Project Overview
Vibe-Pi combines a Streamlit chatbot frontend with a FastAPI backend to deliver context‐aware, file-driven conversational AI. It simplifies secure user management, document ingestion, and retrieval-augmented generation (RAG), all packaged for Docker-first deployment.
What is Vibe-Pi?
Vibe-Pi is an open-source toolkit that lets you:
- Register and authenticate users
- Upload documents for semantic indexing
- Chat with a bot that answers using both general language models and your own files
- Get AI-driven prompt recommendations
- Deploy end-to-end via Docker
Problems It Solves
- Secure, multi-user chatbot hosting
- Integrating private documents into RAG pipelines
- Handling file storage, indexing and retrieval out of the box
- Suggesting effective prompts to improve AI responses
- Streamlining setup in development and production environments
When to Choose Vibe-Pi
Choose Vibe-Pi when you need:
- A turnkey chatbot UI backed by custom knowledge
- User registration, login and per-user data isolation
- Built-in prompt recommendation without building from scratch
- Containerized deployment across teams or cloud providers
Key Features
Streamlit Chatbot UI
- Built with Streamlit for rapid customization
- Manages login, file upload, chat sessions and history
- Easy to extend UI components in
main.py
FastAPI Backend with User Auth
- Endpoints for
/register
,/login
,/upload/
,/chat/
,/recommend_prompt/
,/history/
,/files/
- JWT-based authentication and database integration
- File storage and semantic indexing per user
File-Aware RAG Answers
- Ingests PDFs, TXT and common document formats
- Builds and queries a vector index for context-rich responses
- Falls back to base language model when no file context applies
Prompt Recommendations
- Analyzes user queries and suggests refined prompts
- Improves quality of AI interactions over time
Docker-First Deployment
- Preconfigured Dockerfile and docker-compose.yaml
- Single-command setup for development or production
Quick Start
1. Launch with Docker
# Build and start both frontend and backend
docker-compose up -d
2. Access the Chat UI
Open http://localhost:8501 in your browser, then:
- Register a new user
- Upload documents
- Start chatting with file-aware AI
Example: Chat via Python Client
import requests
API_URL = "http://localhost:8000"
session = requests.Session()
# 1. Register or login
resp = session.post(f"{API_URL}/register", json={"username":"alice","password":"secret"})
resp.raise_for_status()
# 2. Upload a document
with open("doc.pdf","rb") as f:
resp = session.post(f"{API_URL}/upload/", files={"file": f})
resp.raise_for_status()
# 3. Get a chat response
payload = {"query": "Summarize the uploaded document"}
resp = session.post(f"{API_URL}/chat/", json=payload)
print("Bot:", resp.json()["answer"])
This setup delivers a secure, containerized RAG chatbot out of the box. Customizing UI components or backend logic only requires editing the respective main.py
modules.
Quick Start
Get your chatbot running locally in minutes. Choose between a Python-based setup or containerized workflow.
1. Prerequisites
• Git
• Python 3.11 or Docker & Docker Compose
• An OpenAI API key (or equivalent)
• PostgreSQL (local or remote) for production workflows; SQLite works out of the box for dev
2. Environment Variables
Create a chatbot.env
file at project root (this file is excluded from Git via .dockerignore
):
# chatbot.env
OPENAI_API_KEY=sk-...
DATABASE_URL=postgresql+asyncpg://user:pass@localhost:5432/chatbot
CHROMA_DB_DIR=./chroma-db # local vector store path
SECRET_KEY=your-secret-key # JWT or session signing
Load vars in your shell:
export $(grep -v '^#' chatbot.env | xargs)
3. Python Workflow
Clone and install dependencies:
git clone https://github.com/riqtiny/vibe-pi-releasecandidate1.git cd vibe-pi-releasecandidate1 python -m venv .venv source .venv/bin/activate pip install --upgrade pip pip install -r requirements.txt
Run the backend API:
uvicorn backend.main:app \ --reload \ --host 0.0.0.0 \ --port 8000
In a second terminal, start the Streamlit frontend:
streamlit run frontend/app.py \ --server.port 8501 \ --server.headless true
Open your browser:
- Frontend UI → http://localhost:8501
- API docs → http://localhost:8000/docs
4. Docker Workflow
Leverage Docker Compose to build and orchestrate both services:
git clone https://github.com/riqtiny/vibe-pi-releasecandidate1.git
cd vibe-pi-releasecandidate1
# Ensure chatbot.env lives here
docker-compose up --build
- backend → http://localhost:8000
- frontend → http://localhost:8501
Stop and remove containers with:
docker-compose down
5. First Interaction
- In the Streamlit UI, log in or set a
user_id
in the sidebar. - Use “Upload Document” to add a PDF, DOCX, PPTX or image.
- Upon success, the app triggers indexing and displays a chat interface.
- Ask questions; the frontend sends requests to
/chat/
and/recommend_prompt/
under the hood.
6. Next Steps
- Tweak UI theme via
.streamlit/config.toml
- Add custom RAG pipelines in
backend/routers/
- Persist to production-grade Postgres by updating
DATABASE_URL
inchatbot.env
- Scale with Kubernetes using the provided Dockerfiles
You’re now ready to explore advanced features: vector search tuning, multi-user auth, and deployment best practices.
Architecture & Core Concepts
This section describes the main components, how data flows from the Streamlit UI through the backend to produce RAG-powered answers, and how you can extend or debug each part.
System Components
Streamlit Frontend
- Captures user inputs (login, file uploads, chat prompts)
- Uses
frontend/api.py
functions to call backend endpoints - Renders chat history, recommended prompts, and citations
API Client (
frontend/api.py
)- Centralizes all HTTP calls to FastAPI
- Transforms responses into simple return values for the UI
- Handles network errors and JSON parsing
FastAPI Backend (
api/backend/main.py
)- Defines REST endpoints:
/login
,/register
,/upload
,/files
,/history
,/chat
,/recommend_prompt
- Orchestrates indexing, retrieval, chat history persistence, and prompt recommendations
- Defines REST endpoints:
Vector Store (ChromaDB via
api/backend/chroma.py
)- Stores embeddings of text chunks extracted from uploaded files
- Supports fast similarity search for RAG retrieval
Database (SQLAlchemy via
api/backend/db.py
)- Persists users, file metadata, and chat history
- Exposes a
SessionLocal
factory andBase
for model definitions
Embeddings & LLM (Google GenAI)
- Generates embeddings for document chunks
- Powers the LLM that composes answers from retrieved context
Data Flow
1. File Upload & Indexing
- User uploads a file in Streamlit;
upload_file(user_id, file)
posts to/upload/
. - FastAPI saves the file, persists metadata, then calls
index_file(path, mime, metadata)
. index_file
pipeline (api/backend/chroma.py
):- Extracts text via the appropriate extractor (PDF, DOCX, PPTX, image/OCR).
- Wraps each page/slide/paragraph in a LangChain
Document
with combined user metadata. - Splits documents into 1 000-char chunks (100-char overlap) using
RecursiveCharacterTextSplitter
. - Embeds and stores chunks in ChromaDB with
vectorstore.add_documents(chunks)
.
Usage example in your upload handler:
from api.backend.chroma import index_file
path = "/tmp/uploads/lecture1.pdf"
mime = "application/pdf"
meta = {"user_id": current_user.id, "filename": "lecture1.pdf"}
if index_file(path, mime, meta):
print("Indexed successfully")
else:
print("Indexing failed or no text extracted")
Practical tips:
- Set
GOOGLE_API_KEY
for embeddings. - Tune
chunk_size
/chunk_overlap
for your documents. - Check
logging.INFO
for extraction summaries and warnings on unsupported types.
2. Chat Request & RAG Pipeline
- UI calls
get_chat_response(user_id, message, file_id, history)
→ POST/chat/
. - FastAPI
/chat/
handler (api/backend/main.py
) does:- Loads the user’s vector store (optionally scoped by
file_id
). - Builds a retriever (
vectorstore.as_retriever(search_kwargs={"k":5})
). - Constructs a RetrievalQA chain with Google GenAI LLM and the retriever.
- Executes the chain on
{"query": message}
, returning bothresult
andsource_documents
. - Extracts citations (e.g. page numbers, text snippets) from
source_documents
. - Persists user message and assistant response in the chat history table.
- Returns JSON:
{ "answer": "Assistant’s reply text…", "citations": [ {"page": 2, "text": "…relevant snippet…"}, … ] }
- Loads the user’s vector store (optionally scoped by
Example backend snippet:
from fastapi import APIRouter
from api.backend.chroma import get_vectorstore
from api.backend.llm import llm
from langchain.chains import RetrievalQA
router = APIRouter()
@router.post("/chat/")
def chat(user_id: int, message: str, file_id: int = None, history: list = None):
# 1. Prepare retriever
vectorstore = get_vectorstore(user_id, file_id)
retriever = vectorstore.as_retriever(search_kwargs={"k": 5})
# 2. Run RAG chain
qa = RetrievalQA.from_chain_type(
llm=llm,
chain_type="stuff",
retriever=retriever,
return_source_documents=True
)
result = qa({"query": message})
# 3. Extract answer & citations
answer = result["result"]
citations = [
{"page": doc.metadata.get("page"), "text": doc.page_content}
for doc in result["source_documents"]
]
# 4. Persist to DB (omitted for brevity)…
return {"answer": answer, "citations": citations}
Best practices:
- Limit
k
(retrieval count) to balance context size vs. latency. - Cap
history
to the most recent 10 messages to reduce payload. - Customize the prompt template in
RetrievalQA.from_chain_type
as needed.
Core Configuration
Database Engine & Session (api/backend/db.py
)
import os
from dotenv import load_dotenv
from sqlalchemy import create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
load_dotenv()
DATABASE_URL = os.getenv("DATABASE_URL")
engine = create_engine(DATABASE_URL, pool_pre_ping=True)
SessionLocal = sessionmaker(
autocommit=False,
autoflush=False,
bind=engine
)
Base = declarative_base()
- Call
Base.metadata.create_all(bind=engine)
to initialize your schema. - Use
SessionLocal()
in a dependency orwith
-block to avoid connection leaks.
Embeddings & Vector Store Initialization (api/backend/chroma.py
)
from langchain.embeddings import GoogleGenerativeAIEmbeddings
from langchain.vectorstores import Chroma
def get_vectorstore(user_id: int, file_id: int = None) -> Chroma:
# Path “chromadb/{user_id}/{file_id}” or “chromadb/{user_id}/global”
persist_dir = f"chromadb/{user_id}/{file_id or 'global'}"
embeddings = GoogleGenerativeAIEmbeddings()
return Chroma(persist_directory=persist_dir, embedding_function=embeddings)
Ensure the CHROMA_DIR
is writable and consistent across instances.
API Client Wrappers (frontend/api.py
)
Centralize HTTP calls so your UI code stays clean. Key functions:
login_user(username, password)
→(ok, user_id, message)
register_user(username, password)
→(ok, message)
get_chat_history(user_id)
→list[dict]
upload_file(user_id, file)
→{"file_id", "filename"}
get_user_files(user_id)
→list[dict]
delete_file(user_id, file_id)
→dict
delete_chat_history(user_id)
→dict
get_chat_response(user_id, message, file_id=None, history=None)
→{"answer", "citations"}
get_recommended_prompts(user_id, file_id=None, n=6)
→list[str]
Example usage in Streamlit:
from frontend.api import get_chat_response
resp = get_chat_response(
st.session_state.user_id,
st.session_state.prompt,
file_id=st.session_state.file_id,
history=st.session_state.chat_history[-10:]
)
st.write(resp["answer"])
for cite in resp["citations"]:
st.write(f"Page {cite['page']}: {cite['text']}")
By following this architecture and understanding each component’s role in the RAG pipeline, you can extend extractors, swap embedding providers, adjust retrieval settings, or add new endpoints with confidence.
API Reference
1. Authentication
1.1 Register a New User
Create a user account.
Endpoint
POST /register
Request
Content-Type: application/json
{
"username": "alice",
"password": "s3cr3t"
}
Response
201 Created
{
"id": 42,
"username": "alice"
}
Errors
• 400 Bad Request – missing fields
• 409 Conflict – username already exists
cURL
curl -X POST http://localhost:8000/register \
-H "Content-Type: application/json" \
-d '{"username":"alice","password":"s3cr3t"}'
1.2 Login and Obtain Token
Authenticate and receive a Bearer token.
Endpoint
POST /login
Request
Content-Type: application/json
{
"username": "alice",
"password": "s3cr3t"
}
Response
200 OK
{
"access_token": "<jwt_token>",
"token_type": "bearer"
}
cURL
curl -X POST http://localhost:8000/login \
-H "Content-Type: application/json" \
-d '{"username":"alice","password":"s3cr3t"}'
2. File Upload & Indexing
POST /upload/
Upload a document, persist metadata, and index its contents for RAG.
Headers
Authorization: Bearer <token>
Content-Type: multipart/form-data
Form Fields
• user_id (int) – your user ID
• file (file) – PDF, DOCX, TXT, etc.
Response
200 OK
{
"file_id": 7,
"filename": "20250717123045_report.pdf"
}
cURL
curl -X POST http://localhost:8000/upload/ \
-H "Authorization: Bearer $TOKEN" \
-F "user_id=42" \
-F "file=@/path/to/report.pdf"
Python (requests)
import requests
token = "<jwt_token>"
files = {'file': open('report.pdf','rb')}
data = {'user_id': 42}
resp = requests.post(
"http://localhost:8000/upload/",
headers={"Authorization": f"Bearer {token}"},
files=files,
data=data
)
print(resp.json())
3. Chat with RAG
POST /chat/
Send a message; receive a response augmented by your indexed files.
Headers
Authorization: Bearer <token>
Content-Type: application/json
Request Body
{
"user_id": 42,
"message": "Summarize the Q2 report",
"file_id": 7 // optional: target a specific upload
}
Response
200 OK
{
"response": "The Q2 report highlights a 15% revenue increase…",
"source_chunks": [
{"text":"…revenue increased by 15%…","document":"report.pdf"},
{"text":"…net profit margin improved…","document":"report.pdf"}
]
}
cURL
curl -X POST http://localhost:8000/chat/ \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"user_id":42,"message":"Summarize the Q2 report","file_id":7}'
4. Prompt Recommendations
GET /recommend_prompt/
Retrieve auto-generated prompts based on your files.
Headers
Authorization: Bearer <token>
Query Parameters
• user_id (int) – your user ID
• file_id (int) – target file
Response
200 OK
{
"prompts": [
"What are the key findings in report.pdf?",
"List challenges mentioned in the Q2 report."
]
}
cURL
curl "http://localhost:8000/recommend_prompt/?user_id=42&file_id=7" \
-H "Authorization: Bearer $TOKEN"
5. Chat History
GET /history/{user_id}
List all past chat interactions for a user.
Headers
Authorization: Bearer <token>
Path Parameter
• user_id (int)
Response
200 OK
[
{
"id": 101,
"message": "Summarize the Q2 report",
"response": "The Q2 report…",
"timestamp": "2025-07-17T12:35:00"
},
{
"id": 102,
"message": "List action items",
"response": "1. Follow up with finance…",
"timestamp": "2025-07-17T12:40:00"
}
]
cURL
curl http://localhost:8000/history/42 \
-H "Authorization: Bearer $TOKEN"
6. File Management
6.1 List Uploaded Files
GET /files/{user_id}
Headers
Authorization: Bearer <token>
Path Parameter
• user_id (int)
Response
200 OK
[
{"file_id":7,"filename":"20250717123045_report.pdf","upload_time":"2025-07-17T12:30:00"},
{"file_id":8,"filename":"20250717124500_notes.txt","upload_time":"2025-07-17T12:45:00"}
]
cURL
curl http://localhost:8000/files/42 \
-H "Authorization: Bearer $TOKEN"
6.2 Delete a File
DELETE /files/{file_id}
Headers
Authorization: Bearer <token>
Path Parameter
• file_id (int)
Response
204 No Content
cURL
curl -X DELETE http://localhost:8000/files/7 \
-H "Authorization: Bearer $TOKEN"
All endpoints support testing with Postman or curl. Include the Authorization: Bearer <token>
header for protected routes.
Deployment & Configuration
This section guides you through containerizing, configuring, and running the backend (FastAPI/Uvicorn) and frontend (Streamlit) services for staging or production. You’ll learn how to build images, manage environment variables, orchestrate with Docker Compose, and scale your setup.
1. Building Container Images
Use the provided Dockerfiles to create images for each service.
Backend
Dockerfile.backend installs dependencies and launches Uvicorn on port 8000.
docker build \
-t chatbot-backend:latest \
-f Dockerfile.backend \
.
Frontend
Dockerfile.frontend installs Streamlit requirements and exposes port 8501.
docker build \
-t chatbot-frontend:latest \
-f Dockerfile.frontend \
.
2. Environment Configuration
Centralize runtime settings and secrets in chatbot.env
at your project root. Git-ignore this file.
# chatbot.env
OPENAI_API_KEY=sk-…
DATABASE_URL=postgres://user:pass@db:5432/chatbot
STREAMLIT_SERVER_PORT=8501
UVICORN_WORKERS=4
Refer to these variables in your Dockerfiles or application code via os.getenv
.
3. Orchestrating with Docker Compose
Use docker-compose.yml
to launch both services and inject environment variables.
version: "3.9"
services:
backend:
image: chatbot-backend:latest
ports:
- "${BACKEND_PORT:-8000}:8000"
env_file: chatbot.env
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 10s
retries: 3
frontend:
image: chatbot-frontend:latest
ports:
- "${FRONTEND_PORT:-8501}:8501"
env_file: chatbot.env
depends_on:
backend:
condition: service_healthy
Launching Services
# Override ports or worker count in a local .env file or CLI
export BACKEND_PORT=8080
export FRONTEND_PORT=8601
docker-compose up -d
Verify:
- Backend Docs → http://localhost:${BACKEND_PORT:-8000}/docs
- Streamlit UI → http://localhost:${FRONTEND_PORT:-8501}
4. Scaling & Performance
Scaling Backends
Run multiple backend replicas to handle increased traffic:
docker-compose up -d --scale backend=3
Load-balance at your ingress (NGINX, Traefik) or cloud load-balancer.
Tuning Uvicorn Workers
Pass UVICORN_WORKERS
from chatbot.env
into the container and override the start command:
backend:
image: chatbot-backend:latest
command: >
uvicorn app.main:app
--host 0.0.0.0
--port 8000
--workers ${UVICORN_WORKERS}
5. Cleanup and Maintenance
Stop and remove containers, networks, volumes:
docker-compose down --volumes
View real-time logs:
docker-compose logs -f backend docker-compose logs -f frontend
Rebuild images after code changes:
docker-compose build docker-compose up -d
Practical Tips
Enable live code reload in development by switching
image:
to abuild:
block and mounting source directories.Keep secrets out of version control; use Docker secrets or a vault in production.
Add resource limits in Compose for production stability:
deploy: resources: limits: cpus: "0.50" memory: 512M
Monitor container health and logs via your orchestration platform or a monitoring stack (Prometheus, Grafana).
Development Guide
This guide covers coding standards, local development setup, and contribution workflow for the riqtiny/vibe-pi-releasecandidate1 project.
Coding Standards
Enforce consistency and readability across the codebase.
• Formatting
- Use Black (
black .
) for code formatting. - Run
isort .
to sort imports. - Target line length: 88 characters.
• Type Checking & Linting
- Annotate functions and methods with type hints.
- Run MyPy (
mypy .
) to catch type errors. - Use Flake8 (
flake8 .
) to enforce lint rules.
• Docstrings
- Follow Google-style docstrings.
- Include parameter and return type descriptions.
Example function with type hints and docstring:
def add_vectors(a: list[float], b: list[float]) -> list[float]:
"""Element-wise addition of two vectors.
Args:
a: First vector of floats.
b: Second vector of floats.
Returns:
A new vector where each element is the sum of corresponding elements in a and b.
Raises:
ValueError: If input vectors differ in length.
"""
if len(a) != len(b):
raise ValueError("Vectors must be the same length")
return [x + y for x, y in zip(a, b)]
Local Development Setup
Prerequisites
- Python 3.10+
- Docker 20+ & Docker Compose
- Git
Clone & Environment
git clone https://github.com/riqtiny/vibe-pi-releasecandidate1.git
cd vibe-pi-releasecandidate1
python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt
.dockerignore
The project’s .dockerignore
excludes sensitive and build-time files:
• .env
and *.env
• __pycache__/
, *.pyc
• uploads/
This ensures lean, secure Docker images.
Running Services
Backend (FastAPI):
uvicorn api.backend.main:app --reload --host 0.0.0.0 --port 8000
Frontend (Streamlit):
streamlit run api.frontend.app.py --server.port 8501
Dockerized Development
Build and run via Docker:
docker build -t vibe-pi:dev .
docker run --rm -p 8000:8000 -p 8501:8501 \
-v "$(pwd)":/app \
-e ENV=development \
vibe-pi:dev
Testing
Run the full test suite and linters before merging changes:
pytest --maxfail=1 --disable-warnings -q
black --check .
isort --check-only .
mypy .
flake8 .
Contribution Workflow
Fork & Branch
- Fork the repo.
- Create a descriptive branch:
git checkout -b feature/parse-pdf
.
Develop & Test
- Adhere to coding standards.
- Write or update tests under
tests/
. - Ensure CI passes: tests, lint, type checks.
Commit Messages
- Use imperative tense:
Add PDF parsing support
. - Reference issues:
Fix #42: handle missing metadata
.
- Use imperative tense:
Pull Request
- Open PR against
main
. - Include a concise summary of changes.
- Tag reviewers and add relevant labels.
- Open PR against
Review & Merge
- Address review comments promptly.
- Squash related commits before merge.
- Ensure the changelog is updated if applicable.
Following this guide ensures a consistent workflow, stable builds, and smooth collaboration.