riqtiny/vibe-pi-releasecandidate1 Documentation - Complete Guide & API Reference

Project Overview

Vibe-Pi combines a Streamlit chatbot frontend with a FastAPI backend to deliver context‐aware, file-driven conversational AI. It simplifies secure user management, document ingestion, and retrieval-augmented generation (RAG), all packaged for Docker-first deployment.

What is Vibe-Pi?

Vibe-Pi is an open-source toolkit that lets you:

Register and authenticate users
Upload documents for semantic indexing
Chat with a bot that answers using both general language models and your own files
Get AI-driven prompt recommendations
Deploy end-to-end via Docker

Problems It Solves

Secure, multi-user chatbot hosting
Integrating private documents into RAG pipelines
Handling file storage, indexing and retrieval out of the box
Suggesting effective prompts to improve AI responses
Streamlining setup in development and production environments

When to Choose Vibe-Pi

Choose Vibe-Pi when you need:

A turnkey chatbot UI backed by custom knowledge
User registration, login and per-user data isolation
Built-in prompt recommendation without building from scratch
Containerized deployment across teams or cloud providers

Key Features

Streamlit Chatbot UI

Built with Streamlit for rapid customization
Manages login, file upload, chat sessions and history
Easy to extend UI components in main.py

FastAPI Backend with User Auth

Endpoints for /register, /login, /upload/, /chat/, /recommend_prompt/, /history/, /files/
JWT-based authentication and database integration
File storage and semantic indexing per user

File-Aware RAG Answers

Ingests PDFs, TXT and common document formats
Builds and queries a vector index for context-rich responses
Falls back to base language model when no file context applies

Prompt Recommendations

Analyzes user queries and suggests refined prompts
Improves quality of AI interactions over time

Docker-First Deployment

Preconfigured Dockerfile and docker-compose.yaml
Single-command setup for development or production

Quick Start

1. Launch with Docker

# Build and start both frontend and backend
docker-compose up -d

2. Access the Chat UI

Open http://localhost:8501 in your browser, then:

Register a new user
Upload documents
Start chatting with file-aware AI

Example: Chat via Python Client

import requests

API_URL = "http://localhost:8000"
session = requests.Session()

# 1. Register or login
resp = session.post(f"{API_URL}/register", json={"username":"alice","password":"secret"})
resp.raise_for_status()

# 2. Upload a document
with open("doc.pdf","rb") as f:
    resp = session.post(f"{API_URL}/upload/", files={"file": f})
    resp.raise_for_status()

# 3. Get a chat response
payload = {"query": "Summarize the uploaded document"}
resp = session.post(f"{API_URL}/chat/", json=payload)
print("Bot:", resp.json()["answer"])

This setup delivers a secure, containerized RAG chatbot out of the box. Customizing UI components or backend logic only requires editing the respective main.py modules.

Quick Start

Get your chatbot running locally in minutes. Choose between a Python-based setup or containerized workflow.

1. Prerequisites

• Git
• Python 3.11 or Docker & Docker Compose
• An OpenAI API key (or equivalent)
• PostgreSQL (local or remote) for production workflows; SQLite works out of the box for dev

2. Environment Variables

Create a chatbot.env file at project root (this file is excluded from Git via .dockerignore):

# chatbot.env
OPENAI_API_KEY=sk-...
DATABASE_URL=postgresql+asyncpg://user:pass@localhost:5432/chatbot
CHROMA_DB_DIR=./chroma-db       # local vector store path
SECRET_KEY=your-secret-key      # JWT or session signing

Load vars in your shell:

export $(grep -v '^#' chatbot.env | xargs)

3. Python Workflow

Clone and install dependencies:

git clone https://github.com/riqtiny/vibe-pi-releasecandidate1.git
cd vibe-pi-releasecandidate1
python -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt

Run the backend API:

uvicorn backend.main:app \
  --reload \
  --host 0.0.0.0 \
  --port 8000

In a second terminal, start the Streamlit frontend:

streamlit run frontend/app.py \
  --server.port 8501 \
  --server.headless true

Open your browser:
- Frontend UI → http://localhost:8501
- API docs → http://localhost:8000/docs

4. Docker Workflow

Leverage Docker Compose to build and orchestrate both services:

git clone https://github.com/riqtiny/vibe-pi-releasecandidate1.git
cd vibe-pi-releasecandidate1
# Ensure chatbot.env lives here
docker-compose up --build

backend → http://localhost:8000
frontend → http://localhost:8501

Stop and remove containers with:

docker-compose down

5. First Interaction

In the Streamlit UI, log in or set a user_id in the sidebar.
Use “Upload Document” to add a PDF, DOCX, PPTX or image.
Upon success, the app triggers indexing and displays a chat interface.
Ask questions; the frontend sends requests to /chat/ and /recommend_prompt/ under the hood.

6. Next Steps

Tweak UI theme via .streamlit/config.toml
Add custom RAG pipelines in backend/routers/
Persist to production-grade Postgres by updating DATABASE_URL in chatbot.env
Scale with Kubernetes using the provided Dockerfiles

You’re now ready to explore advanced features: vector search tuning, multi-user auth, and deployment best practices.

Architecture & Core Concepts

This section describes the main components, how data flows from the Streamlit UI through the backend to produce RAG-powered answers, and how you can extend or debug each part.

System Components

Streamlit Frontend
- Captures user inputs (login, file uploads, chat prompts)
- Uses frontend/api.py functions to call backend endpoints
- Renders chat history, recommended prompts, and citations
API Client (frontend/api.py)
- Centralizes all HTTP calls to FastAPI
- Transforms responses into simple return values for the UI
- Handles network errors and JSON parsing
FastAPI Backend (api/backend/main.py)
- Defines REST endpoints: /login, /register, /upload, /files, /history, /chat, /recommend_prompt
- Orchestrates indexing, retrieval, chat history persistence, and prompt recommendations
Vector Store (ChromaDB via api/backend/chroma.py)
- Stores embeddings of text chunks extracted from uploaded files
- Supports fast similarity search for RAG retrieval
Database (SQLAlchemy via api/backend/db.py)
- Persists users, file metadata, and chat history
- Exposes a SessionLocal factory and Base for model definitions
Embeddings & LLM (Google GenAI)
- Generates embeddings for document chunks
- Powers the LLM that composes answers from retrieved context

Data Flow

1. File Upload & Indexing

User uploads a file in Streamlit; upload_file(user_id, file) posts to /upload/.
FastAPI saves the file, persists metadata, then calls index_file(path, mime, metadata).
index_file pipeline (api/backend/chroma.py):
- Extracts text via the appropriate extractor (PDF, DOCX, PPTX, image/OCR).
- Wraps each page/slide/paragraph in a LangChain Document with combined user metadata.
- Splits documents into 1 000-char chunks (100-char overlap) using RecursiveCharacterTextSplitter.
- Embeds and stores chunks in ChromaDB with vectorstore.add_documents(chunks).

Usage example in your upload handler:

from api.backend.chroma import index_file

path = "/tmp/uploads/lecture1.pdf"
mime = "application/pdf"
meta = {"user_id": current_user.id, "filename": "lecture1.pdf"}

if index_file(path, mime, meta):
    print("Indexed successfully")
else:
    print("Indexing failed or no text extracted")

Practical tips:

Set GOOGLE_API_KEY for embeddings.
Tune chunk_size/chunk_overlap for your documents.
Check logging.INFO for extraction summaries and warnings on unsupported types.

2. Chat Request & RAG Pipeline

UI calls get_chat_response(user_id, message, file_id, history) → POST /chat/.
FastAPI /chat/ handler (api/backend/main.py) does:
- Loads the user’s vector store (optionally scoped by file_id).
- Builds a retriever (vectorstore.as_retriever(search_kwargs={"k":5})).
- Constructs a RetrievalQA chain with Google GenAI LLM and the retriever.
- Executes the chain on {"query": message}, returning both result and source_documents.
- Extracts citations (e.g. page numbers, text snippets) from source_documents.
- Persists user message and assistant response in the chat history table.
- Returns JSON:
```
{
  "answer": "Assistant’s reply text…",
  "citations": [
    {"page": 2, "text": "…relevant snippet…"},
    …
  ]
}
```

Example backend snippet:

from fastapi import APIRouter
from api.backend.chroma import get_vectorstore
from api.backend.llm import llm
from langchain.chains import RetrievalQA

router = APIRouter()

@router.post("/chat/")
def chat(user_id: int, message: str, file_id: int = None, history: list = None):
    # 1. Prepare retriever
    vectorstore = get_vectorstore(user_id, file_id)
    retriever = vectorstore.as_retriever(search_kwargs={"k": 5})

    # 2. Run RAG chain
    qa = RetrievalQA.from_chain_type(
        llm=llm,
        chain_type="stuff",
        retriever=retriever,
        return_source_documents=True
    )
    result = qa({"query": message})

    # 3. Extract answer & citations
    answer = result["result"]
    citations = [
        {"page": doc.metadata.get("page"), "text": doc.page_content}
        for doc in result["source_documents"]
    ]

    # 4. Persist to DB (omitted for brevity)…

    return {"answer": answer, "citations": citations}

Best practices:

Limit k (retrieval count) to balance context size vs. latency.
Cap history to the most recent 10 messages to reduce payload.
Customize the prompt template in RetrievalQA.from_chain_type as needed.

Core Configuration

Database Engine & Session (`api/backend/db.py`)

import os
from dotenv import load_dotenv
from sqlalchemy import create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker

load_dotenv()
DATABASE_URL = os.getenv("DATABASE_URL")

engine = create_engine(DATABASE_URL, pool_pre_ping=True)

SessionLocal = sessionmaker(
    autocommit=False,
    autoflush=False,
    bind=engine
)

Base = declarative_base()

Call Base.metadata.create_all(bind=engine) to initialize your schema.
Use SessionLocal() in a dependency or with-block to avoid connection leaks.

Embeddings & Vector Store Initialization (`api/backend/chroma.py`)

from langchain.embeddings import GoogleGenerativeAIEmbeddings
from langchain.vectorstores import Chroma

def get_vectorstore(user_id: int, file_id: int = None) -> Chroma:
    # Path “chromadb/{user_id}/{file_id}” or “chromadb/{user_id}/global”
    persist_dir = f"chromadb/{user_id}/{file_id or 'global'}"
    embeddings = GoogleGenerativeAIEmbeddings()
    return Chroma(persist_directory=persist_dir, embedding_function=embeddings)

Ensure the CHROMA_DIR is writable and consistent across instances.

API Client Wrappers (`frontend/api.py`)

Centralize HTTP calls so your UI code stays clean. Key functions:

login_user(username, password) → (ok, user_id, message)
register_user(username, password) → (ok, message)
get_chat_history(user_id) → list[dict]
upload_file(user_id, file) → {"file_id", "filename"}
get_user_files(user_id) → list[dict]
delete_file(user_id, file_id) → dict
delete_chat_history(user_id) → dict
get_chat_response(user_id, message, file_id=None, history=None) → {"answer", "citations"}
get_recommended_prompts(user_id, file_id=None, n=6) → list[str]

Example usage in Streamlit:

from frontend.api import get_chat_response

resp = get_chat_response(
    st.session_state.user_id,
    st.session_state.prompt,
    file_id=st.session_state.file_id,
    history=st.session_state.chat_history[-10:]
)
st.write(resp["answer"])
for cite in resp["citations"]:
    st.write(f"Page {cite['page']}: {cite['text']}")

By following this architecture and understanding each component’s role in the RAG pipeline, you can extend extractors, swap embedding providers, adjust retrieval settings, or add new endpoints with confidence.

API Reference

1. Authentication

1.1 Register a New User

Create a user account.

Endpoint
POST /register

Request
Content-Type: application/json

{
  "username": "alice",
  "password": "s3cr3t"
}

Response
201 Created

{
  "id": 42,
  "username": "alice"
}

Errors
• 400 Bad Request – missing fields
• 409 Conflict – username already exists

cURL

curl -X POST http://localhost:8000/register \
  -H "Content-Type: application/json" \
  -d '{"username":"alice","password":"s3cr3t"}'

1.2 Login and Obtain Token

Authenticate and receive a Bearer token.

Endpoint
POST /login

Request
Content-Type: application/json

{
  "username": "alice",
  "password": "s3cr3t"
}

Response
200 OK

{
  "access_token": "<jwt_token>",
  "token_type": "bearer"
}

cURL

curl -X POST http://localhost:8000/login \
  -H "Content-Type: application/json" \
  -d '{"username":"alice","password":"s3cr3t"}'

2. File Upload & Indexing

POST /upload/

Upload a document, persist metadata, and index its contents for RAG.

Headers
Authorization: Bearer <token>
Content-Type: multipart/form-data

Form Fields
• user_id (int) – your user ID
• file (file) – PDF, DOCX, TXT, etc.

Response
200 OK

{
  "file_id": 7,
  "filename": "20250717123045_report.pdf"
}

cURL

curl -X POST http://localhost:8000/upload/ \
  -H "Authorization: Bearer $TOKEN" \
  -F "user_id=42" \
  -F "file=@/path/to/report.pdf"

Python (requests)

import requests
token = "<jwt_token>"
files = {'file': open('report.pdf','rb')}
data = {'user_id': 42}
resp = requests.post(
    "http://localhost:8000/upload/",
    headers={"Authorization": f"Bearer {token}"},
    files=files,
    data=data
)
print(resp.json())

3. Chat with RAG

POST /chat/

Send a message; receive a response augmented by your indexed files.

Headers
Authorization: Bearer <token>
Content-Type: application/json

Request Body

{
  "user_id": 42,
  "message": "Summarize the Q2 report",
  "file_id": 7         // optional: target a specific upload
}

Response
200 OK

{
  "response": "The Q2 report highlights a 15% revenue increase…",
  "source_chunks": [
    {"text":"…revenue increased by 15%…","document":"report.pdf"},
    {"text":"…net profit margin improved…","document":"report.pdf"}
  ]
}

cURL

curl -X POST http://localhost:8000/chat/ \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"user_id":42,"message":"Summarize the Q2 report","file_id":7}'

4. Prompt Recommendations

GET /recommend_prompt/

Retrieve auto-generated prompts based on your files.

Headers
Authorization: Bearer <token>

Query Parameters
• user_id (int) – your user ID
• file_id (int) – target file

Response
200 OK

{
  "prompts": [
    "What are the key findings in report.pdf?",
    "List challenges mentioned in the Q2 report."
  ]
}

cURL

curl "http://localhost:8000/recommend_prompt/?user_id=42&file_id=7" \
  -H "Authorization: Bearer $TOKEN"

5. Chat History

GET /history/{user_id}

List all past chat interactions for a user.

Headers
Authorization: Bearer <token>

Path Parameter
• user_id (int)

Response
200 OK

[
  {
    "id": 101,
    "message": "Summarize the Q2 report",
    "response": "The Q2 report…",
    "timestamp": "2025-07-17T12:35:00"
  },
  {
    "id": 102,
    "message": "List action items",
    "response": "1. Follow up with finance…",
    "timestamp": "2025-07-17T12:40:00"
  }
]

cURL

curl http://localhost:8000/history/42 \
  -H "Authorization: Bearer $TOKEN"

6. File Management

6.1 List Uploaded Files

GET /files/{user_id}

Headers
Authorization: Bearer <token>

Path Parameter
• user_id (int)

Response
200 OK

[
  {"file_id":7,"filename":"20250717123045_report.pdf","upload_time":"2025-07-17T12:30:00"},
  {"file_id":8,"filename":"20250717124500_notes.txt","upload_time":"2025-07-17T12:45:00"}
]

cURL

curl http://localhost:8000/files/42 \
  -H "Authorization: Bearer $TOKEN"

6.2 Delete a File

DELETE /files/{file_id}

Headers
Authorization: Bearer <token>

Path Parameter
• file_id (int)

Response
204 No Content

cURL

curl -X DELETE http://localhost:8000/files/7 \
  -H "Authorization: Bearer $TOKEN"

All endpoints support testing with Postman or curl. Include the Authorization: Bearer <token> header for protected routes.

Deployment & Configuration

This section guides you through containerizing, configuring, and running the backend (FastAPI/Uvicorn) and frontend (Streamlit) services for staging or production. You’ll learn how to build images, manage environment variables, orchestrate with Docker Compose, and scale your setup.

1. Building Container Images

Use the provided Dockerfiles to create images for each service.

Backend

Dockerfile.backend installs dependencies and launches Uvicorn on port 8000.

docker build \
  -t chatbot-backend:latest \
  -f Dockerfile.backend \
  .

Frontend

Dockerfile.frontend installs Streamlit requirements and exposes port 8501.

docker build \
  -t chatbot-frontend:latest \
  -f Dockerfile.frontend \
  .

2. Environment Configuration

Centralize runtime settings and secrets in chatbot.env at your project root. Git-ignore this file.

# chatbot.env
OPENAI_API_KEY=sk-…
DATABASE_URL=postgres://user:pass@db:5432/chatbot
STREAMLIT_SERVER_PORT=8501
UVICORN_WORKERS=4

Refer to these variables in your Dockerfiles or application code via os.getenv.

3. Orchestrating with Docker Compose

Use docker-compose.yml to launch both services and inject environment variables.

version: "3.9"

services:
  backend:
    image: chatbot-backend:latest
    ports:
      - "${BACKEND_PORT:-8000}:8000"
    env_file: chatbot.env
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
      interval: 10s
      retries: 3

  frontend:
    image: chatbot-frontend:latest
    ports:
      - "${FRONTEND_PORT:-8501}:8501"
    env_file: chatbot.env
    depends_on:
      backend:
        condition: service_healthy

Launching Services

# Override ports or worker count in a local .env file or CLI
export BACKEND_PORT=8080
export FRONTEND_PORT=8601

docker-compose up -d

Verify:

Backend Docs → http://localhost:${BACKEND_PORT:-8000}/docs
Streamlit UI → http://localhost:${FRONTEND_PORT:-8501}

4. Scaling & Performance

Scaling Backends

Run multiple backend replicas to handle increased traffic:

docker-compose up -d --scale backend=3

Load-balance at your ingress (NGINX, Traefik) or cloud load-balancer.

Tuning Uvicorn Workers

Pass UVICORN_WORKERS from chatbot.env into the container and override the start command:

backend:
  image: chatbot-backend:latest
  command: >
    uvicorn app.main:app
    --host 0.0.0.0
    --port 8000
    --workers ${UVICORN_WORKERS}

5. Cleanup and Maintenance

Stop and remove containers, networks, volumes:
```
docker-compose down --volumes
```

View real-time logs:

docker-compose logs -f backend
docker-compose logs -f frontend

Rebuild images after code changes:

docker-compose build
docker-compose up -d

Practical Tips

Enable live code reload in development by switching image: to a build: block and mounting source directories.
Keep secrets out of version control; use Docker secrets or a vault in production.

Add resource limits in Compose for production stability:

deploy:
  resources:
    limits:
      cpus: "0.50"
      memory: 512M

Monitor container health and logs via your orchestration platform or a monitoring stack (Prometheus, Grafana).

Development Guide

This guide covers coding standards, local development setup, and contribution workflow for the riqtiny/vibe-pi-releasecandidate1 project.

Coding Standards

Enforce consistency and readability across the codebase.

• Formatting

Use Black (black .) for code formatting.
Run isort . to sort imports.
Target line length: 88 characters.

• Type Checking & Linting

Annotate functions and methods with type hints.
Run MyPy (mypy .) to catch type errors.
Use Flake8 (flake8 .) to enforce lint rules.

• Docstrings

Follow Google-style docstrings.
Include parameter and return type descriptions.

Example function with type hints and docstring:

def add_vectors(a: list[float], b: list[float]) -> list[float]:
    """Element-wise addition of two vectors.

    Args:
        a: First vector of floats.
        b: Second vector of floats.

    Returns:
        A new vector where each element is the sum of corresponding elements in a and b.

    Raises:
        ValueError: If input vectors differ in length.
    """
    if len(a) != len(b):
        raise ValueError("Vectors must be the same length")
    return [x + y for x, y in zip(a, b)]

Local Development Setup

Prerequisites

Python 3.10+
Docker 20+ & Docker Compose
Git

Clone & Environment

git clone https://github.com/riqtiny/vibe-pi-releasecandidate1.git
cd vibe-pi-releasecandidate1
python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt

.dockerignore

The project’s .dockerignore excludes sensitive and build-time files:

• .env and *.env
• __pycache__/, *.pyc
• uploads/

This ensures lean, secure Docker images.

Running Services

Backend (FastAPI):

uvicorn api.backend.main:app --reload --host 0.0.0.0 --port 8000

Frontend (Streamlit):

streamlit run api.frontend.app.py --server.port 8501

Dockerized Development

Build and run via Docker:

docker build -t vibe-pi:dev .
docker run --rm -p 8000:8000 -p 8501:8501 \
  -v "$(pwd)":/app \
  -e ENV=development \
  vibe-pi:dev

Testing

Run the full test suite and linters before merging changes:

pytest --maxfail=1 --disable-warnings -q
black --check .
isort --check-only .
mypy .
flake8 .

Contribution Workflow

Fork & Branch
- Fork the repo.
- Create a descriptive branch: git checkout -b feature/parse-pdf.
Develop & Test
- Adhere to coding standards.
- Write or update tests under tests/.
- Ensure CI passes: tests, lint, type checks.
Commit Messages
- Use imperative tense: Add PDF parsing support.
- Reference issues: Fix #42: handle missing metadata.
Pull Request
- Open PR against main.
- Include a concise summary of changes.
- Tag reviewers and add relevant labels.
Review & Merge
- Address review comments promptly.
- Squash related commits before merge.
- Ensure the changelog is updated if applicable.

Following this guide ensures a consistent workflow, stable builds, and smooth collaboration.

Documentation

Contents

Quick Actions

Contents

Quick Actions

Chat about this codebase

Chat about this codebase

Project Overview

What is Vibe-Pi?

Problems It Solves

When to Choose Vibe-Pi

Key Features

Streamlit Chatbot UI

FastAPI Backend with User Auth

File-Aware RAG Answers

Prompt Recommendations

Docker-First Deployment

Quick Start

1. Launch with Docker

2. Access the Chat UI

Example: Chat via Python Client

Quick Start

1. Prerequisites

2. Environment Variables

3. Python Workflow

4. Docker Workflow

5. First Interaction

6. Next Steps

Architecture & Core Concepts

System Components

Data Flow

1. File Upload & Indexing

2. Chat Request & RAG Pipeline

Core Configuration

Database Engine & Session (api/backend/db.py)

Embeddings & Vector Store Initialization (api/backend/chroma.py)

API Client Wrappers (frontend/api.py)

API Reference

1. Authentication

1.1 Register a New User

1.2 Login and Obtain Token

2. File Upload & Indexing

POST /upload/

3. Chat with RAG

POST /chat/

4. Prompt Recommendations

GET /recommend_prompt/

5. Chat History

GET /history/{user_id}

6. File Management

6.1 List Uploaded Files

6.2 Delete a File

Deployment & Configuration

1. Building Container Images

Backend

Frontend

2. Environment Configuration

3. Orchestrating with Docker Compose

Launching Services

4. Scaling & Performance

Scaling Backends

Tuning Uvicorn Workers

5. Cleanup and Maintenance

Practical Tips

Development Guide

Coding Standards

Local Development Setup

Prerequisites

Clone & Environment

.dockerignore

Running Services

Dockerized Development

Testing

Contribution Workflow

Database Engine & Session (`api/backend/db.py`)

Embeddings & Vector Store Initialization (`api/backend/chroma.py`)

API Client Wrappers (`frontend/api.py`)