Chat about this codebase

AI-powered code exploration

Online

Project Overview

Vibe-Pi combines a Streamlit chatbot frontend with a FastAPI backend to deliver context‐aware, file-driven conversational AI. It simplifies secure user management, document ingestion, and retrieval-augmented generation (RAG), all packaged for Docker-first deployment.

What is Vibe-Pi?

Vibe-Pi is an open-source toolkit that lets you:

  • Register and authenticate users
  • Upload documents for semantic indexing
  • Chat with a bot that answers using both general language models and your own files
  • Get AI-driven prompt recommendations
  • Deploy end-to-end via Docker

Problems It Solves

  • Secure, multi-user chatbot hosting
  • Integrating private documents into RAG pipelines
  • Handling file storage, indexing and retrieval out of the box
  • Suggesting effective prompts to improve AI responses
  • Streamlining setup in development and production environments

When to Choose Vibe-Pi

Choose Vibe-Pi when you need:

  • A turnkey chatbot UI backed by custom knowledge
  • User registration, login and per-user data isolation
  • Built-in prompt recommendation without building from scratch
  • Containerized deployment across teams or cloud providers

Key Features

Streamlit Chatbot UI

  • Built with Streamlit for rapid customization
  • Manages login, file upload, chat sessions and history
  • Easy to extend UI components in main.py

FastAPI Backend with User Auth

  • Endpoints for /register, /login, /upload/, /chat/, /recommend_prompt/, /history/, /files/
  • JWT-based authentication and database integration
  • File storage and semantic indexing per user

File-Aware RAG Answers

  • Ingests PDFs, TXT and common document formats
  • Builds and queries a vector index for context-rich responses
  • Falls back to base language model when no file context applies

Prompt Recommendations

  • Analyzes user queries and suggests refined prompts
  • Improves quality of AI interactions over time

Docker-First Deployment

  • Preconfigured Dockerfile and docker-compose.yaml
  • Single-command setup for development or production

Quick Start

1. Launch with Docker

# Build and start both frontend and backend
docker-compose up -d

2. Access the Chat UI

Open http://localhost:8501 in your browser, then:

  • Register a new user
  • Upload documents
  • Start chatting with file-aware AI

Example: Chat via Python Client

import requests

API_URL = "http://localhost:8000"
session = requests.Session()

# 1. Register or login
resp = session.post(f"{API_URL}/register", json={"username":"alice","password":"secret"})
resp.raise_for_status()

# 2. Upload a document
with open("doc.pdf","rb") as f:
    resp = session.post(f"{API_URL}/upload/", files={"file": f})
    resp.raise_for_status()

# 3. Get a chat response
payload = {"query": "Summarize the uploaded document"}
resp = session.post(f"{API_URL}/chat/", json=payload)
print("Bot:", resp.json()["answer"])

This setup delivers a secure, containerized RAG chatbot out of the box. Customizing UI components or backend logic only requires editing the respective main.py modules.

Quick Start

Get your chatbot running locally in minutes. Choose between a Python-based setup or containerized workflow.

1. Prerequisites

• Git
• Python 3.11 or Docker & Docker Compose
• An OpenAI API key (or equivalent)
• PostgreSQL (local or remote) for production workflows; SQLite works out of the box for dev

2. Environment Variables

Create a chatbot.env file at project root (this file is excluded from Git via .dockerignore):

# chatbot.env
OPENAI_API_KEY=sk-...
DATABASE_URL=postgresql+asyncpg://user:pass@localhost:5432/chatbot
CHROMA_DB_DIR=./chroma-db       # local vector store path
SECRET_KEY=your-secret-key      # JWT or session signing

Load vars in your shell:

export $(grep -v '^#' chatbot.env | xargs)

3. Python Workflow

  1. Clone and install dependencies:

    git clone https://github.com/riqtiny/vibe-pi-releasecandidate1.git
    cd vibe-pi-releasecandidate1
    python -m venv .venv
    source .venv/bin/activate
    pip install --upgrade pip
    pip install -r requirements.txt
    
  2. Run the backend API:

    uvicorn backend.main:app \
      --reload \
      --host 0.0.0.0 \
      --port 8000
    
  3. In a second terminal, start the Streamlit frontend:

    streamlit run frontend/app.py \
      --server.port 8501 \
      --server.headless true
    
  4. Open your browser:

4. Docker Workflow

Leverage Docker Compose to build and orchestrate both services:

git clone https://github.com/riqtiny/vibe-pi-releasecandidate1.git
cd vibe-pi-releasecandidate1
# Ensure chatbot.env lives here
docker-compose up --build

Stop and remove containers with:

docker-compose down

5. First Interaction

  1. In the Streamlit UI, log in or set a user_id in the sidebar.
  2. Use “Upload Document” to add a PDF, DOCX, PPTX or image.
  3. Upon success, the app triggers indexing and displays a chat interface.
  4. Ask questions; the frontend sends requests to /chat/ and /recommend_prompt/ under the hood.

6. Next Steps

  • Tweak UI theme via .streamlit/config.toml
  • Add custom RAG pipelines in backend/routers/
  • Persist to production-grade Postgres by updating DATABASE_URL in chatbot.env
  • Scale with Kubernetes using the provided Dockerfiles

You’re now ready to explore advanced features: vector search tuning, multi-user auth, and deployment best practices.

Architecture & Core Concepts

This section describes the main components, how data flows from the Streamlit UI through the backend to produce RAG-powered answers, and how you can extend or debug each part.

System Components

  • Streamlit Frontend

    • Captures user inputs (login, file uploads, chat prompts)
    • Uses frontend/api.py functions to call backend endpoints
    • Renders chat history, recommended prompts, and citations
  • API Client (frontend/api.py)

    • Centralizes all HTTP calls to FastAPI
    • Transforms responses into simple return values for the UI
    • Handles network errors and JSON parsing
  • FastAPI Backend (api/backend/main.py)

    • Defines REST endpoints: /login, /register, /upload, /files, /history, /chat, /recommend_prompt
    • Orchestrates indexing, retrieval, chat history persistence, and prompt recommendations
  • Vector Store (ChromaDB via api/backend/chroma.py)

    • Stores embeddings of text chunks extracted from uploaded files
    • Supports fast similarity search for RAG retrieval
  • Database (SQLAlchemy via api/backend/db.py)

    • Persists users, file metadata, and chat history
    • Exposes a SessionLocal factory and Base for model definitions
  • Embeddings & LLM (Google GenAI)

    • Generates embeddings for document chunks
    • Powers the LLM that composes answers from retrieved context

Data Flow

1. File Upload & Indexing

  1. User uploads a file in Streamlit; upload_file(user_id, file) posts to /upload/.
  2. FastAPI saves the file, persists metadata, then calls index_file(path, mime, metadata).
  3. index_file pipeline (api/backend/chroma.py):
    • Extracts text via the appropriate extractor (PDF, DOCX, PPTX, image/OCR).
    • Wraps each page/slide/paragraph in a LangChain Document with combined user metadata.
    • Splits documents into 1 000-char chunks (100-char overlap) using RecursiveCharacterTextSplitter.
    • Embeds and stores chunks in ChromaDB with vectorstore.add_documents(chunks).

Usage example in your upload handler:

from api.backend.chroma import index_file

path = "/tmp/uploads/lecture1.pdf"
mime = "application/pdf"
meta = {"user_id": current_user.id, "filename": "lecture1.pdf"}

if index_file(path, mime, meta):
    print("Indexed successfully")
else:
    print("Indexing failed or no text extracted")

Practical tips:

  • Set GOOGLE_API_KEY for embeddings.
  • Tune chunk_size/chunk_overlap for your documents.
  • Check logging.INFO for extraction summaries and warnings on unsupported types.

2. Chat Request & RAG Pipeline

  1. UI calls get_chat_response(user_id, message, file_id, history) → POST /chat/.
  2. FastAPI /chat/ handler (api/backend/main.py) does:
    • Loads the user’s vector store (optionally scoped by file_id).
    • Builds a retriever (vectorstore.as_retriever(search_kwargs={"k":5})).
    • Constructs a RetrievalQA chain with Google GenAI LLM and the retriever.
    • Executes the chain on {"query": message}, returning both result and source_documents.
    • Extracts citations (e.g. page numbers, text snippets) from source_documents.
    • Persists user message and assistant response in the chat history table.
    • Returns JSON:
      {
        "answer": "Assistant’s reply text…",
        "citations": [
          {"page": 2, "text": "…relevant snippet…"},
          …
        ]
      }
      

Example backend snippet:

from fastapi import APIRouter
from api.backend.chroma import get_vectorstore
from api.backend.llm import llm
from langchain.chains import RetrievalQA

router = APIRouter()

@router.post("/chat/")
def chat(user_id: int, message: str, file_id: int = None, history: list = None):
    # 1. Prepare retriever
    vectorstore = get_vectorstore(user_id, file_id)
    retriever = vectorstore.as_retriever(search_kwargs={"k": 5})

    # 2. Run RAG chain
    qa = RetrievalQA.from_chain_type(
        llm=llm,
        chain_type="stuff",
        retriever=retriever,
        return_source_documents=True
    )
    result = qa({"query": message})

    # 3. Extract answer & citations
    answer = result["result"]
    citations = [
        {"page": doc.metadata.get("page"), "text": doc.page_content}
        for doc in result["source_documents"]
    ]

    # 4. Persist to DB (omitted for brevity)…

    return {"answer": answer, "citations": citations}

Best practices:

  • Limit k (retrieval count) to balance context size vs. latency.
  • Cap history to the most recent 10 messages to reduce payload.
  • Customize the prompt template in RetrievalQA.from_chain_type as needed.

Core Configuration

Database Engine & Session (api/backend/db.py)

import os
from dotenv import load_dotenv
from sqlalchemy import create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker

load_dotenv()
DATABASE_URL = os.getenv("DATABASE_URL")

engine = create_engine(DATABASE_URL, pool_pre_ping=True)

SessionLocal = sessionmaker(
    autocommit=False,
    autoflush=False,
    bind=engine
)

Base = declarative_base()
  • Call Base.metadata.create_all(bind=engine) to initialize your schema.
  • Use SessionLocal() in a dependency or with-block to avoid connection leaks.

Embeddings & Vector Store Initialization (api/backend/chroma.py)

from langchain.embeddings import GoogleGenerativeAIEmbeddings
from langchain.vectorstores import Chroma

def get_vectorstore(user_id: int, file_id: int = None) -> Chroma:
    # Path “chromadb/{user_id}/{file_id}” or “chromadb/{user_id}/global”
    persist_dir = f"chromadb/{user_id}/{file_id or 'global'}"
    embeddings = GoogleGenerativeAIEmbeddings()
    return Chroma(persist_directory=persist_dir, embedding_function=embeddings)

Ensure the CHROMA_DIR is writable and consistent across instances.


API Client Wrappers (frontend/api.py)

Centralize HTTP calls so your UI code stays clean. Key functions:

  • login_user(username, password)(ok, user_id, message)
  • register_user(username, password)(ok, message)
  • get_chat_history(user_id)list[dict]
  • upload_file(user_id, file){"file_id", "filename"}
  • get_user_files(user_id)list[dict]
  • delete_file(user_id, file_id)dict
  • delete_chat_history(user_id)dict
  • get_chat_response(user_id, message, file_id=None, history=None){"answer", "citations"}
  • get_recommended_prompts(user_id, file_id=None, n=6)list[str]

Example usage in Streamlit:

from frontend.api import get_chat_response

resp = get_chat_response(
    st.session_state.user_id,
    st.session_state.prompt,
    file_id=st.session_state.file_id,
    history=st.session_state.chat_history[-10:]
)
st.write(resp["answer"])
for cite in resp["citations"]:
    st.write(f"Page {cite['page']}: {cite['text']}")

By following this architecture and understanding each component’s role in the RAG pipeline, you can extend extractors, swap embedding providers, adjust retrieval settings, or add new endpoints with confidence.

API Reference

1. Authentication

1.1 Register a New User

Create a user account.

Endpoint
POST /register

Request
Content-Type: application/json

{
  "username": "alice",
  "password": "s3cr3t"
}

Response
201 Created

{
  "id": 42,
  "username": "alice"
}

Errors
• 400 Bad Request – missing fields
• 409 Conflict – username already exists

cURL

curl -X POST http://localhost:8000/register \
  -H "Content-Type: application/json" \
  -d '{"username":"alice","password":"s3cr3t"}'

1.2 Login and Obtain Token

Authenticate and receive a Bearer token.

Endpoint
POST /login

Request
Content-Type: application/json

{
  "username": "alice",
  "password": "s3cr3t"
}

Response
200 OK

{
  "access_token": "<jwt_token>",
  "token_type": "bearer"
}

cURL

curl -X POST http://localhost:8000/login \
  -H "Content-Type: application/json" \
  -d '{"username":"alice","password":"s3cr3t"}'

2. File Upload & Indexing

POST /upload/

Upload a document, persist metadata, and index its contents for RAG.

Headers
Authorization: Bearer <token>
Content-Type: multipart/form-data

Form Fields
• user_id (int) – your user ID
• file (file) – PDF, DOCX, TXT, etc.

Response
200 OK

{
  "file_id": 7,
  "filename": "20250717123045_report.pdf"
}

cURL

curl -X POST http://localhost:8000/upload/ \
  -H "Authorization: Bearer $TOKEN" \
  -F "user_id=42" \
  -F "file=@/path/to/report.pdf"

Python (requests)

import requests
token = "<jwt_token>"
files = {'file': open('report.pdf','rb')}
data = {'user_id': 42}
resp = requests.post(
    "http://localhost:8000/upload/",
    headers={"Authorization": f"Bearer {token}"},
    files=files,
    data=data
)
print(resp.json())

3. Chat with RAG

POST /chat/

Send a message; receive a response augmented by your indexed files.

Headers
Authorization: Bearer <token>
Content-Type: application/json

Request Body

{
  "user_id": 42,
  "message": "Summarize the Q2 report",
  "file_id": 7         // optional: target a specific upload
}

Response
200 OK

{
  "response": "The Q2 report highlights a 15% revenue increase…",
  "source_chunks": [
    {"text":"…revenue increased by 15%…","document":"report.pdf"},
    {"text":"…net profit margin improved…","document":"report.pdf"}
  ]
}

cURL

curl -X POST http://localhost:8000/chat/ \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"user_id":42,"message":"Summarize the Q2 report","file_id":7}'

4. Prompt Recommendations

GET /recommend_prompt/

Retrieve auto-generated prompts based on your files.

Headers
Authorization: Bearer <token>

Query Parameters
• user_id (int) – your user ID
• file_id (int) – target file

Response
200 OK

{
  "prompts": [
    "What are the key findings in report.pdf?",
    "List challenges mentioned in the Q2 report."
  ]
}

cURL

curl "http://localhost:8000/recommend_prompt/?user_id=42&file_id=7" \
  -H "Authorization: Bearer $TOKEN"

5. Chat History

GET /history/{user_id}

List all past chat interactions for a user.

Headers
Authorization: Bearer <token>

Path Parameter
• user_id (int)

Response
200 OK

[
  {
    "id": 101,
    "message": "Summarize the Q2 report",
    "response": "The Q2 report…",
    "timestamp": "2025-07-17T12:35:00"
  },
  {
    "id": 102,
    "message": "List action items",
    "response": "1. Follow up with finance…",
    "timestamp": "2025-07-17T12:40:00"
  }
]

cURL

curl http://localhost:8000/history/42 \
  -H "Authorization: Bearer $TOKEN"

6. File Management

6.1 List Uploaded Files

GET /files/{user_id}

Headers
Authorization: Bearer <token>

Path Parameter
• user_id (int)

Response
200 OK

[
  {"file_id":7,"filename":"20250717123045_report.pdf","upload_time":"2025-07-17T12:30:00"},
  {"file_id":8,"filename":"20250717124500_notes.txt","upload_time":"2025-07-17T12:45:00"}
]

cURL

curl http://localhost:8000/files/42 \
  -H "Authorization: Bearer $TOKEN"

6.2 Delete a File

DELETE /files/{file_id}

Headers
Authorization: Bearer <token>

Path Parameter
• file_id (int)

Response
204 No Content

cURL

curl -X DELETE http://localhost:8000/files/7 \
  -H "Authorization: Bearer $TOKEN"

All endpoints support testing with Postman or curl. Include the Authorization: Bearer <token> header for protected routes.

Deployment & Configuration

This section guides you through containerizing, configuring, and running the backend (FastAPI/Uvicorn) and frontend (Streamlit) services for staging or production. You’ll learn how to build images, manage environment variables, orchestrate with Docker Compose, and scale your setup.

1. Building Container Images

Use the provided Dockerfiles to create images for each service.

Backend

Dockerfile.backend installs dependencies and launches Uvicorn on port 8000.

docker build \
  -t chatbot-backend:latest \
  -f Dockerfile.backend \
  .

Frontend

Dockerfile.frontend installs Streamlit requirements and exposes port 8501.

docker build \
  -t chatbot-frontend:latest \
  -f Dockerfile.frontend \
  .

2. Environment Configuration

Centralize runtime settings and secrets in chatbot.env at your project root. Git-ignore this file.

# chatbot.env
OPENAI_API_KEY=sk-…
DATABASE_URL=postgres://user:pass@db:5432/chatbot
STREAMLIT_SERVER_PORT=8501
UVICORN_WORKERS=4

Refer to these variables in your Dockerfiles or application code via os.getenv.

3. Orchestrating with Docker Compose

Use docker-compose.yml to launch both services and inject environment variables.

version: "3.9"

services:
  backend:
    image: chatbot-backend:latest
    ports:
      - "${BACKEND_PORT:-8000}:8000"
    env_file: chatbot.env
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
      interval: 10s
      retries: 3

  frontend:
    image: chatbot-frontend:latest
    ports:
      - "${FRONTEND_PORT:-8501}:8501"
    env_file: chatbot.env
    depends_on:
      backend:
        condition: service_healthy

Launching Services

# Override ports or worker count in a local .env file or CLI
export BACKEND_PORT=8080
export FRONTEND_PORT=8601

docker-compose up -d

Verify:

4. Scaling & Performance

Scaling Backends

Run multiple backend replicas to handle increased traffic:

docker-compose up -d --scale backend=3

Load-balance at your ingress (NGINX, Traefik) or cloud load-balancer.

Tuning Uvicorn Workers

Pass UVICORN_WORKERS from chatbot.env into the container and override the start command:

backend:
  image: chatbot-backend:latest
  command: >
    uvicorn app.main:app
    --host 0.0.0.0
    --port 8000
    --workers ${UVICORN_WORKERS}

5. Cleanup and Maintenance

  • Stop and remove containers, networks, volumes:

    docker-compose down --volumes
    
  • View real-time logs:

    docker-compose logs -f backend
    docker-compose logs -f frontend
    
  • Rebuild images after code changes:

    docker-compose build
    docker-compose up -d
    

Practical Tips

  • Enable live code reload in development by switching image: to a build: block and mounting source directories.

  • Keep secrets out of version control; use Docker secrets or a vault in production.

  • Add resource limits in Compose for production stability:

    deploy:
      resources:
        limits:
          cpus: "0.50"
          memory: 512M
    
  • Monitor container health and logs via your orchestration platform or a monitoring stack (Prometheus, Grafana).

Development Guide

This guide covers coding standards, local development setup, and contribution workflow for the riqtiny/vibe-pi-releasecandidate1 project.

Coding Standards

Enforce consistency and readability across the codebase.

Formatting

  • Use Black (black .) for code formatting.
  • Run isort . to sort imports.
  • Target line length: 88 characters.

Type Checking & Linting

  • Annotate functions and methods with type hints.
  • Run MyPy (mypy .) to catch type errors.
  • Use Flake8 (flake8 .) to enforce lint rules.

Docstrings

  • Follow Google-style docstrings.
  • Include parameter and return type descriptions.

Example function with type hints and docstring:

def add_vectors(a: list[float], b: list[float]) -> list[float]:
    """Element-wise addition of two vectors.

    Args:
        a: First vector of floats.
        b: Second vector of floats.

    Returns:
        A new vector where each element is the sum of corresponding elements in a and b.

    Raises:
        ValueError: If input vectors differ in length.
    """
    if len(a) != len(b):
        raise ValueError("Vectors must be the same length")
    return [x + y for x, y in zip(a, b)]

Local Development Setup

Prerequisites

  • Python 3.10+
  • Docker 20+ & Docker Compose
  • Git

Clone & Environment

git clone https://github.com/riqtiny/vibe-pi-releasecandidate1.git
cd vibe-pi-releasecandidate1
python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt

.dockerignore

The project’s .dockerignore excludes sensitive and build-time files:

.env and *.env
__pycache__/, *.pyc
uploads/

This ensures lean, secure Docker images.

Running Services

Backend (FastAPI):

uvicorn api.backend.main:app --reload --host 0.0.0.0 --port 8000

Frontend (Streamlit):

streamlit run api.frontend.app.py --server.port 8501

Dockerized Development

Build and run via Docker:

docker build -t vibe-pi:dev .
docker run --rm -p 8000:8000 -p 8501:8501 \
  -v "$(pwd)":/app \
  -e ENV=development \
  vibe-pi:dev

Testing

Run the full test suite and linters before merging changes:

pytest --maxfail=1 --disable-warnings -q
black --check .
isort --check-only .
mypy .
flake8 .

Contribution Workflow

  1. Fork & Branch

    • Fork the repo.
    • Create a descriptive branch: git checkout -b feature/parse-pdf.
  2. Develop & Test

    • Adhere to coding standards.
    • Write or update tests under tests/.
    • Ensure CI passes: tests, lint, type checks.
  3. Commit Messages

    • Use imperative tense: Add PDF parsing support.
    • Reference issues: Fix #42: handle missing metadata.
  4. Pull Request

    • Open PR against main.
    • Include a concise summary of changes.
    • Tag reviewers and add relevant labels.
  5. Review & Merge

    • Address review comments promptly.
    • Squash related commits before merge.
    • Ensure the changelog is updated if applicable.

Following this guide ensures a consistent workflow, stable builds, and smooth collaboration.