DocChat — RAG Assistant

A grounded RAG assistant that answers from your documents with citations — and a 0% hallucination rate on out-of-scope queries.

Live demo Source code

React 19
TypeScript
Express
Gemini
pgvector
Postgres
SSE

A retrieval-augmented chat app over your own documents. Built the full ingestion pipeline — multi-format parsing, recursive chunking, and batch embedding into pgvector — and a real-time SSE endpoint that streams retrieval → generation → citation, with live controls for top-K and similarity threshold.

// Problem

General-purpose chatbots confidently make things up, which is unusable when answers have to come from a specific, trusted set of documents. I wanted an assistant that only ever answers from the source material — and clearly says when it can't.

// Approach

Strict grounding prompts plus citation extraction so every answer traces back to a chunk, and an out-of-scope guard that declines rather than guesses — achieving a 0% hallucination rate on out-of-scope queries across the test set.

// Architecture

Ingestion parses PDF/MD/TXT, chunks recursively at 512 tokens with 50-token overlap, and batch-embeds via gemini-embedding-001 into pgvector on Neon serverless Postgres. A single SSE endpoint streams the whole retrieval → generation → citation flow, with UI controls to tune top-K and cosine-similarity threshold live.

// Outcome

Tuning retrieval against a 12-case test set lifted answer accuracy from 43% to 70%, and the live similarity / top-K controls made the trade-offs visible and debuggable instead of buried in config.