Document Copilot
A retrieval-augmented assistant that answers plain-English questions over a corpus of SEC filings with sourced, citable answers. Built to test the core idea from the Document Copilot case study: that trust, not raw model intelligence, is what makes AI usable in high-stakes knowledge work.
Concepts explored: hybrid retrieval (vector + full-text search), chunking strategy for long-form documents, grounding prompts so the model only answers from retrieved evidence, and citation as a first-class output requirement.
Why these decisions: retrieval over fine-tuning keeps answers current as filings are added and traceable to source documents; a single Postgres store with pgvector reduces operational burden for a team without dedicated infrastructure; hybrid search catches both semantic meaning and exact financial terms.