RAG & Search (Postgres + pgvector) — Generic Standard
This section defines generic engineering standards for implementing:
- RAG (Retrieval-Augmented Generation) for long-form, cited answers
- Vector search for semantic similarity
- Hybrid search (lexical + semantic) for higher recall and better precision
These standards are Postgres-first:
- Postgres full-text search (
tsvector,tsquery,ts_rank) - pgvector for embeddings and KNN similarity search
This document intentionally avoids domain specifics. It focuses on architecture, contracts, and operational correctness.
Goals
- Scoped retrieval: every query must be constrained to the caller’s authorization scope (tenant/org/team/project).
- Hybrid-by-default: prefer lexical + vector retrieval with a fusion step (when both channels are available).
- Deterministic contracts: stable DB schemas, stable tool interfaces, stable return shapes.
- Operationally safe: backfills, re-indexing, and degradation modes are defined up front.
- Extensible: retrieval can serve both AI agents (tool calls) and standard API endpoints.
When to use what
-
Lexical search (FTS) when:
- users expect exact term matching
- the query contains identifiers, error strings, product SKUs, or names
- you need explainable ranking and highlighting
-
Vector search (pgvector) when:
- users ask concept/intent questions (“how do I…”, “similar to…”, “best option for…”)
- synonyms and paraphrases matter
- you need semantic similarity over long text
-
Hybrid search (recommended) when:
- correctness matters and queries are mixed (terms + intent)
- you need better recall without losing lexical precision
-
RAG when:
- answers must be grounded in internal docs/content
- you must provide citations (source URLs/IDs) alongside the answer
Reference architecture
flowchart TD
Sources[Sources] --> Ingestion[Ingestion]
Ingestion --> Chunking[Chunking_Normalization]
Chunking --> Embeddings[Embeddings_Generation]
Embeddings --> Storage[(Postgres_pgvector)]
Storage --> Retrieval[Hybrid_Retrieval]
Retrieval --> Rerank[RRF_or_WeightedBlend]
Rerank --> Context[Context_Formatting]
Context --> LLM[LLM_Response]
Standards that apply here
- Input safety: all user-provided strings must follow the XSS prevention and sanitization standards in:
docs/backend/01_security/xss_prevention.md
- Testing: retrieval/search behaviors must be covered using the backend testing conventions in:
docs/backend/04_testing/overview.mddocs/backend/04_testing/structure.md