flowCreate.solutions

RAG & Search (Postgres + pgvector) — Generic Standard

This section defines generic engineering standards for implementing:

  • RAG (Retrieval-Augmented Generation) for long-form, cited answers
  • Vector search for semantic similarity
  • Hybrid search (lexical + semantic) for higher recall and better precision

These standards are Postgres-first:

  • Postgres full-text search (tsvector, tsquery, ts_rank)
  • pgvector for embeddings and KNN similarity search

This document intentionally avoids domain specifics. It focuses on architecture, contracts, and operational correctness.

Goals

  • Scoped retrieval: every query must be constrained to the caller’s authorization scope (tenant/org/team/project).
  • Hybrid-by-default: prefer lexical + vector retrieval with a fusion step (when both channels are available).
  • Deterministic contracts: stable DB schemas, stable tool interfaces, stable return shapes.
  • Operationally safe: backfills, re-indexing, and degradation modes are defined up front.
  • Extensible: retrieval can serve both AI agents (tool calls) and standard API endpoints.

When to use what

  • Lexical search (FTS) when:

    • users expect exact term matching
    • the query contains identifiers, error strings, product SKUs, or names
    • you need explainable ranking and highlighting
  • Vector search (pgvector) when:

    • users ask concept/intent questions (“how do I…”, “similar to…”, “best option for…”)
    • synonyms and paraphrases matter
    • you need semantic similarity over long text
  • Hybrid search (recommended) when:

    • correctness matters and queries are mixed (terms + intent)
    • you need better recall without losing lexical precision
  • RAG when:

    • answers must be grounded in internal docs/content
    • you must provide citations (source URLs/IDs) alongside the answer

Reference architecture

flowchart TD
  Sources[Sources] --> Ingestion[Ingestion]
  Ingestion --> Chunking[Chunking_Normalization]
  Chunking --> Embeddings[Embeddings_Generation]
  Embeddings --> Storage[(Postgres_pgvector)]
  Storage --> Retrieval[Hybrid_Retrieval]
  Retrieval --> Rerank[RRF_or_WeightedBlend]
  Rerank --> Context[Context_Formatting]
  Context --> LLM[LLM_Response]

Standards that apply here

  • Input safety: all user-provided strings must follow the XSS prevention and sanitization standards in:
    • docs/backend/01_security/xss_prevention.md
  • Testing: retrieval/search behaviors must be covered using the backend testing conventions in:
    • docs/backend/04_testing/overview.md
    • docs/backend/04_testing/structure.md