RAG & Search — Entity Hybrid Search
This page defines how to implement hybrid search for structured entities (catalog items, FAQs, listings, services, etc.) using:
- Postgres full-text search (
search_vector) - pgvector embeddings (
embedding) - a deterministic ranking fusion
Unlike RAG chunk retrieval, entity search typically returns compact objects for selection and UI display.
Data model requirements
Each entity table that supports hybrid search should include:
scope_id(tenant/org/project identifier)embedding vector(<dim>)(nullable)search_vector tsvector(nullable or stored/generated)embedding_version(string) (recommended)
Search text generation (standard)
Define a deterministic function generate_entity_search_text(entity) -> str.
Rules:
- Include only fields intended to be searchable.
- Add stable labels to reduce ambiguity.
- Keep field ordering stable.
- Avoid large non-informative fields (raw HTML, giant blobs).
Example pattern:
name: <name>
description: <description>
tags: <tag1> <tag2> <tag3>
guidance: <guidance>
Use the output for:
- embedding input text
to_tsvector(...)input text
Write-path standards (keeping indexes current)
Create
Recommended approach:
- Persist the entity row first.
- Populate
embeddingandsearch_vectorin a second step.
Failure handling:
- If embedding fails, do not fail the entity creation by default.
- Mark indexing status so it can be backfilled.
Update
Recompute derived fields only when a searchable field changed.
Rules:
- If a non-searchable field changes, do not regenerate embedding/search_vector.
- If regeneration fails, either:
- fail the update (strict mode), or
- allow the update and mark for backfill (resilient mode)
Pick one and document it; default preference is resilient mode for non-critical search.
Read-path standards (search API)
Inputs
Required:
scope_idquery(free text)- pagination (
page,limit) or cursor-based pagination
Outputs
Return a compact shape:
- entity
id - display fields (e.g.,
name,description,image_url,url) - optional: scoring/debug fields in non-production or admin-only endpoints
Do not return heavy blobs by default.
Hybrid search implementation (recommended)
Step 1: Full-text search (FTS)
Standards:
- query via
plainto_tsquery - rank via
ts_rank/ts_rank_cd - retrieve up to
k_lexicalcandidates (oftenlimit * 2orlimit * 5)
Minimal SQL pattern:
SELECT id,
ts_rank(search_vector, plainto_tsquery('english', :q)) AS rank
FROM entities
WHERE scope_id = :scope_id
AND search_vector @@ plainto_tsquery('english', :q)
ORDER BY rank DESC
LIMIT :k_lexical;
Step 2: Semantic search (pgvector)
Standards:
- embed the query text once
- retrieve up to
k_vectorcandidates (oftenlimit * 2orlimit * 5) - include only rows where
embedding IS NOT NULL
Minimal SQL pattern:
SELECT id
FROM entities
WHERE scope_id = :scope_id
AND embedding IS NOT NULL
ORDER BY embedding <=> :query_embedding
LIMIT :k_vector;
Step 3: Fuse and rank
Choose one default method and apply consistently:
-
Weighted blend (common for entity search):
- normalize lexical rank scores to 0–1 within the result set
- combine with semantic similarity using weights
- apply a “both channels” bonus when an entity appears in both lists
- apply deterministic tie-breakers (e.g., business priority, then recency)
-
RRF (also acceptable for entities):
- works well when you primarily want rank-based robustness
Standard knobs to document
At minimum:
k_lexicalk_vector- weights or RRF constant
limit- pagination behavior (offset vs cursor)
Degradation modes
Entity search must degrade safely:
- If embeddings are missing or provider is down: lexical-only.
- If FTS vector is missing: semantic-only.
- If both are unavailable: return empty results, not an exception (tool contexts).