RAG & Search — Entity Hybrid Search

This page defines how to implement hybrid search for structured entities (catalog items, FAQs, listings, services, etc.) using:

Postgres full-text search (search_vector)
pgvector embeddings (embedding)
a deterministic ranking fusion

Unlike RAG chunk retrieval, entity search typically returns compact objects for selection and UI display.

Data model requirements

Each entity table that supports hybrid search should include:

scope_id (tenant/org/project identifier)
embedding vector(<dim>) (nullable)
search_vector tsvector (nullable or stored/generated)
embedding_version (string) (recommended)

Search text generation (standard)

Define a deterministic function generate_entity_search_text(entity) -> str.

Rules:

Include only fields intended to be searchable.
Add stable labels to reduce ambiguity.
Keep field ordering stable.
Avoid large non-informative fields (raw HTML, giant blobs).

Example pattern:

name: <name>
description: <description>
tags: <tag1> <tag2> <tag3>
guidance: <guidance>

Use the output for:

embedding input text
to_tsvector(...) input text

Write-path standards (keeping indexes current)

Create

Recommended approach:

Persist the entity row first.
Populate embedding and search_vector in a second step.

Failure handling:

If embedding fails, do not fail the entity creation by default.
Mark indexing status so it can be backfilled.

Update

Recompute derived fields only when a searchable field changed.

Rules:

If a non-searchable field changes, do not regenerate embedding/search_vector.
If regeneration fails, either:
- fail the update (strict mode), or
- allow the update and mark for backfill (resilient mode)

Pick one and document it; default preference is resilient mode for non-critical search.

Read-path standards (search API)

Inputs

Required:

scope_id
query (free text)
pagination (page, limit) or cursor-based pagination

Outputs

Return a compact shape:

entity id
display fields (e.g., name, description, image_url, url)
optional: scoring/debug fields in non-production or admin-only endpoints

Do not return heavy blobs by default.

Hybrid search implementation (recommended)

Step 1: Full-text search (FTS)

Standards:

query via plainto_tsquery
rank via ts_rank / ts_rank_cd
retrieve up to k_lexical candidates (often limit * 2 or limit * 5)

Minimal SQL pattern:

SELECT id,
       ts_rank(search_vector, plainto_tsquery('english', :q)) AS rank
FROM entities
WHERE scope_id = :scope_id
  AND search_vector @@ plainto_tsquery('english', :q)
ORDER BY rank DESC
LIMIT :k_lexical;

Step 2: Semantic search (pgvector)

Standards:

embed the query text once
retrieve up to k_vector candidates (often limit * 2 or limit * 5)
include only rows where embedding IS NOT NULL

Minimal SQL pattern:

SELECT id
FROM entities
WHERE scope_id = :scope_id
  AND embedding IS NOT NULL
ORDER BY embedding <=> :query_embedding
LIMIT :k_vector;

Step 3: Fuse and rank

Choose one default method and apply consistently:

Weighted blend (common for entity search):
- normalize lexical rank scores to 0–1 within the result set
- combine with semantic similarity using weights
- apply a “both channels” bonus when an entity appears in both lists
- apply deterministic tie-breakers (e.g., business priority, then recency)
RRF (also acceptable for entities):
- works well when you primarily want rank-based robustness

Standard knobs to document

At minimum:

k_lexical
k_vector
weights or RRF constant
limit
pagination behavior (offset vs cursor)

Degradation modes

Entity search must degrade safely:

If embeddings are missing or provider is down: lexical-only.
If FTS vector is missing: semantic-only.
If both are unavailable: return empty results, not an exception (tool contexts).