flowCreate.solutions

RAG & Search — Entity Hybrid Search

This page defines how to implement hybrid search for structured entities (catalog items, FAQs, listings, services, etc.) using:

  • Postgres full-text search (search_vector)
  • pgvector embeddings (embedding)
  • a deterministic ranking fusion

Unlike RAG chunk retrieval, entity search typically returns compact objects for selection and UI display.

Data model requirements

Each entity table that supports hybrid search should include:

  • scope_id (tenant/org/project identifier)
  • embedding vector(<dim>) (nullable)
  • search_vector tsvector (nullable or stored/generated)
  • embedding_version (string) (recommended)

Search text generation (standard)

Define a deterministic function generate_entity_search_text(entity) -> str.

Rules:

  • Include only fields intended to be searchable.
  • Add stable labels to reduce ambiguity.
  • Keep field ordering stable.
  • Avoid large non-informative fields (raw HTML, giant blobs).

Example pattern:

name: <name>
description: <description>
tags: <tag1> <tag2> <tag3>
guidance: <guidance>

Use the output for:

  • embedding input text
  • to_tsvector(...) input text

Write-path standards (keeping indexes current)

Create

Recommended approach:

  • Persist the entity row first.
  • Populate embedding and search_vector in a second step.

Failure handling:

  • If embedding fails, do not fail the entity creation by default.
  • Mark indexing status so it can be backfilled.

Update

Recompute derived fields only when a searchable field changed.

Rules:

  • If a non-searchable field changes, do not regenerate embedding/search_vector.
  • If regeneration fails, either:
    • fail the update (strict mode), or
    • allow the update and mark for backfill (resilient mode)

Pick one and document it; default preference is resilient mode for non-critical search.

Read-path standards (search API)

Inputs

Required:

  • scope_id
  • query (free text)
  • pagination (page, limit) or cursor-based pagination

Outputs

Return a compact shape:

  • entity id
  • display fields (e.g., name, description, image_url, url)
  • optional: scoring/debug fields in non-production or admin-only endpoints

Do not return heavy blobs by default.

Step 1: Full-text search (FTS)

Standards:

  • query via plainto_tsquery
  • rank via ts_rank / ts_rank_cd
  • retrieve up to k_lexical candidates (often limit * 2 or limit * 5)

Minimal SQL pattern:

SELECT id,
       ts_rank(search_vector, plainto_tsquery('english', :q)) AS rank
FROM entities
WHERE scope_id = :scope_id
  AND search_vector @@ plainto_tsquery('english', :q)
ORDER BY rank DESC
LIMIT :k_lexical;

Step 2: Semantic search (pgvector)

Standards:

  • embed the query text once
  • retrieve up to k_vector candidates (often limit * 2 or limit * 5)
  • include only rows where embedding IS NOT NULL

Minimal SQL pattern:

SELECT id
FROM entities
WHERE scope_id = :scope_id
  AND embedding IS NOT NULL
ORDER BY embedding <=> :query_embedding
LIMIT :k_vector;

Step 3: Fuse and rank

Choose one default method and apply consistently:

  • Weighted blend (common for entity search):

    • normalize lexical rank scores to 0–1 within the result set
    • combine with semantic similarity using weights
    • apply a “both channels” bonus when an entity appears in both lists
    • apply deterministic tie-breakers (e.g., business priority, then recency)
  • RRF (also acceptable for entities):

    • works well when you primarily want rank-based robustness

Standard knobs to document

At minimum:

  • k_lexical
  • k_vector
  • weights or RRF constant
  • limit
  • pagination behavior (offset vs cursor)

Degradation modes

Entity search must degrade safely:

  • If embeddings are missing or provider is down: lexical-only.
  • If FTS vector is missing: semantic-only.
  • If both are unavailable: return empty results, not an exception (tool contexts).