pg_trgm + pgvector Hybrid Retrieval: Build Better RAG in Postgres (2026)
Pure vector search caps out around 62% precision. Adding pg_trgm + tsvector with Reciprocal Rank Fusion lifts it to 84%+. Real SQL, real RRF math, and a Prisma helper you can drop into any agent.
TL;DR — Vector search misses exact matches (model names, error codes, drug names). Lexical search misses paraphrases. Hybrid retrieval — running both and merging with Reciprocal Rank Fusion — combines their strengths. In Postgres you get this for free with pgvector + pg_trgm + tsvector.
What you'll build
A single search_chunks(query, k) function that runs vector ANN, full-text BM25-flavored ranking, and trigram fuzzy matching, then fuses the three lists with RRF. Recall@10 climbs from ~0.62 (vector only) to ~0.84+.
Schema
CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE TABLE chunks (
id BIGSERIAL PRIMARY KEY,
doc_id UUID NOT NULL,
body TEXT NOT NULL,
body_tsv tsvector GENERATED ALWAYS AS (to_tsvector('english', body)) STORED,
embedding vector(1536) NOT NULL
);
CREATE INDEX chunks_tsv_idx ON chunks USING gin (body_tsv);
CREATE INDEX chunks_trgm_idx ON chunks USING gin (body gin_trgm_ops);
CREATE INDEX chunks_hnsw_idx ON chunks USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 128);
Architecture
flowchart LR
Q[User query] --> EMB[Embed]
Q --> TS[Tsvector parse]
Q --> TRG[Trigram tokens]
EMB --> ANN[HNSW top-50]
TS --> BM25[ts_rank top-50]
TRG --> FUZZ[trgm sim top-50]
ANN --> RRF[Reciprocal Rank Fusion]
BM25 --> RRF
FUZZ --> RRF
RRF --> TOPK[Top-k to LLM]
Step 1 — Vector candidates
WITH vec AS (
SELECT id, ROW_NUMBER() OVER (ORDER BY embedding <=> $1::vector) AS rk
FROM chunks ORDER BY embedding <=> $1::vector LIMIT 50
)
SELECT * FROM vec;
Step 2 — Full-text candidates
WITH lex AS (
SELECT id, ROW_NUMBER() OVER (
ORDER BY ts_rank_cd(body_tsv, websearch_to_tsquery('english', $2)) DESC
) AS rk
FROM chunks
WHERE body_tsv @@ websearch_to_tsquery('english', $2)
LIMIT 50
)
SELECT * FROM lex;
Step 3 — Trigram fuzzy candidates
WITH fuzz AS (
SELECT id, ROW_NUMBER() OVER (ORDER BY similarity(body, $2) DESC) AS rk
FROM chunks
WHERE body % $2
ORDER BY similarity(body, $2) DESC LIMIT 50
)
SELECT * FROM fuzz;
Step 4 — Fuse with RRF
CREATE OR REPLACE FUNCTION search_chunks(
q_embed vector(1536),
q_text text,
k int DEFAULT 10,
rrf_k int DEFAULT 60
)
RETURNS TABLE (id bigint, body text, score float) AS $$
WITH
vec AS (
SELECT id, ROW_NUMBER() OVER (ORDER BY embedding <=> q_embed) AS rk
FROM chunks ORDER BY embedding <=> q_embed LIMIT 50
),
lex AS (
SELECT id, ROW_NUMBER() OVER (
ORDER BY ts_rank_cd(body_tsv, websearch_to_tsquery('english', q_text)) DESC
) AS rk
FROM chunks
WHERE body_tsv @@ websearch_to_tsquery('english', q_text)
LIMIT 50
),
fuzz AS (
SELECT id, ROW_NUMBER() OVER (ORDER BY similarity(body, q_text) DESC) AS rk
FROM chunks WHERE body % q_text LIMIT 50
),
fused AS (
SELECT id, SUM(1.0 / (rrf_k + rk)) AS score FROM (
SELECT id, rk FROM vec
UNION ALL SELECT id, rk FROM lex
UNION ALL SELECT id, rk FROM fuzz
) u GROUP BY id
)
SELECT c.id, c.body, f.score
FROM fused f JOIN chunks c USING (id)
ORDER BY f.score DESC LIMIT k;
$$ LANGUAGE sql STABLE;
Step 5 — Call from Prisma
const rows = await prisma.$queryRaw`
SELECT * FROM search_chunks(${vec}::vector, ${q}, 10)
`;
Step 6 — Tune RRF k
k=60 is the canonical TREC value. Lower (k=10) emphasizes top-1 hits; higher (k=120) flattens results. A/B test on your eval set.
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
Pitfalls
websearch_to_tsqueryrejects empty strings — guard with COALESCE or a fallback.- Trigram on long text —
%works on whole-string sim. For long docs, storetitle || summaryas a separate trgm column. - Different LIMITs per branch — keep them equal (50 each) so RRF weights are fair.
- Forgetting
STABLE— required so the planner caches the function call.
CallSphere production note
CallSphere's knowledge agents use this exact RRF pattern across the 115+ DB tables that hold product docs, FAQs, and call transcripts. The Healthcare vertical (HIPAA, Prisma healthcare_voice) tunes rrf_k=40 to favor exact drug-name matches; OneRoof (RLS) keeps rrf_k=60; UrackIT (Supabase + ChromaDB) ports the same RRF shape to its non-HIPAA stack. 37 agents · 90+ tools · 6 verticals. Pricing: $149/$499/$1,499 — 14-day trial, 22% affiliate.
FAQ
Q: How much does hybrid actually beat vector-only? On TREC-COVID and BEIR splits, hybrid + RRF lifts nDCG@10 by 15–25%.
Q: Do I need a reranker after RRF? Optional. A cross-encoder reranker on the top-20 fused results adds another 3–7% nDCG.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
Q: What about pg_textsearch / BM25 in Postgres? TigerData's pg_textsearch ships true BM25. Drop-in for ts_rank_cd if you need it.
Q: pg_trgm vs FTS — when does each win? Trgm wins on misspellings and partial words; FTS wins on phrase/stem matching.
Q: Single SQL vs three round trips? Single SQL is faster and atomic. Always fuse server-side.
Sources
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.