Skip to content
AI Engineering
AI Engineering11 min read0 views

Supabase + pgvector for AI Agents: Memory, RAG, and Auth in One Backend (2026)

Supabase ships pgvector, RLS, and edge functions out of the box. Here's a working AI-agent memory layer with semantic recall, per-user RLS, and a match_documents RPC ready for LangChain.

TL;DR — Supabase = managed Postgres + pgvector + RLS + edge functions. For AI agent memory and RAG it's the lowest-friction stack in 2026 — five tables, one RPC, and your LangChain agent has long-term memory.

What you'll build

A Supabase project with a documents table for RAG, an agent_memories table for per-user long-term memory, an RLS policy that scopes both to auth.uid(), and a match_documents RPC that LangChain.js calls directly.

Schema

CREATE EXTENSION IF NOT EXISTS vector;

CREATE TABLE documents (
  id BIGSERIAL PRIMARY KEY,
  user_id UUID NOT NULL REFERENCES auth.users (id) ON DELETE CASCADE,
  content TEXT NOT NULL,
  metadata JSONB DEFAULT '{}',
  embedding vector(1536) NOT NULL,
  created_at TIMESTAMPTZ DEFAULT now()
);

CREATE INDEX documents_hnsw_idx ON documents
  USING hnsw (embedding vector_cosine_ops) WITH (m = 16, ef_construction = 64);

ALTER TABLE documents ENABLE ROW LEVEL SECURITY;

CREATE POLICY user_owns_documents ON documents
  USING (user_id = auth.uid())
  WITH CHECK (user_id = auth.uid());

Architecture

flowchart LR
  USER[User in app] --> SB[Supabase Auth JWT]
  SB --> PG[(Postgres + pgvector)]
  PG --> RLS[RLS scopes to auth.uid]
  AGENT[LangChain agent] --> RPC[match_documents RPC]
  RPC --> PG
  PG --> RESULTS[Top-k user docs]
  RESULTS --> AGENT

Step 1 — Create the match RPC

CREATE OR REPLACE FUNCTION match_documents(
  query_embedding vector(1536),
  match_count int DEFAULT 5,
  filter jsonb DEFAULT '{}'
)
RETURNS TABLE (id bigint, content text, metadata jsonb, similarity float)
LANGUAGE plpgsql STABLE
AS $$
BEGIN
  RETURN QUERY
  SELECT d.id, d.content, d.metadata,
         1 - (d.embedding <=> query_embedding) AS similarity
  FROM documents d
  WHERE d.user_id = auth.uid()
    AND d.metadata @> filter
  ORDER BY d.embedding <=> query_embedding
  LIMIT match_count;
END;
$$;

Step 2 — Wire into LangChain.js

import { SupabaseVectorStore } from "@langchain/community/vectorstores/supabase";
import { OpenAIEmbeddings } from "@langchain/openai";
import { createClient } from "@supabase/supabase-js";

const supabase = createClient(
  process.env.SUPABASE_URL!,
  process.env.SUPABASE_ANON_KEY!,
  { global: { headers: { Authorization: `Bearer ${userJwt}` } } },
);

const store = new SupabaseVectorStore(new OpenAIEmbeddings(), {
  client: supabase,
  tableName: "documents",
  queryName: "match_documents",
});

const docs = await store.similaritySearch("appointment policy", 5);

The user's JWT carries auth.uid() — RLS plus the RPC keeps results scoped automatically.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live Demo →

Step 3 — Edge function for ingestion

// supabase/functions/ingest/index.ts
import { OpenAI } from "https://esm.sh/openai@4";
import { createClient } from "https://esm.sh/@supabase/supabase-js@2";

Deno.serve(async (req) => {
  const { content } = await req.json();
  const oai = new OpenAI({ apiKey: Deno.env.get("OPENAI_API_KEY") });
  const v = (await oai.embeddings.create({
    model: "text-embedding-3-small", input: content,
  })).data[0].embedding;

  const sb = createClient(
    Deno.env.get("SUPABASE_URL")!,
    Deno.env.get("SUPABASE_SERVICE_ROLE_KEY")!,
  );
  const userId = (await sb.auth.getUser(req.headers.get("Authorization"))).data.user!.id;
  await sb.from("documents").insert({ user_id: userId, content, embedding: v });
  return new Response("ok");
});

Step 4 — Add agent memory

CREATE TABLE agent_memories (
  id BIGSERIAL PRIMARY KEY,
  user_id UUID NOT NULL,
  agent TEXT NOT NULL,
  kind TEXT NOT NULL,  -- 'episodic' | 'semantic'
  content TEXT NOT NULL,
  embedding vector(1536) NOT NULL,
  created_at TIMESTAMPTZ DEFAULT now()
);
ALTER TABLE agent_memories ENABLE ROW LEVEL SECURITY;
CREATE POLICY uo ON agent_memories USING (user_id = auth.uid());

Step 5 — Realtime updates

Supabase realtime auto-streams changes. Enable on agent_memories and clients see new memories live.

Step 6 — Cost guard

Embedding API costs add up. Hash content before embedding to avoid duplicates:

import crypto from "node:crypto";
const hash = crypto.createHash("sha1").update(content).digest("hex");
// query documents WHERE metadata->>'hash' = hash before embedding

Pitfalls

  • Service-role key in browser — never. Use anon key + RLS.
  • auth.uid() returns null — RPC was called with anon JWT. Pass the user's JWT in headers.
  • Index built on empty table — fine, but rebuild after first 100k inserts for better recall.
  • Realtime row size — Supabase realtime caps at 1MB per row.

CallSphere production note

UrackIT — CallSphere's non-HIPAA support vertical — runs Supabase + ChromaDB for fast prototyping and per-user memory. Healthcare and Behavioral Health stay on dedicated Postgres with the healthcare_voice Prisma schema; OneRoof keeps RLS for landlord isolation. Across 115+ DB tables · 37 agents · 90+ tools · 6 verticals, the same RRF + RLS pattern ports cleanly between hosts. Plans: $149/$499/$1,499 — 14-day trial, 22% affiliate.

FAQ

Q: Free tier limits? 500 MB DB + 2 GB egress + 50k auth users. Fine for prototypes.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Q: How big can pgvector grow on Supabase? Pro plan + dedicated compute scales to 100M+ rows.

Q: Can I bring my own LLM? Yes — edge functions can call any provider; no Supabase lock-in.

Q: What about data residency? Supabase supports US, EU, AP regions; pick at project create time.

Q: Migrate off Supabase later? pg_dump + restore to any Postgres. No proprietary surfaces beyond auth + storage.

Sources

Share

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

AI Engineering

Build a Chat Agent with Haystack RAG + Open LLM (Llama 3.2, 2026)

Haystack 2.7's Agent component plus an Ollama-served Llama 3.2 gives you tool-calling RAG with citations. Here's a complete pipeline against your own document store.

Agentic AI

Agentic RAG with LangGraph: Iterative Retrieval, Self-Correction, and Eval Pipelines

Beyond single-shot RAG — agentic RAG with LangGraph that re-retrieves, self-grades, and rewrites queries. With evals that catch silent retrieval drift.

Agentic AI

Building Your First Agent with the OpenAI Agents SDK in 2026: A Hands-On Walkthrough

Step-by-step build of a working agent with the OpenAI Agents SDK — Agent class, tools, handoffs, tracing — plus an eval pipeline that catches regressions before merge.

Agentic AI

LangGraph Checkpointers in Production: Durable, Resumable Agents with Eval Replay

Use LangGraph's checkpointer to make agents resumable across crashes and human-in-the-loop pauses, then replay any checkpoint into your eval pipeline.

Agentic AI

LangGraph State-Machine Architecture: A Principal-Engineer Deep Dive (2026)

How LangGraph's StateGraph, channels, and reducers actually work — with a working multi-step agent, eval hooks at every node, and the patterns that survive production.

Agentic AI

Production RAG Agents with LangChain and RAGAS Evaluation in 2026

Build a production RAG agent with LangChain, then measure faithfulness, answer relevance, and context precision with RAGAS. The four metrics that matter and how to wire them up.