Building a Memory Layer for AI Agents: From Simple Lists to Vector Stores
Explore four approaches to building agent memory — in-memory lists, file-based storage, relational databases, and vector stores — with practical Python implementations and guidance on when to use each.
Why Agents Need a Memory Layer
Without memory, every agent interaction starts from scratch. The agent cannot recall what it did five minutes ago, what the user prefers, or what tools returned previously. A memory layer gives agents the ability to store, retrieve, and reason over information across turns and sessions.
The right memory architecture depends on your requirements: how much data you store, how you query it, whether memory persists across restarts, and whether you need semantic search. Let us walk through four approaches in increasing order of sophistication.
Approach 1: In-Memory Lists
The simplest memory is a Python list. It is fast, requires no infrastructure, and works well for prototypes and single-session agents.
flowchart TD
DOC(["Document"])
CHUNK["Chunker<br/>recursive plus overlap"]
EMB["Embedding model"]
META["Attach metadata<br/>source, page, tenant"]
INDEX[("HNSW or IVF index<br/>in vector store")]
Q(["Query"])
QEMB["Embed query"]
SEARCH["ANN search<br/>cosine similarity"]
FILTER["Metadata filter<br/>tenant or date"]
HITS(["Top-k chunks"])
DOC --> CHUNK --> EMB --> META --> INDEX
Q --> QEMB --> SEARCH
INDEX --> SEARCH --> FILTER --> HITS
style INDEX fill:#4f46e5,stroke:#4338ca,color:#fff
style HITS fill:#059669,stroke:#047857,color:#fff
from dataclasses import dataclass, field
from datetime import datetime
from typing import List, Optional
@dataclass
class MemoryEntry:
content: str
category: str # "fact", "preference", "task_result"
timestamp: datetime = field(default_factory=datetime.utcnow)
metadata: dict = field(default_factory=dict)
class InMemoryStore:
def __init__(self):
self._entries: List[MemoryEntry] = []
def store(self, content: str, category: str, **metadata):
entry = MemoryEntry(content=content, category=category, metadata=metadata)
self._entries.append(entry)
def search(self, keyword: str, category: Optional[str] = None) -> List[MemoryEntry]:
results = []
for entry in self._entries:
if keyword.lower() in entry.content.lower():
if category is None or entry.category == category:
results.append(entry)
return results
def get_recent(self, n: int = 10) -> List[MemoryEntry]:
return self._entries[-n:]
# Usage
memory = InMemoryStore()
memory.store("User prefers dark mode", "preference")
memory.store("API returned 42 results for query X", "task_result")
results = memory.search("dark mode")
Limitations: All data is lost when the process ends. Keyword search is brittle — it misses semantic matches. It does not scale beyond a few thousand entries.
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
Approach 2: File-Based Persistence
Adding file persistence ensures memory survives restarts. JSON files work well for small datasets.
import json
from pathlib import Path
class FileMemoryStore:
def __init__(self, path: str = "agent_memory.json"):
self.path = Path(path)
self._entries: List[dict] = []
self._load()
def _load(self):
if self.path.exists():
with open(self.path, "r") as f:
self._entries = json.load(f)
def _save(self):
with open(self.path, "w") as f:
json.dump(self._entries, f, indent=2, default=str)
def store(self, content: str, category: str, **metadata):
entry = {
"content": content,
"category": category,
"timestamp": datetime.utcnow().isoformat(),
"metadata": metadata,
}
self._entries.append(entry)
self._save()
def search(self, keyword: str) -> List[dict]:
return [e for e in self._entries if keyword.lower() in e["content"].lower()]
File-based storage is ideal for single-user desktop agents or CLI tools. It falls apart with concurrent access or when you need complex queries.
Approach 3: Database-Backed Memory
A relational database like SQLite or PostgreSQL adds query flexibility, concurrency support, and scalability.
import sqlite3
from contextlib import contextmanager
class SQLiteMemoryStore:
def __init__(self, db_path: str = "agent_memory.db"):
self.db_path = db_path
self._init_db()
def _init_db(self):
with self._connect() as conn:
conn.execute("""
CREATE TABLE IF NOT EXISTS memories (
id INTEGER PRIMARY KEY AUTOINCREMENT,
content TEXT NOT NULL,
category TEXT NOT NULL,
timestamp TEXT DEFAULT CURRENT_TIMESTAMP,
metadata TEXT DEFAULT '{}'
)
""")
conn.execute(
"CREATE INDEX IF NOT EXISTS idx_category ON memories(category)"
)
@contextmanager
def _connect(self):
conn = sqlite3.connect(self.db_path)
conn.row_factory = sqlite3.Row
try:
yield conn
conn.commit()
finally:
conn.close()
def store(self, content: str, category: str, **metadata):
with self._connect() as conn:
conn.execute(
"INSERT INTO memories (content, category, metadata) VALUES (?, ?, ?)",
(content, category, json.dumps(metadata)),
)
def search(self, keyword: str, category: str = None, limit: int = 20):
query = "SELECT * FROM memories WHERE content LIKE ?"
params = [f"%{keyword}%"]
if category:
query += " AND category = ?"
params.append(category)
query += " ORDER BY timestamp DESC LIMIT ?"
params.append(limit)
with self._connect() as conn:
return conn.execute(query, params).fetchall()
Approach 4: Vector Store Memory
When you need semantic search — finding memories by meaning rather than exact keywords — a vector store is essential. This approach embeds each memory as a high-dimensional vector and retrieves the closest matches.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
import chromadb
from chromadb.config import Settings
class VectorMemoryStore:
def __init__(self, collection_name: str = "agent_memory"):
self.client = chromadb.PersistentClient(path="./chroma_data")
self.collection = self.client.get_or_create_collection(
name=collection_name,
metadata={"hnsw:space": "cosine"},
)
self._counter = self.collection.count()
def store(self, content: str, category: str, **metadata):
self._counter += 1
self.collection.add(
documents=[content],
metadatas=[{"category": category, **metadata}],
ids=[f"mem_{self._counter}"],
)
def search(self, query: str, n_results: int = 5, category: str = None):
where_filter = {"category": category} if category else None
results = self.collection.query(
query_texts=[query],
n_results=n_results,
where=where_filter,
)
return results["documents"][0] if results["documents"] else []
With a vector store, searching for "user interface theme preference" correctly retrieves a memory stored as "User prefers dark mode" even though none of the words match.
Comparison Table
| Approach | Persistence | Semantic Search | Concurrency | Setup Cost |
|---|---|---|---|---|
| In-Memory List | None | No | No | Zero |
| File-Based | Restart-safe | No | No | Minimal |
| SQLite/Postgres | Full | No (FTS partial) | Yes | Low-Medium |
| Vector Store | Full | Yes | Yes | Medium |
FAQ
When should I use a vector store instead of a database?
Use a vector store when your agent needs to retrieve memories by semantic similarity — for example, finding relevant past decisions when the user describes a situation in different words. If you only need exact-match or keyword lookups, a relational database is simpler and faster.
Can I combine a relational database with a vector store?
Yes, this is a common production pattern. Store structured data (timestamps, categories, metadata) in PostgreSQL and store the embedding vectors in a dedicated vector store like Chroma, Pinecone, or pgvector. Query both and merge results.
How much memory should an agent retain?
It depends on the use case. Customer support agents might keep the last 30 days. Research agents might keep everything. Implement a retention policy that expires old, low-relevance memories to keep storage costs manageable and retrieval quality high.
#AgentMemory #VectorStores #DatabaseDesign #Python #AgenticAI #LearnAI #AIEngineering
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.