Building a Memory Layer for AI Agents: From Simple Lists to Vector Stores

Why Agents Need a Memory Layer

Without memory, every agent interaction starts from scratch. The agent cannot recall what it did five minutes ago, what the user prefers, or what tools returned previously. A memory layer gives agents the ability to store, retrieve, and reason over information across turns and sessions.

The right memory architecture depends on your requirements: how much data you store, how you query it, whether memory persists across restarts, and whether you need semantic search. Let us walk through four approaches in increasing order of sophistication.

Approach 1: In-Memory Lists

The simplest memory is a Python list. It is fast, requires no infrastructure, and works well for prototypes and single-session agents.

flowchart TD
    DOC(["Document"])
    CHUNK["Chunker<br/>recursive plus overlap"]
    EMB["Embedding model"]
    META["Attach metadata<br/>source, page, tenant"]
    INDEX[("HNSW or IVF index<br/>in vector store")]
    Q(["Query"])
    QEMB["Embed query"]
    SEARCH["ANN search<br/>cosine similarity"]
    FILTER["Metadata filter<br/>tenant or date"]
    HITS(["Top-k chunks"])
    DOC --> CHUNK --> EMB --> META --> INDEX
    Q --> QEMB --> SEARCH
    INDEX --> SEARCH --> FILTER --> HITS
    style INDEX fill:#4f46e5,stroke:#4338ca,color:#fff
    style HITS fill:#059669,stroke:#047857,color:#fff

from dataclasses import dataclass, field
from datetime import datetime
from typing import List, Optional

@dataclass
class MemoryEntry:
    content: str
    category: str  # "fact", "preference", "task_result"
    timestamp: datetime = field(default_factory=datetime.utcnow)
    metadata: dict = field(default_factory=dict)

class InMemoryStore:
    def __init__(self):
        self._entries: List[MemoryEntry] = []

    def store(self, content: str, category: str, **metadata):
        entry = MemoryEntry(content=content, category=category, metadata=metadata)
        self._entries.append(entry)

    def search(self, keyword: str, category: Optional[str] = None) -> List[MemoryEntry]:
        results = []
        for entry in self._entries:
            if keyword.lower() in entry.content.lower():
                if category is None or entry.category == category:
                    results.append(entry)
        return results

    def get_recent(self, n: int = 10) -> List[MemoryEntry]:
        return self._entries[-n:]

# Usage
memory = InMemoryStore()
memory.store("User prefers dark mode", "preference")
memory.store("API returned 42 results for query X", "task_result")
results = memory.search("dark mode")

Limitations: All data is lost when the process ends. Keyword search is brittle — it misses semantic matches. It does not scale beyond a few thousand entries.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

Approach 2: File-Based Persistence

Adding file persistence ensures memory survives restarts. JSON files work well for small datasets.

import json
from pathlib import Path

class FileMemoryStore:
    def __init__(self, path: str = "agent_memory.json"):
        self.path = Path(path)
        self._entries: List[dict] = []
        self._load()

    def _load(self):
        if self.path.exists():
            with open(self.path, "r") as f:
                self._entries = json.load(f)

    def _save(self):
        with open(self.path, "w") as f:
            json.dump(self._entries, f, indent=2, default=str)

    def store(self, content: str, category: str, **metadata):
        entry = {
            "content": content,
            "category": category,
            "timestamp": datetime.utcnow().isoformat(),
            "metadata": metadata,
        }
        self._entries.append(entry)
        self._save()

    def search(self, keyword: str) -> List[dict]:
        return [e for e in self._entries if keyword.lower() in e["content"].lower()]

File-based storage is ideal for single-user desktop agents or CLI tools. It falls apart with concurrent access or when you need complex queries.

Approach 3: Database-Backed Memory

A relational database like SQLite or PostgreSQL adds query flexibility, concurrency support, and scalability.

import sqlite3
from contextlib import contextmanager

class SQLiteMemoryStore:
    def __init__(self, db_path: str = "agent_memory.db"):
        self.db_path = db_path
        self._init_db()

    def _init_db(self):
        with self._connect() as conn:
            conn.execute("""
                CREATE TABLE IF NOT EXISTS memories (
                    id INTEGER PRIMARY KEY AUTOINCREMENT,
                    content TEXT NOT NULL,
                    category TEXT NOT NULL,
                    timestamp TEXT DEFAULT CURRENT_TIMESTAMP,
                    metadata TEXT DEFAULT '{}'
                )
            """)
            conn.execute(
                "CREATE INDEX IF NOT EXISTS idx_category ON memories(category)"
            )

    @contextmanager
    def _connect(self):
        conn = sqlite3.connect(self.db_path)
        conn.row_factory = sqlite3.Row
        try:
            yield conn
            conn.commit()
        finally:
            conn.close()

    def store(self, content: str, category: str, **metadata):
        with self._connect() as conn:
            conn.execute(
                "INSERT INTO memories (content, category, metadata) VALUES (?, ?, ?)",
                (content, category, json.dumps(metadata)),
            )

    def search(self, keyword: str, category: str = None, limit: int = 20):
        query = "SELECT * FROM memories WHERE content LIKE ?"
        params = [f"%{keyword}%"]
        if category:
            query += " AND category = ?"
            params.append(category)
        query += " ORDER BY timestamp DESC LIMIT ?"
        params.append(limit)
        with self._connect() as conn:
            return conn.execute(query, params).fetchall()

Approach 4: Vector Store Memory

When you need semantic search — finding memories by meaning rather than exact keywords — a vector store is essential. This approach embeds each memory as a high-dimensional vector and retrieves the closest matches.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

import chromadb
from chromadb.config import Settings

class VectorMemoryStore:
    def __init__(self, collection_name: str = "agent_memory"):
        self.client = chromadb.PersistentClient(path="./chroma_data")
        self.collection = self.client.get_or_create_collection(
            name=collection_name,
            metadata={"hnsw:space": "cosine"},
        )
        self._counter = self.collection.count()

    def store(self, content: str, category: str, **metadata):
        self._counter += 1
        self.collection.add(
            documents=[content],
            metadatas=[{"category": category, **metadata}],
            ids=[f"mem_{self._counter}"],
        )

    def search(self, query: str, n_results: int = 5, category: str = None):
        where_filter = {"category": category} if category else None
        results = self.collection.query(
            query_texts=[query],
            n_results=n_results,
            where=where_filter,
        )
        return results["documents"][0] if results["documents"] else []

With a vector store, searching for "user interface theme preference" correctly retrieves a memory stored as "User prefers dark mode" even though none of the words match.

Comparison Table

Approach	Persistence	Semantic Search	Concurrency	Setup Cost
In-Memory List	None	No	No	Zero
File-Based	Restart-safe	No	No	Minimal
SQLite/Postgres	Full	No (FTS partial)	Yes	Low-Medium
Vector Store	Full	Yes	Yes	Medium

FAQ

When should I use a vector store instead of a database?

Use a vector store when your agent needs to retrieve memories by semantic similarity — for example, finding relevant past decisions when the user describes a situation in different words. If you only need exact-match or keyword lookups, a relational database is simpler and faster.

Can I combine a relational database with a vector store?

Yes, this is a common production pattern. Store structured data (timestamps, categories, metadata) in PostgreSQL and store the embedding vectors in a dedicated vector store like Chroma, Pinecone, or pgvector. Query both and merge results.

How much memory should an agent retain?

It depends on the use case. Customer support agents might keep the last 30 days. Research agents might keep everything. Implement a retention policy that expires old, low-relevance memories to keep storage costs manageable and retrieval quality high.

#AgentMemory #VectorStores #DatabaseDesign #Python #AgenticAI #LearnAI #AIEngineering

Building a Memory Layer for AI Agents: From Simple Lists to Vector Stores

Why Agents Need a Memory Layer

Approach 1: In-Memory Lists

Approach 2: File-Based Persistence

Approach 3: Database-Backed Memory

Approach 4: Vector Store Memory

Comparison Table

FAQ

When should I use a vector store instead of a database?

Can I combine a relational database with a vector store?

How much memory should an agent retain?

Try CallSphere AI Voice Agents

Related Articles You May Like

Evaluating Agent Memory: Recall, Precision, and the Eval Pipeline Most Teams Don't Build

Building Your First Agent with the OpenAI Agents SDK in 2026: A Hands-On Walkthrough

Agent Memory in LangGraph 2026: Short-Term, Long-Term, and the Patterns That Survive Production

Anthropic Skills System: Loadable Tool Packs for Claude Agents

Designing Agent Loops with the Claude Agent SDK

Enterprise CIO Guide: Hippocratic AI — Healthcare Agents at Scale