Skip to content
Learn Agentic AI
Learn Agentic AI12 min read8 views

Building a Memory Layer for AI Agents: From Simple Lists to Vector Stores

Explore four approaches to building agent memory — in-memory lists, file-based storage, relational databases, and vector stores — with practical Python implementations and guidance on when to use each.

Why Agents Need a Memory Layer

Without memory, every agent interaction starts from scratch. The agent cannot recall what it did five minutes ago, what the user prefers, or what tools returned previously. A memory layer gives agents the ability to store, retrieve, and reason over information across turns and sessions.

The right memory architecture depends on your requirements: how much data you store, how you query it, whether memory persists across restarts, and whether you need semantic search. Let us walk through four approaches in increasing order of sophistication.

Approach 1: In-Memory Lists

The simplest memory is a Python list. It is fast, requires no infrastructure, and works well for prototypes and single-session agents.

flowchart TD
    DOC(["Document"])
    CHUNK["Chunker<br/>recursive plus overlap"]
    EMB["Embedding model"]
    META["Attach metadata<br/>source, page, tenant"]
    INDEX[("HNSW or IVF index<br/>in vector store")]
    Q(["Query"])
    QEMB["Embed query"]
    SEARCH["ANN search<br/>cosine similarity"]
    FILTER["Metadata filter<br/>tenant or date"]
    HITS(["Top-k chunks"])
    DOC --> CHUNK --> EMB --> META --> INDEX
    Q --> QEMB --> SEARCH
    INDEX --> SEARCH --> FILTER --> HITS
    style INDEX fill:#4f46e5,stroke:#4338ca,color:#fff
    style HITS fill:#059669,stroke:#047857,color:#fff
from dataclasses import dataclass, field
from datetime import datetime
from typing import List, Optional

@dataclass
class MemoryEntry:
    content: str
    category: str  # "fact", "preference", "task_result"
    timestamp: datetime = field(default_factory=datetime.utcnow)
    metadata: dict = field(default_factory=dict)

class InMemoryStore:
    def __init__(self):
        self._entries: List[MemoryEntry] = []

    def store(self, content: str, category: str, **metadata):
        entry = MemoryEntry(content=content, category=category, metadata=metadata)
        self._entries.append(entry)

    def search(self, keyword: str, category: Optional[str] = None) -> List[MemoryEntry]:
        results = []
        for entry in self._entries:
            if keyword.lower() in entry.content.lower():
                if category is None or entry.category == category:
                    results.append(entry)
        return results

    def get_recent(self, n: int = 10) -> List[MemoryEntry]:
        return self._entries[-n:]

# Usage
memory = InMemoryStore()
memory.store("User prefers dark mode", "preference")
memory.store("API returned 42 results for query X", "task_result")
results = memory.search("dark mode")

Limitations: All data is lost when the process ends. Keyword search is brittle — it misses semantic matches. It does not scale beyond a few thousand entries.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live Demo →

Approach 2: File-Based Persistence

Adding file persistence ensures memory survives restarts. JSON files work well for small datasets.

import json
from pathlib import Path

class FileMemoryStore:
    def __init__(self, path: str = "agent_memory.json"):
        self.path = Path(path)
        self._entries: List[dict] = []
        self._load()

    def _load(self):
        if self.path.exists():
            with open(self.path, "r") as f:
                self._entries = json.load(f)

    def _save(self):
        with open(self.path, "w") as f:
            json.dump(self._entries, f, indent=2, default=str)

    def store(self, content: str, category: str, **metadata):
        entry = {
            "content": content,
            "category": category,
            "timestamp": datetime.utcnow().isoformat(),
            "metadata": metadata,
        }
        self._entries.append(entry)
        self._save()

    def search(self, keyword: str) -> List[dict]:
        return [e for e in self._entries if keyword.lower() in e["content"].lower()]

File-based storage is ideal for single-user desktop agents or CLI tools. It falls apart with concurrent access or when you need complex queries.

Approach 3: Database-Backed Memory

A relational database like SQLite or PostgreSQL adds query flexibility, concurrency support, and scalability.

import sqlite3
from contextlib import contextmanager

class SQLiteMemoryStore:
    def __init__(self, db_path: str = "agent_memory.db"):
        self.db_path = db_path
        self._init_db()

    def _init_db(self):
        with self._connect() as conn:
            conn.execute("""
                CREATE TABLE IF NOT EXISTS memories (
                    id INTEGER PRIMARY KEY AUTOINCREMENT,
                    content TEXT NOT NULL,
                    category TEXT NOT NULL,
                    timestamp TEXT DEFAULT CURRENT_TIMESTAMP,
                    metadata TEXT DEFAULT '{}'
                )
            """)
            conn.execute(
                "CREATE INDEX IF NOT EXISTS idx_category ON memories(category)"
            )

    @contextmanager
    def _connect(self):
        conn = sqlite3.connect(self.db_path)
        conn.row_factory = sqlite3.Row
        try:
            yield conn
            conn.commit()
        finally:
            conn.close()

    def store(self, content: str, category: str, **metadata):
        with self._connect() as conn:
            conn.execute(
                "INSERT INTO memories (content, category, metadata) VALUES (?, ?, ?)",
                (content, category, json.dumps(metadata)),
            )

    def search(self, keyword: str, category: str = None, limit: int = 20):
        query = "SELECT * FROM memories WHERE content LIKE ?"
        params = [f"%{keyword}%"]
        if category:
            query += " AND category = ?"
            params.append(category)
        query += " ORDER BY timestamp DESC LIMIT ?"
        params.append(limit)
        with self._connect() as conn:
            return conn.execute(query, params).fetchall()

Approach 4: Vector Store Memory

When you need semantic search — finding memories by meaning rather than exact keywords — a vector store is essential. This approach embeds each memory as a high-dimensional vector and retrieves the closest matches.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

import chromadb
from chromadb.config import Settings

class VectorMemoryStore:
    def __init__(self, collection_name: str = "agent_memory"):
        self.client = chromadb.PersistentClient(path="./chroma_data")
        self.collection = self.client.get_or_create_collection(
            name=collection_name,
            metadata={"hnsw:space": "cosine"},
        )
        self._counter = self.collection.count()

    def store(self, content: str, category: str, **metadata):
        self._counter += 1
        self.collection.add(
            documents=[content],
            metadatas=[{"category": category, **metadata}],
            ids=[f"mem_{self._counter}"],
        )

    def search(self, query: str, n_results: int = 5, category: str = None):
        where_filter = {"category": category} if category else None
        results = self.collection.query(
            query_texts=[query],
            n_results=n_results,
            where=where_filter,
        )
        return results["documents"][0] if results["documents"] else []

With a vector store, searching for "user interface theme preference" correctly retrieves a memory stored as "User prefers dark mode" even though none of the words match.

Comparison Table

Approach Persistence Semantic Search Concurrency Setup Cost
In-Memory List None No No Zero
File-Based Restart-safe No No Minimal
SQLite/Postgres Full No (FTS partial) Yes Low-Medium
Vector Store Full Yes Yes Medium

FAQ

When should I use a vector store instead of a database?

Use a vector store when your agent needs to retrieve memories by semantic similarity — for example, finding relevant past decisions when the user describes a situation in different words. If you only need exact-match or keyword lookups, a relational database is simpler and faster.

Can I combine a relational database with a vector store?

Yes, this is a common production pattern. Store structured data (timestamps, categories, metadata) in PostgreSQL and store the embedding vectors in a dedicated vector store like Chroma, Pinecone, or pgvector. Query both and merge results.

How much memory should an agent retain?

It depends on the use case. Customer support agents might keep the last 30 days. Research agents might keep everything. Implement a retention policy that expires old, low-relevance memories to keep storage costs manageable and retrieval quality high.


#AgentMemory #VectorStores #DatabaseDesign #Python #AgenticAI #LearnAI #AIEngineering

Share

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like