Skip to content
Learn Agentic AI
Learn Agentic AI9 min read17 views

SQLiteSession: Building Persistent Conversations for AI Agents

Learn how to use SQLiteSession in the OpenAI Agents SDK to build persistent multi-turn conversations with automatic history retrieval, storage, and session limits.

Why Agent Memory Matters

Every meaningful conversation depends on memory. When a user asks your AI agent "What did I just say?" or "Can you change the second item?", the agent needs access to the conversation history. Without persistence, every interaction starts from zero — the agent has no idea who it is talking to or what has been discussed.

The OpenAI Agents SDK solves this with sessions — pluggable backends that store and retrieve conversation history automatically. The simplest and most portable option is SQLiteSession, which uses SQLite as the storage engine.

SQLiteSession Basics

SQLiteSession comes built into the OpenAI Agents SDK. It supports two modes: in-memory (for testing and ephemeral conversations) and file-based (for true persistence across process restarts).

flowchart TD
    MSG(["New message"])
    WORKING["Working memory<br/>rolling window"]
    EPISODIC[("Episodic memory<br/>past sessions")]
    SEMANTIC[("Semantic memory<br/>facts and preferences")]
    SUM["Summarizer<br/>compresses old turns"]
    ROUTER{"Retrieve<br/>needed memories"}
    PROMPT["Assembled context"]
    LLM["LLM"]
    UPD["Memory updater<br/>writes new facts"]
    MSG --> WORKING --> ROUTER
    ROUTER -->|Past sessions| EPISODIC
    ROUTER -->|User facts| SEMANTIC
    EPISODIC --> SUM --> PROMPT
    SEMANTIC --> PROMPT
    WORKING --> PROMPT --> LLM --> UPD
    UPD --> EPISODIC
    UPD --> SEMANTIC
    style ROUTER fill:#4f46e5,stroke:#4338ca,color:#fff
    style LLM fill:#f59e0b,stroke:#d97706,color:#1f2937
    style EPISODIC fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b
    style SEMANTIC fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b

In-Memory Sessions

An in-memory session lives only as long as the Python process. It is perfect for unit tests and short-lived scripts where you need multi-turn behavior but do not need data to survive a restart.

from agents import Agent, Runner
from agents.extensions.sessions import SQLiteSession

# In-memory session — data lost when process ends
session = SQLiteSession()

agent = Agent(
    name="Assistant",
    instructions="You are a helpful assistant. Remember what the user tells you.",
)

async def chat(user_message: str, session_id: str):
    result = await Runner.run(
        agent,
        user_message,
        session=session,
        session_id=session_id,
    )
    return result.final_output

File-Based Sessions

For real persistence, pass a file path to SQLiteSession. The database file is created automatically if it does not exist.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live Demo →
from agents.extensions.sessions import SQLiteSession

# File-based session — survives process restarts
session = SQLiteSession(db_path="./conversations.db")

That single change means your agent remembers conversations across restarts, deployments, and even server migrations (just copy the .db file).

Automatic History Retrieval and Storage

The key design principle of sessions in the Agents SDK is that they are transparent. You do not need to manually load history before a run or save it after. The runner handles both automatically.

When you call Runner.run() with a session and session_id:

  1. Before the run: The runner calls session.get_items(session_id) to load all prior conversation items.
  2. During the run: The agent sees the full history as context and generates a response.
  3. After the run: The runner calls session.add_items(session_id, new_items) to persist the new turn.
import asyncio
from agents import Agent, Runner
from agents.extensions.sessions import SQLiteSession

session = SQLiteSession(db_path="./my_agent.db")

agent = Agent(
    name="MemoryBot",
    instructions="You remember everything the user tells you. When asked to recall, be specific.",
)

async def main():
    sid = "user-123-conversation-1"

    # Turn 1
    result = await Runner.run(
        agent, "My favorite color is blue.", session=session, session_id=sid
    )
    print(result.final_output)

    # Turn 2 — the agent automatically sees Turn 1
    result = await Runner.run(
        agent, "What is my favorite color?", session=session, session_id=sid
    )
    print(result.final_output)  # "Your favorite color is blue."

asyncio.run(main())

No manual history threading. No message array management. The session handles it.

Multi-Turn Conversation Example

Let us build a more realistic example — a travel planning agent that accumulates preferences over multiple turns.

import asyncio
from agents import Agent, Runner
from agents.extensions.sessions import SQLiteSession

session = SQLiteSession(db_path="./travel_planner.db")

travel_agent = Agent(
    name="TravelPlanner",
    instructions="""You are a travel planning assistant. As the user shares preferences,
    build up a mental model of their ideal trip. Summarize what you know when asked.
    Be specific about dates, budget, and destinations mentioned.""",
)

async def multi_turn_demo():
    sid = "trip-planning-session-42"

    turns = [
        "I want to visit Japan in October.",
        "My budget is around $3000 for flights and hotels.",
        "I love hiking and traditional temples.",
        "Can you summarize what you know about my trip so far?",
    ]

    for message in turns:
        print(f"User: {message}")
        result = await Runner.run(
            travel_agent, message, session=session, session_id=sid
        )
        print(f"Agent: {result.final_output}
")

asyncio.run(multi_turn_demo())

Each turn builds on the previous ones. The fourth message triggers a summary that references Japan, October, the $3000 budget, and the hiking and temples preferences — all pulled from the session automatically.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

SessionSettings with the Limit Parameter

Long conversations accumulate tokens fast. The SessionSettings class lets you control how much history the runner loads from the session. The limit parameter caps the number of items retrieved.

from agents.extensions.sessions import SQLiteSession, SessionSettings

session = SQLiteSession(db_path="./conversations.db")

# Only load the last 20 items from history
settings = SessionSettings(limit=20)

result = await Runner.run(
    agent,
    "What were we discussing?",
    session=session,
    session_id="user-456",
    session_settings=settings,
)

This is critical for production systems where conversations can span hundreds of turns. Without a limit, you risk exceeding the model's context window or paying for unnecessary input tokens.

Choosing the Right Limit

Scenario Recommended Limit
Quick Q&A bot 10-20 items
Customer support agent 30-50 items
Long-running project assistant 50-100 items
Unlimited context (use with compaction) No limit, use compaction

Full Working Chatbot Example

Here is a complete, production-style chatbot that uses SQLiteSession with file persistence, session limits, and proper async handling.

import asyncio
import uuid
from agents import Agent, Runner
from agents.extensions.sessions import SQLiteSession, SessionSettings

DB_PATH = "./chatbot_sessions.db"

session = SQLiteSession(db_path=DB_PATH)
settings = SessionSettings(limit=50)

assistant = Agent(
    name="ChatBot",
    instructions="""You are a friendly and helpful conversational assistant.
    You remember the user's name, preferences, and prior requests.
    When the user returns to a topic discussed earlier, reference it naturally.
    Keep responses concise but warm.""",
)

async def handle_message(session_id: str, user_input: str) -> str:
    """Process a single user message and return the agent response."""
    result = await Runner.run(
        assistant,
        user_input,
        session=session,
        session_id=session_id,
        session_settings=settings,
    )
    return result.final_output

async def main():
    print("ChatBot ready. Type 'quit' to exit, 'new' for a new session.
")
    session_id = str(uuid.uuid4())
    print(f"Session: {session_id}
")

    while True:
        user_input = input("You: ").strip()
        if user_input.lower() == "quit":
            break
        if user_input.lower() == "new":
            session_id = str(uuid.uuid4())
            print(f"
New session: {session_id}
")
            continue
        if not user_input:
            continue

        response = await handle_message(session_id, user_input)
        print(f"Bot: {response}
")

asyncio.run(main())

What Happens Under the Hood

  1. The user types a message.
  2. Runner.run() calls session.get_items(session_id) which executes a SQL query: SELECT * FROM session_items WHERE session_id = ? ORDER BY rowid LIMIT 50.
  3. The retrieved items are prepended to the conversation context.
  4. The agent generates a response using the full context.
  5. The new user message and agent response are persisted via session.add_items().
  6. On the next turn, the cycle repeats with the updated history.

When to Use SQLiteSession

SQLiteSession is the right choice when:

  • You are building a single-server application or CLI tool
  • You want zero-dependency persistence (SQLite is built into Python)
  • You need a quick prototype with real persistence
  • Your conversations are bound to a single process or machine

For distributed systems where multiple workers need access to the same sessions, look at RedisSession or SQLAlchemySession instead. But for a remarkable number of use cases — personal assistants, development tools, local chatbots, and MVP products — SQLiteSession is all you need.

Sources:

Share

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

Agentic AI

Browser Agents with LangGraph + Playwright: Visual Evaluation Pipelines That Don't Lie

Build a browser agent with LangGraph and Playwright that does multi-step web tasks, then ground-truth its work with visual diffs and DOM-based evaluators.

Agentic AI

Building Your First Agent with the OpenAI Agents SDK in 2026: A Hands-On Walkthrough

Step-by-step build of a working agent with the OpenAI Agents SDK — Agent class, tools, handoffs, tracing — plus an eval pipeline that catches regressions before merge.

Agentic AI

OpenAI Computer-Use Agents (CUA) in Production: Build + Evaluate a Real Workflow (2026)

Build a working computer-use agent with the OpenAI Computer Use tool — clicks, types, scrolls a real browser — then evaluate task success on a benchmark suite.

AI Infrastructure

Agent Personalization at Scale: Patterns That Work for 1M Users

Personalizing agents for one user is easy. Personalizing them for a million users is a memory-tier problem. The hot/warm/cold split and what each tier optimizes for.

Agentic AI

Neo4j Knowledge Graph Memory for AI Agents in 2026

Neo4j's agent-memory project ships short-term, long-term, and reasoning memory in one graph. Microsoft Agent Framework and LangChain both wire it in. Here is the production pattern.

Funding & Industry

OpenAI revenue run-rate — April 2026 read — April 2026 update

OpenAI's April 2026 reported revenue run-rate cleared $13B annualized, on continued ChatGPT growth, agentic Operator monetization, and enterprise API expansion.