API Pagination for AI Agent Data: Cursor-Based, Offset, and Keyset Pagination

Why Pagination Matters for AI Agent APIs

AI agents generate enormous volumes of data: conversation histories, tool call logs, evaluation results, and audit trails. Returning all records in a single response is impractical. Without pagination, a single query for an agent's conversation history could return millions of messages, consuming excessive memory, saturating the network, and timing out.

Pagination splits large result sets into manageable pages. The three dominant strategies — offset-based, cursor-based, and keyset pagination — each offer different performance characteristics and consistency guarantees.

Offset-Based Pagination: Simple but Fragile

Offset pagination uses a page number or offset combined with a limit. It is the most intuitive approach and maps directly to SQL's LIMIT and OFFSET clauses.

flowchart LR
    CLIENT(["Client SDK"])
    GW["API Gateway<br/>auth plus rate limit"]
    APP["FastAPI app<br/>handlers and DI"]
    VAL["Pydantic validation"]
    SVC["Service layer<br/>business logic"]
    DB[(Database)]
    QUEUE[(Background queue)]
    OBS[(Tracing)]
    CLIENT --> GW --> APP --> VAL --> SVC
    SVC --> DB
    SVC --> QUEUE
    SVC --> OBS
    SVC --> CLIENT
    style GW fill:#4f46e5,stroke:#4338ca,color:#fff
    style APP fill:#f59e0b,stroke:#d97706,color:#1f2937
    style DB fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b

from fastapi import FastAPI, Query
from pydantic import BaseModel
from sqlalchemy import select, func
from sqlalchemy.ext.asyncio import AsyncSession

app = FastAPI()

class PaginatedResponse(BaseModel):
    data: list[dict]
    total: int
    offset: int
    limit: int
    has_more: bool

@app.get("/v1/agents/{agent_id}/messages")
async def list_messages_offset(
    agent_id: str,
    offset: int = Query(0, ge=0),
    limit: int = Query(20, ge=1, le=100),
    db: AsyncSession = Depends(get_db),
):
    total = await db.scalar(
        select(func.count())
        .select_from(Message)
        .where(Message.agent_id == agent_id)
    )

    rows = await db.execute(
        select(Message)
        .where(Message.agent_id == agent_id)
        .order_by(Message.created_at.desc())
        .offset(offset)
        .limit(limit)
    )
    messages = rows.scalars().all()

    return PaginatedResponse(
        data=[m.to_dict() for m in messages],
        total=total,
        offset=offset,
        limit=limit,
        has_more=offset + limit < total,
    )

The problem with offset pagination is performance degradation at scale. OFFSET 1000000 forces the database to scan and discard one million rows before returning results. It also suffers from consistency issues: if new records are inserted while the client is paginating, pages can shift, causing duplicated or skipped items.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

Cursor-Based Pagination: Consistent and Scalable

Cursor pagination uses an opaque token representing the position of the last item on the current page. The server decodes the cursor to determine where to start the next page, avoiding the performance cliff of large offsets.

import base64
import json

def encode_cursor(created_at: str, id: str) -> str:
    payload = json.dumps({"created_at": created_at, "id": id})
    return base64.urlsafe_b64encode(payload.encode()).decode()

def decode_cursor(cursor: str) -> dict:
    payload = base64.urlsafe_b64decode(cursor.encode()).decode()
    return json.loads(payload)

class CursorPaginatedResponse(BaseModel):
    data: list[dict]
    next_cursor: str | None
    has_more: bool

@app.get("/v1/agents/{agent_id}/conversations")
async def list_conversations_cursor(
    agent_id: str,
    cursor: str | None = Query(None),
    limit: int = Query(20, ge=1, le=100),
    db: AsyncSession = Depends(get_db),
):
    query = (
        select(Conversation)
        .where(Conversation.agent_id == agent_id)
        .order_by(
            Conversation.created_at.desc(),
            Conversation.id.desc(),
        )
    )

    if cursor:
        decoded = decode_cursor(cursor)
        query = query.where(
            (Conversation.created_at < decoded["created_at"])
            | (
                (Conversation.created_at == decoded["created_at"])
                & (Conversation.id < decoded["id"])
            )
        )

    rows = await db.execute(query.limit(limit + 1))
    items = rows.scalars().all()

    has_more = len(items) > limit
    items = items[:limit]

    next_cursor = None
    if has_more and items:
        last = items[-1]
        next_cursor = encode_cursor(
            last.created_at.isoformat(), str(last.id)
        )

    return CursorPaginatedResponse(
        data=[c.to_dict() for c in items],
        next_cursor=next_cursor,
        has_more=has_more,
    )

The trick of fetching limit + 1 items lets you determine whether more pages exist without running a separate count query.

Keyset Pagination: Maximum Database Performance

Keyset pagination is a variant of cursor pagination that directly uses column values rather than opaque tokens. It requires a strict, unique ordering and leverages database indexes for maximum efficiency.

@app.get("/v1/agents/{agent_id}/tool-calls")
async def list_tool_calls_keyset(
    agent_id: str,
    after_id: int | None = Query(None),
    limit: int = Query(50, ge=1, le=200),
    db: AsyncSession = Depends(get_db),
):
    query = (
        select(ToolCall)
        .where(ToolCall.agent_id == agent_id)
        .order_by(ToolCall.id.asc())
    )

    if after_id is not None:
        query = query.where(ToolCall.id > after_id)

    rows = await db.execute(query.limit(limit + 1))
    items = rows.scalars().all()
    has_more = len(items) > limit
    items = items[:limit]

    return {
        "data": [t.to_dict() for t in items],
        "next_after_id": items[-1].id if has_more else None,
        "has_more": has_more,
    }

This generates a simple WHERE id > :after_id ORDER BY id LIMIT :limit query that uses an index seek instead of a sequential scan, performing consistently regardless of how deep into the dataset you paginate.

Choosing the Right Strategy

Use offset pagination for admin dashboards and internal tools where datasets are small, users need to jump to specific pages, and simplicity is valued over performance.

Use cursor pagination for public APIs consumed by AI agents that iterate through large datasets sequentially. It provides stable results and consistent performance.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

Use keyset pagination when you control both the API and the client, your ordering column is indexed and unique, and you need maximum query performance on tables with millions of rows.

FAQ

Can I mix pagination strategies in the same API?

Yes, but be consistent within each resource. For example, use cursor pagination for conversation messages (which are append-heavy and sequentially accessed) and offset pagination for a paginated admin dashboard that needs page jumping. Document the strategy clearly in your OpenAPI spec for each endpoint.

How do I handle filtering with cursor pagination?

Apply filters before cursor conditions. The cursor encodes position within the filtered result set. If a user changes filters mid-pagination, they must start from the beginning with no cursor. Never reuse a cursor from a different filter combination — the underlying position may point to a record that no longer matches the new filter.

What page size should I default to for AI agent APIs?

Start with 20 to 50 items per page, with a maximum of 100 to 200. AI agents processing data in bulk may benefit from larger pages to reduce HTTP round trips, but excessively large pages increase memory pressure and response latency. Let clients specify the page size via a limit query parameter with a sane default and a hard maximum.

#APIPagination #CursorPagination #FastAPI #DatabasePerformance #AIAgents #AgenticAI #LearnAI #AIEngineering

API Pagination for AI Agent Data: Cursor-Based, Offset, and Keyset Pagination

Choosing the Right Strategy

FAQ

What page size should I default to for AI agent APIs?

Try CallSphere AI Voice Agents

Related Articles You May Like

AI Agent M&A Activity 2026: Aircall–Vogent, Meta–PlayAI, OpenAI's Six Deals

LangGraph Checkpointers in Production: Durable, Resumable Agents with Eval Replay

LangGraph State-Machine Architecture: A Principal-Engineer Deep Dive (2026)

Multi-Agent Handoffs with the OpenAI Agents SDK: The Pattern That Actually Scales (2026)

Building Your First Agent with the OpenAI Agents SDK in 2026: A Hands-On Walkthrough

LangGraph Supervisor Pattern: Orchestrating Multi-Agent Teams in 2026