Building a File Upload API for AI Agents: Multipart, Presigned URLs, and Chunked Uploads

Upload Strategies for AI Agent Platforms

AI agents frequently upload files for processing: documents for RAG pipelines, images for vision models, audio for transcription, and datasets for fine-tuning. Each upload strategy — multipart form data, presigned URLs, and chunked uploads — serves different use cases and file size ranges.

Multipart form data works well for files under 50 MB. Presigned URLs offload the transfer to object storage for files up to several gigabytes. Chunked uploads support resumable transfers for unreliable networks and very large files.

Multipart Upload: The Standard Approach

Multipart form data is the most widely supported upload mechanism. The file is sent as part of an HTTP request body, alongside optional metadata fields.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

flowchart LR
    CLIENT(["Client SDK"])
    GW["API Gateway<br/>auth plus rate limit"]
    APP["FastAPI app<br/>handlers and DI"]
    VAL["Pydantic validation"]
    SVC["Service layer<br/>business logic"]
    DB[(Database)]
    QUEUE[(Background queue)]
    OBS[(Tracing)]
    CLIENT --> GW --> APP --> VAL --> SVC
    SVC --> DB
    SVC --> QUEUE
    SVC --> OBS
    SVC --> CLIENT
    style GW fill:#4f46e5,stroke:#4338ca,color:#fff
    style APP fill:#f59e0b,stroke:#d97706,color:#1f2937
    style DB fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b

from fastapi import FastAPI, UploadFile, File, Form, HTTPException
from pathlib import Path
import uuid
import hashlib

app = FastAPI()

ALLOWED_TYPES = {
    "application/pdf",
    "text/plain",
    "text/csv",
    "application/json",
    "image/png",
    "image/jpeg",
    "audio/wav",
    "audio/mpeg",
}
MAX_FILE_SIZE = 50 * 1024 * 1024  # 50 MB

@app.post("/v1/files", status_code=201)
async def upload_file(
    file: UploadFile = File(...),
    purpose: str = Form(...),
):
    # Validate content type
    if file.content_type not in ALLOWED_TYPES:
        raise HTTPException(
            status_code=415,
            detail=f"Unsupported file type: {file.content_type}. "
                   f"Allowed: {', '.join(ALLOWED_TYPES)}",
        )

    # Read and validate size
    contents = await file.read()
    if len(contents) > MAX_FILE_SIZE:
        raise HTTPException(
            status_code=413,
            detail=f"File exceeds maximum size of {MAX_FILE_SIZE} bytes",
        )

    # Generate unique filename and checksum
    file_id = str(uuid.uuid4())
    checksum = hashlib.sha256(contents).hexdigest()
    extension = Path(file.filename or "unknown").suffix
    storage_path = f"uploads/{purpose}/{file_id}{extension}"

    # Save to storage (local filesystem or S3)
    await save_to_storage(storage_path, contents)

    return {
        "id": file_id,
        "filename": file.filename,
        "purpose": purpose,
        "size": len(contents),
        "content_type": file.content_type,
        "checksum": f"sha256:{checksum}",
        "status": "uploaded",
    }

Presigned URLs: Offloading to Object Storage

For large files, having the upload go through your API server wastes bandwidth and ties up worker processes. Presigned URLs let agents upload directly to S3 or compatible storage. Your server generates a short-lived signed URL, the agent uploads to it, and a webhook or polling mechanism confirms completion.

import boto3
from botocore.config import Config

s3_client = boto3.client(
    "s3",
    config=Config(signature_version="s3v4"),
)

class PresignedUploadRequest(BaseModel):
    filename: str
    content_type: str
    size: int
    purpose: str

@app.post("/v1/files/presigned", status_code=201)
async def create_presigned_upload(body: PresignedUploadRequest):
    if body.content_type not in ALLOWED_TYPES:
        raise HTTPException(status_code=415, detail="Unsupported type")

    if body.size > 5 * 1024 * 1024 * 1024:  # 5 GB
        raise HTTPException(status_code=413, detail="File too large")

    file_id = str(uuid.uuid4())
    extension = Path(body.filename).suffix
    key = f"uploads/{body.purpose}/{file_id}{extension}"

    presigned = s3_client.generate_presigned_url(
        "put_object",
        Params={
            "Bucket": "agent-uploads",
            "Key": key,
            "ContentType": body.content_type,
            "ContentLength": body.size,
        },
        ExpiresIn=3600,  # 1 hour
    )

    # Save pending upload record to database
    await save_upload_record(file_id, key, body)

    return {
        "id": file_id,
        "upload_url": presigned,
        "expires_in": 3600,
        "method": "PUT",
        "headers": {
            "Content-Type": body.content_type,
            "Content-Length": str(body.size),
        },
    }

@app.post("/v1/files/{file_id}/complete")
async def confirm_upload(file_id: str):
    """Agent calls this after uploading to the presigned URL."""
    record = await get_upload_record(file_id)
    if not record:
        raise HTTPException(status_code=404, detail="Upload not found")

    exists = await verify_s3_object(record["key"])
    if not exists:
        raise HTTPException(
            status_code=400,
            detail="File not yet uploaded to storage",
        )

    await mark_upload_complete(file_id)
    return {"id": file_id, "status": "completed"}

Chunked Upload: Resumable Transfers

Chunked uploads split a large file into smaller parts. Each part is uploaded independently, allowing the agent to resume from the last successful chunk after a failure.

from pydantic import BaseModel

class InitiateChunkedUpload(BaseModel):
    filename: str
    total_size: int
    chunk_size: int = 10 * 1024 * 1024  # 10 MB default
    content_type: str

@app.post("/v1/files/chunked", status_code=201)
async def initiate_chunked_upload(body: InitiateChunkedUpload):
    upload_id = str(uuid.uuid4())
    total_chunks = -(-body.total_size // body.chunk_size)  # ceil division

    await create_chunked_upload_record(
        upload_id, body.filename, total_chunks, body.total_size,
    )

    return {
        "upload_id": upload_id,
        "chunk_size": body.chunk_size,
        "total_chunks": total_chunks,
        "upload_endpoint": f"/v1/files/chunked/{upload_id}/parts",
    }

@app.put("/v1/files/chunked/{upload_id}/parts/{part_number}")
async def upload_chunk(
    upload_id: str,
    part_number: int,
    chunk: UploadFile = File(...),
):
    record = await get_chunked_upload(upload_id)
    if not record:
        raise HTTPException(status_code=404)

    if part_number < 1 or part_number > record["total_chunks"]:
        raise HTTPException(status_code=400, detail="Invalid part number")

    contents = await chunk.read()
    checksum = hashlib.sha256(contents).hexdigest()

    await store_chunk(upload_id, part_number, contents, checksum)

    return {
        "part_number": part_number,
        "checksum": f"sha256:{checksum}",
        "status": "uploaded",
    }

@app.post("/v1/files/chunked/{upload_id}/complete")
async def complete_chunked_upload(upload_id: str):
    record = await get_chunked_upload(upload_id)
    uploaded = await get_uploaded_parts(upload_id)

    if len(uploaded) != record["total_chunks"]:
        missing = set(range(1, record["total_chunks"] + 1)) - set(uploaded)
        raise HTTPException(
            status_code=400,
            detail=f"Missing parts: {sorted(missing)}",
        )

    await assemble_chunks(upload_id)
    return {"id": upload_id, "status": "completed"}

FAQ

When should I use presigned URLs versus direct multipart upload?

Use direct multipart upload for files under 50 MB where simplicity is important. Use presigned URLs for anything larger, or when you want to reduce load on your API servers. Presigned URLs let the file data go directly from the agent to object storage, keeping your API server free for business logic. They also support much larger files since the transfer does not go through your infrastructure.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

How do I validate file contents beyond the Content-Type header?

Never trust the Content-Type header alone — it can be spoofed. Read the file's magic bytes (the first few bytes that identify the format) to verify the actual file type. Libraries like python-magic can detect file types from content. For security-sensitive applications, run uploaded files through a virus scanner (ClamAV is a common choice) before making them available for processing.

How do I handle upload failures in chunked upload mode?

The beauty of chunked uploads is built-in resumability. When an upload fails, the agent queries the status endpoint to see which parts were successfully uploaded, then resumes from the first missing part. Each chunk should be verified with a checksum. Set a reasonable expiration on incomplete uploads (24 to 48 hours) and clean them up automatically.

#FileUploadAPI #PresignedURLs #MultipartUpload #FastAPI #AIAgents #AgenticAI #LearnAI #AIEngineering

Building a File Upload API for AI Agents: Multipart, Presigned URLs, and Chunked Uploads

Upload Strategies for AI Agent Platforms

Multipart Upload: The Standard Approach

Presigned URLs: Offloading to Object Storage

Chunked Upload: Resumable Transfers

FAQ

When should I use presigned URLs versus direct multipart upload?

How do I validate file contents beyond the Content-Type header?

How do I handle upload failures in chunked upload mode?

Try CallSphere AI Voice Agents

Related Articles You May Like

Multi-Agent Handoffs with the OpenAI Agents SDK: The Pattern That Actually Scales (2026)

AI Agent M&A Activity 2026: Aircall–Vogent, Meta–PlayAI, OpenAI's Six Deals

Building Your First Agent with the OpenAI Agents SDK in 2026: A Hands-On Walkthrough

LangGraph Checkpointers in Production: Durable, Resumable Agents with Eval Replay

LangGraph State-Machine Architecture: A Principal-Engineer Deep Dive (2026)

LangGraph Supervisor Pattern: Orchestrating Multi-Agent Teams in 2026