Skip to content
Learn Agentic AI
Learn Agentic AI13 min read18 views

Building a Custom MCP Server for Your REST API

Build a production-ready MCP server that wraps your existing REST API endpoints as callable tools, using FastAPI and the MCP Python SDK to expose your business logic to AI agents.

Why Build a Custom MCP Server?

Most MCP tutorials use pre-built servers — the filesystem server, the Git server, the Postgres server. These cover common use cases. But every company has its own REST APIs: inventory systems, billing platforms, CRM endpoints, internal dashboards. To let an AI agent interact with your specific business logic, you need to wrap those APIs as MCP tools.

A custom MCP server sits between the AI agent and your REST API. The agent calls tools defined by your server, and your server translates those tool calls into HTTP requests against your existing endpoints. Your API does not need to change at all. The MCP server is an adapter layer.

In this post, we will build a complete custom MCP server using the official MCP Python SDK and FastAPI, exposing a sample e-commerce REST API as a set of agent-callable tools.

MCP Server Architecture

An MCP server has three responsibilities:

flowchart LR
    HOST(["MCP host<br/>Claude Desktop or IDE"])
    CLIENT["MCP client"]
    subgraph SERVERS["MCP Servers"]
        S1["Filesystem server"]
        S2["GitHub server"]
        S3["Postgres server"]
        SX["Custom tool server"]
    end
    LLM["LLM session"]
    OUT(["Grounded action"])
    HOST <--> CLIENT
    CLIENT <-->|stdio or HTTP+SSE| S1
    CLIENT <--> S2
    CLIENT <--> S3
    CLIENT <--> SX
    CLIENT --> LLM --> OUT
    style HOST fill:#f1f5f9,stroke:#64748b,color:#0f172a
    style CLIENT fill:#4f46e5,stroke:#4338ca,color:#fff
    style OUT fill:#059669,stroke:#047857,color:#fff
  1. Declare tools — Expose a list of tools with names, descriptions, and input schemas so agents know what they can call.
  2. Execute tools — When an agent invokes a tool, run the associated logic (in our case, an HTTP request to your API) and return the result.
  3. Communicate via protocol — Speak the MCP protocol over either stdio (for local subprocess servers) or HTTP with SSE (for remote servers).

The architecture looks like this:

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live Demo →
Agent (OpenAI SDK)
    |
    | MCP Protocol (stdio or HTTP+SSE)
    v
Custom MCP Server
    |
    | HTTP requests
    v
Your REST API (FastAPI, Express, Rails, etc.)

Setting Up the Project

Start by installing the MCP Python SDK:

pip install mcp httpx pydantic

Create a project structure:

my-mcp-server/
  server.py        # MCP server definition
  api_client.py    # HTTP client for your REST API
  config.py        # Configuration and environment variables
  requirements.txt

The REST API We Are Wrapping

For this tutorial, assume we have an e-commerce API with these endpoints:

GET    /api/products              List all products
GET    /api/products/{id}         Get product details
POST   /api/orders                Create an order
GET    /api/orders/{id}           Get order status
GET    /api/customers/{id}        Get customer profile
POST   /api/customers/{id}/notes  Add a note to a customer

This is a standard CRUD API. The goal is to make every endpoint callable by an AI agent through MCP tools.

Building the API Client

First, create a typed HTTP client for your API. This keeps the MCP server code clean and separates protocol logic from HTTP logic:

# api_client.py
import httpx
from typing import Optional

class EcommerceAPIClient:
    def __init__(self, base_url: str, api_key: str):
        self.base_url = base_url.rstrip("/")
        self.headers = {
            "Authorization": f"Bearer {api_key}",
            "Content-Type": "application/json",
        }

    async def list_products(
        self, category: Optional[str] = None, limit: int = 20
    ) -> dict:
        params = {"limit": limit}
        if category:
            params["category"] = category
        async with httpx.AsyncClient() as client:
            resp = await client.get(
                f"{self.base_url}/api/products",
                headers=self.headers,
                params=params,
            )
            resp.raise_for_status()
            return resp.json()

    async def create_order(
        self, customer_id: str, product_ids: list[str], shipping_address: str
    ) -> dict:
        async with httpx.AsyncClient() as client:
            resp = await client.post(
                f"{self.base_url}/api/orders",
                headers=self.headers,
                json={
                    "customer_id": customer_id,
                    "product_ids": product_ids,
                    "shipping_address": shipping_address,
                },
            )
            resp.raise_for_status()
            return resp.json()

    async def get_order(self, order_id: str) -> dict:
        async with httpx.AsyncClient() as client:
            resp = await client.get(
                f"{self.base_url}/api/orders/{order_id}",
                headers=self.headers,
            )
            resp.raise_for_status()
            return resp.json()

    # Additional methods follow the same pattern:
    # get_product(), get_customer(), add_customer_note()

Defining the MCP Server

Now create the MCP server that registers each API method as a tool:

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

# server.py
import json
import os
from mcp.server import Server
from mcp.types import Tool, TextContent
from api_client import EcommerceAPIClient

# Initialize
api = EcommerceAPIClient(
    base_url=os.environ["ECOMMERCE_API_URL"],
    api_key=os.environ["ECOMMERCE_API_KEY"],
)
server = Server("ecommerce-mcp")

@server.list_tools()
async def list_tools() -> list[Tool]:
    return [
        Tool(
            name="list_products",
            description="List available products, optionally filtered by category",
            inputSchema={
                "type": "object",
                "properties": {
                    "category": {
                        "type": "string",
                        "description": "Filter by category (e.g. electronics, clothing)",
                    },
                    "limit": {
                        "type": "integer",
                        "description": "Max results to return (default 20)",
                        "default": 20,
                    },
                },
            },
        ),
        Tool(
            name="create_order",
            description="Place a new order for a customer",
            inputSchema={
                "type": "object",
                "properties": {
                    "customer_id": {"type": "string"},
                    "product_ids": {
                        "type": "array",
                        "items": {"type": "string"},
                        "description": "List of product IDs to order",
                    },
                    "shipping_address": {"type": "string"},
                },
                "required": ["customer_id", "product_ids", "shipping_address"],
            },
        ),
        # Additional tools: get_product, get_order_status,
        # get_customer, add_customer_note follow the same pattern
    ]

@server.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]:
    try:
        if name == "list_products":
            result = await api.list_products(
                category=arguments.get("category"),
                limit=arguments.get("limit", 20),
            )
        elif name == "create_order":
            result = await api.create_order(
                customer_id=arguments["customer_id"],
                product_ids=arguments["product_ids"],
                shipping_address=arguments["shipping_address"],
            )
        # ... handle remaining tools with the same pattern
        else:
            return [TextContent(type="text", text=f"Unknown tool: {name}")]
        return [TextContent(type="text", text=json.dumps(result, indent=2))]
    except Exception as e:
        return [TextContent(type="text", text=f"Error: {str(e)}")]

Running as a Stdio Server

The simplest deployment is stdio — the agent SDK spawns your server as a subprocess:

# At the bottom of server.py
import asyncio
from mcp.server.stdio import stdio_server

async def main():
    async with stdio_server() as (read_stream, write_stream):
        await server.run(read_stream, write_stream)

if __name__ == "__main__":
    asyncio.run(main())

Connect it from the agent side:

from agents import Agent, Runner
from agents.mcp import MCPServerStdio

ecommerce = MCPServerStdio(
    name="Ecommerce",
    params={
        "command": "python",
        "args": ["server.py"],
        "env": {
            "ECOMMERCE_API_URL": "https://api.myshop.com",
            "ECOMMERCE_API_KEY": "sk-...",
        },
    },
    cache_tools_list=True,
)

agent = Agent(
    name="Shop Assistant",
    instructions="You help customers browse products, place orders, and check order status.",
    mcp_servers=[ecommerce],
)

async def main():
    async with ecommerce:
        result = await Runner.run(agent, "What electronics do you have in stock?")
        print(result.final_output)

Running as an HTTP Server

For production, you often want the MCP server to run as a standalone service. Use the Streamable HTTP transport:

# http_server.py
from mcp.server.streamable_http import StreamableHTTPServer
from server import server

app = StreamableHTTPServer(server)

if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=8001)

Then connect from the agent:

from agents.mcp import MCPServerStreamableHTTP

ecommerce = MCPServerStreamableHTTP(
    name="Ecommerce",
    params={"url": "http://ecommerce-mcp:8001/mcp"},
    cache_tools_list=True,
)

Error Handling Best Practices

Your MCP server must handle errors gracefully. API failures should return informative messages, not crash the server:

@server.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]:
    try:
        result = await dispatch_tool(name, arguments)
        return [TextContent(type="text", text=json.dumps(result, indent=2))]
    except httpx.HTTPStatusError as e:
        error_msg = f"API returned {e.response.status_code}"
        if e.response.status_code == 404:
            error_msg = f"Resource not found: {arguments}"
        elif e.response.status_code == 403:
            error_msg = "Permission denied for this operation"
        return [TextContent(type="text", text=error_msg)]
    except httpx.ConnectError:
        return [TextContent(
            type="text",
            text="Cannot reach the API server. Please try again later.",
        )]
    except Exception as e:
        return [TextContent(type="text", text=f"Unexpected error: {str(e)}")]

Testing Your MCP Server

Test tools individually before connecting them to an agent. The MCP SDK provides a test client:

import pytest
from mcp.client import ClientSession
from mcp.client.stdio import stdio_client

@pytest.mark.asyncio
async def test_list_products():
    async with stdio_client("python", ["server.py"]) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()
            tools = await session.list_tools()
            tool_names = [t.name for t in tools]
            assert "list_products" in tool_names
            result = await session.call_tool("list_products", {"limit": 5})
            assert len(result.content) > 0

Building a custom MCP server is the bridge between your existing APIs and the world of AI agents. The pattern is always the same: define tools with schemas, map tool calls to API requests, and handle errors cleanly. Once your first server is working, adding new tools takes minutes.

Share

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

AI Engineering

Build a Voice Agent on Cloudflare Workers AI (No External LLM)

Run STT, LLM, and TTS entirely on Cloudflare's edge — no OpenAI, no ElevenLabs. Real working code with Whisper, Llama 3.3 70B, and Deepgram Aura.

AI Infrastructure

MCP Registry Catalogs in 2026: Official Registry vs Smithery vs mcp.so

The Official MCP Registry hit API freeze v0.1. Smithery has 7,000+ servers, mcp.so has 19,700+, PulseMCP is hand-curated. We compare discovery, install, and security across the major catalogs.

AI Infrastructure

MCP Servers for SaaS Tools: A 2026 Registry Walkthrough for Voice Agent Teams

The public MCP registry crossed 9,400 servers in April 2026. Here is a curated walkthrough of the SaaS MCP servers CallSphere mounts in production, with OAuth 2.1 PKCE patterns.

AI Engineering

Build a Chat Agent with Haystack RAG + Open LLM (Llama 3.2, 2026)

Haystack 2.7's Agent component plus an Ollama-served Llama 3.2 gives you tool-calling RAG with citations. Here's a complete pipeline against your own document store.

Agentic AI

OpenAI Computer-Use Agents (CUA) in Production: Build + Evaluate a Real Workflow (2026)

Build a working computer-use agent with the OpenAI Computer Use tool — clicks, types, scrolls a real browser — then evaluate task success on a benchmark suite.

Agentic AI

Browser Agents with LangGraph + Playwright: Visual Evaluation Pipelines That Don't Lie

Build a browser agent with LangGraph and Playwright that does multi-step web tasks, then ground-truth its work with visual diffs and DOM-based evaluators.