Skip to content
Learn Agentic AI
Learn Agentic AI13 min read16 views

MCP Security Best Practices for Production Agents

Secure your MCP-powered agents for production with authentication, network policies, tool approval workflows, audit logging, rate limiting, and defense-in-depth strategies.

Why MCP Security Matters

MCP servers give AI agents the ability to take real actions — read files, query databases, send emails, modify records. A misconfigured MCP server is not just a bug. It is a security vulnerability that an adversary or a hallucinating model can exploit to access data or modify systems.

The default configuration of most MCP servers is designed for development convenience, not production security. Moving to production requires deliberately layering security controls at every level. This post covers five essential layers: authentication, network policies, tool approval, audit logging, and rate limiting.

Layer 1: Authentication and Authorization

Server-to-Server Authentication

Every MCP server should require authentication. For HTTP-based MCP servers (Streamable HTTP transport), use bearer tokens or mutual TLS:

flowchart LR
    HOST(["MCP host<br/>Claude Desktop or IDE"])
    CLIENT["MCP client"]
    subgraph SERVERS["MCP Servers"]
        S1["Filesystem server"]
        S2["GitHub server"]
        S3["Postgres server"]
        SX["Custom tool server"]
    end
    LLM["LLM session"]
    OUT(["Grounded action"])
    HOST <--> CLIENT
    CLIENT <-->|stdio or HTTP+SSE| S1
    CLIENT <--> S2
    CLIENT <--> S3
    CLIENT <--> SX
    CLIENT --> LLM --> OUT
    style HOST fill:#f1f5f9,stroke:#64748b,color:#0f172a
    style CLIENT fill:#4f46e5,stroke:#4338ca,color:#fff
    style OUT fill:#059669,stroke:#047857,color:#fff
from starlette.middleware.base import BaseHTTPMiddleware
from starlette.responses import JSONResponse
import os

VALID_TOKENS = set(os.environ.get("MCP_AUTH_TOKENS", "").split(","))

class AuthMiddleware(BaseHTTPMiddleware):
    async def dispatch(self, request, call_next):
        auth = request.headers.get("Authorization", "")
        if not auth.startswith("Bearer "):
            return JSONResponse({"error": "Unauthorized"}, status_code=401)
        if auth[7:] not in VALID_TOKENS:
            return JSONResponse({"error": "Forbidden"}, status_code=403)
        return await call_next(request)

server = Server("secure-server")
app = StreamableHTTPServer(server, middleware=[Middleware(AuthMiddleware)])

On the agent side, pass auth headers when connecting:

from agents.mcp import MCPServerStreamableHTTP

secure_server = MCPServerStreamableHTTP(
    name="SecureDB",
    params={
        "url": "http://db-mcp:8001/mcp",
        "headers": {
            "Authorization": f"Bearer {os.environ['MCP_DB_TOKEN']}",
        },
    },
    cache_tools_list=True,
)

Per-User Authorization

Not every user should have access to every tool. Implement per-user authorization by passing user context (role, user ID) through the MCP arguments and checking a permissions map server-side. Map each tool to a list of allowed roles (e.g., "delete_records" requires "admin", while "read_records" allows "viewer"). Reject calls from unauthorized roles with a clear permission denied message.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live Demo →

Layer 2: Network Policies

Principle of Least Network Access

MCP servers should only be accessible from the agent service, never from the public internet. In Kubernetes, use NetworkPolicy to restrict traffic:

# network-policy.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: mcp-server-policy
  namespace: agents
spec:
  podSelector:
    matchLabels:
      app: mcp-database-server
  policyTypes:
    - Ingress
  ingress:
    - from:
        - podSelector:
            matchLabels:
              app: agent-service
      ports:
        - port: 8001
          protocol: TCP

This ensures only the agent service pod can reach the MCP database server. No other pod, and no external traffic, can connect.

Stdio Server Isolation

For stdio-based MCP servers, the security boundary is the subprocess environment. Limit what the subprocess can access:

import os

# Restrict environment variables passed to the subprocess
safe_env = {
    "PATH": "/usr/local/bin:/usr/bin:/bin",
    "HOME": "/tmp/mcp-sandbox",
    "ECOMMERCE_API_URL": os.environ["ECOMMERCE_API_URL"],
    "ECOMMERCE_API_KEY": os.environ["ECOMMERCE_API_KEY"],
}

server = MCPServerStdio(
    name="EcommerceTools",
    params={
        "command": "python",
        "args": ["ecommerce_server.py"],
        "env": safe_env,  # Only these env vars are visible
    },
)

Never pass the full os.environ to a subprocess. This could leak database passwords, cloud credentials, or API keys that the MCP server does not need.

Layer 3: Tool Approval Workflows

Even with authentication and network controls, you may want human approval before certain tools execute. The Agents SDK supports approval policies for this purpose.

Static Approval for Dangerous Tools

Use a static tool filter combined with an approval callback:

from agents.mcp.util import create_static_tool_filter

# Allow read tools freely, require approval for writes
read_tools = {"list_products", "get_product", "get_order", "get_customer"}
write_tools = {"create_order", "update_order", "delete_order", "add_note"}

async def approval_callback(tool_name: str, arguments: dict) -> bool:
    """In production, this sends a Slack message or UI prompt."""
    if tool_name in read_tools:
        return True

    print(f"APPROVAL REQUIRED: {tool_name}")
    print(f"Arguments: {arguments}")
    # In production: send to approval queue, wait for response
    # For demo: auto-approve
    return True

tool_filter = create_static_tool_filter(
    allowed_tool_names=read_tools | write_tools
)

You can extend this pattern with context-aware approval that checks argument values — auto-approving small orders while requiring human sign-off for large ones, or always requiring approval for delete operations.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Layer 4: Audit Logging

Every tool invocation should be logged with enough context to reconstruct what happened, when, and why. This is essential for compliance, debugging, and incident response.

Structured Audit Logs

import structlog, time
from datetime import datetime, timezone

audit = structlog.get_logger("mcp.audit")
SENSITIVE = {"password", "token", "api_key", "secret", "ssn"}

def sanitize(args: dict) -> dict:
    return {k: "***" if k.lower() in SENSITIVE else v for k, v in args.items()}

@server.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]:
    user_id = arguments.pop("_user_id", "unknown")
    start = time.perf_counter()
    try:
        result = await execute_tool(name, arguments)
        audit.info("tool_ok", tool=name, user=user_id,
                    args=sanitize(arguments),
                    ms=round((time.perf_counter() - start) * 1000, 2))
        return result
    except Exception as e:
        audit.error("tool_fail", tool=name, user=user_id,
                     args=sanitize(arguments), error=str(e),
                     ms=round((time.perf_counter() - start) * 1000, 2))
        raise

For compliance, persist audit logs to a durable store like PostgreSQL or a dedicated logging service rather than just stdout. Include a sanitize_arguments step that redacts sensitive fields (passwords, tokens, API keys) before writing to the log.

Layer 5: Rate Limiting

Without rate limiting, a runaway agent loop could hammer your MCP servers with thousands of tool calls per minute. This can exhaust database connections, trigger API rate limits on downstream services, or simply consume excessive resources.

Per-Tool Rate Limiting

from collections import defaultdict
from datetime import datetime, timedelta

class ToolRateLimiter:
    def __init__(self):
        self.call_timestamps = defaultdict(list)
        self.limits = {
            "create_order": {"max_calls": 10, "window_seconds": 60},
            "delete_order": {"max_calls": 5, "window_seconds": 60},
            "_default": {"max_calls": 100, "window_seconds": 60},
        }

    def check(self, tool_name: str) -> bool:
        config = self.limits.get(tool_name, self.limits["_default"])
        cutoff = datetime.now() - timedelta(seconds=config["window_seconds"])
        self.call_timestamps[tool_name] = [
            ts for ts in self.call_timestamps[tool_name] if ts > cutoff
        ]
        if len(self.call_timestamps[tool_name]) >= config["max_calls"]:
            return False
        self.call_timestamps[tool_name].append(datetime.now())
        return True

rate_limiter = ToolRateLimiter()

@server.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]:
    if not rate_limiter.check(name):
        return [TextContent(
            type="text",
            text=f"Rate limit exceeded for '{name}'. Try again later.",
        )]
    return await execute_tool(name, arguments)

In addition to per-tool limits, add a global per-session rate limit that caps the total number of tool calls any single agent session can make. This prevents runaway loops from exhausting resources even if individual tool limits are not exceeded.

Defense in Depth: Putting It All Together

No single security layer is sufficient. Production MCP deployments should combine all five: authenticated connections on the agent side (Layer 1), NetworkPolicies restricting server access in Kubernetes (Layer 2), tool approval callbacks for write operations (Layer 3), structured audit logging inside the server (Layer 4), and per-tool and per-session rate limiting (Layer 5). Each layer catches threats that others miss.

Security Checklist for Production MCP

Before deploying an MCP-powered agent to production, verify each item:

  • All HTTP MCP servers require bearer tokens or mTLS, rotated every 90 days
  • Stdio servers receive only the environment variables they need — no secrets hardcoded in source
  • MCP servers are not exposed to the public internet; Kubernetes NetworkPolicies restrict ingress to the agent service only
  • Write and delete tools require explicit approval; tool filters block unnecessary tools
  • Every tool call is logged with user ID, tool name, sanitized arguments, status, and duration
  • Per-tool and per-session rate limits are configured, with violations logged and alerted
  • You have a runbook for revoking tokens and disabling tools without redeployment

MCP security is not a one-time setup. It requires ongoing attention as new servers are added, tools are modified, and agents are given new capabilities. Treat every MCP tool like an API endpoint — because that is exactly what it is. Apply the same security rigor you would to any production API surface.

Share

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

AI Infrastructure

MCP Servers for SaaS Tools: A 2026 Registry Walkthrough for Voice Agent Teams

The public MCP registry crossed 9,400 servers in April 2026. Here is a curated walkthrough of the SaaS MCP servers CallSphere mounts in production, with OAuth 2.1 PKCE patterns.

AI Infrastructure

MCP Registry Catalogs in 2026: Official Registry vs Smithery vs mcp.so

The Official MCP Registry hit API freeze v0.1. Smithery has 7,000+ servers, mcp.so has 19,700+, PulseMCP is hand-curated. We compare discovery, install, and security across the major catalogs.

Agentic AI

OpenAI Computer-Use Agents (CUA) in Production: Build + Evaluate a Real Workflow (2026)

Build a working computer-use agent with the OpenAI Computer Use tool — clicks, types, scrolls a real browser — then evaluate task success on a benchmark suite.

Agentic AI

Input and Output Guardrails in the OpenAI Agents SDK: A Production Pattern (2026)

Stop the agent BEFORE it does the wrong thing. How to wire input and output guardrails in the OpenAI Agents SDK with cheap classifiers and an eval suite that proves they work.

Agentic AI

Browser Agents with LangGraph + Playwright: Visual Evaluation Pipelines That Don't Lie

Build a browser agent with LangGraph and Playwright that does multi-step web tasks, then ground-truth its work with visual diffs and DOM-based evaluators.

Agentic AI

Safety Evaluation for Agents: Jailbreak, Prompt Injection, and Tool-Misuse Test Suites in 2026

How to build a safety eval pipeline that runs known jailbreak corpora, prompt-injection attacks, and tool-misuse scenarios on every release — and gates merges on it.