Skip to content
Learn Agentic AI
Learn Agentic AI12 min read16 views

Debug Logging and Configuration Best Practices for OpenAI Agents

Configure the OpenAI Agents SDK for development and production. Covers API keys, model defaults, verbose logging, sensitive data protection, and a production readiness checklist.

Getting Configuration Right

Configuration is where development convenience meets production security. The OpenAI Agents SDK provides multiple configuration mechanisms — environment variables, programmatic settings, and per-run overrides. Getting these right from the start saves hours of debugging and prevents security incidents.

API Key Configuration

The SDK automatically reads OPENAI_API_KEY from the environment:

flowchart LR
    INPUT(["User input"])
    AGENT["Agent<br/>name plus instructions"]
    HAND{"Handoff to<br/>another agent?"}
    SUB["Sub-agent<br/>specialist"]
    GUARD{"Guardrail<br/>passed?"}
    TOOL["Tool call"]
    SDK[("Tracing<br/>OpenAI dashboard")]
    OUT(["Final output"])
    INPUT --> AGENT --> HAND
    HAND -->|Yes| SUB --> GUARD
    HAND -->|No| GUARD
    GUARD -->|Yes| TOOL --> AGENT
    GUARD -->|Block| OUT
    AGENT --> OUT
    AGENT --> SDK
    style AGENT fill:#4f46e5,stroke:#4338ca,color:#fff
    style GUARD fill:#f59e0b,stroke:#d97706,color:#1f2937
    style SDK fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b
    style OUT fill:#059669,stroke:#047857,color:#fff
export OPENAI_API_KEY="sk-proj-your-key-here"

This is the recommended approach because:

  • Keys stay out of source code
  • Different environments (dev, staging, prod) use different keys
  • Key rotation does not require code changes

Programmatic Key Setting

For cases where environment variables are not practical:

from agents import set_default_openai_key

set_default_openai_key("sk-proj-your-key-here")

This sets the key for all subsequent agent runs in the process. Call this once at application startup, not before every run.

Per-Run Key Override

For multi-tenant applications where different requests use different API keys:

from openai import AsyncOpenAI
from agents import Agent, Runner, OpenAIChatCompletionsModel

# Create a client with a specific key
client = AsyncOpenAI(api_key="sk-proj-tenant-specific-key")

agent = Agent(
    name="Tenant Agent",
    instructions="Help the user.",
    model=OpenAIChatCompletionsModel(
        model="gpt-4o",
        openai_client=client,
    ),
)

result = await Runner.run(agent, "Hello")

Custom OpenAI Client Configuration

For advanced scenarios — proxies, custom base URLs, or organization IDs — configure the underlying OpenAI client:

from openai import AsyncOpenAI
from agents import set_default_openai_client

client = AsyncOpenAI(
    api_key="sk-proj-your-key",
    organization="org-your-org-id",
    base_url="https://your-proxy.example.com/v1",
    timeout=60.0,
    max_retries=3,
)

set_default_openai_client(client)

This is useful for:

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live Demo →
  • API proxies: Route traffic through a logging proxy or gateway
  • Azure OpenAI: Use a custom base URL for Azure-hosted models
  • Organization isolation: Set the organization ID for billing separation

Model Configuration

Default Model

The SDK defaults to gpt-4o. Override globally with an environment variable:

export OPENAI_DEFAULT_MODEL="gpt-4o-mini"

Or programmatically:

from agents import Agent

# Per-agent model selection
fast_agent = Agent(
    name="Fast Agent",
    instructions="Respond quickly.",
    model="gpt-4o-mini",
)

smart_agent = Agent(
    name="Smart Agent",
    instructions="Analyze deeply.",
    model="gpt-4o",
)

reasoning_agent = Agent(
    name="Reasoning Agent",
    instructions="Solve complex problems step by step.",
    model="o3-mini",
)

Responses API vs Chat Completions API

By default, the SDK uses the OpenAI Responses API, which is newer and supports features like built-in tools (web search, file search) and constrained JSON output.

For compatibility with non-OpenAI providers or older setups, you can switch to the Chat Completions API:

from agents import Agent
from agents.models.openai_chatcompletions import OpenAIChatCompletionsModel
from openai import AsyncOpenAI

# Use Chat Completions API with any OpenAI-compatible provider
client = AsyncOpenAI(
    base_url="https://api.together.xyz/v1",
    api_key="your-together-api-key",
)

agent = Agent(
    name="Together Agent",
    instructions="You are helpful.",
    model=OpenAIChatCompletionsModel(
        model="meta-llama/Llama-3-70b-chat-hf",
        openai_client=client,
    ),
)

This makes the SDK work with any provider that exposes an OpenAI-compatible Chat Completions endpoint — Together AI, Anyscale, vLLM, Ollama, and more.

Debug Logging

Verbose Stdout Logging

The fastest way to see what the agent loop is doing:

from agents import enable_verbose_stdout_logging

enable_verbose_stdout_logging()

# Now every agent run prints detailed information:
# - Each LLM call with the full message list
# - Tool calls and their results
# - Handoff events
# - Timing information

This is invaluable during development. Never enable this in production — it prints potentially sensitive data including full prompts and responses.

Python Logging Integration

For more control, use Python's standard logging:

import logging

# Set the agents logger to DEBUG
logging.getLogger("agents").setLevel(logging.DEBUG)

# Configure a handler
handler = logging.StreamHandler()
handler.setFormatter(logging.Formatter(
    "%(asctime)s [%(name)s] %(levelname)s: %(message)s"
))
logging.getLogger("agents").addHandler(handler)

In production, route these logs to your observability stack (Datadog, CloudWatch, etc.) at INFO level or above.

What Gets Logged

At DEBUG level, the SDK logs:

Log Entry Contains
LLM request Model name, message count, tool count
LLM response Response type, token usage
Tool execution Tool name, execution time
Tool error Tool name, error message
Handoff Source agent, target agent
Loop iteration Turn number, current agent

Tracing

The SDK includes built-in tracing that captures the full execution flow of every agent run. Tracing is enabled by default:

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

from agents import Agent, Runner, RunConfig

result = await Runner.run(
    agent,
    "User query here",
    run_config=RunConfig(
        workflow_name="customer-support",
        trace_id="req-12345-abc",
        group_id="session-67890",
        tracing_disabled=False,  # Default: enabled
    ),
)

Traces capture:

  • The complete agent loop execution with timing
  • All LLM calls with input/output
  • Tool calls with arguments and results
  • Handoff events
  • Error events

Disabling Tracing

For sensitive workloads or to reduce overhead:

result = await Runner.run(
    agent,
    "Sensitive query",
    run_config=RunConfig(tracing_disabled=True),
)

Sensitive Data Protection

What to Protect

In production, be conscious of what data flows through the agent system:

  • User PII: Names, emails, phone numbers, addresses
  • Financial data: Credit card numbers, bank accounts
  • Authentication tokens: API keys, session tokens, passwords
  • Health information: Medical records, diagnoses

Protection Strategies

1. Scrub inputs before sending to the agent:

import re

def scrub_pii(text: str) -> str:
    # Mask email addresses
    text = re.sub(r'[\w.-]+@[\w.-]+\.\w+', '[EMAIL]', text)
    # Mask phone numbers
    text = re.sub(r'\b\d{3}[-.]?\d{3}[-.]?\d{4}\b', '[PHONE]', text)
    # Mask credit card numbers
    text = re.sub(r'\b\d{4}[- ]?\d{4}[- ]?\d{4}[- ]?\d{4}\b', '[CARD]', text)
    return text

result = await Runner.run(agent, scrub_pii(user_input))

2. Use context for sensitive data instead of conversation messages:

@function_tool
async def process_payment(
    context: RunContextWrapper[PaymentContext],
    amount: float,
) -> str:
    """Process a payment for the current user.

    Args:
        amount: Payment amount in USD.
    """
    # Access payment info from context, not from the conversation
    card = context.context.payment_method
    # Process payment...
    return f"Payment of ${amount} processed successfully."

The payment details are in the context (never sent to the LLM) while the LLM only sees the amount and result.

3. Disable tracing for sensitive operations:

result = await Runner.run(
    payment_agent,
    "Process my payment",
    run_config=RunConfig(tracing_disabled=True),
)

Production Configuration Checklist

Before deploying agents to production, verify each item:

Security

  • API keys are loaded from environment variables or a secrets manager
  • No API keys are hardcoded in source code
  • PII scrubbing is applied to user inputs where appropriate
  • Sensitive data flows through context, not conversation messages
  • Output guardrails are configured to catch unsafe responses
  • Tracing is disabled or filtered for sensitive workflows

Reliability

  • max_turns is set on every Runner.run() call
  • Tool timeouts are configured for all I/O tools
  • Retry policies are configured for transient failures
  • MaxTurnsExceeded and other exceptions are caught and handled
  • Circuit breakers are in place for external service calls

Observability

  • Logging is configured at INFO level (not DEBUG in production)
  • Tracing is enabled with meaningful workflow names and trace IDs
  • Trace IDs are correlated with your application's request IDs
  • Token usage is tracked for cost monitoring
  • Error rates are monitored with alerting

Performance

  • Model selection matches the task complexity (do not use gpt-4o for simple classification)
  • max_tokens is set to prevent unnecessarily long responses
  • WebSocket transport is used for high-frequency streaming applications
  • Connection pooling is configured on custom OpenAI clients
  • Async Runner.run() is used in async contexts (not run_sync())

Cost Control

  • Token usage is logged and monitored
  • max_turns prevents runaway loops
  • max_tokens is set appropriately per agent
  • Cheaper models are used for simple tasks
  • Rate limiting is implemented at the application level

Environment-Specific Configuration Pattern

A clean pattern for managing configuration across environments:

import os
from dataclasses import dataclass

@dataclass
class AgentConfig:
    openai_api_key: str
    default_model: str
    max_turns: int
    enable_tracing: bool
    log_level: str

    @classmethod
    def from_env(cls) -> "AgentConfig":
        env = os.getenv("ENVIRONMENT", "development")

        if env == "production":
            return cls(
                openai_api_key=os.environ["OPENAI_API_KEY"],
                default_model="gpt-4o",
                max_turns=10,
                enable_tracing=True,
                log_level="INFO",
            )
        elif env == "staging":
            return cls(
                openai_api_key=os.environ["OPENAI_API_KEY"],
                default_model="gpt-4o-mini",
                max_turns=15,
                enable_tracing=True,
                log_level="DEBUG",
            )
        else:  # development
            return cls(
                openai_api_key=os.getenv("OPENAI_API_KEY", ""),
                default_model="gpt-4o-mini",
                max_turns=25,
                enable_tracing=False,
                log_level="DEBUG",
            )

config = AgentConfig.from_env()

This keeps all environment-specific decisions in one place and makes it easy to audit what each environment uses.


Source: OpenAI Agents SDK — Configuration

Share

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like