Skip to content
Learn Agentic AI
Learn Agentic AI11 min read15 views

Building a Triage Agent: Intelligent Request Routing to the Right Specialist

Build a triage agent that classifies incoming requests, routes them to the correct specialist agent with confidence scoring, and handles ambiguous or unclassifiable inputs with fallback strategies.

The Triage Agent's Role

Every multi-agent system needs a front door. The triage agent is that front door — it receives the raw user input, determines what kind of request it is, and routes it to the specialist best equipped to handle it. A well-built triage agent is the difference between a multi-agent system that feels intelligent and one that feels like a phone tree.

The triage agent has three core responsibilities: classify the request, select the right specialist, and transfer cleanly. It should not attempt to answer the question itself. Its job is routing, not resolution.

A Simple Triage Agent

Here is a basic triage agent that routes between three specialists:

flowchart TD
    INPUT(["Task input"])
    SUPER["Supervisor agent<br/>plans plus monitors"]
    W1["Worker 1<br/>research"]
    W2["Worker 2<br/>code"]
    W3["Worker 3<br/>writing"]
    CRITIC{"Output meets<br/>rubric?"}
    REWORK["Rework or<br/>retry path"]
    SHARED[("Shared scratchpad<br/>and memory")]
    OUT(["Final result"])
    INPUT --> SUPER
    SUPER --> W1 --> CRITIC
    SUPER --> W2 --> CRITIC
    SUPER --> W3 --> CRITIC
    W1 --> SHARED
    W2 --> SHARED
    W3 --> SHARED
    SHARED --> SUPER
    CRITIC -->|Pass| OUT
    CRITIC -->|Fail| REWORK --> SUPER
    style SUPER fill:#4f46e5,stroke:#4338ca,color:#fff
    style CRITIC fill:#f59e0b,stroke:#d97706,color:#1f2937
    style OUT fill:#059669,stroke:#047857,color:#fff
    style SHARED fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b
from agents import Agent, Runner, handoff

sales_agent = Agent(
    name="Sales Agent",
    instructions="""You handle pricing questions, product comparisons,
    and purchase decisions. Be helpful and consultative.""",
)

support_agent = Agent(
    name="Support Agent",
    instructions="""You handle technical issues, bug reports, and
    troubleshooting. Be patient and methodical.""",
)

general_agent = Agent(
    name="General Agent",
    instructions="""You handle general inquiries that do not fit sales
    or support. Answer questions about the company, careers, and
    partnerships.""",
)

triage = Agent(
    name="Triage Agent",
    instructions="""You are a request router. Analyze the user's
    message and hand off to the correct specialist:

    - Sales Agent: pricing, plans, discounts, purchases, upgrades
    - Support Agent: bugs, errors, outages, how-to questions, config
    - General Agent: company info, careers, partnerships, everything else

    Hand off immediately. Do not answer questions yourself.""",
    handoffs=[
        handoff(sales_agent),
        handoff(support_agent),
        handoff(general_agent),
    ],
)

result = Runner.run_sync(triage, "How much does the enterprise plan cost?")
print(result.final_output)  # Handled by sales_agent

This works for straightforward requests. But production traffic is messy. Users say things like "I think there is a bug with the billing page" — is that a support issue or a billing issue? The basic triage agent guesses, and sometimes guesses wrong.

Adding Structured Classification

To make routing more reliable, give the triage agent a classification tool that forces structured reasoning:

from agents import Agent, Runner, function_tool, handoff
from pydantic import BaseModel
from enum import Enum

class RequestCategory(str, Enum):
    SALES = "sales"
    SUPPORT = "support"
    GENERAL = "general"

@function_tool
def classify_request(
    category: str,
    confidence: float,
    reasoning: str,
) -> str:
    """Classify a user request into a category with confidence score.

    Args:
        category: One of 'sales', 'support', or 'general'
        confidence: Float between 0.0 and 1.0
        reasoning: Brief explanation of classification decision
    """
    return f"Category: {category} | Confidence: {confidence} | Reason: {reasoning}"

triage = Agent(
    name="Triage Agent",
    instructions="""You are a request classifier. For every user message:

    1. Call classify_request with your assessment
    2. If confidence >= 0.8, hand off to the matching specialist
    3. If confidence < 0.8, ask the user a clarifying question

    Categories:
    - sales: pricing, plans, discounts, purchases, upgrades, demos
    - support: bugs, errors, outages, how-to, configuration, API issues
    - general: company info, careers, partnerships, feedback""",
    tools=[classify_request],
    handoffs=[
        handoff(sales_agent),
        handoff(support_agent),
        handoff(general_agent),
    ],
)

Now the triage agent must explicitly declare its classification, confidence, and reasoning before routing. This creates an auditable decision trail. When routing goes wrong, you can inspect the classification output to understand why.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live Demo →

Handling Ambiguous Requests

The confidence threshold approach naturally handles ambiguity. When the triage agent encounters a message like "I want to change my plan but the upgrade button is broken," it recognizes both sales and support signals. The confidence for either category alone will be lower than 0.8, triggering a clarifying question.

You can also handle multi-intent requests by splitting them:

@function_tool
def detect_intents(
    primary_intent: str,
    secondary_intent: str,
    primary_confidence: float,
    secondary_confidence: float,
) -> str:
    """Detect multiple intents in a single user message.

    Args:
        primary_intent: The main request category
        secondary_intent: A secondary category, or 'none'
        primary_confidence: Confidence for primary intent
        secondary_confidence: Confidence for secondary intent
    """
    if secondary_intent != "none":
        return f"Multi-intent: {primary_intent} ({primary_confidence}) + {secondary_intent} ({secondary_confidence})"
    return f"Single intent: {primary_intent} ({primary_confidence})"

When two intents are detected, the triage agent can address the primary intent first and note the secondary intent for follow-up.

Fallback Strategies

Every triage system needs a fallback for requests that do not fit any category. There are three common fallback strategies:

1. Default specialist. Route unclassifiable requests to a general-purpose agent that can handle anything, even if less expertly.

2. Human escalation. If the triage agent cannot classify with sufficient confidence after one clarifying question, escalate to a human operator.

3. Graceful decline. Acknowledge that the request is outside the system's capabilities and suggest alternative channels.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

fallback_agent = Agent(
    name="Fallback Agent",
    instructions="""You handle requests that could not be confidently
    routed to a specialist. Do your best to help, but proactively
    offer to connect the user with a human agent if you cannot fully
    resolve their issue.

    Say: 'I want to make sure you get the best help. Would you like
    me to connect you with a team member who specializes in this?'""",
)

triage = Agent(
    name="Triage Agent",
    instructions="""Classify and route requests. If confidence is below
    0.5 even after asking a clarifying question, hand off to the
    Fallback Agent.""",
    handoffs=[
        handoff(sales_agent),
        handoff(support_agent),
        handoff(general_agent),
        handoff(fallback_agent),
    ],
)

Optimizing the Triage Agent for Speed

The triage agent sits in the critical path of every conversation. Every millisecond it spends reasoning is latency the user feels before getting a real answer. Two techniques help:

Use a fast model. The triage agent performs classification, not complex reasoning. GPT-4o-mini handles this task well and responds significantly faster than GPT-4o.

Minimize instructions. Keep the triage agent's system prompt short. It does not need background context about the company or detailed product knowledge — it only needs routing criteria.

triage = Agent(
    name="Triage Agent",
    model="gpt-4o-mini",
    instructions="Route to sales (pricing/plans), support (bugs/errors), or general (other). Hand off immediately.",
    handoffs=[handoff(sales_agent), handoff(support_agent), handoff(general_agent)],
)

FAQ

Should the triage agent maintain conversation state?

No. The triage agent should be stateless. It classifies the current message and routes. It does not need to remember previous conversations. If a returning user sends a follow-up message, the triage agent classifies it fresh and routes accordingly. State management belongs in the specialist agents or in an external session store.

What if the user changes topics mid-conversation?

If the user starts with a billing question and then asks a technical question, the currently active specialist agent should detect that the new request is outside its scope and hand off back to the triage agent, or directly to the appropriate specialist if it has that handoff configured.

How do I measure triage accuracy?

Log every classification decision (category, confidence, reasoning) alongside the final outcome. Periodically review cases where the user had to repeat themselves or where the conversation was transferred multiple times — these indicate triage errors. Target 90%+ first-attempt routing accuracy.


#TriageAgent #RequestRouting #MultiAgentSystems #OpenAIAgentsSDK #Classification #AgenticAI #LearnAI #AIEngineering

Share

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

Agentic AI

Tool Selection Accuracy: The Eval Most Teams Skip — and Should Not (2026)

Your agent picked the wrong tool 12% of the time and the final answer was still right. That's a latent bug. Here's the eval pipeline that surfaces it.

Agentic AI

Token-Level Evaluation of Streaming Agents: TTFT, Stream Smoothness, and Mid-Stream Hallucination Detection

Streaming changes the eval game — final-answer correctness isn't enough when users perceive the answer one token at a time. Here's the metric set that matters.

Agentic AI

Building Your First Agent with the OpenAI Agents SDK in 2026: A Hands-On Walkthrough

Step-by-step build of a working agent with the OpenAI Agents SDK — Agent class, tools, handoffs, tracing — plus an eval pipeline that catches regressions before merge.

Agentic AI

OpenAI Agents SDK vs Assistants API in 2026: Migration Guide with Eval Parity

Honest principal-engineer comparison of the OpenAI Agents SDK and the legacy Assistants API, with a migration checklist and eval-parity strategy so you don't ship regressions.

Agentic AI

Streaming Agent Responses with OpenAI Agents SDK and LangChain in 2026

How to stream tokens, tool-call deltas, and intermediate steps from an agent — with code for both the OpenAI Agents SDK and LangChain — and the gotchas that bite in production.

Agentic AI

Input and Output Guardrails in the OpenAI Agents SDK: A Production Pattern (2026)

Stop the agent BEFORE it does the wrong thing. How to wire input and output guardrails in the OpenAI Agents SDK with cheap classifiers and an eval suite that proves they work.