Skip to content
Learn Agentic AI
Learn Agentic AI13 min read7 views

Call Transfer Patterns for AI Agents: Warm Transfer, Cold Transfer, and Conferencing

Master the three call transfer patterns for AI voice agents: cold transfer, warm transfer, and conferencing. Covers context passing, hold music, agent whisper, and seamless handoff implementation.

The Three Transfer Patterns

When an AI agent cannot fully resolve a caller's issue, it must transfer the call to a human. How that transfer happens dramatically affects customer experience. There are three patterns, each with distinct tradeoffs:

Cold Transfer — The AI connects the caller directly to the destination. The caller may hear ringing and must re-explain their issue. Fast but frustrating.

Warm Transfer — The AI first speaks to the human agent, passes context, then bridges the caller in. The caller does not repeat themselves. Slower but much better experience.

Conference Transfer — The AI, caller, and human agent are briefly all on the same call. The AI introduces the situation, then drops off. Best for complex handoffs.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live Demo →

Cold Transfer Implementation

Cold transfer is the simplest pattern. The AI terminates its leg of the call and connects the caller directly to the destination:

flowchart LR
    CALLER(["Caller"])
    subgraph TEL["Telephony"]
        SIP["Twilio SIP and PSTN"]
    end
    subgraph BRAIN["Business AI Agent"]
        STT["Streaming STT<br/>Deepgram or Whisper"]
        NLU{"Intent and<br/>Entity Extraction"}
        TOOLS["Tool Calls"]
        TTS["Streaming TTS<br/>ElevenLabs or Rime"]
    end
    subgraph DATA["Live Data Plane"]
        CRM[("CRM and Notes")]
        CAL[("Calendar and<br/>Schedule")]
        KB[("Knowledge Base<br/>and Policies")]
    end
    subgraph OUT["Outcomes"]
        O1(["Booking captured"])
        O2(["CRM record created"])
        O3(["Human handoff"])
    end
    CALLER --> SIP --> STT --> NLU
    NLU -->|Lookup| TOOLS
    TOOLS <--> CRM
    TOOLS <--> CAL
    TOOLS <--> KB
    NLU --> TTS --> SIP --> CALLER
    NLU -->|Resolved| O1
    NLU -->|Schedule| O2
    NLU -->|Escalate| O3
    style CALLER fill:#f1f5f9,stroke:#64748b,color:#0f172a
    style NLU fill:#4f46e5,stroke:#4338ca,color:#fff
    style O1 fill:#059669,stroke:#047857,color:#fff
    style O2 fill:#0ea5e9,stroke:#0369a1,color:#fff
    style O3 fill:#f59e0b,stroke:#d97706,color:#1f2937
from twilio.twiml.voice_response import VoiceResponse, Dial
from fastapi import FastAPI, Request
from fastapi.responses import Response

app = FastAPI()

@app.post("/cold-transfer")
async def cold_transfer(request: Request):
    """Transfer the caller directly to a human agent."""
    form = await request.form()
    call_sid = form.get("CallSid")

    # Log the transfer context before disconnecting
    await save_transfer_context(call_sid, {
        "reason": "billing_dispute",
        "caller_sentiment": "frustrated",
        "summary": "Caller disputing charge of $49.99 from March 3",
    })

    response = VoiceResponse()
    response.say(
        "I am connecting you to a billing specialist now. "
        "Please hold."
    )

    dial = Dial(
        caller_id=form.get("From"),  # Preserve caller ID
        timeout=30,
        action="/transfer-complete",  # Called when dial ends
    )
    dial.number(
        "+15559876543",
        status_callback="/agent-answered",
        status_callback_event="initiated ringing answered completed",
    )
    response.append(dial)

    # Fallback if agent does not answer
    response.say(
        "I am sorry, no agent is available right now. "
        "Let me take a message."
    )
    response.redirect("/take-message")

    return Response(content=str(response), media_type="application/xml")

Warm Transfer Implementation

Warm transfer requires managing three call legs: the original call (on hold), a whisper call to the agent, and the final bridged call:

from twilio.rest import Client
import os

twilio_client = Client()

class WarmTransferManager:
    """Manages warm transfers with context passing."""

    def __init__(self, twilio_client, webhook_base):
        self.client = twilio_client
        self.webhook_base = webhook_base

    async def initiate_warm_transfer(
        self, call_sid: str, agent_number: str, context: dict
    ):
        """Start the warm transfer process."""
        # Step 1: Put the caller on hold with music
        self.client.calls(call_sid).update(
            twiml='<Response><Play loop="0">'
                  'https://api.twilio.com/cowbell.mp3'
                  '</Play></Response>',
        )

        # Step 2: Store context for the whisper
        await self.store_context(call_sid, context)

        # Step 3: Call the human agent with a whisper
        whisper_call = self.client.calls.create(
            to=agent_number,
            from_=os.environ["TWILIO_NUMBER"],
            url=(
                f"{self.webhook_base}/agent-whisper"
                f"?original_call={call_sid}"
            ),
            status_callback=f"{self.webhook_base}/whisper-status",
        )

        return whisper_call.sid

    async def store_context(self, call_sid: str, context: dict):
        """Store transfer context for the receiving agent."""
        import json
        # Use Redis for fast retrieval during the whisper
        await self.redis.set(
            f"transfer:{call_sid}",
            json.dumps(context),
            ex=300,  # 5 minute TTL
        )

@app.post("/agent-whisper")
async def agent_whisper(request: Request):
    """Play context to the human agent before bridging."""
    form = await request.form()
    original_call = form.get("original_call")

    # Retrieve the transfer context
    context = await get_transfer_context(original_call)

    response = VoiceResponse()

    # Whisper: only the agent hears this
    whisper_text = (
        f"Incoming transfer. Caller: {context['caller_name']}. "
        f"Issue: {context['summary']}. "
        f"Sentiment: {context['sentiment']}. "
        f"Press 1 to accept, 2 to decline."
    )
    response.say(whisper_text, voice="Polly.Joanna")

    gather = response.gather(
        num_digits=1,
        action=f"/agent-accept?original_call={original_call}",
        timeout=10,
    )

    # Timeout fallback — decline
    response.say("No response received. Transfer cancelled.")
    response.hangup()

    return Response(content=str(response), media_type="application/xml")

@app.post("/agent-accept")
async def agent_accept_transfer(request: Request):
    """Bridge the caller and agent after acceptance."""
    form = await request.form()
    digit = form.get("Digits")
    original_call = form.get("original_call")

    response = VoiceResponse()

    if digit == "1":
        # Agent accepted — bridge the calls via conference
        conference_name = f"transfer-{original_call}"

        # Connect the agent to the conference
        dial = Dial()
        dial.conference(conference_name, end_conference_on_exit=True)
        response.append(dial)

        # Move the original caller into the same conference
        twilio_client.calls(original_call).update(
            twiml=(
                f'<Response><Dial><Conference>'
                f'{conference_name}'
                f'</Conference></Dial></Response>'
            ),
        )
    else:
        response.say("Transfer declined.")
        response.hangup()
        # Return caller to AI agent
        twilio_client.calls(original_call).update(
            url=f"{os.environ['WEBHOOK_BASE']}/return-to-ai",
            method="POST",
        )

    return Response(content=str(response), media_type="application/xml")

Conference Transfer (Three-Way Introduction)

The conference pattern keeps all three parties briefly on the same call:

class ConferenceTransferManager:
    """Three-way conference transfer with AI introduction."""

    async def initiate_conference_transfer(
        self, call_sid: str, agent_number: str, context: dict
    ):
        """Set up a three-way call for handoff."""
        conference_name = f"handoff-{call_sid}"

        # Move caller into a conference (from hold)
        twilio_client.calls(call_sid).update(
            twiml=(
                f'<Response>'
                f'<Say>I am bringing in a specialist now.</Say>'
                f'<Dial><Conference>{conference_name}'
                f'</Conference></Dial></Response>'
            ),
        )

        # Add the AI agent to the conference (for introduction)
        ai_participant = twilio_client.conferences(
            conference_name
        ).participants.create(
            from_=os.environ["TWILIO_NUMBER"],
            to="sip:[email protected]",
            early_media=True,
        )

        # Add the human agent
        human_participant = twilio_client.conferences(
            conference_name
        ).participants.create(
            from_=os.environ["TWILIO_NUMBER"],
            to=agent_number,
            early_media=True,
        )

        return conference_name

    async def ai_introduction(self, conference_name, context):
        """AI speaks the introduction then leaves."""
        intro_text = (
            f"Hello everyone. I have {context['caller_name']} on "
            f"the line who needs help with {context['summary']}. "
            f"I will leave you to it."
        )
        # Speak the introduction via TTS
        await self.speak_in_conference(conference_name, intro_text)

        # Remove the AI from the conference
        await asyncio.sleep(2)  # Brief pause after speaking
        await self.remove_ai_from_conference(conference_name)

Context Passing Best Practices

The value of a warm transfer is the context. Structure it well:

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

from dataclasses import dataclass
from typing import Optional

@dataclass
class TransferContext:
    """Structured context passed during call transfer."""
    caller_name: str
    caller_number: str
    call_duration_seconds: int
    issue_summary: str
    sentiment: str  # positive, neutral, frustrated, angry
    intent: str
    actions_taken: list[str]
    information_collected: dict
    previous_transfers: int
    preferred_language: str = "en"
    priority: str = "normal"
    notes: Optional[str] = None

    def to_whisper_script(self) -> str:
        """Generate a concise whisper message for the agent."""
        actions = ", ".join(self.actions_taken) if self.actions_taken else "none yet"
        return (
            f"Caller: {self.caller_name}. "
            f"Issue: {self.issue_summary}. "
            f"Mood: {self.sentiment}. "
            f"Already tried: {actions}. "
            f"Priority: {self.priority}."
        )

    def to_screen_pop(self) -> dict:
        """Generate data for the agent's screen pop display."""
        return {
            "caller": self.caller_name,
            "phone": self.caller_number,
            "summary": self.issue_summary,
            "sentiment_emoji": {
                "positive": "green",
                "neutral": "yellow",
                "frustrated": "orange",
                "angry": "red",
            }.get(self.sentiment, "yellow"),
            "history": self.actions_taken,
            "collected_data": self.information_collected,
            "transfer_count": self.previous_transfers,
        }

FAQ

When should I use warm transfer versus cold transfer?

Use cold transfer for simple routing where context is not critical — e.g., transferring to a general queue. Use warm transfer when the caller has already explained their issue to the AI and repeating it would cause frustration — especially for complaints, complex issues, or VIP callers. The extra 10-15 seconds for a warm transfer pays for itself in customer satisfaction.

How do I handle the case where the human agent does not answer?

Implement a timeout with fallback logic. After 20-30 seconds of ringing, cancel the transfer and either return the caller to the AI agent, offer to take a message, or try an alternative agent. Always inform the caller what is happening: "Our specialist is not available right now. Would you like me to take a message, or would you prefer to try again later?"

How do I pass context to the agent's screen in addition to the whisper?

Use a parallel HTTP notification. When you initiate the warm transfer, simultaneously POST the TransferContext data to your contact center's API or the agent's desktop application. Most modern contact center platforms (Five9, Genesys, Talkdesk) have APIs for screen pops. The whisper provides audio context, and the screen pop provides visual context — both arrive before the caller is bridged in.


#CallTransfer #WarmTransfer #VoiceAI #Telephony #AgentHandoff #ContactCenter #AgenticAI #LearnAI #AIEngineering

Share

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

AI Engineering

Latency vs Cost: A Decision Matrix for Voice AI Spend in 2026

Every 100ms of latency costs you. So does every cent per minute. Here is the decision matrix we use across 6 verticals to pick where to spend and where to save on voice AI infrastructure.

AI Infrastructure

WebRTC Over QUIC and the Future of Realtime: Where Voice AI Goes After 2026

WebTransport is Baseline as of March 2026. Media Over QUIC ships in production within the year. Here is what changes for AI voice agents — and what stays the same.

AI Infrastructure

Defense, ITAR & AI Voice Vendor Compliance in 2026

ITAR technical-data definitions don't care if a human or an LLM produced the output. CMMC Level 2 has been mandatory since November 2025. Here is what an AI voice vendor needs to ship to defense in 2026.

AI Strategy

AI Agent M&A Activity 2026: Aircall–Vogent, Meta–PlayAI, OpenAI's Six Deals

Q1 2026 saw a record acquisition wave: Aircall bought Vogent (May), Meta acquired Manus and PlayAI, OpenAI closed six deals. The voice AI consolidation phase has begun.

AI Infrastructure

OpenAI's May 2026 WebRTC Rearchitecture: How Voice Latency Got Real

On May 4 2026 OpenAI published its Realtime stack rebuild — split-relay plus transceiver edge. Here is what changed and what it means for production voice agents.

AI Voice Agents

Call Sentiment Time-Series Dashboards for Voice AI in 2026

Sentiment is not a single number per call - it is a curve. The shape (started positive, dropped at minute 4, recovered) tells you what your AI did wrong. Here is the per-utterance sentiment pipeline and the dashboards we ship by vertical.