Skip to content
Technical Guides
Technical Guides15 min read18 views

Post-Call Analytics with GPT-4o-mini: Sentiment, Lead Scoring, and Intent

Build a post-call analytics pipeline with GPT-4o-mini — sentiment, intent, lead scoring, satisfaction, and escalation detection.

The cheap AI that earns its keep

Running the Realtime API for live conversation is expensive. Running GPT-4o-mini over the transcript afterwards is nearly free — and it is where most of the operational insight actually comes from. Sentiment, intent, lead score, satisfaction, escalation reason: all of it falls out of one structured JSON call per transcript.

This post walks through the post-call analytics pipeline CallSphere runs in production, including the exact schema, the prompt, and the queue architecture that keeps it off the hot path.

call ends
   │
   ▼
queue.publish(post_call, {transcript, metadata})
   │
   ▼
worker pulls
   │
   ▼
GPT-4o-mini call with JSON schema
   │
   ▼
UPSERT call_analytics
   │
   ▼
trigger downstream (CRM, dashboards)

Architecture overview

┌────────────────────┐
│ Voice agent runtime│
└─────────┬──────────┘
          │ on_call_end
          ▼
┌────────────────────┐
│ Queue (SQS/Redis)  │
└─────────┬──────────┘
          ▼
┌────────────────────┐
│ Analytics worker   │
│ • GPT-4o-mini call │
│ • JSON validation  │
└─────────┬──────────┘
          ▼
┌────────────────────┐
│ call_analytics     │
└─────────┬──────────┘
          ▼
   dashboards, CRM,
   alerts, exports

Prerequisites

  • A queue for background jobs.
  • Postgres (or any OLAP store) for the analytics table.
  • An OpenAI key with GPT-4o-mini access.
  • The call transcript in a structured [{role, text}] format.

Step-by-step walkthrough

1. Define the output schema

ANALYTICS_SCHEMA = {
    "type": "object",
    "properties": {
        "summary": {"type": "string"},
        "sentiment": {"type": "string", "enum": ["positive", "neutral", "negative"]},
        "sentiment_score": {"type": "number", "minimum": -1, "maximum": 1},
        "intent": {"type": "string"},
        "lead_score": {"type": "integer", "minimum": 0, "maximum": 100},
        "satisfaction": {"type": "integer", "minimum": 1, "maximum": 5},
        "escalated": {"type": "boolean"},
        "escalation_reason": {"type": ["string", "null"]},
        "next_action": {"type": "string"},
        "tags": {"type": "array", "items": {"type": "string"}},
    },
    "required": ["summary", "sentiment", "intent", "lead_score", "satisfaction", "escalated", "next_action"],
}

2. Write the worker

from openai import AsyncOpenAI
client = AsyncOpenAI()

PROMPT = """
You are an analyst reviewing a completed phone call between a customer and an AI voice agent.
Return a JSON object matching the provided schema. Be concise and accurate.
Do not invent facts. If something is unclear, say so in the summary.
"""

async def analyze(transcript: list[dict]) -> dict:
    text = "\n".join(f"{t['role']}: {t['text']}" for t in transcript)
    resp = await client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": PROMPT},
            {"role": "user", "content": text},
        ],
        response_format={"type": "json_object"},
        temperature=0.1,
    )
    return json.loads(resp.choices[0].message.content)

3. Persist and index

CREATE TABLE call_analytics (
  call_id TEXT PRIMARY KEY,
  summary TEXT,
  sentiment TEXT,
  sentiment_score REAL,
  intent TEXT,
  lead_score INT,
  satisfaction INT,
  escalated BOOLEAN,
  escalation_reason TEXT,
  next_action TEXT,
  tags TEXT[],
  created_at TIMESTAMPTZ DEFAULT now()
);
CREATE INDEX ON call_analytics (sentiment, created_at);
CREATE INDEX ON call_analytics (lead_score DESC) WHERE lead_score >= 70;

4. Trigger downstream actions

async def on_analytics(result: dict, call_id: str):
    if result["lead_score"] >= 75:
        await hubspot_log_hot_lead(call_id, result)
    if result["escalated"]:
        await pager_alert(call_id, result["escalation_reason"])

5. Handle failures gracefully

Validate the JSON against the schema. On failure, retry once with a "fix your previous output" prompt. On repeated failure, park the event in a DLQ for manual review.

flowchart LR
    LEAD(["Inbound lead"])
    AGENT["AI voice or chat<br/>qualifier"]
    BANT["BANT capture<br/>budget, authority,<br/>need, timing"]
    SCORE{"Lead score<br/>and routing rules"}
    HOT(["Hot — book<br/>AE meeting"])
    WARM(["Warm — SDR<br/>sequence"])
    NURT(["Nurture — drip<br/>and content"])
    CRM[("CRM and SLA timer")]
    LEAD --> AGENT --> BANT --> SCORE
    SCORE -->|Hot| HOT --> CRM
    SCORE -->|Warm| WARM --> CRM
    SCORE -->|Cold| NURT --> CRM
    style AGENT fill:#4f46e5,stroke:#4338ca,color:#fff
    style HOT fill:#059669,stroke:#047857,color:#fff
    style WARM fill:#0ea5e9,stroke:#0369a1,color:#fff
    style NURT fill:#f59e0b,stroke:#d97706,color:#1f2937

6. Sample and spot-check

Every day, have a human reviewer grade 10 random analytics outputs for accuracy. Drift in the base model shows up here first.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live Demo →

Production considerations

  • Cost: GPT-4o-mini is ~$0.15/1M input tokens. A 5-minute call is roughly $0.001 to analyze.
  • Latency: this runs async, so latency does not affect the caller, but keep the worker under 10s to avoid backlog.
  • PII: redact credit cards and SSNs before sending the transcript to the LLM.
  • Schema evolution: version the schema and store the version alongside the row.
  • Bias monitoring: spot-check scores across demographics to avoid systematic skew.

CallSphere's real implementation

CallSphere runs exactly this pipeline for every call across every vertical. The voice plane uses the OpenAI Realtime API with gpt-4o-realtime-preview-2025-06-03, PCM16 at 24kHz, and server VAD. When a call ends, the transcript plus metadata is published to a queue, and a worker calls GPT-4o-mini with a JSON schema almost identical to the one above, then writes the result into per-vertical Postgres.

The healthcare vertical tunes the schema for insurance and clinical intent signals (14 tools), real estate uses tighter lead-scoring and tour-booking intent (10 agents), salon optimizes for rebooking and upsell (4 agents), after-hours escalation focuses on urgency classification (7 tools), IT helpdesk combines intent with RAG-hit quality (10 tools + RAG), and the ElevenLabs sales pod tracks objection categories (5 GPT-4 specialists). All of them feed the same admin dashboard. CallSphere runs 57+ languages with analytics computed identically across them.

Common pitfalls

  • Running analytics synchronously: it blocks the next call.
  • Trusting the JSON without validation: small JSON errors blow up downstream.
  • Mixing verticals in one prompt: every vertical needs its own schema.
  • Ignoring drift: spot-check or you will miss regressions.
  • Logging raw PII: use field-level encryption for the summary column.

FAQ

Why GPT-4o-mini and not the full model?

Cost. GPT-4o-mini is accurate enough for analytics and 10-20x cheaper.

Roll up nightly into a summary table; do not re-query raw every time.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Can I use the same output to route follow-ups?

Yes — the next_action field is designed for it.

What about multi-language calls?

GPT-4o-mini handles 50+ languages well for sentiment and intent.

How do I correlate analytics with business outcomes?

Join call_analytics.call_id to your CRM deal closure data.

Next steps

Want sentiment, intent, and lead scoring on every call? Book a demo, explore the technology page, or see pricing.

#CallSphere #PostCallAnalytics #GPT4oMini #VoiceAI #Sentiment #LeadScoring #AIVoiceAgents

Share

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

AI Voice Agents

Call Sentiment Time-Series Dashboards for Voice AI in 2026

Sentiment is not a single number per call - it is a curve. The shape (started positive, dropped at minute 4, recovered) tells you what your AI did wrong. Here is the per-utterance sentiment pipeline and the dashboards we ship by vertical.

IT Helpdesk

Denver and Boulder IT Helpdesks: A Different Take on CallSphere Voice + Chat for Front Range MSPs Running Tight Margins

Colorado MSPs and IT helpdesks: integrate CallSphere's 10-agent voice + chat AI into ConnectWise, Autotask, ServiceNow, or your PSA in 24-72 hours.

IT Helpdesk

Hassle-Free CallSphere Integration for Edison IT Departments — RAG Knowledge Base, Auto Ticket, Live Voice & Chat

New Jersey MSPs and IT helpdesks: integrate CallSphere's 10-agent voice + chat AI into ConnectWise, Autotask, ServiceNow, or your PSA in 24-72 hours.

IT Helpdesk

Michigan MSP Operators' Playbook for Plugging Voice + Chat AI Into Your PSA Without Rewriting a Workflow

Michigan MSPs and IT helpdesks: integrate CallSphere's 10-agent voice + chat AI into ConnectWise, Autotask, ServiceNow, or your PSA in 24-72 hours.

IT Helpdesk

From Rochester to Statewide MN: Smooth CallSphere Rollout for MSPs Running Halo, Freshservice, and Jira SM

Minnesota MSPs and IT helpdesks: integrate CallSphere's 10-agent voice + chat AI into ConnectWise, Autotask, ServiceNow, or your PSA in 24-72 hours.

IT Helpdesk

Why Pennsylvania IT Helpdesks Are Routing L1 Tickets Through CallSphere's 10-Agent AI — Pittsburgh Lead Adopters

Pennsylvania MSPs and IT helpdesks: integrate CallSphere's 10-agent voice + chat AI into ConnectWise, Autotask, ServiceNow, or your PSA in 24-72 hours.