CrewAI Callbacks and Event Hooks: Monitoring Agent Progress in Real Time

Why Observability Matters in Multi-Agent Systems

When a single LLM call produces unexpected output, you read the prompt and response. When a crew of five agents runs for three minutes and produces a poor result, debugging is exponentially harder. Which agent went off track? At which step? Did a tool return bad data? Did an agent misinterpret context from a previous task?

CrewAI's callback system solves this by giving you hooks into every step of agent execution. You can log progress, track costs, save intermediate results, send notifications, or halt execution — all without modifying your agent or task definitions.

Task Callbacks

The simplest callback is at the task level. It fires when a task completes and receives the task output:

flowchart TD
    GOAL(["Crew goal"])
    MGR["Manager agent<br/>hierarchical process"]
    R1["Researcher agent<br/>role plus backstory"]
    R2["Analyst agent"]
    W1["Writer agent"]
    T1["Task A<br/>research"]
    T2["Task B<br/>analyze"]
    T3["Task C<br/>draft"]
    TOOLS[("Tools<br/>web search, files")]
    OUT(["Crew output"])
    GOAL --> MGR
    MGR --> T1 --> R1 --> TOOLS
    R1 --> T2 --> R2
    R2 --> T3 --> W1 --> OUT
    style MGR fill:#4f46e5,stroke:#4338ca,color:#fff
    style TOOLS fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b
    style OUT fill:#059669,stroke:#047857,color:#fff

from crewai import Agent, Task, Crew, Process
import json
from datetime import datetime

def on_task_complete(output):
    log_entry = {
        "timestamp": datetime.now().isoformat(),
        "description": output.description[:80],
        "output_length": len(output.raw),
        "output_preview": output.raw[:200],
    }
    print(f"[TASK DONE] {json.dumps(log_entry, indent=2)}")

researcher = Agent(
    role="Researcher",
    goal="Find accurate data",
    backstory="Expert researcher.",
)

task = Task(
    description="Research the top 5 AI startups funded in 2026.",
    expected_output="A numbered list with company name, funding amount, and focus area.",
    agent=researcher,
    callback=on_task_complete,
)

The callback receives a TaskOutput object with properties including raw (the string output), description (the task description), and agent (the agent that executed it). This is your primary tool for logging what each task produced.

Step Callbacks

Step callbacks fire at each reasoning step within an agent's execution loop. They provide granular visibility into the agent's thought process, tool calls, and intermediate outputs:

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

from crewai import Agent

def on_agent_step(step_output):
    print(f"[STEP] Agent: {step_output.agent}")
    print(f"[STEP] Action: {step_output.action}")
    if step_output.tool:
        print(f"[STEP] Tool used: {step_output.tool}")
        print(f"[STEP] Tool input: {step_output.tool_input}")
    print(f"[STEP] Output: {step_output.result[:150]}...")
    print("---")

researcher = Agent(
    role="Researcher",
    goal="Find accurate data using web search",
    backstory="Expert online researcher.",
    step_callback=on_agent_step,
    verbose=True,
)

Step callbacks let you see exactly what the agent is thinking at each iteration. When an agent makes a bad tool call or misinterprets data, the step callback captures the exact moment things went wrong.

Building a Structured Logger

For production systems, combine callbacks with a structured logging system:

import logging
import json
from datetime import datetime

logging.basicConfig(
    filename="crew_execution.log",
    level=logging.INFO,
    format="%(message)s",
)

class CrewLogger:
    def __init__(self, crew_name: str):
        self.crew_name = crew_name
        self.start_time = None
        self.task_count = 0

    def on_task_start(self):
        self.task_count += 1

    def on_task_complete(self, output):
        entry = {
            "crew": self.crew_name,
            "event": "task_complete",
            "task_number": self.task_count,
            "timestamp": datetime.now().isoformat(),
            "description": output.description[:100],
            "output_chars": len(output.raw),
        }
        logging.info(json.dumps(entry))

    def on_step(self, step_output):
        entry = {
            "crew": self.crew_name,
            "event": "agent_step",
            "task_number": self.task_count,
            "timestamp": datetime.now().isoformat(),
            "action": str(step_output.action)[:100],
        }
        logging.info(json.dumps(entry))

logger = CrewLogger("market_research")

Use the logger with your agents and tasks:

researcher = Agent(
    role="Researcher",
    goal="Find data",
    backstory="Expert researcher.",
    step_callback=logger.on_step,
)

task = Task(
    description="Research AI market trends.",
    expected_output="A summary of 5 trends.",
    agent=researcher,
    callback=logger.on_task_complete,
)

This produces a structured log file that can be ingested by any log aggregation system — ELK, Datadog, CloudWatch, or a simple script that parses JSON lines.

Cost Tracking with Callbacks

One of the most practical uses of callbacks is tracking LLM token usage and cost:

class CostTracker:
    def __init__(self):
        self.total_steps = 0
        self.tool_calls = 0
        self.tasks_completed = 0

    def on_step(self, step_output):
        self.total_steps += 1
        if step_output.tool:
            self.tool_calls += 1

    def on_task_complete(self, output):
        self.tasks_completed += 1

    def summary(self):
        return {
            "total_steps": self.total_steps,
            "tool_calls": self.tool_calls,
            "tasks_completed": self.tasks_completed,
            "avg_steps_per_task": (
                self.total_steps / self.tasks_completed
                if self.tasks_completed > 0
                else 0
            ),
        }

tracker = CostTracker()

After a crew run, call tracker.summary() to understand how much work each execution required. Track this over time to identify optimization opportunities.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

Halting Execution from Callbacks

While CrewAI does not natively support halting execution from a callback, you can raise an exception to stop a run:

class SafetyGuard:
    def __init__(self, max_steps: int = 50):
        self.max_steps = max_steps
        self.step_count = 0

    def on_step(self, step_output):
        self.step_count += 1
        if self.step_count > self.max_steps:
            raise RuntimeError(
                f"Safety limit reached: {self.max_steps} steps exceeded. "
                "Agent may be in a loop."
            )

This prevents runaway agents from consuming unlimited tokens. Set the threshold based on your expected task complexity.

FAQ

Can I use async callbacks?

CrewAI's callback system currently expects synchronous functions. If you need to perform async operations (like writing to an async database), use a synchronous wrapper that schedules the async work or writes to a queue that an async consumer processes.

Do callbacks affect agent performance?

Callbacks add negligible overhead — they run between LLM calls, not during them. The LLM inference time dominates execution. A callback that takes 10 milliseconds is invisible when each LLM call takes 1 to 3 seconds.

Can I attach multiple callbacks to the same agent?

Not directly. The step_callback parameter accepts a single function. To run multiple handlers, create a dispatcher function that calls all your handlers sequentially within a single callback.

#CrewAI #Callbacks #Observability #Monitoring #Python #AgenticAI #LearnAI #AIEngineering

CrewAI Callbacks and Event Hooks: Monitoring Agent Progress in Real Time

Why Observability Matters in Multi-Agent Systems

Task Callbacks

Step Callbacks

Building a Structured Logger

Cost Tracking with Callbacks

Halting Execution from Callbacks

FAQ

Can I use async callbacks?

Do callbacks affect agent performance?

Can I attach multiple callbacks to the same agent?

Try CallSphere AI Voice Agents

Related Articles You May Like

Monitoring WebSocket Health: Heartbeats and Prometheus in 2026

The Agent Evaluation Stack in 2026: From Trace to Eval Score

Building Your First Agent with the OpenAI Agents SDK in 2026: A Hands-On Walkthrough

MOS Call Quality Scoring for AI Voice Operations in 2026: Beyond 4.2

Arize Phoenix: Open-Source LLM Tracing in 2026 Reviewed Honestly

CrewAI for E-Commerce Merchandising Agents: A Real DTC Build