Build a Durable AI Agent with Inngest Async Workflows in 2026
Inngest steps give every LLM call retries, sleeps, human-in-the-loop pauses, and replay-safe state. Build a research agent that survives 2-hour approvals.
TL;DR — Inngest's
step.runmakes every LLM call automatically retried, idempotent, and replay-safe. Withstep.waitForEventyou can pause an agent for hours waiting on a human approval — without keeping a process alive.
What you'll build
A research agent that (1) plans subqueries with an LLM, (2) fans out tool calls in parallel, (3) pauses for human approval on the synthesis step, and (4) emits a final report — all durable across redeploys.
Prerequisites
inngest@^3.30,@inngest/agent-kit@^0.7, Node 20+.- Inngest dev server (
npx inngest-cli@latest dev) and a Vercel/Netlify/Node host.
Architecture
flowchart LR
E[event: research.requested] --> P[step.run plan]
P --> F[step.run fanout tools x N]
F --> H[step.waitForEvent approval]
H --> S[step.run synthesize]
S --> O[event: research.completed]
Step 1 — Define the function
```ts import { Inngest } from "inngest"; export const inngest = new Inngest({ id: "research-agent" });
export const research = inngest.createFunction(
{ id: "research" },
{ event: "research.requested" },
async ({ event, step }) => {
const plan = await step.run("plan", async () => llm.plan(event.data.q));
const findings = await Promise.all(
plan.subqueries.map((q, i) =>
step.run(fetch-${i}, () => searchTool(q))));
const approval = await step.waitForEvent("await-approval", {
event: "research.approved",
timeout: "2h",
if: async.data.runId == "${event.id}",
});
if (!approval) return { ok: false, reason: "timeout" };
return await step.run("synthesize", () => llm.synthesize(findings));
},
);
```
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
Step 2 — Wire AgentKit for the LLM logic
```ts import { createAgent, createNetwork, openai } from "@inngest/agent-kit";
const planner = createAgent({ name: "planner", model: openai({ model: "gpt-4o-mini" }), system: "Break the user question into 3-5 search subqueries.", }); const writer = createAgent({ name: "writer", model: openai({ model: "gpt-4o" }), system: "Write a 1-page synthesis with citations.", }); export const network = createNetwork({ agents: [planner, writer] }); ```
Step 3 — Mount the Next.js handler
```ts // app/api/inngest/route.ts import { serve } from "inngest/next"; import { inngest, research } from "@/inngest"; export const { GET, POST, PUT } = serve({ client: inngest, functions: [research] }); ```
Step 4 — Trigger + approve
```ts
await inngest.send({ name: "research.requested",
data: { q: "GLP-1 telehealth landscape" } });
// later, after a human reviewer approves:
await inngest.send({ name: "research.approved",
data: { runId: "
Step 5 — Watch in Inngest UI
Inngest's local dev UI shows each step's input, output, retries, and timing. Failed LLM calls auto-retry with exponential backoff (default 4 tries).
Step 6 — Production checklist
Add onFailure for dead-letter handling, set concurrency: { limit: 5 } to bound API spend, and enable Inngest's tracing export to Datadog or Honeycomb.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
Pitfalls
- Non-deterministic in step.run: Code inside
step.runis replayed on retry — pure functions only, all side effects through tools. - Random IDs: Use
step.run(\x-${i}`, ...)notMath.random` — Inngest uses the step ID as the cache key. - Long pauses: Default
waitForEventtimeout is 7 days; bump for week-long human reviews, but state cost grows.
How CallSphere does this in production
CallSphere runs durable async agents on Inngest for the OneRoof real-estate product (Next.js 16 + React 19) — lead scoring, drip outreach, and CRM enrichment all flow through step.run with retries. The platform spans 37 agents, 90+ tools, 115+ DB tables, and 6 verticals at $149/$499/$1,499 with a 14-day no-card trial and 22% affiliate.
FAQ
Inngest vs Temporal? Inngest is event-first and serverless-native; Temporal is workflow-first and needs workers. Inngest deploys to Vercel/Netlify in minutes.
Pricing? Free tier ~50K runs/month; paid starts at $20/mo. Self-host is open-source.
Can I use it with LangGraph? Yes — wrap your LangGraph in a single step.run for durability boundaries.
Does AgentKit replace LangChain? AgentKit is lighter and TypeScript-native; it covers ~80% of agent use-cases with no Python.
Sources
- Inngest docs - https://www.inngest.com/docs
- AgentKit - https://agentkit.inngest.com/overview
- Weaviate - Building Agentic Workflows with Inngest - https://weaviate.io/blog/inngest-ai-workflows
- AgentKit GitHub - https://github.com/inngest/agent-kit
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.