Skip to content
AI Voice Agents
AI Voice Agents12 min read0 views

Build an AI Voice Agent on the T3 Stack (Next.js + tRPC + Prisma, 2026)

create-t3-app gives you Next.js 15, TypeScript, Tailwind, tRPC v11, Prisma, and Auth.js v5 in one CLI. Add OpenAI Realtime and you have a typed voice agent in 90 minutes.

TL;DRpnpm create t3-app@latest scaffolds a Next.js 15 App Router project with tRPC v11, Prisma, Auth.js v5 and Tailwind. Bolt OpenAI Realtime + WebRTC ephemeral keys onto it and you ship a typed, authenticated voice agent in one afternoon.

What you'll build

A logged-in user opens /voice, the page mints an ephemeral OpenAI key from a tRPC procedure, the browser opens a WebRTC peer connection to OpenAI Realtime, and call transcripts are written to Postgres via Prisma — all type-safe.

Prerequisites

  1. Node 20+, pnpm@9.
  2. pnpm create t3-app@latest --noInstall and pick: tRPC, Prisma, Tailwind, Auth.js.
  3. OPENAI_API_KEY and a Postgres URL.

Architecture

flowchart TD
  U[User] --> NX[Next.js 15 App Router]
  NX --> TR[tRPC v11 - mintEphemeral]
  TR --> OA[POST /v1/realtime/sessions]
  OA --> NX
  NX -- WebRTC SDP --> RT[OpenAI Realtime]
  NX --> PR[Prisma transcripts]

Step 1 — Add the realtime procedure

```ts // server/api/routers/voice.ts import { z } from "zod"; import { protectedProcedure, createTRPCRouter } from "@/server/api/trpc";

export const voiceRouter = createTRPCRouter({ mintEphemeral: protectedProcedure .input(z.object({ voice: z.enum(["alloy","verse"]).default("alloy") })) .mutation(async ({ input }) => { const r = await fetch("https://api.openai.com/v1/realtime/sessions", { method: "POST", headers: { Authorization: Bearer ${process.env.OPENAI_API_KEY}, "Content-Type": "application/json" }, body: JSON.stringify({ model: "gpt-realtime", voice: input.voice }), }); return (await r.json()) as { client_secret: { value: string } }; }), }); ```

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live Demo →

Step 2 — Browser WebRTC handshake

```tsx "use client"; import { api } from "@/trpc/react";

export function VoiceCall() { const mint = api.voice.mintEphemeral.useMutation(); async function start() { const { client_secret } = await mint.mutateAsync({ voice: "alloy" }); const pc = new RTCPeerConnection(); pc.ontrack = (e) => (audioEl.current!.srcObject = e.streams[0]); const ms = await navigator.mediaDevices.getUserMedia({ audio: true }); ms.getTracks().forEach((t) => pc.addTrack(t, ms)); const dc = pc.createDataChannel("oai-events"); const offer = await pc.createOffer(); await pc.setLocalDescription(offer); const ans = await fetch( "https://api.openai.com/v1/realtime?model=gpt-realtime", { method: "POST", body: offer.sdp, headers: { Authorization: Bearer ${client_secret.value}, "Content-Type": "application/sdp" } }); await pc.setRemoteDescription({ type: "answer", sdp: await ans.text() }); } return ; } ```

Step 3 — Persist transcripts

```prisma model Transcript { id String @id @default(cuid()) userId String text String role String createdAt DateTime @default(now()) } ```

```ts dc.addEventListener("message", async (e) => { const evt = JSON.parse(e.data); if (evt.type === "response.audio_transcript.done") await api.voice.saveTurn.mutate({ text: evt.transcript, role: "assistant" }); }); ```

Step 4 — Auth gate

Wrap the page in auth() — Auth.js v5 has stable App Router support. Unauthenticated users see a redirect; the tRPC procedure already uses protectedProcedure.

Step 5 — Deploy

vercel --prod works out of the box; set OPENAI_API_KEY and DATABASE_URL as environment variables. Use Vercel Postgres or Neon.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Pitfalls

  • WebRTC + corporate firewalls: Some networks block UDP 3478. Fall back to WebSocket if pc.iceConnectionState stays checking >5s.
  • Ephemeral key TTL: Default 60s — mint right before createOffer.
  • Auth.js v5 cookies: Edge middleware sometimes strips __Secure- cookies — set useSecureCookies: process.env.NODE_ENV === "production" explicitly.

How CallSphere does this in production

CallSphere's platform combines T3-style typing across 37 agents, 90+ tools, 115+ DB tables, and 6 verticals. OneRoof (Next.js 16 + React 19) is the closest analog: Auth.js gates the realtime endpoint, Prisma persists transcripts, and tRPC carries every tool call. $149/$499/$1,499, 14-day no-card trial, 22% affiliate.

FAQ

Is T3 still relevant in 2026? Yes — best free TS full-stack starter, but consider T4 (T3 + Vercel AI SDK + RAG) for AI-heavy apps.

Can I swap Prisma for Drizzle? create-t3-app --drizzle is supported in 2026.

Why WebRTC over WebSocket? Browsers handle echo cancellation + jitter natively over WebRTC.

Where does the API key live? Server only — clients receive only ephemeral keys.

Sources

Share

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

AI Infrastructure

Build a Multi-Region Voice Agent on Fly.io for Sub-500ms Global Latency (2026)

Deploy a voice agent to Fly.io's anycast network across 6 regions: Tokyo, Frankfurt, São Paulo, Sydney, Virginia, Los Angeles. fly-replay routes traffic to the closest healthy region.

AI Voice Agents

Build an AI Voice Agent with SolidStart + SolidJS + OpenAI Realtime (2026)

SolidStart 1.3 + Solid 1.9 deliver fine-grained reactivity with no VDOM — voice agents render at 30% lower CPU than React. Plug WebRTC into Solid signals.

AI Infrastructure

TensorFlow.js + ML5.js Voice Agents in the Browser: 2026 Architecture

Pre-trained Speech Commands models, ml5.js wrappers, and TensorFlow.js with the WASM/WebGPU backend let you ship a voice agent with wake-word, intent, and tone detection — all client-side.

AI Voice Agents

Build an AI Voice Agent with Nuxt 3 + Vue 3.5 + OpenAI Realtime (2026)

Nuxt 3 Nitro server routes mint ephemeral OpenAI keys, Vue 3.5 composables wrap WebRTC, and Pinia holds the call state. Sub-700ms voice agent in 200 lines.

AI Voice Agents

Build a Voice Agent with Bolna (Open-Source Production Stack)

Bolna 0.10 wires LiteLLM, Deepgram, ElevenLabs, Twilio and Plivo into one OSS orchestrator. Deploy a full conversational voice agent in under 200 lines of YAML + Python.

AI Voice Agents

Build an AI Voice Agent with SvelteKit + WebRTC + OpenAI Realtime (2026)

SvelteKit 2 + Svelte 5 runes give you reactive voice UI with 30% smaller bundles than React. Wire WebRTC ephemeral keys to OpenAI Realtime for browser-direct voice.