TL;DR — Nuxt 3.13+ on Vue 3.5 ships a built-in Nitro server, perfect for hiding OpenAI keys. Wrap WebRTC + the Realtime API in a useVoiceAgent composable for a clean Vue voice UI.

What you'll build

A Nuxt 3 page with a Talk button that uses an ephemeral key minted by a Nitro server route, opens WebRTC to OpenAI gpt-realtime, and streams transcripts into a Pinia store.

Prerequisites

nuxt@^3.13, vue@^3.5, pinia@^2.2.
OPENAI_API_KEY in .env.
Node 20+ or Bun 1.3.

Architecture

flowchart LR
  V[Nuxt page] --> N[Nitro /api/realtime/key]
  N -- POST sessions --> OA1[OpenAI]
  OA1 --> N --> V
  V -- WebRTC SDP --> OA2[OpenAI Realtime]

Step 1 — Nitro endpoint

```ts // server/api/realtime/key.post.ts export default defineEventHandler(async () => { const r = await $fetch<{ client_secret: { value: string } }>( "https://api.openai.com/v1/realtime/sessions", { method: "POST", headers: { Authorization: Bearer ${process.env.OPENAI_API_KEY} }, body: { model: "gpt-realtime", voice: "alloy" }, }, ); return r; }); ```

Step 2 — Composable

```ts // composables/useVoiceAgent.ts export function useVoiceAgent() { const live = ref(false); const transcript = ref(""); const audioEl = ref<HTMLAudioElement | null>(null);

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

async function start() { const { client_secret } = await $fetch("/api/realtime/key", { method: "POST" }); const pc = new RTCPeerConnection(); pc.ontrack = (e) => audioEl.value && (audioEl.value.srcObject = e.streams[0]); const ms = await navigator.mediaDevices.getUserMedia({ audio: true }); ms.getTracks().forEach((t) => pc.addTrack(t, ms));

const dc = pc.createDataChannel("oai-events");
dc.addEventListener("message", (e) => {
  const evt = JSON.parse(e.data);
  if (evt.type === "response.audio_transcript.delta")
    transcript.value += evt.delta;
});
const offer = await pc.createOffer();
await pc.setLocalDescription(offer);
const ans = await fetch(
  "https://api.openai.com/v1/realtime?model=gpt-realtime",
  { method: "POST", body: offer.sdp,
    headers: { Authorization: `Bearer ${client_secret.value}`,
               "Content-Type": "application/sdp" } });
await pc.setRemoteDescription({ type: "answer", sdp: await ans.text() });
live.value = true;

} return { live, transcript, audioEl, start }; } ```

Step 3 — Page

```vue

\`\`\`

Step 4 — Pinia store for multi-page state

```ts // stores/calls.ts export const useCalls = defineStore("calls", () => ({ state: () => ({ history: [] as { role: string; text: string }[] }), })); ```

Step 5 — Deploy

npx nuxi build && nuxt preview locally, or use the nitro-cloudflare preset for Cloudflare Pages. Vercel and Netlify presets ship out of the box.

Step 6 — Tool calls

Listen for response.function_call_arguments.done, run a Nitro endpoint to execute the tool server-side (so you keep secrets server-only), and reply via the data channel.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

Pitfalls

process.env in Nitro vs useRuntimeConfig — use the latter for typed config.
WebRTC + SSR: The voice page must be <ClientOnly> or set ssr: false in definePageMeta.
Vapor mode (preview): Vue 3.5's experimental Vapor mode skips VDOM but is still preview — opt-in carefully.

How CallSphere does this in production

CallSphere's stack is multi-framework: Healthcare (FastAPI), OneRoof (Next.js 16 + React 19), Salon (NestJS 10 + Prisma), Sales (Node.js 20 + React 18 + Vite). Some agency white-label customers prefer Vue/Nuxt — supported via the same realtime relay. 37 agents · 90+ tools · 115+ DB tables · 6 verticals. $149/$499/$1,499, 14-day trial, 22% affiliate.

FAQ

Vue 3.5 vs 3.4? 3.5 brings reactivity perf wins and useTemplateRef.

WebRTC vs WebSocket on Nuxt? WebRTC for browser-direct, WebSocket if you need server-side audit/policy.

Cloudflare Pages limit? WS connections capped at 100 concurrent on free tier — bump to Workers Paid for production.

Ephemeral key TTL? 60s default; refresh before each call.

Sources

Nuxt 3 docs - https://nuxt.com/
Mamezou - Nuxt + OpenAI Realtime - https://developer.mamezou-tech.com/en/blogs/2024/10/16/openai-realtime-api-nuxt/
Vue.js AI SDK Getting Started - https://ai-sdk.dev/docs/getting-started/nuxt
VueSchool - AI Interfaces with Vue + Nuxt - https://vueschool.io/courses/ai-interfaces-with-vue-nuxt-and-the-ai-sdk

## How this plays out in production If you are taking the ideas in *Build an AI Voice Agent with Nuxt 3 + Vue 3.5 + OpenAI Realtime (2026)* and putting them in front of real customers, the constraint that decides everything is ASR error rates on long-tail entities (drug names, street names, SKUs) and the post-call pipeline that must reconcile what was actually heard. Treat this as a voice-first system from the first prompt: the agent's persona, its tool surface, and its escalation rules all flow from that single decision. Teams that ship fast tend to instrument the loop end-to-end before they tune any single component, because the bottleneck is rarely where intuition puts it. ## Voice agent architecture, end to end A production-grade voice stack at CallSphere stitches Twilio Programmable Voice (PSTN ingress, TwiML, bidirectional Media Streams) to a realtime reasoning layer — typically OpenAI Realtime or ElevenLabs Conversational AI — with sub-second response as a hard SLO. Anything north of one second of perceived silence and callers either repeat themselves or hang up; that single number drives the whole architecture. Server-side VAD with proper barge-in support is non-negotiable, otherwise the agent talks over the caller and the conversation collapses. Streaming TTS with phoneme-aligned interruption keeps the cadence natural even when the user changes their mind mid-sentence. Post-call, every transcript is run through a structured pipeline: sentiment, intent classification, lead score, escalation flag, and a normalized slot extraction (name, callback number, reason, urgency). For healthcare workloads, the BAA-covered storage path, audit logs, encryption-at-rest, and PHI-safe transcript redaction are wired in from day one, not bolted on at compliance review. The end state is a system where every call produces a row of structured data, not just a recording. ## FAQ **What changes when you move a voice agent the way *Build an AI Voice Agent with Nuxt 3 + Vue 3.5 + OpenAI Realtime (2026)* describes?** Treat the architecture in this post as a starting point and instrument it before you tune it. The metrics that matter most early on are end-to-end latency (target < 1s for voice, < 3s for chat), barge-in correctness, tool-call success rate, and post-conversation lead score distribution. Optimize whatever the data flags as the bottleneck, not whatever feels slowest in your head. **Where does this break down for voice agent deployments at scale?** The two failure modes that bite hardest are silent context loss across multi-turn handoffs and tool calls that succeed in dev but get rate-limited in production. Both are solvable with a proper agent backplane that pins state to a session ID, retries with backoff, and writes every tool invocation to an audit log you can replay. **How does the salon stack (GlamBook) keep bookings clean across stylists and services?** GlamBook runs 4 agents that handle booking, rescheduling, fuzzy service-name matching, and confirmations. Every appointment gets a deterministic reference like GB-YYYYMMDD-### so the salon, the customer, and the agent all reference the same object across SMS, email, and voice. ## See it live Book a 30-minute working session at [calendly.com/sagar-callsphere/new-meeting](https://calendly.com/sagar-callsphere/new-meeting) and bring a real call flow — we will walk it through the live salon booking agent (GlamBook) at [salon.callsphere.tech](https://salon.callsphere.tech) and show you exactly where the production wiring sits.

Build an AI Voice Agent with Nuxt 3 + Vue 3.5 + OpenAI Realtime (2026)

What you'll build

Prerequisites

Architecture

Step 1 — Nitro endpoint

Step 2 — Composable

Step 3 — Page

Step 4 — Pinia store for multi-page state

Step 5 — Deploy

Step 6 — Tool calls

Pitfalls

How CallSphere does this in production

FAQ

Sources

Try CallSphere AI Voice Agents

Related Articles You May Like

WebRTC Mobile Testing with BrowserStack + Sauce Labs (2026)

WebRTC Over QUIC and the Future of Realtime: Where Voice AI Goes After 2026

OpenAI's May 2026 WebRTC Rearchitecture: How Voice Latency Got Real

Building a Custom Calling Platform: Enterprise Guide

WebRTC + AI TTS for Live Podcast Guesting and Interviews (2026)

WebRTC + AI Fact-Checker for Live News Studio Broadcasts in 2026