AudioWorklet for Low-Latency Voice Processing in 2026: RNNoise + WASM at 13ms
RNNoise compiled to WASM and loaded into an AudioWorklet processes 48 kHz audio at ~13 ms latency, runs noise-gate hysteresis, and converts Float32 to Int16 PCM — all on a non-DOM audio thread.
RNNoise compiled to WASM and loaded into an AudioWorklet processes 48 kHz audio at ~13 ms latency, runs noise-gate hysteresis, and converts Float32 to Int16 PCM — all on a non-DOM audio thread.
The change
The 2026 production pattern for browser-side voice processing has converged on a tight stack: AudioWorkletProcessor as the host, RNNoise (a recurrent-neural-network denoiser, MIT license) compiled to WebAssembly as the DSP engine, plus a hysteresis noise gate to prevent rapid mute toggling. Reference implementations (Sokuji, Fluxer) report ~13 ms processing latency at 48 kHz, well below the 100 ms threshold for human-perceptible delay. The worklet runs on a high-priority audio thread isolated from DOM, so React rerenders, scrolling, and tab background-throttling do not produce audio glitches. The same processor handles Float32-to-Int16 PCM conversion, which is what every WebSocket-based AI voice API actually wants on the wire.
What it unlocks
Production-grade noise suppression that does not depend on a server. RNNoise outperforms WebRTC's built-in noiseSuppression on background-conversation rejection (the canonical "open-plan office" case). Hysteresis with separate open/close thresholds (e.g. -45 dB open, -55 dB close) prevents the rapid-flapping pattern where ambient noise is right at the threshold. Off-loading PCM conversion to the worklet kills a common JS bottleneck — converting Float32 at 48 kHz on the main thread costs 2-4% CPU for nothing. Combined: cleaner audio uploaded to your AI gateway, 60% bandwidth savings during silence (because gated frames can be dropped), and main-thread CPU stays free for UI.
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
flowchart TD
A[Microphone · getUserMedia] --> B[AudioContext 48 kHz]
B --> C[AudioWorkletNode]
C --> D[AudioWorkletProcessor]
D --> E[WASM RNNoise · 13ms latency]
E --> F{Noise gate}
F -- above -45 dB --> G[Float32 to Int16 PCM]
F -- below -55 dB --> H[Drop frame · upstream silence]
G --> I[postMessage to main thread]
I --> J[WebSocket / WebCodecs]
CallSphere context
CallSphere ships 37 agents · 90+ tools · 115+ tables · 6 verticals · HIPAA + SOC 2 aligned. Our browser voice client embeds RNNoise + a hysteresis gate inside one AudioWorkletProcessor compiled from Rust. Mean processing latency clocks 12-14 ms across M2 MacBook, Snapdragon X Elite, and ThinkPad X1. Background-noise rejection cut "could you repeat that" clarification turns by 22% in the Real Estate OneRoof Pion Go gateway 1.23 flow. Plans $149 / $499 / $1,499, 14-day trial, 22% affiliate Year 1.
Migration steps
- Compile RNNoise (or jitsi-meet's variant) to WASM via Emscripten
- Inline the WASM as base64 in your worklet processor file (avoids cross-origin issues)
- Implement hysteresis:
opening = -45 dB,closing = -55 dB, with a 100ms close timer - Convert Float32 to Int16 inside the worklet, not on main thread
- Profile in chrome://media-internals to verify zero audio glitches under load
FAQ
Why not use the browser's built-in noise suppression? It targets human voice in moderate noise. RNNoise is stronger on dense background speech and constant fan/HVAC noise.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
What about Krisp / NVIDIA Broadcast? Stronger models, server-grade. RNNoise is the open-source baseline you can ship without per-seat fees.
Will hysteresis cause audio cutoff? With the close timer, no. The 100ms tail keeps trailing words audible.
Does WASM in worklet need cross-origin isolation? Yes if you use SharedArrayBuffer. Without it, postMessage is fine.
Sources
- MDN - AudioWorklet - https://developer.mozilla.org/en-US/docs/Web/API/AudioWorklet
- MDN - AudioWorkletProcessor - https://developer.mozilla.org/en-US/docs/Web/API/AudioWorkletProcessor
- DEV - Audio Worklets for Low-Latency Audio Processing - https://dev.to/omriluz1/audio-worklets-for-low-latency-audio-processing-3b9p
- DeepWiki - Audio Worklets in kizuna-ai-lab/sokuji - https://deepwiki.com/kizuna-ai-lab/sokuji/6.4-audio-worklets
- Picovoice - Noise Suppression Guide 2026 - https://picovoice.ai/blog/complete-guide-to-noise-suppression/
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.