Skip to content
AI Infrastructure
AI Infrastructure11 min read0 views

PII and Secret Leak Detection in AI Logs

Your traces are a security exposure. PHI, credit cards, passwords end up in spans, prompts, and tool args. Here's a layered redaction pipeline that runs before export.

TL;DR — Redact at three points: in the agent before logging, at the OTel collector before export, and at the storage layer before display. Defence in depth.

What goes wrong

flowchart TD
  Client[Client] --> Edge[Cloudflare Worker]
  Edge -->|WS upgrade| DO[Durable Object]
  DO --> AI[(OpenAI Realtime WS)]
  AI --> DO
  DO --> Client
  DO -.hibernation.-> Storage[(Persisted state)]
CallSphere reference architecture

A 2023 study found ~4.7% of employees had pasted confidential data into ChatGPT and ~11% of all employee-submitted data was confidential. The same data ends up in your traces. PHI in a healthcare voice agent prompt. A credit card a user reads aloud and STT captures verbatim. An API key the user paste-bombs into a chat.

If those bytes land in your trace store and your trace store has a privileged-IAM bug, the leak is the size of your retention window times your traffic. The OWASP API Security Top 10 explicitly calls out logging exposure. Your observability is a security boundary.

How to monitor

Three layers of redaction, each redundant:

  1. Agent-side — strip obvious PII (cards, SSNs, emails, phone numbers) with a regex + Microsoft Presidio NLP before the prompt or response is added to a span.
  2. Collector-side — the OTel collector runs a transform processor that re-applies the same regexes plus secret patterns (AWS keys, Stripe keys, JWT-shaped tokens). Last line before export.
  3. Storage-side — at display time, redact anything that slipped through. UI shows [REDACTED:phi] instead of raw.

Plus an alert: if any redaction fires at the storage layer, page security. Means an earlier layer missed.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live Demo →

CallSphere stack

CallSphere is HIPAA-aligned for the /industries/healthcare build. Our redaction pipeline:

  • Agent SDK — every prompt and response goes through a redaction wrapper that calls Presidio (Spanish + English NER) and strips PHI before logging. The Healthcare FastAPI on :8084 runs Presidio in-process.
  • OTel Collector — DaemonSet on k3s, transforms processor with 28 regex rules covering: AWS access keys, Stripe sk_*, JWTs, US/UK/IN phone numbers, US SSNs, common card BINs. Runs after agent-side, catches anything that slipped.
  • Trace store — Langfuse self-hosted with row-level encryption; UI shows [REDACTED] for any field tagged sensitive.
  • Audit — every [REDACTED] event creates an audit row in Postgres. Weekly review.

Real Estate 6-container NATS pod is similar — PII redaction runs before NATS publish so messages between services are already clean. Sales WebSocket + PM2 redacts at frame ingest. After-hours Bull/Redis queue redacts on job enqueue.

We support BAAs at the $1499 tier on /pricing. $499 includes our standard redaction. The 14-day trial ships with redaction on by default.

Implementation

  1. Agent-side Presidio.
from presidio_analyzer import AnalyzerEngine
from presidio_anonymizer import AnonymizerEngine
analyzer = AnalyzerEngine()
anonymizer = AnonymizerEngine()

def redact(text: str) -> str:
    findings = analyzer.analyze(text=text, language="en")
    return anonymizer.anonymize(text=text, analyzer_results=findings).text
  1. Collector transform.
processors:
  transform/pii:
    log_statements:
      - context: log
        statements:
          - replace_all_patterns(body, "value", "sk_live_[A-Za-z0-9]{24}", "[REDACTED:stripe]")
          - replace_all_patterns(body, "value", "AKIA[0-9A-Z]{16}", "[REDACTED:aws]")
          - replace_all_patterns(body, "value", "\\b\\d{3}-\\d{2}-\\d{4}\\b", "[REDACTED:ssn]")
  1. Display-side redaction as a thin React wrapper component over any trace field.

  2. Alert on redactions. Any span tagged callsphere.redaction=storage_layer fires alert_type=security to security-on-call.

    Still reading? Stop comparing — try CallSphere live.

    CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

  3. Test it. Add a CI test that submits a known credit-card-shaped string into a fixture trace and verifies it never appears in trace storage.

FAQ

Q: Will redaction break debugging? A: For privileged users, store an encrypted "preimage" with HMAC. Authorized engineers can decrypt for incident review. Default view is redacted.

Q: How do I keep PII out of LLM prompts in the first place? A: Same redaction layer applied before sending to the model. We block the request entirely if a credit-card pattern is detected.

Q: What about voice — STT will transcribe PII verbatim. A: Yes. We redact the transcript before logging. Audio is stored encrypted with strict access controls.

Q: Is regex enough? A: No. Combine regex (precision) with NER (recall). Presidio + custom patterns is the standard.

Q: BAAs? A: Available for /industries/healthcare on the $1499 enterprise tier. Includes our redaction architecture as part of the audit package.

Sources

Share

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

AI Infrastructure

Monitoring WebSocket Health: Heartbeats and Prometheus in 2026

How to actually observe a WebSocket fleet: ping/pong heartbeats, Prometheus metrics that matter, dead-man switches, and the alerts that fire before customers notice.

Agentic AI

Input and Output Guardrails in the OpenAI Agents SDK: A Production Pattern (2026)

Stop the agent BEFORE it does the wrong thing. How to wire input and output guardrails in the OpenAI Agents SDK with cheap classifiers and an eval suite that proves they work.

Agentic AI

Safety Evaluation for Agents: Jailbreak, Prompt Injection, and Tool-Misuse Test Suites in 2026

How to build a safety eval pipeline that runs known jailbreak corpora, prompt-injection attacks, and tool-misuse scenarios on every release — and gates merges on it.

Agentic AI

The Agent Evaluation Stack in 2026: From Trace to Eval Score

How the modern agent eval stack actually flows: instrument, trace, dataset, evaluator, score, CI gate. The full pipeline that keeps agents from regressing.

AI Voice Agents

MOS Call Quality Scoring for AI Voice Operations in 2026: Beyond 4.2

MOS 4.3+ is the band where AI voice feels human. Drop below 3.6 and conversations break. Here is how to measure, improve, and alert on MOS in production AI voice using G.711, Opus, and the underlying packet loss / jitter / latency math.

AI Engineering

NeMo Guardrails vs LlamaGuard: Side-by-Side Comparison in 2026

NeMo Guardrails and LlamaGuard solve overlapping problems with different architectures. The trade-offs once you push them past 100 RPS in production agent stacks.