How to Prevent PII Leaks in LLM APIs

Every prompt your users submit is a potential data exposure event. Emails, phone numbers, social security numbers, patient records — users paste them into AI chatbots without thinking twice. Your LLM provider logs them. Your audit trail captures them. Your application stores them. This guide walks through the three approaches to PII detection in LLM APIs — what each catches, what each misses, and which one actually works at scale.

Why PII Leaks in AI APIs Are Different

PII in traditional web apps is at least contained: it goes into your database, stays there, and you control the schema. PII in AI applications is a different problem entirely. The moment you send a prompt to an LLM, that data is transmitted to a third-party inference endpoint, may be used for model training (depending on provider terms), appears in your application logs, and gets echoed back in completions — sometimes in contexts you didn't expect.

// Three ways LLMs leak PII

Echo-back: The model includes user-submitted PII in its response. A user pastes a customer record; the model quotes it back verbatim. Your application logs the full exchange.

Training data extraction: Adversarial users use prompt injection to make the model recall patterns from training data — including real PII from fine-tuning or RAG document sets.

Accidental context leakage: System prompts, RAG-retrieved documents, and tool outputs containing PII get injected into the conversation context and surface in completions — even when the user didn't submit PII themselves.

The regulatory exposure is real. GDPR Article 25 requires privacy by design — meaning "we didn't know users would paste SSNs" is not a defense. HIPAA's Security Rule applies to ePHI whether it reaches your system intentionally or not. SOC 2 Type II auditors will ask what controls prevent PII from appearing in your AI logs. CCPA requires you to know what personal data you collect — including data collected inadvertently through AI prompts.

The fix is to intercept and strip PII before it hits the model. Three approaches exist, each with different tradeoffs.

Three Approaches to PII Detection

Approach 1 — Fragile

Regex-based detection

Pattern matching on known formats (SSN: \d{3}-\d{2}-\d{4}, email: \S+@\S+). Fast to implement, fast to break.

Approach 2 — Better

NER-based detection

Named-entity recognition models trained to identify persons, organizations, and locations. Catches unlabeled PII regex misses. Adds latency.

Approach 3 — Best

Policy-layer proxy

Pre-built detection + configurable policy + audit trail, sitting between your app and the LLM. No model weights to maintain.

Approach 1: Regex-Based Detection

Regex is the obvious starting point. Define patterns for known PII formats and strip them before the prompt goes out. Here's what a basic regex scrubber looks like:

regex-scrubber.py

import re

PII_PATTERNS = {
    'ssn':         re.compile(r'\b\d{3}[-\s]?\d{2}[-\s]?\d{4}\b'),
    'credit_card': re.compile(r'\b(?:\d{4}[-\s]?){3}\d{4}\b'),
    'email':       re.compile(r'\b[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}\b'),
    'phone':       re.compile(r'\b(?:\+?1[-.\s]?)?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}\b'),
}

def scrub_pii(text: str) -> str:
    for label, pattern in PII_PATTERNS.items():
        placeholder = f'[REDACTED_{label.upper()}]'
        text = pattern.sub(placeholder, text)
    return text

# Usage
user_input = "My SSN is 123-45-6789 and my email is john@example.com"
clean = scrub_pii(user_input)
# → "My SSN is [REDACTED_SSN] and my email is [REDACTED_EMAIL]"

This works for well-formatted PII. The problem is that real-world PII is not well-formatted. Consider what regex misses:

Names: "Send this to Sarah Johnson in accounting" — no pattern to match
Addresses: "My address is 123 Maple Street" — too many variations to reliably pattern-match
Account numbers: "My member ID is AB-7734291" — organization-specific formats
Obfuscated formats: "123 45 6789" or "123.45.6789" — variants regex authors didn't anticipate
Context-sensitive PII: "Born on January 12, 1985" — a birthday only becomes HIPAA-relevant alongside other identifying data

// The false confidence problem

A regex scrubber that catches 80% of PII is worse than no scrubber in one important way: it creates the appearance of compliance without the reality. Security auditors and regulators aren't looking for "we tried to catch PII" — they want evidence that your controls are reliable.

Approach 2: NER-Based Detection

Named-Entity Recognition (NER) models are trained to identify real-world entities in unstructured text — person names, organizations, locations, dates. Running a lightweight NER model over prompts before forwarding them catches structured PII that regex misses:

ner-scrubber.py (using spaCy)

# pip install spacy && python -m spacy download en_core_web_sm
import spacy

nlp = spacy.load("en_core_web_sm")

REDACT_LABELS = {"PERSON", "ORG", "GPE", "DATE", "CARDINAL"}

def ner_scrub(text: str) -> str:
    doc = nlp(text)
    tokens = list(text)
    for ent in reversed(doc.ents):  # reverse to preserve offsets
        if ent.label_ in REDACT_LABELS:
            placeholder = f'[{ent.label_}]'
            tokens[ent.start_char:ent.end_char] = placeholder
    return ''.join(tokens)

user_input = "Please send this to Sarah Johnson at Acme Corp in New York"
clean = ner_scrub(user_input)
# → "Please send this to [PERSON] at [ORG] in [GPE]"

NER catches names, organizations, and locations that regex can't touch. But it introduces new problems:

Latency: Loading a spaCy or Hugging Face model adds 50–200ms per request. That's acceptable for some use cases, prohibitive for real-time chat.
False positives: "Apple announced earnings" will redact "Apple" as an organization. Generic NER models don't understand your domain.
No audit trail: Your scrubber runs in-process — there's no record of what was detected, what action was taken, or which policy applied.
Infrastructure overhead: You're now running ML inference as part of your API request path. Scaling, monitoring, and updating the model is your problem.

For teams with dedicated ML infrastructure, a fine-tuned NER model is a solid approach. For everyone else — especially those with compliance requirements — the infrastructure cost and audit gap are dealbreakers.

Approach 3: Policy-Layer Proxy (Recommended)

The most reliable approach is to put a purpose-built policy enforcement layer between your application and the LLM. This is what SentinelGate does: it combines pattern detection, contextual analysis, and configurable policy rules into a single proxy that your existing OpenAI SDK calls route through — no model weights to maintain, no audit gap, sub-20ms overhead.

The key advantages over DIY solutions:

✓ Structured audit events per request — every detection, every action, every policy name, immutably logged. This is what compliance evidence looks like.
✓ Multiple PII classes with configurable actions — redact SSNs, block on credit cards, observe on names. Different rules for different data classes.
✓ No model to maintain — SentinelGate's detection runs at the gateway, not in your process. Updates are transparent.
✓ Response scanning — PII in the model's response is also caught. Not just inputs.
✓ One URL change — integrates with any OpenAI-compatible SDK. Your existing code is unchanged.

Run your first PII-protected LLM call

Free tier includes 100 requests/month. Takes 5 minutes from signup to your first redacted prompt. No credit card required.

Get API Key Free Compare Plans

📊 Quick check

How does your PII protection stack up?

Take the 2-minute AI compliance assessment — get a scored gap breakdown across PII handling, policy enforcement, audit trails, and more.

Take the Assessment →

SentinelGate Walkthrough: Configuring a PII Policy

Here's how to configure PII protection with SentinelGate and see it in action. You'll need a free API key from the signup page — takes 2 minutes.

Point your SDK at the SentinelGate proxy

Replace your base URL. Your SentinelGate API key goes in the Authorization header. Your LLM API key (OpenAI, etc.) is configured once in the dashboard under BYOK — SentinelGate uses it to forward requests.

curl — with PII in prompt

curl -X POST https://sentinelgate.polsia.app/v1/chat/completions \
  -H "Authorization: Bearer sg_live_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{
      "role": "user",
      "content": "Summarize this record: John Doe, SSN 123-45-6789, email john@example.com"
    }]
  }'

With an active PII policy, the proxy intercepts this request, redacts the SSN and email before the prompt reaches the model, and logs a structured audit event. The model receives:

What the model actually receives

Summarize this record: John Doe, SSN [REDACTED_SSN], email [REDACTED_EMAIL]

Configure your PII policy

Log into your dashboard and create a policy. Here's a complete PII policy for a compliance-sensitive application:

Policy: pii-strict

{
  "name": "pii-strict",
  "mode": "enforce",
  "rules": [
    {
      "type": "pii_redaction",
      "detect": ["ssn", "credit_card", "email", "phone"],
      "action": "redact"    // strip before forwarding to model
    },
    {
      "type": "pii_redaction",
      "detect": ["name"],
      "action": "mask"      // replace with [NAME] token
    },
    {
      "type": "audit_logging",
      "retention_days": 365,
      "include_prompts": false  // log detections, not prompt text
    }
  ]
}

Integrate with Python and Node.js SDKs

If you're using the OpenAI SDK, the integration is a single base_url parameter change. Everything else stays the same:

Python — with PII protection

from openai import OpenAI

# One change — point at SentinelGate
client = OpenAI(
    api_key="sg_live_YOUR_KEY",
    base_url="https://sentinelgate.polsia.app/v1"
)

# This prompt contains PII — SentinelGate redacts before forwarding
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{
        "role": "user",
        "content": "Customer record: Jane Smith, 555-867-5309, jane@corp.com"
    }]
)

# The model received the prompt with PII stripped.
# SentinelGate logged: 1x phone detected+redacted, 1x email detected+redacted.
# The response comes back normally — your code is unchanged.
print(response.choices[0].message.content)

Node.js — with PII protection

import OpenAI from 'openai';

// One change — point at SentinelGate
const client = new OpenAI({
  apiKey: 'sg_live_YOUR_KEY',
  baseURL: 'https://sentinelgate.polsia.app/v1'
});

// Prompt contains PII — SentinelGate strips before forwarding
const response = await client.chat.completions.create({
  model: 'gpt-4o',
  messages: [{
    role: 'user',
    content: 'Patient DOB 01/15/1982, MRN 447821, admitted 2026-03-01'
  }]
});

// Model received prompt with PHI redacted.
// Audit event logged: date + numeric identifiers detected and masked.
console.log(response.choices[0].message.content);

Read the audit event — this is your compliance artifact

Every proxied request generates a structured audit event in the SentinelGate dashboard. This is the evidence that demonstrates your controls are active. Here's what a PII detection event looks like:

Audit event — PII detected + redacted

{
  "id":           "evt_sg_9k2m4p7r",
  "timestamp":    "2026-04-22T09:14:33.201Z",
  "model":        "gpt-4o",
  "policy":       "pii-strict",
  "decision":     "redacted",
  "detections": [
    { "type": "pii", "class": "phone",       "action": "redact", "count": 1 },
    { "type": "pii", "class": "email",       "action": "redact", "count": 1 },
    { "type": "pii", "class": "credit_card", "action": "block",  "count": 0 }
  ],
  "latency_ms":   14,
  "upstream_ms":  921,
  "upstream_status": 200
}

// The difference between logs and audit evidence

Application logs tell you something happened. Audit events tell you what policy was active, what was detected, and what action was taken — with a timestamp, a request ID, and a machine-readable structure. Security auditors and compliance reviewers want the latter. include_prompts: false keeps the evidence without storing the sensitive content itself.

Regulatory Compliance Mapping

Each major regulation has specific requirements around PII in AI systems. Here's how SentinelGate's features map to those requirements:

Regulation	Requirement	SentinelGate feature
GDPR Art. 25	Privacy by design — data minimization before processing. Personal data should not reach processors unless necessary.	PII redaction before prompt reaches upstream LLM. Configurable per-class detection.
GDPR Art. 30	Records of processing activities — documentation of what data is processed, by whom, and for what purpose.	Structured audit log with per-request policy name, decision, and detection classes. Exportable.
HIPAA Security Rule	Technical safeguards for ePHI — access controls, audit controls, transmission security.	PHI detection (SSN, DOB, MRN patterns), redaction before inference, audit log with 365-day retention.
SOC 2 Type II	CC6.1 / CC7.2 — logical access controls and monitoring of data flows into AI systems.	Per-request audit events, policy enforcement logs, API key scoping, BYOK encryption at rest.
CCPA §1798.100	Right to know — businesses must document what personal information they collect, including via AI interactions.	Audit log captures detection classes per request. Queryable record of what PII categories appeared in prompts.
EU AI Act Art. 9	Risk management system for high-risk AI — documented controls, monitoring, and logging.	Policy-based enforcement with audit trail; exportable compliance bundles available on Pro and Enterprise plans.

What You've Accomplished

With SentinelGate's PII protection active, you now have:

✓ Pre-inference redaction: SSNs, credit cards, email addresses, and phone numbers are stripped from prompts before reaching the model — not logged by your provider, not echoed back
✓ Response scanning: PII that appears in model outputs (echo-back) is detected and can be masked before reaching your application
✓ Compliance-grade audit trail: structured events per request — detection classes, action taken, policy name, latency — in a queryable, exportable format
✓ No ML infrastructure to maintain: detection runs at the gateway, updated transparently — your team manages policy configuration, not model weights
✓ Zero code changes: existing OpenAI SDK calls work unchanged — you changed one URL and one API key

Frequently Asked Questions

Does SentinelGate catch PII in the model's response, not just the prompt?

Yes. Both request and response pass through the policy engine. If the model echoes back PII from a prompt it received before you added SentinelGate, or generates PII-like patterns in its response, those are detected and actioned according to your policy.

What PII classes does SentinelGate detect?

The standard detection set includes: email addresses, phone numbers, SSNs, credit card PANs, and names. The Pro and Enterprise plans add custom pattern support — you can define your own detection classes using regex patterns for organization-specific identifiers (member IDs, account numbers, internal codes).

Can I log which PII was detected without storing the raw prompt?

Yes. Setting include_prompts: false in your policy (the default) logs detection metadata — types detected, counts, actions taken — without storing the prompt text. This is the preferred configuration for compliance: you have evidence of what your system detected and acted on, without creating a secondary PII store in your audit logs.

How is the PII detection different from regex I could write myself?

SentinelGate's detection combines pattern matching with contextual analysis — it catches obfuscated formats, common variants, and multi-field patterns that single-field regex misses. More importantly, it runs at the gateway with a structured audit trail. Your DIY regex has no audit record, no configurable policy, and no response scanning. See the AI Guardrails guide for a broader overview of everything the policy engine covers.

Is there a way to test PII detection before enforcing it in production?

Yes. Set your policy mode to observe instead of enforce. In observe mode, detections are logged and you can review what would have been redacted — without actually stripping anything from live prompts. When you're satisfied with the detection accuracy, switch to enforce. This lets you tune false positive rates before hardening production. See pricing for plan details.

Stop sending PII to your LLM provider

Free tier, no credit card, takes 5 minutes. Your first 100 requests are covered — enough to validate the integration and see your first audit events.

Get API Key Free See Plans

← All Guides Start Free →