Why PII Leaks in AI APIs Are Different
PII in traditional web apps is at least contained: it goes into your database, stays there, and you control the schema. PII in AI applications is a different problem entirely. The moment you send a prompt to an LLM, that data is transmitted to a third-party inference endpoint, may be used for model training (depending on provider terms), appears in your application logs, and gets echoed back in completions — sometimes in contexts you didn't expect.
Echo-back: The model includes user-submitted PII in its response. A user pastes a customer record; the model quotes it back verbatim. Your application logs the full exchange.
Training data extraction: Adversarial users use prompt injection to make the model recall patterns from training data — including real PII from fine-tuning or RAG document sets.
Accidental context leakage: System prompts, RAG-retrieved documents, and tool outputs containing PII get injected into the conversation context and surface in completions — even when the user didn't submit PII themselves.
The regulatory exposure is real. GDPR Article 25 requires privacy by design — meaning "we didn't know users would paste SSNs" is not a defense. HIPAA's Security Rule applies to ePHI whether it reaches your system intentionally or not. SOC 2 Type II auditors will ask what controls prevent PII from appearing in your AI logs. CCPA requires you to know what personal data you collect — including data collected inadvertently through AI prompts.
The fix is to intercept and strip PII before it hits the model. Three approaches exist, each with different tradeoffs.
Three Approaches to PII Detection
Regex-based detection
Pattern matching on known formats (SSN: \d{3}-\d{2}-\d{4}, email: \S+@\S+). Fast to implement, fast to break.
NER-based detection
Named-entity recognition models trained to identify persons, organizations, and locations. Catches unlabeled PII regex misses. Adds latency.
Policy-layer proxy
Pre-built detection + configurable policy + audit trail, sitting between your app and the LLM. No model weights to maintain.
Approach 1: Regex-Based Detection
Regex is the obvious starting point. Define patterns for known PII formats and strip them before the prompt goes out. Here's what a basic regex scrubber looks like:
import re PII_PATTERNS = { 'ssn': re.compile(r'\b\d{3}[-\s]?\d{2}[-\s]?\d{4}\b'), 'credit_card': re.compile(r'\b(?:\d{4}[-\s]?){3}\d{4}\b'), 'email': re.compile(r'\b[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}\b'), 'phone': re.compile(r'\b(?:\+?1[-.\s]?)?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}\b'), } def scrub_pii(text: str) -> str: for label, pattern in PII_PATTERNS.items(): placeholder = f'[REDACTED_{label.upper()}]' text = pattern.sub(placeholder, text) return text # Usage user_input = "My SSN is 123-45-6789 and my email is john@example.com" clean = scrub_pii(user_input) # → "My SSN is [REDACTED_SSN] and my email is [REDACTED_EMAIL]"
This works for well-formatted PII. The problem is that real-world PII is not well-formatted. Consider what regex misses:
- Names: "Send this to Sarah Johnson in accounting" — no pattern to match
- Addresses: "My address is 123 Maple Street" — too many variations to reliably pattern-match
- Account numbers: "My member ID is AB-7734291" — organization-specific formats
- Obfuscated formats: "123 45 6789" or "123.45.6789" — variants regex authors didn't anticipate
- Context-sensitive PII: "Born on January 12, 1985" — a birthday only becomes HIPAA-relevant alongside other identifying data
A regex scrubber that catches 80% of PII is worse than no scrubber in one important way: it creates the appearance of compliance without the reality. Security auditors and regulators aren't looking for "we tried to catch PII" — they want evidence that your controls are reliable.
Approach 2: NER-Based Detection
Named-Entity Recognition (NER) models are trained to identify real-world entities in unstructured text — person names, organizations, locations, dates. Running a lightweight NER model over prompts before forwarding them catches structured PII that regex misses:
# pip install spacy && python -m spacy download en_core_web_sm import spacy nlp = spacy.load("en_core_web_sm") REDACT_LABELS = {"PERSON", "ORG", "GPE", "DATE", "CARDINAL"} def ner_scrub(text: str) -> str: doc = nlp(text) tokens = list(text) for ent in reversed(doc.ents): # reverse to preserve offsets if ent.label_ in REDACT_LABELS: placeholder = f'[{ent.label_}]' tokens[ent.start_char:ent.end_char] = placeholder return ''.join(tokens) user_input = "Please send this to Sarah Johnson at Acme Corp in New York" clean = ner_scrub(user_input) # → "Please send this to [PERSON] at [ORG] in [GPE]"
NER catches names, organizations, and locations that regex can't touch. But it introduces new problems:
- Latency: Loading a spaCy or Hugging Face model adds 50–200ms per request. That's acceptable for some use cases, prohibitive for real-time chat.
- False positives: "Apple announced earnings" will redact "Apple" as an organization. Generic NER models don't understand your domain.
- No audit trail: Your scrubber runs in-process — there's no record of what was detected, what action was taken, or which policy applied.
- Infrastructure overhead: You're now running ML inference as part of your API request path. Scaling, monitoring, and updating the model is your problem.
For teams with dedicated ML infrastructure, a fine-tuned NER model is a solid approach. For everyone else — especially those with compliance requirements — the infrastructure cost and audit gap are dealbreakers.
Approach 3: Policy-Layer Proxy (Recommended)
The most reliable approach is to put a purpose-built policy enforcement layer between your application and the LLM. This is what SentinelGate does: it combines pattern detection, contextual analysis, and configurable policy rules into a single proxy that your existing OpenAI SDK calls route through — no model weights to maintain, no audit gap, sub-20ms overhead.
The key advantages over DIY solutions:
- ✓ Structured audit events per request — every detection, every action, every policy name, immutably logged. This is what compliance evidence looks like.
- ✓ Multiple PII classes with configurable actions — redact SSNs, block on credit cards, observe on names. Different rules for different data classes.
- ✓ No model to maintain — SentinelGate's detection runs at the gateway, not in your process. Updates are transparent.
- ✓ Response scanning — PII in the model's response is also caught. Not just inputs.
- ✓ One URL change — integrates with any OpenAI-compatible SDK. Your existing code is unchanged.
Run your first PII-protected LLM call
Free tier includes 100 requests/month. Takes 5 minutes from signup to your first redacted prompt. No credit card required.
SentinelGate Walkthrough: Configuring a PII Policy
Here's how to configure PII protection with SentinelGate and see it in action. You'll need a free API key from the signup page — takes 2 minutes.
Replace your base URL. Your SentinelGate API key goes in the Authorization header. Your LLM API key (OpenAI, etc.) is configured once in the dashboard under BYOK — SentinelGate uses it to forward requests.
curl -X POST https://sentinelgate.polsia.app/v1/chat/completions \ -H "Authorization: Bearer sg_live_YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-4o", "messages": [{ "role": "user", "content": "Summarize this record: John Doe, SSN 123-45-6789, email john@example.com" }] }'
With an active PII policy, the proxy intercepts this request, redacts the SSN and email before the prompt reaches the model, and logs a structured audit event. The model receives:
Summarize this record: John Doe, SSN [REDACTED_SSN], email [REDACTED_EMAIL]
Log into your dashboard and create a policy. Here's a complete PII policy for a compliance-sensitive application:
{
"name": "pii-strict",
"mode": "enforce",
"rules": [
{
"type": "pii_redaction",
"detect": ["ssn", "credit_card", "email", "phone"],
"action": "redact" // strip before forwarding to model
},
{
"type": "pii_redaction",
"detect": ["name"],
"action": "mask" // replace with [NAME] token
},
{
"type": "audit_logging",
"retention_days": 365,
"include_prompts": false // log detections, not prompt text
}
]
}
If you're using the OpenAI SDK, the integration is a single base_url parameter change. Everything else stays the same:
from openai import OpenAI # One change — point at SentinelGate client = OpenAI( api_key="sg_live_YOUR_KEY", base_url="https://sentinelgate.polsia.app/v1" ) # This prompt contains PII — SentinelGate redacts before forwarding response = client.chat.completions.create( model="gpt-4o", messages=[{ "role": "user", "content": "Customer record: Jane Smith, 555-867-5309, jane@corp.com" }] ) # The model received the prompt with PII stripped. # SentinelGate logged: 1x phone detected+redacted, 1x email detected+redacted. # The response comes back normally — your code is unchanged. print(response.choices[0].message.content)
import OpenAI from 'openai'; // One change — point at SentinelGate const client = new OpenAI({ apiKey: 'sg_live_YOUR_KEY', baseURL: 'https://sentinelgate.polsia.app/v1' }); // Prompt contains PII — SentinelGate strips before forwarding const response = await client.chat.completions.create({ model: 'gpt-4o', messages: [{ role: 'user', content: 'Patient DOB 01/15/1982, MRN 447821, admitted 2026-03-01' }] }); // Model received prompt with PHI redacted. // Audit event logged: date + numeric identifiers detected and masked. console.log(response.choices[0].message.content);
Every proxied request generates a structured audit event in the SentinelGate dashboard. This is the evidence that demonstrates your controls are active. Here's what a PII detection event looks like:
{
"id": "evt_sg_9k2m4p7r",
"timestamp": "2026-04-22T09:14:33.201Z",
"model": "gpt-4o",
"policy": "pii-strict",
"decision": "redacted",
"detections": [
{ "type": "pii", "class": "phone", "action": "redact", "count": 1 },
{ "type": "pii", "class": "email", "action": "redact", "count": 1 },
{ "type": "pii", "class": "credit_card", "action": "block", "count": 0 }
],
"latency_ms": 14,
"upstream_ms": 921,
"upstream_status": 200
}
Application logs tell you something happened. Audit events tell you what policy was active, what was detected, and what action was taken — with a timestamp, a request ID, and a machine-readable structure. Security auditors and compliance reviewers want the latter. include_prompts: false keeps the evidence without storing the sensitive content itself.
Regulatory Compliance Mapping
Each major regulation has specific requirements around PII in AI systems. Here's how SentinelGate's features map to those requirements:
| Regulation | Requirement | SentinelGate feature |
|---|---|---|
| GDPR Art. 25 | Privacy by design — data minimization before processing. Personal data should not reach processors unless necessary. | PII redaction before prompt reaches upstream LLM. Configurable per-class detection. |
| GDPR Art. 30 | Records of processing activities — documentation of what data is processed, by whom, and for what purpose. | Structured audit log with per-request policy name, decision, and detection classes. Exportable. |
| HIPAA Security Rule | Technical safeguards for ePHI — access controls, audit controls, transmission security. | PHI detection (SSN, DOB, MRN patterns), redaction before inference, audit log with 365-day retention. |
| SOC 2 Type II | CC6.1 / CC7.2 — logical access controls and monitoring of data flows into AI systems. | Per-request audit events, policy enforcement logs, API key scoping, BYOK encryption at rest. |
| CCPA §1798.100 | Right to know — businesses must document what personal information they collect, including via AI interactions. | Audit log captures detection classes per request. Queryable record of what PII categories appeared in prompts. |
| EU AI Act Art. 9 | Risk management system for high-risk AI — documented controls, monitoring, and logging. | Policy-based enforcement with audit trail; exportable compliance bundles available on Pro and Enterprise plans. |
What You've Accomplished
With SentinelGate's PII protection active, you now have:
- ✓ Pre-inference redaction: SSNs, credit cards, email addresses, and phone numbers are stripped from prompts before reaching the model — not logged by your provider, not echoed back
- ✓ Response scanning: PII that appears in model outputs (echo-back) is detected and can be masked before reaching your application
- ✓ Compliance-grade audit trail: structured events per request — detection classes, action taken, policy name, latency — in a queryable, exportable format
- ✓ No ML infrastructure to maintain: detection runs at the gateway, updated transparently — your team manages policy configuration, not model weights
- ✓ Zero code changes: existing OpenAI SDK calls work unchanged — you changed one URL and one API key
Frequently Asked Questions
Does SentinelGate catch PII in the model's response, not just the prompt?
Yes. Both request and response pass through the policy engine. If the model echoes back PII from a prompt it received before you added SentinelGate, or generates PII-like patterns in its response, those are detected and actioned according to your policy.
What PII classes does SentinelGate detect?
The standard detection set includes: email addresses, phone numbers, SSNs, credit card PANs, and names. The Pro and Enterprise plans add custom pattern support — you can define your own detection classes using regex patterns for organization-specific identifiers (member IDs, account numbers, internal codes).
Can I log which PII was detected without storing the raw prompt?
Yes. Setting include_prompts: false in your policy (the default) logs detection metadata — types detected, counts, actions taken — without storing the prompt text. This is the preferred configuration for compliance: you have evidence of what your system detected and acted on, without creating a secondary PII store in your audit logs.
How is the PII detection different from regex I could write myself?
SentinelGate's detection combines pattern matching with contextual analysis — it catches obfuscated formats, common variants, and multi-field patterns that single-field regex misses. More importantly, it runs at the gateway with a structured audit trail. Your DIY regex has no audit record, no configurable policy, and no response scanning. See the AI Guardrails guide for a broader overview of everything the policy engine covers.
Is there a way to test PII detection before enforcing it in production?
Yes. Set your policy mode to observe instead of enforce. In observe mode, detections are logged and you can review what would have been redacted — without actually stripping anything from live prompts. When you're satisfied with the detection accuracy, switch to enforce. This lets you tune false positive rates before hardening production. See pricing for plan details.
Stop sending PII to your LLM provider
Free tier, no credit card, takes 5 minutes. Your first 100 requests are covered — enough to validate the integration and see your first audit events.