The Problem: LLMs Are Unguarded by Default
When you call api.openai.com/v1/chat/completions, there's nothing between your users and the model. Whatever they type — credit card numbers, social security numbers, confidential medical details — hits the model and gets logged. Whatever the model returns — fabricated information, leaked context, harmful content — goes straight back to your application.
PII leakage: User inputs containing emails, SSNs, phone numbers, and names are sent unredacted to third-party AI providers and stored in your application logs.
Jailbreaks: Adversarial users use prompt injection to override your system prompt, extract confidential instructions, or make the model produce outputs that violate your policies.
No audit trail: Without structured logging per request, you can't demonstrate compliance, investigate incidents, or prove due diligence to security teams.
Fixing this used to require building a custom middleware layer — parsing every request, running detectors, logging structured events. That's weeks of engineering. SentinelGate is that middleware, pre-built, exposed as a drop-in proxy.
The Solution: A Policy Proxy in Front of Your LLM
SentinelGate sits between your application and the upstream LLM provider (OpenAI, Anthropic, or any OpenAI-compatible endpoint). It's an AI governance proxy — every request passes through the policy engine before reaching the model.
Here's what happens on every call:
- Inspect the prompt — scan for PII, injection patterns, policy violations
- Act according to policy — redact, block, or allow depending on your config
- Forward to upstream — the modified request goes to OpenAI (or wherever)
- Inspect the response — check model output for unsafe content or data leakage
- Log a structured audit event — every interaction, every detection, immutably stored
The proxy is 100% OpenAI-compatible — your existing code doesn't change, only the base URL.
Create a free SentinelGate account. Your account includes 100 guardrailed requests per month at no cost — enough to test in development and stage a real integration. No credit card required.
After signup, go to your dashboard and generate your first API key. It looks like:
sg_live_a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5f6a1b2
This key identifies your account in the proxy. It's separate from your upstream LLM key — you configure your OpenAI (or other provider) key in the dashboard under BYOK (Bring Your Own Key). SentinelGate uses your upstream key to forward requests, never exposes it.
Replace https://api.openai.com with https://sentinelgate.polsia.app. That's the entire integration. Your existing OpenAI SDK calls, curl scripts, and API clients work unchanged.
curl -X POST https://api.openai.com/v1/chat/completions \ -H "Authorization: Bearer sk-..." \ -H "Content-Type: application/json" \ -d '{"model":"gpt-4o","messages":[{"role":"user","content":"Hello"}]}'
curl -X POST https://sentinelgate.polsia.app/v1/chat/completions \ -H "Authorization: Bearer sg_live_YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{"model":"gpt-4o","messages":[{"role":"user","content":"Hello"}]}'
Two things changed: the base URL, and the Authorization header (now uses your sg_live_* key instead of the OpenAI key directly). That's it.
If you're using the OpenAI Python or Node.js SDK, the integration is one line:
# pip install openai from openai import OpenAI client = OpenAI( api_key="sg_live_YOUR_KEY", base_url="https://sentinelgate.polsia.app/v1" # ← one change ) # All existing calls work unchanged response = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": "Hello"}] )
// npm install openai import OpenAI from 'openai'; const client = new OpenAI({ apiKey: 'sg_live_YOUR_KEY', baseURL: 'https://sentinelgate.polsia.app/v1' // ← one change }); // All existing calls work unchanged const response = await client.chat.completions.create({ model: 'gpt-4o', messages: [{ role: 'user', content: 'Hello' }] });
With the proxy running, log into your SentinelGate dashboard and configure a policy. A policy defines what the gateway checks and how it reacts. You can have multiple policies for different use cases — your customer-facing chatbot has different risk tolerance than your internal copilot.
Key policy controls available on all tiers:
- PII detection classes: email, phone, SSN, credit card, name, custom patterns
- Redaction vs. block: strip PII from the prompt before it reaches the model (redact), or reject the request entirely (block)
- Jailbreak / injection detection: pattern-based and semantic detection for prompt injection, role override attacks, and system prompt extraction attempts
- Enforcement mode: Observe (log only, pass through), Enforce (act on violations), or Strict (block on any detection)
Here's an example policy configuration for a customer-facing chatbot:
{
"name": "public-chatbot",
"mode": "enforce",
"rules": [
{
"type": "pii_redaction",
"detect": ["email", "phone", "ssn", "credit_card"],
"action": "redact" // replace with [REDACTED_EMAIL] etc.
},
{
"type": "jailbreak_detection",
"action": "block" // reject the request entirely
},
{
"type": "audit_logging",
"retention_days": 90
}
]
}
And an example internal copilot policy with stricter controls:
{
"name": "internal-copilot",
"mode": "enforce",
"rules": [
{
"type": "pii_redaction",
"detect": ["email", "phone", "ssn", "name", "credit_card"],
"action": "redact"
},
{
"type": "jailbreak_detection",
"action": "block"
},
{
"type": "topic_restriction",
"blocked_topics": ["competitor_products", "legal_advice"],
"action": "block"
},
{
"type": "audit_logging",
"retention_days": 365,
"include_prompts": true
}
]
}
What You Get: The Audit Event
Every proxied request generates a structured audit event. This is the core compliance artifact — immutable, queryable, and exportable. Here's what a single event looks like:
{
"id": "evt_sg_8f3k2m9x",
"timestamp": "2026-04-21T14:32:11.004Z",
"model": "gpt-4o",
"policy": "internal-copilot",
"decision": "redacted",
"detections": [
{ "type": "pii", "class": "ssn", "action": "redact", "count": 1 },
{ "type": "pii", "class": "email", "action": "redact", "count": 2 }
],
"latency_ms": 12, // gateway overhead only
"upstream_ms": 843, // OpenAI round-trip
"upstream_status": 200
}
SOC 2, HIPAA, and GDPR all require evidence that you're controlling data flows into AI systems. An audit log with per-request detection records — who called what model, what PII was present, what action was taken — is that evidence. Without it, "we have guardrails" is a claim. With it, it's provable.
Policy Examples for Common Use Cases
Healthcare / HIPAA
Redact PHI (patient names, DOB, MRN, diagnoses) from prompts. Block responses containing protected health information.
Finance / SOC 2
Detect and redact account numbers, routing numbers, card PAN. Log all AI interactions for SOC 2 audit evidence.
Legal / GDPR
PII redaction before inference. Retention controls per tenant region. Evidence packs for GDPR Article 22 documentation.
Internal AI Agents
Block injection patterns and tool-call manipulation. Enforce topic restrictions. Rate limit by user or org.
What You've Accomplished
With those three steps, you now have:
- ✓ PII protection: emails, SSNs, phone numbers, and credit card numbers are redacted from prompts before reaching the model
- ✓ Jailbreak defense: prompt injection and role override attempts are detected and blocked at the gateway
- ✓ Audit trail: every request generates a structured, immutable event with detections and actions logged
- ✓ Zero code changes: existing OpenAI SDK calls work unchanged — you changed one URL
- ✓ Compliance-ready evidence: exportable audit bundles for SOC 2, HIPAA, and GDPR review
Next Steps
A few things worth doing after the initial integration:
- Review your first audit events. Go to the dashboard Audit Logs tab and check what's being detected. You'll often find patterns you didn't expect — users pasting email threads, support tickets with customer PII, internal IDs that match your custom patterns.
- Tune your policy. If you're seeing too many false positives (legitimate requests blocked), switch from
enforcetoobservemode to log without blocking while you tune. Adjust the detection sensitivity and redaction classes to fit your data. - Set up a second policy for production. Dev and staging can run in observe mode; production in enforce. Separate policies let you harden over time without breaking prod flows.
- Add a tenant header. If you're building multi-tenant, pass
X-Tenant-Id: your-tenant-slugin requests. Audit events are scoped by tenant, giving you per-customer usage and compliance reports.
Ready to run your first guardrailed request?
Free tier includes 100 requests/month. No credit card. Takes under 5 minutes from signup to first proxied call.
Frequently Asked Questions
Does SentinelGate add latency to my LLM calls?
Yes — every request passes through the policy engine. Gateway overhead is typically under 20ms (p99 sub-50ms). For most chat and completion use cases, this is unnoticeable relative to the model's own response time (usually 500ms–3s+). You can see per-request gateway latency vs. upstream latency in your audit events.
Which LLM providers does SentinelGate support?
Any provider with an OpenAI-compatible API: OpenAI, Azure OpenAI, Groq, Together AI, Perplexity, Mistral, and others. For Anthropic Claude (non-OpenAI format), you'll need to configure the SDK for OpenAI compatibility mode or use the compatibility wrapper.
Does SentinelGate store my prompts?
Audit events store detection metadata — what types of PII were found, what actions were taken, model, latency, policy name. Whether prompt text is stored in audit logs is configurable per policy via include_prompts: false (the default). Full prompt storage is opt-in and subject to your configured retention period.
Can I use SentinelGate without pointing it at OpenAI?
Yes. In BYOK mode, you configure your own LLM API key in the dashboard and SentinelGate forwards to your provider. SentinelGate is the proxy — your LLM key stays in your account and is encrypted at rest.
Is there a self-hosted option?
Enterprise plans include self-hosted and on-premise deployment options for environments where data cannot leave your infrastructure. See pricing for details, or book a demo.