AI Defense

Interactive analyzer

Guardrail Gap Analyzer

Describe your LLM application — its surface, trust boundary, data sensitivity, hosting, and the controls you already run — and get a layered defense-in-depth matrix (Input → Model → Output → Monitoring) with every cell scored Covered, Gap, or N/A, gaps ranked by residual risk and linked to the mitigating control and our implementation guides.

Nothing leaves your browser — evaluation runs client-side with a transparent rule set, and your configuration is encoded in the URL so you can share or bookmark a result. Controls and severities are drawn from our defense techniques guide (reviewed 2026-05).

What does your LLM application look like?

Pick the closest shape. This drives which controls even apply.

Who can reach it (trust boundary)?

Untrusted callers raise the bar on input and abuse controls.

What's the most sensitive data in the prompt or retrieval path?

Regulated/PII data makes leakage controls mandatory, not optional.

Where does the model run?

A hosted API means prompts leave your trust boundary.

Which controls do you ALREADY have in place?

Check everything you genuinely run today. Unchecked applicable controls become gaps.

All 17 controls we score

The defense-in-depth model: an attack should have to defeat a control at the Input layer, the Model layer, the Output layer, and Monitoring. Each control below carries a severity (1–5) used to rank residual risk when it is applicable but missing.

Control Layer Severity if missing Mitigates
Prompt-injection / jailbreak input classifier Input 5 LLM01 Prompt Injection; Jailbreak / instruction override
Input schema / length / format validation Input 3 LLM01 Prompt Injection; Resource exhaustion / context stuffing
Inbound PII detection & redaction Input 4 LLM02 Sensitive Information Disclosure; Regulated-data leakage to model provider
Retrieved-document sanitization (indirect injection) Input 5 LLM01 Prompt Injection (indirect); Data / knowledge-base poisoning
Rate limiting & per-user quotas Input 4 LLM10 Unbounded Consumption; Model extraction / scraping; Cost-based denial of wallet
Authentication & tenant isolation at the boundary Input 5 LLM02 Sensitive Information Disclosure; Cross-tenant data exposure
System-prompt hardening & instruction hierarchy Model 4 LLM01 Prompt Injection; Instruction override / role confusion
Privilege separation / least-privilege tool scoping Model 5 LLM06 Excessive Agency; Confused-deputy via tool calls
Dual-model / privileged-vs-quarantined architecture Model 3 LLM01 Prompt Injection; LLM06 Excessive Agency
Grounding / context-faithfulness constraints Model 3 LLM09 Misinformation; Hallucination / unsupported claims
Output safety / harmful-content classifier Output 5 LLM05 Improper Output Handling; Harmful / policy-violating generations
Outbound PII / secrets leakage scrubbing Output 5 LLM02 Sensitive Information Disclosure; Training-data / context leakage
Structured-output schema enforcement Output 4 LLM05 Improper Output Handling; Injection into downstream systems
System-prompt canary / leak detection Output 2 LLM07 System Prompt Leakage; Prompt extraction
Request/response logging & tracing Monitoring 4 No detection / no forensics; Undetected abuse
Abuse / anomaly & drift detection with alerting Monitoring 3 Coordinated abuse / attack campaigns; Output drift; Slow data exfiltration
Continuous adversarial testing in CI Monitoring 3 Security regression; Untested attack surface

Related tools in this network

Other interactive tools across the network that pair well with this one.