🔒 Prompt Security Controls (Enterprise)

AI Pilot supports layered prompt controls suitable for enterprise AI Gateway deployments.


📌 Control categories

ControlWhat it detectsTypical action
Prompt injectionInstruction override attemptsblock
JailbreakSafety bypass phrases/patternsblock
Data exfiltrationBulk export / leakage intentblock
Secret leakageAPI keys/tokens/secrets in contentredact
PII detectionSensitive personal dataredact
Malware/code abuseMalicious command/code promptsannotate or block

🛡️ Policy behavior options

Each control can use one of:

  • Block: reject request (HTTP 400)
  • Redact: sanitize sensitive fragments before forwarding
  • Annotate: allow traffic but mark event for audit and monitoring

🔒 Data security controls

Custom blocklists

Add enterprise-specific terms as:

  • Exact match (e.g., internal project codename)
  • Regex (e.g., custom identifier patterns)

Allowlist terms

Define safe terms that should bypass overly broad detections in approved contexts.

Trusted domains

Limit data exfiltration exceptions to approved destinations/domains.


🏗️ Visual flow

Click to enlarge


👁️ Audit and governance mapping

Use these events for compliance and incident review:

  • AI_TRAFFIC_LOG
  • AI_POLICY_VIOLATION
  • AI_PII_REDACTION

Recommended:

  • Alert on repeated AI_POLICY_VIOLATION spikes.
  • Trend AI_PII_REDACTION by bouncer and application.

🔒 Hardening recommendations

  • Default to block for injection/jailbreak/exfiltration.
  • Use redact for PII/secrets where business flow must continue.
  • Keep annotate for low-confidence classes initially, then tighten.
  • Review allowlists/trusted domains quarterly.