🤖 AI Gateway Configuration
This guide explains the modern AI Pilot configuration flow in Settings -> AI Pilot.
📌 Navigation model
AI Pilot controls are grouped into one place for each selected bouncer:
- Settings: baseline and override controls
- Forensics: summary of denies, failures, and incidents
- Trace Log: investigation events with action and reason context
📌 Configuration fields and intent
Baseline scope
- Use Global AI Baseline: inherit tenant-level defaults for this bouncer.
- Use for consistency across many bouncers.
- Disable only when a specific bouncer needs stricter or different controls.
Cost and throughput controls
- Tokens / Min / User: max token budget per user in the configured window.
- Burst Tokens: short burst budget before throttling.
- Window (seconds): sliding window duration for rate checks.
- Prompt Caching: cache deterministic prompt responses.
- Cache TTL (seconds): cache lifetime.
Safety controls
- Prompt Shields: enable jailbreak and prompt injection controls.
- Harm Threshold: sensitivity level (0 strictest -> 3 permissive).
- Violation Action:
block: deny unsafe request/responseredact: remove unsafe content and continueannotate: allow but tag/flag for review
📌 Recommended pilot defaults
- Use global baseline enabled for most bouncers.
- Set deny-oriented posture in sandbox first, then tune false positives.
- Start with conservative token windows and increase by observed usage.
- Enable prompt caching for high-volume, repetitive workloads.
- Keep prompt shields enabled in both sandbox and production.
📌 Validation checklist
- Correct bouncer selected in the intended environment
- Baseline mode (
Use Global) intentionally set - Token/cost controls saved without validation errors
- Safety controls produce expected block/redact/annotate outcomes
- Forensics and trace logs show expected events for test traffic