🤖 AI Gateway Configuration

This guide explains the modern AI Pilot configuration flow in Settings -> AI Pilot.

AI Pilot controls are grouped into one place for each selected bouncer:

  • Settings: baseline and override controls
  • Forensics: summary of denies, failures, and incidents
  • Trace Log: investigation events with action and reason context

📌 Configuration fields and intent

Baseline scope

  • Use Global AI Baseline: inherit tenant-level defaults for this bouncer.
    • Use for consistency across many bouncers.
    • Disable only when a specific bouncer needs stricter or different controls.

Cost and throughput controls

  • Tokens / Min / User: max token budget per user in the configured window.
  • Burst Tokens: short burst budget before throttling.
  • Window (seconds): sliding window duration for rate checks.
  • Prompt Caching: cache deterministic prompt responses.
  • Cache TTL (seconds): cache lifetime.

Safety controls

  • Prompt Shields: enable jailbreak and prompt injection controls.
  • Harm Threshold: sensitivity level (0 strictest -> 3 permissive).
  • Violation Action:
    • block: deny unsafe request/response
    • redact: remove unsafe content and continue
    • annotate: allow but tag/flag for review
  • Use global baseline enabled for most bouncers.
  • Set deny-oriented posture in sandbox first, then tune false positives.
  • Start with conservative token windows and increase by observed usage.
  • Enable prompt caching for high-volume, repetitive workloads.
  • Keep prompt shields enabled in both sandbox and production.

📌 Validation checklist

  • Correct bouncer selected in the intended environment
  • Baseline mode (Use Global) intentionally set
  • Token/cost controls saved without validation errors
  • Safety controls produce expected block/redact/annotate outcomes
  • Forensics and trace logs show expected events for test traffic