AI Pilot

Audience: Platform engineers, AI governance owners Time: ~5 min read

Control Core AI Pilot delivers enterprise authorization, observability, and runtime controls for every AI interaction that flows through your bouncers. AI Pilot is built into every bouncer, so there is no separate product to deploy — the Control Plane simply delivers configuration to it.

What you get

  • Multi-provider routing to OpenAI, Anthropic, Azure OpenAI, Google Vertex, AWS Bedrock, and any OpenAI-compatible endpoint
  • Multi-target enforcement in one bouncer — a single bouncer can govern many LLMs, MCP servers, RAG services, AI agents, and internal APIs at the same time
  • Unified API translation so applications can call a single endpoint and have the gateway speak each provider's native protocol
  • Token-aware rate limits, budget caps, fallback chains, A/B and canary routing expressed declaratively
  • Credential management at the gateway — no API keys in application code
  • Upstream auth/authorization for connected AI/LLM providers — OAuth2/OIDC token rotation, MFA-required flag, scope mapping
  • Guardrails and DLP for prompts and responses, powered by the bouncer's external processor filter
  • Prompt cache + semantic match with an optional Redis sidecar near the bouncer
  • Token usage per transaction captured in the audit trail and surfaced in the Token Ledger
  • Cost + latency circuit breakers — trip on cost-per-minute, p95 latency, error rate, not just 5xx
  • Server-side resilience (retries, retry budget, hedging) so client SDKs no longer need bespoke resiliency code
  • Intelligent transparent MCP proxy — passthrough, registry, or broker mode with tool allowlists and tool-response caching
  • Advanced AI audit with decision lineage, prompt/response hashing, redaction log, and tool-call chain
  • Full audit trail of every AI call with direct SIEM export

How AI Pilot is delivered

The Control Plane compiles your UI inputs into AI Pilot manifests that the bouncer applies at runtime. The 2026-Q2 bundle is aigateway.envoyproxy.io/v1 and emits nine CRDs:

  • Gateway and HTTPRoute for traffic entry
  • AIGatewayRoute for model name mapping, conditional routing, fallback
  • AIServiceBackend per provider
  • BackendSecurityPolicy for credentials (API key, AWS SigV4, OAuth2/OIDC)
  • BackendTrafficPolicy for token-aware rate limits and multi-signal circuit breakers
  • AIGatewayFilter for ext_proc (guardrails, DLP, prompt cache)
  • MCPRoute for the intelligent transparent MCP proxy
  • AIGatewayUpstreamAuth for provider-side OAuth2/OIDC and scope mapping
  • AIGatewayMetricsPolicy for per-transaction token usage telemetry

Your PEP stays generic. Control Core's value-add — guardrails, DLP, prompt cache, MCP enrichment, decision lineage — runs as an external processor attached to AI Pilot, never in application code.

Where the cache and rate limits live

AI Pilot supports three deployment topologies, picked per bouncer in Settings -> AI Pilot -> Cache & Rate Limits:

  • bundled — Helm subchart and Compose profile ship redis:7-alpine and envoyproxy/ratelimit:1.4 next to the bouncer
  • external — operator supplies REDIS_URL and (optionally) a RATELIMIT_ENDPOINT; the Control Plane probes connectivity
  • disabled — only per-process limits and local cache; no shared store

See Cache & Rate Limits for the full deep-dive.

The /pilot dashboard

Open AI Pilot (top-level navigation) for the enterprise cockpit. It is organised into nine tabs that span the full width of the page:

  1. Overview — live bouncer inventory, SIEM outbox health, mTLS/Cache/RLS status badges
  2. Analytics — requests, tokens, cost, TTFT/ITL; breakdowns by provider, model, MCP server, AI agent, RAG service, application, user
  3. Traffic log — paginated per-request log with outcome, guardrail action, DLP action, CSV export
  4. Routing & models — visual builder for service backends, routes, fallback chains, rate limits; previews compiled AI Pilot YAML
  5. Guardrails & DLP — content safety classifiers, DLP profiles with predefined and custom detectors, prompt cache settings
  6. Token Ledger — per-transaction token usage table with cost, provider, model, principal; CSV export
  7. Resilience — circuit-breaker state per backend (cost / latency / error), retry budget, hedging
  8. MCP Proxy — live tool-call view, server health, denied-tool list
  9. Audit — SSE-driven live trace with decision-lineage drill-down

Advanced global baselines, credential vaults, and the multi-target rule tables live under Settings -> AI Pilot.

Typical operator flow

  1. Register a bouncer (it ships with AI Pilot embedded).
  2. Open AI Pilot -> Routing & models and add backends, routes, and rate limits.
  3. Open Guardrails & DLP and enable the classifiers and DLP profiles your organisation requires.
  4. Send test traffic, then inspect Analytics and Traffic log.
  5. Use Overview -> Open AI audit trail to verify AI_POLICY_VIOLATION, AI_PII_REDACTION, and AI_TRAFFIC_LOG events are being captured.
  6. Configure a SIEM target from Settings -> Audit logs; the SIEM outbox card on the dashboard shows delivery health.
  7. Promote the configuration from Sandbox to Production once it meets your acceptance criteria.

Troubleshooting: If the dashboard shows an empty bouncer list, confirm the bouncer is registered in the current environment (use the header environment selector) and that it can reach the Control Plane. See AI Pilot Troubleshooting.

Environment model

Control Core runs as one Control Plane instance with two isolated environments:

  • Sandbox for development and validation
  • Production for live enforcement

Isolation applies to policies, bouncers, resources, AI connections, PIP connections, and action destinations. The only cross-environment flow is controlled promotion from Sandbox to Production.

Next steps