AI Pilot — Enterprise Onboarding

Audience: Platform engineers, AI governance owners, security architects Time: ~15 min read

This is the entry point for new enterprise deployments of Control Core AI Pilot. It assumes you already have a Control Plane installed and at least one bouncer registered, and walks you through enabling the full AI Pilot feature set in the right order.

If you have never deployed Control Core, start at the Quick Start Guide first.

What you will end up with

  • One bouncer governing many AI targets at once: LLMs, MCP servers, RAG services, AI agents, internal AI-touching APIs.
  • mTLS enforced for the AI tier.
  • Per-target cost rules (per provider, per model, per MCP server, per app, per role).
  • Cost + latency + 5xx circuit breakers with retry budget and hedging.
  • Intelligent transparent MCP proxy with tool allowlists and tool-response caching.
  • Upstream OAuth2/OIDC for AI providers — no API keys in app code.
  • Token usage captured per transaction with full SIEM-bound audit lineage.
  • An optional Redis + Rate Limit Service sidecar deployed next to the bouncer (or BYO).

Click to enlarge

Step 1 — Register bouncers and resources

Open Settings -> Bouncer Management and confirm at least one bouncer is Connected and Intercepting traffic for the environment you are configuring (Sandbox first).

Open Settings -> Resources and register every AI target the bouncer should govern: LLM endpoints (OpenAI, Anthropic, Azure OpenAI, Bedrock, Vertex), MCP servers, internal RAG endpoints, AI agents, and any AI-touching internal APIs. Tag each resource with classification, owner, and SLA tier — these enrichment fields drive policy templates and cost rule wizards downstream.

Networking: Control Plane → bouncer plugin admin (TCP 9998)

The Control Plane API calls the bouncer's plugin admin HTTP listener on TCP 9998 (host taken from the registered bouncer health URL; administration uses port 9998). Live pilot telemetry (GET /pilot/resilience, GET /pilot/mcp-proxy, GET /pilot/cache-probe, and related endpoints) is served there. Allow Control Plane → bouncer:9998 on your network policies or security groups; exposing only Envoy admin 9901 is not enough for those panels.

Troubleshooting: If AI Pilot → Resilience or MCP Proxy shows errors while other tabs work, test reachability to http://<bouncer-host>:9998/pilot/cache-probe from a host on the same network as the Control Plane. Common causes: firewall rules, missing Service port in Helm, or forwarding only the main data plane port. See AI Pilot Troubleshooting.

Single-bouncer multi-target is the norm. A reverse-proxy bouncer can sit in front of many providers, MCP servers, RAG services, and AI agents at once.

Step 2 — Pick cache & rate-limit topology

Open Settings -> AI Pilot -> Cache & Rate Limits and pick one of:

  • bundled — ship Redis + envoyproxy/ratelimit as sidecars next to the bouncer (Helm subchart or Compose profile)
  • external — reuse an existing Redis (and optional RLS) already in your infra
  • disabled — only per-process limits and local cache (good for sandbox or air-gapped demos)

PAP probes connectivity and shows the result on the /pilot Overview tab. See Cache & Rate Limits for the full deep-dive.

Step 3 — Configure upstream auth for providers

For every LLM provider you registered in Step 1, open the Upstream Auth section under Settings -> AI Pilot and pick the appropriate auth type:

  • API key (current default)
  • OAuth2 / OIDC (recommended for enterprise)
  • AWS SigV4 (Bedrock)
  • Azure AD (Azure OpenAI)

Once OAuth2/OIDC is set, the bouncer rotates tokens automatically and your app code never needs to handle API keys. See Upstream Auth.

Step 4 — Author cost rules per target

Open Settings -> AI Pilot -> Cost Optimization and add rules in the table. Each rule is keyed by target_type + provider + model (or MCP server) plus optional scope filters (application, user_role):

  • Tokens per minute per user and total tokens per minute
  • USD per day cap (cost-aware)
  • Burst tokens, window seconds
  • Fallback route when the rule trips

A single bouncer can hold many cost rules at once — for example "GPT-4o capped at $50/day total but $5/user/day for Marketing" alongside "Claude on Bedrock allowed only for Legal" alongside "MCP weather_lookup 100 calls/min". See Cost Optimization for Multi-Provider for the full pattern catalog.

Step 5 — Enable resilience & circuit breakers

Open Settings -> AI Pilot -> Resilience and configure thresholds for cost-per-minute, p95 latency, and error rate. Set the cool-down period and the fallback route to use when a breaker trips. Add a retry budget and hedging policy for the routes that need it.

This is what lets you remove resiliency code from your application SDKs — the bouncer handles retries, fallbacks, and breakers transparently. See Resilience & Circuit Breaker.

Step 6 — Turn on guardrails & DLP

Open Settings -> AI Pilot -> Content Safety and enable the categories your governance program requires. Configure custom blocklists, allowlists, trusted domains, and the violation action (block / redact / annotate).

Step 7 — Configure MCP proxy mode

If you registered MCP servers in Step 1, open Settings -> AI Pilot -> MCP Proxy and pick a mode:

  • passthrough — bouncer just observes and applies PBAC
  • registry — bouncer maintains a server allowlist; only registered MCP servers are reachable
  • broker — bouncer is the only MCP endpoint for clients; it routes to the right server based on the requested tool

Add tool allowlists, response cache TTLs, and PBAC overlays. See MCP Proxy Mode.

Step 8 — Enforce mTLS

Open Settings -> AI Pilot -> mTLS and toggle "Required for: LLM | MCP | RAG | API". The bouncer presents SPIRE-issued SVIDs to every selected upstream and refuses to connect on any other path. See mTLS Enforcement.

Step 9 — Validate observability & audit

Open the /pilot dashboard:

  • Overview — should show your bouncer with green badges for Connected, mTLS, Cache, RLS.
  • Analytics — apply filters by provider / model / MCP server / AI agent / application / user / decision and confirm metrics appear.
  • Token Ledger — confirm per-transaction rows are populated with cost.
  • Resilience — confirm circuit-breaker state matches what you configured.
  • MCP Proxy — confirm each MCP server is Healthy and tool calls are flowing.
  • Audit — open the SSE stream and confirm AI_DECISION_LINEAGE events are arriving.

Also check the Audit page at /audit?quickFilter=ai-pilot and the SIEM outbox card on the dashboard.

Promotion to production

Once Sandbox is validated end-to-end, use the existing Promotion workflow under Controls to move your pilot config and policies to Production. Deploy a separate bouncer in the Production environment and repeat Step 2 (cache topology) — the rest of the configuration promotes automatically.

Where to go next