AI Pilot scenario library

Audience: Platform engineers, AI governance owners
Time: ~10–45 min per scenario (varies)

Short, repeatable narratives aligned with the AI Pilot gateway-first rule: prefer CRDs and filters before custom PEP logic.

1) Cold-start multi-provider route (~20 min)

Goal: Route /team-a/* to Provider A and /team-b/* to Provider B with distinct backends.

  1. Open /pilotRouting & models, add two AIServiceBackend entries (providers + models).
  2. Add AIGatewayRoute rules with path prefix conditions (or import paths via OpenAPI upload, then merge).
  3. Save & deploy, verify compiled YAML contains both backends and rules.

Troubleshooting: If one route 404s upstream, verify base_url and credentials on each backend. See Upstream Auth.

2) Token budget + fallback chain (~25 min)

Goal: Enforce token-aware limits and ordered fallbacks.

  1. Define BackendTrafficPolicy inputs via rate limit descriptors and resilience blocks (Control Plane routing / resilience panels).
  2. Add primary + fallback backendRefs on the route rule.
  3. Confirm breaker thresholds match your SLO stories (cost / latency / 5xx).

Troubleshooting: Limits not applying usually means the traffic policy is not attached in the runtime profile — see Resilience & Circuit Breaker.

3) MCP broker with tool allowlist (~30 min)

Goal: Central MCP egress with allowed tools only.

  1. Configure MCPRoute (servers, mode, allowlists) per MCP Proxy Mode.
  2. Layer PBAC in OPA for sensitive tools (reference MCP dimensions in policies).
  3. Use /pilot MCP Proxy tab for observed activity — routing truth stays in MCPRoute.

4) Semantic cache + Redis (~30 min)

Goal: Shared prompt cache with honest topology probes.

  1. Choose bundled or external Redis per Cache & Rate Limits.
  2. Enable semantic cache in guardrails / filter settings so AIGatewayFilter carries the block ext_proc expects.
  3. From the Control Plane, run the cache probe path documented in onboarding — expect configured vs explicit errors (never silent success).

5) Promote sandbox → production checklist (~20 min)

  1. Export or snapshot routing + resilience + MCP settings from Sandbox.
  2. Reapply under Production environment; re-verify 9998 reachability from Production Control Plane pods.
  3. Watch audit for PILOT_* events and SIEM outbox counters on /pilot.

See also

Troubleshooting: Scenario partially works after promotion — compare compiled YAML bundle revisions between environments and Policy Bridge bundle freshness. Link: Troubleshooting.