Diagnostic Logs
Audience: DevOps / Platform engineers, Support leads
Time: ~10 min
Prerequisites:
- Control Plane running (see Quick-Start)
- At least one Bouncer registered (see Deployment Overview)
- Administrator role in the Control Plane
Diagnostic logs are off by default. Control authors and operators turn them on only when they need to troubleshoot — either on their own or under guidance from Control Core support. Diagnostic logs are separate from the compliance audit trail and do not replace it.
TL;DR — capture, export, purge
# 1) Turn the subsystems you need on (UI: Settings → Diagnostic Logs)
# 2) Reproduce the issue once
# 3) Pull a tarball from the Control Plane
curl -sS -H "Authorization: Bearer $CC_TOKEN" \
"https://controlplane.example.com/log-management/diagnostics/export?since=2026-04-30T10:00:00Z" \
-o diagnostic-logs.tar.gz
# 4) Share the bundle with Control Core support (UI: Settings → Diagnostic Logs
# → Share with support) or inspect locally with standard tools:
mkdir -p rca-input && tar -xzf diagnostic-logs.tar.gz -C rca-input
grep -r '"level":"ERROR"' rca-input/
jq -s 'sort_by(.ts)' rca-input/*.jsonl | less
# 5) Purge when done (confirm=true required)
curl -sS -X POST -H "Authorization: Bearer $CC_TOKEN" \
-H "Content-Type: application/json" \
-d '{"confirm":true}' \
"https://controlplane.example.com/log-management/diagnostics/purge"
Troubleshooting: If the export returns an empty tarball, check that at least one subsystem is enabled at level INFO or higher. With every subsystem OFF there is nothing on disk to export. Open Settings → Diagnostic Logs and toggle the relevant subsystem; then reproduce.
1. What gets logged
Each subsystem writes one JSONL line per event under /var/log/controlcore/control-plane/<subsystem>.jsonl (Control Plane) or /var/log/controlcore/bouncer/<subsystem>.jsonl (Bouncer). The schema is uniform across tiers:
{"ts":"2026-04-30T10:00:01Z","level":"ERROR","subsystem":"auth.login","correlation_id":"abc-123","msg":"password rejected"}
Standard fields:
| Field | Notes |
|---|---|
ts | ISO-8601 UTC timestamp |
level | ERROR, WARN, INFO, DEBUG, or TRACE (never OFF — OFF means no line is written) |
subsystem | One of the catalog IDs listed below |
correlation_id / trace_id | Request-scoped identifiers used to join events end-to-end |
msg | Short human-readable description |
tenant | Present when tenant routing is enabled |
PII is scrubbed before write: Bearer tokens, JWTs, email-like strings, password=…, and api_key=… are replaced with [REDACTED]. No prompt bodies, passwords, or JWTs ever hit disk.
2. Subsystem catalog
Open Settings → Diagnostic Logs. The subsystem table is grouped into Control Plane and Bouncer sections; each row has an Enable switch and a Level selector (OFF, ERROR, WARN, INFO, DEBUG, TRACE).
| Group | Subsystem | What it captures |
|---|---|---|
| Control Plane | auth.login | Login, password reset, session issuance |
| Control Plane | auth.sso | SSO/SAML/OIDC exchanges |
| Control Plane | pep.registration | Bouncer register/heartbeat/deregister |
| Control Plane | resource.discovery | Protected-resource scans, probes, metadata fetch |
| Control Plane | policy.generation | Rego generation (Visual Builder + SCCA) |
| Control Plane | policy.lifecycle | Control create/update/version/delete |
| Control Plane | policy.activation | Activation, deactivation, environment promotion |
| Control Plane | policy.sync | Bundle build, GitHub sync, policy bundle push timing |
| Control Plane | bridge.server | Embedded policy distribution server and WebSocket subscriptions |
| Control Plane | bridge.sync | Per-bouncer policy sync outcomes |
| Control Plane | pip.datasource | PIP data-source fetch, cache, row counts |
| Control Plane | aigw.compiler | AI Pilot Kubernetes CRD emission |
| Control Plane | audit.writer | Audit persistence and SIEM outbox flush |
| Control Plane | http.requests | Inbound HTTP requests (correlation_id only) |
| Bouncer | bouncer.ext_authz | Envoy ext_authz gRPC decisions |
| Bouncer | bouncer.ext_proc | Envoy ext_proc body inspection and obligation application |
| Bouncer | bouncer.opa | OPA evaluation timing and decision-log plugin |
| Bouncer | bouncer.opal_client | Policy Bridge client reloads and WebSocket reconnects |
| Bouncer | bouncer.envoy_access | Envoy access log aligned to the diagnostic schema |
Troubleshooting: If a subsystem you need is not in the catalog, check your Control Plane version. The catalog is shipped as a single source of truth and expands with new capabilities. File a support ticket with the subsystem you want and a one-line description of what you want logged.
3. Time-boxed capture
For short troubleshooting windows — reproducing a flake, chasing a transient policy-sync failure — use Time-boxed capture instead of permanently enabling a subsystem.
- Select the subsystems you want temporarily loud.
- Pick a duration (30 s – 1 h).
- Pick a level (default
DEBUG). - Click Start capture.
- Reproduce the issue.
- The capture expires automatically; every participating subsystem reverts to OFF.
Troubleshooting: If capture status shows
active: truepast the expiry window, the Control Plane capture-timer job hasn't run. Hit the endpoint manually:curl -sS -X DELETE -H "Authorization: Bearer $CC_TOKEN" \ "https://controlplane.example.com/log-management/diagnostics/capture"Full reference: Remote Troubleshooting Runbook.
4. Export a diagnostic slice
GET /log-management/diagnostics/export?since=<iso>&until=<iso> streams a tar.gz containing every JSONL file clipped to the time window, plus a manifest.json documenting which files and how many events were included.
- The archive is plain JSONL — no encoded blobs — so
grep,jq, andyqall work directly on the expanded bundle. - When sharing with Control Core support, use Settings → Diagnostic Logs → Share diagnostic bundle for a signed, HMAC-verified package. See Security of Support.
5. Purge
POST /log-management/diagnostics/purge (body {"confirm":true}) removes every *.jsonl* file from the configured control-plane and bouncer log directories and writes a DIAGNOSTIC_PURGE audit event. Purging is irreversible; the audit record proves the purge happened.
Troubleshooting: If purge reports
freed_bytes: 0on a system you know has logs, check Settings → Diagnostic Logs → Storage & Retention for the configured paths. The purge only touches those directories.
6. Retention
Diagnostic logs rotate daily and keep the last 14 days on disk by default. Tune retention under Settings → Diagnostic Logs → Storage & Retention; the tier minimum/maximum matches your audit retention policy.
Next steps
- Route logs to Grafana/Splunk: Remote Troubleshooting Runbook
- Send a bundle to support: Security of Support
- Understand what stays in audit vs diagnostic: Audit vs Diagnostic Logs