Diagnostic Logs

Audience: DevOps / Platform engineers, Support leads
Time: ~10 min
Prerequisites:

Diagnostic logs are off by default. Control authors and operators turn them on only when they need to troubleshoot — either on their own or under guidance from Control Core support. Diagnostic logs are separate from the compliance audit trail and do not replace it.

TL;DR — capture, export, purge

# 1) Turn the subsystems you need on (UI: Settings → Diagnostic Logs)
# 2) Reproduce the issue once
# 3) Pull a tarball from the Control Plane
curl -sS -H "Authorization: Bearer $CC_TOKEN" \
  "https://controlplane.example.com/log-management/diagnostics/export?since=2026-04-30T10:00:00Z" \
  -o diagnostic-logs.tar.gz

# 4) Share the bundle with Control Core support (UI: Settings → Diagnostic Logs
#    → Share with support) or inspect locally with standard tools:
mkdir -p rca-input && tar -xzf diagnostic-logs.tar.gz -C rca-input
grep -r '"level":"ERROR"' rca-input/
jq -s 'sort_by(.ts)' rca-input/*.jsonl | less

# 5) Purge when done (confirm=true required)
curl -sS -X POST -H "Authorization: Bearer $CC_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"confirm":true}' \
  "https://controlplane.example.com/log-management/diagnostics/purge"

Troubleshooting: If the export returns an empty tarball, check that at least one subsystem is enabled at level INFO or higher. With every subsystem OFF there is nothing on disk to export. Open Settings → Diagnostic Logs and toggle the relevant subsystem; then reproduce.

1. What gets logged

Each subsystem writes one JSONL line per event under /var/log/controlcore/control-plane/<subsystem>.jsonl (Control Plane) or /var/log/controlcore/bouncer/<subsystem>.jsonl (Bouncer). The schema is uniform across tiers:

{"ts":"2026-04-30T10:00:01Z","level":"ERROR","subsystem":"auth.login","correlation_id":"abc-123","msg":"password rejected"}

Standard fields:

FieldNotes
tsISO-8601 UTC timestamp
levelERROR, WARN, INFO, DEBUG, or TRACE (never OFF — OFF means no line is written)
subsystemOne of the catalog IDs listed below
correlation_id / trace_idRequest-scoped identifiers used to join events end-to-end
msgShort human-readable description
tenantPresent when tenant routing is enabled

PII is scrubbed before write: Bearer tokens, JWTs, email-like strings, password=…, and api_key=… are replaced with [REDACTED]. No prompt bodies, passwords, or JWTs ever hit disk.

2. Subsystem catalog

Open Settings → Diagnostic Logs. The subsystem table is grouped into Control Plane and Bouncer sections; each row has an Enable switch and a Level selector (OFF, ERROR, WARN, INFO, DEBUG, TRACE).

GroupSubsystemWhat it captures
Control Planeauth.loginLogin, password reset, session issuance
Control Planeauth.ssoSSO/SAML/OIDC exchanges
Control Planepep.registrationBouncer register/heartbeat/deregister
Control Planeresource.discoveryProtected-resource scans, probes, metadata fetch
Control Planepolicy.generationRego generation (Visual Builder + SCCA)
Control Planepolicy.lifecycleControl create/update/version/delete
Control Planepolicy.activationActivation, deactivation, environment promotion
Control Planepolicy.syncBundle build, GitHub sync, policy bundle push timing
Control Planebridge.serverEmbedded policy distribution server and WebSocket subscriptions
Control Planebridge.syncPer-bouncer policy sync outcomes
Control Planepip.datasourcePIP data-source fetch, cache, row counts
Control Planeaigw.compilerAI Pilot Kubernetes CRD emission
Control Planeaudit.writerAudit persistence and SIEM outbox flush
Control Planehttp.requestsInbound HTTP requests (correlation_id only)
Bouncerbouncer.ext_authzEnvoy ext_authz gRPC decisions
Bouncerbouncer.ext_procEnvoy ext_proc body inspection and obligation application
Bouncerbouncer.opaOPA evaluation timing and decision-log plugin
Bouncerbouncer.opal_clientPolicy Bridge client reloads and WebSocket reconnects
Bouncerbouncer.envoy_accessEnvoy access log aligned to the diagnostic schema

Troubleshooting: If a subsystem you need is not in the catalog, check your Control Plane version. The catalog is shipped as a single source of truth and expands with new capabilities. File a support ticket with the subsystem you want and a one-line description of what you want logged.

3. Time-boxed capture

For short troubleshooting windows — reproducing a flake, chasing a transient policy-sync failure — use Time-boxed capture instead of permanently enabling a subsystem.

  1. Select the subsystems you want temporarily loud.
  2. Pick a duration (30 s – 1 h).
  3. Pick a level (default DEBUG).
  4. Click Start capture.
  5. Reproduce the issue.
  6. The capture expires automatically; every participating subsystem reverts to OFF.

Troubleshooting: If capture status shows active: true past the expiry window, the Control Plane capture-timer job hasn't run. Hit the endpoint manually:

curl -sS -X DELETE -H "Authorization: Bearer $CC_TOKEN" \
  "https://controlplane.example.com/log-management/diagnostics/capture"

Full reference: Remote Troubleshooting Runbook.

4. Export a diagnostic slice

GET /log-management/diagnostics/export?since=<iso>&until=<iso> streams a tar.gz containing every JSONL file clipped to the time window, plus a manifest.json documenting which files and how many events were included.

  • The archive is plain JSONL — no encoded blobs — so grep, jq, and yq all work directly on the expanded bundle.
  • When sharing with Control Core support, use Settings → Diagnostic Logs → Share diagnostic bundle for a signed, HMAC-verified package. See Security of Support.

5. Purge

POST /log-management/diagnostics/purge (body {"confirm":true}) removes every *.jsonl* file from the configured control-plane and bouncer log directories and writes a DIAGNOSTIC_PURGE audit event. Purging is irreversible; the audit record proves the purge happened.

Troubleshooting: If purge reports freed_bytes: 0 on a system you know has logs, check Settings → Diagnostic Logs → Storage & Retention for the configured paths. The purge only touches those directories.

6. Retention

Diagnostic logs rotate daily and keep the last 14 days on disk by default. Tune retention under Settings → Diagnostic Logs → Storage & Retention; the tier minimum/maximum matches your audit retention policy.

Next steps