Auto-Classification
Audience: Platform engineers, governance owners Time: ~6 min read Prerequisites: Familiarity with Resource Enrichment.
Auto-classification proposes enrichment values for a resource using URL, name, and business-context heuristics. The classifier runs without an LLM call by default; SCCA fallback is opt-in. The proposed values feed directly into the control suggestions surfaced by the Policy Builder, so getting them right shortens the path from a newly registered resource to an enforced control.
Two endpoints
| Endpoint | What it does |
|---|---|
POST /resources/{id}/auto-classify | Read-only. Returns a ProposedEnrichment payload. The operator confirms before anything is written. |
POST /resources/{id}/apply-classification | Write. Persists an operator-confirmed payload, sets classification_source (default manual if not provided), updates last_classified_at, writes a RESOURCE_UPDATED audit row. |
POST /resources/auto-classify-bulk | Batch write. Heuristic-classifies many resources at once. Operator-set values are preserved. |
ProposedEnrichment shape
{
"resource_kind": "llm_endpoint",
"ai_provider": "openai",
"ai_model_family": "gpt-4*",
"mcp_protocol_version": null,
"agent_capabilities": [],
"pii_categories": ["emails"],
"egress_destinations": ["api.openai.com"],
"suggested_data_classification": "internal",
"suggested_compliance_tags": ["GDPR"],
"confidence": 0.55,
"source": "heuristic",
"rationale": [
"resource_kind=llm_endpoint from URL/name heuristic",
"ai_provider=openai",
"ai_model_family=gpt-4*",
"pii_categories=emails",
"suggested_data_classification=internal"
]
}
How the heuristic decides
The classifier scans the lower-cased concatenation of name + url + original_host + business_context against fixed substring tables:
| Signal | Why it triggers |
|---|---|
/v1/chat/completions, /inference, /generate | resource_kind = llm_endpoint |
/mcp/, mcp., model-context-protocol | resource_kind = mcp_server |
/agent/, /copilot | resource_kind = agent |
/rag/, /retrieval, /vector | resource_kind = rag_index |
openai, anthropic, bedrock, vertex, azure_openai | ai_provider (only when kind is AI) |
email, card, medical, ssn | pii_categories |
/upload, /run, /exec, /browse, /memory | agent_capabilities |
The first kind match wins; api is the safe default when nothing matches. Confidence accumulates from the strength of evidence, capped at 1.0.
Troubleshooting: Confidence is too low (< 0.4)? That just means the heuristic didn't find strong signals. Either (a) accept the proposal and refine in the Enrich modal, or (b) opt into SCCA fallback (next section).
SCCA fallback (optional)
When the feature flag flags.resources.scca_enrichment is on and heuristic confidence is below a threshold, the classifier can re-call SCCA's LLM service with a strict, schema-validated prompt. The output replaces the heuristic proposal but source becomes scca so you can audit it later.
This is off by default. Enable it in Settings → Feature Flags. SCCA fallback never runs without operator opt-in because it adds latency and an LLM call per classification.
What the classifier never does
- Never overwrites operator-set values — when used in bulk mode, only empty cells are filled.
- Never writes during
auto-classify(it's a pure proposal). - Never includes PII in prompts when SCCA fallback runs —
owner_emailand similar fields are excluded. - Never branches on tenant ID, vendor names beyond the public substring tables, or framework paths — the classifier is generic; the substrings are public conventions.
Workflow: classify a single resource
# 1. Get the proposal
curl -X POST -H "Authorization: Bearer $TOKEN" \
https://controlplane.example.com/api/resources/42/auto-classify
# 2. Review and adjust the JSON, then apply
curl -X POST -H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"resource_kind":"llm_endpoint",
"ai_provider":"openai",
"data_classification":"confidential",
"compliance_tags":["GDPR","PCI-DSS"],
"pii_categories":["emails","card_numbers"],
"classification_source":"manual"
}' \
https://controlplane.example.com/api/resources/42/apply-classification
Troubleshooting:
apply-classificationreturns 404? Check the resource ID exists in the current environment. Resources are environment-scoped — a sandbox ID won't resolve in production.