🛠️ Admin Troubleshooting Guide

This guide helps administrators troubleshoot common issues in Control Core deployments.

🛠️ Common Issues

Configuration Issues

Issue: Settings Not Saving

Symptoms:

  • Changes in Settings pages don't persist
  • Error messages appear when saving
  • Settings revert after page refresh

Diagnosis:

  1. Check browser console for JavaScript errors
  2. Verify you have admin permissions
  3. Check backend logs for errors
  4. Verify database connection is working

Solutions:

  1. Check Permissions:

    • Verify you're logged in as an admin user
    • Check user role in Settings → Users
    • Ensure user has "admin" role
  2. Clear Browser Cache:

    # Clear browser cache and cookies
    # Or use incognito/private browsing mode
    
  3. Check Backend Logs:

    # View backend logs
    docker logs control-plane-api
    # or
    kubectl logs deployment/control-plane-api
    
  4. Verify Database Connection:

    • Check database is running
    • Verify DATABASE_URL is correct
    • Test database connection

Issue: Environment Variables Not Applied

Symptoms:

  • Configuration changes don't take effect
  • Services use default values instead of configured values
  • Environment-specific settings not working

Solutions:

  1. Verify Environment File:

    # Check .env file exists and is readable
    cat .env
    
    # Verify variables are set correctly
    grep VARIABLE_NAME .env
    
  2. Restart Services:

    # Restart services to apply new environment variables
    docker-compose restart
    # or
    kubectl rollout restart deployment/control-plane-api
    
  3. Check Variable Format:

    • No spaces around = sign
    • No quotes unless needed
    • Escape special characters properly

Connection Issues

Issue: Bouncer Not Connecting to Control Plane

Symptoms:

  • Bouncer shows "Disconnected" in Admin UI
  • Heartbeat failures in logs
  • Policies not syncing to bouncer

Diagnosis:

  1. Check network connectivity
  2. Verify API credentials
  3. Check firewall rules
  4. Review bouncer logs

Solutions:

  1. Test Network Connectivity:

    # From bouncer host
    curl -I https://your-control-plane.company.com/health
    
  2. Verify API Credentials:

    # Check API key format
    echo $CONTROL_PLANE_API_KEY
    # Should start with sk_live_ or sk_test_
    
    # Test credentials
    curl https://your-control-plane.company.com/api/v1/health \
      -H "Authorization: Bearer $CONTROL_PLANE_API_KEY"
    
  3. Check Firewall Rules:

    • Allow outbound HTTPS (port 443)
    • Allow outbound gRPC (port 50051) if enabled
    • Check for proxy blocking connections
  4. Review Bouncer Logs:

    docker logs control-core-bouncer
    # Look for connection errors or authentication failures
    

Issue: License Server Connection Failed

Symptoms:

  • "Test Connection" fails in Telemetry settings
  • Policy testing disabled
  • Production promotion disabled

Solutions:

  1. Verify License Admin URL:

    • Check URL is correct (no typos)
    • Ensure URL includes https:// protocol
    • Verify URL is accessible from your network
  2. Check API Key:

    • Verify API key is correct (no extra spaces)
    • Check API key hasn't expired
    • Contact support to regenerate if needed
  3. Test Network Connectivity:

    # Test URL accessibility
    curl https://business-admin.controlcore.io/health
    
  4. Check Firewall/Proxy:

    • Verify firewall allows outbound HTTPS
    • Check proxy settings if behind corporate proxy
    • Test URL accessibility from server

Performance Issues

Issue: Slow Policy Evaluation

Symptoms:

  • High latency on authorization decisions
  • Slow dashboard loading
  • Timeout errors

Solutions:

  1. Enable Caching:

    # Enable decision caching
    CACHE_ENABLED=true
    DECISION_CACHE_TTL=60  # seconds
    
  2. Optimize Policies:

    • Review policy complexity
    • Use policy templates for common patterns
    • Avoid expensive operations in policies
  3. Check Database Performance:

    -- Check slow queries
    SELECT * FROM pg_stat_statements 
    ORDER BY total_time DESC LIMIT 10;
    
  4. Monitor Resource Usage:

    • Check CPU and memory usage
    • Monitor database connection pool
    • Review Redis cache hit rates

Issue: High Memory Usage

Symptoms:

  • Services consuming excessive memory
  • Out of memory errors
  • System slowdowns

Solutions:

  1. Review Cache Settings:

    # Reduce cache size if too large
    CACHE_MAX_SIZE=5000  # Reduce from default
    
  2. Check Database Connections:

    # Reduce connection pool size
    DATABASE_POOL_SIZE=10  # Reduce if needed
    
  3. Monitor Memory Usage:

    # Check container memory usage
    docker stats
    # or
    kubectl top pods
    
  4. Review Log Retention:

    • Reduce log retention period
    • Archive old logs
    • Enable log rotation

Policy Issues

Issue: Policies Not Enforcing

Symptoms:

  • Policies created but not blocking access
  • All requests allowed despite policies
  • Policies not appearing in bouncer

Solutions:

  1. Verify Policy Status:

    • Check policy is enabled
    • Verify policy is in correct environment (Sandbox/Production)
    • Ensure policy is assigned to resource
  2. Check Bouncer Configuration:

    # Verify bouncer is using correct environment
    ENVIRONMENT=production  # or sandbox
    
    # Check security posture
    SECURITY_POSTURE=deny-all  # Should be deny-all, not allow-all
    
  3. Review Policy Logic:

    • Test policy with Control Simulator
    • Verify policy syntax is correct
    • Check policy test cases pass
  4. Check Policy Sync:

    • Verify bouncer is connected
    • Check policy sync status in Admin UI
    • Review bouncer logs for sync errors

Issue: Policy Testing Disabled

Symptoms:

  • "Test Policy" button is disabled
  • Error message: "Connect to License Server"
  • Control Simulator unavailable

Solutions:

  1. Configure License Server:

    • Navigate to Settings → General → Telemetry
    • Enter License Admin URL
    • Enter API Key
    • Click "Test Connection"
    • Save configuration
  2. Verify Connection Status:

    • Check telemetry health status
    • Review transmission history
    • Ensure connection test succeeds
  3. Check Subscription Status:

    • Verify subscription tier is active
    • Check trial expiration (if on Kickstart plan)
    • Contact support if subscription issues

Database Issues

Issue: Database Connection Failed

Symptoms:

  • "Database connection error" messages
  • Services fail to start
  • Data not loading

Solutions:

  1. Verify Database is Running:

    # Check PostgreSQL container
    docker ps | grep postgres
    
    # Check database status
    docker exec postgres pg_isready
    
  2. Check Connection String:

    # Verify DATABASE_URL format
    echo $DATABASE_URL
    # Format: postgresql://user:password@host:port/database
    
  3. Test Database Connection:

    # Test connection
    psql $DATABASE_URL -c "SELECT 1;"
    
  4. Check Credentials:

    • Verify username and password are correct
    • Check database name exists
    • Verify user has proper permissions

Issue: Database Performance Issues

Symptoms:

  • Slow queries
  • High database CPU usage
  • Connection pool exhaustion

Solutions:

  1. Optimize Queries:

    -- Check for missing indexes
    SELECT * FROM pg_stat_user_indexes 
    WHERE idx_scan = 0;
    
    -- Analyze tables
    ANALYZE;
    
  2. Adjust Connection Pool:

    # Increase pool size if needed
    DATABASE_POOL_SIZE=30
    DATABASE_MAX_OVERFLOW=40
    
  3. Monitor Database:

    -- Check active connections
    SELECT count(*) FROM pg_stat_activity;
    
    -- Check slow queries
    SELECT * FROM pg_stat_statements 
    ORDER BY mean_exec_time DESC LIMIT 10;
    

Authentication Issues

Issue: Cannot Login

Symptoms:

  • Login fails with "Invalid credentials"
  • User account locked
  • Password reset not working

Solutions:

  1. Verify Credentials:

    • Check username and password are correct
    • Verify account is not locked
    • Check account is active
  2. Reset Password:

    • Use password reset functionality
    • Contact admin to reset password
    • Check email for reset link
  3. Check Account Status:

    -- Check user account status
    SELECT username, is_active, is_locked 
    FROM users WHERE username = 'your-username';
    
  4. Review Authentication Logs:

    • Check audit logs for failed login attempts
    • Review authentication errors
    • Check for account lockout policies

Issue: Session Timeout Not Working

Symptoms:

  • Session timeout modal doesn't appear
  • "Stay Logged In" doesn't work
  • Sessions don't expire

Solutions:

  1. Verify Session Timeout Setting:

    • Check Settings → General → Platform
    • Verify session timeout is configured (15-480 minutes)
    • Save settings
  2. Clear Browser Cache:

    • Clear cookies and cache
    • Try incognito/private mode
    • Check browser console for errors
  3. Check Backend Configuration:

    # Verify session management is enabled
    # Check backend logs for session errors
    

Telemetry Issues

Issue: Telemetry Not Sending

Symptoms:

  • Transmission history shows failures
  • No telemetry data in preview
  • License Server not receiving data

Solutions:

  1. Check Telemetry Configuration:

    • Verify telemetry is enabled
    • Check License Admin URL is correct
    • Verify API key is valid
  2. Test Connection:

    • Use "Test Connection" button
    • Review error messages
    • Check network connectivity
  3. Review Transmission History:

    • Check for failed transmissions
    • Review error details
    • Retry failed transmissions
  4. Check Telemetry Data:

    • Verify bouncers are registered
    • Check audit logs are being generated
    • Ensure data aggregation is working

Environment Issues

Issue: Wrong Environment Data Showing

Symptoms:

  • Sandbox policies appearing in Production
  • Production data in Sandbox view
  • Environment selector not working

Solutions:

  1. Verify Environment Selector:

    • Check environment badge in header
    • Switch environments using selector
    • Refresh page after switching
  2. Check Bouncer Environment:

    # Verify bouncer environment setting
    ENVIRONMENT=sandbox  # or production
    
  3. Review Policy Environment:

    • Check policies are in correct environment
    • Verify policy promotion workflow
    • Check environment-specific resources

Control Map Issues

Issue: Control Map does not show expected policies

Symptoms:

  • Expected policies are not visible on the map
  • Resource appears but policy node is missing

Solutions:

  1. Confirm the selected environment in the page header.
  2. Control Map only includes policies active in the selected environment.
  3. If policies are grouped, use See More (+5) in Selection Details.
  4. Use Search + Type Filter to narrow policy nodes quickly.

Issue: Control Map navigation opens wrong screen

Symptoms:

  • Clicking a node opens a generic page instead of specific entity

Solutions:

  1. Verify deep-link parameters exist in URL:
    • bouncer_id, resource_id, policy_id, connection, destination_ref
  2. Refresh destination page and open again from map.
  3. Clear browser cache and retry.

Issue: Map viewport is hard to manage

Symptoms:

  • User gets lost while zooming/panning
  • Graph does not fit screen comfortably

Solutions:

  1. Use Recenter to center the current zoomed canvas.
  2. Use Reset View to restore default position/zoom/filter.
  3. Use Full Screen mode for larger topology reviews.
  4. Press Esc to exit full screen if browser control is stuck.

📌 Diagnostic Commands

Health Checks

# Control Plane health
curl https://your-control-plane.company.com/health

# Bouncer health
curl http://localhost:8080/health

# Database health
docker exec postgres pg_isready

# Redis health
redis-cli ping

Log Review

# View Control Plane logs
docker logs control-plane-api --tail 100

# View Bouncer logs
docker logs control-core-bouncer --tail 100

# Search for errors
docker logs control-plane-api | grep -i error

# Follow logs in real-time
docker logs -f control-plane-api

Database Diagnostics

-- Check database size
SELECT pg_size_pretty(pg_database_size('control_core_db'));

-- Check table sizes
SELECT schemaname, tablename, pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size
FROM pg_tables WHERE schemaname = 'public'
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;

-- Check active connections
SELECT count(*) FROM pg_stat_activity;

-- Check slow queries
SELECT query, mean_exec_time, calls 
FROM pg_stat_statements 
ORDER BY mean_exec_time DESC LIMIT 10;

📞 Getting Additional Help

Support Resources

When Contacting Support

Include the following information:

  1. Issue Description: Clear description of the problem
  2. Steps to Reproduce: How to reproduce the issue
  3. Error Messages: Full error messages and stack traces
  4. Configuration: Relevant configuration (sanitize secrets)
  5. Logs: Relevant log excerpts
  6. Environment: Deployment environment and version
  7. Timeline: When the issue started