7.5 KiB
7.5 KiB
Security Incident Response Runbook
This runbook describes the procedure for responding to security incidents, including breaches, unauthorized access, and data exfiltration.
Prerequisites
- Security team contact information
- Incident response team assembled
- Access to logs and monitoring
- Backup and restore procedures ready
Incident Severity Levels
- P0 (Critical): Active breach, data exfiltration, ransomware
- P1 (High): Unauthorized access, privilege escalation
- P2 (Medium): Suspicious activity, potential vulnerability
- P3 (Low): Security alerts, false positives
Immediate Response (First 15 Minutes)
Step 1: Containment
# Isolate affected systems
# Option A: Scale down affected deployment
kubectl scale deployment veza-backend-api --replicas=0 -n veza-production
# Option B: Block network access
kubectl apply -f k8s/network-policies/block-all.yaml -n veza-production
# Option C: Revoke credentials
# Update secrets immediately
kubectl delete secret veza-secrets -n veza-production
# Restore from Vault with new credentials
Step 2: Preserve Evidence
# Export logs
kubectl logs deployment/veza-backend-api -n veza-production > /tmp/incident-logs-$(date +%s).log
# Export events
kubectl get events -n veza-production --sort-by='.lastTimestamp' > /tmp/incident-events-$(date +%s).log
# Export pod configurations
kubectl get pods -n veza-production -o yaml > /tmp/incident-pods-$(date +%s).yaml
# Export network policies
kubectl get networkpolicies -n veza-production -o yaml > /tmp/incident-netpol-$(date +%s).yaml
Step 3: Notify Team
# Send immediate notification
# (Use your notification system)
# PagerDuty, Slack, Email, etc.
# Document incident
echo "INCIDENT: $(date)" >> /tmp/incident-log.txt
echo "Severity: P0" >> /tmp/incident-log.txt
echo "Description: [Description]" >> /tmp/incident-log.txt
Investigation Phase
Step 1: Identify Scope
# Check for unauthorized pods
kubectl get pods -n veza-production --all-namespaces
# Check for suspicious services
kubectl get svc -n veza-production
# Check for unauthorized ingress
kubectl get ingress -n veza-production
# Check network policies
kubectl get networkpolicies -n veza-production
Step 2: Review Access Logs
# Check API access logs
kubectl logs deployment/veza-backend-api -n veza-production | \
grep -i "unauthorized\|forbidden\|failed\|error"
# Check authentication logs
kubectl logs deployment/veza-backend-api -n veza-production | \
grep -i "login\|auth\|token\|jwt"
# Check database access
kubectl logs postgres-pod -n veza-production | \
grep -i "connection\|login\|failed"
Step 3: Check for Data Exfiltration
# Check database access patterns
kubectl exec -it postgres-pod -n veza-production -- \
psql -U veza_user -d veza_db -c "
SELECT * FROM pg_stat_activity
WHERE state = 'active'
ORDER BY query_start DESC;
"
# Check for large data exports
kubectl exec -it postgres-pod -n veza-production -- \
psql -U veza_user -d veza_db -c "
SELECT schemaname, tablename, n_tup_ins, n_tup_upd, n_tup_del
FROM pg_stat_user_tables
ORDER BY n_tup_del DESC;
"
Remediation Phase
Step 1: Revoke Compromised Credentials
# Revoke JWT secrets
# Update in Vault
vault kv put secret/veza/production/jwt-secret value=$(openssl rand -base64 32)
# Force External Secrets to sync
kubectl annotate externalsecret veza-secrets \
force-sync=$(date +%s) \
-n veza-production \
--overwrite
# Restart applications
kubectl rollout restart deployment/veza-backend-api -n veza-production
Step 2: Patch Vulnerabilities
# Update vulnerable images
kubectl set image deployment/veza-backend-api \
veza-backend-api=veza-backend-api:latest \
-n veza-production
# Apply security patches
kubectl apply -f k8s/security-patches/ -n veza-production
Step 3: Restore from Clean Backup
If data was compromised:
# Follow data restore procedure
# See runbooks/data-restore.md
# Restore from backup before incident
kubectl scale deployment veza-backend-api --replicas=0 -n veza-production
# Restore database
# (Follow data-restore.md procedure)
# Restart applications
kubectl scale deployment veza-backend-api --replicas=3 -n veza-production
Step 4: Strengthen Security
# Apply network policies
kubectl apply -f k8s/network-policies/ -n veza-production
# Enable audit logging
kubectl apply -f k8s/audit/audit-policy.yaml
# Update RBAC
kubectl apply -f k8s/rbac/ -n veza-production
# Enable Pod Security Policies
kubectl apply -f k8s/pod-security/ -n veza-production
Recovery Phase
Step 1: Verify System Integrity
# Check all pods are running
kubectl get pods -n veza-production
# Verify health checks
curl https://api.veza.com/health
# Check for anomalies
kubectl top pods -n veza-production
Step 2: Monitor for Recurrence
# Set up enhanced monitoring
# (Configure additional alerts)
# Review logs continuously
kubectl logs -f deployment/veza-backend-api -n veza-production
Step 3: Gradual Re-enablement
# Gradually scale up services
kubectl scale deployment veza-backend-api --replicas=1 -n veza-production
# Monitor for issues
# Wait 15 minutes
# Scale to full capacity
kubectl scale deployment veza-backend-api --replicas=3 -n veza-production
Post-Incident Tasks
Immediate (First 24 Hours)
-
Document Incident
- Timeline of events
- Actions taken
- Systems affected
- Data compromised (if any)
-
Notify Stakeholders
- Internal team
- Management
- Legal (if required)
- Customers (if data breach)
-
Preserve Evidence
- Secure all logs
- Document all actions
- Maintain chain of custody
Short Term (First Week)
-
Root Cause Analysis
- Identify vulnerability
- Determine attack vector
- Assess impact
-
Remediation
- Patch vulnerabilities
- Update security policies
- Implement additional controls
-
Communication
- Internal post-mortem
- External communication (if needed)
- Regulatory notifications (if required)
Long Term (Ongoing)
-
Prevention
- Security training
- Regular security audits
- Penetration testing
- Security monitoring improvements
-
Documentation
- Update security procedures
- Update incident response plan
- Document lessons learned
Verification Checklist
- Incident contained
- Evidence preserved
- Compromised credentials revoked
- Vulnerabilities patched
- Systems restored
- Monitoring enhanced
- Documentation updated
- Stakeholders notified
- Post-mortem scheduled
Communication Templates
Internal Notification
Subject: [SECURITY INCIDENT] Veza Platform - <Description>
Severity: P0/P1/P2
Status: Contained/Investigating/Resolved
Impact: <Description>
Actions Taken: <List>
Next Steps: <List>
Incident response team is actively working on resolution.
External Notification (if required)
Subject: Security Incident Notification
We are writing to inform you of a security incident that may have affected your account.
What happened: <Description>
What we're doing: <Actions>
What you should do: <Recommendations>
Timeline: <Dates>
We take security seriously and are committed to protecting your data.