# Application Rollback Runbook This runbook describes the procedure for rolling back a failed application deployment. ## Prerequisites - Access to Kubernetes cluster - kubectl configured - Previous deployment version available ## Detection ### Automatic Detection Health checks will automatically detect: - Application crashes - High error rates - Slow response times - Failed readiness probes ### Manual Detection ```bash # Check pod status kubectl get pods -n veza-production -l app=veza-backend-api # Check deployment status kubectl rollout status deployment/veza-backend-api -n veza-production # Check application logs kubectl logs -f deployment/veza-backend-api -n veza-production # Check metrics kubectl top pods -n veza-production ``` ## Rollback Procedure ### Step 1: Verify Issue ```bash # Check current deployment kubectl get deployment veza-backend-api -n veza-production -o yaml # Check recent events kubectl get events -n veza-production --sort-by='.lastTimestamp' | tail -20 # Verify health endpoint curl https://api.veza.com/health ``` ### Step 2: Check Rollback History ```bash # View deployment history kubectl rollout history deployment/veza-backend-api -n veza-production # View details of previous revision kubectl rollout history deployment/veza-backend-api -n veza-production --revision= ``` ### Step 3: Execute Rollback #### Option A: Rollback to Previous Version ```bash # Rollback to previous version kubectl rollout undo deployment/veza-backend-api -n veza-production # Monitor rollback progress kubectl rollout status deployment/veza-backend-api -n veza-production ``` #### Option B: Rollback to Specific Revision ```bash # Rollback to specific revision kubectl rollout undo deployment/veza-backend-api -n veza-production --to-revision= # Monitor rollback progress kubectl rollout status deployment/veza-backend-api -n veza-production ``` ### Step 4: Verify Rollback ```bash # Check pod status kubectl get pods -n veza-production -l app=veza-backend-api # Check deployment status kubectl get deployment veza-backend-api -n veza-production # Verify pods are ready kubectl wait --for=condition=ready pod \ -l app=veza-backend-api \ -n veza-production \ --timeout=300s # Check application logs kubectl logs -f deployment/veza-backend-api -n veza-production # Test health endpoint curl https://api.veza.com/health # Test critical endpoints curl https://api.veza.com/api/v1/tracks ``` ### Step 5: Verify Application Functionality ```bash # Run smoke tests # (Use your application's test suite) # Check metrics kubectl top pods -n veza-production # Monitor error rates # (Check monitoring dashboard) ``` ## Multi-Service Rollback If multiple services need rollback: ```bash # Rollback backend API (handles chat since v0.502 merge) kubectl rollout undo deployment/veza-backend-api -n veza-production # Rollback frontend kubectl rollout undo deployment/veza-frontend -n veza-production # Rollback stream server (if media layer affected) kubectl rollout undo deployment/veza-stream-server -n veza-production # Monitor all rollbacks kubectl rollout status deployment/veza-backend-api -n veza-production kubectl rollout status deployment/veza-frontend -n veza-production kubectl rollout status deployment/veza-stream-server -n veza-production ``` ## Database Migration Rollback If rollback includes database changes: ```bash # 1. Stop application kubectl scale deployment veza-backend-api --replicas=0 -n veza-production # 2. Rollback database migration # (Use your migration tool) # Example with migrate tool: kubectl run migrate-rollback --rm -it --image=veza-backend-api:previous \ --restart=Never \ --env="DATABASE_URL=$DATABASE_URL" \ -- migrate -path /migrations -database $DATABASE_URL down 1 # 3. Rollback application kubectl rollout undo deployment/veza-backend-api -n veza-production # 4. Restart application kubectl scale deployment veza-backend-api --replicas=3 -n veza-production ``` ## Verification Checklist - [ ] Previous version identified - [ ] Rollback executed - [ ] Pods are running and ready - [ ] Health checks passing - [ ] Application logs show no errors - [ ] Critical endpoints responding - [ ] Metrics normalized - [ ] Users can access platform - [ ] Monitoring alerts cleared ## Troubleshooting ### Rollback Fails ```bash # Check deployment status kubectl describe deployment veza-backend-api -n veza-production # Check pod events kubectl describe pod -n veza-production # Check image availability kubectl get pod -n veza-production -o jsonpath='{.spec.containers[0].image}' # If image is missing, may need to rebuild or use different image ``` ### Pods Not Starting ```bash # Check pod logs kubectl logs -n veza-production # Check resource constraints kubectl describe pod -n veza-production | grep -A 5 "Limits\|Requests" # Check node resources kubectl top nodes ``` ### Application Still Failing After Rollback ```bash # Verify correct version is deployed kubectl get deployment veza-backend-api -n veza-production -o jsonpath='{.spec.template.spec.containers[0].image}' # Check if issue is in previous version too kubectl logs -n veza-production # May need to rollback further or investigate root cause ``` ## Post-Rollback Tasks 1. **Investigate Root Cause** - Review deployment logs - Check application logs - Identify what caused failure 2. **Fix Issue** - Address root cause - Test fix in staging - Prepare new deployment 3. **Document Incident** - Document rollback procedure - Note any issues encountered - Update deployment process if needed 4. **Notify Stakeholders** - Send incident report - Update status page - Schedule post-mortem if needed ## Prevention To prevent future rollbacks: - **Automated Testing**: Run full test suite before deployment - **Staged Rollouts**: Use canary or blue-green deployments - **Health Checks**: Comprehensive health check endpoints - **Monitoring**: Real-time monitoring and alerting - **Gradual Rollout**: Deploy to small percentage first ## References - [Kubernetes Rollout Documentation](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#rolling-back-a-deployment) - [Deployment Best Practices](../README.md)