veza/k8s/load-balancing/README.md

512 lines
12 KiB
Markdown

# Load Balancing Configuration for Veza Platform
This directory contains configurations for load balancing across the Veza platform to ensure high availability, scalability, and optimal performance.
## Overview
Veza uses multiple layers of load balancing:
1. **Kubernetes Services**: Internal load balancing using kube-proxy
2. **Ingress Controllers**: Nginx Ingress for HTTP/HTTPS traffic
3. **Cloud Load Balancers**: AWS ALB, GCP Load Balancer, Azure Load Balancer
4. **Application-Level**: Session affinity, health checks, circuit breakers
## Architecture
```
┌─────────────────────────────────────────────────────────┐
│ Internet Users │
└────────────────────┬──────────────────────────────────────┘
┌────────────▼────────────┐
│ Cloud Load Balancer │
│ (AWS ALB / GCP LB) │
└────────────┬─────────────┘
┌────────────▼────────────┐
│ Ingress Controller │
│ (Nginx Ingress) │
└────────────┬─────────────┘
┌────────────▼────────────┐
│ Kubernetes Services │
│ (ClusterIP) │
└────────────┬─────────────┘
┌────────────▼────────────┐
│ Application Pods │
│ (Backend, Frontend) │
└─────────────────────────┘
```
## Components
### 1. Kubernetes Services
Kubernetes Services provide internal load balancing using different algorithms:
- **Round Robin** (default): Distributes requests evenly
- **Session Affinity**: Sticky sessions for stateful applications
- **Least Connections**: Routes to pod with fewest connections
- **IP Hash**: Consistent hashing based on client IP
### 2. Ingress Controller
Nginx Ingress Controller provides:
- SSL/TLS termination
- Path-based routing
- Rate limiting
- WebSocket support
- Load balancing algorithms
### 3. Cloud Load Balancers
Cloud provider load balancers provide:
- Global load balancing
- Health checks
- SSL/TLS termination
- DDoS protection
- Geographic routing
## Configuration
### Service Load Balancing
#### Basic Service (Round Robin)
```yaml
apiVersion: v1
kind: Service
metadata:
name: veza-backend-api
spec:
type: ClusterIP
sessionAffinity: None # Round robin (default)
ports:
- port: 8080
targetPort: 8080
selector:
app: veza-backend-api
```
#### Session Affinity (Sticky Sessions)
```yaml
apiVersion: v1
kind: Service
metadata:
name: veza-backend-api
spec:
type: ClusterIP
sessionAffinity: ClientIP
sessionAffinityConfig:
clientIP:
timeoutSeconds: 3600 # 1 hour
ports:
- port: 8080
targetPort: 8080
selector:
app: veza-backend-api
```
### Ingress Load Balancing
#### Nginx Ingress with Load Balancing
```yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: veza-ingress
annotations:
nginx.ingress.kubernetes.io/upstream-hash-by: "$request_uri" # Consistent hashing
nginx.ingress.kubernetes.io/load-balance: "round_robin" # Load balancing algorithm
nginx.ingress.kubernetes.io/upstream-keepalive-connections: "64"
nginx.ingress.kubernetes.io/upstream-keepalive-timeout: "60"
nginx.ingress.kubernetes.io/upstream-keepalive-requests: "100"
spec:
rules:
- host: api.veza.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: veza-backend-api
port:
number: 8080
```
#### Load Balancing Algorithms
Available algorithms for Nginx Ingress:
- `round_robin`: Default, distributes requests evenly
- `least_conn`: Routes to backend with fewest connections
- `ip_hash`: Consistent hashing based on client IP
- `hash $request_uri`: Consistent hashing based on request URI
### Cloud Load Balancer Configuration
#### AWS Application Load Balancer (ALB)
```yaml
apiVersion: v1
kind: Service
metadata:
name: veza-backend-api
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
service.beta.kubernetes.io/aws-load-balancer-scheme: "internet-facing"
service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
service.beta.kubernetes.io/aws-load-balancer-healthcheck-interval: "10"
service.beta.kubernetes.io/aws-load-balancer-healthcheck-timeout: "5"
service.beta.kubernetes.io/aws-load-balancer-healthcheck-healthy-threshold: "2"
service.beta.kubernetes.io/aws-load-balancer-healthcheck-unhealthy-threshold: "3"
spec:
type: LoadBalancer
ports:
- port: 80
targetPort: 8080
selector:
app: veza-backend-api
```
#### GCP Load Balancer
```yaml
apiVersion: v1
kind: Service
metadata:
name: veza-backend-api
annotations:
cloud.google.com/load-balancer-type: "Internal"
cloud.google.com/backend-config: '{"default": "veza-backend-config"}'
spec:
type: LoadBalancer
ports:
- port: 80
targetPort: 8080
selector:
app: veza-backend-api
---
apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
name: veza-backend-config
spec:
healthCheck:
checkIntervalSec: 10
timeoutSec: 5
healthyThreshold: 2
unhealthyThreshold: 3
type: HTTP
requestPath: /health
sessionAffinity:
affinityType: "CLIENT_IP"
affinityCookieTtlSec: 3600
```
#### Azure Load Balancer
```yaml
apiVersion: v1
kind: Service
metadata:
name: veza-backend-api
annotations:
service.beta.kubernetes.io/azure-load-balancer-internal: "false"
service.beta.kubernetes.io/azure-load-balancer-health-probe-request-path: "/health"
spec:
type: LoadBalancer
ports:
- port: 80
targetPort: 8080
selector:
app: veza-backend-api
```
## Health Checks
### Service Health Checks
```yaml
apiVersion: v1
kind: Service
metadata:
name: veza-backend-api
spec:
type: ClusterIP
ports:
- port: 8080
targetPort: 8080
selector:
app: veza-backend-api
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: veza-backend-api
spec:
template:
spec:
containers:
- name: veza-backend-api
image: veza-backend-api:latest
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
```
### Ingress Health Checks
```yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: veza-ingress
annotations:
nginx.ingress.kubernetes.io/health-check: "true"
nginx.ingress.kubernetes.io/health-check-path: "/health"
nginx.ingress.kubernetes.io/health-check-interval: "10s"
nginx.ingress.kubernetes.io/health-check-timeout: "5s"
nginx.ingress.kubernetes.io/health-check-expected-status: "200"
```
## Load Balancing Strategies
### 1. Round Robin (Default)
**Use Case**: Stateless applications, API services
**Configuration**:
```yaml
sessionAffinity: None
```
**Pros**:
- Simple and fair distribution
- No state management needed
**Cons**:
- May not account for server load
- No session persistence
### 2. Session Affinity (Sticky Sessions)
**Use Case**: Stateful applications, user sessions
**Configuration**:
```yaml
sessionAffinity: ClientIP
sessionAffinityConfig:
clientIP:
timeoutSeconds: 3600
```
**Pros**:
- Maintains user sessions
- Better for stateful applications
**Cons**:
- Uneven load distribution
- Single point of failure if pod crashes
### 3. Least Connections
**Use Case**: Long-lived connections, WebSocket
**Configuration**:
```yaml
annotations:
nginx.ingress.kubernetes.io/load-balance: "least_conn"
```
**Pros**:
- Better load distribution
- Accounts for active connections
**Cons**:
- More complex than round robin
- Requires connection tracking
### 4. Consistent Hashing
**Use Case**: Caching, distributed systems
**Configuration**:
```yaml
annotations:
nginx.ingress.kubernetes.io/upstream-hash-by: "$request_uri"
```
**Pros**:
- Predictable routing
- Better cache utilization
**Cons**:
- Uneven distribution if hash keys are skewed
- More complex configuration
## Best Practices
### 1. Health Checks
- **Liveness Probe**: Detects and restarts unhealthy pods
- **Readiness Probe**: Removes pods from load balancer if not ready
- **Startup Probe**: Allows slow-starting applications time to initialize
### 2. Resource Limits
Set appropriate resource limits to prevent resource exhaustion:
```yaml
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "512Mi"
```
### 3. Pod Disruption Budgets
Ensure minimum availability during updates:
```yaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: veza-backend-api-pdb
spec:
minAvailable: 2
selector:
matchLabels:
app: veza-backend-api
```
### 4. Horizontal Pod Autoscaling
Automatically scale based on load:
```yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: veza-backend-api-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: veza-backend-api
minReplicas: 3
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
```
### 5. Circuit Breakers
Implement circuit breakers for resilience:
```yaml
annotations:
nginx.ingress.kubernetes.io/upstream-max-fails: "3"
nginx.ingress.kubernetes.io/upstream-fail-timeout: "30s"
```
## Monitoring
### Metrics to Monitor
- **Request Rate**: Requests per second per backend
- **Response Time**: P50, P95, P99 latencies
- **Error Rate**: 4xx and 5xx error rates
- **Connection Count**: Active connections per backend
- **Health Check Status**: Success/failure rates
### Prometheus Queries
```promql
# Request rate per backend
rate(nginx_ingress_controller_requests[5m])
# Response time percentiles
histogram_quantile(0.95, rate(nginx_ingress_controller_request_duration_seconds_bucket[5m]))
# Error rate
rate(nginx_ingress_controller_requests{status=~"5.."}[5m])
# Active connections
nginx_ingress_controller_connections
```
## Troubleshooting
### Uneven Load Distribution
```bash
# Check pod distribution across nodes
kubectl get pods -n veza-production -o wide
# Check service endpoints
kubectl get endpoints veza-backend-api -n veza-production
# Check load balancer statistics
kubectl exec -it nginx-ingress-controller-pod -n ingress-nginx -- \
curl http://localhost:10254/nginx_status
```
### Health Check Failures
```bash
# Check pod health
kubectl get pods -n veza-production
# Check health check logs
kubectl logs deployment/veza-backend-api -n veza-production | grep health
# Test health endpoint manually
kubectl exec -it deployment/veza-backend-api -n veza-production -- \
curl http://localhost:8080/health
```
### Session Affinity Issues
```bash
# Check session affinity configuration
kubectl get service veza-backend-api -n veza-production -o yaml | grep -A 5 sessionAffinity
# Test session persistence
for i in {1..10}; do
curl -H "Cookie: session=test" https://api.veza.com/api/v1/tracks
done
```
## References
- [Kubernetes Services Documentation](https://kubernetes.io/docs/concepts/services-networking/service/)
- [Nginx Ingress Controller](https://kubernetes.github.io/ingress-nginx/)
- [AWS Load Balancer Controller](https://kubernetes-sigs.github.io/aws-load-balancer-controller/)
- [GCP Load Balancer](https://cloud.google.com/kubernetes-engine/docs/how-to/load-balance-ingress)