veza/k8s/load-balancing/README.md

# Load Balancing Configuration for Veza Platform

This directory contains configurations for load balancing across the Veza platform to ensure high availability, scalability, and optimal performance.

## Overview

Veza uses multiple layers of load balancing:

1. **Kubernetes Services**: Internal load balancing using kube-proxy
2. **Ingress Controllers**: Nginx Ingress for HTTP/HTTPS traffic
3. **Cloud Load Balancers**: AWS ALB, GCP Load Balancer, Azure Load Balancer
4. **Application-Level**: Session affinity, health checks, circuit breakers

## Architecture

```
┌─────────────────────────────────────────────────────────┐
│                    Internet Users                         │
└────────────────────┬──────────────────────────────────────┘
                     │
        ┌────────────▼────────────┐
        │  Cloud Load Balancer     │
        │  (AWS ALB / GCP LB)      │
        └────────────┬─────────────┘
                     │
        ┌────────────▼────────────┐
        │  Ingress Controller      │
        │  (Nginx Ingress)         │
        └────────────┬─────────────┘
                     │
        ┌────────────▼────────────┐
        │  Kubernetes Services     │
        │  (ClusterIP)             │
        └────────────┬─────────────┘
                     │
        ┌────────────▼────────────┐
        │  Application Pods        │
        │  (Backend, Frontend)    │
        └─────────────────────────┘
```

## Components

### 1. Kubernetes Services

Kubernetes Services provide internal load balancing using different algorithms:

- **Round Robin** (default): Distributes requests evenly
- **Session Affinity**: Sticky sessions for stateful applications
- **Least Connections**: Routes to pod with fewest connections
- **IP Hash**: Consistent hashing based on client IP

### 2. Ingress Controller

Nginx Ingress Controller provides:
- SSL/TLS termination
- Path-based routing
- Rate limiting
- WebSocket support
- Load balancing algorithms

### 3. Cloud Load Balancers

Cloud provider load balancers provide:
- Global load balancing
- Health checks
- SSL/TLS termination
- DDoS protection
- Geographic routing

## Configuration

### Service Load Balancing

#### Basic Service (Round Robin)

```yaml
apiVersion: v1
kind: Service
metadata:
  name: veza-backend-api
spec:
  type: ClusterIP
  sessionAffinity: None  # Round robin (default)
  ports:
  - port: 8080
    targetPort: 8080
  selector:
    app: veza-backend-api
```

#### Session Affinity (Sticky Sessions)

```yaml
apiVersion: v1
kind: Service
metadata:
  name: veza-backend-api
spec:
  type: ClusterIP
  sessionAffinity: ClientIP
  sessionAffinityConfig:
    clientIP:
      timeoutSeconds: 3600  # 1 hour
  ports:
  - port: 8080
    targetPort: 8080
  selector:
    app: veza-backend-api
```

### Ingress Load Balancing

#### Nginx Ingress with Load Balancing

```yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: veza-ingress
  annotations:
    nginx.ingress.kubernetes.io/upstream-hash-by: "$request_uri"  # Consistent hashing
    nginx.ingress.kubernetes.io/load-balance: "round_robin"      # Load balancing algorithm
    nginx.ingress.kubernetes.io/upstream-keepalive-connections: "64"
    nginx.ingress.kubernetes.io/upstream-keepalive-timeout: "60"
    nginx.ingress.kubernetes.io/upstream-keepalive-requests: "100"
spec:
  rules:
  - host: api.veza.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: veza-backend-api
            port:
              number: 8080
```

#### Load Balancing Algorithms

Available algorithms for Nginx Ingress:

- `round_robin`: Default, distributes requests evenly
- `least_conn`: Routes to backend with fewest connections
- `ip_hash`: Consistent hashing based on client IP
- `hash $request_uri`: Consistent hashing based on request URI

### Cloud Load Balancer Configuration

#### AWS Application Load Balancer (ALB)

```yaml
apiVersion: v1
kind: Service
metadata:
  name: veza-backend-api
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
    service.beta.kubernetes.io/aws-load-balancer-scheme: "internet-facing"
    service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-interval: "10"
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-timeout: "5"
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-healthy-threshold: "2"
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-unhealthy-threshold: "3"
spec:
  type: LoadBalancer
  ports:
  - port: 80
    targetPort: 8080
  selector:
    app: veza-backend-api
```

#### GCP Load Balancer

```yaml
apiVersion: v1
kind: Service
metadata:
  name: veza-backend-api
  annotations:
    cloud.google.com/load-balancer-type: "Internal"
    cloud.google.com/backend-config: '{"default": "veza-backend-config"}'
spec:
  type: LoadBalancer
  ports:
  - port: 80
    targetPort: 8080
  selector:
    app: veza-backend-api
---
apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
  name: veza-backend-config
spec:
  healthCheck:
    checkIntervalSec: 10
    timeoutSec: 5
    healthyThreshold: 2
    unhealthyThreshold: 3
    type: HTTP
    requestPath: /health
  sessionAffinity:
    affinityType: "CLIENT_IP"
    affinityCookieTtlSec: 3600
```

#### Azure Load Balancer

```yaml
apiVersion: v1
kind: Service
metadata:
  name: veza-backend-api
  annotations:
    service.beta.kubernetes.io/azure-load-balancer-internal: "false"
    service.beta.kubernetes.io/azure-load-balancer-health-probe-request-path: "/health"
spec:
  type: LoadBalancer
  ports:
  - port: 80
    targetPort: 8080
  selector:
    app: veza-backend-api
```

## Health Checks

### Service Health Checks

```yaml
apiVersion: v1
kind: Service
metadata:
  name: veza-backend-api
spec:
  type: ClusterIP
  ports:
  - port: 8080
    targetPort: 8080
  selector:
    app: veza-backend-api
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: veza-backend-api
spec:
  template:
    spec:
      containers:
      - name: veza-backend-api
        image: veza-backend-api:latest
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 3
        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 5
          timeoutSeconds: 3
          failureThreshold: 3
```

### Ingress Health Checks

```yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: veza-ingress
  annotations:
    nginx.ingress.kubernetes.io/health-check: "true"
    nginx.ingress.kubernetes.io/health-check-path: "/health"
    nginx.ingress.kubernetes.io/health-check-interval: "10s"
    nginx.ingress.kubernetes.io/health-check-timeout: "5s"
    nginx.ingress.kubernetes.io/health-check-expected-status: "200"
```

## Load Balancing Strategies

### 1. Round Robin (Default)

**Use Case**: Stateless applications, API services

**Configuration**:
```yaml
sessionAffinity: None
```

**Pros**:
- Simple and fair distribution
- No state management needed

**Cons**:
- May not account for server load
- No session persistence

### 2. Session Affinity (Sticky Sessions)

**Use Case**: Stateful applications, user sessions

**Configuration**:
```yaml
sessionAffinity: ClientIP
sessionAffinityConfig:
  clientIP:
    timeoutSeconds: 3600
```

**Pros**:
- Maintains user sessions
- Better for stateful applications

**Cons**:
- Uneven load distribution
- Single point of failure if pod crashes

### 3. Least Connections

**Use Case**: Long-lived connections, WebSocket

**Configuration**:
```yaml
annotations:
  nginx.ingress.kubernetes.io/load-balance: "least_conn"
```

**Pros**:
- Better load distribution
- Accounts for active connections

**Cons**:
- More complex than round robin
- Requires connection tracking

### 4. Consistent Hashing

**Use Case**: Caching, distributed systems

**Configuration**:
```yaml
annotations:
  nginx.ingress.kubernetes.io/upstream-hash-by: "$request_uri"
```

**Pros**:
- Predictable routing
- Better cache utilization

**Cons**:
- Uneven distribution if hash keys are skewed
- More complex configuration

## Best Practices

### 1. Health Checks

- **Liveness Probe**: Detects and restarts unhealthy pods
- **Readiness Probe**: Removes pods from load balancer if not ready
- **Startup Probe**: Allows slow-starting applications time to initialize

### 2. Resource Limits

Set appropriate resource limits to prevent resource exhaustion:

```yaml
resources:
  requests:
    cpu: "100m"
    memory: "128Mi"
  limits:
    cpu: "500m"
    memory: "512Mi"
```

### 3. Pod Disruption Budgets

Ensure minimum availability during updates:

```yaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: veza-backend-api-pdb
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: veza-backend-api
```

### 4. Horizontal Pod Autoscaling

Automatically scale based on load:

```yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: veza-backend-api-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: veza-backend-api
  minReplicas: 3
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
```

### 5. Circuit Breakers

Implement circuit breakers for resilience:

```yaml
annotations:
  nginx.ingress.kubernetes.io/upstream-max-fails: "3"
  nginx.ingress.kubernetes.io/upstream-fail-timeout: "30s"
```

## Monitoring

### Metrics to Monitor

- **Request Rate**: Requests per second per backend
- **Response Time**: P50, P95, P99 latencies
- **Error Rate**: 4xx and 5xx error rates
- **Connection Count**: Active connections per backend
- **Health Check Status**: Success/failure rates

### Prometheus Queries

```promql
# Request rate per backend
rate(nginx_ingress_controller_requests[5m])

# Response time percentiles
histogram_quantile(0.95, rate(nginx_ingress_controller_request_duration_seconds_bucket[5m]))

# Error rate
rate(nginx_ingress_controller_requests{status=~"5.."}[5m])

# Active connections
nginx_ingress_controller_connections
```

## Troubleshooting

### Uneven Load Distribution

```bash
# Check pod distribution across nodes
kubectl get pods -n veza-production -o wide

# Check service endpoints
kubectl get endpoints veza-backend-api -n veza-production

# Check load balancer statistics
kubectl exec -it nginx-ingress-controller-pod -n ingress-nginx -- \
  curl http://localhost:10254/nginx_status
```

### Health Check Failures

```bash
# Check pod health
kubectl get pods -n veza-production

# Check health check logs
kubectl logs deployment/veza-backend-api -n veza-production | grep health

# Test health endpoint manually
kubectl exec -it deployment/veza-backend-api -n veza-production -- \
  curl http://localhost:8080/health
```

### Session Affinity Issues

```bash
# Check session affinity configuration
kubectl get service veza-backend-api -n veza-production -o yaml | grep -A 5 sessionAffinity

# Test session persistence
for i in {1..10}; do
  curl -H "Cookie: session=test" https://api.veza.com/api/v1/tracks
done
```

## References

- [Kubernetes Services Documentation](https://kubernetes.io/docs/concepts/services-networking/service/)
- [Nginx Ingress Controller](https://kubernetes.github.io/ingress-nginx/)
- [AWS Load Balancer Controller](https://kubernetes-sigs.github.io/aws-load-balancer-controller/)
- [GCP Load Balancer](https://cloud.google.com/kubernetes-engine/docs/how-to/load-balance-ingress)