senke/veza

History

senke 9bd3ec8fec [INFRA-011] infra: Set up load balancing		2025-12-25 21:41:39 +01:00
..
cloud-load-balancers	[INFRA-011] infra: Set up load balancing	2025-12-25 21:41:39 +01:00
ingress-with-lb.yaml	[INFRA-011] infra: Set up load balancing	2025-12-25 21:41:39 +01:00
pod-disruption-budget.yaml	[INFRA-011] infra: Set up load balancing	2025-12-25 21:41:39 +01:00
README.md	[INFRA-011] infra: Set up load balancing	2025-12-25 21:41:39 +01:00
services-with-lb.yaml	[INFRA-011] infra: Set up load balancing	2025-12-25 21:41:39 +01:00

README.md

Load Balancing Configuration for Veza Platform

This directory contains configurations for load balancing across the Veza platform to ensure high availability, scalability, and optimal performance.

Overview

Veza uses multiple layers of load balancing:

Kubernetes Services: Internal load balancing using kube-proxy
Ingress Controllers: Nginx Ingress for HTTP/HTTPS traffic
Cloud Load Balancers: AWS ALB, GCP Load Balancer, Azure Load Balancer
Application-Level: Session affinity, health checks, circuit breakers

Architecture

┌─────────────────────────────────────────────────────────┐
│                    Internet Users                         │
└────────────────────┬──────────────────────────────────────┘
                     │
        ┌────────────▼────────────┐
        │  Cloud Load Balancer     │
        │  (AWS ALB / GCP LB)      │
        └────────────┬─────────────┘
                     │
        ┌────────────▼────────────┐
        │  Ingress Controller      │
        │  (Nginx Ingress)         │
        └────────────┬─────────────┘
                     │
        ┌────────────▼────────────┐
        │  Kubernetes Services     │
        │  (ClusterIP)             │
        └────────────┬─────────────┘
                     │
        ┌────────────▼────────────┐
        │  Application Pods        │
        │  (Backend, Frontend)    │
        └─────────────────────────┘

Components

1. Kubernetes Services

Kubernetes Services provide internal load balancing using different algorithms:

Round Robin (default): Distributes requests evenly
Session Affinity: Sticky sessions for stateful applications
Least Connections: Routes to pod with fewest connections
IP Hash: Consistent hashing based on client IP

2. Ingress Controller

Nginx Ingress Controller provides:

SSL/TLS termination
Path-based routing
Rate limiting
WebSocket support
Load balancing algorithms

3. Cloud Load Balancers

Cloud provider load balancers provide:

Global load balancing
Health checks
SSL/TLS termination
DDoS protection
Geographic routing

Configuration

Service Load Balancing

Basic Service (Round Robin)

apiVersion: v1
kind: Service
metadata:
  name: veza-backend-api
spec:
  type: ClusterIP
  sessionAffinity: None  # Round robin (default)
  ports:
  - port: 8080
    targetPort: 8080
  selector:
    app: veza-backend-api

Session Affinity (Sticky Sessions)

apiVersion: v1
kind: Service
metadata:
  name: veza-backend-api
spec:
  type: ClusterIP
  sessionAffinity: ClientIP
  sessionAffinityConfig:
    clientIP:
      timeoutSeconds: 3600  # 1 hour
  ports:
  - port: 8080
    targetPort: 8080
  selector:
    app: veza-backend-api

Ingress Load Balancing

Nginx Ingress with Load Balancing

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: veza-ingress
  annotations:
    nginx.ingress.kubernetes.io/upstream-hash-by: "$request_uri"  # Consistent hashing
    nginx.ingress.kubernetes.io/load-balance: "round_robin"      # Load balancing algorithm
    nginx.ingress.kubernetes.io/upstream-keepalive-connections: "64"
    nginx.ingress.kubernetes.io/upstream-keepalive-timeout: "60"
    nginx.ingress.kubernetes.io/upstream-keepalive-requests: "100"
spec:
  rules:
  - host: api.veza.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: veza-backend-api
            port:
              number: 8080

Load Balancing Algorithms

Available algorithms for Nginx Ingress:

round_robin: Default, distributes requests evenly
least_conn: Routes to backend with fewest connections
ip_hash: Consistent hashing based on client IP
hash $request_uri: Consistent hashing based on request URI

Cloud Load Balancer Configuration

AWS Application Load Balancer (ALB)

apiVersion: v1
kind: Service
metadata:
  name: veza-backend-api
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
    service.beta.kubernetes.io/aws-load-balancer-scheme: "internet-facing"
    service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-interval: "10"
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-timeout: "5"
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-healthy-threshold: "2"
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-unhealthy-threshold: "3"
spec:
  type: LoadBalancer
  ports:
  - port: 80
    targetPort: 8080
  selector:
    app: veza-backend-api

GCP Load Balancer

apiVersion: v1
kind: Service
metadata:
  name: veza-backend-api
  annotations:
    cloud.google.com/load-balancer-type: "Internal"
    cloud.google.com/backend-config: '{"default": "veza-backend-config"}'
spec:
  type: LoadBalancer
  ports:
  - port: 80
    targetPort: 8080
  selector:
    app: veza-backend-api
---
apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
  name: veza-backend-config
spec:
  healthCheck:
    checkIntervalSec: 10
    timeoutSec: 5
    healthyThreshold: 2
    unhealthyThreshold: 3
    type: HTTP
    requestPath: /health
  sessionAffinity:
    affinityType: "CLIENT_IP"
    affinityCookieTtlSec: 3600

Azure Load Balancer

apiVersion: v1
kind: Service
metadata:
  name: veza-backend-api
  annotations:
    service.beta.kubernetes.io/azure-load-balancer-internal: "false"
    service.beta.kubernetes.io/azure-load-balancer-health-probe-request-path: "/health"
spec:
  type: LoadBalancer
  ports:
  - port: 80
    targetPort: 8080
  selector:
    app: veza-backend-api

Health Checks

Service Health Checks

apiVersion: v1
kind: Service
metadata:
  name: veza-backend-api
spec:
  type: ClusterIP
  ports:
  - port: 8080
    targetPort: 8080
  selector:
    app: veza-backend-api
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: veza-backend-api
spec:
  template:
    spec:
      containers:
      - name: veza-backend-api
        image: veza-backend-api:latest
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 3
        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 5
          timeoutSeconds: 3
          failureThreshold: 3

Ingress Health Checks

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: veza-ingress
  annotations:
    nginx.ingress.kubernetes.io/health-check: "true"
    nginx.ingress.kubernetes.io/health-check-path: "/health"
    nginx.ingress.kubernetes.io/health-check-interval: "10s"
    nginx.ingress.kubernetes.io/health-check-timeout: "5s"
    nginx.ingress.kubernetes.io/health-check-expected-status: "200"

Load Balancing Strategies

1. Round Robin (Default)

Use Case: Stateless applications, API services

Configuration:

sessionAffinity: None

Pros:

Simple and fair distribution
No state management needed

Cons:

May not account for server load
No session persistence

2. Session Affinity (Sticky Sessions)

Use Case: Stateful applications, user sessions

Configuration:

sessionAffinity: ClientIP
sessionAffinityConfig:
  clientIP:
    timeoutSeconds: 3600

Pros:

Maintains user sessions
Better for stateful applications

Cons:

Uneven load distribution
Single point of failure if pod crashes

3. Least Connections

Use Case: Long-lived connections, WebSocket

Configuration:

annotations:
  nginx.ingress.kubernetes.io/load-balance: "least_conn"

Pros:

Better load distribution
Accounts for active connections

Cons:

More complex than round robin
Requires connection tracking

4. Consistent Hashing

Use Case: Caching, distributed systems

Configuration:

annotations:
  nginx.ingress.kubernetes.io/upstream-hash-by: "$request_uri"

Pros:

Predictable routing
Better cache utilization

Cons:

Uneven distribution if hash keys are skewed
More complex configuration

Best Practices

1. Health Checks

Liveness Probe: Detects and restarts unhealthy pods
Readiness Probe: Removes pods from load balancer if not ready
Startup Probe: Allows slow-starting applications time to initialize

2. Resource Limits

Set appropriate resource limits to prevent resource exhaustion:

resources:
  requests:
    cpu: "100m"
    memory: "128Mi"
  limits:
    cpu: "500m"
    memory: "512Mi"

3. Pod Disruption Budgets

Ensure minimum availability during updates:

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: veza-backend-api-pdb
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: veza-backend-api

4. Horizontal Pod Autoscaling

Automatically scale based on load:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: veza-backend-api-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: veza-backend-api
  minReplicas: 3
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

5. Circuit Breakers

Implement circuit breakers for resilience:

annotations:
  nginx.ingress.kubernetes.io/upstream-max-fails: "3"
  nginx.ingress.kubernetes.io/upstream-fail-timeout: "30s"

Monitoring

Metrics to Monitor

Request Rate: Requests per second per backend
Response Time: P50, P95, P99 latencies
Error Rate: 4xx and 5xx error rates
Connection Count: Active connections per backend
Health Check Status: Success/failure rates

Prometheus Queries

# Request rate per backend
rate(nginx_ingress_controller_requests[5m])

# Response time percentiles
histogram_quantile(0.95, rate(nginx_ingress_controller_request_duration_seconds_bucket[5m]))

# Error rate
rate(nginx_ingress_controller_requests{status=~"5.."}[5m])

# Active connections
nginx_ingress_controller_connections

Troubleshooting

Uneven Load Distribution

# Check pod distribution across nodes
kubectl get pods -n veza-production -o wide

# Check service endpoints
kubectl get endpoints veza-backend-api -n veza-production

# Check load balancer statistics
kubectl exec -it nginx-ingress-controller-pod -n ingress-nginx -- \
  curl http://localhost:10254/nginx_status

Health Check Failures

# Check pod health
kubectl get pods -n veza-production

# Check health check logs
kubectl logs deployment/veza-backend-api -n veza-production | grep health

# Test health endpoint manually
kubectl exec -it deployment/veza-backend-api -n veza-production -- \
  curl http://localhost:8080/health

Session Affinity Issues

# Check session affinity configuration
kubectl get service veza-backend-api -n veza-production -o yaml | grep -A 5 sessionAffinity

# Test session persistence
for i in {1..10}; do
  curl -H "Cookie: session=test" https://api.veza.com/api/v1/tracks
done

README.md

Load Balancing Configuration for Veza Platform

Overview

Architecture

Components

1. Kubernetes Services

2. Ingress Controller

3. Cloud Load Balancers

Configuration

Service Load Balancing

Basic Service (Round Robin)

Session Affinity (Sticky Sessions)

Ingress Load Balancing

Nginx Ingress with Load Balancing

Load Balancing Algorithms

Cloud Load Balancer Configuration

AWS Application Load Balancer (ALB)

GCP Load Balancer

Azure Load Balancer

Health Checks

Service Health Checks

Ingress Health Checks

Load Balancing Strategies

1. Round Robin (Default)

2. Session Affinity (Sticky Sessions)

3. Least Connections

4. Consistent Hashing

Best Practices

1. Health Checks

2. Resource Limits

3. Pod Disruption Budgets

4. Horizontal Pod Autoscaling

5. Circuit Breakers

Monitoring

Metrics to Monitor

Prometheus Queries

Troubleshooting

Uneven Load Distribution

Health Check Failures

Session Affinity Issues

References