veza/k8s/load-balancing
senke f9120c322b
Some checks failed
Backend API CI / test-unit (push) Failing after 0s
Backend API CI / test-integration (push) Failing after 0s
Frontend CI / test (push) Failing after 0s
Storybook Audit / Build & audit Storybook (push) Failing after 0s
Stream Server CI / test (push) Failing after 0s
release(v0.903): Vault - ORDER BY whitelist, rate limiter, VERSION sync, chat-server cleanup, Go 1.24
- ORDER BY dynamiques : whitelist explicite, fallback created_at DESC
- Login/register soumis au rate limiter global
- VERSION sync + check CI
- Nettoyage références veza-chat-server
- Go 1.24 partout (Dockerfile, workflows)
- TODO/FIXME/HACK convertis en issues ou résolus
2026-02-27 09:43:25 +01:00
..
cloud-load-balancers [INFRA-011] infra: Set up load balancing 2025-12-25 21:41:39 +01:00
ingress-with-lb.yaml [INFRA-011] infra: Set up load balancing 2025-12-25 21:41:39 +01:00
pod-disruption-budget.yaml [INFRA-011] infra: Set up load balancing 2025-12-25 21:41:39 +01:00
README.md [INFRA-011] infra: Set up load balancing 2025-12-25 21:41:39 +01:00
services-with-lb.yaml release(v0.903): Vault - ORDER BY whitelist, rate limiter, VERSION sync, chat-server cleanup, Go 1.24 2026-02-27 09:43:25 +01:00

Load Balancing Configuration for Veza Platform

This directory contains configurations for load balancing across the Veza platform to ensure high availability, scalability, and optimal performance.

Overview

Veza uses multiple layers of load balancing:

  1. Kubernetes Services: Internal load balancing using kube-proxy
  2. Ingress Controllers: Nginx Ingress for HTTP/HTTPS traffic
  3. Cloud Load Balancers: AWS ALB, GCP Load Balancer, Azure Load Balancer
  4. Application-Level: Session affinity, health checks, circuit breakers

Architecture

┌─────────────────────────────────────────────────────────┐
│                    Internet Users                         │
└────────────────────┬──────────────────────────────────────┘
                     │
        ┌────────────▼────────────┐
        │  Cloud Load Balancer     │
        │  (AWS ALB / GCP LB)      │
        └────────────┬─────────────┘
                     │
        ┌────────────▼────────────┐
        │  Ingress Controller      │
        │  (Nginx Ingress)         │
        └────────────┬─────────────┘
                     │
        ┌────────────▼────────────┐
        │  Kubernetes Services     │
        │  (ClusterIP)             │
        └────────────┬─────────────┘
                     │
        ┌────────────▼────────────┐
        │  Application Pods        │
        │  (Backend, Frontend)    │
        └─────────────────────────┘

Components

1. Kubernetes Services

Kubernetes Services provide internal load balancing using different algorithms:

  • Round Robin (default): Distributes requests evenly
  • Session Affinity: Sticky sessions for stateful applications
  • Least Connections: Routes to pod with fewest connections
  • IP Hash: Consistent hashing based on client IP

2. Ingress Controller

Nginx Ingress Controller provides:

  • SSL/TLS termination
  • Path-based routing
  • Rate limiting
  • WebSocket support
  • Load balancing algorithms

3. Cloud Load Balancers

Cloud provider load balancers provide:

  • Global load balancing
  • Health checks
  • SSL/TLS termination
  • DDoS protection
  • Geographic routing

Configuration

Service Load Balancing

Basic Service (Round Robin)

apiVersion: v1
kind: Service
metadata:
  name: veza-backend-api
spec:
  type: ClusterIP
  sessionAffinity: None  # Round robin (default)
  ports:
  - port: 8080
    targetPort: 8080
  selector:
    app: veza-backend-api

Session Affinity (Sticky Sessions)

apiVersion: v1
kind: Service
metadata:
  name: veza-backend-api
spec:
  type: ClusterIP
  sessionAffinity: ClientIP
  sessionAffinityConfig:
    clientIP:
      timeoutSeconds: 3600  # 1 hour
  ports:
  - port: 8080
    targetPort: 8080
  selector:
    app: veza-backend-api

Ingress Load Balancing

Nginx Ingress with Load Balancing

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: veza-ingress
  annotations:
    nginx.ingress.kubernetes.io/upstream-hash-by: "$request_uri"  # Consistent hashing
    nginx.ingress.kubernetes.io/load-balance: "round_robin"      # Load balancing algorithm
    nginx.ingress.kubernetes.io/upstream-keepalive-connections: "64"
    nginx.ingress.kubernetes.io/upstream-keepalive-timeout: "60"
    nginx.ingress.kubernetes.io/upstream-keepalive-requests: "100"
spec:
  rules:
  - host: api.veza.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: veza-backend-api
            port:
              number: 8080

Load Balancing Algorithms

Available algorithms for Nginx Ingress:

  • round_robin: Default, distributes requests evenly
  • least_conn: Routes to backend with fewest connections
  • ip_hash: Consistent hashing based on client IP
  • hash $request_uri: Consistent hashing based on request URI

Cloud Load Balancer Configuration

AWS Application Load Balancer (ALB)

apiVersion: v1
kind: Service
metadata:
  name: veza-backend-api
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
    service.beta.kubernetes.io/aws-load-balancer-scheme: "internet-facing"
    service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-interval: "10"
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-timeout: "5"
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-healthy-threshold: "2"
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-unhealthy-threshold: "3"
spec:
  type: LoadBalancer
  ports:
  - port: 80
    targetPort: 8080
  selector:
    app: veza-backend-api

GCP Load Balancer

apiVersion: v1
kind: Service
metadata:
  name: veza-backend-api
  annotations:
    cloud.google.com/load-balancer-type: "Internal"
    cloud.google.com/backend-config: '{"default": "veza-backend-config"}'
spec:
  type: LoadBalancer
  ports:
  - port: 80
    targetPort: 8080
  selector:
    app: veza-backend-api
---
apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
  name: veza-backend-config
spec:
  healthCheck:
    checkIntervalSec: 10
    timeoutSec: 5
    healthyThreshold: 2
    unhealthyThreshold: 3
    type: HTTP
    requestPath: /health
  sessionAffinity:
    affinityType: "CLIENT_IP"
    affinityCookieTtlSec: 3600

Azure Load Balancer

apiVersion: v1
kind: Service
metadata:
  name: veza-backend-api
  annotations:
    service.beta.kubernetes.io/azure-load-balancer-internal: "false"
    service.beta.kubernetes.io/azure-load-balancer-health-probe-request-path: "/health"
spec:
  type: LoadBalancer
  ports:
  - port: 80
    targetPort: 8080
  selector:
    app: veza-backend-api

Health Checks

Service Health Checks

apiVersion: v1
kind: Service
metadata:
  name: veza-backend-api
spec:
  type: ClusterIP
  ports:
  - port: 8080
    targetPort: 8080
  selector:
    app: veza-backend-api
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: veza-backend-api
spec:
  template:
    spec:
      containers:
      - name: veza-backend-api
        image: veza-backend-api:latest
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 3
        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 5
          timeoutSeconds: 3
          failureThreshold: 3

Ingress Health Checks

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: veza-ingress
  annotations:
    nginx.ingress.kubernetes.io/health-check: "true"
    nginx.ingress.kubernetes.io/health-check-path: "/health"
    nginx.ingress.kubernetes.io/health-check-interval: "10s"
    nginx.ingress.kubernetes.io/health-check-timeout: "5s"
    nginx.ingress.kubernetes.io/health-check-expected-status: "200"

Load Balancing Strategies

1. Round Robin (Default)

Use Case: Stateless applications, API services

Configuration:

sessionAffinity: None

Pros:

  • Simple and fair distribution
  • No state management needed

Cons:

  • May not account for server load
  • No session persistence

2. Session Affinity (Sticky Sessions)

Use Case: Stateful applications, user sessions

Configuration:

sessionAffinity: ClientIP
sessionAffinityConfig:
  clientIP:
    timeoutSeconds: 3600

Pros:

  • Maintains user sessions
  • Better for stateful applications

Cons:

  • Uneven load distribution
  • Single point of failure if pod crashes

3. Least Connections

Use Case: Long-lived connections, WebSocket

Configuration:

annotations:
  nginx.ingress.kubernetes.io/load-balance: "least_conn"

Pros:

  • Better load distribution
  • Accounts for active connections

Cons:

  • More complex than round robin
  • Requires connection tracking

4. Consistent Hashing

Use Case: Caching, distributed systems

Configuration:

annotations:
  nginx.ingress.kubernetes.io/upstream-hash-by: "$request_uri"

Pros:

  • Predictable routing
  • Better cache utilization

Cons:

  • Uneven distribution if hash keys are skewed
  • More complex configuration

Best Practices

1. Health Checks

  • Liveness Probe: Detects and restarts unhealthy pods
  • Readiness Probe: Removes pods from load balancer if not ready
  • Startup Probe: Allows slow-starting applications time to initialize

2. Resource Limits

Set appropriate resource limits to prevent resource exhaustion:

resources:
  requests:
    cpu: "100m"
    memory: "128Mi"
  limits:
    cpu: "500m"
    memory: "512Mi"

3. Pod Disruption Budgets

Ensure minimum availability during updates:

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: veza-backend-api-pdb
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: veza-backend-api

4. Horizontal Pod Autoscaling

Automatically scale based on load:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: veza-backend-api-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: veza-backend-api
  minReplicas: 3
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

5. Circuit Breakers

Implement circuit breakers for resilience:

annotations:
  nginx.ingress.kubernetes.io/upstream-max-fails: "3"
  nginx.ingress.kubernetes.io/upstream-fail-timeout: "30s"

Monitoring

Metrics to Monitor

  • Request Rate: Requests per second per backend
  • Response Time: P50, P95, P99 latencies
  • Error Rate: 4xx and 5xx error rates
  • Connection Count: Active connections per backend
  • Health Check Status: Success/failure rates

Prometheus Queries

# Request rate per backend
rate(nginx_ingress_controller_requests[5m])

# Response time percentiles
histogram_quantile(0.95, rate(nginx_ingress_controller_request_duration_seconds_bucket[5m]))

# Error rate
rate(nginx_ingress_controller_requests{status=~"5.."}[5m])

# Active connections
nginx_ingress_controller_connections

Troubleshooting

Uneven Load Distribution

# Check pod distribution across nodes
kubectl get pods -n veza-production -o wide

# Check service endpoints
kubectl get endpoints veza-backend-api -n veza-production

# Check load balancer statistics
kubectl exec -it nginx-ingress-controller-pod -n ingress-nginx -- \
  curl http://localhost:10254/nginx_status

Health Check Failures

# Check pod health
kubectl get pods -n veza-production

# Check health check logs
kubectl logs deployment/veza-backend-api -n veza-production | grep health

# Test health endpoint manually
kubectl exec -it deployment/veza-backend-api -n veza-production -- \
  curl http://localhost:8080/health

Session Affinity Issues

# Check session affinity configuration
kubectl get service veza-backend-api -n veza-production -o yaml | grep -A 5 sessionAffinity

# Test session persistence
for i in {1..10}; do
  curl -H "Cookie: session=test" https://api.veza.com/api/v1/tracks
done

References