# πŸ“– PRODUCTION GUIDE - VEZA RUST MODULES > **Guide complet pour le dΓ©ploiement et l'exploitation des modules Rust en production** > **Version** : 2.0 Production-Ready > **DerniΓ¨re mise Γ  jour** : 1er juillet 2025 --- ## 🎯 APERΓ‡U SYSTÈME ### **Architecture Production** ``` β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ VEZA PLATFORM β”‚ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ CHAT SERVER β”‚ STREAM SERVER β”‚ β”‚ (Rust) β”‚ (Rust) β”‚ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ β€’ 100k+ WS β”‚ β€’ 10k+ Streams β”‚ β”‚ β€’ <10ms latency β”‚ β€’ 100k+ Listeners β”‚ β”‚ β€’ E2E Encryptionβ”‚ β€’ Adaptive Bitrate β”‚ β”‚ β€’ AI Moderation β”‚ β€’ Real-time Effects β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β” β”‚ BACKEND GO β”‚ β”‚ (API Gateway) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ``` ### **SpΓ©cifications Techniques** - **Performance** : 100k+ connexions WebSocket simultanΓ©es - **Latency** : <10ms P99 pour messages, <50ms pour streaming - **Throughput** : 10k+ requΓͺtes/seconde par instance - **Availability** : 99.99% uptime target - **Scalability** : Horizontale avec auto-scaling --- ## πŸ”§ CONFIGURATION PRODUCTION ### **Variables d'Environnement** ```bash # === CORE CONFIG === RUST_LOG=info ENVIRONMENT=production SERVICE_NAME=veza-stream-server VERSION=2.0.0 # === NETWORK === HOST=0.0.0.0 PORT=8080 WS_PORT=8081 GRPC_PORT=50051 # === DATABASE === DATABASE_URL=postgresql://veza:secure_pass@postgres:5432/veza_prod DATABASE_POOL_SIZE=100 DATABASE_TIMEOUT_MS=5000 # === REDIS === REDIS_URL=redis://redis:6379 REDIS_POOL_SIZE=50 REDIS_TTL_DEFAULT=3600 # === MONITORING === PROMETHEUS_PORT=9090 JAEGER_ENDPOINT=http://jaeger:14268/api/traces METRICS_ENABLED=true TRACING_ENABLED=true # === PERFORMANCE === MAX_CONNECTIONS=100000 WORKER_THREADS=16 BLOCKING_THREADS=32 MEMORY_LIMIT_MB=8192 # === SECURITY === JWT_SECRET=your_production_jwt_secret_here ENCRYPTION_KEY=your_32_byte_encryption_key_here RATE_LIMIT_PER_MINUTE=1000 ENABLE_CORS=true ALLOWED_ORIGINS=https://veza.live,https://app.veza.live # === AUDIO STREAMING === MAX_STREAMS=10000 MAX_LISTENERS_PER_STREAM=10000 ADAPTIVE_BITRATE=true DEFAULT_BITRATE=128 CODECS_ENABLED=opus,aac,mp3 # === CHAT === MAX_MESSAGE_SIZE=8192 MESSAGE_HISTORY_LIMIT=1000 MODERATION_ENABLED=true E2E_ENCRYPTION=optional ``` ### **Limites et Quotas RecommandΓ©s** ```yaml production_limits: # CPU & Memory cpu_request: "2000m" # 2 CPU cores minimum cpu_limit: "8000m" # 8 CPU cores maximum memory_request: "4Gi" # 4GB RAM minimum memory_limit: "16Gi" # 16GB RAM maximum # Network max_connections: 100000 bandwidth_limit: "1Gbps" # Storage ephemeral_storage: "10Gi" logs_retention: "30d" # Application max_message_rate: 100 # messages/second/user max_file_upload: "200MB" concurrent_streams: 10000 ``` --- ## πŸš€ DΓ‰PLOIEMENT PRODUCTION ### **1. PrΓ©-requis Infrastructure** - **Kubernetes** 1.25+ ou Docker Swarm - **PostgreSQL** 14+ avec High Availability - **Redis** 7.0+ cluster mode - **Load Balancer** avec SSL termination - **Monitoring** : Prometheus + Grafana stack ### **2. Health Checks** ```yaml healthcheck: readiness: path: /health/ready port: 8080 timeout: 5s period: 10s liveness: path: /health/live port: 8080 timeout: 5s period: 30s failure_threshold: 3 ``` ### **3. Graceful Shutdown** ```rust // Configuration de graceful shutdown (30s) tokio::select! { _ = signal::ctrl_c() => { info!("πŸ›‘ Graceful shutdown initiated"); // 1. Stop accepting new connections server.stop_accepting().await; // 2. Wait for existing connections to finish (max 30s) timeout(Duration::from_secs(30), server.wait_for_connections()).await; // 3. Force close remaining connections server.force_close().await; info!("βœ… Graceful shutdown completed"); } } ``` --- ## πŸ“Š MONITORING & ALERTING ### **MΓ©triques ClΓ©s Γ  Surveiller** #### **Performance Metrics** ```prometheus # Latency (target: P99 < 50ms) http_request_duration_seconds{quantile="0.99"} < 0.05 # Throughput (target: > 10k req/s) rate(http_requests_total[1m]) > 10000 # Error Rate (target: < 0.1%) rate(http_requests_total{status=~"5.."}[5m]) / rate(http_requests_total[5m]) < 0.001 ``` #### **Resource Metrics** ```prometheus # CPU Usage (alert: > 80%) cpu_usage_percent > 80 # Memory Usage (alert: > 85%) memory_usage_percent > 85 # Connection Count (alert: > 90k) websocket_connections_active > 90000 ``` #### **Business Metrics** ```prometheus # Active Users (alert: < 1k unusual drop) increase(active_users_total[5m]) < -1000 # Message Success Rate (alert: < 99.9%) message_delivery_success_rate < 0.999 # Stream Quality (alert: > 5% degraded) stream_quality_degraded_percent > 5 ``` ### **Alerting Rules** ```yaml groups: - name: veza-rust-modules rules: - alert: HighLatency expr: histogram_quantile(0.99, http_request_duration_seconds) > 0.1 for: 2m labels: severity: critical annotations: summary: "High latency detected" - alert: HighErrorRate expr: rate(http_errors_total[5m]) / rate(http_requests_total[5m]) > 0.01 for: 1m labels: severity: critical - alert: ServiceDown expr: up == 0 for: 30s labels: severity: critical ``` --- ## πŸ”’ SΓ‰CURITΓ‰ PRODUCTION ### **1. Network Security** - **TLS 1.3** obligatoire pour toutes les connexions - **Certificate pinning** pour communications inter-services - **Network policies** Kubernetes restrictives - **DDoS protection** avec rate limiting intelligent ### **2. Data Protection** ```rust // Encryption at rest let encryption_key = load_key_from_vault().await?; let encrypted_data = AES_256_GCM.encrypt(&data, &encryption_key)?; // Encryption in transit let tls_config = TlsConfig::builder() .cert_file("/certs/server.crt") .key_file("/certs/server.key") .min_tls_version(TlsVersion::TLSv1_3) .build()?; ``` ### **3. Authentication & Authorization** - **JWT** avec rotation automatique (24h) - **RBAC** granulaire par resource - **API keys** avec scopes limitΓ©s - **2FA** obligatoire pour comptes privilΓ©giΓ©s --- ## πŸ”„ MAINTENANCE & OPΓ‰RATIONS ### **1. Mise Γ  Jour Rolling** ```bash # 1. Update image version kubectl set image deployment/stream-server \ stream-server=veza/stream-server:v2.1.0 # 2. Monitor rollout kubectl rollout status deployment/stream-server # 3. Validate health kubectl get pods -l app=stream-server ``` ### **2. Backup & Recovery** - **Database** : Point-in-time recovery (PITR) avec PostgreSQL - **Configuration** : Git-ops avec validation automatique - **Logs** : RΓ©tention 30 jours avec archivage S3 - **Metrics** : RΓ©tention 1 an avec downsampling ### **3. Scaling Operations** ```bash # Horizontal scaling kubectl scale deployment/stream-server --replicas=10 # Vertical scaling (HPA) kubectl autoscale deployment/stream-server \ --cpu-percent=70 --min=5 --max=50 ``` --- ## 🚨 RUNBOOKS INCIDENTS ### **Incident 1 : High Latency** ```bash # 1. Diagnostic rapide kubectl top pods -l app=stream-server kubectl logs -l app=stream-server --tail=100 # 2. Scaling immΓ©diat si CPU > 80% kubectl scale deployment/stream-server --replicas=20 # 3. Investigation kubectl exec -it stream-server-xxx -- /bin/bash htop # VΓ©rifier CPU/Memory ss -tulpn # VΓ©rifier connexions rΓ©seau ``` ### **Incident 2 : Service Down** ```bash # 1. Restart rapid kubectl rollout restart deployment/stream-server # 2. Check dependencies kubectl get pods -l app=postgres kubectl get pods -l app=redis # 3. Traffic rerouting kubectl patch service/stream-server -p '{"spec":{"selector":{"app":"stream-server-backup"}}}' ``` ### **Incident 3 : Memory Leak** ```bash # 1. Memory profiling kubectl exec stream-server-xxx -- /bin/bash curl http://localhost:9090/debug/pprof/heap > heap.prof # 2. Graceful restart par batch for pod in $(kubectl get pods -l app=stream-server -o name); do kubectl delete $pod sleep 30 # Wait for replacement done ``` --- ## πŸ“ˆ PERFORMANCE TUNING ### **1. OS Level Optimizations** ```bash # Network tuning echo 'net.core.somaxconn=65535' >> /etc/sysctl.conf echo 'net.core.netdev_max_backlog=5000' >> /etc/sysctl.conf echo 'net.ipv4.tcp_max_syn_backlog=65535' >> /etc/sysctl.conf # File descriptor limits echo '* soft nofile 1048576' >> /etc/security/limits.conf echo '* hard nofile 1048576' >> /etc/security/limits.conf ``` ### **2. Application Tuning** ```rust // Tokio runtime optimization let rt = tokio::runtime::Builder::new_multi_thread() .worker_threads(num_cpus::get() * 2) .max_blocking_threads(512) .thread_stack_size(2 * 1024 * 1024) // 2MB stack .enable_all() .build()?; ``` ### **3. Database Optimization** ```sql -- Connection pooling ALTER SYSTEM SET max_connections = 500; ALTER SYSTEM SET shared_buffers = '4GB'; ALTER SYSTEM SET effective_cache_size = '12GB'; ALTER SYSTEM SET work_mem = '64MB'; -- Index optimization for chat messages CREATE INDEX CONCURRENTLY idx_messages_room_created ON messages(room_id, created_at DESC) WHERE deleted_at IS NULL; ``` --- ## πŸ” TROUBLESHOOTING ### **ProblΓ¨mes FrΓ©quents** #### **1. WebSocket Connections Dropping** ```bash # Check load balancer timeout kubectl describe ingress stream-server # Verify heartbeat configuration grep -r "ping_interval" src/ # Monitor connection metrics curl http://localhost:9090/metrics | grep websocket ``` #### **2. Audio Stream Latency** ```rust // Verify buffer configuration let buffer_config = BufferConfig { target_latency: Duration::from_millis(50), max_buffer_size: 1024 * 8, // 8KB adaptive: true, }; ``` #### **3. Memory Usage Growth** ```bash # Check for connection leaks ss -s | grep tcp lsof -p $(pgrep stream-server) | wc -l # Monitor memory pools curl http://localhost:9090/debug/pprof/allocs ``` --- ## πŸ“‹ CHECKLIST PRODUCTION ### **Pre-Deployment** - [ ] Load testing completed (100k+ connections) - [ ] Security audit passed - [ ] Monitoring configured - [ ] Backup strategy validated - [ ] Disaster recovery tested ### **Post-Deployment** - [ ] Health checks passing - [ ] Metrics collecting properly - [ ] Logs flowing to aggregation - [ ] Alerts configured and tested - [ ] Performance within targets ### **Weekly Maintenance** - [ ] Check resource utilization trends - [ ] Review error logs - [ ] Update security patches - [ ] Validate backup integrity - [ ] Performance regression testing --- **🎯 Cette documentation garantit un dΓ©ploiement production robuste et maintenable des modules Rust Veza.**