# Performance Baseline — Veza API **Version** : v0.951 **Objectif** : Documenter les latences P50/P95/P99 des endpoints critiques pour détecter les régressions. ## Méthodologie 1. Démarrer l'API en mode profiling : `pprof` est exposé si `ENABLE_PPROF=true` 2. Exécuter un load test (k6 ou Go) sur les endpoints critiques 3. Mesurer latences via Prometheus (`http_request_duration_seconds`) ou pprof ## Endpoints critiques à monitorer | Endpoint | Méthode | Description | |----------|---------|-------------| | `/api/v1/auth/login` | POST | Login utilisateur | | `/api/v1/auth/register` | POST | Inscription | | `/api/v1/tracks` | GET | Liste des tracks (cursor pagination v0.931) | | `/api/v1/tracks/search` | GET | Recherche | | `/api/v1/users/me` | GET | Profil utilisateur | | `/api/v1/marketplace/orders` | POST | Création commande | | `/api/v1/notifications` | GET | Notifications | | `/api/v1/conversations` | GET | Conversations | | `/api/v1/analytics/me` | GET | Analytics | | `/health` | GET | Health check | ## Cibles v1.0 (v0.951) - **P99 < 500ms** sur tous les endpoints critiques à 500 req/s (stress_500rps.js) - **1000 WebSocket** : connexions stables 5 min, taux livraison > 99% (stress_1000ws.js) - **50 uploads concurrents** : tous réussis, backpressure respecté (uploads.js) - **GET /tracks** : pagination cursor-based (v0.931) garantit des performances constantes quelle que soit la page ## Scripts k6 v0.951 | Script | Commande | Seuils | |--------|----------|--------| | API stress 500 VUs | `k6 run loadtests/backend/stress_500rps.js` | P99 < 500ms (login, tracks, search, products) | | WebSocket 1000 | `k6 run loadtests/chat/stress_1000ws.js` | ws_connection_failures < 1%, ws_message_failures < 1% | | Uploads 50 | `k6 run loadtests/backend/uploads.js` | P95 < 5s (simple), P95 < 8s (chunked) | Voir [loadtests/README.md](../loadtests/README.md) pour l'exécution complète. ## Commande pprof ```bash # Profiler 30s pendant un load test go tool pprof -http=:8081 http://localhost:8080/debug/pprof/profile?seconds=30 ``` ## Métriques Prometheus Les middlewares de monitoring exposent `http_request_duration_seconds` avec les labels `method`, `path`, `status`. Utiliser des histogram quantiles pour P50/P95/P99. ## Lighthouse v0.982 (Frontend) **Objectif** : Performance ≥ 90, Accessibility ≥ 90, Best Practices ≥ 90 sur les pages critiques. ### Pages à auditer | Page | Route | Cible Performance | Cible Accessibility | |------|-------|-------------------|---------------------| | Login | `/login` | ≥ 90 | ≥ 90 | | Dashboard | `/dashboard` | ≥ 90 | ≥ 90 | | Tracks | `/library` ou `/tracks` | ≥ 90 | ≥ 90 | | Marketplace | `/marketplace` | ≥ 90 | ≥ 90 | | Search | `/search` | ≥ 90 | ≥ 90 | | Profile | `/profile` | ≥ 90 | ≥ 90 | ### Procédure d'audit ```bash # Prérequis: app frontend en cours d'exécution (npm run dev ou build + preview) npx lighthouse http://localhost:4173/ --view --output=html --output-path=./lighthouse-reports/home.html npx lighthouse http://localhost:4173/login --view --output=html --output-path=./lighthouse-reports/login.html # Répéter pour chaque page critique ``` ### Dernier audit Voir [config/incus/LIGHTHOUSE_AUDIT_REPORT.md](../config/incus/LIGHTHOUSE_AUDIT_REPORT.md) pour le dernier rapport (2026-01-15). Accessibility 93, Best Practices 96 — objectif v0.982 atteint sur ces critères. Performance à revalider après corrections NO_LCP. --- ## Résultats v1.0.2 **Prérequis** : `docker compose up -d`, backend + PostgreSQL + Redis. ### Load tests corrigés (v0.502) - WebSocket load test : CHAT_ORIGIN pointant vers backend `ws://localhost:8080`, WS_URL = `/api/v1/ws` - Fichiers : `loadtests/config.js`, `loadtests/chat/stress_1000ws.js`, `loadtests/chat/websocket.js` ### Commandes pour exécution ```bash k6 run loadtests/backend/stress_500rps.js # 500 req/s, P99 < 500ms k6 run loadtests/chat/stress_1000ws.js # 1000 WebSocket, < 1% échec k6 run loadtests/backend/uploads.js # 50 uploads ``` ### Tableau résultats (à remplir après exécution sur infra) | Endpoint / Script | P50 | P95 | P99 | Taux échec | |------------------|-----|-----|-----|------------| | stress_500rps (login, tracks, search) | | | | | | stress_1000ws | | | | | | uploads | | | | | --- ## v1.0.9 W4 Day 20 — Mixed-scenarios nightly k6 Capacity gate before launch : sustain **1650 VU concurrent** for 5 minutes on staging without breaking the global thresholds. Scheduled by `.github/workflows/loadtest.yml` at 02:30 UTC ; the acceptance bar is "3 nuits consécutives green" before the launch goes hot. ### Scenarios Run in parallel via the k6 scenarios block in `scripts/loadtest/k6_mixed_scenarios.js`. Each one uses `executor: constant-vus` so the steady state is unambiguous. | Scenario | VU | Workload | Per-scenario p95 gate | | ---------- | ---- | ----------------------------------------------------- | --------------------- | | upload | 100 | initiate + 10×1 MiB chunks (synthetic 10 MiB tracks) | global only | | streaming | 500 | master.m3u8 → quality playlist → 4 segments loop | < 300 ms | | browse | 1000 | search 60% / list 30% / detail 10% | < 400 ms | | checkout | 50 | list products → POST orders (rejected at validation) | < 800 ms | ### Global thresholds (acceptance bar) | Metric | Threshold | Reason | | -------------------- | -------------------- | ------------------------------------------------- | | `http_req_duration` | p(95) < 500 ms | Roadmap §Day 20. | | `http_req_duration` | p(99) < 1500 ms | Tail latency cap ; catches one-off sync stalls. | | `http_req_failed` | rate < 0.5 % | Roadmap §Day 20. Looser per-scenario for upload + checkout (network + Hyperswitch). | ### How to run locally ```bash # Against the lab haproxy (no auth required for browse/streaming) : k6 run scripts/loadtest/k6_mixed_scenarios.js \ --env BASE_URL=http://haproxy.lxd \ --env STREAM_TRACK_ID= \ --env DURATION=2m \ --env UPLOAD_VUS=10 --env STREAM_VUS=50 --env BROWSE_VUS=100 --env CHECKOUT_VUS=5 # Full nightly profile against staging : USER_TOKEN=$(./scripts/issue-loadtest-token.sh) \ k6 run scripts/loadtest/k6_mixed_scenarios.js \ --env BASE_URL=https://staging.veza.fr \ --env STREAM_TRACK_ID= \ --env USER_TOKEN="$USER_TOKEN" ``` ### Operating notes - **Override per-scenario VU** with `UPLOAD_VUS`, `STREAM_VUS`, `BROWSE_VUS`, `CHECKOUT_VUS` env vars to dial the load down for local runs. - **Staging-only.** The workflow refuses to run against prod ; the `BASE_URL` is set from `vars.STAGING_BASE_URL` (or `DEFAULT_BASE_URL` env in the workflow) and never reads from a prod-shaped variable. - **Token rotation.** `STAGING_LOADTEST_TOKEN` is a long-lived token bound to a dedicated `loadtest@veza.music` user with role=user (no admin powers). Rotate quarterly. - **Upload scenario approximation.** The chunked endpoint expects multipart bodies ; for load shaping we POST raw 1 MiB chunks with the upload-id header. The cost path (auth + rate-limit + Redis state) is exercised correctly even though the resulting upload is rejected at the multipart parser. ### After-run dashboard The Grafana dashboard `Veza API Overview` (config/grafana/dashboards/api-overview.json) carries the p95/p99 panels. Add the k6 run window via the timepicker to compare. The k6 JSON summary uploaded as a workflow artifact carries the per-scenario breakdown that the dashboard can't show directly. ### Acceptance gate (W4 verification) - 3 consecutive nightly runs green (no threshold violation). - p95 < 500 ms on the global metric. - Per-scenario gates met for every flow. When the gate breaks, the workflow's "Annotate thresholds in summary" step writes the failing values to the GitHub Actions summary so the on-call can triage from a single page.