veza/config
senke 54af2bc851 feat(observability): RUM Web Vitals beacons + alert rules (v1.0.10 ops item 9)
Real User Monitoring closes the gap between synthetic probes (which
already cover server-side latency) and what users actually see in
their browsers. Slow CDN edges, third-party scripts, mobile-CPU
regressions, and bundle bloat all surface here but stay invisible
to backend-side dashboards.

Frontend (apps/web) :
- web-vitals@^4.2.4 dep
- src/observability/webVitals.ts collects LCP / CLS / INP / FID /
  TTFB via the npm web-vitals package and POSTs to the backend
  using sendBeacon (with fetch keepalive fallback)
- Pageload-level sampling decision (flip a coin once, contribute
  all metrics or none) avoids per-metric histogram bias
- Sample rate via VITE_RUM_SAMPLE_RATE (default 1.0 dev / 0.25 prod)
- main.tsx wires initWebVitals() right after initSentry()
- Route slug derived client-side (strips uuid-ish + numeric ids
  to keep cardinality low)

Backend :
- internal/handlers/web_vitals_handler.go : POST
  /api/v1/observability/web-vitals — anonymous, IP rate-limited
  (reuses FrontendLogRateLimit), validates value ranges, normalizes
  route + device labels for cardinality
- internal/monitoring/web_vitals.go : Prometheus histograms with
  buckets aligned to Google's good/needs-improvement/poor
  thresholds, plus beacons-received / beacons-rejected counters
- Tests : 6 handler tests + 3 helper-function tests + 10 frontend
  vitest tests (all pass)

Alerts in alert_rules.yml veza_rum group :
- WebVitalsLCPP75Poor (p75 LCP > 4s on a route+device for 30m)
- WebVitalsCLSP75Poor (p75 CLS > 0.25 for 30m)
- WebVitalsINPP75Poor (p75 INP > 500ms for 30m)
- WebVitalsBeaconsStopped (zero beacons for 30m vs yesterday)

Cardinality discipline : labels are bounded to {route, device}
where route is alnum/dash, ≤32 chars, and device is one of
mobile/desktop/tablet/unknown. No per-user labels.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 19:56:44 +02:00
..
alertmanager feat(observability): SLO burn-rate alerts + 7 runbook stubs (W2 Day 10) 2026-04-28 01:30:34 +02:00
baremetal/apache state-ownership: delete unused optimisticStoreUpdates.ts file 2026-01-15 19:26:53 +01:00
caddy chore(cleanup): remove veza-chat-server directory and all operational references 2026-02-22 21:13:00 +01:00
docker chore(infra): J6 — mark 3 dormant docker-compose files as deprecated 2026-04-15 12:58:39 +02:00
grafana feat(redis): Sentinel HA + cache hit rate metrics (W3 Day 11) 2026-04-28 13:36:55 +02:00
haproxy feat(infra): blue-green deployment via HAProxy 2026-02-23 19:52:19 +01:00
incus chore(cleanup): remove veza-chat-server directory and all operational references 2026-02-22 21:13:00 +01:00
prometheus feat(observability): RUM Web Vitals beacons + alert rules (v1.0.10 ops item 9) 2026-05-04 19:56:44 +02:00
ssl fix(infra): HAProxy HTTPS and stats security 2026-02-15 15:58:51 +01:00
env.example v0.9.5 2026-03-06 10:02:53 +01:00
logging.toml docs: add project documentation, logging config, status script 2026-03-18 11:36:36 +01:00
metrics.yaml BASE: completing the initial repo state 2025-12-03 22:56:50 +01:00
prometheus.yml feat(monitoring): add Alertmanager with Slack notifications 2026-02-23 19:54:55 +01:00