veza/veza-backend-api/internal/metrics/cache_hit_rate.go
senke 15e591305e
Some checks failed
Veza CI / Rust (Stream Server) (push) Successful in 5m12s
Security Scan / Secret Scanning (gitleaks) (push) Failing after 54s
Veza CI / Backend (Go) (push) Failing after 8m38s
Veza CI / Frontend (Web) (push) Failing after 16m44s
Veza CI / Notify on failure (push) Successful in 15s
E2E Playwright / e2e (full) (push) Successful in 20m28s
feat(cdn): Bunny.net signed URLs + HLS cache headers + metric collision fix (W3 Day 13)
CDN edge in front of S3/MinIO via origin-pull. Backend signs URLs
with Bunny.net token-auth (SHA-256 over security_key + path + expires)
so edges verify before serving cached objects ; origin is never hit
on a valid token. Cloudflare CDN / R2 / CloudFront stubs kept.

- internal/services/cdn_service.go : new providers CDNProviderBunny +
  CDNProviderCloudflareR2. SecurityKey added to CDNConfig.
  generateBunnySignedURL implements the documented Bunny scheme
  (url-safe base64, no padding, expires query). HLSSegmentCacheHeaders
  + HLSPlaylistCacheHeaders helpers exported for handlers.
- internal/services/cdn_service_test.go : pin Bunny URL shape +
  base64-url charset ; assert empty SecurityKey fails fast (no
  silent fallback to unsigned URLs).
- internal/core/track/service.go : new CDNURLSigner interface +
  SetCDNService(cdn). GetStorageURL prefers CDN signed URL when
  cdnService.IsEnabled, falls back to direct S3 presign on signing
  error so a CDN partial outage doesn't block playback.
- internal/api/routes_tracks.go + routes_core.go : wire SetCDNService
  on the two TrackService construction sites that serve stream/download.
- internal/config/config.go : 4 new env vars (CDN_ENABLED, CDN_PROVIDER,
  CDN_BASE_URL, CDN_SECURITY_KEY). config.CDNService always non-nil
  after init ; IsEnabled gates the actual usage.
- internal/handlers/hls_handler.go : segments now return
  Cache-Control: public, max-age=86400, immutable (content-addressed
  filenames make this safe). Playlists at max-age=60.
- veza-backend-api/.env.template : 4 placeholder env vars.
- docs/ENV_VARIABLES.md §12 : provider matrix + Bunny vs Cloudflare
  vs R2 trade-offs.

Bug fix collateral : v1.0.9 Day 11 introduced veza_cache_hits_total
which collided in name with monitoring.CacheHitsTotal (different
label set ⇒ promauto MustRegister panic at process init). Day 13
deletes the monitoring duplicate and restores the metrics-package
counter as the single source of truth (label: subsystem). All 8
affected packages green : services, core/track, handlers, middleware,
websocket/chat, metrics, monitoring, config.

Acceptance (Day 13) : code path is wired ; verifying via real Bunny
edge requires a Pull Zone provisioned by the user (EX-? in roadmap).
On the user side : create Pull Zone w/ origin = MinIO, copy token
auth key into CDN_SECURITY_KEY, set CDN_ENABLED=true.

W3 progress : Redis Sentinel ✓ · MinIO distribué ✓ · CDN ✓ ·
DMCA  Day 14 · embed  Day 15.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 14:07:20 +02:00

57 lines
2.1 KiB
Go

package metrics
// Cache hit/miss counters per subsystem (v1.0.9 W3 Day 11).
//
// Three call-sites instrumented in v1.0.9 :
// - rate_limiter — Redis INCR result classified as "hit" if Redis
// delivered a verdict, "miss" if Redis was
// unreachable and the in-memory fallback kicked in.
// - chat_pubsub — "hit" on a successful Publish/Subscribe round-trip,
// "miss" on connection error (Redis unreachable).
// - presence — "hit" on a successful Get/Set/Del, "miss" on a key
// that didn't exist (presence stale or never set) or
// on an underlying Redis error.
//
// Subsystems are passed as labels rather than baked into separate metrics
// so dashboards can pivot. Cardinality is fixed at the three values above
// (plus future additions in W3+) ; never label by user_id / room_id /
// per-key — that would explode cardinality.
//
// Note (v1.0.9 W3 Day 13) : the original Day 11 metric collided in name
// with `monitoring.CacheHitsTotal` (different label set). Day 13 deletes
// the monitoring duplicate ; this file is the single source of truth.
import (
"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promauto"
)
var (
cacheHits = promauto.NewCounterVec(
prometheus.CounterOpts{
Name: "veza_cache_hits_total",
Help: "Total cache hits per subsystem",
},
[]string{"subsystem"},
)
cacheMisses = promauto.NewCounterVec(
prometheus.CounterOpts{
Name: "veza_cache_misses_total",
Help: "Total cache misses per subsystem",
},
[]string{"subsystem"},
)
)
// RecordCacheHit increments the hit counter for a subsystem. Subsystem
// must be one of the bounded set documented at file-level — adding a
// new value is a deliberate choice that should also update Grafana.
func RecordCacheHit(subsystem string) {
cacheHits.WithLabelValues(subsystem).Inc()
}
// RecordCacheMiss increments the miss counter for a subsystem.
func RecordCacheMiss(subsystem string) {
cacheMisses.WithLabelValues(subsystem).Inc()
}