veza/config at ccf3e64d9a1246119e2c3144faafcf7d80811eb1 - senke/veza

senke/veza

History

senke ccf3e64d9a feat(observability): DB pool monitoring + N+1 detection (v1.0.10 ops item 11) Two complementary signals : pool-side (do we have enough connections for the load?) and per-request side (does any single handler quietly run hundreds of queries?). Both feed Prometheus + Grafana + alert rules. Pool stats exporter (internal/database/pool_stats_exporter.go) : - Background goroutine ticks every 15s and feeds the existing veza_db_connections{state} gauges. Before this, the gauges only refreshed when /health/deep was hit, so PoolExhaustionImminent evaluated against stale data. - Wired into cmd/api/main.go alongside the ledger sampler with a shutdown hook for clean cancellation. N+1 detector (internal/database/n1_detector.go + internal/middleware/n1_query_counter.go) : - Per-request *int64 counter attached to ctx by the gin middleware ; GORM after-callback (Query/Create/Update/Delete/ Row/Raw) atomic-adds. - Cost : one pointer load + one atomic add per query. - Cardinality bounded by c.FullPath() (templated route, not URL). - Threshold default 50, override via VEZA_N1_THRESHOLD. - Histogram veza_db_request_query_count + counter veza_db_n1_suspicions_total. Alerts in alert_rules.yml veza_db_pool_n1 group : - PoolExhaustionImminent (in_use ≥ 90% for 5m) - PoolStatsExporterStuck (gauges frozen for 10m despite traffic) - N1QuerySpike (> 3% of requests over threshold for 15m) - SlowQuerySustained (slow query rate > 2/min for 15m on same op+table) Tests : 8 detector tests + 4 middleware tests, all pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>		2026-05-04 23:53:37 +02:00
..
alertmanager	feat(observability): SLO burn-rate alerts + 7 runbook stubs (W2 Day 10)	2026-04-28 01:30:34 +02:00
baremetal/apache	state-ownership: delete unused optimisticStoreUpdates.ts file	2026-01-15 19:26:53 +01:00
caddy	chore(cleanup): remove veza-chat-server directory and all operational references	2026-02-22 21:13:00 +01:00
docker	chore(infra): J6 — mark 3 dormant docker-compose files as deprecated	2026-04-15 12:58:39 +02:00
grafana	feat(redis): Sentinel HA + cache hit rate metrics (W3 Day 11)	2026-04-28 13:36:55 +02:00
haproxy	feat(infra): blue-green deployment via HAProxy	2026-02-23 19:52:19 +01:00
incus	chore(cleanup): remove veza-chat-server directory and all operational references	2026-02-22 21:13:00 +01:00
prometheus	feat(observability): DB pool monitoring + N+1 detection (v1.0.10 ops item 11)	2026-05-04 23:53:37 +02:00
ssl	fix(infra): HAProxy HTTPS and stats security	2026-02-15 15:58:51 +01:00
env.example	v0.9.5	2026-03-06 10:02:53 +01:00
logging.toml	docs: add project documentation, logging config, status script	2026-03-18 11:36:36 +01:00
metrics.yaml	BASE: completing the initial repo state	2025-12-03 22:56:50 +01:00
prometheus.yml	feat(monitoring): add Alertmanager with Slack notifications	2026-02-23 19:54:55 +01:00