Some checks failed
Veza CI / Backend (Go) (push) Failing after 8m56s
Veza CI / Frontend (Web) (push) Has been cancelled
E2E Playwright / e2e (full) (push) Has been cancelled
Veza CI / Notify on failure (push) Blocked by required conditions
Veza CI / Rust (Stream Server) (push) Successful in 5m3s
Security Scan / Secret Scanning (gitleaks) (push) Failing after 53s
Three Incus containers, each running redis-server + redis-sentinel (co-located). redis-1 = master at first boot, redis-2/3 = replicas. Sentinel quorum=2 of 3 ; failover-timeout=30s satisfies the W3 acceptance criterion. - internal/config/redis_init.go : initRedis branches on REDIS_SENTINEL_ADDRS ; non-empty -> redis.NewFailoverClient with MasterName + SentinelAddrs + SentinelPassword. Empty -> existing single-instance NewClient (dev/local stays parametric). - internal/config/config.go : 3 new fields (RedisSentinelAddrs, RedisSentinelMasterName, RedisSentinelPassword) read from env. parseRedisSentinelAddrs trims+filters CSV. - internal/metrics/cache_hit_rate.go : new RecordCacheHit / Miss counters, labelled by subsystem. Cardinality bounded. - internal/middleware/rate_limiter.go : instrument 3 Eval call sites (DDoS, frontend log throttle, upload throttle). Hit = Redis answered, Miss = error -> in-memory fallback. - internal/services/chat_pubsub.go : instrument Publish + PublishPresence. - internal/websocket/chat/presence_service.go : instrument SetOnline / SetOffline / Heartbeat / GetPresence. redis.Nil counts as a hit (legitimate empty result). - infra/ansible/roles/redis_sentinel/ : install Redis 7 + Sentinel, render redis.conf + sentinel.conf, systemd units. Vault assertion prevents shipping placeholder passwords to staging/prod. - infra/ansible/playbooks/redis_sentinel.yml : provisions the 3 containers + applies common baseline + role. - infra/ansible/inventory/lab.yml : new groups redis_ha + redis_ha_master. - infra/ansible/tests/test_redis_failover.sh : kills the master container, polls Sentinel for the new master, asserts elapsed < 30s. - config/grafana/dashboards/redis-cache-overview.json : 3 hit-rate stats (rate_limiter / chat_pubsub / presence) + ops/s breakdown. - docs/ENV_VARIABLES.md §3 : 3 new REDIS_SENTINEL_* env vars. - veza-backend-api/.env.template : 3 placeholders (empty default). Acceptance (Day 11) : Sentinel failover < 30s ; cache hit-rate dashboard populated. Lab test pending Sentinel deployment. W3 verification gate progress : Redis Sentinel ✓ (this commit), MinIO EC4+2 ⏳ Day 12, CDN ⏳ Day 13, DMCA ⏳ Day 14, embed ⏳ Day 15. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
102 lines
4.5 KiB
JSON
102 lines
4.5 KiB
JSON
{
|
|
"annotations": { "list": [] },
|
|
"editable": true,
|
|
"fiscalYearStartMonth": 0,
|
|
"graphTooltip": 1,
|
|
"id": null,
|
|
"links": [],
|
|
"liveNow": false,
|
|
"panels": [
|
|
{
|
|
"datasource": { "type": "prometheus", "uid": "prometheus" },
|
|
"fieldConfig": {
|
|
"defaults": { "unit": "percentunit", "min": 0, "max": 1, "color": { "mode": "thresholds" }, "thresholds": { "mode": "absolute", "steps": [{ "color": "red", "value": null }, { "color": "yellow", "value": 0.9 }, { "color": "green", "value": 0.99 }] } },
|
|
"overrides": []
|
|
},
|
|
"gridPos": { "h": 6, "w": 8, "x": 0, "y": 0 },
|
|
"id": 1,
|
|
"options": { "reduceOptions": { "calcs": ["lastNotNull"] }, "orientation": "auto" },
|
|
"targets": [
|
|
{
|
|
"expr": "sum(rate(veza_cache_hits_total{subsystem=\"rate_limiter\"}[5m])) / (sum(rate(veza_cache_hits_total{subsystem=\"rate_limiter\"}[5m])) + sum(rate(veza_cache_misses_total{subsystem=\"rate_limiter\"}[5m])))",
|
|
"refId": "A"
|
|
}
|
|
],
|
|
"title": "Rate limiter — cache hit rate (5m)",
|
|
"type": "stat"
|
|
},
|
|
{
|
|
"datasource": { "type": "prometheus", "uid": "prometheus" },
|
|
"fieldConfig": {
|
|
"defaults": { "unit": "percentunit", "min": 0, "max": 1, "color": { "mode": "thresholds" }, "thresholds": { "mode": "absolute", "steps": [{ "color": "red", "value": null }, { "color": "yellow", "value": 0.9 }, { "color": "green", "value": 0.99 }] } },
|
|
"overrides": []
|
|
},
|
|
"gridPos": { "h": 6, "w": 8, "x": 8, "y": 0 },
|
|
"id": 2,
|
|
"options": { "reduceOptions": { "calcs": ["lastNotNull"] }, "orientation": "auto" },
|
|
"targets": [
|
|
{
|
|
"expr": "sum(rate(veza_cache_hits_total{subsystem=\"chat_pubsub\"}[5m])) / (sum(rate(veza_cache_hits_total{subsystem=\"chat_pubsub\"}[5m])) + sum(rate(veza_cache_misses_total{subsystem=\"chat_pubsub\"}[5m])))",
|
|
"refId": "A"
|
|
}
|
|
],
|
|
"title": "Chat PubSub — hit rate (5m)",
|
|
"type": "stat"
|
|
},
|
|
{
|
|
"datasource": { "type": "prometheus", "uid": "prometheus" },
|
|
"fieldConfig": {
|
|
"defaults": { "unit": "percentunit", "min": 0, "max": 1, "color": { "mode": "thresholds" }, "thresholds": { "mode": "absolute", "steps": [{ "color": "red", "value": null }, { "color": "yellow", "value": 0.9 }, { "color": "green", "value": 0.99 }] } },
|
|
"overrides": []
|
|
},
|
|
"gridPos": { "h": 6, "w": 8, "x": 16, "y": 0 },
|
|
"id": 3,
|
|
"options": { "reduceOptions": { "calcs": ["lastNotNull"] }, "orientation": "auto" },
|
|
"targets": [
|
|
{
|
|
"expr": "sum(rate(veza_cache_hits_total{subsystem=\"presence\"}[5m])) / (sum(rate(veza_cache_hits_total{subsystem=\"presence\"}[5m])) + sum(rate(veza_cache_misses_total{subsystem=\"presence\"}[5m])))",
|
|
"refId": "A"
|
|
}
|
|
],
|
|
"title": "Presence — hit rate (5m)",
|
|
"type": "stat"
|
|
},
|
|
{
|
|
"datasource": { "type": "prometheus", "uid": "prometheus" },
|
|
"fieldConfig": { "defaults": { "unit": "ops", "color": { "mode": "palette-classic" } }, "overrides": [] },
|
|
"gridPos": { "h": 8, "w": 12, "x": 0, "y": 6 },
|
|
"id": 4,
|
|
"options": { "legend": { "displayMode": "list", "placement": "bottom" } },
|
|
"targets": [
|
|
{ "expr": "sum by (subsystem) (rate(veza_cache_hits_total[5m]))", "legendFormat": "{{subsystem}} hits", "refId": "A" },
|
|
{ "expr": "sum by (subsystem) (rate(veza_cache_misses_total[5m]))", "legendFormat": "{{subsystem}} misses", "refId": "B" }
|
|
],
|
|
"title": "Hits + misses per subsystem (ops/s)",
|
|
"type": "timeseries"
|
|
},
|
|
{
|
|
"datasource": { "type": "prometheus", "uid": "prometheus" },
|
|
"fieldConfig": { "defaults": { "unit": "short", "color": { "mode": "palette-classic" } }, "overrides": [] },
|
|
"gridPos": { "h": 8, "w": 12, "x": 12, "y": 6 },
|
|
"id": 5,
|
|
"options": { "legend": { "displayMode": "list", "placement": "bottom" } },
|
|
"targets": [
|
|
{ "expr": "redis_connected_clients", "legendFormat": "{{instance}} clients", "refId": "A" },
|
|
{ "expr": "redis_connected_slaves", "legendFormat": "{{instance}} replicas", "refId": "B" }
|
|
],
|
|
"title": "Redis connectivity (requires redis_exporter)",
|
|
"type": "timeseries"
|
|
}
|
|
],
|
|
"refresh": "30s",
|
|
"schemaVersion": 38,
|
|
"style": "dark",
|
|
"tags": ["veza", "redis", "cache"],
|
|
"templating": { "list": [] },
|
|
"time": { "from": "now-1h", "to": "now" },
|
|
"timepicker": {},
|
|
"timezone": "browser",
|
|
"title": "Veza Redis + Cache Hit Rate",
|
|
"uid": "veza-redis-cache",
|
|
"version": 1
|
|
}
|