Some checks failed
Veza CI / Backend (Go) (push) Failing after 8m56s
Veza CI / Frontend (Web) (push) Has been cancelled
E2E Playwright / e2e (full) (push) Has been cancelled
Veza CI / Notify on failure (push) Blocked by required conditions
Veza CI / Rust (Stream Server) (push) Successful in 5m3s
Security Scan / Secret Scanning (gitleaks) (push) Failing after 53s
Three Incus containers, each running redis-server + redis-sentinel (co-located). redis-1 = master at first boot, redis-2/3 = replicas. Sentinel quorum=2 of 3 ; failover-timeout=30s satisfies the W3 acceptance criterion. - internal/config/redis_init.go : initRedis branches on REDIS_SENTINEL_ADDRS ; non-empty -> redis.NewFailoverClient with MasterName + SentinelAddrs + SentinelPassword. Empty -> existing single-instance NewClient (dev/local stays parametric). - internal/config/config.go : 3 new fields (RedisSentinelAddrs, RedisSentinelMasterName, RedisSentinelPassword) read from env. parseRedisSentinelAddrs trims+filters CSV. - internal/metrics/cache_hit_rate.go : new RecordCacheHit / Miss counters, labelled by subsystem. Cardinality bounded. - internal/middleware/rate_limiter.go : instrument 3 Eval call sites (DDoS, frontend log throttle, upload throttle). Hit = Redis answered, Miss = error -> in-memory fallback. - internal/services/chat_pubsub.go : instrument Publish + PublishPresence. - internal/websocket/chat/presence_service.go : instrument SetOnline / SetOffline / Heartbeat / GetPresence. redis.Nil counts as a hit (legitimate empty result). - infra/ansible/roles/redis_sentinel/ : install Redis 7 + Sentinel, render redis.conf + sentinel.conf, systemd units. Vault assertion prevents shipping placeholder passwords to staging/prod. - infra/ansible/playbooks/redis_sentinel.yml : provisions the 3 containers + applies common baseline + role. - infra/ansible/inventory/lab.yml : new groups redis_ha + redis_ha_master. - infra/ansible/tests/test_redis_failover.sh : kills the master container, polls Sentinel for the new master, asserts elapsed < 30s. - config/grafana/dashboards/redis-cache-overview.json : 3 hit-rate stats (rate_limiter / chat_pubsub / presence) + ops/s breakdown. - docs/ENV_VARIABLES.md §3 : 3 new REDIS_SENTINEL_* env vars. - veza-backend-api/.env.template : 3 placeholders (empty default). Acceptance (Day 11) : Sentinel failover < 30s ; cache hit-rate dashboard populated. Lab test pending Sentinel deployment. W3 verification gate progress : Redis Sentinel ✓ (this commit), MinIO EC4+2 ⏳ Day 12, CDN ⏳ Day 13, DMCA ⏳ Day 14, embed ⏳ Day 15. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
54 lines
1.8 KiB
Django/Jinja
54 lines
1.8 KiB
Django/Jinja
# Managed by Ansible — do not edit by hand.
|
|
# Veza Redis 7 config — replication via Sentinel (see sentinel.conf).
|
|
#
|
|
# Topology at first boot :
|
|
# redis-1 : master
|
|
# redis-2 : replicaof redis-1.lxd
|
|
# redis-3 : replicaof redis-1.lxd
|
|
# After failover, Sentinel rewrites this file in-place to point at the
|
|
# new master. Do NOT re-render this template after first boot — set
|
|
# `force: false` in the Ansible task that owns it.
|
|
|
|
bind {{ redis_bind }}
|
|
port {{ redis_port }}
|
|
protected-mode {{ redis_protected_mode }}
|
|
daemonize no
|
|
supervised systemd
|
|
|
|
requirepass {{ redis_password }}
|
|
masterauth {{ redis_password }}
|
|
|
|
{% if pg_auto_failover_role is not defined and inventory_hostname != groups['redis_ha_master'][0] %}
|
|
# Replicas point at the bootstrap master. Sentinel re-points them on
|
|
# failover ; this directive only matters at first boot.
|
|
replicaof {{ groups['redis_ha_master'][0] }}.lxd {{ redis_port }}
|
|
{% endif %}
|
|
|
|
# Replica reads kept on so the backend's read-mostly fanout (chat
|
|
# pubsub history, presence GETs) can be served by either replica
|
|
# during steady state.
|
|
replica-read-only yes
|
|
|
|
# Persistence — AOF + occasional RDB. AOF gives ~ 1s RPO with
|
|
# everysec ; RDB is fast restore.
|
|
{% if redis_aof_enabled %}
|
|
appendonly yes
|
|
appendfsync everysec
|
|
{% else %}
|
|
appendonly no
|
|
{% endif %}
|
|
save {{ redis_save_config }}
|
|
|
|
# Memory cap + eviction. Eviction is OK for the use cases we have in
|
|
# v1.0 (sessions, rate-limit counters, presence — all reconstructible).
|
|
maxmemory {{ redis_maxmemory }}
|
|
maxmemory-policy {{ redis_maxmemory_policy }}
|
|
|
|
# Logging
|
|
logfile /var/log/redis/redis-server.log
|
|
loglevel notice
|
|
|
|
# Slow log — anything > 10ms gets captured. Useful when we suspect a
|
|
# slow Lua script (rate limiter Eval) is back-pressuring.
|
|
slowlog-log-slower-than 10000
|
|
slowlog-max-len 256
|