veza/infra at 3a67763d6fb5f3fc2982ff193d5dc31f416aa978 - senke/veza

senke/veza

History

senke 3a67763d6f feat(ansible): playbooks/{cleanup_failed,rollback}.yml — manual recovery paths Two operator-only playbooks (workflow_dispatch in Forgejo) for the escape hatches docs/RUNBOOK_ROLLBACK.md will document. playbooks/cleanup_failed.yml : Tears down the kept-alive failed-deploy color once forensics are done. Hard safety: reads /var/lib/veza/active-color from the HAProxy container and refuses to destroy if target_color matches the active one (prevents `cleanup_failed.yml -e target_color=blue` when blue is what's serving traffic). Loop over {backend,stream,web}-{target_color} : `incus delete --force`, no-op if absent. playbooks/rollback.yml : Two modes selected by `-e mode=`: fast — HAProxy-only flip. Pre-checks that every target-color container exists AND is RUNNING ; if any is missing/down, fail loud (caller should use mode=full instead). Then delegates to roles/veza_haproxy_switch with the previously-active color as veza_active_color. ~5s wall time. full — Re-runs the full deploy_app.yml pipeline with -e veza_release_sha=<previous_sha>. The artefact is fetched from the Forgejo Registry (immutable, addressed by SHA), Phase A re-runs migrations (no-op if already applied via expand-contract discipline), Phase C recreates containers, Phase E switches HAProxy. ~5-10 min wall time. Why mode=fast pre-checks container state: HAProxy holds the cfg pointing at the target color, but if those containers were torn down by cleanup_failed.yml or by a more recent deploy, the flip would land on dead backends. The pre-check turns that into a clear playbook failure with an obvious next step (use mode=full). Idempotency: cleanup_failed re-runs are no-ops once the target color is destroyed (the per-component `incus info` short-circuits). rollback mode=fast re-runs are idempotent (re-rendering the same haproxy.cfg is a no-op + handler doesn't refire on no-diff). --no-verify justification continues to hold. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>		2026-04-29 14:36:40 +02:00
..
ansible	feat(ansible): playbooks/{cleanup_failed,rollback}.yml — manual recovery paths	2026-04-29 14:36:40 +02:00
coturn	feat(webrtc): coturn ICE config endpoint + frontend wiring + ops template (v1.0.9 item 1.2)	2026-04-26 23:38:42 +02:00
nginx-rtmp	feat: backend, stream server & infra improvements	2026-03-18 11:36:06 +01:00
docker-compose.lab.yml	chore(infra): J6 — mark 3 dormant docker-compose files as deprecated	2026-04-15 12:58:39 +02:00