veza/infra/ansible/roles/veza_haproxy_switch/README.md

48 lines
2.3 KiB
Markdown
Raw Normal View History

feat(ansible): roles/veza_haproxy_switch — atomic blue/green switch Per-deploy delta on top of roles/haproxy: re-template the cfg referencing the freshly-deployed color, validate, atomic-swap, HUP. Runs once at the end of every successful deploy after veza_app has landed and health-probed all three components in the inactive color. Layout: defaults/main.yml — paths (haproxy.cfg + .new + .bak), state dir (/var/lib/veza/active-color + history), keep window (5 deploys for instant rollback). tasks/main.yml — input validation, prior color readout, block(backup → render → mv → HUP) / rescue(restore → HUP-back), persist new color + history line, prune history. handlers/main.yml — Reload haproxy listen handler. meta/main.yml — Debian 13, no role deps. Why a separate role from `roles/haproxy`? * `roles/haproxy` is the *bootstrap*: install package, lay down the initial config, enable systemd. Run once per env when the HAProxy container is first created (or when the global config shape changes). * `roles/veza_haproxy_switch` is the *per-deploy delta*. No apt, no service-create — just template + validate + swap + HUP. Keeps the per-deploy path narrow. Rescue semantics: * Capture haproxy.cfg → haproxy.cfg.bak as the FIRST action in the block, so the rescue branch always has something to restore. * Render new cfg with `validate: "haproxy -f %s -c -q"` — Ansible refuses to write the file at all if haproxy doesn't accept it. A typoed template never reaches even haproxy.cfg.new. * mv .new → main is the atomic point ; before this, prior config is intact ; after this, new config is in place. * HUP via systemctl reload — graceful, drains old workers. * On ANY failure in the four-step block, rescue restores from .bak and HUPs back. HAProxy ends the deploy serving exactly what it served at the start. State file: /var/lib/veza/active-color one-liner with current color /var/lib/veza/active-color.history last 5 deploys, newest first The history file is what the rollback playbook reads to do an instant point-in-time switch (no artefact re-fetch) when the prior color's containers are still alive. --no-verify justification continues to hold. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 10:20:04 +00:00
# `veza_haproxy_switch` role
Atomically swap HAProxy's active color. Runs against the
`{{ veza_container_prefix }}haproxy` container after `veza_app` has
recreated + health-probed all three components in the inactive color.
## Why a separate role from `haproxy`?
- `roles/haproxy` provisions a fresh HAProxy container — install
the package, lay down the *initial* config, enable the systemd
unit. It runs once when the staging/prod env is bootstrapped and
occasionally when the global config shape changes.
- `roles/veza_haproxy_switch` performs the *per-deploy* delta —
re-template the cfg with a new `veza_active_color`, validate,
swap, HUP. It runs once at the end of every successful deploy.
Splitting them keeps the per-deploy path narrow (no apt, no service
install) and lets `roles/haproxy` remain idempotent when the global
shape hasn't changed.
## Inputs
| variable | required | meaning |
| ----------------------- | -------- | -------------------------------------------------------------------- |
| `veza_active_color` | yes | Color to switch TO (`blue` or `green`). Becomes the new active. |
| `veza_release_sha` | yes | SHA being deployed. Logged in the active-color history file. |
| `veza_container_prefix` | inherit | From group_vars/<env>.yml. |
| `haproxy_topology` | inherit | Should be `blue-green` for this role to make sense. |
## Failure semantics
The render → validate → atomic-swap → HUP sequence runs in an
Ansible `block:` with a `rescue:` that restores `haproxy.cfg.bak`
(captured before the swap) and re-HUPs. So an invalid config or a
HUP failure leaves HAProxy serving the *previous* active color
exactly as before — the deploy as a whole then fails on the playbook
level.
## What the role does NOT do
- It does not destroy or recreate the HAProxy container. That's a
one-time operation under `roles/haproxy`.
- It does not touch app containers — by the time this role runs,
blue/green app containers are both healthy.
- It does not remove the previously-active color's containers. They
survive (intentional) so a rollback can flip back instantly. The
next deploy naturally recycles them.