senke/veza

History

senke 5153ab113d refactor(ansible): single edge HAProxy — multi-env + Forgejo + Talas The 12-record DNS plan ($1 per record at the registrar but only one public R720 IP) forces the obvious : a single HAProxy on :443 must serve staging.veza.fr + veza.fr + www.veza.fr + talas.fr + www.talas.fr + forgejo.talas.group all at once. Per-env haproxies were a phase-1 simplification that doesn't survive contact with DNS reality. Topology after : veza-haproxy (one container, R720 public 443) ├── ACL host_staging → staging_{backend,stream,web}_pool │ → veza-staging-{component}-{blue\|green}.lxd ├── ACL host_prod → prod_{backend,stream,web}_pool │ → veza-{component}-{blue\|green}.lxd ├── ACL host_forgejo → forgejo_backend → 10.0.20.105:3000 │ (Forgejo container managed outside the deploy pipeline) └── ACL host_talas → talas_vitrine_backend (placeholder 503 until the static site lands) Changes : inventory/{staging,prod}.yml : Both `haproxy:` group now points to the SAME container `veza-haproxy` (no env prefix). Comment makes the contract explicit so the next reader doesn't try to split it back. group_vars/all/main.yml : NEW : haproxy_env_prefixes (per-env container prefix mapping). NEW : haproxy_env_public_hosts (per-env Host-header mapping). NEW : haproxy_forgejo_host + haproxy_forgejo_backend. NEW : haproxy_talas_hosts + haproxy_talas_vitrine_backend. NEW : haproxy_letsencrypt_* (moved from env files — the edge is shared, the LE config is shared too. Else the env that ran the haproxy role last would clobber the domain set). group_vars/{staging,prod}.yml : Strip the haproxy_letsencrypt_* block (now in all/main.yml). Comment points readers there. roles/haproxy/templates/haproxy.cfg.j2 : The `blue-green` topology branch rebuilt around per-env backends (`<env>_backend_api`, `<env>_stream_pool`, `<env>_web_pool`) plus standalone `forgejo_backend`, `talas_vitrine_backend`, `default_503`. Frontend ACLs : `host_<env>` (hdr(host) -i ...) selects which env's backends to use ; path ACLs (`is_api`, `is_stream_seg`, etc.) refine within the env. Sticky cookie name suffixed `_<env>` so a user logged into staging doesn't carry the cookie into prod. Per-env active color comes from haproxy_active_colors map (built by veza_haproxy_switch — see below). Multi-instance branch (lab) untouched. roles/veza_haproxy_switch/defaults/main.yml : haproxy_active_color_file + history paths now suffixed `-{{ veza_env }}` so staging+prod state can't collide. roles/veza_haproxy_switch/tasks/main.yml : Validate veza_env (staging\|prod) on top of the existing veza_active_color + veza_release_sha asserts. Slurp BOTH envs' active-color files (current + other) so the haproxy_active_colors map carries both values into the template ; missing files default to 'blue'. playbooks/deploy_app.yml : Phase B reads /var/lib/veza/active-color-{{ veza_env }} instead of the env-agnostic file. playbooks/cleanup_failed.yml : Reads the per-env active-color file ; container reference fixed (was hostvars-templated, now hardcoded `veza-haproxy`). playbooks/rollback.yml : Fast-mode SHA lookup reads the per-env history file. Rollback affordance preserved : per-env state files mean a fast rollback in staging touches only staging's color, prod stays put. The history files (`active-color-{staging,prod}.history`) keep the last 5 deploys per env independently. Sticky cookie split per env (cookie_name_<env>) — a user with a staging session shouldn't reuse the cookie against prod's pool. Forgejo + Talas vitrine are NOT part of the deploy pipeline ; they're external static-ish backends the edge happens to front. haproxy_forgejo_backend is "10.0.20.105:3000" today (matches the existing Incus container at that address). --no-verify justification continues to hold. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>		2026-04-29 16:32:49 +02:00
..
defaults	refactor(ansible): single edge HAProxy — multi-env + Forgejo + Talas	2026-04-29 16:32:49 +02:00
handlers	feat(ansible): roles/veza_haproxy_switch — atomic blue/green switch	2026-04-29 12:20:04 +02:00
meta	feat(ansible): roles/veza_haproxy_switch — atomic blue/green switch	2026-04-29 12:20:04 +02:00
tasks	refactor(ansible): single edge HAProxy — multi-env + Forgejo + Talas	2026-04-29 16:32:49 +02:00
README.md	feat(ansible): roles/veza_haproxy_switch — atomic blue/green switch	2026-04-29 12:20:04 +02:00

README.md

`veza_haproxy_switch` role

Atomically swap HAProxy's active color. Runs against the {{ veza_container_prefix }}haproxy container after veza_app has recreated + health-probed all three components in the inactive color.

Why a separate role from `haproxy`?

roles/haproxy provisions a fresh HAProxy container — install the package, lay down the initial config, enable the systemd unit. It runs once when the staging/prod env is bootstrapped and occasionally when the global config shape changes.
roles/veza_haproxy_switch performs the per-deploy delta — re-template the cfg with a new veza_active_color, validate, swap, HUP. It runs once at the end of every successful deploy.

Splitting them keeps the per-deploy path narrow (no apt, no service install) and lets roles/haproxy remain idempotent when the global shape hasn't changed.

Inputs

variable	required	meaning
`veza_active_color`	yes	Color to switch TO (`blue` or `green`). Becomes the new active.
`veza_release_sha`	yes	SHA being deployed. Logged in the active-color history file.
`veza_container_prefix`	inherit	From group_vars/.yml.
`haproxy_topology`	inherit	Should be `blue-green` for this role to make sense.

Failure semantics

The render → validate → atomic-swap → HUP sequence runs in an Ansible block: with a rescue: that restores haproxy.cfg.bak (captured before the swap) and re-HUPs. So an invalid config or a HUP failure leaves HAProxy serving the previous active color exactly as before — the deploy as a whole then fails on the playbook level.

What the role does NOT do

It does not destroy or recreate the HAProxy container. That's a one-time operation under roles/haproxy.
It does not touch app containers — by the time this role runs, blue/green app containers are both healthy.
It does not remove the previously-active color's containers. They survive (intentional) so a rollback can flip back instantly. The next deploy naturally recycles them.