veza/infra/ansible/roles/haproxy/README.md

# `haproxy` role — TLS termination + sticky-WS load balancer

Single Incus container in front of the active/active backend-api fleet and the stream-server fleet. v1.0.9 W4 Day 19 — phase-1 of the HA story (single-host LB ; phase-2 adds keepalived for an LB pair).

## Topology

```
                         :80 / :443
                              │
                       ┌──────▼─────────┐
                       │   haproxy.lxd  │   (this role)
                       │  HTTP + WS     │
                       │  TLS terminate │
                       │  sticky cookie │
                       └─┬───────┬──────┘
                         │       │
              ┌──────────┘       └──────────┐
              ▼                              ▼
       ┌──────────────┐              ┌──────────────┐
       │ api_pool     │              │ stream_pool  │
       │ ─────────    │              │ ─────────    │
       │ backend-api-1│              │ stream-srv-1 │
       │ backend-api-2│              │ stream-srv-2 │
       │  (port 8080) │              │  (port 8082) │
       │  Round-robin │              │  URI-hash    │
       │  Sticky cookie│             │  (track_id)  │
       └──────────────┘              └──────────────┘
```

## Why these balance modes

- **api_pool : `balance roundrobin` + `cookie SERVERID insert indirect`.** The Go API is stateless (sessions live in Redis), so any backend can serve any request. The cookie keeps a logged-in user pinned to one backend through the session, which makes WebSocket upgrades land on the same instance that authenticated the user — avoiding a Redis round-trip on every WS hello.
- **stream_pool : `balance uri whole` + `hash-type consistent`.** The Rust streamer keeps a hot HLS-segment cache in process. URI-hash routes the same track_id to the same node ; jump-hash means adding or removing a node only displaces ~`1/N` of the keys, not the entire pool.

## Failover behaviour

- Health check `GET /api/v1/health` (or `/health` for stream) every `haproxy_health_check_interval_ms` ms (default 5 s). 3 consecutive failures = down ; 2 consecutive successes = back up.
- `on-marked-down shutdown-sessions` : when a backend drops, all its in-flight TCP/WS sessions are cut. Clients reconnect ; the cookie targets the dead backend → HAProxy ignores the dead pin and re-balances. WebSocket clients on the frontend (chat, presence) MUST handle the close + reconnect — that's already wired in `apps/web/src/features/chat/services/websocket.ts`.
- `slowstart {{ haproxy_graceful_drain_seconds }}s` : when a backend recovers, its weight ramps up linearly over 30 s instead of taking a full third of the traffic on the first scrape. Smoothes the post-restart latency spike.

## Defaults

| variable                          | default            | meaning                                   |
| --------------------------------- | ------------------ | ----------------------------------------- |
| `haproxy_listen_http`             | `80`               | HTTP listener                             |
| `haproxy_listen_https`            | `443`              | HTTPS listener (only bound when cert set) |
| `haproxy_tls_cert_path`           | `""`               | path to PEM (cert+key concat). Empty = HTTP only. |
| `haproxy_backend_api_port`        | `8080`             | upstream port for backend-api             |
| `haproxy_stream_server_port`      | `8082`             | upstream port for stream-server           |
| `haproxy_health_check_interval_ms`| `5000`             | active-check cadence                      |
| `haproxy_health_check_fall`       | `3`                | failed checks before "down"               |
| `haproxy_health_check_rise`       | `2`                | successful checks before "up"             |
| `haproxy_graceful_drain_seconds`  | `30`               | post-recovery weight ramp-up              |
| `haproxy_sticky_cookie_name`      | `VEZA_SERVERID`    | cookie name for backend stickiness        |

## Operations

```bash
# Health view (admin socket, loopback only) :
sudo socat /run/haproxy/admin.sock - <<< "show servers state"
sudo socat /run/haproxy/admin.sock - <<< "show stat"

# Disable a server gracefully (drains existing connections,
# new requests skip it ; useful before a planned restart) :
echo "set server api_pool/backend-api-1 state drain" | sudo socat /run/haproxy/admin.sock -
# ...wait haproxy_graceful_drain_seconds, then on the backend host :
#   sudo systemctl restart veza-backend-api
echo "set server api_pool/backend-api-1 state ready"  | sudo socat /run/haproxy/admin.sock -

# Stats UI for a human (browser only ; bound to localhost) :
ssh -L 9100:localhost:9100 haproxy.lxd
# then open http://localhost:9100/stats

# Live log tail (HAProxy logs to journald via /dev/log) :
sudo journalctl -u haproxy -f
```

## Failover smoke test

```bash
bash infra/ansible/tests/test_backend_failover.sh
```

Sequence : verifies the api_pool is healthy at start, kills `backend-api-1`, polls HAProxy until the server is marked DOWN, asserts the next request still gets a 200 (served by `backend-api-2`), restarts the killed container, asserts it rejoins as healthy. Suitable for the W2 game-day day 24 drill.

## What this role does NOT cover

- **TLS cert provisioning.** Phase-1 lab : HTTP only. Phase-2 mounts a Let's Encrypt cert from Caddy's data dir or directly via certbot. mTLS to the backends is W5 territory.
- **Multi-LB HA.** Single HAProxy node — if it dies, the cluster is dark. Phase-2 adds keepalived + a floating VIP.
- **Rate limiting.** The Gin middleware does that today ; pushing it to the LB is a v1.1 optimisation.
- **WebSocket auth header passing.** HAProxy passes `Sec-WebSocket-*` headers through unchanged ; Gin's middleware authenticates the upgrade request. No extra config needed.
feat(infra): haproxy sticky WS + backend_api multi-instance scaffold (W4 Day 19) Phase-1 of the active/active backend story. HAProxy in front of two backend-api containers + two stream-server containers ; sticky cookie pins WS sessions to one backend, URI hash routes track_id to one streamer for HLS cache locality. Day 19 acceptance asks for : kill backend-api-1, HAProxy bascule, WS sessions reconnect to backend-api-2 sans perte. The smoke test wires that gate ; phase-2 (W5) will add keepalived for an LB pair. - infra/ansible/roles/haproxy/ * Install HAProxy + render haproxy.cfg with frontend (HTTP, optional HTTPS via haproxy_tls_cert_path), api_pool (round-robin + sticky cookie SERVERID), stream_pool (URI-hash + consistent jump-hash). * Active health check GET /api/v1/health every 5s ; fall=3, rise=2. on-marked-down shutdown-sessions + slowstart 30s on recovery. * Stats socket bound to 127.0.0.1:9100 for the future prometheus haproxy_exporter sidecar. * Mozilla Intermediate TLS cipher list ; only effective when a cert is mounted. - infra/ansible/roles/backend_api/ * Scaffolding for the multi-instance Go API. Creates veza-api system user, /opt/veza/backend-api dir, /etc/veza env dir, /var/log/veza, and a hardened systemd unit pointing at the binary. * Binary deployment is OUT of scope (documented in README) — the Go binary is built outside Ansible (Makefile target) and pushed via incus file push. CI → ansible-pull integration is W5+. - infra/ansible/playbooks/haproxy.yml : provisions the haproxy Incus container + applies common baseline + role. - infra/ansible/inventory/lab.yml : 3 new groups : * haproxy (single LB node) * backend_api_instances (backend-api-{1,2}) * stream_server_instances (stream-server-{1,2}) HAProxy template reads these groups directly to populate its upstream blocks ; falls back to the static haproxy_backend_api_fallback list if the group is missing (for in-isolation tests). - infra/ansible/tests/test_backend_failover.sh * step 0 : pre-flight — both backends UP per HAProxy stats socket. * step 1 : 5 baseline GET /api/v1/health through the LB → all 200. * step 2 : incus stop --force backend-api-1 ; record t0. * step 3 : poll HAProxy stats until backend-api-1 is DOWN (timeout 30s ; expected ~ 15s = fall × interval). * step 4 : 5 GET requests during the down window — all must 200 (served by backend-api-2). Fails if any returns non-200. * step 5 : incus start backend-api-1 ; poll until UP again. Acceptance (Day 19) : smoke test passes ; HAProxy sticky cookie keeps WS sessions on the same backend until that backend dies, at which point the cookie is ignored and the request rebalances. W4 progress : Day 16 done · Day 17 done · Day 18 done · Day 19 done · Day 20 (k6 nightly load test) pending. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> 2026-04-29 09:32:48 +00:00			# `haproxy` role — TLS termination + sticky-WS load balancer

			`Single Incus container in front of the active/active backend-api fleet and the stream-server fleet. v1.0.9 W4 Day 19 — phase-1 of the HA story (single-host LB ; phase-2 adds keepalived for an LB pair).`

			`## Topology`

			```
			`:80 / :443`
			`│`
			`┌──────▼─────────┐`
			`│ haproxy.lxd │ (this role)`
			`│ HTTP + WS │`
			`│ TLS terminate │`
			`│ sticky cookie │`
			`└─┬───────┬──────┘`
			`│ │`
			`┌──────────┘ └──────────┐`
			`▼ ▼`
			`┌──────────────┐ ┌──────────────┐`
			`│ api_pool │ │ stream_pool │`
			`│ ───────── │ │ ───────── │`
			`│ backend-api-1│ │ stream-srv-1 │`
			`│ backend-api-2│ │ stream-srv-2 │`
			`│ (port 8080) │ │ (port 8082) │`
			`│ Round-robin │ │ URI-hash │`
			`│ Sticky cookie│ │ (track_id) │`
			`└──────────────┘ └──────────────┘`
			```

			`## Why these balance modes`

			- api_pool : `balance roundrobin` + `cookie SERVERID insert indirect`. The Go API is stateless (sessions live in Redis), so any backend can serve any request. The cookie keeps a logged-in user pinned to one backend through the session, which makes WebSocket upgrades land on the same instance that authenticated the user — avoiding a Redis round-trip on every WS hello.
			- stream_pool : `balance uri whole` + `hash-type consistent`. The Rust streamer keeps a hot HLS-segment cache in process. URI-hash routes the same track_id to the same node ; jump-hash means adding or removing a node only displaces ~`1/N` of the keys, not the entire pool.

			`## Failover behaviour`

			- Health check `GET /api/v1/health` (or `/health` for stream) every `haproxy_health_check_interval_ms` ms (default 5 s). 3 consecutive failures = down ; 2 consecutive successes = back up.
			- `on-marked-down shutdown-sessions` : when a backend drops, all its in-flight TCP/WS sessions are cut. Clients reconnect ; the cookie targets the dead backend → HAProxy ignores the dead pin and re-balances. WebSocket clients on the frontend (chat, presence) MUST handle the close + reconnect — that's already wired in `apps/web/src/features/chat/services/websocket.ts`.
			- `slowstart {{ haproxy_graceful_drain_seconds }}s` : when a backend recovers, its weight ramps up linearly over 30 s instead of taking a full third of the traffic on the first scrape. Smoothes the post-restart latency spike.

			`## Defaults`

			`\| variable \| default \| meaning \|`
			`\| --------------------------------- \| ------------------ \| ----------------------------------------- \|`
			\| `haproxy_listen_http` \| `80` \| HTTP listener \|
			\| `haproxy_listen_https` \| `443` \| HTTPS listener (only bound when cert set) \|
			\| `haproxy_tls_cert_path` \| `""` \| path to PEM (cert+key concat). Empty = HTTP only. \|
			\| `haproxy_backend_api_port` \| `8080` \| upstream port for backend-api \|
			\| `haproxy_stream_server_port` \| `8082` \| upstream port for stream-server \|
			\| `haproxy_health_check_interval_ms`\| `5000` \| active-check cadence \|
			\| `haproxy_health_check_fall` \| `3` \| failed checks before "down" \|
			\| `haproxy_health_check_rise` \| `2` \| successful checks before "up" \|
			\| `haproxy_graceful_drain_seconds` \| `30` \| post-recovery weight ramp-up \|
			\| `haproxy_sticky_cookie_name` \| `VEZA_SERVERID` \| cookie name for backend stickiness \|

			`## Operations`

			```bash
			`# Health view (admin socket, loopback only) :`
			`sudo socat /run/haproxy/admin.sock - <<< "show servers state"`
			`sudo socat /run/haproxy/admin.sock - <<< "show stat"`

			`# Disable a server gracefully (drains existing connections,`
			`# new requests skip it ; useful before a planned restart) :`
			`echo "set server api_pool/backend-api-1 state drain" \| sudo socat /run/haproxy/admin.sock -`
			`# ...wait haproxy_graceful_drain_seconds, then on the backend host :`
			`# sudo systemctl restart veza-backend-api`
			`echo "set server api_pool/backend-api-1 state ready" \| sudo socat /run/haproxy/admin.sock -`

			`# Stats UI for a human (browser only ; bound to localhost) :`
			`ssh -L 9100:localhost:9100 haproxy.lxd`
			`# then open http://localhost:9100/stats`

			`# Live log tail (HAProxy logs to journald via /dev/log) :`
			`sudo journalctl -u haproxy -f`
			```

			`## Failover smoke test`

			```bash
			`bash infra/ansible/tests/test_backend_failover.sh`
			```

			Sequence : verifies the api_pool is healthy at start, kills `backend-api-1`, polls HAProxy until the server is marked DOWN, asserts the next request still gets a 200 (served by `backend-api-2`), restarts the killed container, asserts it rejoins as healthy. Suitable for the W2 game-day day 24 drill.

			`## What this role does NOT cover`

			`- TLS cert provisioning. Phase-1 lab : HTTP only. Phase-2 mounts a Let's Encrypt cert from Caddy's data dir or directly via certbot. mTLS to the backends is W5 territory.`
			`- Multi-LB HA. Single HAProxy node — if it dies, the cluster is dark. Phase-2 adds keepalived + a floating VIP.`
			`- Rate limiting. The Gin middleware does that today ; pushing it to the LB is a v1.1 optimisation.`
			- WebSocket auth header passing. HAProxy passes `Sec-WebSocket-*` headers through unchanged ; Gin's middleware authenticates the upgrade request. No extra config needed.