veza/infra/ansible/playbooks/haproxy.yml

53 lines
1.4 KiB
YAML
Raw Normal View History

feat(infra): haproxy sticky WS + backend_api multi-instance scaffold (W4 Day 19) Phase-1 of the active/active backend story. HAProxy in front of two backend-api containers + two stream-server containers ; sticky cookie pins WS sessions to one backend, URI hash routes track_id to one streamer for HLS cache locality. Day 19 acceptance asks for : kill backend-api-1, HAProxy bascule, WS sessions reconnect to backend-api-2 sans perte. The smoke test wires that gate ; phase-2 (W5) will add keepalived for an LB pair. - infra/ansible/roles/haproxy/ * Install HAProxy + render haproxy.cfg with frontend (HTTP, optional HTTPS via haproxy_tls_cert_path), api_pool (round-robin + sticky cookie SERVERID), stream_pool (URI-hash + consistent jump-hash). * Active health check GET /api/v1/health every 5s ; fall=3, rise=2. on-marked-down shutdown-sessions + slowstart 30s on recovery. * Stats socket bound to 127.0.0.1:9100 for the future prometheus haproxy_exporter sidecar. * Mozilla Intermediate TLS cipher list ; only effective when a cert is mounted. - infra/ansible/roles/backend_api/ * Scaffolding for the multi-instance Go API. Creates veza-api system user, /opt/veza/backend-api dir, /etc/veza env dir, /var/log/veza, and a hardened systemd unit pointing at the binary. * Binary deployment is OUT of scope (documented in README) — the Go binary is built outside Ansible (Makefile target) and pushed via incus file push. CI → ansible-pull integration is W5+. - infra/ansible/playbooks/haproxy.yml : provisions the haproxy Incus container + applies common baseline + role. - infra/ansible/inventory/lab.yml : 3 new groups : * haproxy (single LB node) * backend_api_instances (backend-api-{1,2}) * stream_server_instances (stream-server-{1,2}) HAProxy template reads these groups directly to populate its upstream blocks ; falls back to the static haproxy_backend_api_fallback list if the group is missing (for in-isolation tests). - infra/ansible/tests/test_backend_failover.sh * step 0 : pre-flight — both backends UP per HAProxy stats socket. * step 1 : 5 baseline GET /api/v1/health through the LB → all 200. * step 2 : incus stop --force backend-api-1 ; record t0. * step 3 : poll HAProxy stats until backend-api-1 is DOWN (timeout 30s ; expected ~ 15s = fall × interval). * step 4 : 5 GET requests during the down window — all must 200 (served by backend-api-2). Fails if any returns non-200. * step 5 : incus start backend-api-1 ; poll until UP again. Acceptance (Day 19) : smoke test passes ; HAProxy sticky cookie keeps WS sessions on the same backend until that backend dies, at which point the cookie is ignored and the request rebalances. W4 progress : Day 16 done · Day 17 done · Day 18 done · Day 19 done · Day 20 (k6 nightly load test) pending. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 09:32:48 +00:00
# HAProxy playbook — provisions one Incus container `haproxy` and
# lays down the HAProxy config in front of the backend-api +
# stream-server pools.
#
# v1.0.9 W4 Day 19.
#
# Run with:
# ansible-galaxy collection install community.general
# ansible-playbook -i inventory/lab.yml playbooks/haproxy.yml
---
- name: Provision Incus container for HAProxy
hosts: incus_hosts
become: true
gather_facts: true
tasks:
- name: Launch haproxy container
ansible.builtin.shell:
cmd: |
set -e
if ! incus info haproxy >/dev/null 2>&1; then
incus launch images:ubuntu/22.04 haproxy
for _ in $(seq 1 30); do
if incus exec haproxy -- cloud-init status 2>/dev/null | grep -q "status: done"; then
break
fi
sleep 1
done
incus exec haproxy -- apt-get update
incus exec haproxy -- apt-get install -y python3 python3-apt
fi
args:
executable: /bin/bash
register: provision_result
changed_when: "'incus launch' in provision_result.stdout"
tags: [haproxy, provision]
- name: Refresh inventory so the new container is reachable
ansible.builtin.meta: refresh_inventory
- name: Apply common baseline
hosts: haproxy
become: true
gather_facts: true
roles:
- common
- name: Install + configure HAProxy
hosts: haproxy
become: true
gather_facts: true
roles:
- haproxy