Some checks failed
Veza CI / Frontend (Web) (push) Has been cancelled
E2E Playwright / e2e (full) (push) Has been cancelled
Veza CI / Notify on failure (push) Blocked by required conditions
Veza CI / Backend (Go) (push) Failing after 4m34s
Veza CI / Rust (Stream Server) (push) Successful in 5m37s
Security Scan / Secret Scanning (gitleaks) (push) Failing after 1m7s
Phase-1 of the active/active backend story. HAProxy in front of two
backend-api containers + two stream-server containers ; sticky cookie
pins WS sessions to one backend, URI hash routes track_id to one
streamer for HLS cache locality.
Day 19 acceptance asks for : kill backend-api-1, HAProxy bascule, WS
sessions reconnect to backend-api-2 sans perte. The smoke test wires
that gate ; phase-2 (W5) will add keepalived for an LB pair.
- infra/ansible/roles/haproxy/
* Install HAProxy + render haproxy.cfg with frontend (HTTP, optional
HTTPS via haproxy_tls_cert_path), api_pool (round-robin + sticky
cookie SERVERID), stream_pool (URI-hash + consistent jump-hash).
* Active health check GET /api/v1/health every 5s ; fall=3, rise=2.
on-marked-down shutdown-sessions + slowstart 30s on recovery.
* Stats socket bound to 127.0.0.1:9100 for the future prometheus
haproxy_exporter sidecar.
* Mozilla Intermediate TLS cipher list ; only effective when a cert
is mounted.
- infra/ansible/roles/backend_api/
* Scaffolding for the multi-instance Go API. Creates veza-api
system user, /opt/veza/backend-api dir, /etc/veza env dir,
/var/log/veza, and a hardened systemd unit pointing at the binary.
* Binary deployment is OUT of scope (documented in README) — the
Go binary is built outside Ansible (Makefile target) and pushed
via incus file push. CI → ansible-pull integration is W5+.
- infra/ansible/playbooks/haproxy.yml : provisions the haproxy Incus
container + applies common baseline + role.
- infra/ansible/inventory/lab.yml : 3 new groups :
* haproxy (single LB node)
* backend_api_instances (backend-api-{1,2})
* stream_server_instances (stream-server-{1,2})
HAProxy template reads these groups directly to populate its
upstream blocks ; falls back to the static haproxy_backend_api_fallback
list if the group is missing (for in-isolation tests).
- infra/ansible/tests/test_backend_failover.sh
* step 0 : pre-flight — both backends UP per HAProxy stats socket.
* step 1 : 5 baseline GET /api/v1/health through the LB → all 200.
* step 2 : incus stop --force backend-api-1 ; record t0.
* step 3 : poll HAProxy stats until backend-api-1 is DOWN
(timeout 30s ; expected ~ 15s = fall × interval).
* step 4 : 5 GET requests during the down window — all must 200
(served by backend-api-2). Fails if any returns non-200.
* step 5 : incus start backend-api-1 ; poll until UP again.
Acceptance (Day 19) : smoke test passes ; HAProxy sticky cookie
keeps WS sessions on the same backend until that backend dies, at
which point the cookie is ignored and the request rebalances.
W4 progress : Day 16 done · Day 17 done · Day 18 done · Day 19 done ·
Day 20 (k6 nightly load test) pending.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
41 lines
2.3 KiB
Markdown
41 lines
2.3 KiB
Markdown
# `backend_api` role — runtime baseline for the Go API container
|
|
|
|
Multi-instance scaffolding for the Go backend API behind HAProxy. v1.0.9 W4 Day 19 — phase-1 of the active/active deploy story.
|
|
|
|
## What this role DOES
|
|
|
|
- Creates the `veza-api` system user.
|
|
- Lays down `/opt/veza/backend-api`, `/etc/veza`, `/var/log/veza`.
|
|
- Renders a hardened systemd unit pointing at the binary path.
|
|
- Idempotent ; safe to re-apply against an already-running instance.
|
|
|
|
## What this role does NOT do (deliberately)
|
|
|
|
- **Build / copy the Go binary.** That happens out-of-band : a `make backend-api-deploy` target builds the binary on the dev host and pushes it via `incus file push backend-api-X /opt/veza/backend-api/veza-api`. CI integration (Forgejo job → ansible-pull) is W5+ work.
|
|
- **Render `.env`.** Secrets live in `group_vars/backend_api.vault.yml` (encrypted) and are pushed by a separate task in `playbooks/backend_api.yml` ; they don't belong in this role's defaults.
|
|
- **Run database migrations.** Migrations are gated by a CI job — running them via Ansible would race with multi-instance deploys.
|
|
|
|
## Deploying the binary (one-shot, until CI lands)
|
|
|
|
```bash
|
|
# On the dev host :
|
|
make -C veza-backend-api build # produces ./bin/veza-api
|
|
for ct in backend-api-1 backend-api-2; do
|
|
incus file push veza-backend-api/bin/veza-api "$ct"/opt/veza/backend-api/veza-api \
|
|
--uid 1001 --gid 1001 --mode 0755
|
|
incus exec "$ct" -- systemctl restart veza-backend-api
|
|
done
|
|
```
|
|
|
|
Roll one container at a time so HAProxy never sees both backends down.
|
|
|
|
## Defaults
|
|
|
|
| variable | default | meaning |
|
|
| --------------------------- | -------------------------------- | ------------------------------- |
|
|
| `backend_api_user` | `veza-api` | system user |
|
|
| `backend_api_install_dir` | `/opt/veza/backend-api` | binary + working dir |
|
|
| `backend_api_binary_name` | `veza-api` | binary basename |
|
|
| `backend_api_listen_port` | `8080` | matches HAProxy upstream config |
|
|
| `backend_api_env_file` | `/etc/veza/backend-api.env` | EnvironmentFile= path |
|
|
| `backend_api_log_dir` | `/var/log/veza` | tail-friendly log dir |
|