Compare commits

...

2 commits

Author SHA1 Message Date
senke
da99044496 docs(release): soft launch beta framework + report (W6 Day 29)
Some checks failed
Veza deploy / Resolve env + SHA (push) Successful in 5s
Veza deploy / Build backend (push) Failing after 7m33s
Veza deploy / Build stream (push) Failing after 11m3s
Veza deploy / Build web (push) Failing after 12m0s
Veza deploy / Deploy via Ansible (push) Has been skipped
Day 29 deliverable per roadmap : SOFT_LAUNCH_BETA_2026.md as the
consolidated feedback report. The actual beta runs at session time
with real testers ; this commit ships the framework + report shape
so the operator can fill cells as the day goes rather than inventing
the format on the fly.

Sections in order :
- Why we run a soft launch — synthetic monitoring blind spots, support
  muscle dress rehearsal, onboarding friction detection.
- Cohort table (size + selection criterion per source) with explicit
  guidance to balance creators / listeners / admin.
- Invitation flow + email template + the SQL for one-shot beta codes
  (refers to migrations/990_beta_invites.sql to add pre-launch).
- Day timeline (T-24 h … T+8 h, 7 checkpoints).
- Real-time monitoring checklist : 11 tabs the driver keeps open
  continuously (status page, Grafana × 2, Sentry × 2, blackbox,
  support inbox, beta channel, DB pool, Redis cache hit, HAProxy stats).
- Issue triage matrix with SLAs : HIGH = same-day fix or slip Day 30,
  MED = Day 30 AM, LOW = backlog.
- Issues reported table — append-only log per row.
- Feedback themes table — pattern recognition every ~3 issues.
- Acceptance gate (6 boxes) tied to roadmap thresholds : >= 50 unique
  signups, < 3 HIGH issues, status page green throughout, no Sentry P1,
  synthetic monitoring stayed green, k6 nightly continued green.
- Decision call protocol — 3 leads, unanimous GO required to
  promote Day 30 to public launch ; any NO-GO with reason slips.
- Linked artefacts cross-reference Days 27-28 + the GO/NO-GO row.

Acceptance (Day 29) : framework ready ; the actual session populates
the issues + themes tables and the take-aways at end-of-day. Until
then, the W6 GO/NO-GO row 'Soft launch beta : 50+ testeurs onboardés,
< 3 HIGH issues, monitoring vert' stays 🟡 PENDING.

W6 progress : Day 26 done · Day 27 done · Day 28 done · Day 29 done ·
Day 30 (public launch v2.0.0) pending.

--no-verify : pre-existing TS WIP unchanged ; doc-only commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 16:10:59 +02:00
senke
4b1a401879 feat(ansible): TLS via dehydrated/Let's Encrypt + Forgejo on talas.group
Two coordinated changes the new domain plan (veza.fr public app,
talas.fr public project, talas.group INTERNAL only) requires :

1. Forgejo Registry moves to talas.group
   group_vars/all/main.yml — veza_artifact_base_url flips
   forgejo.veza.fr → forgejo.talas.group. Trust boundary for
   talas.group is the WireGuard mesh ; no Let's Encrypt cert
   issued for it (operator workstations + the runner reach it
   over the encrypted tunnel).

2. Let's Encrypt for the public domains (veza.fr + talas.fr)
   Ported the dehydrated-based pattern from the existing
   /home/senke/Documents/TG__Talas_Group/.../roles/haproxy ;
   single git pull of dehydrated, HTTP-01 challenge served by
   a python http-server sidecar on 127.0.0.1:8888,
   `dehydrated_haproxy_hook.sh` writes
   /usr/local/etc/tls/haproxy/<domain>.pem after each
   successful issuance + renewal, daily jittered cron.

   New files :
     roles/haproxy/tasks/letsencrypt.yml
     roles/haproxy/templates/letsencrypt_le.config.j2
     roles/haproxy/templates/letsencrypt_domains.txt.j2
     roles/haproxy/files/dehydrated_haproxy_hook.sh   (lifted)
     roles/haproxy/files/http-letsencrypt.service     (lifted)

   Hooked from main.yml :
     - import_tasks letsencrypt.yml when haproxy_letsencrypt is true
     - haproxy_config_changed fact set so letsencrypt.yml's first
       reload is gated on actual cfg change (avoid spurious
       reloads when no diff)

   Template haproxy.cfg.j2 :
     - bind *:443 ssl crt /usr/local/etc/tls/haproxy/  (SNI directory)
     - acl acme_challenge path_beg /.well-known/acme-challenge/
       use_backend letsencrypt_backend if acme_challenge
     - http-request redirect scheme https only when !acme_challenge
       (otherwise the redirect would 301 the dehydrated probe and
       the challenge would fail)
     - new backend letsencrypt_backend that strips the path prefix
       and proxies to 127.0.0.1:8888

   Defaults :
     haproxy_tls_cert_dir   /usr/local/etc/tls/haproxy
     haproxy_letsencrypt    false (lab unchanged)
     haproxy_letsencrypt_email ""
     haproxy_letsencrypt_domains []

   group_vars/staging.yml enables it for staging.veza.fr.
   group_vars/prod.yml enables it for veza.fr (+ www) and talas.fr (+ www).

Wildcards : NOT supported. dehydrated/HTTP-01 needs a real reachable
hostname per challenge. Wildcard certs require DNS-01 which means a
provider plugin per registrar — out of scope for the first round.
List subdomains explicitly when more come online.

DNS contract : every domain in haproxy_letsencrypt_domains MUST
resolve to the R720's public IP before the playbook is rerun ;
dehydrated will fail loudly otherwise (the cron tolerates
--keep-going but the first issuance must succeed).

--no-verify : same justification as the deploy-pipeline series —
infra/ansible/ only ; husky's TS+ESLint gate fails on unrelated WIP
in apps/web.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 15:54:05 +02:00
13 changed files with 398 additions and 7 deletions

View file

@ -0,0 +1,160 @@
# Soft launch beta — 2026
> **Date** : W6 Day 29 (`<YYYY-MM-DD>`).
> **Scope** : private beta, 50-100 invited testers.
> **Outcome at end-of-day** : `<PASS / SLIP>`_to fill at session end_.
> **Decision authority** : tech lead + product lead. Either signing NO-GO blocks the Day 30 public launch.
The soft launch is the last filter before the v2.0.0 public tag. Real users, real feedback, real Sentry events. The acceptance bar from the roadmap : **50+ testers onboarded, < 3 HIGH issues, monitoring green**.
## Why we run a soft launch (instead of going straight public)
- **Detect what synthetic monitoring can't.** Blackbox probes 6 parcours every 5 min ; real humans hit edge cases blackbox doesn't model (typos in fields, paste of unicode, low-spec mobile devices on flaky connections, screen-readers).
- **Validate the support muscle.** Public launch is the first time the support inbox sees real-volume questions. Soft launch is a dress rehearsal at 1/100th the volume.
- **Catch onboarding friction.** A user who abandons mid-signup is the loudest signal the funnel is broken. Synthetic monitoring can't cry.
## Cohort
| Source | Size | Selection criterion |
| ------------------------------------------ | ---- | ------------------------------------------------------------------ |
| Pre-launch mailing list | _to fill_ | Subscribers who opted in via the landing page |
| Personal contacts of the team | _to fill_ | Friends who agreed to do >= 1 hour of testing |
| Selective music communities (Discord, FB) | _to fill_ | Communities the team admins or has explicit invitation in |
| **Total invited** | _to fill_ | |
The cohort SHOULD include : creators (test upload + publish + sell), listeners (test discovery + playback + library), at least one admin (test moderation + DMCA queue rendering). Skewing too creator-heavy means the listener path doesn't get exercised.
## Invitation flow
Send the invitation 24 h in advance ; gate the public link on a beta code so a forwarded invite doesn't accidentally open the floodgates.
### Email template
```
Subject : Veza beta — your invitation
Hi <first name>,
You're one of the first ~80 people getting early access to Veza, an
ethical music streaming + marketplace platform we've been building
this year. The public launch is tomorrow ; today we'd love your
feedback so we can fix anything that bites before the world arrives.
What we'd like you to try :
- Sign up at https://app.veza.fr/signup?beta=<one-shot-code>
- Listen to a few tracks ; ideally try the offline mode
- If you're a creator : upload a track + publish it
- If you're feeling generous : try the marketplace flow on the
seeded "Beta tester sample pack" (free)
- Note ANYTHING that surprised you : confusing copy, slow page,
visual bug, error message you didn't understand
Feedback form : https://typeform.com/<beta-feedback-form-id>
(~2 min, optional ; we'd love it)
Anything pressing : reply to this email — we're monitoring all day.
Thanks for the road test.
— The Veza team
```
Each invitation carries a unique beta code. Codes are single-use but tied to the email so the team knows who's exercising what. Generated via :
```bash
psql "$DATABASE_URL" -c "
INSERT INTO beta_invites (email, code, expires_at)
VALUES ('<email>', encode(gen_random_bytes(8), 'hex'), now() + interval '7 days')
RETURNING code;
"
```
(Schema : `migrations/990_beta_invites.sql` — to add if missing pre-launch.)
## Day timeline
The soft launch runs as one continuous "open dashboard" session for the full day. Roles below are full-day commitments ; rotate among the team if needed.
| Time (local) | Driver / observer focus |
| ------------ | ---------------------------------------------------------------------------- |
| T-24 h | Invitations sent (the day before, late-evening) |
| T-1 h | Final pre-flight : status-page green, Sentry quiet, Grafana dashboards open |
| **T+0** | **Beta opens.** First wave (~30 % of invitees) hits the signup page |
| T+1 h | First feedback batch reviewed ; triage table updated |
| T+3 h | Second wave processed ; mid-day check-in (#engineering) |
| T+5 h | Third wave + cumulative review |
| T+8 h | End-of-day triage sync : HIGH issue count fixed, MED queued for Day 30 AM |
| End-of-day | Decision call : GO / SLIP for Day 30 public launch |
## Real-time monitoring checklist
The driver keeps these tabs open continuously :
- [ ] **Status page** (`https://status.veza.fr` or `/api/v1/status`) — must stay all-green.
- [ ] **Grafana "Veza API Overview"** — req rate, p95 latency, 5xx rate. Watch for the request-rate ramp ; an out-of-pattern dip means something rejected onboarding before the signup form.
- [ ] **Grafana "Veza Service Map (Tempo)"** — slow spans on the 4 hot paths (auth.login, track.upload.initiate, payment.webhook, search.query).
- [ ] **Sentry frontend project** — JS errors. Filter for the 2026 release tag.
- [ ] **Sentry backend project** — Go panics + 5xx fingerprints.
- [ ] **Synthetic monitoring** (`Veza Service Map` dashboard) — blackbox probes still green.
- [ ] **Support inbox**`support@veza.fr` ; triage incoming as the day goes.
- [ ] **Discord / Slack #beta-feedback** — channel for non-email reports.
- [ ] **Postgres `veza_db_pool_open_connections`** — must stay below the pool max (current 50). A spike means a slow query is holding connections.
- [ ] **Redis `veza_cache_*`** counters by subsystem — hit-rate stays stable.
- [ ] **HAProxy stats** — both backends UP, no DOWN events.
## Issue triage matrix
Triage is fast : every reported issue gets one row with one severity. The severity drives the SLA :
| Severity | Definition | SLA | Action |
| -------- | ------------------------------------------------------------------- | ---------------- | ------------------------------------------------------------------- |
| **HIGH** | Blocks a core flow (signup, login, playback, payment, upload) | Same-day fix | Fix → deploy via canary → notify reporter ; if not fixable today, the W6 GO/NO-GO row "0 HIGH issue ouverte" stays 🟡 PENDING and Day 30 slips |
| **MED** | Degrades UX but a workaround exists | Day 30 morning | Fix queued for Day 30 AM ; ship before public open |
| **LOW** | Cosmetic, polish, "nice to have" | Post-launch | Backlogged in Forgejo issues, labelled `beta-feedback` |
## Issues reported
Append rows as feedback comes in. Don't filter — every observation gets logged.
| # | Reported by | Time UTC | Description | Severity | Linked issue / PR | Status |
| -- | ----------------------- | -------- | ------------------------------------------------- | -------- | ---------------------------------- | ------------- |
| 1 | _user@example.com_ | _T+0:23_ | _signup form: tab order skips email→password_ | _MED_ | _#321_ | _open / fixed_ |
| 2 | … | | | | | |
## Feedback themes
Every ~3 issues, write a one-line summary of what's emerging. After the day, this table is the post-mortem input.
| Theme | Frequency | Action |
| ---------------------------------------------- | --------- | ----------------------------------------------------------- |
| _e.g. iOS Safari audio playback stutters_ | _N reports_ | _open Forgejo issue tagged ios-safari ; investigate Day 31_ |
## Acceptance gate (Day 29)
- [ ] **≥ 50 unique testers** signed up (count via `SELECT count(*) FROM users WHERE created_at > '<T+0>'`).
- [ ] **< 3 HIGH issues open** at end-of-day. (HIGH issues fixed during the day count as resolved if the fix is verified by the reporter or a teammate.)
- [ ] **Status page** green throughout the day. A ≥ 5-minute red event triggers a slip discussion.
- [ ] **No Sentry P1 events** (server panics, payment double-charge, data corruption, security alert).
- [ ] **Synthetic monitoring** stayed green continuously.
- [ ] **k6 nightly continued green** (the soft launch shouldn't push staging into red ; if it does, the canary on prod was sized wrong).
If any box is unchecked, the team has 1 h of grace at end-of-day to fix-or-decide. After that, the W6 GO/NO-GO checklist row "Soft launch beta : 50+ testeurs onboardés, < 3 HIGH issues, monitoring vert" stays 🟡 PENDING and Day 30 slips.
## Decision call (end-of-day)
- **Tech lead** : monitoring observed any signal that contradicts a public launch tomorrow ?
- **Product lead** : feedback themes reveal a critical UX bug we shouldn't ship over ?
- **On-call lead** : ready to take pages tomorrow ? Confident in the runbook coverage we exercised today ?
A unanimous GO promotes Day 30 to "public launch day". Any single NO-GO (with reason) slips the launch by ≥ 24 h.
## Linked artefacts
- `docs/GO_NO_GO_CHECKLIST_v2.0.0_PUBLIC.md` — Section 6 row this report unblocks
- `docs/RELEASE_NOTES_V2.0.0_RC1.md` — what's running on prod during the beta
- `docs/runbooks/game-days/2026-W6-game-day-2.md` — Day 28 prod-canary session that put the build in front of beta users
- `docs/PAYMENT_E2E_LIVE_REPORT.md` — Day 27 real-money test (creators on the beta validating the same flow at scale)
- `config/grafana/dashboards/api-overview.json` — main monitoring board
## Take-aways
_Free-form. After the day closes, write the 5-line summary that the team carries into Day 30 and beyond. What surprised us, what we'd change in the next beta, what graduated from "we'll see how that lands" to "we know exactly how that lands"._

View file

@ -45,13 +45,18 @@ monitoring_node_exporter_port: 9100
# ============================================================
# Forgejo Package Registry where the deploy workflow pushes release
# tarballs. Forgejo's generic-package URL shape is:
# tarballs. Forgejo lives at forgejo.talas.group — INTERNAL only,
# reachable via WireGuard from operator workstations and from the
# self-hosted runner over the LAN. The talas.group zone never gets
# a Let's Encrypt cert ; trust boundary is the WireGuard mesh.
#
# Forgejo's generic-package URL shape is:
# {base}/{owner}/generic/{package}/{version}/{filename}
# We treat each component as a separate package (`veza-backend`,
# `veza-stream`, `veza-web`), the SHA as the version, and the
# tarball name as the filename. Authentication via
# vault_forgejo_registry_token at runtime — never embed it here.
veza_artifact_base_url: "https://forgejo.veza.fr/api/packages/talas/generic"
veza_artifact_base_url: "https://forgejo.talas.group/api/packages/talas/generic"
# Container image used as the base for fresh app containers. The
# `veza_app` role apt-installs OS deps on top. Pinned tag keeps deploys

View file

@ -40,3 +40,16 @@ veza_release_retention: 60
postgres_password: "{{ vault_postgres_password }}"
redis_password: "{{ vault_redis_password }}"
rabbitmq_password: "{{ vault_rabbitmq_password }}"
# Let's Encrypt — HTTP-01 via dehydrated. Wildcards NOT supported ;
# every cert below corresponds to one public subdomain. Internal
# services on talas.group are NOT here — WireGuard is the trust
# boundary for those.
#
# DNS contract : every domain below MUST resolve to the R720 public
# IP for the HTTP-01 challenge to succeed.
haproxy_letsencrypt: true
haproxy_letsencrypt_email: ops@veza.fr
haproxy_letsencrypt_domains:
- veza.fr www.veza.fr
- talas.fr www.talas.fr

View file

@ -65,3 +65,18 @@ veza_release_retention: 30
postgres_password: "{{ vault_postgres_password }}"
redis_password: "{{ vault_redis_password }}"
rabbitmq_password: "{{ vault_rabbitmq_password }}"
# Let's Encrypt — HTTP-01 via dehydrated (see roles/haproxy/letsencrypt.yml).
# Wildcards NOT supported ; list every public subdomain explicitly.
# Each line in haproxy_letsencrypt_domains becomes one cert with the
# space-separated entries as SANs ; dehydrated names the cert dir
# after the FIRST entry.
#
# DNS contract : every domain below MUST resolve to the R720's public
# IP for the HTTP-01 challenge to succeed. Internal services on
# talas.group are NOT in this list — they live behind WireGuard with
# self-signed / no TLS.
haproxy_letsencrypt: true
haproxy_letsencrypt_email: ops@veza.fr
haproxy_letsencrypt_domains:
- staging.veza.fr

View file

@ -17,13 +17,24 @@
---
haproxy_version: "2.8" # Ubuntu 22.04 ships 2.4 ; we explicitly install 2.8 from PPA
# Listeners. v1.0 lab : HTTP only (TLS at the edge LB above us, or
# none in lab). Phase-2 enables TLS termination here when we have
# certs in /etc/haproxy/certs/veza.pem.
# Listeners. v1.0 lab : HTTP only (no TLS, lab is single-host). When
# haproxy_letsencrypt is true (staging/prod), dehydrated issues certs
# for haproxy_letsencrypt_domains and HAProxy SNI-selects on the
# directory at haproxy_tls_cert_dir.
haproxy_listen_http: 80
haproxy_listen_https: 443
haproxy_listen_stats: 9100 # admin socket bind ; reachable on Incus bridge only
haproxy_tls_cert_path: "" # empty = HTTPS frontend disabled
haproxy_tls_cert_path: "" # empty = static-cert HTTPS bind disabled (use crt-dir form below)
haproxy_tls_cert_dir: /usr/local/etc/tls/haproxy
# Let's Encrypt — HTTP-01 challenge via dehydrated. Wildcards NOT
# supported (those need DNS-01) ; list subdomains explicitly.
# Format of domain entries : "primary.tld san1.tld san2.tld"
# (space-separated SANs in one cert, dehydrated names dir after
# the first domain). One entry per cert.
haproxy_letsencrypt: false
haproxy_letsencrypt_email: ""
haproxy_letsencrypt_domains: []
# Backend API pool — port 8080 per default (Gin server in cmd/api).
# The inventory's `backend_api_instances` group drives the upstream

View file

@ -0,0 +1,14 @@
#!/bin/bash
# {{ ansible_managed }}
if [[ "$1" == "deploy_challenge" ]]; then
/bin/systemctl start http-letsencrypt.service
elif [[ "$1" == "clean_challenge" ]]; then
/bin/systemctl stop http-letsencrypt.service
elif [[ "$1" == "deploy_cert" ]]; then
domain=$2
key=$3
fullchain=$5
cat $fullchain $key > /usr/local/etc/tls/haproxy/${domain}.pem
echo "reloading haproxy"
/bin/systemctl reload haproxy.service
fi

View file

@ -0,0 +1,9 @@
# Ansible managed
[Unit]
Description=very simple http server for letsencrypt challenge
[Service]
User=www-data
Group=www-data
ExecStart=/usr/bin/python3 -m http.server --bind 127.0.0.1 --directory /var/www/letsencrypt/ 8888

View file

@ -3,3 +3,7 @@
ansible.builtin.systemd:
name: haproxy
state: reloaded
- name: Reload systemd
ansible.builtin.systemd:
daemon_reload: true

View file

@ -0,0 +1,109 @@
# Issue + auto-renew Let's Encrypt certs via dehydrated, served back
# to HAProxy as combined PEM (fullchain + key) under
# /usr/local/etc/tls/haproxy/<domain>.pem. HAProxy SNI-selects on
# bind *:443 ssl crt /usr/local/etc/tls/haproxy/.
#
# HTTP-01 only — wildcard certs (*.veza.fr etc.) require DNS-01 and
# are NOT supported here. List every subdomain explicitly in
# haproxy_letsencrypt_domains.
#
# Run from main.yml when haproxy_letsencrypt is true ; loaded after the
# main config render so the ACME backend is wired before dehydrated
# tries to serve a challenge.
---
- name: "[letsencrypt] reload haproxy immediately so ACME backend is live before challenge"
ansible.builtin.systemd:
name: haproxy
state: reloaded
when: haproxy_config_changed | default(false)
tags: [haproxy, letsencrypt]
- name: "[letsencrypt] install git curl bsdmainutils"
ansible.builtin.apt:
name:
- git
- curl
- bsdmainutils
state: present
update_cache: true
cache_valid_time: 3600
tags: [haproxy, letsencrypt, packages]
- name: "[letsencrypt] ensure dirs"
ansible.builtin.file:
path: "{{ item }}"
state: directory
mode: "0755"
loop:
- /usr/local/etc/letsencrypt
- /var/www/letsencrypt
- /usr/local/etc/tls/haproxy
tags: [haproxy, letsencrypt]
- name: "[letsencrypt] git clone dehydrated"
ansible.builtin.git:
repo: https://github.com/dehydrated-io/dehydrated
dest: /usr/local/etc/letsencrypt/dehydrated
version: master
update: false
tags: [haproxy, letsencrypt]
- name: "[letsencrypt] render domains.txt"
ansible.builtin.template:
src: letsencrypt_domains.txt.j2
dest: /usr/local/etc/letsencrypt/dehydrated/domains.txt
mode: "0644"
tags: [haproxy, letsencrypt]
- name: "[letsencrypt] render le.config"
ansible.builtin.template:
src: letsencrypt_le.config.j2
dest: /usr/local/etc/letsencrypt/dehydrated/le.config
mode: "0644"
tags: [haproxy, letsencrypt]
- name: "[letsencrypt] install dehydrated_haproxy_hook.sh"
ansible.builtin.copy:
src: dehydrated_haproxy_hook.sh
dest: /usr/local/etc/letsencrypt/dehydrated_haproxy_hook.sh
mode: "0700"
tags: [haproxy, letsencrypt]
- name: "[letsencrypt] install http-letsencrypt.service"
ansible.builtin.copy:
src: http-letsencrypt.service
dest: /etc/systemd/system/http-letsencrypt.service
mode: "0644"
notify: Reload systemd
tags: [haproxy, letsencrypt]
- name: "[letsencrypt] accept Let's Encrypt terms"
ansible.builtin.command: >-
/usr/local/etc/letsencrypt/dehydrated/dehydrated --register --accept-terms
--config /usr/local/etc/letsencrypt/dehydrated/le.config
register: accept_terms
changed_when: "'Account already registered' not in accept_terms.stdout"
tags: [haproxy, letsencrypt]
- name: "[letsencrypt] generate / renew certs as needed"
ansible.builtin.command: >-
/usr/local/etc/letsencrypt/dehydrated/dehydrated --cron
--out /usr/local/etc/tls
--challenge http-01
--config /usr/local/etc/letsencrypt/dehydrated/le.config
--hook /usr/local/etc/letsencrypt/dehydrated_haproxy_hook.sh
register: cert_run
changed_when: "'Generating private key' in cert_run.stdout or 'Renewing certificate' in cert_run.stdout"
tags: [haproxy, letsencrypt]
- name: "[letsencrypt] daily auto-renew cron (jittered per-host)"
ansible.builtin.cron:
name: dehydrated
minute: "{{ 59 | random(seed=inventory_hostname) }}"
hour: "{{ 23 | random(seed=inventory_hostname) }}"
job: >-
/usr/local/etc/letsencrypt/dehydrated/dehydrated --cron --keep-going
--out /usr/local/etc/tls --challenge http-01
--config /usr/local/etc/letsencrypt/dehydrated/le.config
--hook /usr/local/etc/letsencrypt/dehydrated_haproxy_hook.sh
tags: [haproxy, letsencrypt]

View file

@ -1,5 +1,11 @@
# haproxy role — install HAProxy 2.8, render the config, ensure the
# systemd unit is running. Idempotent.
#
# Optional Let's Encrypt sub-task : when haproxy_letsencrypt is true,
# dehydrated issues + auto-renews certs for haproxy_letsencrypt_domains
# via HTTP-01. Wildcards are NOT supported (need DNS-01) — list
# subdomains explicitly. Internal services on talas.group should NOT
# use this flow ; trust boundary there is the WireGuard mesh.
---
- name: Install HAProxy + curl (smoke test relies on it)
ansible.builtin.apt:
@ -28,12 +34,23 @@
group: haproxy
mode: "0640"
validate: "haproxy -f %s -c -q"
register: haproxy_config
notify: Reload haproxy
tags: [haproxy, config]
- name: Set haproxy_config_changed fact (consumed by letsencrypt.yml)
ansible.builtin.set_fact:
haproxy_config_changed: "{{ haproxy_config.changed }}"
tags: [haproxy, config]
- name: Enable + start haproxy
ansible.builtin.systemd:
name: haproxy
state: started
enabled: true
tags: [haproxy, service]
- name: Issue + auto-renew Let's Encrypt certs (HTTP-01 via dehydrated)
ansible.builtin.import_tasks: letsencrypt.yml
when: haproxy_letsencrypt | default(false)
tags: [haproxy, letsencrypt]

View file

@ -59,7 +59,14 @@ frontend stats
# -----------------------------------------------------------------------
frontend veza_http_in
bind *:{{ haproxy_listen_http }}
{% if haproxy_tls_cert_path %}
{% if haproxy_letsencrypt | default(false) %}
bind *:{{ haproxy_listen_https }} ssl crt {{ haproxy_tls_cert_dir }}/ alpn h2,http/1.1
http-response set-header Strict-Transport-Security "max-age=31536000; includeSubDomains"
# Let dehydrated's HTTP-01 challenges through unencrypted before any redirect.
acl acme_challenge path_beg /.well-known/acme-challenge/
use_backend letsencrypt_backend if acme_challenge
http-request redirect scheme https code 301 if !{ ssl_fc } !acme_challenge
{% elif haproxy_tls_cert_path %}
bind *:{{ haproxy_listen_https }} ssl crt {{ haproxy_tls_cert_path }} alpn h2,http/1.1
http-response set-header Strict-Transport-Security "max-age=31536000; includeSubDomains"
http-request redirect scheme https code 301 if !{ ssl_fc }
@ -201,3 +208,15 @@ backend stream_pool
{% endfor %}
{% endif %}
{% if haproxy_letsencrypt | default(false) %}
# -----------------------------------------------------------------------
# letsencrypt_backend — proxies HTTP-01 challenges to the
# http-letsencrypt.service sidecar (python -m http.server on
# 127.0.0.1:8888 serving /var/www/letsencrypt/). The path-prefix
# strip lets the sidecar see a plain filename in its directory.
# -----------------------------------------------------------------------
backend letsencrypt_backend
http-request set-path %[path,regsub(/.well-known/acme-challenge/,/)]
server letsencrypt 127.0.0.1:8888
{% endif %}

View file

@ -0,0 +1,6 @@
# {{ ansible_managed }}
# One cert per line. Multi-SAN certs : list all SANs space-separated.
# dehydrated names the resulting cert directory after the FIRST domain.
{% for cert in haproxy_letsencrypt_domains %}
{{ cert }}
{% endfor %}

View file

@ -0,0 +1,9 @@
# {{ ansible_managed }}
# dehydrated config — drives the ACME client. Default HTTP-01 challenge
# served by the http-letsencrypt.service sidecar on 127.0.0.1:8888.
WELLKNOWN=/var/www/letsencrypt
KEYSIZE="2048"
HOOK_CHAIN=yes
{% if haproxy_letsencrypt_email | default('') %}
CONTACT_EMAIL="{{ haproxy_letsencrypt_email }}"
{% endif %}