From f23d23cf2b9a04f889a6019b6a35a73018fc63aa Mon Sep 17 00:00:00 2001
From: senke <okin.tcs@gmail.com>
Date: Sat, 25 Apr 2026 23:51:33 +0200
Subject: [PATCH] feat(ci): add E2E Playwright workflow + runbook (v1.0.8 C2 +
 C5)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Closes the second-to-last item of Batch C (after C3 reuseExistingServer
and C4 seed --ci flag landed earlier). Wires the existing Playwright
suite (60+ spec files in tests/e2e/) into Forgejo Actions.

Workflow shape (.github/workflows/e2e.yml):
- pull_request → @critical only (5-7min target, 20min timeout)
- push to main → full suite (~25min target, 45min timeout)
- nightly cron 03:00 UTC → full suite, catches infra drift
- workflow_dispatch → full suite, manual trigger

Single job structure with conditional steps based on github.event_name.
The job:
  1. Boots Postgres / Redis / RabbitMQ via docker compose.
  2. Runs Go migrations.
  3. `go run ./cmd/tools/seed --ci` — the lean seed landed in C4
     (5 test accounts + 10 tracks + 3 playlists, ~5s).
  4. Builds + starts the backend with APP_ENV=test plus
     DISABLE_RATE_LIMIT_FOR_TESTS=true and the lockout-exempt
     emails matching the auth fixture.
  5. `playwright install --with-deps chromium`.
  6. `npm run e2e:critical` (PR) or `npm run e2e` (push/cron).
  7. Uploads the Playwright HTML report + backend log on failure
     (7-day retention, sufficient for triage).

The `CI: "true"` env var is set workflow-wide so playwright.config.ts
(line 141, 155) sees `process.env.CI` and flips reuseExistingServer
to false, guaranteeing a fresh backend + Vite per job.

Secrets fall back to dev defaults (devpassword / 38-char dev JWT /
guest:guest@localhost:5672) so a fresh repo runs without configuring
secrets first; production-style runs should set `E2E_DB_PASSWORD`,
`E2E_JWT_SECRET`, `E2E_RABBITMQ_URL` in Forgejo Actions secrets.

Runbook (docs/CI_E2E.md):
- Trigger / scope / target time table.
- Step-by-step explanation of what a CI run does.
- Required secrets + their fallbacks.
- "Reproducing a CI failure locally" — exact mirror of the workflow
  invocation so a dev can rerun without pushing.
- "Debugging a red run" — where to look in the Forgejo UI, what the
  artifacts contain, when to check SKIPPED_TESTS.md.
- "Adding a new E2E test" — fixture usage, when to tag @critical.

Action pin SHAs match the rest of the workflows (consistent supply-
chain hygiene). Go 1.25 (matches ci.yml backend job, NOT the older
1.24 used in the disabled accessibility.yml template).

Remaining Batch C item: C6 — flake stabilisation (~3-5 of the 22
SKIPPED_TESTS.md entries that look fixable). Defer to a follow-up
session — wiring the workflow first means the next push-to-main run
will tell us empirically which @critical tests are flaky in CI.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 .github/workflows/e2e.yml | 155 ++++++++++++++++++++++++++++++++++++++
 docs/CI_E2E.md            | 144 +++++++++++++++++++++++++++++++++++
 2 files changed, 299 insertions(+)
 create mode 100644 .github/workflows/e2e.yml
 create mode 100644 docs/CI_E2E.md

diff --git a/.github/workflows/e2e.yml b/.github/workflows/e2e.yml
new file mode 100644
index 000000000..dce3237db
--- /dev/null
+++ b/.github/workflows/e2e.yml
@@ -0,0 +1,155 @@
+name: E2E Playwright
+
+# v1.0.8 Batch C — Playwright E2E suite triggered on PRs (@critical only,
+# fast feedback) + push to main and nightly (full suite, deeper coverage).
+# Uses the --ci seed flag (cmd/tools/seed --ci) for ~5s seeding instead
+# of the ~60s minimal seed.
+
+on:
+    pull_request:
+        branches: [main]
+    push:
+        branches: [main]
+    schedule:
+        # Nightly full run — 03:00 UTC keeps it off the daytime runner pool.
+        - cron: "0 3 * * *"
+    workflow_dispatch:
+
+env:
+    GIT_SSL_NO_VERIFY: "true"
+    NODE_TLS_REJECT_UNAUTHORIZED: "0"
+    # Forces playwright.config.ts:141,155 to spawn fresh backend + Vite
+    # instead of reusing whatever is on the runner.
+    CI: "true"
+
+jobs:
+    # ===========================================================================
+    # Job: e2e — single matrix entry that selects the test scope per trigger.
+    #   - PR              → @critical only (5-7min target)
+    #   - push main / cron / dispatch → full suite (~25min target)
+    # ===========================================================================
+    e2e:
+        name: e2e (${{ github.event_name == 'pull_request' && '@critical' || 'full' }})
+        runs-on: ubuntu-latest
+        timeout-minutes: ${{ github.event_name == 'pull_request' && 20 || 45 }}
+        steps:
+            - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
+
+            - name: Set up Node
+              uses: actions/setup-node@1d0ff469b7ec7b3cb9d8673fde0c81c44821de2a # v4.2.0
+              with:
+                  node-version: "20"
+                  cache: "npm"
+                  cache-dependency-path: package-lock.json
+
+            - name: Set up Go
+              uses: actions/setup-go@f111f3307d8850f501ac008e886eec1fd1932a34 # v5.3.0
+              with:
+                  go-version: "1.25"
+                  cache: true
+                  cache-dependency-path: veza-backend-api/go.sum
+
+            - name: Install dependencies
+              run: npm ci
+
+            # Playwright tests reach the frontend via http://veza.fr:5174,
+            # which the browsers resolve via /etc/hosts. Without this entry
+            # the navigation step times out.
+            - name: Add veza.fr to hosts
+              run: echo "127.0.0.1 veza.fr" | sudo tee -a /etc/hosts
+
+            - name: Generate dev JWT keys + SSL cert
+              run: |
+                  ./scripts/generate-jwt-keys.sh
+                  ./scripts/generate-ssl-cert.sh
+
+            - name: Start backend services (Postgres, Redis, RabbitMQ)
+              run: |
+                  docker compose up -d postgres redis rabbitmq
+                  echo "Waiting for Postgres..."
+                  for i in $(seq 1 30); do
+                    if docker exec veza_postgres pg_isready -U veza 2>/dev/null; then
+                      echo "Postgres ready"
+                      break
+                    fi
+                    sleep 2
+                  done
+                  docker compose ps
+
+            - name: Run database migrations
+              env:
+                  DATABASE_URL: postgresql://veza:devpassword@localhost:15432/veza?sslmode=disable
+              run: |
+                  cd veza-backend-api
+                  go run cmd/migrate_tool/main.go
+
+            - name: Seed database (CI mode — 5 test accounts + minimal fixtures)
+              env:
+                  DATABASE_URL: postgresql://veza:devpassword@localhost:15432/veza?sslmode=disable
+              run: |
+                  cd veza-backend-api
+                  go run ./cmd/tools/seed --ci
+
+            - name: Build + start backend API
+              env:
+                  APP_ENV: test
+                  APP_PORT: "18080"
+                  DATABASE_URL: postgresql://veza:${{ secrets.E2E_DB_PASSWORD || 'devpassword' }}@localhost:15432/veza?sslmode=disable
+                  REDIS_URL: redis://localhost:16379
+                  JWT_SECRET: ${{ secrets.E2E_JWT_SECRET || 'ci-dev-jwt-secret-32-chars-min-padding!!' }}
+                  COOKIE_SECURE: "false"
+                  CORS_ALLOWED_ORIGINS: http://veza.fr:5174,http://localhost:5174
+                  RABBITMQ_URL: ${{ secrets.E2E_RABBITMQ_URL || 'amqp://guest:guest@localhost:5672/' }}
+                  DISABLE_RATE_LIMIT_FOR_TESTS: "true"
+                  RATE_LIMIT_LIMIT: "10000"
+                  RATE_LIMIT_WINDOW: "60"
+                  ACCOUNT_LOCKOUT_EXEMPT_EMAILS: "user@veza.music,artist@veza.music,admin@veza.music,mod@veza.music,new@veza.music"
+              run: |
+                  cd veza-backend-api
+                  go build -o veza-api ./cmd/api/main.go
+                  ./veza-api > /tmp/backend.log 2>&1 &
+                  sleep 10
+                  curl -sf http://localhost:18080/api/v1/health > /tmp/health.json || (echo "Backend health check failed"; tail -50 /tmp/backend.log; exit 1)
+                  jq -e '.status == "ok"' /tmp/health.json || (echo "Health response invalid"; cat /tmp/health.json; exit 1)
+                  echo "Backend healthy"
+
+            - name: Install Playwright browsers
+              run: npx playwright install --with-deps chromium
+
+            - name: Run E2E (@critical, PR scope)
+              if: github.event_name == 'pull_request'
+              env:
+                  PORT: "5174"
+                  VITE_API_URL: "/api/v1"
+                  VITE_DOMAIN: veza.fr
+                  VITE_BACKEND_PORT: "18080"
+                  PLAYWRIGHT_BASE_URL: "http://localhost:5174"
+              run: npm run e2e:critical
+
+            - name: Run E2E (full, push/cron/dispatch)
+              if: github.event_name != 'pull_request'
+              env:
+                  PORT: "5174"
+                  VITE_API_URL: "/api/v1"
+                  VITE_DOMAIN: veza.fr
+                  VITE_BACKEND_PORT: "18080"
+                  PLAYWRIGHT_BASE_URL: "http://localhost:5174"
+              run: npm run e2e
+
+            - name: Upload Playwright report
+              if: failure()
+              uses: actions/upload-artifact@65c4c4a1ddee5b72f698fdd19549f0f0fb45cf08 # v4.6.0
+              with:
+                  name: playwright-report-${{ github.run_id }}-${{ github.run_attempt }}
+                  path: |
+                      tests/e2e/playwright-report/
+                      tests/e2e/test-results/
+                  retention-days: 7
+
+            - name: Upload backend log
+              if: failure()
+              uses: actions/upload-artifact@65c4c4a1ddee5b72f698fdd19549f0f0fb45cf08 # v4.6.0
+              with:
+                  name: backend-log-${{ github.run_id }}-${{ github.run_attempt }}
+                  path: /tmp/backend.log
+                  retention-days: 7
diff --git a/docs/CI_E2E.md b/docs/CI_E2E.md
new file mode 100644
index 000000000..d42afab15
--- /dev/null
+++ b/docs/CI_E2E.md
@@ -0,0 +1,144 @@
+# E2E CI — runbook
+
+> **v1.0.8 Batch C** — Playwright E2E suite running on Forgejo Actions.
+> Workflow: `.github/workflows/e2e.yml`. Tests: `tests/e2e/*.spec.ts`.
+> Skipped tests inventory: `tests/e2e/SKIPPED_TESTS.md`.
+
+---
+
+## Triggers
+
+| Trigger | Scope | Target time | Why |
+|---|---|---|---|
+| PR opened / synced (against `main`) | `@critical` only | ~5–7 min | Fast feedback loop, blocks merge if red |
+| Push to `main` | Full suite | ~25 min | Catches regressions that slipped past `@critical` |
+| Nightly cron (03:00 UTC) | Full suite | ~25 min | Catches infra drift independent of merges |
+| `workflow_dispatch` | Full suite | manual | Re-run after a flaky failure or on a feature branch |
+
+`@critical` is a Playwright `--grep` tag — see `npm run e2e:critical`.
+
+---
+
+## How a CI run works
+
+1. `actions/checkout` + `setup-node@20` + `setup-go@1.25`.
+2. `npm ci` from repo root.
+3. Adds `127.0.0.1 veza.fr` to `/etc/hosts` so the browsers can hit
+   the dev domain.
+4. Generates dev JWT keys + SSL cert via the existing scripts.
+5. Brings up `postgres / redis / rabbitmq` via `docker compose`.
+6. Runs Go migrations.
+7. **`go run ./cmd/tools/seed --ci`** — the lean seed: 5 test accounts
+   + 10 tracks + 3 playlists, no chat/live/marketplace/analytics. ~5s.
+8. Builds + starts the backend on `localhost:18080`, asserts
+   `/api/v1/health`.
+9. `playwright install --with-deps chromium`.
+10. Runs `npm run e2e:critical` (PR) or `npm run e2e` (push/cron).
+    `CI=true` is exported globally so `playwright.config.ts:141,155`
+    spawns its own Vite + backend instance instead of trying to reuse.
+11. On failure: uploads the Playwright HTML report and `backend.log`
+    as artifacts, retained 7 days.
+
+---
+
+## Required secrets (Forgejo)
+
+The workflow falls back to dev defaults so it can still run on a
+fresh repo without secrets configured, but **production-style runs
+should set these in Forgejo Actions secrets**:
+
+| Secret | Default fallback | Purpose |
+|---|---|---|
+| `E2E_DB_PASSWORD` | `devpassword` | Postgres password (must match `docker-compose.yml`) |
+| `E2E_JWT_SECRET` | `ci-dev-jwt-secret-32-chars-min-padding!!` | HS256 signing key (32+ chars) |
+| `E2E_RABBITMQ_URL` | `amqp://guest:guest@localhost:5672/` | RabbitMQ AMQP URL |
+
+Without these, the workflow still passes for everything that doesn't
+exercise WebSocket / RabbitMQ paths under load.
+
+---
+
+## Reproducing a CI failure locally
+
+Mirrors the workflow exactly:
+
+```bash
+# From repo root
+make infra-up-dev                  # postgres + redis + rabbitmq
+cd veza-backend-api
+go run cmd/migrate_tool/main.go
+go run ./cmd/tools/seed --ci       # 5 test accounts only
+go build -o veza-api ./cmd/api/main.go
+APP_ENV=test ./veza-api &
+
+# In another shell
+cd apps/web && npm run dev -- --host 127.0.0.1 --port 5174 &
+
+# Run the same tests CI ran
+cd /path/to/repo
+CI=true npm run e2e:critical       # PR scope
+# or
+CI=true npm run e2e                # full suite
+```
+
+If the failure only reproduces under `CI=true`, suspect
+`reuseExistingServer` — set `CI=` (empty) to flip back to local mode
+and bisect.
+
+---
+
+## Debugging a red run
+
+1. **Open the run** in Forgejo Actions UI.
+2. Find the failing job's "Run E2E" step. Each test failure shows the
+   selector / assertion / screenshot inline.
+3. Scroll to the artifact section: download
+   `playwright-report-<run-id>-<attempt>` (the HTML report — opens in
+   any browser, shows trace viewer + video for retry-on-fail) and
+   `backend-log-<run-id>-<attempt>` (full backend stdout + stderr).
+4. If the failure looks env-related (404 on a known route, 500
+   without a clear cause), check `backend-log` for panics or
+   migration errors before assuming a test bug.
+5. Cross-check `tests/e2e/SKIPPED_TESTS.md` — if the test is already
+   listed as flaky, the right fix may be `.skip()` until the
+   underlying app bug is tracked.
+
+---
+
+## Adding a new E2E test
+
+1. Drop a `*.spec.ts` file under `tests/e2e/`.
+2. Tag it with `@critical` if it must run on every PR (be conservative
+   — every `@critical` test extends the PR feedback loop).
+3. Use the auth fixture from `tests/e2e/fixtures/auth.fixture.ts`
+   (`listenerPage` / `creatorPage` / `adminPage` / `moderatorPage`)
+   instead of writing UI login flows.
+4. If the test needs DB state outside the `--ci` seed (rare), seed it
+   from inside the test via `page.request.post(...)` rather than
+   extending the seed tool — keeps the seed lean.
+5. Run locally with `CI=true npm run e2e:critical -- --grep "your test"`
+   before pushing.
+
+---
+
+## Scaling considerations
+
+- Forgejo runner pool is shared across CI workflows — keep PR runs
+  under 10 min so we don't hold a runner during peak hours.
+- `docker compose up -d postgres redis rabbitmq` reuses the dev
+  compose file; if that file changes, the workflow inherits the
+  change automatically.
+- The full suite is gated to push/cron/dispatch precisely because we
+  don't want to pay 25 min on every PR push.
+
+---
+
+## Related
+
+- `tests/e2e/playwright.config.ts` — base config, `reuseExistingServer:
+  !process.env.CI` (committed in v1.0.8 C3, commit `46d21c5c`).
+- `veza-backend-api/cmd/tools/seed/config.go` — `CIConfig()` and the
+  `--ci` flag (committed in v1.0.8 C4, commit `cee850a5`).
+- `tests/e2e/SKIPPED_TESTS.md` — known flakes + tickets to resolve.
+- `docs/audit-2026-04/v107-plan.md` — historical context for E2E
+  coverage gaps that landed in v1.0.7.