veza/docs/CI_E2E.md
senke f23d23cf2b feat(ci): add E2E Playwright workflow + runbook (v1.0.8 C2 + C5)
Closes the second-to-last item of Batch C (after C3 reuseExistingServer
and C4 seed --ci flag landed earlier). Wires the existing Playwright
suite (60+ spec files in tests/e2e/) into Forgejo Actions.

Workflow shape (.github/workflows/e2e.yml):
- pull_request → @critical only (5-7min target, 20min timeout)
- push to main → full suite (~25min target, 45min timeout)
- nightly cron 03:00 UTC → full suite, catches infra drift
- workflow_dispatch → full suite, manual trigger

Single job structure with conditional steps based on github.event_name.
The job:
  1. Boots Postgres / Redis / RabbitMQ via docker compose.
  2. Runs Go migrations.
  3. `go run ./cmd/tools/seed --ci` — the lean seed landed in C4
     (5 test accounts + 10 tracks + 3 playlists, ~5s).
  4. Builds + starts the backend with APP_ENV=test plus
     DISABLE_RATE_LIMIT_FOR_TESTS=true and the lockout-exempt
     emails matching the auth fixture.
  5. `playwright install --with-deps chromium`.
  6. `npm run e2e:critical` (PR) or `npm run e2e` (push/cron).
  7. Uploads the Playwright HTML report + backend log on failure
     (7-day retention, sufficient for triage).

The `CI: "true"` env var is set workflow-wide so playwright.config.ts
(line 141, 155) sees `process.env.CI` and flips reuseExistingServer
to false, guaranteeing a fresh backend + Vite per job.

Secrets fall back to dev defaults (devpassword / 38-char dev JWT /
guest:guest@localhost:5672) so a fresh repo runs without configuring
secrets first; production-style runs should set `E2E_DB_PASSWORD`,
`E2E_JWT_SECRET`, `E2E_RABBITMQ_URL` in Forgejo Actions secrets.

Runbook (docs/CI_E2E.md):
- Trigger / scope / target time table.
- Step-by-step explanation of what a CI run does.
- Required secrets + their fallbacks.
- "Reproducing a CI failure locally" — exact mirror of the workflow
  invocation so a dev can rerun without pushing.
- "Debugging a red run" — where to look in the Forgejo UI, what the
  artifacts contain, when to check SKIPPED_TESTS.md.
- "Adding a new E2E test" — fixture usage, when to tag @critical.

Action pin SHAs match the rest of the workflows (consistent supply-
chain hygiene). Go 1.25 (matches ci.yml backend job, NOT the older
1.24 used in the disabled accessibility.yml template).

Remaining Batch C item: C6 — flake stabilisation (~3-5 of the 22
SKIPPED_TESTS.md entries that look fixable). Defer to a follow-up
session — wiring the workflow first means the next push-to-main run
will tell us empirically which @critical tests are flaky in CI.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 23:51:33 +02:00

5.3 KiB
Raw Blame History

E2E CI — runbook

v1.0.8 Batch C — Playwright E2E suite running on Forgejo Actions. Workflow: .github/workflows/e2e.yml. Tests: tests/e2e/*.spec.ts. Skipped tests inventory: tests/e2e/SKIPPED_TESTS.md.


Triggers

Trigger Scope Target time Why
PR opened / synced (against main) @critical only ~57 min Fast feedback loop, blocks merge if red
Push to main Full suite ~25 min Catches regressions that slipped past @critical
Nightly cron (03:00 UTC) Full suite ~25 min Catches infra drift independent of merges
workflow_dispatch Full suite manual Re-run after a flaky failure or on a feature branch

@critical is a Playwright --grep tag — see npm run e2e:critical.


How a CI run works

  1. actions/checkout + setup-node@20 + setup-go@1.25.
  2. npm ci from repo root.
  3. Adds 127.0.0.1 veza.fr to /etc/hosts so the browsers can hit the dev domain.
  4. Generates dev JWT keys + SSL cert via the existing scripts.
  5. Brings up postgres / redis / rabbitmq via docker compose.
  6. Runs Go migrations.
  7. go run ./cmd/tools/seed --ci — the lean seed: 5 test accounts
    • 10 tracks + 3 playlists, no chat/live/marketplace/analytics. ~5s.
  8. Builds + starts the backend on localhost:18080, asserts /api/v1/health.
  9. playwright install --with-deps chromium.
  10. Runs npm run e2e:critical (PR) or npm run e2e (push/cron). CI=true is exported globally so playwright.config.ts:141,155 spawns its own Vite + backend instance instead of trying to reuse.
  11. On failure: uploads the Playwright HTML report and backend.log as artifacts, retained 7 days.

Required secrets (Forgejo)

The workflow falls back to dev defaults so it can still run on a fresh repo without secrets configured, but production-style runs should set these in Forgejo Actions secrets:

Secret Default fallback Purpose
E2E_DB_PASSWORD devpassword Postgres password (must match docker-compose.yml)
E2E_JWT_SECRET ci-dev-jwt-secret-32-chars-min-padding!! HS256 signing key (32+ chars)
E2E_RABBITMQ_URL amqp://guest:guest@localhost:5672/ RabbitMQ AMQP URL

Without these, the workflow still passes for everything that doesn't exercise WebSocket / RabbitMQ paths under load.


Reproducing a CI failure locally

Mirrors the workflow exactly:

# From repo root
make infra-up-dev                  # postgres + redis + rabbitmq
cd veza-backend-api
go run cmd/migrate_tool/main.go
go run ./cmd/tools/seed --ci       # 5 test accounts only
go build -o veza-api ./cmd/api/main.go
APP_ENV=test ./veza-api &

# In another shell
cd apps/web && npm run dev -- --host 127.0.0.1 --port 5174 &

# Run the same tests CI ran
cd /path/to/repo
CI=true npm run e2e:critical       # PR scope
# or
CI=true npm run e2e                # full suite

If the failure only reproduces under CI=true, suspect reuseExistingServer — set CI= (empty) to flip back to local mode and bisect.


Debugging a red run

  1. Open the run in Forgejo Actions UI.
  2. Find the failing job's "Run E2E" step. Each test failure shows the selector / assertion / screenshot inline.
  3. Scroll to the artifact section: download playwright-report-<run-id>-<attempt> (the HTML report — opens in any browser, shows trace viewer + video for retry-on-fail) and backend-log-<run-id>-<attempt> (full backend stdout + stderr).
  4. If the failure looks env-related (404 on a known route, 500 without a clear cause), check backend-log for panics or migration errors before assuming a test bug.
  5. Cross-check tests/e2e/SKIPPED_TESTS.md — if the test is already listed as flaky, the right fix may be .skip() until the underlying app bug is tracked.

Adding a new E2E test

  1. Drop a *.spec.ts file under tests/e2e/.
  2. Tag it with @critical if it must run on every PR (be conservative — every @critical test extends the PR feedback loop).
  3. Use the auth fixture from tests/e2e/fixtures/auth.fixture.ts (listenerPage / creatorPage / adminPage / moderatorPage) instead of writing UI login flows.
  4. If the test needs DB state outside the --ci seed (rare), seed it from inside the test via page.request.post(...) rather than extending the seed tool — keeps the seed lean.
  5. Run locally with CI=true npm run e2e:critical -- --grep "your test" before pushing.

Scaling considerations

  • Forgejo runner pool is shared across CI workflows — keep PR runs under 10 min so we don't hold a runner during peak hours.
  • docker compose up -d postgres redis rabbitmq reuses the dev compose file; if that file changes, the workflow inherits the change automatically.
  • The full suite is gated to push/cron/dispatch precisely because we don't want to pay 25 min on every PR push.

  • tests/e2e/playwright.config.ts — base config, reuseExistingServer: !process.env.CI (committed in v1.0.8 C3, commit 46d21c5c).
  • veza-backend-api/cmd/tools/seed/config.goCIConfig() and the --ci flag (committed in v1.0.8 C4, commit cee850a5).
  • tests/e2e/SKIPPED_TESTS.md — known flakes + tickets to resolve.
  • docs/audit-2026-04/v107-plan.md — historical context for E2E coverage gaps that landed in v1.0.7.