# `scripts/bootstrap/` Two-host bootstrap of the Veza deploy pipeline. Each script is idempotent, resumable, and read-only by default unless explicitly asked to mutate. ## Files | File | Where it runs | What it does | |---|---|---| | `lib.sh` | sourced by all | logging, error trap, idempotent state file, Forgejo API helpers (honours `FORGEJO_INSECURE=1`) | | `bootstrap-local.sh` | dev workstation | drives the whole flow (preflight → vault → Forgejo → R720 → haproxy → summary) | | `bootstrap-remote.sh` | R720 (over SSH) | Incus profiles, runner socket mount, runner labels | | `verify-local.sh` | dev workstation | read-only checks of local state | | `verify-remote.sh` | R720 | read-only checks of R720 state (run via `verify-remote-ssh.sh`) | | `verify-remote-ssh.sh` | dev workstation | scp+ssh wrapper that runs `verify-remote.sh` on R720 | | `enable-auto-deploy.sh` | dev workstation | restores `.forgejo/workflows/` if disabled, uncomments push: trigger | | `reset-vault.sh` | dev workstation | recovery from a vault password mismatch (destructive — re-prompts) | | `.env.example` | template | copy to `.env`, fill in, gitignored | ## State file Each host keeps a per-host state file with `phase=DONE timestamp` lines so a re-run is a no-op for completed phases : ``` local : /.git/talas-bootstrap/local.state R720 : /var/lib/talas/bootstrap.state ``` To force a phase re-run, delete its line : ```bash sed -i '/^vault=/d' .git/talas-bootstrap/local.state ``` ## Inter-script communication `bootstrap-local.sh` invokes `bootstrap-remote.sh` over SSH by concatenating `lib.sh` + `bootstrap-remote.sh` and piping into `sudo -E bash -s` on the R720. The remote script : * writes `/var/log/talas-bootstrap.log` on R720 (persistent) * emits `>>>PHASE::<<<` markers on stdout * the local script `tee`s those to stderr so the operator sees remote progress in the same terminal as the local logs Resumability : the state file means a SSH disconnect or partial failure leaves the work it managed to complete marked DONE. Re-run `bootstrap-local.sh` and it picks up where it stopped. ## Quickstart ```bash cd /home/senke/git/talas/veza/scripts/bootstrap cp .env.example .env $EDITOR .env # fill in FORGEJO_ADMIN_TOKEN at minimum chmod +x *.sh # Set up everything ./bootstrap-local.sh # Or skip phases you've already done PHASE=4 ./bootstrap-local.sh # Verify any time ./verify-local.sh ssh ansible@10.0.20.150 'sudo bash' < verify-remote.sh ``` ## What each phase needs | Phase | Needs | |---|---| | 1. preflight | git, ansible, dig, ssh, jq locally ; SSH to R720 ; DNS resolved (warning only if missing) | | 2. vault | nothing ; will prompt for vault password and edit `vault.yml` from template | | 3. forgejo | `FORGEJO_ADMIN_TOKEN` env var or in .env | | 4. r720 | `FORGEJO_ADMIN_TOKEN` (used to fetch runner registration token) ; SSH to R720 with sudo | | 5. haproxy | DNS public domains resolved + port 80 reachable from Internet ; ansible decryptable vault | | 6. summary | nothing | ## Troubleshooting - **Phase 1 SSH fails** — verify `R720_HOST` + `R720_USER` in `.env`. If you use an SSH config alias (e.g. `Host srv-102v` in `~/.ssh/config`), set `R720_HOST=srv-102v` and either set `R720_USER=` (empty, alias's User= wins) or match the alias's user. Test manually : `ssh ${R720_USER}@${R720_HOST} /bin/true`. - **Phase 2 `cannot decrypt vault.yml`** — the password in `.vault-pass` doesn't match what was used to encrypt `vault.yml`. - If you remember the original password, edit `.vault-pass` (`echo "" > infra/ansible/.vault-pass ; chmod 0400 …`). - Otherwise : `./reset-vault.sh` — destructive, re-prompts for everything. - **Phase 3 `Forgejo API unreachable`** — Forgejo on `https://10.0.20.105:3000` serves a self-signed cert. Set `FORGEJO_INSECURE=1` in `.env`. Once the edge HAProxy is up + LE has issued `forgejo.talas.group`, switch to that URL and clear `FORGEJO_INSECURE`. - **Phase 3 `repo not found`** — set `FORGEJO_OWNER` to the actual org/user owning the repo. Confirm with `git remote -v` (the path segment after `host:port/`). - **Phase 4 SSH timeout / sudo prompt** — passwordless sudo needed for the SSH user. Add to `/etc/sudoers.d/talas-bootstrap` : ``` senke ALL=(ALL) NOPASSWD: /usr/bin/bash ``` Or run the remote half manually : ``` scp scripts/bootstrap/{lib.sh,bootstrap-remote.sh} srv-102v:/tmp/ ssh srv-102v 'sudo FORGEJO_REGISTRATION_TOKEN= bash /tmp/bootstrap-remote.sh' ``` - **Phase 5 dehydrated fails** — port 80 must be reachable from Internet for HTTP-01 (not blocked by ISP, NAT-forwarded). Test from outside : `curl http://veza.fr/.well-known/acme-challenge/test` should hit HAProxy's `letsencrypt_backend` (will 404, which is fine ; what matters is reaching the R720). - **`.forgejo/workflows/` is missing, only `workflows.disabled/` present** — expected when the auto-trigger has been gated by renaming the dir. `enable-auto-deploy.sh` restores it. ## After bootstrap - Trigger 1st deploy manually via Forgejo UI : Actions → Veza deploy → Run workflow. - Once green, run `./enable-auto-deploy.sh` to re-enable push-trigger. - `verify-local.sh` + `verify-remote.sh` are safe to run any time.