101 lines
3.8 KiB
Markdown
101 lines
3.8 KiB
Markdown
|
|
# `scripts/bootstrap/`
|
||
|
|
|
||
|
|
Two-host bootstrap of the Veza deploy pipeline. Each script is
|
||
|
|
idempotent, resumable, and read-only by default unless explicitly
|
||
|
|
asked to mutate.
|
||
|
|
|
||
|
|
## Files
|
||
|
|
|
||
|
|
| File | Where it runs | What it does |
|
||
|
|
|---|---|---|
|
||
|
|
| `lib.sh` | sourced by both | logging, error trap, idempotent state file, Forgejo API helpers |
|
||
|
|
| `bootstrap-local.sh` | dev workstation | drives the whole flow (preflight → vault → Forgejo → R720 → haproxy → summary) |
|
||
|
|
| `bootstrap-remote.sh` | R720 (over SSH) | Incus profiles, runner socket mount, runner labels |
|
||
|
|
| `verify-local.sh` | dev workstation | read-only checks of local state |
|
||
|
|
| `verify-remote.sh` | R720 | read-only checks of R720 state |
|
||
|
|
| `enable-auto-deploy.sh` | dev workstation | flips the deploy.yml gate from workflow_dispatch-only to push:main + tag:v* |
|
||
|
|
| `.env.example` | template | copy to `.env`, fill in, gitignored |
|
||
|
|
|
||
|
|
## State file
|
||
|
|
|
||
|
|
Each host keeps a per-host state file with `phase=DONE timestamp`
|
||
|
|
lines so a re-run is a no-op for completed phases :
|
||
|
|
|
||
|
|
```
|
||
|
|
local : <repo>/.git/talas-bootstrap/local.state
|
||
|
|
R720 : /var/lib/talas/bootstrap.state
|
||
|
|
```
|
||
|
|
|
||
|
|
To force a phase re-run, delete its line :
|
||
|
|
```bash
|
||
|
|
sed -i '/^vault=/d' .git/talas-bootstrap/local.state
|
||
|
|
```
|
||
|
|
|
||
|
|
## Inter-script communication
|
||
|
|
|
||
|
|
`bootstrap-local.sh` invokes `bootstrap-remote.sh` over SSH by
|
||
|
|
concatenating `lib.sh` + `bootstrap-remote.sh` and piping into
|
||
|
|
`sudo -E bash -s` on the R720. The remote script :
|
||
|
|
|
||
|
|
* writes `/var/log/talas-bootstrap.log` on R720 (persistent)
|
||
|
|
* emits `>>>PHASE:<name>:<status><<<` markers on stdout
|
||
|
|
* the local script `tee`s those to stderr so the operator sees
|
||
|
|
remote progress in the same terminal as the local logs
|
||
|
|
|
||
|
|
Resumability : the state file means a SSH disconnect or partial
|
||
|
|
failure leaves the work it managed to complete marked DONE. Re-run
|
||
|
|
`bootstrap-local.sh` and it picks up where it stopped.
|
||
|
|
|
||
|
|
## Quickstart
|
||
|
|
|
||
|
|
```bash
|
||
|
|
cd /home/senke/git/talas/veza/scripts/bootstrap
|
||
|
|
cp .env.example .env
|
||
|
|
$EDITOR .env # fill in FORGEJO_ADMIN_TOKEN at minimum
|
||
|
|
chmod +x *.sh
|
||
|
|
|
||
|
|
# Set up everything
|
||
|
|
./bootstrap-local.sh
|
||
|
|
|
||
|
|
# Or skip phases you've already done
|
||
|
|
PHASE=4 ./bootstrap-local.sh
|
||
|
|
|
||
|
|
# Verify any time
|
||
|
|
./verify-local.sh
|
||
|
|
ssh ansible@10.0.20.150 'sudo bash' < verify-remote.sh
|
||
|
|
```
|
||
|
|
|
||
|
|
## What each phase needs
|
||
|
|
|
||
|
|
| Phase | Needs |
|
||
|
|
|---|---|
|
||
|
|
| 1. preflight | git, ansible, dig, ssh, jq locally ; SSH to R720 ; DNS resolved (warning only if missing) |
|
||
|
|
| 2. vault | nothing ; will prompt for vault password and edit `vault.yml` from template |
|
||
|
|
| 3. forgejo | `FORGEJO_ADMIN_TOKEN` env var or in .env |
|
||
|
|
| 4. r720 | `FORGEJO_ADMIN_TOKEN` (used to fetch runner registration token) ; SSH to R720 with sudo |
|
||
|
|
| 5. haproxy | DNS public domains resolved + port 80 reachable from Internet ; ansible decryptable vault |
|
||
|
|
| 6. summary | nothing |
|
||
|
|
|
||
|
|
## Troubleshooting
|
||
|
|
|
||
|
|
- **Phase 3 `repo not found`** — set `FORGEJO_OWNER` to the actual
|
||
|
|
org/user owning the repo (e.g., `senke` instead of `talas`).
|
||
|
|
- **Phase 4 SSH timeout** — `sudo` may prompt for password ; configure
|
||
|
|
passwordless sudo for the SSH user, OR run remote bootstrap manually :
|
||
|
|
```
|
||
|
|
scp scripts/bootstrap/{lib.sh,bootstrap-remote.sh} r720:/tmp/
|
||
|
|
ssh r720 'sudo FORGEJO_REGISTRATION_TOKEN=… bash /tmp/bootstrap-remote.sh'
|
||
|
|
```
|
||
|
|
- **Phase 5 dehydrated fails** — check that port 80 reaches the R720
|
||
|
|
from Internet (not blocked by ISP, NAT-forwarded, etc.). dehydrated
|
||
|
|
needs HTTP-01 inbound. Test: from outside,
|
||
|
|
`curl http://veza.fr/.well-known/acme-challenge/test` should hit
|
||
|
|
HAProxy's letsencrypt_backend (will 404, which is fine ; what
|
||
|
|
matters is it reaches the R720).
|
||
|
|
|
||
|
|
## After bootstrap
|
||
|
|
|
||
|
|
- Trigger 1st deploy manually via Forgejo UI : Actions → Veza deploy → Run workflow.
|
||
|
|
- Once green, run `./enable-auto-deploy.sh` to re-enable push-trigger.
|
||
|
|
- `verify-local.sh` + `verify-remote.sh` are safe to run any time.
|