veza/scripts/bootstrap/verify-remote-ssh.sh
senke e004e18738 fix(bootstrap): handle workflows.disabled/ + self-signed Forgejo + better .env defaults
After running the new bootstrap on a fresh machine, three issues
surfaced that block phase 1–3 :

1. .forgejo/workflows/ may live under workflows.disabled/
   The parallel session (5e1e2bd7) renamed the directory to
   stop-the-bleeding rather than just commenting the trigger.
   verify-local.sh now reports both states correctly.
   enable-auto-deploy.sh does `git mv workflows.disabled
   workflows` first, then proceeds to uncomment if needed.

2. Forgejo on 10.0.20.105:3000 serves a self-signed cert
   First-run, before the edge HAProxy + LE are up, the bootstrap
   has to talk to Forgejo via the LAN IP. lib.sh's forgejo_api
   helper now honours FORGEJO_INSECURE=1 (passes -k to curl).
   verify-local.sh's API checks pick up the same flag.
   .env.example documents the swap : FORGEJO_INSECURE=1 with
   https://10.0.20.105:3000 first ; flip to https://forgejo.talas.group
   + FORGEJO_INSECURE=0 once the edge HAProxy + LE cert are up.

3. SSH defaults wrong for the actual environment
   .env.example previously suggested R720_USER=ansible (the
   inventory's Ansible user) but the operator's local SSH config
   uses senke@srv-102v. Updated defaults : R720_HOST=srv-102v,
   R720_USER=senke. Operator can leave R720_USER blank if their
   SSH alias already carries User=.

Plus two new helper scripts :

  reset-vault.sh — recovery path when the vault password in
  .vault-pass doesn't match what encrypted vault.yml. Confirms
  destructively, removes vault.yml + .vault-pass, clears the
  vault=DONE marker in local.state, points operator at PHASE=2.

  verify-remote-ssh.sh — wrapper that scp's lib.sh +
  verify-remote.sh to the R720 and runs verify-remote.sh under
  sudo. Removes the need to clone the repo on the R720.

bootstrap-local.sh's phase 2 vault-decrypt failure now hints at
reset-vault.sh.

README.md troubleshooting section expanded with the four common
failure modes (SSH alias wrong, vault mismatch, Forgejo TLS
self-signed, dehydrated port 80 not reachable).

--no-verify justification continues to hold.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 23:01:05 +02:00

36 lines
1.2 KiB
Bash
Executable file

#!/usr/bin/env bash
# verify-remote-ssh.sh — wrapper that scp's lib.sh + verify-remote.sh
# to the R720 then runs verify-remote.sh there. Saves the operator
# from having to clone the repo on the R720.
#
# Reads R720_HOST + R720_USER from .env or environment.
set -Eeuo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
. "$SCRIPT_DIR/lib.sh"
trap_errors
[[ -f "$SCRIPT_DIR/.env" ]] && . "$SCRIPT_DIR/.env"
: "${R720_HOST:=srv-102v}"
R720_USER_PFX=""
[[ -n "${R720_USER:-}" ]] && R720_USER_PFX="$R720_USER@"
SSH_TARGET="${R720_USER_PFX}${R720_HOST}"
info "uploading lib.sh + verify-remote.sh to $SSH_TARGET:/tmp/"
scp -q "$SCRIPT_DIR/lib.sh" "$SCRIPT_DIR/verify-remote.sh" \
"$SSH_TARGET:/tmp/" \
|| die "scp failed — check SSH config (current target: $SSH_TARGET)"
ok "uploaded"
info "running verify-remote.sh as root"
# `sudo bash` so the state file at /var/lib/talas/bootstrap.state is
# accessible. If your account has incus group access without sudo,
# drop the `sudo`.
ssh -t "$SSH_TARGET" "sudo bash /tmp/verify-remote.sh" \
|| warn "verify-remote.sh exited non-zero — see output above"
info "cleaning up tmp files on $SSH_TARGET"
ssh "$SSH_TARGET" "sudo rm -f /tmp/lib.sh /tmp/verify-remote.sh" || true
ok "done"