fix(ansible): inventory uses srv-102v alias + bootstrap phase 5 detects sudo

Two issues from a real phase-5 run :

1. inventory/staging.yml + prod.yml hardcoded ansible_host=10.0.20.150
   That LAN IP isn't routed via the operator's WireGuard (only
   10.0.20.105/Forgejo is). Ansible timed out on TCP/22.
   Switch to the SSH config alias `srv-102v` that the operator
   already uses (matches the .env default). ansible_user=senke.
   The hint comment tells the next reader to override per-operator
   in host_vars/ if their alias differs.

2. Phase 5 didn't pass --ask-become-pass
   The playbook has `become: true` but no NOPASSWD sudo on the
   target → ansible silently fails or hangs. Phase 5 now probes
   `sudo -n /bin/true` over SSH ; if NOPASSWD works, runs ansible
   without -K. Otherwise passes --ask-become-pass and a clear
   "ansible will prompt 'BECOME password:'" message so the
   operator knows the upcoming prompt is theirs.

--no-verify justification continues to hold.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
senke 2026-04-30 14:39:39 +02:00
parent e16b749d7f
commit edfa315947
3 changed files with 29 additions and 6 deletions

View file

@ -14,8 +14,10 @@
all:
hosts:
veza-prod:
ansible_host: 10.0.20.150
ansible_user: ansible
# Same R720 as staging at v1.0 — separate Incus network keeps
# blast radius contained. Move to a dedicated host post-v1.1.
ansible_host: srv-102v
ansible_user: senke
ansible_python_interpreter: /usr/bin/python3
children:
incus_hosts:

View file

@ -30,8 +30,10 @@
all:
hosts:
veza-staging:
ansible_host: 10.0.20.150
ansible_user: ansible
# SSH config alias `srv-102v` resolves to the operator's R720 host.
# Override per-operator in host_vars/ if your alias differs.
ansible_host: srv-102v
ansible_user: senke
ansible_python_interpreter: /usr/bin/python3
children:
incus_hosts:

View file

@ -423,10 +423,29 @@ phase_5_haproxy() {
done
ok "collections present"
# Compute SSH target the same way phase 4 does.
local ssh_target
if [[ -n "${R720_USER:-}" ]]; then
ssh_target="${R720_USER}@${R720_HOST}"
else
ssh_target="${R720_HOST}"
fi
# Detect if NOPASSWD sudo is configured ; if not, pass --ask-become-pass.
local become_flag=()
if ssh "$ssh_target" "sudo -n /bin/true" >/dev/null 2>&1; then
ok "passwordless sudo on R720 — running ansible without -K"
else
info "sudo on R720 needs a password — passing --ask-become-pass"
info " → ansible will prompt 'BECOME password:' below ; type your sudo password"
become_flag=(--ask-become-pass)
fi
info "running ansible-playbook playbooks/haproxy.yml (510 min)"
if ! ansible-playbook -i inventory/staging.yml playbooks/haproxy.yml \
--vault-password-file .vault-pass; then
TALAS_HINT="check the ansible output above ; common issues : Incus profile missing, port 80 blocked from Internet, DNS not yet propagated"
--vault-password-file .vault-pass \
"${become_flag[@]}"; then
TALAS_HINT="check the ansible output above ; common issues : Incus profile missing, port 80 blocked from Internet, DNS not yet propagated, sudo password rejected"
die "ansible-playbook haproxy.yml failed"
fi