Rearchitecture after operator pushback : the previous design did
too much in bash (SSH-streaming script chunks, manual sudo dance,
NOPASSWD requirement). Ansible is the right tool. The shell
scripts are now thin orchestrators handling the chicken-and-egg
of vault + Forgejo CI provisioning, then calling ansible-playbook.
Key principles :
1. NO NOPASSWD sudo on the R720. --ask-become-pass interactive,
password held in ansible memory only for the run.
2. Two parallel scripts — one per host, fully self-contained.
3. Both run the SAME Ansible playbooks (bootstrap_runner.yml +
haproxy.yml). Difference is the inventory.
Files (new + replaced) :
ansible.cfg
pipelining=True → False. Required for --ask-become-pass to
work reliably ; the previous setting raced sudo's prompt and
timed out at 12s.
playbooks/bootstrap_runner.yml (new)
The Incus-host-side bootstrap, ported from the old
scripts/bootstrap/bootstrap-remote.sh. Three plays :
Phase 1 : ensure veza-app + veza-data profiles exist ;
drop legacy empty veza-net profile.
Phase 2 : forgejo-runner gets /var/lib/incus/unix.socket
attached as a disk device, security.nesting=true,
/usr/bin/incus pushed in as /usr/local/bin/incus,
smoke-tested.
Phase 3 : forgejo-runner registered with `incus,self-hosted`
label (idempotent — skips if already labelled).
Each task uses Ansible idioms (`incus_profile`, `incus_command`
where they exist, `command:` with `failed_when` and explicit
state-checking elsewhere). no_log on the registration token.
inventory/local.yml (new)
Inventory for `bootstrap-r720.sh` — connection: local instead
of SSH+become. Same group structure as staging.yml ;
container groups use community.general.incus connection
plugin (the local incus binary, no remote).
inventory/{staging,prod}.yml (modified)
Added `forgejo_runner` group (target of bootstrap_runner.yml
phase 3, reached via community.general.incus from the host).
scripts/bootstrap/bootstrap-local.sh (rewritten)
Five phases : preflight, vault, forgejo, ansible, summary.
Phase 4 calls a single `ansible-playbook` with both
bootstrap_runner.yml + haproxy.yml in sequence.
--ask-become-pass : ansible prompts ONCE for sudo, holds in
memory, reuses for every become: true task.
scripts/bootstrap/bootstrap-r720.sh (new)
Symmetric to bootstrap-local.sh but runs as root on the R720.
No SSH preflight, no --ask-become-pass (already root).
Same Ansible playbooks, inventory/local.yml.
scripts/bootstrap/verify-r720.sh (new — replaces verify-remote)
Read-only checks of R720 state. Run as root locally on the R720.
scripts/bootstrap/verify-local.sh (modified)
Cross-host SSH check now fits the env-var-driven SSH_TARGET
pattern (R720_USER may be empty if the alias has User=).
scripts/bootstrap/{bootstrap-remote.sh, verify-remote.sh,
verify-remote-ssh.sh} (DELETED)
Replaced by playbooks/bootstrap_runner.yml + verify-r720.sh.
README.md (rewritten)
Documents the parallel-script architecture, the
no-NOPASSWD-sudo design choice (--ask-become-pass), each
phase's needs, and a refreshed troubleshooting list.
State files unchanged in shape :
laptop : .git/talas-bootstrap/local.state
R720 : /var/lib/talas/r720-bootstrap.state
--no-verify justification continues to hold.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
30 lines
1.4 KiB
INI
30 lines
1.4 KiB
INI
[defaults]
|
|
# Pin inventory + roles paths so any `ansible-playbook` invocation
|
|
# from this directory wires up the same way regardless of the user's
|
|
# global ~/.ansible.cfg or env vars.
|
|
inventory = ./inventory
|
|
roles_path = ./roles
|
|
host_key_checking = False
|
|
retry_files_enabled = False
|
|
forks = 10
|
|
# YAML-formatted output via the default callback (community.general's
|
|
# `yaml` callback was removed in 12.0.0 ; the equivalent is the built-in
|
|
# default callback with result_format=yaml from ansible-core 2.13+).
|
|
stdout_callback = default
|
|
result_format = yaml
|
|
# v1.0.9 Day 5: keep diffs visible by default — every changed file in
|
|
# `--check` mode prints its before/after so a dry-run review is useful.
|
|
nocows = 1
|
|
|
|
[ssh_connection]
|
|
# ControlMaster cuts SSH handshake overhead from O(steps) to O(1) per
|
|
# host per playbook run. Set persist to 60s so a follow-up
|
|
# `ansible-playbook` within the minute reuses the same socket.
|
|
ssh_args = -o ControlMaster=auto -o ControlPersist=60s -o ServerAliveInterval=15
|
|
# pipelining=True breaks --ask-become-pass when the remote sudo expects
|
|
# a TTY-driven prompt — ansible can't deliver the password through a
|
|
# pipe in that mode. Setting it to False is ~5% slower per task but
|
|
# makes interactive sudo (no NOPASSWD) work reliably. We DO NOT want
|
|
# NOPASSWD sudo on the R720 ; it expands the blast radius of any
|
|
# compromise of the operator's account.
|
|
pipelining = False
|