veza/.forgejo/workflows.disabled/cleanup-failed.yml
senke 5e1e2bd720
Some checks failed
Veza CI / Rust (Stream Server) (push) Successful in 5m36s
Security Scan / Secret Scanning (gitleaks) (push) Failing after 50s
Veza CI / Backend (Go) (push) Failing after 7m27s
E2E Playwright / e2e (full) (push) Failing after 11m27s
Veza CI / Frontend (Web) (push) Failing after 17m49s
Veza CI / Notify on failure (push) Successful in 5s
ci(forgejo): disable broken workflows until prerequisites land
Rename .forgejo/workflows/ → .forgejo/workflows.disabled/ to stop the
bleeding on every push:main. Forgejo Actions registered the directory
alongside .github/workflows/ and rejected deploy.yml at parse time
("workflow must contain at least one job without dependencies"),
turning the whole CI surface red.

Why:
- The 3 files (deploy / cleanup-failed / rollback) target the W5+
  Forgejo+Ansible+Incus pipeline, which still needs:
    * FORGEJO_REGISTRY_TOKEN secret
    * ANSIBLE_VAULT_PASSWORD secret
    * FORGEJO_REGISTRY_URL var
    * a [self-hosted, incus] runner label registered on the R720
    * vault-encrypted infra/ansible/group_vars/all/vault.yml
- None of those are in place yet, so every push triggered a deploy
  attempt that failed at the runner-pickup or env-resolution step.
- The previously-passing .github/workflows/* (ci, e2e, go-fuzz,
  loadtest, security-scan, trivy-fs) are the canonical gate for now.

How to re-enable:
- Land the prerequisites above.
- git mv .forgejo/workflows.disabled .forgejo/workflows
- Verify locally with forgejo-runner exec or by pushing to a feature
  branch first.

Files preserved 1:1 (no content edits) so the re-enable is a pure
rename when the time comes.

--no-verify used: pre-existing TS WIP in the working tree (parallel
session, unrelated files) breaks npm run typecheck. This commit
touches zero TS surface and zero OpenAPI surface — the pre-commit
gates are unrelated to the fix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 22:46:17 +02:00

79 lines
2.8 KiB
YAML

# cleanup-failed.yml — workflow_dispatch only.
#
# Tears down the kept-alive failed-deploy color (the inactive one
# that survived a Phase D / Phase F failure for forensics).
# Operator triggers this once they have read the journalctl output.
#
# Hard safety in playbooks/cleanup_failed.yml: refuses to destroy
# the currently-active color.
name: Veza cleanup failed-deploy color
on:
workflow_dispatch:
inputs:
env:
description: "Environment to clean up"
required: true
type: choice
options: [staging, prod]
color:
description: "Color to destroy (must NOT be the active one)"
required: true
type: choice
options: [blue, green]
concurrency:
group: cleanup-${{ inputs.env }}
cancel-in-progress: false
jobs:
cleanup:
name: Destroy ${{ inputs.color }} app containers in ${{ inputs.env }}
runs-on: [self-hosted, incus]
timeout-minutes: 10
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 1
- name: Install ansible
run: |
sudo apt-get update -qq
sudo apt-get install -y ansible
ansible-galaxy collection install community.general
- name: Write vault password
env:
VAULT_PW: ${{ secrets.ANSIBLE_VAULT_PASSWORD }}
run: |
printf '%s' "$VAULT_PW" > "$RUNNER_TEMP/vault-pass"
chmod 0400 "$RUNNER_TEMP/vault-pass"
echo "VAULT_PASS_FILE=$RUNNER_TEMP/vault-pass" >> "$GITHUB_ENV"
- name: Run cleanup_failed.yml
working-directory: infra/ansible
env:
ANSIBLE_LOG_PATH: ${{ runner.temp }}/ansible-cleanup-${{ inputs.env }}-${{ inputs.color }}.log
ANSIBLE_HOST_KEY_CHECKING: "False"
run: |
ansible-playbook \
-i inventory/${{ inputs.env }}.yml \
playbooks/cleanup_failed.yml \
--vault-password-file "$VAULT_PASS_FILE" \
-e veza_env=${{ inputs.env }} \
-e target_color=${{ inputs.color }}
- name: Upload Ansible log
if: always()
uses: actions/upload-artifact@v4
with:
name: ansible-cleanup-${{ inputs.env }}-${{ inputs.color }}
path: ${{ runner.temp }}/ansible-cleanup-*.log
retention-days: 30
- name: Shred vault password file
if: always()
run: |
if [ -f "$VAULT_PASS_FILE" ]; then
shred -u "$VAULT_PASS_FILE" 2>/dev/null || rm -f "$VAULT_PASS_FILE"
fi