Closes the "single-region MinIO" gap. The 4-node EC:2 cluster
tolerates 2 simultaneous drive losses but a regional outage
(network partition, DC fire, operator error wiping the cluster)
remains a single point of data loss.
New Ansible role minio_replication :
- Wrapper script veza-minio-replicate.sh runs `mc mirror --preserve`
from the local cluster to a remote S3-compatible target every 6h
(configurable via OnCalendar).
- Writes textfile-collector metrics on each run :
veza_minio_replication_last_run_timestamp_seconds
veza_minio_replication_last_success_timestamp_seconds
veza_minio_replication_last_duration_seconds
veza_minio_replication_last_status (1/0)
veza_minio_replication_target_bytes
- systemd timer with Persistent=true catches up missed runs after
reboot (this is the disaster-recovery surface, can't afford to
silently skip ticks).
- Idempotent : `mc alias set` re-applies cleanly, `mc mb
--ignore-existing` for the target bucket.
- Refuses to run with vault placeholders to avoid accidental
prod application against bogus credentials.
Why mc mirror, not MinIO native bucket replication : works against
any S3-compatible target (Wasabi, Backblaze B2, AWS S3) with just
an access key, where MinIO BR/SR requires the target to be
MinIO-managed and bidirectionally reachable. mc is the lowest-
common-denominator that lets us decouple from the choice of
target operator.
Alerts in alert_rules.yml veza_minio_backup group :
- MinioReplicationLastFailed (warning, single failed run)
- MinioReplicationStale (CRITICAL, no success in 12h — past RPO)
- MinioReplicationNeverSucceeded (warning, fresh deploy stuck)
- MinioReplicationTargetShrunk (CRITICAL, > 20% drop in 1h —
operator-error guard rail)
Runbook docs/runbooks/minio-replication.md covers triage by alert,
common ops tasks (manual sync, pause, credential rotation), and
the manual restore procedure for DR.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5.9 KiB
5.9 KiB
Runbook — MinIO cross-region replication
v1.0.10 ops item 8. Owner : ops on-call. Severity routing :
MinioReplicationStaleandMinioReplicationTargetShrunkpage ; the others are warnings.
Architecture (1 minute version)
- Source : 4-node MinIO distributed cluster (EC:2),
veza-prod-tracksbucket. - Target : remote S3-compatible bucket (
{{ minio_remote_bucket }}— set in vault). - Mechanism :
mc mirror --preservedriven by a systemd timer firing every 6h (veza-minio-replicate.timer). - Telemetry : textfile-collector metrics (
veza_minio_replication_*). - RPO target : 6h. RTO target : 2h for ≤500 GB.
Triage by alert
MinioReplicationLastFailed
The last run returned non-zero. Single failures are usually transient (target endpoint blip, DNS hiccup) ; the next 6h tick retries automatically.
# 1. Inspect the last journal output (full mc-mirror error).
journalctl -u veza-minio-replicate --since "12 hours ago" --no-pager
# 2. Cross-check the timer is still scheduled.
systemctl list-timers veza-minio-replicate.timer
# 3. If the failure looks transient (network), trigger a manual run.
sudo systemctl start veza-minio-replicate.service
journalctl -fu veza-minio-replicate
# 4. Verify the post-run metric is back to 1.
grep last_status /var/lib/node_exporter/textfile/veza_minio_replication.prom
MinioReplicationStale (PAGES)
12h with no successful run = past RPO. This is a real incident.
- Confirm scope. Open the dashboard panel
MinIO replication freshness; confirm the metric and that the last-success timestamp is what the alert says. - Check the timer + service status.
If the timer issystemctl status veza-minio-replicate.timer systemctl status veza-minio-replicate.serviceinactive (dead)orfailed, see "Timer broke" below. If the service isfailed, see "Script crashes" below. - Check the host. If the MinIO host is offline (e.g. taken down for maintenance) the timer is correctly idle ; this isn't a replication bug, it's a host-availability bug. Page the on-call for the host first.
- Check the remote target. From the source host :
Ifmc ls veza-backup/ # list aliases — does the target alias exist ? mc admin info veza-backup # ping the target — auth works ?mc admin infofails with auth error → credentials rotated, update vault and re-applyminio_replication. If it fails with network error → target endpoint down, escalate to the target operator. - Manual restore of replication health :
sudo systemctl restart veza-minio-replicate.timer sudo systemctl start veza-minio-replicate.service journalctl -fu veza-minio-replicate - If the manual run succeeds, the alert clears in ≤ 15min (next scrape + the alert's 30m for: window).
MinioReplicationNeverSucceeded
Fresh deploy of the role that has run at least once but never landed a green run. Almost always a config error.
- Read the last journal output for veza-minio-replicate ; the error is usually a clean message from
mc. - Common causes :
- Wrong remote endpoint : typo,
https://vshttp://, port wrong. - Bad credentials : key rotated post-deploy, vault not updated.
- Target bucket unable to be created : IAM policy on the remote denies
mc mb. Pre-create the bucket on the remote side and re-apply the role.
- Wrong remote endpoint : typo,
- After fixing, re-run the role with
ansible-playbook -i inventory/prod -t minio_replication; the playbook is idempotent.
MinioReplicationTargetShrunk (PAGES)
Critical — the backup we hold may have just lost data.
- STOP THE TIMER FIRST. Don't let the next tick propagate the damage :
sudo systemctl stop veza-minio-replicate.timer - Investigate the target side :
mc ls --recursive veza-backup/<bucket> | wc -l # object count vs yesterday mc du veza-backup/<bucket> # current size - Cross-check the source size. If the source is intact and the target shrunk, the next run will re-mirror the missing objects (this is the recovery path) — but only after you've confirmed the source is the source of truth.
- If the source is also empty, the data is gone (or nearly so). Escalate to the data-recovery runbook ; the latest pgbackrest + MinIO snapshots are the next layer.
- Once root cause is identified and source data is verified, re-enable the timer :
sudo systemctl start veza-minio-replicate.timer
Common operational tasks
Trigger a one-off sync (manual catch-up)
sudo systemctl start veza-minio-replicate.service
journalctl -fu veza-minio-replicate
Pause replication (e.g. during a planned migration)
sudo systemctl stop veza-minio-replicate.timer
# … do the work …
sudo systemctl start veza-minio-replicate.timer
Rotate the remote credentials
- Update the vault entry (
minio_remote_access_key,minio_remote_secret_key). - Re-run the role :
ansible-playbook -i inventory/prod -t minio_replication. The role re-applies themc alias setwith the new key (idempotent). - Trigger one manual run + verify success metric.
Quarterly DR drill (manual — recommended cadence)
Once per quarter, exercise the restore path on a throwaway MinIO cluster :
- Provision a temporary single-node MinIO somewhere (Incus container, throwaway VM).
- Run the restore commands from the role README's "Manual restore" section against this throwaway target.
- Spot-check 5 random track playbacks via the API pointed at the new MinIO.
- Document RTO observed in
docs/dr-drill-log.md. - Tear down the throwaway.
Related
- Role :
infra/ansible/roles/minio_replication/ - Source role :
infra/ansible/roles/minio_distributed/ - Alert rules :
config/prometheus/alert_rules.ymlgroupveza_minio_backup - pgbackrest equivalent runbook :
docs/runbooks/db-failover.md+infra/ansible/roles/pgbackrest/