veza/k8s/disaster-recovery/runbooks/database-failover.md

# Database Failover Runbook

This runbook describes the procedure for failing over from a primary PostgreSQL database to a standby replica.

## Prerequisites

- Standby replica configured and synchronized
- Access to Kubernetes cluster
- Database credentials in Vault/Secrets
- Monitoring alerts configured

## Detection

### Automatic Detection

Monitoring alerts will trigger when:

- Primary database is unreachable
- Replication lag exceeds threshold
- Health checks fail

### Manual Detection

```bash
# Check primary database status
kubectl exec -it postgres-primary -n veza-production -- pg_isready

# Check replication status
kubectl exec -it postgres-standby -n veza-production -- \
  psql -U postgres -c "SELECT * FROM pg_stat_replication;"
```

## Failover Procedure

### Step 1: Verify Standby Status

```bash
# Check standby is synchronized
kubectl exec -it postgres-standby -n veza-production -- \
  psql -U postgres -c "SELECT pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn();"

# Verify replication lag
kubectl exec -it postgres-standby -n veza-production -- \
  psql -U postgres -c "SELECT EXTRACT(EPOCH FROM (now() - pg_last_xact_replay_timestamp())) AS lag_seconds;"
```

**Expected**: Lag should be < 60 seconds

### Step 2: Promote Standby to Primary

```bash
# Promote standby
kubectl exec -it postgres-standby -n veza-production -- \
  pg_ctl promote

# Verify promotion
kubectl exec -it postgres-standby -n veza-production -- \
  psql -U postgres -c "SELECT pg_is_in_recovery();"
```

**Expected**: Returns `false` (no longer in recovery mode)

### Step 3: Update Service Endpoint

```bash
# Update postgres service to point to new primary
kubectl patch service postgres -n veza-production \
  -p '{"spec":{"selector":{"role":"primary"}}}'

# Or update the service selector to point to standby pod
kubectl get pod postgres-standby -n veza-production -o jsonpath='{.metadata.labels}' | \
  jq -r 'to_entries | map("\(.key)=\(.value)") | join(",")'
```

### Step 4: Restart Application Pods

```bash
# Restart to pick up new database connection
# (backend-api handles chat since v0.502 merge — no separate chat-server deployment)
kubectl rollout restart deployment/veza-backend-api -n veza-production

# Verify pods are healthy
kubectl rollout status deployment/veza-backend-api -n veza-production
```

### Step 5: Verify Application Health

```bash
# Check application logs
kubectl logs -f deployment/veza-backend-api -n veza-production

# Test database connectivity
kubectl exec -it deployment/veza-backend-api -n veza-production -- \
  psql $DATABASE_URL -c "SELECT 1;"

# Check health endpoint
curl https://api.veza.com/health
```

### Step 6: Set Up New Standby

```bash
# Once primary is recovered, set up new standby
# Follow PostgreSQL replication setup guide
```

## Rollback Procedure

If failover was incorrect or primary recovers:

```bash
# Stop applications
kubectl scale deployment veza-backend-api --replicas=0 -n veza-production

# Revert service endpoint
kubectl patch service postgres -n veza-production \
  -p '{"spec":{"selector":{"role":"primary","pod":"postgres-primary"}}}'

# Restart applications
kubectl scale deployment veza-backend-api --replicas=3 -n veza-production
```

## Verification Checklist

- [ ] Standby promoted successfully
- [ ] Service endpoint updated
- [ ] Application pods restarted
- [ ] Database connectivity verified
- [ ] Application health checks passing
- [ ] No data loss detected
- [ ] Monitoring alerts cleared
- [ ] Documentation updated

## Troubleshooting

### Standby Not Synchronized

```bash
# Check replication status
kubectl exec -it postgres-standby -n veza-production -- \
  psql -U postgres -c "SELECT * FROM pg_stat_replication;"

# If replication is broken, rebuild standby
# (See PostgreSQL replication setup guide)
```

### Application Cannot Connect

```bash
# Verify service selector
kubectl get service postgres -n veza-production -o yaml

# Check pod labels
kubectl get pod postgres-standby -n veza-production --show-labels

# Verify network connectivity
kubectl run test-connection --rm -it --image=postgres:15-alpine \
  --restart=Never \
  -- psql -h postgres.veza-production.svc.cluster.local -U veza_user -d veza_db
```

## Post-Failover Tasks

1. **Investigate Root Cause**
    - Review primary database logs
    - Check system resources
    - Identify failure reason

2. **Set Up New Standby**
    - Configure replication from new primary
    - Verify synchronization
    - Update monitoring

3. **Document Incident**
    - Document failover procedure
    - Note any issues encountered
    - Update runbook if needed

4. **Notify Stakeholders**
    - Send incident report
    - Update status page
    - Schedule post-mortem

## References

- [PostgreSQL Replication Documentation](https://www.postgresql.org/docs/current/high-availability.html)
- [Kubernetes Service Documentation](https://kubernetes.io/docs/concepts/services-networking/service/)
[INFRA-010] infra: Set up disaster recovery plan 2025-12-25 20:40:31 +00:00			`# Database Failover Runbook`

			`This runbook describes the procedure for failing over from a primary PostgreSQL database to a standby replica.`

			`## Prerequisites`

			`- Standby replica configured and synchronized`
			`- Access to Kubernetes cluster`
			`- Database credentials in Vault/Secrets`
			`- Monitoring alerts configured`

			`## Detection`

			`### Automatic Detection`

			`Monitoring alerts will trigger when:`
docs(J2): align docs with reality — rewrite CLAUDE.md, fix README, purge chat-server refs Completes Day 2 of the v1.0.3 → v1.0.4 cleanup sprint. The documentation now describes the actual repo layout instead of a fictional one. CLAUDE.md — complete rewrite Old version referenced paths that don't exist and a protocol aimed at implementing v0.11.0 (current tag: v1.0.3). The agent was following a map for a city that had been rebuilt. - backend/ → veza-backend-api/ - frontend/ → apps/web/ - ORIGIN/ (root) → veza-docs/ORIGIN/ - veza-chat-server → merged into backend-api (v0.502, commit 279a10d31) - apps/desktop/ → never existed Also refreshed: stack versions (Go 1.25, Vite 5, React 18.2, Axum 0.8), commands, conventions, hook bypasses (SKIP_TYPES/SKIP_TESTS/SKIP_E2E), scope rules kept as immutable (no AI/ML, no Web3, no gamification, no dark patterns, no public popularity metrics). README.md — targeted fixes - "Version cible: v0.101" → "Version courante: v1.0.4" - "Development Setup (v0.9.3)" → "Development Setup" - Removed Desktop (Electron) section — never implemented - Removed veza-chat-server from structure — merged into backend - Removed deprecated compose files section (nothing is DEPRECATED now) k8s runbooks — remove stale chat-server references The disaster-recovery runbooks still scaled/restarted a deployment that no longer exists. In a real failover these commands would have failed silently and blocked the procedure. Files patched: - k8s/disaster-recovery/runbooks/cluster-failover.md - k8s/disaster-recovery/runbooks/data-restore.md - k8s/disaster-recovery/runbooks/database-failover.md - k8s/disaster-recovery/runbooks/rollback-procedure.md - k8s/network-policies/README.md - k8s/secrets/README.md - k8s/secrets.yaml.example Each reference is replaced by a short inline note pointing to v0.502 (commit 279a10d31) so future readers understand the history. .env.example — remove CHAT_JWT_SECRET Legacy env var for the deleted chat server. Replaced by an explanatory comment. Not in this commit (user handles on Forgejo): - Closing the 5 open dependabot PRs on veza-chat-server/* branches - Deleting those 5 remote branches after the PRs are closed Refs: AUDIT_REPORT.md §5.1, §7.1, §10 P1, §10 P4 2026-04-14 15:23:50 +00:00
[INFRA-010] infra: Set up disaster recovery plan 2025-12-25 20:40:31 +00:00			`- Primary database is unreachable`
			`- Replication lag exceeds threshold`
			`- Health checks fail`

			`### Manual Detection`

			```bash
			`# Check primary database status`
			`kubectl exec -it postgres-primary -n veza-production -- pg_isready`

			`# Check replication status`
			`kubectl exec -it postgres-standby -n veza-production -- \`
			`psql -U postgres -c "SELECT * FROM pg_stat_replication;"`
			```

			`## Failover Procedure`

			`### Step 1: Verify Standby Status`

			```bash
			`# Check standby is synchronized`
			`kubectl exec -it postgres-standby -n veza-production -- \`
			`psql -U postgres -c "SELECT pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn();"`

			`# Verify replication lag`
			`kubectl exec -it postgres-standby -n veza-production -- \`
			`psql -U postgres -c "SELECT EXTRACT(EPOCH FROM (now() - pg_last_xact_replay_timestamp())) AS lag_seconds;"`
			```

			`Expected: Lag should be < 60 seconds`

			`### Step 2: Promote Standby to Primary`

			```bash
			`# Promote standby`
			`kubectl exec -it postgres-standby -n veza-production -- \`
			`pg_ctl promote`

			`# Verify promotion`
			`kubectl exec -it postgres-standby -n veza-production -- \`
			`psql -U postgres -c "SELECT pg_is_in_recovery();"`
			```

			Expected: Returns `false` (no longer in recovery mode)

			`### Step 3: Update Service Endpoint`

			```bash
			`# Update postgres service to point to new primary`
			`kubectl patch service postgres -n veza-production \`
			`-p '{"spec":{"selector":{"role":"primary"}}}'`

			`# Or update the service selector to point to standby pod`
			`kubectl get pod postgres-standby -n veza-production -o jsonpath='{.metadata.labels}' \| \`
			`jq -r 'to_entries \| map("\(.key)=\(.value)") \| join(",")'`
			```

			`### Step 4: Restart Application Pods`

			```bash
			`# Restart to pick up new database connection`
docs(J2): align docs with reality — rewrite CLAUDE.md, fix README, purge chat-server refs Completes Day 2 of the v1.0.3 → v1.0.4 cleanup sprint. The documentation now describes the actual repo layout instead of a fictional one. CLAUDE.md — complete rewrite Old version referenced paths that don't exist and a protocol aimed at implementing v0.11.0 (current tag: v1.0.3). The agent was following a map for a city that had been rebuilt. - backend/ → veza-backend-api/ - frontend/ → apps/web/ - ORIGIN/ (root) → veza-docs/ORIGIN/ - veza-chat-server → merged into backend-api (v0.502, commit 279a10d31) - apps/desktop/ → never existed Also refreshed: stack versions (Go 1.25, Vite 5, React 18.2, Axum 0.8), commands, conventions, hook bypasses (SKIP_TYPES/SKIP_TESTS/SKIP_E2E), scope rules kept as immutable (no AI/ML, no Web3, no gamification, no dark patterns, no public popularity metrics). README.md — targeted fixes - "Version cible: v0.101" → "Version courante: v1.0.4" - "Development Setup (v0.9.3)" → "Development Setup" - Removed Desktop (Electron) section — never implemented - Removed veza-chat-server from structure — merged into backend - Removed deprecated compose files section (nothing is DEPRECATED now) k8s runbooks — remove stale chat-server references The disaster-recovery runbooks still scaled/restarted a deployment that no longer exists. In a real failover these commands would have failed silently and blocked the procedure. Files patched: - k8s/disaster-recovery/runbooks/cluster-failover.md - k8s/disaster-recovery/runbooks/data-restore.md - k8s/disaster-recovery/runbooks/database-failover.md - k8s/disaster-recovery/runbooks/rollback-procedure.md - k8s/network-policies/README.md - k8s/secrets/README.md - k8s/secrets.yaml.example Each reference is replaced by a short inline note pointing to v0.502 (commit 279a10d31) so future readers understand the history. .env.example — remove CHAT_JWT_SECRET Legacy env var for the deleted chat server. Replaced by an explanatory comment. Not in this commit (user handles on Forgejo): - Closing the 5 open dependabot PRs on veza-chat-server/* branches - Deleting those 5 remote branches after the PRs are closed Refs: AUDIT_REPORT.md §5.1, §7.1, §10 P1, §10 P4 2026-04-14 15:23:50 +00:00			`# (backend-api handles chat since v0.502 merge — no separate chat-server deployment)`
[INFRA-010] infra: Set up disaster recovery plan 2025-12-25 20:40:31 +00:00			`kubectl rollout restart deployment/veza-backend-api -n veza-production`

			`# Verify pods are healthy`
			`kubectl rollout status deployment/veza-backend-api -n veza-production`
			```

			`### Step 5: Verify Application Health`

			```bash
			`# Check application logs`
			`kubectl logs -f deployment/veza-backend-api -n veza-production`

			`# Test database connectivity`
			`kubectl exec -it deployment/veza-backend-api -n veza-production -- \`
			`psql $DATABASE_URL -c "SELECT 1;"`

			`# Check health endpoint`
			`curl https://api.veza.com/health`
			```

			`### Step 6: Set Up New Standby`

			```bash
			`# Once primary is recovered, set up new standby`
			`# Follow PostgreSQL replication setup guide`
			```

			`## Rollback Procedure`

			`If failover was incorrect or primary recovers:`

			```bash
			`# Stop applications`
			`kubectl scale deployment veza-backend-api --replicas=0 -n veza-production`

			`# Revert service endpoint`
			`kubectl patch service postgres -n veza-production \`
			`-p '{"spec":{"selector":{"role":"primary","pod":"postgres-primary"}}}'`

			`# Restart applications`
			`kubectl scale deployment veza-backend-api --replicas=3 -n veza-production`
			```

			`## Verification Checklist`

			`- [ ] Standby promoted successfully`
			`- [ ] Service endpoint updated`
			`- [ ] Application pods restarted`
			`- [ ] Database connectivity verified`
			`- [ ] Application health checks passing`
			`- [ ] No data loss detected`
			`- [ ] Monitoring alerts cleared`
			`- [ ] Documentation updated`

			`## Troubleshooting`

			`### Standby Not Synchronized`

			```bash
			`# Check replication status`
			`kubectl exec -it postgres-standby -n veza-production -- \`
			`psql -U postgres -c "SELECT * FROM pg_stat_replication;"`

			`# If replication is broken, rebuild standby`
			`# (See PostgreSQL replication setup guide)`
			```

			`### Application Cannot Connect`

			```bash
			`# Verify service selector`
			`kubectl get service postgres -n veza-production -o yaml`

			`# Check pod labels`
			`kubectl get pod postgres-standby -n veza-production --show-labels`

			`# Verify network connectivity`
			`kubectl run test-connection --rm -it --image=postgres:15-alpine \`
			`--restart=Never \`
			`-- psql -h postgres.veza-production.svc.cluster.local -U veza_user -d veza_db`
			```

			`## Post-Failover Tasks`

			`1. Investigate Root Cause`
docs(J2): align docs with reality — rewrite CLAUDE.md, fix README, purge chat-server refs Completes Day 2 of the v1.0.3 → v1.0.4 cleanup sprint. The documentation now describes the actual repo layout instead of a fictional one. CLAUDE.md — complete rewrite Old version referenced paths that don't exist and a protocol aimed at implementing v0.11.0 (current tag: v1.0.3). The agent was following a map for a city that had been rebuilt. - backend/ → veza-backend-api/ - frontend/ → apps/web/ - ORIGIN/ (root) → veza-docs/ORIGIN/ - veza-chat-server → merged into backend-api (v0.502, commit 279a10d31) - apps/desktop/ → never existed Also refreshed: stack versions (Go 1.25, Vite 5, React 18.2, Axum 0.8), commands, conventions, hook bypasses (SKIP_TYPES/SKIP_TESTS/SKIP_E2E), scope rules kept as immutable (no AI/ML, no Web3, no gamification, no dark patterns, no public popularity metrics). README.md — targeted fixes - "Version cible: v0.101" → "Version courante: v1.0.4" - "Development Setup (v0.9.3)" → "Development Setup" - Removed Desktop (Electron) section — never implemented - Removed veza-chat-server from structure — merged into backend - Removed deprecated compose files section (nothing is DEPRECATED now) k8s runbooks — remove stale chat-server references The disaster-recovery runbooks still scaled/restarted a deployment that no longer exists. In a real failover these commands would have failed silently and blocked the procedure. Files patched: - k8s/disaster-recovery/runbooks/cluster-failover.md - k8s/disaster-recovery/runbooks/data-restore.md - k8s/disaster-recovery/runbooks/database-failover.md - k8s/disaster-recovery/runbooks/rollback-procedure.md - k8s/network-policies/README.md - k8s/secrets/README.md - k8s/secrets.yaml.example Each reference is replaced by a short inline note pointing to v0.502 (commit 279a10d31) so future readers understand the history. .env.example — remove CHAT_JWT_SECRET Legacy env var for the deleted chat server. Replaced by an explanatory comment. Not in this commit (user handles on Forgejo): - Closing the 5 open dependabot PRs on veza-chat-server/* branches - Deleting those 5 remote branches after the PRs are closed Refs: AUDIT_REPORT.md §5.1, §7.1, §10 P1, §10 P4 2026-04-14 15:23:50 +00:00			`- Review primary database logs`
			`- Check system resources`
			`- Identify failure reason`
[INFRA-010] infra: Set up disaster recovery plan 2025-12-25 20:40:31 +00:00
			`2. Set Up New Standby`
docs(J2): align docs with reality — rewrite CLAUDE.md, fix README, purge chat-server refs Completes Day 2 of the v1.0.3 → v1.0.4 cleanup sprint. The documentation now describes the actual repo layout instead of a fictional one. CLAUDE.md — complete rewrite Old version referenced paths that don't exist and a protocol aimed at implementing v0.11.0 (current tag: v1.0.3). The agent was following a map for a city that had been rebuilt. - backend/ → veza-backend-api/ - frontend/ → apps/web/ - ORIGIN/ (root) → veza-docs/ORIGIN/ - veza-chat-server → merged into backend-api (v0.502, commit 279a10d31) - apps/desktop/ → never existed Also refreshed: stack versions (Go 1.25, Vite 5, React 18.2, Axum 0.8), commands, conventions, hook bypasses (SKIP_TYPES/SKIP_TESTS/SKIP_E2E), scope rules kept as immutable (no AI/ML, no Web3, no gamification, no dark patterns, no public popularity metrics). README.md — targeted fixes - "Version cible: v0.101" → "Version courante: v1.0.4" - "Development Setup (v0.9.3)" → "Development Setup" - Removed Desktop (Electron) section — never implemented - Removed veza-chat-server from structure — merged into backend - Removed deprecated compose files section (nothing is DEPRECATED now) k8s runbooks — remove stale chat-server references The disaster-recovery runbooks still scaled/restarted a deployment that no longer exists. In a real failover these commands would have failed silently and blocked the procedure. Files patched: - k8s/disaster-recovery/runbooks/cluster-failover.md - k8s/disaster-recovery/runbooks/data-restore.md - k8s/disaster-recovery/runbooks/database-failover.md - k8s/disaster-recovery/runbooks/rollback-procedure.md - k8s/network-policies/README.md - k8s/secrets/README.md - k8s/secrets.yaml.example Each reference is replaced by a short inline note pointing to v0.502 (commit 279a10d31) so future readers understand the history. .env.example — remove CHAT_JWT_SECRET Legacy env var for the deleted chat server. Replaced by an explanatory comment. Not in this commit (user handles on Forgejo): - Closing the 5 open dependabot PRs on veza-chat-server/* branches - Deleting those 5 remote branches after the PRs are closed Refs: AUDIT_REPORT.md §5.1, §7.1, §10 P1, §10 P4 2026-04-14 15:23:50 +00:00			`- Configure replication from new primary`
			`- Verify synchronization`
			`- Update monitoring`
[INFRA-010] infra: Set up disaster recovery plan 2025-12-25 20:40:31 +00:00
			`3. Document Incident`
docs(J2): align docs with reality — rewrite CLAUDE.md, fix README, purge chat-server refs Completes Day 2 of the v1.0.3 → v1.0.4 cleanup sprint. The documentation now describes the actual repo layout instead of a fictional one. CLAUDE.md — complete rewrite Old version referenced paths that don't exist and a protocol aimed at implementing v0.11.0 (current tag: v1.0.3). The agent was following a map for a city that had been rebuilt. - backend/ → veza-backend-api/ - frontend/ → apps/web/ - ORIGIN/ (root) → veza-docs/ORIGIN/ - veza-chat-server → merged into backend-api (v0.502, commit 279a10d31) - apps/desktop/ → never existed Also refreshed: stack versions (Go 1.25, Vite 5, React 18.2, Axum 0.8), commands, conventions, hook bypasses (SKIP_TYPES/SKIP_TESTS/SKIP_E2E), scope rules kept as immutable (no AI/ML, no Web3, no gamification, no dark patterns, no public popularity metrics). README.md — targeted fixes - "Version cible: v0.101" → "Version courante: v1.0.4" - "Development Setup (v0.9.3)" → "Development Setup" - Removed Desktop (Electron) section — never implemented - Removed veza-chat-server from structure — merged into backend - Removed deprecated compose files section (nothing is DEPRECATED now) k8s runbooks — remove stale chat-server references The disaster-recovery runbooks still scaled/restarted a deployment that no longer exists. In a real failover these commands would have failed silently and blocked the procedure. Files patched: - k8s/disaster-recovery/runbooks/cluster-failover.md - k8s/disaster-recovery/runbooks/data-restore.md - k8s/disaster-recovery/runbooks/database-failover.md - k8s/disaster-recovery/runbooks/rollback-procedure.md - k8s/network-policies/README.md - k8s/secrets/README.md - k8s/secrets.yaml.example Each reference is replaced by a short inline note pointing to v0.502 (commit 279a10d31) so future readers understand the history. .env.example — remove CHAT_JWT_SECRET Legacy env var for the deleted chat server. Replaced by an explanatory comment. Not in this commit (user handles on Forgejo): - Closing the 5 open dependabot PRs on veza-chat-server/* branches - Deleting those 5 remote branches after the PRs are closed Refs: AUDIT_REPORT.md §5.1, §7.1, §10 P1, §10 P4 2026-04-14 15:23:50 +00:00			`- Document failover procedure`
			`- Note any issues encountered`
			`- Update runbook if needed`
[INFRA-010] infra: Set up disaster recovery plan 2025-12-25 20:40:31 +00:00
			`4. Notify Stakeholders`
docs(J2): align docs with reality — rewrite CLAUDE.md, fix README, purge chat-server refs Completes Day 2 of the v1.0.3 → v1.0.4 cleanup sprint. The documentation now describes the actual repo layout instead of a fictional one. CLAUDE.md — complete rewrite Old version referenced paths that don't exist and a protocol aimed at implementing v0.11.0 (current tag: v1.0.3). The agent was following a map for a city that had been rebuilt. - backend/ → veza-backend-api/ - frontend/ → apps/web/ - ORIGIN/ (root) → veza-docs/ORIGIN/ - veza-chat-server → merged into backend-api (v0.502, commit 279a10d31) - apps/desktop/ → never existed Also refreshed: stack versions (Go 1.25, Vite 5, React 18.2, Axum 0.8), commands, conventions, hook bypasses (SKIP_TYPES/SKIP_TESTS/SKIP_E2E), scope rules kept as immutable (no AI/ML, no Web3, no gamification, no dark patterns, no public popularity metrics). README.md — targeted fixes - "Version cible: v0.101" → "Version courante: v1.0.4" - "Development Setup (v0.9.3)" → "Development Setup" - Removed Desktop (Electron) section — never implemented - Removed veza-chat-server from structure — merged into backend - Removed deprecated compose files section (nothing is DEPRECATED now) k8s runbooks — remove stale chat-server references The disaster-recovery runbooks still scaled/restarted a deployment that no longer exists. In a real failover these commands would have failed silently and blocked the procedure. Files patched: - k8s/disaster-recovery/runbooks/cluster-failover.md - k8s/disaster-recovery/runbooks/data-restore.md - k8s/disaster-recovery/runbooks/database-failover.md - k8s/disaster-recovery/runbooks/rollback-procedure.md - k8s/network-policies/README.md - k8s/secrets/README.md - k8s/secrets.yaml.example Each reference is replaced by a short inline note pointing to v0.502 (commit 279a10d31) so future readers understand the history. .env.example — remove CHAT_JWT_SECRET Legacy env var for the deleted chat server. Replaced by an explanatory comment. Not in this commit (user handles on Forgejo): - Closing the 5 open dependabot PRs on veza-chat-server/* branches - Deleting those 5 remote branches after the PRs are closed Refs: AUDIT_REPORT.md §5.1, §7.1, §10 P1, §10 P4 2026-04-14 15:23:50 +00:00			`- Send incident report`
			`- Update status page`
			`- Schedule post-mortem`
[INFRA-010] infra: Set up disaster recovery plan 2025-12-25 20:40:31 +00:00
			`## References`

			`- [PostgreSQL Replication Documentation](https://www.postgresql.org/docs/current/high-availability.html)`
			`- [Kubernetes Service Documentation](https://kubernetes.io/docs/concepts/services-networking/service/)`