# `nginx_proxy_cache` role — phase-1 self-hosted edge cache Sits on its own Incus container `nginx-cache` between clients and the distributed MinIO cluster. Caches HLS segments aggressively (1 MiB slice, 7 d TTL) and HLS playlists conservatively (60 s TTL). Disk-backed, capped at 20 GB, stale-on-error covered. This is the **phase-1 alternative to a third-party CDN**. It costs nothing in egress, leaks no logs to a third party, and handles thousands of concurrent listeners on a single R720. Phase-2 (W3+) adds a second geographically-distinct cache node + GeoDNS ; phase-3 only if traffic justifies a third-party CDN (Bunny.net is wired in `cdn_service.go` and stays inert until `CDN_ENABLED=true`). ## Topology ``` :80 │ ┌──────▼─────────┐ │ nginx-cache │ proxy_cache_path /var/cache/nginx/veza │ (this role) │ 20 GB disk, 1 MiB slices, 7 d TTL └──────┬─────────┘ │ keepalive ×32 backend pool ┌────────┴────────────────┐ ▼ ▼ ▼ ▼ minio-1.lxd minio-2.lxd minio-3.lxd minio-4.lxd (EC:2 distributed cluster) ``` When `CDN_ENABLED=false` (the default), `TrackService.GetStorageURL` returns `http://minio-1.lxd:9000/...` presigned URLs directly. To route through this cache layer, point the backend at the cache instead : ```env AWS_S3_ENDPOINT=http://nginx-cache.lxd:80 ``` The cache forwards to MinIO ; signed URLs still work because the signature lives in the query string (we cache `$args` in the key but signatures are short-lived so cache effectiveness only matters per signed-URL window). ## Defaults | variable | default | meaning | | --------------------------------- | ------------------------ | ------------------------------------------------------ | | `nginx_cache_root` | `/var/cache/nginx/veza` | disk-backed cache root | | `nginx_cache_max_size` | `20g` | hard cap on the cache directory | | `nginx_cache_inactive` | `7d` | purge entries unused for > 7 d | | `nginx_cache_ttl_segment` | `7d` | TTL for `.ts` / `.m4s` / `.mp4` / `.aac` / `.m4a` | | `nginx_cache_ttl_playlist` | `60s` | TTL for `.m3u8` | | `nginx_cache_ttl_other` | `1h` | TTL for everything else (cover art, originals) | | `nginx_cache_stale_error_window` | `1h` | serve stale on origin 5xx / timeout for this window | | `nginx_cache_listen_port` | `80` | listener (HTTP). TLS lives at the public LB. | | `nginx_cache_minio_port` | `9000` | MinIO upstream port | ## Cache-key policy ``` "$scheme$request_method$host$uri$is_args$args" + $slice_range (segments only) ``` - **Authorization / Cookie not in the key.** All access in v1.0 goes through presigned URLs (signature in `$args`) so per-user state is naturally segmented by query string. Adding cookies/auth would either explode cardinality or, worse, leak per-user objects across users. - **`$slice_range`** : 1 MiB slices. A range request for `bytes=0-512000` is served from the same cached chunks as `bytes=300000-700000` ; cache effectiveness stays high even when clients pick odd byte windows. ## Verifying it works ```bash # Curl the same URL twice through the cache. First should be MISS, # second should be HIT. The X-Cache-Status header surfaces the verdict. curl -sI http://nginx-cache.lxd/veza-prod-tracks//master.m3u8 | grep -i x-cache # x-cache-status: MISS curl -sI http://nginx-cache.lxd/veza-prod-tracks//master.m3u8 | grep -i x-cache # x-cache-status: HIT ``` The smoke test `infra/ansible/tests/test_nginx_cache.sh` automates this check. ## Operations ```bash # Disk usage of the cache directory : sudo du -sh /var/cache/nginx/veza # Tail access logs (shows HIT/MISS/STALE per request) : sudo tail -f /var/log/nginx/veza-cache.access.log # Reload after changing TTLs without dropping in-flight requests : sudo systemctl reload nginx # Bust the entire cache : sudo systemctl stop nginx sudo rm -rf /var/cache/nginx/veza/* sudo systemctl start nginx # Per-key purge requires ngx_cache_purge or nginx-plus — not in v1.0. # Workaround : delete the file from disk by computing the md5 of the # cache key and touching the corresponding directory under # /var/cache/nginx/veza///<...>. # Stub-status (Prometheus exporter target) — bound to loopback only : curl -s http://127.0.0.1:81/__nginx_status # Active connections: 4 # server accepts handled requests # 12345 12345 67890 # Reading: 0 Writing: 1 Waiting: 3 ``` ## Hit-rate dashboard The access log carries `cache=$upstream_cache_status`. Point a Promtail (or vector) instance at `/var/log/nginx/veza-cache.access.log` and group by `cache` for a hit-ratio panel. Until that's wired, a quick command : ```bash sudo awk '{print $NF}' /var/log/nginx/veza-cache.access.log \ | grep -oP 'cache=\K\w+' | sort | uniq -c | sort -rn # 18432 cache=HIT # 1284 cache=MISS # 16 cache=EXPIRED ``` ## What this role does NOT cover - **TLS termination.** The Incus bridge is the trust boundary in v1.0. Public exposure goes through the existing HAProxy/Caddy LB which does TLS upstream of this cache. When phase-2 puts the cache directly on the public internet, switch `nginx_cache_listen_port` to 443 and add `tls_cert_path` / `tls_key_path` defaults. - **Per-key purge.** OSS Nginx has no native purge ; v1.1 adds either ngx_cache_purge (compiled-in module) or migrates to Varnish. - **Multi-node coordination.** Single cache node in phase-1. Phase-2 introduces a second node + GeoDNS — independent caches are fine because HLS segments are immutable. - **Brotli.** Audio is already compressed ; gzip is enabled for `.m3u8` only. Brotli would add CPU for marginal gains.