veza/veza-backend-api/internal/handlers/live_health_handler.go
senke 698859cc52 feat(backend,web): surface RTMP ingest health on the Go Live page
Fifth item of the v1.0.6 backlog. "Go Live" was silent when the
nginx-rtmp profile wasn't up — an artist could copy the RTMP URL +
stream key, fire up OBS, hit "Start Streaming" and broadcast into the
void with no in-UI signal that the ingest wasn't listening. The audit
flagged this 🟡 ("livestream sans feedback UI si nginx-rtmp down").

Backend (`GET /api/v1/live/health`)
  * `LiveHealthHandler` TCP-dials `NGINX_RTMP_ADDR` (default
    `localhost:1935`) with a 2s timeout. Reports `rtmp_reachable`,
    `rtmp_addr`, a UI-safe `error` string (no raw dial target in the
    body — avoids leaking internal hostnames to the browser), and
    `last_check_at`.
  * 15s TTL cache protected by a mutex so a burst of page loads can't
    hammer the ingest. First call dials; subsequent calls within TTL
    serve the cached verdict.
  * Response ships `Cache-Control: private, max-age=15` so browsers
    piggy-back the same quarter-minute window.
  * When the dial fails the handler emits a WARN log so an operator
    watching backend logs sees the outage before a user does.
  * Public endpoint — no auth. The "RTMP is up / down" signal has no
    sensitive payload and is useful pre-login too.

Frontend
  * `useLiveHealth()` hook: react-query with 15s stale time, 1 retry,
    then falls back to an optimistic `{ rtmpReachable: true }` — we'd
    rather miss a banner than flash a false negative during a transient
    blip on the health endpoint itself.
  * `LiveRtmpHealthBanner`: amber, non-blocking banner with a Retry
    button that invalidates the health query. Copy explicitly tells the
    artist their stream key is still valid but broadcasting now won't
    reach anyone.
  * `GoLivePage` wraps `GoLiveView` in a vertical stack with the banner
    above — the view itself stays unchanged (the key + instructions
    remain readable even when the ingest is down).

Tests
  * 3 Go tests: live listener reports reachable + Cache-Control header;
    dead address reports unreachable + UI-safe error (asserts no
    `127.0.0.1` leak); TTL cache survives listener teardown within
    window.
  * 3 Vitest tests: banner renders nothing when reachable; banner
    visible + Retry enabled when unreachable; Retry invalidates the
    right query key.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-16 23:52:36 +02:00

96 lines
2.6 KiB
Go

package handlers
import (
"net"
"net/http"
"os"
"sync"
"time"
"github.com/gin-gonic/gin"
"go.uber.org/zap"
)
// LiveHealthResponse is returned by GET /api/v1/live/health.
type LiveHealthResponse struct {
RtmpReachable bool `json:"rtmp_reachable"`
RtmpAddr string `json:"rtmp_addr"`
// Error is populated when RtmpReachable is false. It is a short, UI-safe
// message — not the raw dial error (which may leak internal hostnames).
Error string `json:"error,omitempty"`
LastCheckAt time.Time `json:"last_check_at"`
}
// liveHealthChecker probes the RTMP TCP port on a short TTL cache. Every
// call to CurrentStatus either returns the cached value or triggers a
// fresh dial; dials are serialized by a mutex so a thundering herd of
// page-loads can't DOS the RTMP port.
type liveHealthChecker struct {
addr string
ttl time.Duration
dialer *net.Dialer
mu sync.Mutex
last LiveHealthResponse
checked bool
}
func newLiveHealthChecker() *liveHealthChecker {
addr := os.Getenv("NGINX_RTMP_ADDR")
if addr == "" {
addr = "localhost:1935"
}
return &liveHealthChecker{
addr: addr,
ttl: 15 * time.Second,
dialer: &net.Dialer{Timeout: 2 * time.Second},
}
}
func (c *liveHealthChecker) CurrentStatus() LiveHealthResponse {
c.mu.Lock()
defer c.mu.Unlock()
if c.checked && time.Since(c.last.LastCheckAt) < c.ttl {
return c.last
}
conn, err := c.dialer.Dial("tcp", c.addr)
now := time.Now().UTC()
if err != nil {
c.last = LiveHealthResponse{
RtmpReachable: false,
RtmpAddr: c.addr,
Error: "RTMP ingest server is not reachable",
LastCheckAt: now,
}
} else {
_ = conn.Close()
c.last = LiveHealthResponse{
RtmpReachable: true,
RtmpAddr: c.addr,
LastCheckAt: now,
}
}
c.checked = true
return c.last
}
// LiveHealthHandler returns a gin handler that reports RTMP reachability.
// v1.0.6: the Go Live UI surfaces a warning banner when rtmp_reachable is
// false so artists don't silently broadcast into a void (was the "Go Live
// silencieux si nginx-rtmp down" audit finding).
func LiveHealthHandler(logger *zap.Logger) gin.HandlerFunc {
checker := newLiveHealthChecker()
return func(c *gin.Context) {
status := checker.CurrentStatus()
if !status.RtmpReachable && logger != nil {
logger.Warn("RTMP ingest unreachable",
zap.String("rtmp_addr", status.RtmpAddr),
)
}
// Encourage clients to re-check on every page load but not burn the TCP
// dial more often than the checker's own TTL.
c.Header("Cache-Control", "private, max-age=15")
c.JSON(http.StatusOK, status)
}
}