2026-01-29 22:32:18 +00:00
|
|
|
# =============================================================================
|
|
|
|
|
# VEZA BACKEND API - ENVIRONMENT TEMPLATE
|
|
|
|
|
# =============================================================================
|
|
|
|
|
# This is a template file. Copy to .env and fill in actual values.
|
|
|
|
|
# DO NOT commit .env with real secrets to Git!
|
|
|
|
|
# =============================================================================
|
|
|
|
|
|
|
|
|
|
# --- ENVIRONMENT ---
|
|
|
|
|
# Options: development, staging, production
|
|
|
|
|
APP_ENV=development
|
|
|
|
|
APP_PORT=8080
|
|
|
|
|
LOG_LEVEL=info
|
|
|
|
|
|
2026-02-11 21:19:09 +00:00
|
|
|
# --- DOMAIN (single source of truth) ---
|
|
|
|
|
# All service URLs and CORS origins derive from this in development.
|
|
|
|
|
# Change this + /etc/hosts to switch domain.
|
|
|
|
|
APP_DOMAIN=veza.fr
|
|
|
|
|
|
2026-01-29 22:32:18 +00:00
|
|
|
# --- DATABASE (REQUIRED) ---
|
2026-02-11 21:19:09 +00:00
|
|
|
# PostgreSQL connection string (host ports when using docker-compose: 15432)
|
|
|
|
|
# In Docker: postgres:5432 | On host: veza.fr:15432
|
|
|
|
|
DATABASE_URL=postgres://veza:password@veza.fr:15432/veza?sslmode=disable
|
2026-02-14 21:50:23 +00:00
|
|
|
# Optional: Read replica for scaling read-heavy workloads (same format as DATABASE_URL)
|
|
|
|
|
# DATABASE_READ_URL=postgres://veza:password@veza-read-replica:5432/veza?sslmode=disable
|
2026-01-29 22:32:18 +00:00
|
|
|
DATABASE_MAX_OPEN_CONNS=25
|
|
|
|
|
DATABASE_MAX_IDLE_CONNS=5
|
|
|
|
|
DATABASE_CONN_MAX_LIFETIME=5m
|
|
|
|
|
|
2026-03-05 18:22:31 +00:00
|
|
|
# --- JWT & AUTHENTICATION (v0.9.1 RS256) ---
|
|
|
|
|
# PREFERRED: RS256 with RSA keys (generate with scripts/generate-jwt-keys.sh)
|
|
|
|
|
# JWT_PRIVATE_KEY_PATH=/path/to/jwt-private.pem
|
|
|
|
|
# JWT_PUBLIC_KEY_PATH=/path/to/jwt-public.pem
|
|
|
|
|
# FALLBACK (dev only): JWT_SECRET must be at least 32 characters
|
2026-01-29 22:32:18 +00:00
|
|
|
JWT_SECRET=dev-secret-key-minimum-32-characters-long-for-testing-only
|
|
|
|
|
JWT_ISSUER=veza-api
|
2026-03-05 18:22:31 +00:00
|
|
|
JWT_AUDIENCE=veza-platform
|
2026-01-29 22:32:18 +00:00
|
|
|
JWT_ACCESS_TOKEN_DURATION=15m
|
|
|
|
|
JWT_REFRESH_TOKEN_DURATION=30d
|
|
|
|
|
|
|
|
|
|
# --- COOKIES ---
|
|
|
|
|
# Set to true in production for HTTPS-only cookies
|
|
|
|
|
COOKIE_SECURE=false
|
|
|
|
|
COOKIE_SAME_SITE=lax
|
|
|
|
|
COOKIE_DOMAIN=
|
|
|
|
|
|
|
|
|
|
# --- CORS (REQUIRED) ---
|
|
|
|
|
# Comma-separated list of allowed origins
|
2026-02-11 21:19:09 +00:00
|
|
|
# Development: http://veza.fr:5173,http://veza.fr:3000 (or your APP_DOMAIN)
|
2026-01-29 22:32:18 +00:00
|
|
|
# Production: https://app.veza.com,https://www.veza.com
|
2026-02-11 21:19:09 +00:00
|
|
|
CORS_ALLOWED_ORIGINS=http://veza.fr:5173,http://veza.fr:3000
|
2026-01-29 22:32:18 +00:00
|
|
|
|
|
|
|
|
# --- REDIS (REQUIRED for CSRF, rate limiting, cache) ---
|
2026-02-11 21:19:09 +00:00
|
|
|
# Redis (host port when using docker-compose: 16379)
|
|
|
|
|
# In Docker: redis:6379 | On host: veza.fr:16379
|
|
|
|
|
REDIS_URL=redis://veza.fr:16379
|
|
|
|
|
REDIS_ADDR=veza.fr:6379
|
2026-01-29 22:32:18 +00:00
|
|
|
REDIS_PASSWORD=
|
|
|
|
|
REDIS_DB=0
|
feat(redis): Sentinel HA + cache hit rate metrics (W3 Day 11)
Three Incus containers, each running redis-server + redis-sentinel
(co-located). redis-1 = master at first boot, redis-2/3 = replicas.
Sentinel quorum=2 of 3 ; failover-timeout=30s satisfies the W3
acceptance criterion.
- internal/config/redis_init.go : initRedis branches on
REDIS_SENTINEL_ADDRS ; non-empty -> redis.NewFailoverClient with
MasterName + SentinelAddrs + SentinelPassword. Empty -> existing
single-instance NewClient (dev/local stays parametric).
- internal/config/config.go : 3 new fields (RedisSentinelAddrs,
RedisSentinelMasterName, RedisSentinelPassword) read from env.
parseRedisSentinelAddrs trims+filters CSV.
- internal/metrics/cache_hit_rate.go : new RecordCacheHit / Miss
counters, labelled by subsystem. Cardinality bounded.
- internal/middleware/rate_limiter.go : instrument 3 Eval call sites
(DDoS, frontend log throttle, upload throttle). Hit = Redis answered,
Miss = error -> in-memory fallback.
- internal/services/chat_pubsub.go : instrument Publish + PublishPresence.
- internal/websocket/chat/presence_service.go : instrument SetOnline /
SetOffline / Heartbeat / GetPresence. redis.Nil counts as a hit
(legitimate empty result).
- infra/ansible/roles/redis_sentinel/ : install Redis 7 + Sentinel,
render redis.conf + sentinel.conf, systemd units. Vault assertion
prevents shipping placeholder passwords to staging/prod.
- infra/ansible/playbooks/redis_sentinel.yml : provisions the 3
containers + applies common baseline + role.
- infra/ansible/inventory/lab.yml : new groups redis_ha + redis_ha_master.
- infra/ansible/tests/test_redis_failover.sh : kills the master
container, polls Sentinel for the new master, asserts elapsed < 30s.
- config/grafana/dashboards/redis-cache-overview.json : 3 hit-rate
stats (rate_limiter / chat_pubsub / presence) + ops/s breakdown.
- docs/ENV_VARIABLES.md §3 : 3 new REDIS_SENTINEL_* env vars.
- veza-backend-api/.env.template : 3 placeholders (empty default).
Acceptance (Day 11) : Sentinel failover < 30s ; cache hit-rate
dashboard populated. Lab test pending Sentinel deployment.
W3 verification gate progress : Redis Sentinel ✓ (this commit),
MinIO EC4+2 ⏳ Day 12, CDN ⏳ Day 13, DMCA ⏳ Day 14, embed ⏳ Day 15.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 11:36:55 +00:00
|
|
|
# v1.0.9 W3 Day 11 — Sentinel HA. Leave REDIS_SENTINEL_ADDRS empty for
|
|
|
|
|
# single-instance dev. Set in prod to enable redis.NewFailoverClient.
|
|
|
|
|
# Comma-separated host:port list ; the master name must match
|
|
|
|
|
# `sentinel monitor` in sentinel.conf.
|
|
|
|
|
REDIS_SENTINEL_ADDRS=
|
|
|
|
|
REDIS_SENTINEL_MASTER_NAME=veza-master
|
|
|
|
|
REDIS_SENTINEL_PASSWORD=
|
2026-01-29 22:32:18 +00:00
|
|
|
|
feat(cdn): Bunny.net signed URLs + HLS cache headers + metric collision fix (W3 Day 13)
CDN edge in front of S3/MinIO via origin-pull. Backend signs URLs
with Bunny.net token-auth (SHA-256 over security_key + path + expires)
so edges verify before serving cached objects ; origin is never hit
on a valid token. Cloudflare CDN / R2 / CloudFront stubs kept.
- internal/services/cdn_service.go : new providers CDNProviderBunny +
CDNProviderCloudflareR2. SecurityKey added to CDNConfig.
generateBunnySignedURL implements the documented Bunny scheme
(url-safe base64, no padding, expires query). HLSSegmentCacheHeaders
+ HLSPlaylistCacheHeaders helpers exported for handlers.
- internal/services/cdn_service_test.go : pin Bunny URL shape +
base64-url charset ; assert empty SecurityKey fails fast (no
silent fallback to unsigned URLs).
- internal/core/track/service.go : new CDNURLSigner interface +
SetCDNService(cdn). GetStorageURL prefers CDN signed URL when
cdnService.IsEnabled, falls back to direct S3 presign on signing
error so a CDN partial outage doesn't block playback.
- internal/api/routes_tracks.go + routes_core.go : wire SetCDNService
on the two TrackService construction sites that serve stream/download.
- internal/config/config.go : 4 new env vars (CDN_ENABLED, CDN_PROVIDER,
CDN_BASE_URL, CDN_SECURITY_KEY). config.CDNService always non-nil
after init ; IsEnabled gates the actual usage.
- internal/handlers/hls_handler.go : segments now return
Cache-Control: public, max-age=86400, immutable (content-addressed
filenames make this safe). Playlists at max-age=60.
- veza-backend-api/.env.template : 4 placeholder env vars.
- docs/ENV_VARIABLES.md §12 : provider matrix + Bunny vs Cloudflare
vs R2 trade-offs.
Bug fix collateral : v1.0.9 Day 11 introduced veza_cache_hits_total
which collided in name with monitoring.CacheHitsTotal (different
label set ⇒ promauto MustRegister panic at process init). Day 13
deletes the monitoring duplicate and restores the metrics-package
counter as the single source of truth (label: subsystem). All 8
affected packages green : services, core/track, handlers, middleware,
websocket/chat, metrics, monitoring, config.
Acceptance (Day 13) : code path is wired ; verifying via real Bunny
edge requires a Pull Zone provisioned by the user (EX-? in roadmap).
On the user side : create Pull Zone w/ origin = MinIO, copy token
auth key into CDN_SECURITY_KEY, set CDN_ENABLED=true.
W3 progress : Redis Sentinel ✓ · MinIO distribué ✓ · CDN ✓ ·
DMCA ⏳ Day 14 · embed ⏳ Day 15.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 12:07:20 +00:00
|
|
|
# v1.0.9 W3 Day 13 — CDN edge in front of S3/MinIO. Optional.
|
|
|
|
|
# Provider valid values: bunny | cloudflare | cloudflare_r2 | cloudfront | none
|
|
|
|
|
# CDN_SECURITY_KEY only required for bunny (Pull Zone token-auth key).
|
|
|
|
|
CDN_ENABLED=false
|
|
|
|
|
CDN_PROVIDER=none
|
|
|
|
|
CDN_BASE_URL=
|
|
|
|
|
CDN_SECURITY_KEY=
|
|
|
|
|
|
2026-02-11 21:19:09 +00:00
|
|
|
# --- RABBITMQ ---
|
|
|
|
|
# Enable message queue for async events (use veza:password, host port 15672 for docker-compose)
|
|
|
|
|
# In Docker: amqp://veza:password@rabbitmq:5672/ | On host: amqp://veza:password@veza.fr:15672/
|
|
|
|
|
RABBITMQ_ENABLE=true
|
|
|
|
|
RABBITMQ_URL=amqp://veza:password@veza.fr:15672/
|
2026-01-29 22:32:18 +00:00
|
|
|
|
|
|
|
|
# --- SENTRY (OPTIONAL - Recommended for production) ---
|
|
|
|
|
# Error tracking and monitoring
|
|
|
|
|
SENTRY_DSN=
|
|
|
|
|
SENTRY_ENVIRONMENT=development
|
|
|
|
|
SENTRY_SAMPLE_RATE_ERRORS=1.0
|
|
|
|
|
SENTRY_SAMPLE_RATE_TRANSACTIONS=0.1
|
|
|
|
|
|
|
|
|
|
# --- RATE LIMITING ---
|
|
|
|
|
RATE_LIMIT_ENABLED=true
|
|
|
|
|
RATE_LIMIT_REQUESTS_PER_SECOND=100
|
|
|
|
|
|
|
|
|
|
# --- FILE UPLOADS ---
|
|
|
|
|
UPLOAD_DIR=./uploads
|
|
|
|
|
ENABLE_CLAMAV=false
|
|
|
|
|
CLAMAV_REQUIRED=false
|
|
|
|
|
|
2026-02-14 20:45:15 +00:00
|
|
|
# --- HYPERSWITCH (PAYMENTS - OPTIONAL) ---
|
|
|
|
|
# Required for real payment processing. Leave empty to use simulated payments.
|
|
|
|
|
HYPERSWITCH_ENABLED=false
|
|
|
|
|
HYPERSWITCH_URL=http://veza.fr:18081
|
|
|
|
|
# From Hyperswitch Control Center (app.hyperswitch.io) > Settings > Developers
|
|
|
|
|
HYPERSWITCH_API_KEY=
|
|
|
|
|
# For webhook signature verification
|
|
|
|
|
HYPERSWITCH_WEBHOOK_SECRET=
|
|
|
|
|
# Checkout success redirect (used in return_url)
|
|
|
|
|
CHECKOUT_SUCCESS_URL=http://veza.fr:5173/purchases
|
|
|
|
|
|
2026-02-23 21:09:23 +00:00
|
|
|
# --- STRIPE CONNECT (SELLER PAYOUT - OPTIONAL) ---
|
|
|
|
|
# Required for seller payout (balance, onboarding, transfers).
|
|
|
|
|
# Get keys from Stripe Dashboard > Connect > Settings
|
|
|
|
|
STRIPE_CONNECT_ENABLED=false
|
|
|
|
|
# Secret key for server-side Stripe API calls (sk_test_xxx or sk_live_xxx)
|
|
|
|
|
STRIPE_SECRET_KEY=
|
|
|
|
|
# Webhook secret for Connect events (whsec_xxx)
|
|
|
|
|
STRIPE_CONNECT_WEBHOOK_SECRET=
|
|
|
|
|
|
feat(marketplace): async stripe connect reversal worker — v1.0.7 item B day 2
Day-2 cut of item B: the reversal path becomes async. Pre-v1.0.7
(and v1.0.7 day 1) the refund handler flipped seller_transfers
straight from completed to reversed without ever calling Stripe —
the ledger said "reversed" while the seller's Stripe balance still
showed the original transfer as settled. The new flow:
refund.succeeded webhook
→ reverseSellerAccounting transitions row: completed → reversal_pending
→ StripeReversalWorker (every REVERSAL_CHECK_INTERVAL, default 1m)
→ calls ReverseTransfer on Stripe
→ success: row → reversed + persist stripe_reversal_id
→ 404 already-reversed (dead code until day 3): row → reversed + log
→ 404 resource_missing (dead code until day 3): row → permanently_failed
→ transient error: stay reversal_pending, bump retry_count,
exponential backoff (base * 2^retry, capped at backoffMax)
→ retries exhausted: row → permanently_failed
→ buyer-facing refund completes immediately regardless of Stripe health
State machine enforcement:
* New `SellerTransfer.TransitionStatus(tx, to, extras)` wraps every
mutation: validates against AllowedTransferTransitions, guarded
UPDATE with WHERE status=<from> (optimistic lock semantics), no
RowsAffected = stale state / concurrent winner detected.
* processSellerTransfers no longer mutates .Status in place —
terminal status is decided before struct construction, so the
row is Created with its final state.
* transfer_retry.retryOne and admin RetryTransfer route through
TransitionStatus. Legacy direct assignment removed.
* TestNoDirectTransferStatusMutation greps the package for any
`st.Status = "..."` / `t.Status = "..."` / GORM
Model(&SellerTransfer{}).Update("status"...) outside the
allowlist and fails if found. Verified by temporarily injecting
a violation during development — test caught it as expected.
Configuration (v1.0.7 item B):
* REVERSAL_WORKER_ENABLED=true (default)
* REVERSAL_MAX_RETRIES=5 (default)
* REVERSAL_CHECK_INTERVAL=1m (default)
* REVERSAL_BACKOFF_BASE=1m (default)
* REVERSAL_BACKOFF_MAX=1h (default, caps exponential growth)
* .env.template documents TRANSFER_RETRY_* and REVERSAL_* env vars
so an ops reader can grep them.
Interface change: TransferService.ReverseTransfer(ctx,
stripe_transfer_id, amount *int64, reason) (reversalID, error)
added. All four mocks extended (process_webhook, transfer_retry,
admin_transfer_handler, payment_flow integration). amount=nil means
full reversal; v1.0.7 always passes nil (partial reversal is future
scope per axis-1 P2).
Stripe 404 disambiguation (ErrTransferAlreadyReversed /
ErrTransferNotFound) is wired in the worker as dead code — the
sentinels are declared and the worker branches on them, but
StripeConnectService.ReverseTransfer doesn't yet emit them. Day 3
will parse stripe.Error.Code and populate the sentinels; no worker
change needed at that point. Keeping the handling skeleton in day 2
so the worker's branch shape doesn't change between days and the
tests can already cover all four paths against the mock.
Worker unit tests (9 cases, all green, sqlite :memory:):
* happy path: reversal_pending → reversed + stripe_reversal_id set
* already reversed (mock returns sentinel): → reversed + log
* not found (mock returns sentinel): → permanently_failed + log
* transient 503: retry_count++, next_retry_at set with backoff,
stays reversal_pending
* backoff capped at backoffMax (verified with base=1s, max=10s,
retry_count=4 → capped at 10s not 16s)
* max retries exhausted: → permanently_failed
* legacy row with empty stripe_transfer_id: → permanently_failed,
does not call Stripe
* only picks up reversal_pending (skips all other statuses)
* respects next_retry_at (future rows skipped)
Existing test updated: TestProcessRefundWebhook_SucceededFinalizesState
now asserts the row lands at reversal_pending with next_retry_at
set (worker's responsibility to drive to reversed), not reversed.
Worker wired in cmd/api/main.go alongside TransferRetryWorker,
sharing the same StripeConnectService instance. Shutdown path
registered for graceful stop.
Cut from day 2 scope (per agreed-upon discipline), landing in day 3:
* Stripe 404 disambiguation implementation (parse error.Code)
* End-to-end smoke probe (refund → reversal_pending → worker
processes → reversed) against local Postgres + mock Stripe
* Batch-size tuning / inter-batch sleep — batchLimit=20 today is
safely under Stripe's 100 req/s default rate limit; revisit if
observed load warrants
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 13:34:29 +00:00
|
|
|
# --- TRANSFER RETRY WORKER (v0.701) ---
|
|
|
|
|
# Drives failed seller_transfers back to completed when Stripe recovers.
|
|
|
|
|
# TRANSFER_RETRY_ENABLED=true
|
|
|
|
|
# TRANSFER_RETRY_MAX=3
|
|
|
|
|
# TRANSFER_RETRY_INTERVAL=5m
|
|
|
|
|
|
|
|
|
|
# --- REVERSAL WORKER (v1.0.7 item B) ---
|
|
|
|
|
# Drives reversal_pending seller_transfers to reversed by calling Stripe
|
|
|
|
|
# Connect Transfers:reversal. Decouples buyer-facing refund UX from
|
|
|
|
|
# Stripe-side settlement health. Backoff is exponential (base * 2^retry),
|
|
|
|
|
# capped at BACKOFF_MAX.
|
|
|
|
|
# REVERSAL_WORKER_ENABLED=true
|
|
|
|
|
# REVERSAL_MAX_RETRIES=5
|
|
|
|
|
# REVERSAL_CHECK_INTERVAL=1m
|
|
|
|
|
# REVERSAL_BACKOFF_BASE=1m
|
|
|
|
|
# REVERSAL_BACKOFF_MAX=1h
|
|
|
|
|
|
feat(webhooks): persist raw hyperswitch payloads to audit log — v1.0.7 item E
Every POST /webhooks/hyperswitch delivery now writes a row to
`hyperswitch_webhook_log` regardless of signature-valid or
processing outcome. Captures both legitimate deliveries and attack
probes — a forensics query now has the actual bytes to read, not
just a "webhook rejected" log line. Disputes (axis-1 P1.6) ride
along: the log captures dispute.* events alongside payment and
refund events, ready for when disputes get a handler.
Table shape (migration 984):
* payload TEXT — readable in psql, invalid UTF-8 replaced with
empty (forensics value is in headers + ip + timing for those
attacks, not the binary body).
* signature_valid BOOLEAN + partial index for "show me attack
attempts" being instantaneous.
* processing_result TEXT — 'ok' / 'error: <msg>' /
'signature_invalid' / 'skipped'. Matches the P1.5 action
semantic exactly.
* source_ip, user_agent, request_id — forensics essentials.
request_id is captured from Hyperswitch's X-Request-Id header
when present, else a server-side UUID so every row correlates
to VEZA's structured logs.
* event_type — best-effort extract from the JSON payload, NULL
on malformed input.
Hardening:
* 64KB body cap via io.LimitReader rejects oversize with 413
before any INSERT — prevents log-spam DoS.
* Single INSERT per delivery with final state; no two-phase
update race on signature-failure path. signature_invalid and
processing-error rows both land.
* DB persistence failures are logged but swallowed — the
endpoint's contract is to ack Hyperswitch, not perfect audit.
Retention sweep:
* CleanupHyperswitchWebhookLog in internal/jobs, daily tick,
batched DELETE (10k rows + 100ms pause) so a large backlog
doesn't lock the table.
* HYPERSWITCH_WEBHOOK_LOG_RETENTION_DAYS (default 90).
* Same goroutine-ticker pattern as ScheduleOrphanTracksCleanup.
* Wired in cmd/api/main.go alongside the existing cleanup jobs.
Tests: 5 in webhook_log_test.go (persistence, request_id auto-gen,
invalid-JSON leaves event_type empty, invalid-signature capture,
extractEventType 5 sub-cases) + 4 in cleanup_hyperswitch_webhook_
log_test.go (deletes-older-than, noop, default-on-zero,
context-cancel). Migration 984 applied cleanly to local Postgres;
all indexes present.
Also (v107-plan.md):
* Item G acceptance gains an explicit Idempotency-Key threading
requirement with an empty-key loud-fail test — "literally
copy-paste D's 4-line test skeleton". Closes the risk that
item G silently reopens the HTTP-retry duplicate-charge
exposure D closed.
Out of scope for E (noted in CHANGELOG):
* Rate limit on the endpoint — pre-existing middleware covers
it at the router level; adding a per-endpoint limit is
separate scope.
* Readable-payload SQL view — deferred, the TEXT column is
already human-readable; a convenience view is a nice-to-have
not a ship-blocker.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 00:44:58 +00:00
|
|
|
# --- HYPERSWITCH WEBHOOK LOG (v1.0.7 item E) ---
|
|
|
|
|
# Every webhook hitting POST /webhooks/hyperswitch is persisted to
|
|
|
|
|
# hyperswitch_webhook_log regardless of signature-valid / processing
|
|
|
|
|
# outcome — captures attack attempts alongside legitimate traffic for
|
|
|
|
|
# forensics. A daily sweep deletes rows older than this many days.
|
|
|
|
|
# HYPERSWITCH_WEBHOOK_LOG_RETENTION_DAYS=90
|
|
|
|
|
|
feat(workers): hyperswitch reconciliation sweep for stuck pending states — v1.0.7 item C
New ReconcileHyperswitchWorker sweeps for pending orders and refunds
whose terminal webhook never arrived. Pulls live PSP state for each
stuck row and synthesises a webhook payload to feed the normal
ProcessPaymentWebhook / ProcessRefundWebhook dispatcher. The existing
terminal-state guards on those handlers make reconciliation
idempotent against real webhooks — a late webhook after the reconciler
resolved the row is a no-op.
Three stuck-state classes covered:
1. Stuck orders (pending > 30m, non-empty payment_id) → GetPaymentStatus
+ synthetic payment.<status> webhook.
2. Stuck refunds with PSP id (pending > 30m, non-empty
hyperswitch_refund_id) → GetRefundStatus + synthetic
refund.<status> webhook (error_message forwarded).
3. Orphan refunds (pending > 5m, EMPTY hyperswitch_refund_id) →
mark failed + roll order back to completed + log ERROR. This
is the "we crashed between Phase 1 and Phase 2 of RefundOrder"
case, operator-attention territory.
New interfaces:
* marketplace.HyperswitchReadClient — read-only PSP surface the
worker depends on (GetPaymentStatus, GetRefundStatus). The
worker never calls CreatePayment / CreateRefund.
* hyperswitch.Client.GetRefund + RefundStatus struct added.
* hyperswitch.Provider gains GetRefundStatus + GetPaymentStatus
pass-throughs that satisfy the marketplace interface.
Configuration (all env-var tunable with sensible defaults):
* RECONCILE_WORKER_ENABLED=true
* RECONCILE_INTERVAL=1h (ops can drop to 5m during incident
response without a code change)
* RECONCILE_ORDER_STUCK_AFTER=30m
* RECONCILE_REFUND_STUCK_AFTER=30m
* RECONCILE_REFUND_ORPHAN_AFTER=5m (shorter because "app crashed"
is a different signal from "network hiccup")
Operational details:
* Batch limit 50 rows per phase per tick so a 10k-row backlog
doesn't hammer Hyperswitch. Next tick picks up the rest.
* PSP read errors leave the row untouched — next tick retries.
Reconciliation is always safe to replay.
* Structured log on every action so `grep reconcile` tells the
ops story: which order/refund got synced, against what status,
how long it was stuck.
* Worker wired in cmd/api/main.go, gated on
HyperswitchEnabled + HyperswitchAPIKey. Graceful shutdown
registered.
* RunOnce exposed as public API for ad-hoc ops trigger during
incident response.
Tests — 10 cases, all green (sqlite :memory:):
* TestReconcile_StuckOrder_SyncsViaSyntheticWebhook
* TestReconcile_RecentOrder_NotTouched
* TestReconcile_CompletedOrder_NotTouched
* TestReconcile_OrderWithEmptyPaymentID_NotTouched
* TestReconcile_PSPReadErrorLeavesRowIntact
* TestReconcile_OrphanRefund_AutoFails_OrderRollsBack
* TestReconcile_RecentOrphanRefund_NotTouched
* TestReconcile_StuckRefund_SyncsViaSyntheticWebhook
* TestReconcile_StuckRefund_FailureStatus_PassesErrorMessage
* TestReconcile_AllTerminalStates_NoOp
CHANGELOG v1.0.7-rc1 updated with the full item C section between D
and the existing E block, matching the order convention (ship order:
A → D → B → E → C, CHANGELOG order follows).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 01:08:15 +00:00
|
|
|
# --- RECONCILIATION WORKER (v1.0.7 item C) ---
|
|
|
|
|
# Periodically sweeps for stuck pending orders and refunds whose
|
|
|
|
|
# webhook never arrived. Pulls live status from Hyperswitch, feeds a
|
|
|
|
|
# synthesised webhook into the normal dispatcher. Idempotent with
|
|
|
|
|
# real webhooks via terminal-state guards on the handlers.
|
|
|
|
|
# RECONCILE_WORKER_ENABLED=true
|
|
|
|
|
# RECONCILE_INTERVAL=1h
|
|
|
|
|
# RECONCILE_ORDER_STUCK_AFTER=30m
|
|
|
|
|
# RECONCILE_REFUND_STUCK_AFTER=30m
|
|
|
|
|
# RECONCILE_REFUND_ORPHAN_AFTER=5m
|
|
|
|
|
|
2026-01-29 22:32:18 +00:00
|
|
|
# --- EXTERNAL SERVICES (OPTIONAL) ---
|
2026-02-11 21:19:09 +00:00
|
|
|
STREAM_SERVER_URL=http://veza.fr:8082
|
|
|
|
|
# Must match stream server INTERNAL_API_KEY for /internal/jobs/transcode (P1.1.2)
|
|
|
|
|
STREAM_SERVER_INTERNAL_API_KEY=
|
|
|
|
|
CHAT_SERVER_URL=http://veza.fr:8081
|
|
|
|
|
|
docs(env): canonicalize ENV_VARIABLES.md + add HLS_STREAMING template
Resolves AUDIT_REPORT §9 item #15 (last real item before v1.0.7 final)
and FUNCTIONAL_AUDIT §4 stability item 5.
docs/ENV_VARIABLES.md:
- Complete rewrite from 172 → ~600 lines covering all ~180 env vars
surveyed directly from code (os.Getenv in Go, std::env::var in Rust,
import.meta.env in React).
- 30 sections: core, DB, Redis, JWT, OAuth, CORS, rate-limit, SMTP,
Hyperswitch, Stripe Connect, RabbitMQ, S3/MinIO, HLS, stream server,
Elasticsearch, ClamAV, Sentry, logging, metrics, frontend Vite,
feature flags, password policy, build info, RTMP/misc, Rust stream
schema, security headers recap, deprecated vars, prod validation
rules, drift findings, startup checklist.
- Documents 8 production-critical validation rules (validation.go:869-1018).
- Flags 14 deprecated vars with canonical replacements for v1.1.0 cleanup.
- Catalogs 11 vars used by code but missing from template (HLS_STREAMING,
SLOW_REQUEST_THRESHOLD_MS, CONFIG_WATCH, HANDLER_TIMEOUT, VAPID_*, etc).
veza-backend-api/.env.template:
- Add HLS_STREAMING=false with documentation of fallback behavior
(/tracks/:id/stream with Range support when off).
- Add HLS_STORAGE_DIR=/tmp/veza-hls.
Closes last blocker before v1.0.7 final tag.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 12:36:44 +00:00
|
|
|
# --- HLS STREAMING (OPTIONAL, default off) ---
|
|
|
|
|
# Enables the HLS endpoints on the backend. When false (default), the player
|
|
|
|
|
# falls back to the direct /tracks/:id/stream endpoint with HTTP Range support,
|
|
|
|
|
# which works for all tracks but without adaptive bitrate. Set to true to
|
|
|
|
|
# activate HLS manifests/segments (requires stream server + transcoding).
|
|
|
|
|
# See FUNCTIONAL_AUDIT §4 item 5 for the fallback behaviour.
|
2026-04-29 07:56:02 +00:00
|
|
|
# v1.0.9 W4 Day 17 : default flipped to true. Set to false to disable
|
|
|
|
|
# the HLS transcoder pipeline in lightweight dev / unit-test envs.
|
|
|
|
|
HLS_STREAMING=true
|
docs(env): canonicalize ENV_VARIABLES.md + add HLS_STREAMING template
Resolves AUDIT_REPORT §9 item #15 (last real item before v1.0.7 final)
and FUNCTIONAL_AUDIT §4 stability item 5.
docs/ENV_VARIABLES.md:
- Complete rewrite from 172 → ~600 lines covering all ~180 env vars
surveyed directly from code (os.Getenv in Go, std::env::var in Rust,
import.meta.env in React).
- 30 sections: core, DB, Redis, JWT, OAuth, CORS, rate-limit, SMTP,
Hyperswitch, Stripe Connect, RabbitMQ, S3/MinIO, HLS, stream server,
Elasticsearch, ClamAV, Sentry, logging, metrics, frontend Vite,
feature flags, password policy, build info, RTMP/misc, Rust stream
schema, security headers recap, deprecated vars, prod validation
rules, drift findings, startup checklist.
- Documents 8 production-critical validation rules (validation.go:869-1018).
- Flags 14 deprecated vars with canonical replacements for v1.1.0 cleanup.
- Catalogs 11 vars used by code but missing from template (HLS_STREAMING,
SLOW_REQUEST_THRESHOLD_MS, CONFIG_WATCH, HANDLER_TIMEOUT, VAPID_*, etc).
veza-backend-api/.env.template:
- Add HLS_STREAMING=false with documentation of fallback behavior
(/tracks/:id/stream with Range support when off).
- Add HLS_STORAGE_DIR=/tmp/veza-hls.
Closes last blocker before v1.0.7 final tag.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 12:36:44 +00:00
|
|
|
# HLS segment storage directory (used only when HLS_STREAMING=true)
|
|
|
|
|
HLS_STORAGE_DIR=/tmp/veza-hls
|
|
|
|
|
|
2026-04-23 17:54:28 +00:00
|
|
|
# --- TRACK UPLOAD STORAGE BACKEND (v1.0.8 Phase 0, default local) ---
|
|
|
|
|
# Where TrackService.UploadTrack() writes new track files.
|
|
|
|
|
# local — writes to veza-backend-api/uploads/tracks/ (legacy behaviour,
|
|
|
|
|
# works single-pod, doesn't scale multi-pod)
|
|
|
|
|
# s3 — writes to S3/MinIO via S3StorageService. Requires
|
|
|
|
|
# AWS_S3_ENABLED=true and AWS_S3_BUCKET. In production, setting
|
|
|
|
|
# this to 's3' without S3 enabled fails startup with an explicit
|
|
|
|
|
# error (dev/staging warn and fall back to 'local').
|
|
|
|
|
# Phase 1 wires this into TrackService; Phase 2 migrates the read path;
|
|
|
|
|
# Phase 3 provides a migration CLI for existing local tracks.
|
|
|
|
|
TRACK_STORAGE_BACKEND=local
|
|
|
|
|
|
2026-02-11 21:19:09 +00:00
|
|
|
# --- FRONTEND URL ---
|
|
|
|
|
# Used for password reset links, email templates, etc.
|
|
|
|
|
FRONTEND_URL=http://veza.fr:5173
|
|
|
|
|
|
|
|
|
|
# --- BASE URL ---
|
|
|
|
|
# Public base URL of this backend (used for OAuth callbacks, etc.)
|
|
|
|
|
BASE_URL=http://veza.fr:8080
|
2026-01-29 22:32:18 +00:00
|
|
|
|
chore(release): v1.0.5.1 — dev SMTP ergonomics hotfix
A fresh clone + `cp veza-backend-api/.env.template .env` + `make dev-full`
booted the backend with `SMTP_HOST=""` — `EmailService.sendEmail` short-
circuits to log-only when the host is empty, so `register` + `password
reset` produced users stuck with no way to verify (or recover) in dev,
and the smoke test caught MailHog empty despite the service being up.
- `.env.template` now ships MailHog-ready defaults (`localhost:1025`,
UI on `:8025`, `FROM_EMAIL=no-reply@veza.local`) so a bare clone +
copy gives a working register flow. Comment rewritten to point at
both the dev path and the prod override.
- Also exports duplicate variable names (`SMTP_USERNAME`, `SMTP_FROM`,
`SMTP_FROM_NAME`) read by `internal/email/sender.go`. The two email
services in-tree disagree on env schema (`SMTP_USER` vs
`SMTP_USERNAME`, `FROM_EMAIL` vs `SMTP_FROM`, `FROM_NAME` vs
`SMTP_FROM_NAME`); until v1.0.6 reconciles them, both sets are
populated so whichever path fires finds its names.
Pure config hotfix. No code change, no migration.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 16:16:54 +00:00
|
|
|
# --- EMAIL (REQUIRED for registration + password reset) ---
|
|
|
|
|
# Local dev: MailHog ships with `make infra-up-dev` — the defaults below
|
|
|
|
|
# point at it on localhost:1025, with the web UI at http://localhost:8025
|
|
|
|
|
# to inspect captured emails.
|
|
|
|
|
# Production: override with your SMTP provider (SendGrid, AWS SES, etc.).
|
|
|
|
|
#
|
refactor(backend,infra): unify SMTP env schema on canonical SMTP_* names
Third item of the v1.0.6 backlog. The v1.0.5.1 hotfix surfaced that two
email paths in-tree read *different* env vars for the same configuration:
internal/email/sender.go internal/services/email_service.go
SMTP_USERNAME SMTP_USER
SMTP_FROM FROM_EMAIL
SMTP_FROM_NAME FROM_NAME
The hotfix worked around it by exporting both sets in `.env.template`.
This commit reconciles them onto a single schema so the workaround can
go away.
Changes
* `internal/email/sender.go` is now the single loader. The canonical
names (`SMTP_USERNAME`, `SMTP_FROM`, `SMTP_FROM_NAME`) are read
first; the legacy names (`SMTP_USER`, `FROM_EMAIL`, `FROM_NAME`)
stay supported as a migration fallback that logs a structured
deprecation warning ("remove_in: v1.1.0"). Canonical always wins
over deprecated — no silent precedence flip.
* `NewSMTPEmailSender` callers keep working unchanged; a new
`LoadSMTPConfigFromEnvWithLogger(*zap.Logger)` variant lets callers
opt into the warning stream.
* `internal/services/email_service.go` drops its six inline
`os.Getenv` reads and delegates to the shared loader, so
`AuthService.Register` and `RequestPasswordReset` now see exactly
the same config as the async job worker.
* `.env.template`: the duplicate (SMTP_USER + FROM_EMAIL + FROM_NAME)
block added in v1.0.5.1 is removed — only the canonical SMTP_*
names ship for new contributors.
* `docker-compose.yml` (backend-api service): FROM_EMAIL / FROM_NAME
renamed to SMTP_FROM / SMTP_FROM_NAME to match the canonical schema.
* No Host/Port default injected in the loader. If SMTP_HOST is
empty, callers see Host=="" and log-only (historic dev behavior).
Dev defaults (MailHog localhost:1025) live in `.env.template`, so
a fresh clone still works; a misconfigured prod pod fails loud
instead of silently dialing localhost.
Tests
* 5 new Go tests in `internal/email/smtp_env_test.go`: empty-env
returns empty config; canonical names read directly; deprecated
names fall back (one warning per var); canonical wins over
deprecated silently; nil logger is allowed.
* Existing `TestLoadSMTPConfigFromEnv`, `TestSMTPEmailSender_Send`,
and every auth/services package remained green (40+ packages).
Import-cycle note: the loader deliberately lives in `internal/email`,
not `internal/config`, because `internal/config` already depends on
`internal/email` (wiring `EmailSender` at boot). Putting the loader in
`email` keeps the dependency flow one-way.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-16 18:44:09 +00:00
|
|
|
# v1.0.6: unified on canonical SMTP_* names. The legacy SMTP_USER /
|
|
|
|
|
# FROM_EMAIL / FROM_NAME are still accepted as a deprecation fallback
|
|
|
|
|
# (backend logs a warning on use; removed in v1.1.0).
|
chore(release): v1.0.5.1 — dev SMTP ergonomics hotfix
A fresh clone + `cp veza-backend-api/.env.template .env` + `make dev-full`
booted the backend with `SMTP_HOST=""` — `EmailService.sendEmail` short-
circuits to log-only when the host is empty, so `register` + `password
reset` produced users stuck with no way to verify (or recover) in dev,
and the smoke test caught MailHog empty despite the service being up.
- `.env.template` now ships MailHog-ready defaults (`localhost:1025`,
UI on `:8025`, `FROM_EMAIL=no-reply@veza.local`) so a bare clone +
copy gives a working register flow. Comment rewritten to point at
both the dev path and the prod override.
- Also exports duplicate variable names (`SMTP_USERNAME`, `SMTP_FROM`,
`SMTP_FROM_NAME`) read by `internal/email/sender.go`. The two email
services in-tree disagree on env schema (`SMTP_USER` vs
`SMTP_USERNAME`, `FROM_EMAIL` vs `SMTP_FROM`, `FROM_NAME` vs
`SMTP_FROM_NAME`); until v1.0.6 reconciles them, both sets are
populated so whichever path fires finds its names.
Pure config hotfix. No code change, no migration.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 16:16:54 +00:00
|
|
|
SMTP_HOST=localhost
|
|
|
|
|
SMTP_PORT=1025
|
2026-01-29 22:32:18 +00:00
|
|
|
SMTP_USERNAME=
|
|
|
|
|
SMTP_PASSWORD=
|
chore(release): v1.0.5.1 — dev SMTP ergonomics hotfix
A fresh clone + `cp veza-backend-api/.env.template .env` + `make dev-full`
booted the backend with `SMTP_HOST=""` — `EmailService.sendEmail` short-
circuits to log-only when the host is empty, so `register` + `password
reset` produced users stuck with no way to verify (or recover) in dev,
and the smoke test caught MailHog empty despite the service being up.
- `.env.template` now ships MailHog-ready defaults (`localhost:1025`,
UI on `:8025`, `FROM_EMAIL=no-reply@veza.local`) so a bare clone +
copy gives a working register flow. Comment rewritten to point at
both the dev path and the prod override.
- Also exports duplicate variable names (`SMTP_USERNAME`, `SMTP_FROM`,
`SMTP_FROM_NAME`) read by `internal/email/sender.go`. The two email
services in-tree disagree on env schema (`SMTP_USER` vs
`SMTP_USERNAME`, `FROM_EMAIL` vs `SMTP_FROM`, `FROM_NAME` vs
`SMTP_FROM_NAME`); until v1.0.6 reconciles them, both sets are
populated so whichever path fires finds its names.
Pure config hotfix. No code change, no migration.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 16:16:54 +00:00
|
|
|
SMTP_FROM=no-reply@veza.local
|
|
|
|
|
SMTP_FROM_NAME=Veza (dev)
|
2026-01-29 22:32:18 +00:00
|
|
|
|
|
|
|
|
# --- MONITORING (OPTIONAL) ---
|
|
|
|
|
PROMETHEUS_URL=
|
|
|
|
|
|
|
|
|
|
# =============================================================================
|
|
|
|
|
# VALIDATION RULES
|
|
|
|
|
# =============================================================================
|
|
|
|
|
#
|
|
|
|
|
# REQUIRED (app will not start without these):
|
|
|
|
|
# - DATABASE_URL
|
|
|
|
|
# - JWT_SECRET (min 32 chars)
|
|
|
|
|
# - REDIS_URL or REDIS_ADDR
|
|
|
|
|
# - CORS_ALLOWED_ORIGINS (can be empty for strict mode)
|
|
|
|
|
#
|
|
|
|
|
# RECOMMENDED for production:
|
|
|
|
|
# - SENTRY_DSN
|
|
|
|
|
# - COOKIE_SECURE=true
|
|
|
|
|
# - COOKIE_SAME_SITE=strict
|
|
|
|
|
#
|
|
|
|
|
# OPTIONAL:
|
|
|
|
|
# - RABBITMQ_* (if async events not used)
|
|
|
|
|
# - SMTP_* (if email not used)
|
|
|
|
|
# - CLAMAV_* (if file scanning not used)
|
|
|
|
|
#
|
|
|
|
|
# =============================================================================
|