16 commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
d5152d89a2 |
feat(stream): HLS default on + marketplace 30s pre-listen + FLAC tier checkbox (W4 Day 17)
Some checks failed
Veza CI / Rust (Stream Server) (push) Successful in 5m28s
Security Scan / Secret Scanning (gitleaks) (push) Failing after 53s
Veza CI / Backend (Go) (push) Failing after 7m59s
Veza CI / Frontend (Web) (push) Failing after 17m43s
Veza CI / Notify on failure (push) Successful in 4s
E2E Playwright / e2e (full) (push) Failing after 20m55s
Three pieces shipping under one banner since they're the day's
deliverables and share no review-time coupling :
1. HLS_STREAMING default flipped true
- config.go : getEnvBool default true (was false). Operators wanting
a lightweight dev / unit-test env explicitly set HLS_STREAMING=false
to skip the transcoder pipeline.
- .env.template : default flipped + comment explaining the opt-out.
- Effect : every new track upload routes through the HLS transcoder
by default ; ABR ladder served via /tracks/:id/master.m3u8.
2. Marketplace 30s pre-listen (creator opt-in)
- migrations/989 : adds products.preview_enabled BOOLEAN NOT NULL
DEFAULT FALSE + partial index on TRUE values. Default off so
adoption is opt-in.
- core/marketplace/models.go : PreviewEnabled field on Product.
- handlers/marketplace.go : StreamProductPreview gains a fall-through.
When no file-based ProductPreview exists AND the product is a
track product AND preview_enabled=true, redirect to the underlying
/tracks/:id/stream?preview=30. Header X-Preview-Cap-Seconds: 30
surfaces the policy.
- core/track/track_hls_handler.go : StreamTrack accepts ?preview=30
and gates anonymous access via isMarketplacePreviewAllowed (raw
SQL probe of products.preview_enabled to avoid the
track→marketplace import cycle ; the reverse arrow already exists).
- Trust model : 30s cap is enforced client-side (HTML5 audio
currentTime). Industry standard for tease-to-buy ; not anti-rip.
Documented in the migration + handler doc comment.
3. FLAC tier preview checkbox (Premium-gated, hidden by default)
- upload-modal/constants.ts : optional flacAvailable on UploadFormData.
- upload-modal/UploadModalMetadataForm.tsx : new optional props
showFlacAvailable + flacAvailable + onFlacAvailableChange.
Checkbox renders only when showFlacAvailable=true ; consumers
pass that based on the user's role/subscription tier (deferred
to caller wiring — Item G phase 4 will replace the role check
with a real subscription-tier check).
- Today the checkbox is a UI affordance only ; the actual lossless
distribution path (ladder + storage class) is post-launch work.
Acceptance (Day 17) : new uploads serve HLS ABR by default ;
products.preview_enabled flag wires anonymous 30s pre-listen ;
checkbox visible to premium users on the upload form. All 4 tested
backend packages pass : handlers, core/track, core/marketplace, config.
W4 progress : Day 16 ✓ · Day 17 ✓ · Day 18 (faceted search) ⏳ ·
Day 19 (HAProxy sticky WS) ⏳ · Day 20 (k6 nightly) ⏳.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
15e591305e |
feat(cdn): Bunny.net signed URLs + HLS cache headers + metric collision fix (W3 Day 13)
Some checks failed
Veza CI / Rust (Stream Server) (push) Successful in 5m12s
Security Scan / Secret Scanning (gitleaks) (push) Failing after 54s
Veza CI / Backend (Go) (push) Failing after 8m38s
Veza CI / Frontend (Web) (push) Failing after 16m44s
Veza CI / Notify on failure (push) Successful in 15s
E2E Playwright / e2e (full) (push) Successful in 20m28s
CDN edge in front of S3/MinIO via origin-pull. Backend signs URLs with Bunny.net token-auth (SHA-256 over security_key + path + expires) so edges verify before serving cached objects ; origin is never hit on a valid token. Cloudflare CDN / R2 / CloudFront stubs kept. - internal/services/cdn_service.go : new providers CDNProviderBunny + CDNProviderCloudflareR2. SecurityKey added to CDNConfig. generateBunnySignedURL implements the documented Bunny scheme (url-safe base64, no padding, expires query). HLSSegmentCacheHeaders + HLSPlaylistCacheHeaders helpers exported for handlers. - internal/services/cdn_service_test.go : pin Bunny URL shape + base64-url charset ; assert empty SecurityKey fails fast (no silent fallback to unsigned URLs). - internal/core/track/service.go : new CDNURLSigner interface + SetCDNService(cdn). GetStorageURL prefers CDN signed URL when cdnService.IsEnabled, falls back to direct S3 presign on signing error so a CDN partial outage doesn't block playback. - internal/api/routes_tracks.go + routes_core.go : wire SetCDNService on the two TrackService construction sites that serve stream/download. - internal/config/config.go : 4 new env vars (CDN_ENABLED, CDN_PROVIDER, CDN_BASE_URL, CDN_SECURITY_KEY). config.CDNService always non-nil after init ; IsEnabled gates the actual usage. - internal/handlers/hls_handler.go : segments now return Cache-Control: public, max-age=86400, immutable (content-addressed filenames make this safe). Playlists at max-age=60. - veza-backend-api/.env.template : 4 placeholder env vars. - docs/ENV_VARIABLES.md §12 : provider matrix + Bunny vs Cloudflare vs R2 trade-offs. Bug fix collateral : v1.0.9 Day 11 introduced veza_cache_hits_total which collided in name with monitoring.CacheHitsTotal (different label set ⇒ promauto MustRegister panic at process init). Day 13 deletes the monitoring duplicate and restores the metrics-package counter as the single source of truth (label: subsystem). All 8 affected packages green : services, core/track, handlers, middleware, websocket/chat, metrics, monitoring, config. Acceptance (Day 13) : code path is wired ; verifying via real Bunny edge requires a Pull Zone provisioned by the user (EX-? in roadmap). On the user side : create Pull Zone w/ origin = MinIO, copy token auth key into CDN_SECURITY_KEY, set CDN_ENABLED=true. W3 progress : Redis Sentinel ✓ · MinIO distribué ✓ · CDN ✓ · DMCA ⏳ Day 14 · embed ⏳ Day 15. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
a36d9b2d59 |
feat(redis): Sentinel HA + cache hit rate metrics (W3 Day 11)
Some checks failed
Veza CI / Backend (Go) (push) Failing after 8m56s
Veza CI / Frontend (Web) (push) Has been cancelled
E2E Playwright / e2e (full) (push) Has been cancelled
Veza CI / Notify on failure (push) Blocked by required conditions
Veza CI / Rust (Stream Server) (push) Successful in 5m3s
Security Scan / Secret Scanning (gitleaks) (push) Failing after 53s
Three Incus containers, each running redis-server + redis-sentinel (co-located). redis-1 = master at first boot, redis-2/3 = replicas. Sentinel quorum=2 of 3 ; failover-timeout=30s satisfies the W3 acceptance criterion. - internal/config/redis_init.go : initRedis branches on REDIS_SENTINEL_ADDRS ; non-empty -> redis.NewFailoverClient with MasterName + SentinelAddrs + SentinelPassword. Empty -> existing single-instance NewClient (dev/local stays parametric). - internal/config/config.go : 3 new fields (RedisSentinelAddrs, RedisSentinelMasterName, RedisSentinelPassword) read from env. parseRedisSentinelAddrs trims+filters CSV. - internal/metrics/cache_hit_rate.go : new RecordCacheHit / Miss counters, labelled by subsystem. Cardinality bounded. - internal/middleware/rate_limiter.go : instrument 3 Eval call sites (DDoS, frontend log throttle, upload throttle). Hit = Redis answered, Miss = error -> in-memory fallback. - internal/services/chat_pubsub.go : instrument Publish + PublishPresence. - internal/websocket/chat/presence_service.go : instrument SetOnline / SetOffline / Heartbeat / GetPresence. redis.Nil counts as a hit (legitimate empty result). - infra/ansible/roles/redis_sentinel/ : install Redis 7 + Sentinel, render redis.conf + sentinel.conf, systemd units. Vault assertion prevents shipping placeholder passwords to staging/prod. - infra/ansible/playbooks/redis_sentinel.yml : provisions the 3 containers + applies common baseline + role. - infra/ansible/inventory/lab.yml : new groups redis_ha + redis_ha_master. - infra/ansible/tests/test_redis_failover.sh : kills the master container, polls Sentinel for the new master, asserts elapsed < 30s. - config/grafana/dashboards/redis-cache-overview.json : 3 hit-rate stats (rate_limiter / chat_pubsub / presence) + ops/s breakdown. - docs/ENV_VARIABLES.md §3 : 3 new REDIS_SENTINEL_* env vars. - veza-backend-api/.env.template : 3 placeholders (empty default). Acceptance (Day 11) : Sentinel failover < 30s ; cache hit-rate dashboard populated. Lab test pending Sentinel deployment. W3 verification gate progress : Redis Sentinel ✓ (this commit), MinIO EC4+2 ⏳ Day 12, CDN ⏳ Day 13, DMCA ⏳ Day 14, embed ⏳ Day 15. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
d03232c85c |
feat(storage): add track storage_backend column + config prep (v1.0.8 P0)
Some checks failed
Veza CI / Backend (Go) (push) Failing after 0s
Veza CI / Frontend (Web) (push) Failing after 0s
Veza CI / Rust (Stream Server) (push) Failing after 0s
Security Scan / Secret Scanning (gitleaks) (push) Failing after 0s
Veza CI / Notify on failure (push) Failing after 0s
Phase 0 of the MinIO upload migration (FUNCTIONAL_AUDIT §4 item 2).
Schema + config only — Phase 1 will wire TrackService.UploadTrack()
to actually route writes to S3 when the flag is flipped.
Schema (migration 985):
- tracks.storage_backend VARCHAR(16) NOT NULL DEFAULT 'local'
CHECK in ('local', 's3')
- tracks.storage_key VARCHAR(512) NULL (S3 object key when backend=s3)
- Partial index on storage_backend = 's3' (migration progress queries)
- Rollback drops both columns + index; safe only while all rows are
still 'local' (guard query in the rollback comment)
Go model (internal/models/track.go):
- StorageBackend string (default 'local', not null)
- StorageKey *string (nullable)
- Both tagged json:"-" — internal plumbing, never exposed publicly
Config (internal/config/config.go):
- New field Config.TrackStorageBackend
- Read from TRACK_STORAGE_BACKEND env var (default 'local')
- Production validation rule #11 (ValidateForEnvironment):
- Must be 'local' or 's3' (reject typos like 'S3' or 'minio')
- If 's3', requires AWS_S3_ENABLED=true (fail fast, do not boot with
TrackStorageBackend=s3 while S3StorageService is nil)
- Dev/staging warns and falls back to 'local' instead of fail — keeps
iteration fast while still flagging misconfig.
Docs:
- docs/ENV_VARIABLES.md §13 restructured as "HLS + track storage backend"
with a migration playbook (local → s3 → migrate-storage CLI)
- docs/ENV_VARIABLES.md §28 validation rules: +2 entries for new rules
- docs/ENV_VARIABLES.md §29 drift findings: TRACK_STORAGE_BACKEND added
to "missing from template" list before it was fixed
- veza-backend-api/.env.template: TRACK_STORAGE_BACKEND=local with
comment pointing at Phase 1/2/3 plans
No behavior change yet — TrackService.UploadTrack() still hardcodes the
local path via copyFileAsync(). Phase 1 wires it.
Refs:
- AUDIT_REPORT.md §9 item (deferrals v1.0.8)
- FUNCTIONAL_AUDIT.md §4 item 2 "Stockage local disque only"
- /home/senke/.claude/plans/audit-fonctionnel-wild-hickey.md Item 3
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
7d03ee6686 |
docs(env): canonicalize ENV_VARIABLES.md + add HLS_STREAMING template
Some checks failed
Veza CI / Backend (Go) (push) Failing after 0s
Veza CI / Frontend (Web) (push) Failing after 0s
Veza CI / Rust (Stream Server) (push) Failing after 0s
Security Scan / Secret Scanning (gitleaks) (push) Failing after 0s
Veza CI / Notify on failure (push) Failing after 0s
Resolves AUDIT_REPORT §9 item #15 (last real item before v1.0.7 final) and FUNCTIONAL_AUDIT §4 stability item 5. docs/ENV_VARIABLES.md: - Complete rewrite from 172 → ~600 lines covering all ~180 env vars surveyed directly from code (os.Getenv in Go, std::env::var in Rust, import.meta.env in React). - 30 sections: core, DB, Redis, JWT, OAuth, CORS, rate-limit, SMTP, Hyperswitch, Stripe Connect, RabbitMQ, S3/MinIO, HLS, stream server, Elasticsearch, ClamAV, Sentry, logging, metrics, frontend Vite, feature flags, password policy, build info, RTMP/misc, Rust stream schema, security headers recap, deprecated vars, prod validation rules, drift findings, startup checklist. - Documents 8 production-critical validation rules (validation.go:869-1018). - Flags 14 deprecated vars with canonical replacements for v1.1.0 cleanup. - Catalogs 11 vars used by code but missing from template (HLS_STREAMING, SLOW_REQUEST_THRESHOLD_MS, CONFIG_WATCH, HANDLER_TIMEOUT, VAPID_*, etc). veza-backend-api/.env.template: - Add HLS_STREAMING=false with documentation of fallback behavior (/tracks/:id/stream with Range support when off). - Add HLS_STORAGE_DIR=/tmp/veza-hls. Closes last blocker before v1.0.7 final tag. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
7e180a2c08 |
feat(workers): hyperswitch reconciliation sweep for stuck pending states — v1.0.7 item C
New ReconcileHyperswitchWorker sweeps for pending orders and refunds
whose terminal webhook never arrived. Pulls live PSP state for each
stuck row and synthesises a webhook payload to feed the normal
ProcessPaymentWebhook / ProcessRefundWebhook dispatcher. The existing
terminal-state guards on those handlers make reconciliation
idempotent against real webhooks — a late webhook after the reconciler
resolved the row is a no-op.
Three stuck-state classes covered:
1. Stuck orders (pending > 30m, non-empty payment_id) → GetPaymentStatus
+ synthetic payment.<status> webhook.
2. Stuck refunds with PSP id (pending > 30m, non-empty
hyperswitch_refund_id) → GetRefundStatus + synthetic
refund.<status> webhook (error_message forwarded).
3. Orphan refunds (pending > 5m, EMPTY hyperswitch_refund_id) →
mark failed + roll order back to completed + log ERROR. This
is the "we crashed between Phase 1 and Phase 2 of RefundOrder"
case, operator-attention territory.
New interfaces:
* marketplace.HyperswitchReadClient — read-only PSP surface the
worker depends on (GetPaymentStatus, GetRefundStatus). The
worker never calls CreatePayment / CreateRefund.
* hyperswitch.Client.GetRefund + RefundStatus struct added.
* hyperswitch.Provider gains GetRefundStatus + GetPaymentStatus
pass-throughs that satisfy the marketplace interface.
Configuration (all env-var tunable with sensible defaults):
* RECONCILE_WORKER_ENABLED=true
* RECONCILE_INTERVAL=1h (ops can drop to 5m during incident
response without a code change)
* RECONCILE_ORDER_STUCK_AFTER=30m
* RECONCILE_REFUND_STUCK_AFTER=30m
* RECONCILE_REFUND_ORPHAN_AFTER=5m (shorter because "app crashed"
is a different signal from "network hiccup")
Operational details:
* Batch limit 50 rows per phase per tick so a 10k-row backlog
doesn't hammer Hyperswitch. Next tick picks up the rest.
* PSP read errors leave the row untouched — next tick retries.
Reconciliation is always safe to replay.
* Structured log on every action so `grep reconcile` tells the
ops story: which order/refund got synced, against what status,
how long it was stuck.
* Worker wired in cmd/api/main.go, gated on
HyperswitchEnabled + HyperswitchAPIKey. Graceful shutdown
registered.
* RunOnce exposed as public API for ad-hoc ops trigger during
incident response.
Tests — 10 cases, all green (sqlite :memory:):
* TestReconcile_StuckOrder_SyncsViaSyntheticWebhook
* TestReconcile_RecentOrder_NotTouched
* TestReconcile_CompletedOrder_NotTouched
* TestReconcile_OrderWithEmptyPaymentID_NotTouched
* TestReconcile_PSPReadErrorLeavesRowIntact
* TestReconcile_OrphanRefund_AutoFails_OrderRollsBack
* TestReconcile_RecentOrphanRefund_NotTouched
* TestReconcile_StuckRefund_SyncsViaSyntheticWebhook
* TestReconcile_StuckRefund_FailureStatus_PassesErrorMessage
* TestReconcile_AllTerminalStates_NoOp
CHANGELOG v1.0.7-rc1 updated with the full item C section between D
and the existing E block, matching the order convention (ship order:
A → D → B → E → C, CHANGELOG order follows).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
3c4d0148be |
feat(webhooks): persist raw hyperswitch payloads to audit log — v1.0.7 item E
Every POST /webhooks/hyperswitch delivery now writes a row to
`hyperswitch_webhook_log` regardless of signature-valid or
processing outcome. Captures both legitimate deliveries and attack
probes — a forensics query now has the actual bytes to read, not
just a "webhook rejected" log line. Disputes (axis-1 P1.6) ride
along: the log captures dispute.* events alongside payment and
refund events, ready for when disputes get a handler.
Table shape (migration 984):
* payload TEXT — readable in psql, invalid UTF-8 replaced with
empty (forensics value is in headers + ip + timing for those
attacks, not the binary body).
* signature_valid BOOLEAN + partial index for "show me attack
attempts" being instantaneous.
* processing_result TEXT — 'ok' / 'error: <msg>' /
'signature_invalid' / 'skipped'. Matches the P1.5 action
semantic exactly.
* source_ip, user_agent, request_id — forensics essentials.
request_id is captured from Hyperswitch's X-Request-Id header
when present, else a server-side UUID so every row correlates
to VEZA's structured logs.
* event_type — best-effort extract from the JSON payload, NULL
on malformed input.
Hardening:
* 64KB body cap via io.LimitReader rejects oversize with 413
before any INSERT — prevents log-spam DoS.
* Single INSERT per delivery with final state; no two-phase
update race on signature-failure path. signature_invalid and
processing-error rows both land.
* DB persistence failures are logged but swallowed — the
endpoint's contract is to ack Hyperswitch, not perfect audit.
Retention sweep:
* CleanupHyperswitchWebhookLog in internal/jobs, daily tick,
batched DELETE (10k rows + 100ms pause) so a large backlog
doesn't lock the table.
* HYPERSWITCH_WEBHOOK_LOG_RETENTION_DAYS (default 90).
* Same goroutine-ticker pattern as ScheduleOrphanTracksCleanup.
* Wired in cmd/api/main.go alongside the existing cleanup jobs.
Tests: 5 in webhook_log_test.go (persistence, request_id auto-gen,
invalid-JSON leaves event_type empty, invalid-signature capture,
extractEventType 5 sub-cases) + 4 in cleanup_hyperswitch_webhook_
log_test.go (deletes-older-than, noop, default-on-zero,
context-cancel). Migration 984 applied cleanly to local Postgres;
all indexes present.
Also (v107-plan.md):
* Item G acceptance gains an explicit Idempotency-Key threading
requirement with an empty-key loud-fail test — "literally
copy-paste D's 4-line test skeleton". Closes the risk that
item G silently reopens the HTTP-retry duplicate-charge
exposure D closed.
Out of scope for E (noted in CHANGELOG):
* Rate limit on the endpoint — pre-existing middleware covers
it at the router level; adding a per-endpoint limit is
separate scope.
* Readable-payload SQL view — deferred, the TEXT column is
already human-readable; a convenience view is a nice-to-have
not a ship-blocker.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
d2bb9c0e78 |
feat(marketplace): async stripe connect reversal worker — v1.0.7 item B day 2
Day-2 cut of item B: the reversal path becomes async. Pre-v1.0.7
(and v1.0.7 day 1) the refund handler flipped seller_transfers
straight from completed to reversed without ever calling Stripe —
the ledger said "reversed" while the seller's Stripe balance still
showed the original transfer as settled. The new flow:
refund.succeeded webhook
→ reverseSellerAccounting transitions row: completed → reversal_pending
→ StripeReversalWorker (every REVERSAL_CHECK_INTERVAL, default 1m)
→ calls ReverseTransfer on Stripe
→ success: row → reversed + persist stripe_reversal_id
→ 404 already-reversed (dead code until day 3): row → reversed + log
→ 404 resource_missing (dead code until day 3): row → permanently_failed
→ transient error: stay reversal_pending, bump retry_count,
exponential backoff (base * 2^retry, capped at backoffMax)
→ retries exhausted: row → permanently_failed
→ buyer-facing refund completes immediately regardless of Stripe health
State machine enforcement:
* New `SellerTransfer.TransitionStatus(tx, to, extras)` wraps every
mutation: validates against AllowedTransferTransitions, guarded
UPDATE with WHERE status=<from> (optimistic lock semantics), no
RowsAffected = stale state / concurrent winner detected.
* processSellerTransfers no longer mutates .Status in place —
terminal status is decided before struct construction, so the
row is Created with its final state.
* transfer_retry.retryOne and admin RetryTransfer route through
TransitionStatus. Legacy direct assignment removed.
* TestNoDirectTransferStatusMutation greps the package for any
`st.Status = "..."` / `t.Status = "..."` / GORM
Model(&SellerTransfer{}).Update("status"...) outside the
allowlist and fails if found. Verified by temporarily injecting
a violation during development — test caught it as expected.
Configuration (v1.0.7 item B):
* REVERSAL_WORKER_ENABLED=true (default)
* REVERSAL_MAX_RETRIES=5 (default)
* REVERSAL_CHECK_INTERVAL=1m (default)
* REVERSAL_BACKOFF_BASE=1m (default)
* REVERSAL_BACKOFF_MAX=1h (default, caps exponential growth)
* .env.template documents TRANSFER_RETRY_* and REVERSAL_* env vars
so an ops reader can grep them.
Interface change: TransferService.ReverseTransfer(ctx,
stripe_transfer_id, amount *int64, reason) (reversalID, error)
added. All four mocks extended (process_webhook, transfer_retry,
admin_transfer_handler, payment_flow integration). amount=nil means
full reversal; v1.0.7 always passes nil (partial reversal is future
scope per axis-1 P2).
Stripe 404 disambiguation (ErrTransferAlreadyReversed /
ErrTransferNotFound) is wired in the worker as dead code — the
sentinels are declared and the worker branches on them, but
StripeConnectService.ReverseTransfer doesn't yet emit them. Day 3
will parse stripe.Error.Code and populate the sentinels; no worker
change needed at that point. Keeping the handling skeleton in day 2
so the worker's branch shape doesn't change between days and the
tests can already cover all four paths against the mock.
Worker unit tests (9 cases, all green, sqlite :memory:):
* happy path: reversal_pending → reversed + stripe_reversal_id set
* already reversed (mock returns sentinel): → reversed + log
* not found (mock returns sentinel): → permanently_failed + log
* transient 503: retry_count++, next_retry_at set with backoff,
stays reversal_pending
* backoff capped at backoffMax (verified with base=1s, max=10s,
retry_count=4 → capped at 10s not 16s)
* max retries exhausted: → permanently_failed
* legacy row with empty stripe_transfer_id: → permanently_failed,
does not call Stripe
* only picks up reversal_pending (skips all other statuses)
* respects next_retry_at (future rows skipped)
Existing test updated: TestProcessRefundWebhook_SucceededFinalizesState
now asserts the row lands at reversal_pending with next_retry_at
set (worker's responsibility to drive to reversed), not reversed.
Worker wired in cmd/api/main.go alongside TransferRetryWorker,
sharing the same StripeConnectService instance. Shutdown path
registered for graceful stop.
Cut from day 2 scope (per agreed-upon discipline), landing in day 3:
* Stripe 404 disambiguation implementation (parse error.Code)
* End-to-end smoke probe (refund → reversal_pending → worker
processes → reversed) against local Postgres + mock Stripe
* Batch-size tuning / inter-batch sleep — batchLimit=20 today is
safely under Stripe's 100 req/s default rate limit; revisit if
observed load warrants
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
9002e91d91 |
refactor(backend,infra): unify SMTP env schema on canonical SMTP_* names
Third item of the v1.0.6 backlog. The v1.0.5.1 hotfix surfaced that two
email paths in-tree read *different* env vars for the same configuration:
internal/email/sender.go internal/services/email_service.go
SMTP_USERNAME SMTP_USER
SMTP_FROM FROM_EMAIL
SMTP_FROM_NAME FROM_NAME
The hotfix worked around it by exporting both sets in `.env.template`.
This commit reconciles them onto a single schema so the workaround can
go away.
Changes
* `internal/email/sender.go` is now the single loader. The canonical
names (`SMTP_USERNAME`, `SMTP_FROM`, `SMTP_FROM_NAME`) are read
first; the legacy names (`SMTP_USER`, `FROM_EMAIL`, `FROM_NAME`)
stay supported as a migration fallback that logs a structured
deprecation warning ("remove_in: v1.1.0"). Canonical always wins
over deprecated — no silent precedence flip.
* `NewSMTPEmailSender` callers keep working unchanged; a new
`LoadSMTPConfigFromEnvWithLogger(*zap.Logger)` variant lets callers
opt into the warning stream.
* `internal/services/email_service.go` drops its six inline
`os.Getenv` reads and delegates to the shared loader, so
`AuthService.Register` and `RequestPasswordReset` now see exactly
the same config as the async job worker.
* `.env.template`: the duplicate (SMTP_USER + FROM_EMAIL + FROM_NAME)
block added in v1.0.5.1 is removed — only the canonical SMTP_*
names ship for new contributors.
* `docker-compose.yml` (backend-api service): FROM_EMAIL / FROM_NAME
renamed to SMTP_FROM / SMTP_FROM_NAME to match the canonical schema.
* No Host/Port default injected in the loader. If SMTP_HOST is
empty, callers see Host=="" and log-only (historic dev behavior).
Dev defaults (MailHog localhost:1025) live in `.env.template`, so
a fresh clone still works; a misconfigured prod pod fails loud
instead of silently dialing localhost.
Tests
* 5 new Go tests in `internal/email/smtp_env_test.go`: empty-env
returns empty config; canonical names read directly; deprecated
names fall back (one warning per var); canonical wins over
deprecated silently; nil logger is allowed.
* Existing `TestLoadSMTPConfigFromEnv`, `TestSMTPEmailSender_Send`,
and every auth/services package remained green (40+ packages).
Import-cycle note: the loader deliberately lives in `internal/email`,
not `internal/config`, because `internal/config` already depends on
`internal/email` (wiring `EmailSender` at boot). Putting the loader in
`email` keeps the dependency flow one-way.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
070e31a463 |
chore(release): v1.0.5.1 — dev SMTP ergonomics hotfix
A fresh clone + `cp veza-backend-api/.env.template .env` + `make dev-full` booted the backend with `SMTP_HOST=""` — `EmailService.sendEmail` short- circuits to log-only when the host is empty, so `register` + `password reset` produced users stuck with no way to verify (or recover) in dev, and the smoke test caught MailHog empty despite the service being up. - `.env.template` now ships MailHog-ready defaults (`localhost:1025`, UI on `:8025`, `FROM_EMAIL=no-reply@veza.local`) so a bare clone + copy gives a working register flow. Comment rewritten to point at both the dev path and the prod override. - Also exports duplicate variable names (`SMTP_USERNAME`, `SMTP_FROM`, `SMTP_FROM_NAME`) read by `internal/email/sender.go`. The two email services in-tree disagree on env schema (`SMTP_USER` vs `SMTP_USERNAME`, `FROM_EMAIL` vs `SMTP_FROM`, `FROM_NAME` vs `SMTP_FROM_NAME`); until v1.0.6 reconciles them, both sets are populated so whichever path fires finds its names. Pure config hotfix. No code change, no migration. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
2df921abd5 | v0.9.1 | ||
|
|
ae81e171c7 | feat(seller): add Stripe Connect config | ||
|
|
b73387af3c |
feat(api): add PostgreSQL read replica support (3.7)
- Add DATABASE_READ_URL config and InitReadReplica in database package - Add ForRead() helper for read-only handler routing - Update TrackService and TrackSearchService to use read replica for reads - Document setup in DEPLOYMENT_GUIDE.md and .env.template |
||
|
|
92f432fb9e | chore: consolidate pending changes (Hyperswitch, PostCard, dashboard, stream server, etc.) | ||
|
|
30f17dfc2a |
chore(backend): config, router, auth, stream service, sanitizer, tests
Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
712bfb6b8c |
config(template): add comprehensive .env.template
Created centralized environment template with all configuration variables documented and categorized. Categories: - REQUIRED: DATABASE_URL, JWT_SECRET (min 32 chars), REDIS - RECOMMENDED: SENTRY_DSN, COOKIE_SECURE, CORS_ALLOWED_ORIGINS - OPTIONAL: RABBITMQ, SMTP, CLAMAV, S3 Features: - Clear documentation for each variable - Default values specified - Validation rules documented - Environment-specific guidance (dev vs prod) - Security notes for sensitive values Impact: Single source of truth for configuration, reduces config drift. Fixes: P3.4 (part 1) from audit AUDIT_TEMP_29_01_2026.md |