Two connected failure modes that silently break multi-pod deployments:
1. `RedisURL` has a struct-level default (`redis://<appDomain>:6379`)
that makes `c.RedisURL == ""` always false. An operator forgetting
to set `REDIS_URL` booted against a phantom host — every Redis call
would then fail, and `ChatPubSubService` would quietly fall back to
an in-memory map. On a single-pod deploy that "works"; on two pods
it silently partitions chat (messages on pod A never reach
subscribers on pod B).
2. The fallback itself was logged at `Warn` level, buried under normal
traffic. Operators only noticed when users reported stuck chats.
Changes:
* `config.go` (`ValidateForEnvironment` prod branch): new check that
`os.Getenv("REDIS_URL")` is non-empty. The struct field is left
alone (dev + test still use the default); we inspect the raw env so
the check is "explicitly set" rather than "non-empty after defaults".
* `chat_pubsub.go` `NewChatPubSubService`: if `redisClient == nil`,
emit an `ERROR` at construction time naming the failure mode
("cross-instance messages will be lost"). Same `Warn`→`Error`
promotion for the `Publish` fallback path — runbook-worthy.
Tests: new `chat_pubsub_test.go` with a `zaptest/observer` that asserts
the ERROR-level log fires exactly once when Redis is nil, plus an
in-memory fan-out happy-path so single-pod dev behaviour stays covered.
New `TestValidateForEnvironment_RedisURLRequiredInProduction` mirrors
the Hyperswitch guard test shape.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
45 lines
1.1 KiB
Go
45 lines
1.1 KiB
Go
package services
|
|
|
|
import (
|
|
"context"
|
|
"testing"
|
|
|
|
"github.com/google/uuid"
|
|
"github.com/stretchr/testify/assert"
|
|
"go.uber.org/zap"
|
|
"go.uber.org/zap/zapcore"
|
|
"go.uber.org/zap/zaptest/observer"
|
|
)
|
|
|
|
func TestChatPubSubService_NilRedisLogsError(t *testing.T) {
|
|
core, observed := observer.New(zapcore.ErrorLevel)
|
|
logger := zap.New(core)
|
|
|
|
_ = NewChatPubSubService(nil, logger)
|
|
|
|
entries := observed.All()
|
|
assert.Len(t, entries, 1, "constructor should emit exactly one ERROR log when Redis is nil")
|
|
assert.Equal(t, zapcore.ErrorLevel, entries[0].Level)
|
|
assert.Contains(t, entries[0].Message, "cross-instance messages will be lost")
|
|
}
|
|
|
|
func TestChatPubSubService_InMemoryFanout(t *testing.T) {
|
|
svc := NewChatPubSubService(nil, zap.NewNop())
|
|
|
|
ctx := context.Background()
|
|
roomID := uuid.New()
|
|
|
|
ch, cancel, err := svc.Subscribe(ctx, roomID)
|
|
assert.NoError(t, err)
|
|
defer cancel()
|
|
|
|
err = svc.Publish(ctx, roomID, []byte("hello"))
|
|
assert.NoError(t, err)
|
|
|
|
select {
|
|
case msg := <-ch:
|
|
assert.Equal(t, "hello", string(msg))
|
|
default:
|
|
t.Fatal("expected message on in-memory channel")
|
|
}
|
|
}
|