Skip to content
A AhsanLab.Tech
Real-Time Backend 12 min read · June 9, 2026

Scaling WebSockets Horizontally with Redis Pub/Sub (With Benchmarks)

Adding a second WebSocket node is easy. Keeping Redis from melting at four nodes is the part nobody benchmarks. The fan-out amplification trap, the one-line subscribe mistake that costs you 30× the message volume, and real numbers from 1 to 8 nodes.

A Ahsan Habib Save
Real-Time Backend

Adding a second WebSocket node feels like a victory. You wire up Redis pub/sub, watch a message cross from Server 1 to Server 2, and ship it. The pillar guide covers exactly that — the wiring that makes cross-node delivery work at all.

This post is about what happens after that. Because the wiring is not the hard part. The hard part is that the thing which made two nodes work is the exact thing that makes four nodes melt your Redis instance — and you will not see it coming, because at two nodes the cost is invisible.

The trap in one sentence: the obvious way to subscribe — psubscribe('room:*') — makes every node receive every message for every room, so your Redis traffic grows with nodes × rooms, not with the work you actually need done.

I hit this scaling Matrix Bingo past two nodes. What follows is the amplification math, the one-line fix that flattened it, the Redis-7 feature that helps further, and measured numbers from 1 to 8 nodes on the same hardware.

The fan-out amplification trap

Here is the mistake, and it is the most-copied snippet in every "scale WebSockets with Redis" tutorial:

// The convenient line that does not scale.
// Every node subscribes to EVERY room's channel — including rooms it hosts no one for.
sub.psubscribe('room:*');

At two nodes this looks free. But think about what a single broadcast actually costs. A player sends a move. Their node publishes it to room:abc. With psubscribe('room:*'), every node in the cluster is subscribed to that pattern — so Redis delivers that message to all of them, even the nodes that have zero members in room:abc. Each of those nodes wakes up, deserializes the payload, looks up the room, finds it empty locally, and throws the work away.

ONE BROADCAST, FOUR NODES — psubscribe('room:*') DELIVERS TO ALL NODE 1 hosts room:abc PUBLISH room:abc 1 publish REDIS pattern room:* fans to ALL subs deliver deliver (wasted ×3) NODE 1 — fans out ✓ has members in room:abc REAL work NODE 2 — drops it ✗ deserialize → empty → discard NODE 3 — drops it ✗ deserialize → empty → discard NODE 4 — drops it ✗ deserialize → empty → discard Redis deliveries = publishes × NODES 3 of every 4 deliveries are pure waste and it gets worse with every node you add
FIGURE 1 — FAN-OUT AMPLIFICATION · With a wildcard subscription, Redis delivery volume scales with the number of nodes, not the number of clients who need the message. At 8 nodes, 7 of every 8 deliveries are discarded.

The math is brutal. If each node publishes P messages per second and you have N nodes, Redis must perform P × N × N deliveries per second: every node publishes, and every published message is delivered to every node. That is quadratic in node count. Doubling your servers does not halve the load per Redis — it quadruples the total delivery work. Redis is single-threaded for command execution, so this is the wall you hit first, long before CPU or memory on the WebSocket nodes.

The fix: subscribe only to rooms you actually host

The insight is simple once you see the waste: a node only needs messages for rooms that have at least one local member. So subscribe to a room's channel when the first local member joins, and unsubscribe when the last one leaves. Redis then delivers each message only to the handful of nodes that genuinely host that room.

// rooms.js — targeted per-room subscriptions. Redis delivers only where needed.
const rooms = new Map(); // roomId → Set<WebSocket>

async function joinRoom(ws) {
  const firstLocalMember = !rooms.has(ws.roomId);
  if (firstLocalMember) rooms.set(ws.roomId, new Set());
  rooms.get(ws.roomId).add(ws);

  // Only subscribe when WE start hosting this room locally.
  if (firstLocalMember) await sub.subscribe(`room:${ws.roomId}`);
}

async function leaveRoom(ws) {
  const room = rooms.get(ws.roomId);
  if (!room) return;
  room.delete(ws);

  if (room.size === 0) {
    rooms.delete(ws.roomId);
    await sub.unsubscribe(`room:${ws.roomId}`); // last local member gone → stop listening
  }
}

// Deliver to local members when Redis notifies us about a room we host.
sub.on('message', (channel, payload) => {
  const roomId = channel.slice('room:'.length);
  broadcast(roomId, payload);   // LOCAL fan-out only — never re-publish here
});

// Cross-node send: local delivery first, then one publish.
async function broadcastViaRedis(roomId, payload) {
  broadcast(roomId, payload);
  await pub.publish(`room:${roomId}`, payload);
}

Now a broadcast for room:abc reaches only the nodes that host a member of room:abc. For a room living entirely on one node, the publish costs a round-trip to Redis and back to that single node — not a fan-out to all eight. The two non-negotiable rules from the pillar still apply: separate pub and sub clients (a connection in subscribe mode cannot issue commands), and never re-publish from inside the message handler — that creates a feedback loop that takes the cluster down in seconds.

Redis 7: sharded pub/sub for the cluster case

Targeted subscriptions fix the single-Redis case. But if you outgrow one Redis and move to Redis Cluster, classic pub/sub has a nasty default: a PUBLISH is propagated to every node in the cluster, because any client connected to any node might be subscribed. That re-introduces the amplification at the Redis layer itself.

Redis 7.0 added sharded pub/sub (SSUBSCRIBE / SPUBLISH) precisely for this. A sharded message stays on the single shard that owns the channel's key slot, so it never crosses the whole cluster. If you are scaling past one Redis, use the sharded commands for your room channels — the API mirrors the regular ones:

// Redis 7 sharded pub/sub — message stays on the shard owning the key slot.
await sub.ssubscribe(`room:${roomId}`);
await pub.spublish(`room:${roomId}`, payload);

sub.on('smessage', (channel, payload) => {
  broadcast(channel.slice('room:'.length), payload);
});

The numbers — 1 to 8 nodes, same workload

Same harness as the rest of this series: rooms of ~12 members, a 240-byte JSON frame broadcast on every client message at a 20 Hz game tick, Node 20 LTS nodes behind Nginx with IP-hash sticky sessions, one Redis 7.0 instance on its own 2 vCPU box. The only variable that changes between the two blocks below is the subscription strategy.

REDIS DELIVERIES / SEC AS NODES GROW — TWO STRATEGIES 800k 400k 0 deliveries/s 1 node 2 nodes 4 nodes 8 nodes psubscribe('room:*') ≈ P × N² — Redis saturates near 8 nodes targeted per-room subscribe ≈ P × N — grows with real work only ~10× fewer deliveries
FIGURE 2 — The wildcard subscription grows quadratically and saturates a single Redis near 8 nodes; targeted per-room subscriptions grow with actual cross-node work, leaving Redis idle by comparison.
Nodes Connections Redis deliveries/s
psubscribe(room:*)
Redis deliveries/s
targeted
p99 latency
psubscribe / targeted
1 5,000 0 (local only) 0 (local only) 14 ms / 14 ms
2 10,000 ~70k ~35k 28 ms / 24 ms
4 20,000 ~280k ~70k 61 ms / 27 ms
8 40,000 ~1.1M (Redis CPU pinned) ~140k 180 ms / 31 ms

The story is in the last two rows. With the wildcard subscription, p99 latency goes from fine to unusable between 4 and 8 nodes — not because the WebSocket nodes are struggling, but because a single-threaded Redis is now spending all its time delivering messages to nodes that throw them away. With targeted subscriptions, latency barely moves as you scale, because Redis only ever does work that ends in a real client receiving a message.

Three more things that bite at scale

  • Sticky sessions are mandatory, not optional. A WebSocket lives on one process for its whole life — it cannot be re-routed mid-session like an HTTP request. Use IP-hash or a connection-aware balancer so the upgrade and every subsequent frame land on the same node. Round-robin will break reconnections in confusing, intermittent ways.
  • Pub/sub is fire-and-forget — it has no memory. If a node is briefly disconnected from Redis, every message published during that gap is gone for good; there is no replay. For chat or anything that must survive a blip, put the durable record in Redis Streams or your database and treat pub/sub purely as the live-delivery fast path.
  • One Redis is a single point of failure. The whole cluster's cross-node delivery rides on it. Run Redis Sentinel or a managed HA Redis, and make sure your clients reconnect and re-subscribe on failover — a reconnect that forgets its subscriptions is a silent outage where messages just stop arriving.

Scaling checklist

Before you add the third node, confirm every item. The first two are what separate "scales linearly" from "melts Redis at 8 nodes."
  • ⚠ Subscribe per-room, not psubscribe('room:*'). Subscribe on first local join, unsubscribe on last local leave. This is the single change that makes Redis load scale with work instead of with node count.
  • ⚠ Never re-publish inside the subscribe handler. Local broadcast() only. Re-publishing creates a feedback loop that collapses the cluster.
  • Use sharded pub/sub (SSUBSCRIBE/SPUBLISH) on Redis Cluster. Classic pub/sub re-fans across the whole cluster; sharded keeps each message on one shard.
  • Separate pub and sub clients. A subscribed connection cannot issue normal commands.
  • IP-hash sticky sessions at the balancer. The upgrade and all frames must hit the same node.
  • Re-subscribe on Redis reconnect. A reconnect that drops subscriptions is a silent delivery outage.
  • Graph Redis deliveries/sec, not just WebSocket node CPU. Redis saturates first; if you only watch the app nodes, the bottleneck stays invisible until latency explodes.

The one-line takeaway

Horizontal WebSocket scaling does not fail because nodes run out of CPU — it fails because the message bus does. The difference between a cluster that scales linearly and one that melts at eight nodes is a single architectural choice: deliver each message only to the nodes that need it, never to all of them.

This is Spoke 3 of the real-time backend series, built on the production WebSocket pillar, the memory-leak war story, and the framework comparison. If you have pushed Redis pub/sub further than 8 nodes — or moved to a different bus entirely — I would love to compare notes in the comments.

#WebSocket #Redis #Node.js #Real-Time #Scalability #Performance

Comments (0)

No comments yet

Be the first to share a thought on this article.

Join the conversation

Comments are moderated before they appear.

Keep reading

Related articles