The Wall Every Real-Time App Hits
When I first started building a chat app API, everything worked perfectly on my local machine. One server, a few dozen connections, zero latency. But the moment I tried to deploy to a production cluster with a load balancer, everything broke. User A on Server 1 couldn’t send a message to User B on Server 2. Why? Because WebSockets are stateful; the connection lives on a specific server memory heap.
To fix this, you need a way to broadcast messages across all your server instances. This is where scaling web sockets with redis adapter comes into play. By using Redis as a message broker (Pub/Sub), you can ensure that an event emitted on one node is received by all other nodes in your cluster.
Prerequisites
- Node.js installed (v18+ recommended)
- A running Redis instance (Local, Docker, or Managed like Redis Cloud)
- Basic familiarity with horizontal scaling in Node.js
- Socket.io installed in your project
Step 1: Install the Redis Adapter
First, you need to install the official Socket.io Redis adapter and the redis client. In my experience, using the @socket.io/redis-adapter package is the most stable route.
npm install @socket.io/redis-adapter redis
Step 2: Configure the Redis Client
You need two Redis clients: one for publishing messages and one for subscribing to them. This is a critical detail; you cannot use a single connection for both because once a client enters ‘subscriber’ mode, it can no longer issue regular commands.
const { Server } = require('socket.io');
const { createClient } = require('redis');
const { createAdapter } = require('@socket.io/redis-adapter');
const io = new Server(3000);
const pubClient = createClient({ url: 'redis://localhost:6379' });
const subClient = pubClient.duplicate();
async function setupRedis() {
await Promise.all([pubClient.connect(), subClient.connect()]);
io.adapter(createAdapter(pubClient, subClient));
console.log('Redis adapter connected successfully');
}
setupRedis().catch(console.error);
Step 3: Handling Sticky Sessions
If you are using a load balancer (like Nginx or AWS ALB), you’ll likely see HTTP 400 Bad Request errors during the WebSocket handshake. This happens because Socket.io starts with HTTP long-polling before upgrading to WebSockets. The client must hit the same server for the entire handshake process.
As shown in the architecture diagram above, the Redis adapter handles the message distribution, but the initial connection requires ‘Sticky Sessions’ (Session Affinity). In Nginx, you can achieve this using the ip_hash directive:
upstream socket_nodes {
ip_hash;
server server1.example.com;
server server2.example.com;
}
Step 4: Testing the Cluster
To verify that scaling web sockets with redis adapter is working, I recommend spinning up two separate Node processes on different ports (e.g., 3000 and 3001) and connecting two different browser tabs to them. When you emit a message from Tab A (Server 3000), Tab B (Server 3001) should receive it instantaneously.
Pro Tips for Production
- Use Redis Sentinel or Cluster: For high availability, don’t rely on a single Redis node. If Redis goes down, your cross-server communication dies.
- Monitor Memory: Redis Pub/Sub is fast, but if you have millions of messages per second, monitor your Redis memory usage closely.
- Namespace Optimization: Only use the adapter for namespaces that actually need to scale.
Troubleshooting Common Issues
Client disconnecting repeatedly
This is almost always a missing sticky session configuration. Check your load balancer logs for 400 errors during the polling phase.
Messages not reaching other servers
Ensure both your pubClient and subClient are fully connected before io.adapter() is called. I’ve seen many developers call it before the await pubClient.connect() resolves.
What’s Next?
Now that you’ve mastered the basics of scaling web sockets with redis adapter, you might want to explore more advanced patterns. I suggest looking into Redis Streams if you need message persistence (so users don’t miss messages while offline) or diving deeper into Node.js cluster module for maximizing single-machine performance.