Engineers sometimes frame WebSocket scaling as "can runtime X hold N connections?"
That is too narrow.
The harder question is usually:
"Can this system manage long-lived connection state, selective fanout, and slow clients without collapsing?"
The Real Costs
A WebSocket system has to deal with:
- persistent connection state
- authentication and reconnect handling
- fanout to many subscribers
- backpressure from slow consumers
- cross-node coordination if you scale horizontally
Connection count matters. Fanout shape matters more.
Ten thousand mostly idle connections are a different problem from ten thousand subscribers receiving frequent broadcasts.
Why Node Often Struggles
Node can absolutely power WebSocket systems.
The pressure starts when the application does too much work per message on the same event loop that is responsible for keeping sockets healthy. Once broadcast loops, serialization, and per-client bookkeeping pile up, latency becomes visible quickly.
That is not an indictment of Node. It is a reminder that runtime model and workload shape must match.
Why Elixir/Phoenix Gets Mentioned So Often
Elixir and the BEAM ecosystem are popular in this space because they are designed around large numbers of lightweight processes and message passing. That can make connection-oriented systems simpler to reason about operationally.
The real advantage is not "Elixir can do a million sockets" as a slogan. It is that supervision, isolation, and concurrency are part of the model rather than something bolted on around it.
The Design Rule That Matters
If you need large-scale realtime delivery, focus on:
- keeping per-connection state small
- partitioning rooms or topics cleanly
- applying backpressure
- measuring fanout cost
- choosing a runtime that fits the workload
Sometimes Node is good enough.
Sometimes Go or Elixir is a better fit.
The serious engineering question is not which language wins internet arguments. It is which runtime makes the failure modes manageable for your traffic pattern.
Further Reading