Show HN: SocketCluster.io – flexible open-source real-time server with pub/sub

geekuillaume · on Feb 10, 2016

I tried SocketCluster for a highly scalable chat platform and it couldn't handle very big conversations. SocketCluster is good but the bottleneck here was NodeJS. I finnaly used Nginx PushStream to do the sub and the pub was handled by NodeJS (with the authentication). It has been used to host chat room with more than 70,000 concurrent clients, scaled on 7 m3.large EC2 instances and one relatively small Redis server.

The platform is opensource and come with a lot of higher level features for chat usage: https://github.com/geekuillaume/chatup

sudovoodoo · on Feb 10, 2016

So -- m3.large ec2 instances have a few problems with them. 1, your only running a 2 core machine (1 worker, and assuming 1 broker to distribute through redis). That is not nearly enough to handle really a real-time chat server where everyone is spamming it all the time. Another issue is the network performance of an m3.large instance -- its moderate (which is another word for not very good).

With 2 c4.8xlarge instances behind ELB, we consistently see between 20 to 50K active connections on live meetings, and the servers sit at < 2% usage. Latency between event cycles is < 0.015ms.

I see a lot of people having problems with this, but the issue is usually the resources they give to it, or the actual handling of socket events. If the socket server is just a relay there should be no reason a single m4.4xlarge instance cant handle 600K+ connections.

www.jayway.com/2015/04/13/600k-concurrent-websocket-connections-on-aws-using-node-js/

arrty88 · on Feb 11, 2016

How does nodejs stand up against Java netty NIO for the web socket use case?

vhiremath4 · on Feb 10, 2016

Would you happen to have any metrics of how far you could push SC and how far you're getting with NGINX PushStream for sub?

zokier · on Feb 10, 2016

> It has been used to host chat room with more than 70,000 concurrent clients, scaled on 7 m3.large EC2 instances and one relatively small Redis server.

Knowing nothing about anything, 70k clients on 7 servers does not sound immediately very impressive. That's roughly 10k clients per server, and this is 15 years after "C10k problem" was coined. Could someone comment if I'm missing something major here?

sudovoodoo · on Feb 10, 2016

From personal experience -- switching from socket.io to socketcluster.io for our conferencing platform was a life saver. Its so much more stable & performant its crazy. We also use it to run our webrtc video chat platform which works great as well. Cant say enough good things about this.

sudovoodoo · on Feb 10, 2016

Should say that we horizontally scale this thing pretty heavily using the sc-redis module. Elasticache + ELB + 4 EC2 Instances = support for 5000+ person conferences :D

z3t4 · on Feb 10, 2016

This sounds interesting. Do you have a web-site? Open source?

sudovoodoo · on Feb 10, 2016

www.talkfusion.com -- in specific, the live meetings & video chat portion use socket cluster as its backer. We generally use c4.8xlarge instances to back the socket infrastructure.

coolsunglasses · on Feb 10, 2016

If you check their comment history (which is very brief) you'll see they comment on a job posting for Talkfusion.

misiek08 · on Feb 13, 2016

What are your montly costs?

EGreg · on Feb 10, 2016

How long has it been around? Why is it better than socket.io 2?

0xADADA · on Feb 10, 2016

How is this different than this? https://news.ycombinator.com/item?id=11071916

nkrisc · on Feb 10, 2016

I was wondering the same. What are the odds of two very similar projects with nearly identical headlines appearing on HN at nearly the same time?

detaro · on Feb 10, 2016

Project 1 is submitted and trending, someone thinks "Oh, I have/have seen something similar, let's submit it while people are thinking about the topic?"

dang · on Feb 10, 2016

Three: https://news.ycombinator.com/item?id=11073966. But detaro no doubt has it right.

EGreg · on Feb 10, 2016

This seems very impressive, but I wonder why we would want to switch to it from socket.io in our use case.

We have built an open source platform that takes care of all the usual stuff you need when building realtime social apps. From the client side caching down to the pubsub, message ordering, security, and pushing updates via socket.io back to the client and updating the caches. It is designed to work in a distributed way so things are partitioned based on the stream you subscribe to. If you aren't online you can subscribe to get offline notifications delivered to your device or other endpoints (like custom nodes that would act on notifications like IFTTT).

So, given that we have the infrastructure - we use PHP for request handling, MySQL for persistence, Node for background service to do socket push and notifications... what does this offer over socket.io? We implement our own rules for subscribing to streams.

cryptica · on Feb 10, 2016

I'm the main author of SC.

Your system does sound similar to SC - It seems to be a pretty standard realtime architecture - Those who started building their realtime systems with Socket.io (as you did) often end up with something similar to what you describe except it takes a lot of work to get there...

A lot of people who use SC decided to make the switch because they started implementing their own pub/sub stack (as you did) and then decided that it would be easier to just use something open source instead of writing their own.

There is a lot more to a realtime stack than just the bidirectional transport.

If you already have a fully working system and you don't need any new features, then you don't necessarily need to migrate to the shiny new tool ;p

Gratsby · on Feb 10, 2016

I'm curious because I have a relatively new socket.io based application (we went into production in the last month). How difficult is the migration path? (If it's something i can do in an afternoon, I'd be willing to try it out)

jadbox · on Feb 10, 2016

Does SC require the need for a DB?

cryptica · on Feb 10, 2016

No but there is an optional CRUD module for RethinkDB https://github.com/socketcluster/sc-crud-rethink

EGreg · on Feb 10, 2016

Thanks for the answer! I was suspecting as much for our use case but wanted to clarify.

jontammeh · on Feb 10, 2016

Interesting these are coming around now. Some services (i.e. Pubnub) have been doing this (for money) for a long time and what I noticed were at the end of the day, the use cases were actually rather limited.

Apart from stock quotes, bitcoin quotes and multiplayer games - polling just isn't that bad. And it's actually a very good place to start when developing as it's simple.

aaronbasssett · on Feb 10, 2016

Even if you don't need sub-second updates polling is still very wasteful. Your typical HTTP header is 700-800 bytes, so every 100,000 requests you're sending 75 megabytes of data you don't need to.

Websockets require 2 bytes, so those same 100,000 requests send only 200 kilobytes of additional data. Also it's worth noting that with Websockets data is only sent when there is new data to send. Saving you even more bandwidth.

Plus with services like Pubnub & Pusher websockets are really simply to get started with now as well. I would argue they're actually easier to use than polling.

illamint · on Feb 10, 2016

Not to mention things like logging and metrics overhead if you're logging every request at multiple levels (application, NGINX, HAProxy).

jkarneges · on Feb 10, 2016

Polling can be a fine place to start and is better than nothing. The speed of realtime updates does matter though. Even for seemingly boring use-cases like synchronizing CRM screens or something, it's still best if updates happen within a few seconds. And instant (sub-second) updates add an extra level of polish. Similar to non-flickering graphics and smooth scrolling, fast updates are impressive and can increase the appeal of an app.

billmalarky · on Feb 10, 2016

Polling works for current use cases, but products of the future aren't built around current use cases ;-)

toadkicker · on Feb 10, 2016

Programming is akin to woodworking. Lots of tools and all of it is for wood. Use the right tools to craft the right solutions.

cryptica · on Feb 10, 2016

If anyone opened the console and noticed the socket timing out earlier - That was because I upgraded to a bigger 4-CPU core Amazon instance and in the process, I accidentally installed a newer version of the SC server (which caused a protocol mismatch with the client version). The load per CPU was around 3% on each core at its peak of 250 concurrent users.

There were occasional spikes to 6% CPU use per core when people started spamming :p

zevrox · on Feb 10, 2016

I wonder if socket.io can be mixed with this for use with client-server side (for its polling fallback options) while using SocketCluster for server-server scaling side. Also, how is this different from nats.io?

solipsism · on Feb 10, 2016

Am I the only one who thinks we need a better name for this kind of thing than realtime server ?

jkarneges · on Feb 10, 2016

There's always a pedant. :) You're probably right, but I think that ship has sailed. Projects and companies have been using "realtime" to mean "push" or "update a UI without a refresh button" for years.

solipsism · on Feb 10, 2016

I admit naming things is hard. I'm very often pedantic! But I don't think it's pedantic to say, "Hey, guys, this name we're using is kind of shitty."

1and2equals0 · on Feb 10, 2016

Well yeah but we should know better. Pubsub != realtime.

_fx6v · on Feb 10, 2016

Does phoenix's / elixir channels compare with this well?

chrismccord · on Feb 10, 2016

Our realtime features seem to overlap based on a quick look on their website, except we get distribution for free from the Erlang VM so you can deploy Phoenix on a cluster and you don't an intermediate redis instance or similar for pubsub/IPC across nodes.

vishalzone2002 · on Feb 10, 2016

whats with sudden needs for "flexible open-source realtime server with pubsub" referring to https://news.ycombinator.com/item?id=11071916 reply

jzig · on Feb 10, 2016

Does this support Elasticsearch?

jkarneges · on Feb 10, 2016

Not a direct answer to your question but you might also check out https://appbase.io/ which is Elasticsearch compatible.