
Show HN: SocketCluster.io – flexible open-source real-time server with pub/sub - jondubois
http://socketcluster.io/
======
geekuillaume
I tried SocketCluster for a highly scalable chat platform and it couldn't
handle very big conversations. SocketCluster is good but the bottleneck here
was NodeJS. I finnaly used Nginx PushStream to do the sub and the pub was
handled by NodeJS (with the authentication). It has been used to host chat
room with more than 70,000 concurrent clients, scaled on 7 m3.large EC2
instances and one relatively small Redis server.

The platform is opensource and come with a lot of higher level features for
chat usage:
[https://github.com/geekuillaume/chatup](https://github.com/geekuillaume/chatup)

~~~
sudovoodoo
So -- m3.large ec2 instances have a few problems with them. 1, your only
running a 2 core machine (1 worker, and assuming 1 broker to distribute
through redis). That is not nearly enough to handle really a real-time chat
server where everyone is spamming it all the time. Another issue is the
network performance of an m3.large instance -- its moderate (which is another
word for not very good).

With 2 c4.8xlarge instances behind ELB, we consistently see between 20 to 50K
active connections on live meetings, and the servers sit at < 2% usage.
Latency between event cycles is < 0.015ms.

I see a lot of people having problems with this, but the issue is usually the
resources they give to it, or the actual handling of socket events. If the
socket server is just a relay there should be no reason a single m4.4xlarge
instance cant handle 600K+ connections.

www.jayway.com/2015/04/13/600k-concurrent-websocket-connections-on-aws-using-
node-js/

~~~
arrty88
How does nodejs stand up against Java netty NIO for the web socket use case?

------
sudovoodoo
From personal experience -- switching from socket.io to socketcluster.io for
our conferencing platform was a life saver. Its so much more stable &
performant its crazy. We also use it to run our webrtc video chat platform
which works great as well. Cant say enough good things about this.

~~~
sudovoodoo
Should say that we horizontally scale this thing pretty heavily using the sc-
redis module. Elasticache + ELB + 4 EC2 Instances = support for 5000+ person
conferences :D

~~~
z3t4
This sounds interesting. Do you have a web-site? Open source?

~~~
sudovoodoo
www.talkfusion.com -- in specific, the live meetings & video chat portion use
socket cluster as its backer. We generally use c4.8xlarge instances to back
the socket infrastructure.

------
0xADADA
How is this different than this?
[https://news.ycombinator.com/item?id=11071916](https://news.ycombinator.com/item?id=11071916)

~~~
nkrisc
I was wondering the same. What are the odds of two very similar projects with
nearly identical headlines appearing on HN at nearly the same time?

~~~
detaro
Project 1 is submitted and trending, someone thinks "Oh, I have/have seen
something similar, let's submit it while people are thinking about the topic?"

------
EGreg
This seems very impressive, but I wonder why we would want to switch to it
from socket.io in our use case.

We have built an open source platform that takes care of all the usual stuff
you need when building realtime social apps. From the client side caching down
to the pubsub, message ordering, security, and pushing updates via socket.io
back to the client and updating the caches. It is designed to work in a
distributed way so things are partitioned based on the stream you subscribe
to. If you aren't online you can subscribe to get offline notifications
delivered to your device or other endpoints (like custom nodes that would act
on notifications like IFTTT).

So, given that we have the infrastructure - we use PHP for request handling,
MySQL for persistence, Node for background service to do socket push and
notifications... what does this offer over socket.io? We implement our own
rules for subscribing to streams.

~~~
jondubois
I'm the main author of SC.

Your system does sound similar to SC - It seems to be a pretty standard
realtime architecture - Those who started building their realtime systems with
Socket.io (as you did) often end up with something similar to what you
describe except it takes a lot of work to get there...

A lot of people who use SC decided to make the switch because they started
implementing their own pub/sub stack (as you did) and then decided that it
would be easier to just use something open source instead of writing their
own.

There is a lot more to a realtime stack than just the bidirectional transport.

If you already have a fully working system and you don't need any new
features, then you don't necessarily need to migrate to the shiny new tool ;p

~~~
jadbox
Does SC require the need for a DB?

~~~
jondubois
No but there is an optional CRUD module for RethinkDB
[https://github.com/socketcluster/sc-crud-
rethink](https://github.com/socketcluster/sc-crud-rethink)

------
jontammeh
Interesting these are coming around now. Some services (i.e. Pubnub) have been
doing this (for money) for a long time and what I noticed were at the end of
the day, the use cases were actually rather limited.

Apart from stock quotes, bitcoin quotes and multiplayer games - polling just
isn't that bad. And it's actually a very good place to start when developing
as it's simple.

~~~
aaronbasssett
Even if you don't need sub-second updates polling is still very wasteful. Your
typical HTTP header is 700-800 bytes, so every 100,000 requests you're sending
75 megabytes of data you don't need to.

Websockets require 2 bytes, so those same 100,000 requests send only 200
kilobytes of additional data. Also it's worth noting that with Websockets data
is only sent when there is new data to send. Saving you even more bandwidth.

Plus with services like Pubnub & Pusher websockets are really simply to get
started with now as well. I would argue they're actually easier to use than
polling.

~~~
illamint
Not to mention things like logging and metrics overhead if you're logging
every request at multiple levels (application, NGINX, HAProxy).

------
zevrox
I wonder if socket.io can be mixed with this for use with client-server side
(for its polling fallback options) while using SocketCluster for server-server
scaling side. Also, how is this different from nats.io?

------
jondubois
If anyone opened the console and noticed the socket timing out earlier - That
was because I upgraded to a bigger 4-CPU core Amazon instance and in the
process, I accidentally installed a newer version of the SC server (which
caused a protocol mismatch with the client version). The load per CPU was
around 3% on each core at its peak of 250 concurrent users.

There were occasional spikes to 6% CPU use per core when people started
spamming :p

------
solipsism
Am I the only one who thinks we need a better name for this kind of thing than
_realtime server_ ?

~~~
jkarneges
There's always a pedant. :) You're probably right, but I think that ship has
sailed. Projects and companies have been using "realtime" to mean "push" or
"update a UI without a refresh button" for years.

~~~
solipsism
I admit naming things is hard. I'm _very_ often pedantic! But I don't think
it's pedantic to say, "Hey, guys, this name we're using is kind of shitty."

------
dfischer
Does phoenix's / elixir channels compare with this well?

~~~
chrismccord
Our realtime features seem to overlap based on a quick look on their website,
except we get distribution for free from the Erlang VM so you can deploy
Phoenix on a cluster and you don't an intermediate redis instance or similar
for pubsub/IPC across nodes.

------
vishalzone2002
whats with sudden needs for "flexible open-source realtime server with pubsub"
referring to
[https://news.ycombinator.com/item?id=11071916](https://news.ycombinator.com/item?id=11071916)
reply

------
jzig
Does this support Elasticsearch?

~~~
jkarneges
Not a direct answer to your question but you might also check out
[https://appbase.io/](https://appbase.io/) which is Elasticsearch compatible.

