
Four billion messages an hour: benchmarking Deepstream throughput - wolframhempel
https://deepstream.io/info/performance/four-billion-messages-per-hour/
======
jgrahamc
So, roughly 1.1m messages per second of about 23 bytes.

We're handling 4m to 6m 1.5k log lines per second using Apache Kafka on a
cluster of around 100 nodes.

~~~
yelnatz
1.5k log lines is how many bytes? Would've been helpful if you sticked with
the same unit.

~~~
gberger
I think he means 4 to 6 million log lines with 1.5 kB each.

------
matt_oriordan
I'm interested to know what happens when there is a failure or a deployment?
Having static servers handling load is only part of the problem in our
experience. The true complexity and scalability of a system comes when you
consider how it copes under load with unexpected failures (network, hardware),
but more importantly expected maintenance such as regular deploys, scaling up
and scaling down events. Do you have any metrics for that? Those are the
problems we've been focussing most of our energy on at Ably
([https://www.ably.io](https://www.ably.io)), not just message per second
rates which is often not really the problem.

~~~
wolframhempel
yup, failover times under load / HA setup metrics will come out next week

------
teh
t2 instances have credits that are replenished at a constant rate, and used up
when you use the CPU - is this sustainable for more than 1h?

That's ~25MB of data per second for an in-memory workload over 6 machines. I
think they missing a zero on the size of the messages?

------
gtirloni
What does this compare against? Having a little trouble navigating the "real
time" apps landscape.

~~~
stemuk
I think the most obvious comparison is Firebase, but as a self-hosted solution
deepstream is much cheaper (5-10x, dependant on the usecase). The creators of
deepstream.io have actually written a blog post that covers this exact topic:
[https://deepstream.io/blog/realtime-framework-
overview/](https://deepstream.io/blog/realtime-framework-overview/)

------
0xmohit

      Deepstream relies on garbage collection to free up dereferenced
      memory. If a machine's CPU is overutilized above 100% for a
      consecutive time, garbage collection will be delayed and memory
      can add up. If this continuous for a prolonged period, the
      server will run out of memory and eventually crash - so be
      generous enough when it comes to resource allocation to make
      sure that your processors get some breathing space every once
      in a while.
    

"Be generous when it comes to resource allocation".

\--

> The costs of running a six-instance cluster for an hour on AWS are 36 cents
> (6 x t2.medium @ 0.052$/h + 1 x cache.t2.medium @ 0.068$/h)

AFAIK, pricing in AWS world depends upon the region. Bandwidth, hard disks and
so on also contribute to the price.

I'm not sure what to make of such conclusions.

------
bsbechtel
How does Deepstream compare to Meteor?

~~~
wolframhempel
deepstream solves similar problems as meteor, but is conceptually quite
different. It's a standalone server that you install the same way you would
install e.g. Nginx or a database. Clients connect to it using small SDKs that
come in different languages.

It similarly provides data-sync, pub-sub and request response with no opinion
about your frontend framework or technology stack and has an open ecosystem of
connectors that make it work with all sorts of databases, caches and message
buses.

It's also significantly faster than meteor, making it possible to also use it
for multiplayer gaming, realtime trading etc...

~~~
bsbechtel
Isn't Meteor a stand-alone server? It sits on top of node.js, and has a number
of extra packages included to make it more of a complete framework, but those
packages can easily be removed or replaced.

------
siscia
How does deepstream compare to MQTT?

I see a lot of advantages on using a more standard protocol such as MQTT over
deepstream

~~~
wolframhempel
deepstream is a very different thing from MQTT. It's a server that provides
high level realtime data structures to clients and backends alike.

It's closer to a self hosted version of Firebase or Parse than MQTT

------
AdamMills
How many concurrent connections can you get with the test nodes?

------
stemuk
Is it possible to cluster deepstream in multiple AWS regions? It may be
beneficial when it comes to latency...

~~~
wolframhempel
yup, you can cluster deepstream across multiple regions. The only thing to
take into account is that its datalayer (e.g. redis cache / some combination
of cache and db) needs to be clustered as well

------
lossolo
If you only need one hour then indeed it's great price but if you need it 24/7
then it's 260$ a month which is really expensive if you compare alternative
(you can get dedicated server capable of same throughput easily for half the
price).

~~~
oneloop
Tell us, what's this product that you have that's so successful that it needs
4Bn messages/hr full time, that you can't put together $260/mo?

~~~
lossolo
First of all you are replying to something i didn't write. Or maybe show me
where i wrote that I HAVE product that can't put together $260/mo ?

It's not about 260$ it's about tens of thousands of dollars saved by companies
i work with after migrating from AWS.

AWS is overpriced service for people that do not have time or resources to do
things on their own and pay massive prices for that.

~~~
_puk
"AWS is overpriced service for people that do not have time or resources to do
things on their own and pay massive prices for that."

Isn't that the point of AWS?

The time / resources cost to build (monitor and maintain..) your own
infrastructure isn't zero.

Did these companies have spare teams lying around the place, such that no new
hires were needed to make the transition? If not, then it's not so much of a
saving at the current cost of dev / ops personnel, just now on a different
balance sheet.

~~~
xyzzy123
Yep. Aws is funny. You want to at do it first, then it doesnt make sense at
e.g the 10k per month price point. So you do something else.

Then you come back when you have due diligence and compliance and availability
requirements. And you see how much that is really costing you with the
arrangements you have.

Finally if you're super successful, you could self host or otherwise. But
"rack costs" (naively) trade off against people and auditors.

Summary: AWS is expensive, especially for bandwidth. Practically if your
business requirements are not bandwidth intensive, it ends up being
paradoxically cheaper than the alternatives if your business involves legal
agreements and things like availability commitments to customers.

