
Stream and Go: News feeds for 300M users, built on RocksDB and Raft - tschellenbach
https://stackshare.io/stream/stream-and-go-news-feeds-for-over-300-million-end-users
======
th1nkdifferent
What issues did they run into with Cassandra? It's not easy to build a
scalable database like Cassandra. There are good reasons to write your own
distributed storage system but the authors need to add more details about
Keevo and specific issues they ran into with Cassandra.

~~~
JelteF
Main developer of Keevo at Stream here and you ask a very good question. I've
asked myself the same question quite a lot. And I think it actually warrants
it own dedicated blog post at some point. To give some idea though, the main
reasons are, cost, simplicity, control and wanting to have a very good
understanding of the database internals.

Cassandra is very scalable, but it's not very efficient. The hosting costs for
our cassandra cluster were so big that it was infeasible to run it in another
region as well.

Appart from that we've had (partial) downtime a couple of times because one
node just started going crazy because of unclear reasons.

Keevo solves this by not trying to be cassandra, but be much simpler. It
doesn't do schema's or indexes. All it is is a very fast ordered key value
store that is stored to disk and replicated automatically to multiple servers
(using raft). Any other features we need, we build on top of this usually
outside of Keevo itself. This simplicity saves us a lot of hosting costs and
makes its performance much more predictable and easier to debug.

Last but not least a very important advantage is control and understanding of
the database internals. Because we build Keevo ourselves, we know the
performance and consistency tradeoffs it has and can change/improve them when
needed.

I hope this helps in understanding our choice. It's definitely not something I
would recommend for most companies, but since our product is storage at its
core it makes sense for us.

~~~
pbowyer
Thanks for the comment and look forward to the dedicated blog post :)

Was Riak KV ever considered?

~~~
ddorian43
Wasn't riak, like, less efficient than cassandra ?

~~~
StreamBright
What do you mean? I think Riak is much simpler and after tuning it beats
Cassandra on the same HW same use case in my anecdotal experience.

------
Dowwie
@tschellenbach: congrats on your achievements!

This is yet another Python success story. You created a viable product with
Python and moved critical parts to Go when there was real need for doing so.

Considering Go has _only_ been your primary language for 9 months, as the
beginner's style code base is optimized for all that Go has to offer you'll
continue to see more performance gains, albeit probably not as significant as
you saw between v1 and v2. Good on you for not being tied down with
optimizations and actually releasing.

"With Python we often found ourselves delegating logic to the database layer
purely for performance reasons. " \-- you mean you wrote postgres functions..
or.. views!? _gasp_! heresy! Why would you migrate something working perfectly
well in postgres away to Go, though? You probably didn't achieve any
performance gains with that. Were postgres functions really gobbling that much
memory up? Or.. maybe my assumptions are wrong here.

~~~
alehander42
I actually think that's not a "Python" or "Go" success story. It's a story of
how both of them lack something significant: Python lacks speed and light
memory usage, Go lacks dev speed and expresiveness. There is no reason why a
single modern language can't combine them, and that would be much better than
managing a codebase in two languages, so it's a success story of a useful
compromise

~~~
hkeide
The article says they saw improved productivity after switching to Go, how
does that point to Go lacking dev speed or expressiveness?

~~~
alehander42
I am disagreeing with the parent's point. His point is that Python is better
for prototyping which implies that in the beginning people can be quite more
productive with it.

------
dis-sys
The title contains "rocksdb and raft", but the article doesn't give any
details on how the Keevo key-value store is designed using rocksdb + raft. You
can not figure out how they manage data with raft (how many raft groups in the
system? do they split the group when it reaches certain stage?), there is no
mention on the design choice of not using etcd raft which is the most popular
one in Go. They didn't mention how they test the system, Jepsen or in-house
monkey testing? They do have a Keevo link in the article but it says "Sorry,
we couldn't find that page".

On the Go side, since they mentioned that rocksdb is written in C++, I am
interested to know the cgo issues they might had when using Go and rocksdb
together, but of course they didn't mention that. Same goes for the GC stw
pauses.

I strongly believe that articles like this largely used as soft ads materials
without much detailed technical meat should be banned from Hack News. If it is
all about "we built system X using building blocks Y and Z in language T!",
what is the real value for readers?

~~~
tbarbugli
We currently use the Raft Go implementation from Hashicorp (the article links
to the Github project). We found Hashicorp implementation mature enough and
easier to work with than the one from etcd.

Using Rocksdb with Go has indeed its own challenges because CGO calls are far
from cheap and have an important consequences on Go applications. For
performance reasons, we ended up moving some logic from Go to C in a few hot-
spots (eg. moving the retrieval of large amount of keys from Go to C).

Disclaimer: I work for Stream and I am one the two co-founders

~~~
StreamBright
Could you elaborate on Keevo? Could you compare it to Riak or Cassandra for
example? Do you have any plans to open source it?

------
siscia
I envy you guys to build such great technical product and being able to sell
it.

I would love to know also the "business history" of the product: how you
started, the first sale, the marketing approach and how you move from there...

------
adventured
Stream always looks interesting, until I hit their pricing page, where I see
this:

Starter, $59/month, 5 million updates

Growth, $269/month, 9 million updates

Wait, what? To increase the updates by 80%, the price jumps by 356%.

I understand they're throwing in higher processing and custom ranking. That's
still approximately the most absurd pricing variance on one plan to the next
on a scaling service that I've ever seen.

~~~
tschellenbach
The difference is large because we have a free tier of 3 million feed updates.
So practically the first plan is paying for 2 and the 2nd for 6. The free and
small plans make it easier for developers and smaller startups to start using
Stream. I see that you're founder, want to send me some examples of pricing
that you like? (email is in my profile)

~~~
adventured
You do realize that what you actually just said is: the difference is large
because your valued customers in the second tier with the gigantic 356% price
jump, are paying for your perhaps overly generous free plan.

What you maybe should have said is (speaking from a sales perspective): the
reason there's a 356% jump in price, is because of an incredible value
proposition step up of 1,000% in what you're getting!

I think that price gap is a classic, and very large, opening for a competitor.
There's a ~$89-$129 style plan missing in there, that is a logical step up
from $59. That's obviously just my opinion. I also happen to be a big fan of
charging more rather than less, in the service space; that under-charging is a
common mistake young companies make. And I still can't make sense of $59 to
$269 as a step.

I launched a new business in the last quarter. It has a user feed system that
is fairly standard. I've built several feed-like features in the last decade.
I looked at Stream a while back, having run across it while trying to avoid
building my own again. The good news is, I like your product, it got my
attention. That price jump between plans, for a very modest increase in
service volume, made it a non-starter. It's too easy and cheap to build my
own, which then gives me tight control over its evolution over time as well.
It's basically impossible at your price points to compete with what I can just
do for myself relatively quickly; that $269 product can be run on a low-cost
Digital Ocean droplet in terms of its actual resource demands. You're charging
for the quality of your software product primarily, and some support, which
makes perfect sense. The problem, is that I can build that and scale it to the
$899 plan and save myself $10k a year. If you could do 10 million updates at
~$50-$75 / month, with normal plan stepping from there, it'd be interesting.
250k to 350k updates per day is not an immense sum, just 20,000 daily active
users can saturate that pretty easily.

~~~
JelteF
Engineer at Stream here and I'm really not the guy to talk to about pricing.

However, I can tell you that our features that are hardest to implement in a
scalabale way are aggregated feeds (e.g. one message with "tom watched 10
videos") and custom ranking (reddit/HN style ordering with time decay and
points). If you're feed system doesn't have either of these features it might
indeed be possible to do it cheaper than our more expensive plans.

However keep in mind, a lot of the cost is in the availability and reliability
of our service. Basically for every type of server we run (including
databases), we need to run at least two to be able to stay online in case one
dies. Finally building and maintaining a feed system is not cheap in terms of
developer cost (try finding a dev for $10k a year).

------
jnordwick
Let's do some math and take their metrics and divide by their number of
servers:

\- 6.3 million feed updates per day per server

\- 111 API requests per minute per server

If you guess 4 cores per server it even sounds worse.

(Ps, will HN ever support better formatting?)

~~~
latch
They measured their response time using average. Collecting metrics to look
good.

Since we can only speak in broad terms, those requests/updates aren't going to
be evenly distributed throughout the day and they might not be consistent from
day to day so some over-provisioning is reasonable.

Since they're on EC2 a) they shouldn't be _that_ over-provisioned though and
b) are getting comparatively (to bare metal) awful performance (say 2x-8x
depending on exactly what they're doing) and paying for the privilege.

I'd like to better understand why "Fanout-on-write" wasn't their go-to
solution. All they said was that it was "expensive".

~~~
xstartup
How does fan out write help?

~~~
latch
Lets you shard and makes your reads dead simple (1 table, 1 predicate, 1 sort)

------
NetOpWibby
Looks like you guys have gotten some surly comments from people here but I
enjoyed reading this. It was just enough information to get me interested in
researching some things for my own project.

------
dio
Interesting, what do you think about rocksplicator
[https://medium.com/@Pinterest_Engineering/open-sourcing-
rock...](https://medium.com/@Pinterest_Engineering/open-sourcing-
rocksplicator-a-real-time-rocksdb-data-replicator-558cd3847a9d)? And also pika
([https://github.com/Qihoo360/pika](https://github.com/Qihoo360/pika))

------
dahx4Eev
The Stream Framework[0] is still in Python with Cassandra. The Go version with
RocksDB is not open source.

[0]: [https://github.com/tschellenbach/Stream-
Framework](https://github.com/tschellenbach/Stream-Framework)

~~~
aocvr
This blog post does raise the question of what’s going to happen to Stream
Framework, now that the hosted version is diverging so clearly...

------
noahdesu
I'd be interested in reading, even if very high-level, some additional details
about Keevo. I'm most interested in what the API exposed by Keevo looks like,
how it's utilized, and what parts of it are nice/lacking.

------
pritambarhate
The jagger logo is indeed awesome!
([https://github.com/jaegertracing/jaeger](https://github.com/jaegertracing/jaeger))

------
yakitori
Sometimes it's so hard for me to determine whether half of the stuff on HN is
real news or just advertisement.

A news feed for 300M is child's play when you consider that the content is
pretty much static.

"Go and RocksDB and Raft" hit all the keywords but what makes this post
relevant for HN?

------
jnordwick
Yeah! More product ads masquerading as technical blog posts. Even submitted by
the CEO.

