
Finding Kafka’s throughput limit in Dropbox infrastructure - redm
https://blogs.dropbox.com/tech/2019/01/finding-kafkas-throughput-limit-in-dropbox-infrastructure/
======
anotherhue
50MB/s is tiny, you can push 500 on commodity disks.

[https://www.slideshare.net/ConfluentInc/kafka-on-zfs-
better-...](https://www.slideshare.net/ConfluentInc/kafka-on-zfs-better-
living-through-filesystems?ref=https://www.confluent.io/kafka-summit-
sf18/kafka-on-zfs)

Full talk: [https://www.confluent.io/kafka-summit-sf18/kafka-on-
zfs](https://www.confluent.io/kafka-summit-sf18/kafka-on-zfs) (shameless plug)

~~~
vvanders
Yeah, I've got a Ceph cluster in my homelab that tops out about 80mb/s and
it's pretty darn painful compared to a 2/3 disk mirrored ZFS on spinning rust.

~~~
polskibus
Are there any good ceph alternatives? Why can't you use zfs directly?

~~~
vvanders
Yeah, there's a few but I haven't had a chance to give them a spin yet. I was
mostly using Ceph to explore live migrations in Proxmox, although they just
released support for ZFS based migrations(you have to migrate the whole local
disk though so they aren't nearly as snappy).

There's some pros to Ceph, mostly that you can add/remove drives at-will and
the cluster will automatically rebalance. It's just very, very bandwidth
hungry for cross-cluster communication.

It doesn't matter that much since it's for a homelab, I could migrate to some
10Gbe links between the cluster at some point and it'll get snappier. ZFS is
just such a solid, battle tested piece of tech that it's hard to beat it
unless you have a very different set of constraints.

------
pwaivers
The overload indicators are strange to me. For example, one indicator is "IO
thread idle below 20%: this means the pool of worker threads used by kafka for
handling client requests are too busy to handle any more workload". How did
they determine this? Is it true?

I think a much better way would be to monitor the JMX reporter for dropped
messages or high latency:
[https://docs.confluent.io/current/kafka/monitoring.html](https://docs.confluent.io/current/kafka/monitoring.html).

~~~
pram
There are JMX beans for the socket server and request handler idle. If they
even dip below 80% you’re probably looking at poorly configured producers
timing out.

------
pandemic_region
Line is doing 150 billion msgs/day on their Kafka infra, about 3million per
second. [https://www.slideshare.net/linecorp/building-a-
companywide-d...](https://www.slideshare.net/linecorp/building-a-companywide-
data-pipeline-on-apache-kafka-engineering-for-150-billion-messages-per-day)

~~~
buro9
At Cloudflare a year ago we said we were doing 6m per second ingest rate on a
cluster of 106 brokers with x3 replication factor, 106 partitions:
[https://blog.cloudflare.com/http-analytics-
for-6m-requests-p...](https://blog.cloudflare.com/http-analytics-
for-6m-requests-per-second-using-clickhouse/)

The more interesting blog post IMHO is this one on Kafka compression
[https://blog.cloudflare.com/squeezing-the-
firehose/](https://blog.cloudflare.com/squeezing-the-firehose/)

~~~
Rebelgecko
The compression article was really useful for me when I was trying to squeeze
a bit more performance out of our brokers at work. Any plans to revisit the
article now that Kafka has zstd support?

~~~
nosequel
They talk about zstd in the post and they were the ones who added support to
Sarama.

------
shanemhansen
I regularly do 200MB/s per node with NSQ (I know, I know, apples to oranges
but people should be aware there are choices other than apples depending on
your use case).

------
Rafuino
60 MB/s per broker at what latency though? Does latency even factor into the
performance consideration?

~~~
halbritt
For some use cases, it absolutely does. I can't remember the exact use case,
but a friend of mine is adopting kafka with an end-to-end latency goal of
100ms. Absolutely requires SSDs and for the producers and consumers to be
colocated, but is otherwise not that difficult to achieve.

------
hsong
It would be great to see analysis on where the bottleneck is. Is it disk,
network, compression, etc.

------
Zecar
Dropbox deleted my honeymoon photos

------
deanCommie
"At Dropbox, Kafka clusters are managed by the Jetstream team, whose primary
responsibility is to provide high quality Kafka services. "

And this is why AWS's "Managed Kafka" service has a place.

I get why Dropbox moved off S3 - storage is their core competency and they
thought they could/should do better/cheaper. But I'm surprised they would be
wasting valuable dev time with this kind of overhead.

~~~
toomuchtodo
> But I'm surprised they would be wasting valuable dev time with this kind of
> overhead.

I'm not surprised at all. They have reached a scale where managed services are
looked at with a critical eye, and are no longer the default.

~~~
halbritt
I don't think it's that hard to reach a scale where DIY makes sense. As was
stated up above, the AWS kafka offering isn't complete. One could do something
similar with a cloud formation template.

The ElasticSearch offering on the other hand is fairly complete. At my scale,
it would cost me ~$15-20k/mo for ES in AWS. However, I DIY my own ES cluster
(in k8s) which costs me something like $5k/mo. The delta is roughly
$120-180k/yr.

An engineer costs me roughly ~$200k year with salary and benefits, so let's
say the delta is .75 of an engineer. There are also risks to DIY that are hard
to put an number on.

ES is definitely a PITA to run, but it's core enough to our infrastructure
offering that dedicating .25-.5 of an engineer to it seems like a worthy
trade-off to me. It is also a unique enough type of workload that it yields an
interesting learning opportunity for my team.

~~~
technion
It also means you're in a position to pick up your stack and deploy it
elsewhere if such a thing makes business sense in future. A lot of
organisations are putting themselves in a position where that's not an option.

~~~
halbritt
If it's a given that cloud providers are aiming to get lock-in and increase
margins, then it makes sense to do the necessary math to avoid that.

