
Akka Streams: A Motivating Example - amzans
http://blog.colinbreck.com/akka-streams-a-motivating-example/
======
amzans
I can very much relate to the pain points and benefits mentioned in the
example.

Last year we had a project in which we had to take a stream of data that
needed to be verified with several calls to an external system before being
partitioned, transformed and inserted in batches to a relational DB. The
system processes ~100 million messages per day (message length is ~1.5 KB) on
2 nodes each with 4 cores and 16 GB memory.

We built the routing and partitioning logic using Akka cluster since we wanted
to have consistent hashing as we scale the number of stream workers
horizontally. Initially, the stream processing was also done via Actors, which
are a fantastic concurrency tool, but we soon found ourselves reimplementing a
lot of the features mentioned in the article and needed to handle backpressure
between each of the steps since the rates at which the producers put data into
the system often burst for short periods of time. Last but not least, we
suddenly had a huge amount of non-business logic that needed to be tested and
maintained.

At some point we decided to rewrite the stream processing part to make use of
Akka streams. I have to admit that I was initially skeptical given that no one
on the team had previously worked with it and since some of us already had
experience with Apache Spark, which could also be used to process the data.

Long story short, it was a breeze to rewrite as it consisted mainly of
removing the non-business code and using the Akka streams stages to handle the
batching, rate limiting and end to end backpressure for us. And similarly
satisfying, our codebase was much smaller and readable.

Soon we will begin work on another project of similar nature, and while we are
considering a couple of alternatives depending on the final requirements, Akka
streams is definitely on the top of the list for us.

