
The Case for Building Scalable Stateful Services - aarkay
http://highscalability.com/blog/2015/10/12/making-the-case-for-building-scalable-stateful-services-in-t.html?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+HighScalability+%28High+Scalability%29
======
packetslave
From a talk by Caitie McCaffrey of Twitter, at the Strange Loop conference.

Video:
[https://www.youtube.com/watch?v=H0i_bXKwujQ](https://www.youtube.com/watch?v=H0i_bXKwujQ)

Slides: [https://speakerdeck.com/caitiem20/building-scalable-
stateful...](https://speakerdeck.com/caitiem20/building-scalable-stateful-
services)

------
themartorana
Stateful: you have one web server!

Stateless: you grow to require tens of servers or more, horizontal scalability
is much cheaper than vertical, look to software solutions to help slow
expenses, move to NoSQL clustered DBs like Riak, Casandra, Hadoop, etc. 1-2
engineers can still run the whole show, cloud services, SaaS and PaaS are
employed.

Stateful: you run thousands of servers, having since brought many services
back in-house. Many if not most are your own metal, with dedicated staff.
Looking to slow power bills and space requirements, you look once again at
software solutions.

If you stay at the same growing company long enough, what's old will be new
again.

~~~
ousta
quite frankly I never understood the craze most of the companies had with
stateless services. For the last 7 years we moved away from statefull to
stateless without any aftertought just because it was what everyone was doing
even tho sometimes it just didn't make sense to design something stateless. In
terms of pure design this was always akward. I understand why it makes sense
in some cases I just don't understand why it became the new black. If someone
has an answer to that, ill be happy to hear

~~~
acjohnson55
Stateless services have some huge advantages, analogous to pure functional
programming, especially when you truly commit to purity. The surface area for
verfication and optimization is far more manageable, without having to worry
about an internal state space. There's a dead-simple horizontal scalability
story and reliability isn't much of a concern due to the fungibility of
individual instances.

Any real service must, of course, have state somewhere. I've had a lot of
success being very intentional about factoring state into the smallest
possible interfaces, ideally contained in well understood products like
databases.

Much like pure functional programming, it's not necessarily easy to go truly
stateless. I've plenty of "RESTful" architectures get bogged down in the
complications seemingly harmless stateful optimizations like caching layers,
often to cover up poor performance of runtimes like Ruby and Python. It's very
tempting to take a couple shortcuts and end up with the worst of both worlds.

Of course, stateless isn't _always_ the best way to go, especially once you
start getting to the exotic territory inhabited by the examples in the
article. I think for 99% of us, those are problems we'll never experience.

I'd be interested in seeing a case where statefulness really let someone down,
who wasn't at Facebook/Twitter/Google scale.

------
jakozaur
Thumb rule, if you design service with many servers you have following
options:

1\. Have a stateless service. You can update it frequently with no downtime...
Relatively easy.

2\. Use some of the shelf service that provides states and you don't need to
update that frequently (e.g. ElastiCache, Cassandra, ....). Relatively easy.

3\. Write your own stateful service. For some applications it is a must (e.g.
you do your own search service, data processing, game collision engine). Need
to take care of state transition during restarts/upgrades, client routing is
also tricky. Hard, but sometimes there is no way around to build efficient
infrastructure.

4\. Don't think about state and you may end up crying after your code hits the
prod.

------
rdtsc
A comparison with Erlang/OTP:

[http://christophermeiklejohn.com/papers/2015/05/03/orleans.h...](http://christophermeiklejohn.com/papers/2015/05/03/orleans.html)

------
deathtrader666
Hasn't Erlang & OTP already solved this?

~~~
reubenbond
The Virtual Actor model implemented in Orleans differs from what OTP & Akka
offer in that Virtual Actors have managed lifecycles and are never created or
destroyed. Instead, they are activated when needed (fetching state from
storage if needed) and deactivated when idle.

This helps you to reason about your system. You can never know if a node will
fail at any point in time (in any system), but with Orleans you can be sure
that an actor will be reactivated on a surviving node in event of a failure.
In other words, you can continually message actor X without worrying about
ever having to reinstantiate it on another node due to failure.

By default, Orleans maintains an eventually consistent mapping of actors to
nodes (called silos) and relies on the storage layer to give strong
consistency. Most of the default storage providers offer strong consistency.

By the way, Service Fabric includes an implementation of Virtual Actors which
differ in a couple of ways: 1\. The actors are placed using consistent
hashing. Actor Ids are mapped to a range of partitions, the replicas of which
can be moved between nodes in case of failure. 2\. Actor state is physically
stored on the actor node by default. Because Service Fabric uses distributed
consensus to implement stateful services, they can persist each actors state
in the partition which it belongs to. The state is replicated to a quorum of
replicas on each write.

~~~
saryant
Akka has added something along those lines with Akka Persistence and the new
cluster sharding features. Actors can passivate (gracefully stop) and the
cluster manager will route messages that would've gone to that actor to
another actor in the cluster. Because the actors are persistent, they can be
spun back up at their previous state on any cluster member.

~~~
reubenbond
Akka persistence is different. It doesn't give you managed lifecycles like in
the Virtual Actor model - so you aren't insulated from failures and you still
have to consider when to create/recreate an actor. It gives you Event Sourcing
/ Command Sourcing (depending on how you use it).

ES is a work-in-progress for Orleans, but many of us are using our own ES
systems.

For an implementation of Virtual Actors on the JVM, check out Orbit from
Electronic Arts / BioWare:
[http://orbit.bioware.com/](http://orbit.bioware.com/)

------
EGreg
I think that, in general, anything that has no persistence can be shared-
nothing. State in shared-nothing consists basically of a cache which is
updated by subscribing to changes in the data store and being updated, with
only a slight lag.

Shared-nothing can include environments like user agents, proxies and web
servers.

As for the persistence layer / data store, it should support horizontal
partitioning. Especially useful is range-based partitioning based on a primary
key whose prefix contains a Geohash ... because then you can route requests to
the closest Region on AWS or some other host.

If one of your shards gets too large you can split it into two or more shards.
All the monitoring and splitting can be automated with dev ops in the cloud to
provision machines etc. so you don't need to wake up at 3am.

With this setup you can reliably grow your data store to an arbitrary size,
and literally have only O(log n) growth in latency for any request. However
there is one more issue to solve:

When you need to perform database queries that return a cross product, or
join, do you compute it on the fly for the request (eg with mapreduce) or do
you precompute the result whenever a row is inserted into one of the joined
tables? The second way can be done in the background and uses memory-time
tradeoff to cause the queries to be O(1). This can be really useful for
queries that need to get the answer in realtime.

I would recommend using evented (eg Node.js) servers for queries that involve
hitting multiple shards at the same time, or mapreduce type things. Evented
I/O lets you wait only as long as the longest query.

Finally, I don't think things like socket.io will be horizontally
partitionable easily, eg to a node cluster, so you'll probably want to have
server affinity on a per-room basis.

