
Distributed Systems and the End of the API - mcms
http://writings.quilt.org/2014/05/12/distributed-systems-and-the-end-of-the-api/
======
ChuckMcM
Fun stuff, amusing that the definition of a distributed system used; "Where a
computer that you never heard of can bring your system down." is actually one
of Leslie Lamport's more famous quotes.

When I joined Sun in '86 I thought it was the pinnacle of technological
excellence to be a kernel programmer, and I joined the Systems Group, the
notional center of the Sun universe, in 1987. However I discovered that the
primary _reason_ you had to be picky about kernel programmers what that their
bogus pointer references crashed the machine (as they occurred in kernel mode
with full privileges) but discovered that _network_ programmers could crash
the whole world with their bugs. So clearly they must be in a pantheon above
kernel programmers. :-)

The author has come to discover that in the network world things can die
anywhere, and this makes reasoning about such systems very complicated. Having
been a part of the RPC and CORBA evolution I keenly felt the challenges of
making APIs that "looked" like function calls to a programmer but took place
across a network fabric and thus introduced error conditions that couldn't
exist in locally called routines. (like the inability to return from the
function due to a network partition for a simple example).

Lamport's work in this space is brilliant and inspired. Network systems can be
analysed and reasoned about as physical systems when they exhibit
discontinuities when considered as simple algorithms. The value here is to
realize that a large number of physical systems tolerate a tremendous amount
of randomness and continue to work as intended (windmills for example) while
many algorithms only work consistently given a set of key invariants.

I gave a talk that was inspired by Dr. Lamports work titled 'Java as Newtonian
Physics' which was a call to action to create a set of invariants, in the
spirit of physical laws, that would govern the behavior and _capabilities_ of
distributed systems. It was way early for its time (AOL dialup connections
were still a thing) but much of the same inspiration (presumably from Lamport)
made it into the Google Spanner project.

As with many things, at a surface level many people learn an API which does
something under the covers across the network but having come up through their
education thinking of everything as an API they don't fundamentally grasp the
notion of distributed computation. Then at some point in their experience
there will be that 'ah ha' moment when suddenly everything they know is wrong,
which really means they suddenly see a bigger picture of things. It makes
distributed systems questions in interviews an excellent litmus test for
understanding where people are in their journey.

~~~
jacquesm
I've never seen an RPC system that I really liked. The closest to a model of
distributed computing that gets me from 'a' to 'b' without going terminally
insane is anything based on message passing. Even though there is significant
overhead I figure that by the time you go distributed and your target of the
RPC call or message lives on the other side of a barrier with unknown latency
that overhead is probably low compared to the penalties that you'll be hit
with anyway.

So then the trick becomes to make sure that a message contains a payload that
is 'worth it'.

Making the assumption that any message may not make it to its destination and
that confirmations may be lost (akin to your return example) is still
challenging but I find it easier to reason about than in the RPC analogy.

I love that Lamport quote :)

A nasty side effect of all this network business is that what looks like a
function call can activate an immense cascade of work behind the scenes,
gethostbyname (ok, getaddrinfo) is a nice example of such a function. On the
surface it's a pretty easily understood affair but by the time you're done and
you get your results back you've likely triggered millions of cycles on
'machines that you've never heard of'.

~~~
arethuza
"I've never seen an RPC system that I really liked."

I must admit I've never seen a message passing system that I really liked
either :-) Mind you that's possibly because of times making stuff work in
environments where someone made the decision "you shall use message passing
for all inter-system communication" even when it wasn't always the best
option.

These days my practical test for a remote API is whether I can stand using it
through cURL - if I can happily do stuff from the command line then the
chances are that code to do stuff won't be too insane.

~~~
jacquesm
I liked QnX, currently playing around with Erlang. (Erlang has tons of warts
but it gets enough of the moving parts just right that I find it interesting).

~~~
fenollp
One does not often hear about the warts of Erlang. What do you name those?

------
programminggeek
At least the author is wise enough to see that REST/RPC are not so different
from each other.

I actually find it interesting that as I was learning about earlier networked
"objects" type systems, programmers ran into problems where they were treating
the networked objects as if they were local and that the network always works.
Now, when we build REST api's they always ship with client libraries that feel
like local objects and completely abstract away most notions of network
failure, etc.

I'm not saying we've made an unreasonable tradeoff, it's just interesting that
we seem to be making more refined versions of the same solutions with the same
fundamental problems.

I guess the author was making a similar point.

~~~
steveklabnik
"Layman's REST" is very much RPC, yes.

Fielding's REST is very much not.

~~~
mantrax5
Fielding's REST is pretty much CRUD in HTTP disguise.

Don't get me wrong, this can be great for "hypermedia applications" as
Fielding's paper argues. But "hypermedia applications" just doesn't fit what
many distributed services do these days.

Services are naturally centered arounds verbs (commands and queries) and not
nouns (resources), so like with any other CRUD system, at some point a REST
API that shoehorns everything into the four standard verbs HTTP commonly gives
us, no longer adequately describes the business requirements of your app. You
can definitely force things to be RESTful, but it's typically not the natural
way to build an API. Feels akin to the ORM kind of impedance mismatch in some
ways.

~~~
steveklabnik
I agree that many services are simply CRUD wrappers. That doesn't have much to
do with the nature of the architecture Fielding proposes.

I would be interested in some citations from Fielding which demonstrate that
RPC is its organizational principle. I don't think they're there, though.

~~~
beamatronic
I'm surprised someone hasn't embraced the idea and built the ultimate generic
CRUD wrapper

~~~
steveklabnik
They often fail. See ActiveResource, for example.

------
jwingy
CALM and CRDTs are interesting stuff.

That being said, I feel like the author is confusing a bit the specific
implementations of modern APIs vs the concept of an API which I see as simply
some (somewhat standardized) interface to a system which you don't own. Those
seem like two different problem domains to me, but perhaps I'm arguing over a
different definition of APIs than from what the author is talking about....

~~~
cemerick
Hi, author here.

(Somewhat standardized) interfaces are _fine_. My contention is that you can
have an interface shared by disparate actors without the problematical bits of
"APIs" (both in spirit and in their particular current best materializations),
which provide no useful data model constraints, do not acknowledge the
realities of the network, and inherently couple client and server.

The point is that you can have a shared "interface" over _data_, in exactly
the same way as producers and consumers share shapes/types of messages routed
via queues — except that there are ways (CRDTs being one) to extend that
dynamic so that data can be replicated along any topology, and shared and
reacted to by N actors, not just a consumer downstream of your producer.

I hope that clarifies. :-)

------
cmeiklejohn
I also covered a variety of similar issues when discussing that offline rich-
web applications are perfect for CRDTs, because you are effectively building a
distributed system, in my EmberConf 2014 talk [1][2] called
Convergent/Divergent.

[1] [http://confreaks.com/videos/3311-emberconf2014-convergent-
di...](http://confreaks.com/videos/3311-emberconf2014-convergent-divergent)

[2]
[https://speakerdeck.com/cmeiklejohn/divergent](https://speakerdeck.com/cmeiklejohn/divergent)

* edited to reformat list.

~~~
cmeiklejohn
Also, very relevant:

"A Note on Distributed Computing" 1994, Sun Microsystems Technical Report

* [http://lambda-the-ultimate.org/node/1450](http://lambda-the-ultimate.org/node/1450)

* [http://dl.acm.org/citation.cfm?id=974938](http://dl.acm.org/citation.cfm?id=974938)

------
iadapter
Of course APIs that serve as synchronous endpoints to distributed systems are
a leaky abstraction. But its not the only one of its kind, there's also
Guaranteed Message Delivery [1].

I find the philosophy behind Akka in this context a better fit - embrace that
networks are unreliable and build your app around this limitation accordingly
[2]. The cost is that it results in more work for the developer just like with
the usage of CRDTs.

[1] [http://www.infoq.com/articles/no-reliable-
messaging](http://www.infoq.com/articles/no-reliable-messaging)

[2] [http://doc.akka.io/docs/akka/2.1.0/general/message-
delivery-...](http://doc.akka.io/docs/akka/2.1.0/general/message-delivery-
guarantees.html)

------
baldeagle
TL;DR: APIs have issues with concurrency and latency, amongst others. Use
Consistency As Logical Monotonicity (CALM) or Conflict-free Replicated Data
Types (CRDTs) instead. Here is a little about how CRDTs work. btw: speech in
NY on the 15th of May.

~~~
lstamour
Thanks for the summary. Via Google, ended up at
[http://www.slideshare.net/jboner/the-road-to-akka-cluster-
an...](http://www.slideshare.net/jboner/the-road-to-akka-cluster-and-beyond)
which I've bookmarked to watch at a future date, since I'm new to these
concepts.

------
dgreensp
As the author of EtherPad I'm familiar with CRDT, which is a cousin of OT.
They don't really replace APIs, unless you are using an API to synchronize
data, which is only one of many things you might be trying to do.

In other words, if you're building EtherPad or Wave, use a fancy data
structure for the collaborative document. Otherwise, don't. Meteor's DDP
provides a nice model, where the results of RPCs stream in asynchronously.

~~~
cemerick
Hi, author here. I'm not sure you read the whole piece. :-) (Modern) APIs are
a very limited mechanism of state transfer that happens to be paired with
often side-effecting operations. Thus, a "synchronization" (I don't think that
word is particularly useful because reasons) mechanism paired with reactive
computational services _does_ replace APIs, and offers the ability to do much,
much more.

OTs (operational transforms) _are_ a related precursor to CRDTs only in that
they are both ways to reconcile concurrent changes, but that's really the
limit of the connection. Unfortunately, the substrate for OTs (text, integer-
indexed sequences of characters) is fundamentally not amenable to commutative
operations. This makes implementing OTs _very_ difficult and error-prone, and
certain combinations of concurrent operations are completely unreconcilable (a
result that came out of a Korean group's study, can't find the cite for it
right now).

~~~
jorangreef
I think the paper you are referencing might be [1]?

It's one of my favorite papers on CRDTs and provides practical pseudocode for
learning how to implement CRDTs yourself.

The structures they present are simple to understand and have good performance
characteristics compared to similar CRDTs [2].

A key insight from the second paper is to write CRDTs that optimize for
applying remote operations over applying local operations, as the ratio of
remote operations to local operations will be greater. i.e. 100 clients making
1 change to a CRDT will require all 100 clients to each apply 99 remote
operations and 1 local operation.

[1] Replicated abstract data types: Building blocks for collaborative
applications -
[http://dl.acm.org/citation.cfm?id=1931272](http://dl.acm.org/citation.cfm?id=1931272)

[2] Evaluating CRDTs for Real-time Document Editing - [http://hal.archives-
ouvertes.fr/docs/00/62/95/03/PDF/doce63-...](http://hal.archives-
ouvertes.fr/docs/00/62/95/03/PDF/doce63-ahmednacer.pdf)

~~~
cemerick
The cite I'm missing at the moment is a multi-year study that catalogued all
known operational transforms over text (there were many more than I imagined
prior), along with proofs showing that certain combinations of concurrent
operations simply could not be reconciled consistently.

Thanks for the other pointers, though!

------
ddp
I believe it was Leslie Lamport who said, "A distributed system is one in
which the failure of a computer you didn't even know existed can render your
own computer unusable."

~~~
cmeiklejohn
The reference is here:

[http://research.microsoft.com/en-
us/um/people/lamport/pubs/d...](http://research.microsoft.com/en-
us/um/people/lamport/pubs/distributed-system.txt)

~~~
cemerick
Dammit, thank you. I've even read that exact message before. :-/ Post updated!

------
josephschmoe
Many API users are not knowledgeable to the intricacies of network
programming. This could definitely use an executive summary at the top with
the following info: "What will these new libraries that replace APIs offer to
single point API users?"

~~~
cemerick
(Author here.) As a start, hopefully removing the blinders that makes
statements like "many API users are not knowledgeable to the intricacies of
network programming" oh so true. That that statement can be made without
popular incredulity only reinforces the point that modern network API
technologies have largely been built to sustain the illusion that there is no
network, and you're just making a method call somewhere. Insert this-plt-life
GIF here. :-P

Building systems with things like CRDTs and tools and languages that support
CALM will allow people using point-to-point APIs to continue to do the things
they do now, but remove much of the incidental complexity from the equation.
An example would be that, when you are relying upon N replication mechanisms
to move CRDT state or operations from _here_ to _there_, you don't need
complex timeout, retry, and backoff mechanisms to compensate for the realities
of what's connecting the two parties. The message will arrive when it
arrives...exactly the only guarantee that you can make in the general case of
someone talking to external services.

------
anuraj
This is a network programmers view. System, Network programmers concern
themselves about systems and topologies. For an application programmer, both
needs to be abstracted and only the business logic is important. Money is in
the top of the pyramid now - hence the proliferation of APIs.

~~~
cemerick
The point of the piece, in large part, was to emphasize that _we 're all
network programmers_. If you're whacking away with APIs and couldn't care less
about the broader system its topology, please stay away from me. ;-)

APIs are proliferating because their coupling around client/server process and
data representation makes for high switching costs and thus sweet vendor lock-
in.

------
logn
Rest is less platform dependent than SOAP/RPC. I see that as the main benefit.
JSON is easier to work with than XML. The whole idea of service oriented
architectures is that users don't need to care about the tech stack details of
your service. Rest and JSON do a better job of realizing that vision than
SOAP/XML. I don't think anyone's claimed that Rest is a design pattern to end
all woes. Maybe we haven't given due thought to what new design patterns (or
data structures, architectures, etc) are emerging these days, and in that
light, the article presents a lot of interesting pointers.

~~~
virmundi
Actually JSON makes it damn near impossible to uniformally implement REST. You
need HATEOS. This means that there has to be a semantic of following links in
resources. JSON lacks this ability. ATOM or RSS, both XML, have linking. Heck,
XML at a language level supports document linking. HTTP + JSON != REST.

~~~
jalfresi
Whilst I agree with you, there is nothing stopping someone defining link
structures in JSON documents, coining a new media type e.g. JSON+Link and
boom, problem solved.

~~~
virmundi
Well, I'm responding to you and to your siblings. You're right. You can create
a new MIME type and the problem is solved. Fortunately, as the sibling
comments pointed out, there is an extension. What I want to see happen is that
the JSON + Links becomes a standard. Ideally a W3C. Otherwise we're into the
old XKCD comic [https://xkcd.com/927/](https://xkcd.com/927/)

------
kylebrown
Distributed API's are a big part of Ethereum. I think the Merkle tree of the
bitcoin blockchain (and the Patricia tree of the Ethereum blockchain) might
even qualify as a semilattice.

In fact, its by the physics of information theory that a cryptographic
blockchain solves the consensus problem. Specifically, information theory
emerges from the laws of thermodynamics: Maxwell's demon is essentially what
secures one's private keys from brute-force cracking attempts.

I'd like to see a comparison of how the blockchain solves the CAP problem,
alongside CRDT's. Are they not both solutions to the same problem?

~~~
marktangotango
FYI information theory entropy and physics entropy really aren't the same
thing:

[http://physics.ucsd.edu/do-the-math/2013/05/elusive-
entropy/](http://physics.ucsd.edu/do-the-math/2013/05/elusive-entropy/)

~~~
kylebrown
Well that seemed excessively pedantic IMHO. It actually didn't touch much on
information theory, and where it did, many of the comments disagree. I'll cite
the Landauer limit[1] as what (yes, arguably) connects the entropy of
information theory to the entropy of physics.[2]

Also, I only mentioned physics because the article did, quoting Lamport "
_Most people view concurrency as a programming problem or a language problem.
I regard it as a physics problem_."

Unfortunately the article didn't elaborate any more on the precise type of
physics problem in question (Maybe Lamport does elsewhere), whether the
physics of computational complexity or the physics of information theory, or
something else. But even those two sub-fields have many connections and
similarities (as does pretty much everything in physics and math. such
connections are the bread-and-butter of theoreticians).

1\. [http://en.wikipedia.org/wiki/Von_Neumann-
Landauer_limit](http://en.wikipedia.org/wiki/Von_Neumann-Landauer_limit)

2\.
[http://en.wikipedia.org/wiki/Entropy_in_thermodynamics_and_i...](http://en.wikipedia.org/wiki/Entropy_in_thermodynamics_and_information_theory)

~~~
kylebrown
Here's an article which discusses Lamport's view: "The physics of distributed
information systems"[1]. The first sentence: "This paper aims to present
distributed systems as a new (interesting) area of applications of statistical
physics, and to make the the case that statistical physics can be quite useful
in understanding such systems."

It has several mentions of statistical physics, but (curiously) no mentions of
entropy. It does however discuss the Byzantine Generals problem, which of
course is the problem the bitcoin blockchain solves.

1\.
[http://iopscience.iop.org/1742-6596/473/1/012017/pdf/1742-65...](http://iopscience.iop.org/1742-6596/473/1/012017/pdf/1742-6596_473_1_012017.pdf)

------
sagargv
Joel Spolsky had written along similar lines and argued that it's important to
know what is happening beneath abstractions.

[http://www.joelonsoftware.com/articles/LeakyAbstractions.htm...](http://www.joelonsoftware.com/articles/LeakyAbstractions.html)

------
rooted
I think a distributed system is better defined as a system where timing
becomes an issue to the coordination of components in the system.

------
richm44
I stopped at the point where he claimed that APIs were always synchronous,
this wasn't even true in the 80s. For example XLib is a rather well used API
and is asynchronous (there are many others).

~~~
koide
two paragraphs later he addresses that point, calling that support limited in
current api designs

~~~
richm44
Not really, he talks about HTTP which wasn't really designed for that purpose.
There are plenty of protocols that were. Does this actually have anything to
add that isn't covered in
[http://en.wikipedia.org/wiki/Fallacies_of_Distributed_Comput...](http://en.wikipedia.org/wiki/Fallacies_of_Distributed_Computing)
, if so then I'll read further.

~~~
pjscott
It has some very interesting stuff about CRDTs, which is definitely worth a
look.

------
mantrax5
So he favors exposing a standard set of distributed data models instead of
having APIs.

What a horrible idea.

Exposing implementations is bad because implementations change.

Exposing implementations is bad because as you expose the intricacies of your
data model to your client (which he claims is a benefit) you in turn obscure
and hide the intricacies of your business domain, which will surely not allow
you to patch a service's distributed data tree in an arbitrary fashion.

It's in essence like having SQL as your underlying data model, and replacing
your API with an open read/write/delete access to your SQL server to the
entire world, and hoping everyone will run the right queries and all will be
all right.

It won't be all right.

APIs will become more asynchronous and eventually all APIs will be seen as
protocols, that don't _necessarily_ follow a simple request/response pattern.

But they'll remain in the form of abstract commands and queries modeled after
the _business domain_ of the application, and not the _underlying data model_
of it.

~~~
derefr
> It's in essence like having SQL as your underlying data model, and replacing
> your API with an open read/write/delete access to your SQL server to the
> entire world, and hoping everyone will run the right queries and all will be
> all right.

I find it kind of amusing that this was the original purpose of having an "SQL
server": letting people (e.g. auditors) submit arbitrary queries, so you won't
have to anticipate what exactly they'll want to do with your data. (Write-
access was intended to be segregated to particular database users writing to
particular tables, though--basically parallel to using WebDAV with HTTP Basic
Auth.)

~~~
mantrax5
It was, yes, and to this day read-only SQL access to certain tables is not
that bad of a practice to allow for report-generating apps _within_ a company.

However the idea of exposing SQL databases publicly as an approach never took
hold for many reasons we're today aware of. And the idea of public write
access is ridiculous right from its premise.

The anti-API rant of this author shows us that those who don't know their
history are doomed to repeat it.

------
dreamfactory2
Hmm the article seems based on some false assumptions. I'd argue that the
whole point of REST as an architectural style is to be stateless and async. Of
course you would use an ESB of some kind rather than point-to-point if you
want to protect yourself from failure of a solution component - REST lends
itself well to that or to building error-handling in the client. And isn't
'turning operations into data' what we are doing by switching from a verb-
based model to a noun-based one?

~~~
cemerick
Hi, author here. REST has its set of semantics, but (a) I don't think they're
particularly useful for building computational services with, and (b) it's for
all practical purposes predicated on HTTP, which carries a lot of baggage.
Each _request_ is stateless (barring things like sessions^H^H^H^H^H hack
workarounds), but clients and servers certainly are not; and, how one
maintains that state and orchestrates further REST interactions based on an
intermediate response is entirely on the implementer, _every single time_ a
service or client is built/used.

I'd personally much prefer communication and computational primitives that can
just as easily be used for a point-to-point interaction as they can be used to
_build_ an ESB (enterprise service bus, I believe you mean?) if that's what I
want.

I don't think nouns vs. verbs are a useful distinction. Turning operations
into data is a first step, but all data is not equivalent. Some
representations lend themselves to composition such that you can represent
essentially arbitrary structures (sets, graphs, trees, multimaps, etc), but
most (including the common ones of JSON and XML) do not. Likewise, some data
representations allow for commutative operations so as to reconcile concurrent
actors' activity, but most (again including JSON and XML) do not.

------
ryanobjc
CRDTs are absolutely fascinating, but sometimes I really wonder. It seems like
you throw words like 'semi-lattice' around ...

Also there is one particular element to the eventual consistency that bothers
me, it's that all these eventually consistent algorithms aren't how high
powered neural nets will work. Our brain is highly eventually consistent, but
it computes without ever needing these algorithms.

~~~
bm1362
I think you're taking the neural net _model_ of the brain too strictly.

