Hacker News new | past | comments | ask | show | jobs | submit login
The “API Mandate” memo at Amazon (chrislaing.net)
381 points by laingc on June 21, 2021 | hide | past | favorite | 130 comments



As a current Amazon employee I can confirm this is no longer the case or only applies to such a select few pieces of software that it doesn't actually mean what you think it does. The amount of day to day stuff that is relied upon that runs on greasemonkey scripts and web scraping is insane. API's for a ton of things don't exist or are have heavy gatekeeping. (eg. contact x to get onboarded to our API and they just ghost you). Of course you can get a high level leader to tell you to use a proper API in a mailing list email thread where you are trying to get help to solve a problem but they will not actually help you get working API access, all talk no action. A perfect example is there is this basic CRUD app that I rely upon and I see the current maintainer is working on all these feature requests, the problem for me is that the site has a ton of AJAX and takes like 20 mouse clicks to get retrieve basic information. So I open a ticket / feature request just to get a basic REST API for the reads and they politely tell me I am an idiot and close the request. So much pointless work is created at Amazon from lack of API access.


Sounds like the maintainer is optimizing for job security and/or work-life balance.

And from all the comments on this post, it sounds like that’s a pretty rational course of action.


Sounds like they're allowing people to treat the systems as "infrastructure" and "applications" with only the infra having exposed APIs? That's super common in software of all sorts. There are a lot of people who can make good use of a decent API and appreciate it, but are not experienced (or smart, or have time, etc) enough to design a good API themselves. Most software is mortar, not bricks.


Most all of the time wasting or hacky things like this there are just because people don't have time to improve things they scraped together to provide value somewhere.

Everyone at AWS ran around with their heads cut off back when there were 6 regions because everyone was so overworked or had to drop everything they were doing for a week because a new Xen bug was published to the email list and would go out of embargo at the end of the week. I can only imagine how insane it is now with ~20 regions to support.


This seems so true. A friend of mine got hired in Northern Virginia. He was Java developer for ten years. But at Amazon it is about same as you described.


How far does this principle scale? I've heard of this strategy and thought "that's kinda cool." Amazon is obviously successful in its domain, so it would be easy to assume some causality.

And yet, there's another big tech company, fruit symbol I think, that has played the long game of continually fine tuning the interaction points to make the integration of their parts more than the whole. They get credit for attention to detail, leveraging their integration and their ability to move in ways they do because of their thorough integration across their whole hardware/software stack. They've recently been in the press because of just how much they were able to tune their integration to the problem with their first desktop computing chip. Again, we credit their success with some causality due to this strategy, which (to me) is very different than the Amazon strategy.

Is it because Amazon is a cloud company and Apple is a hardware company that they have these different approaches to product development and each enjoy success? Or at the end of the day, are these "do it this way" less responsible than we'd like to believe?


Amazon has created lot of these "things" (for the lack of a better word) that should be taken with a sack of salt. Door Desk, 2-Pizza Teams, Press Release and Frequently Asked Questions (PRFAQ), API Mandate, AWS as utilizing spare capacity, just to name a few. It's made out to be as if these are religiously practiced at Amazon but that's definitely not the case.

I worked at one of the first remote dev centres of Amazon working on an extremely ambitious AWS service. The mandate was imposed with a hammer in the earlier days, to an extent even the code repository was segregated. We couldn't access any internal service, no tools (such as pager-duty). So we ended up building half-ass version of everything ourselves. People at the HQ built a web-service to access customer information but within a few months it languished with no one to maintain. Half the time we would be blocked for someone to give us access to a service. Whenever we raised an issue the answer from HQ was oh yeah use this and this service, they were oblivious to our limitation. We would then play the broken record and they would then go oh, well let's see what we can do. Eventually someone higher up noticed the massive inefficiency and said fuck it and opened up all the access for us. But then everything had to be migrated from half-ass services to the mainstream ones.

During my time there I never heard of this API mandate. Now that AWS has gotten massively successful this API mandate gets paraded as if it was all a grand plan. No, it wasn't. It was an experiment that sometimes worked and sometimes didn't.

Also there's this troupe about AWS you keep hearing. The story goes that someone realised all the un-utilized server capacity and decided to rent it out. That's absolutely not how it began. That's a typical Amazon marketing speak. Amazon's retail took a very very long time to migrate to AWS. In fact I'm not sure if they are fully on AWS either.


The “rumor” I know is that Amazon internally had a system broadly similar to S3/EC2, where a centralized infrastructure team made hardware resources available to the other teams through API.

Then Amazon realized that they could use a similar system with external customers. They built S3 and other services in a way similar to their internal infrastructure, but completely separate.

Only years later Amazon started using AWS for their internal services.


It's sort of hard to compare these approaches, since it's not like Amazon and Apple only have 1 principle that they follow. I'll try to analyze them in a generic way. Disclaimer that I don't have experience in marketing or executive leadership.

Apple's focus on design allows it to charge higher for its products and to build new high-margin products that people end up buying. It allows them to "scale" revenue by entering new markets with new products.

Amazon has a similar focus on the customer which is centered around customer support. This approach also gives them the "brand reputation" to build new services (that businesses will pay for). You may note that Apple's approach works better for customers who pay without much planning/budgeting (like consumers/households), scales better with the number of customers (again like consumers/households), and scales more poorly with # of products (making it a worse fit for SaaS).

Amazon's API mandate value is felt in how it allows them to make software development more efficient. Customers do not feel the impact directly. Instead, since data is exposed through well-defined APIs, new service (or product) development can be done with far less human communication, as mentioned in the article. However, while this makes inter-team efficiency better, it reduces intra-team efficiency by forcing developers to build things that they don't need. If services are too small, the APIs are not high-quality, or the service boundaries change too frequently, then it's possible that this approach doesn't make software engineering more efficient at Amazon.


> Amazon has a similar focus on the customer which is centered around customer support

Except that for AWS, paying for the service(s) doesn't get you any support at all. So this idea about Amazon's behavior on its marketplace doesn't really map to AWS. Customers there have to choose to receive customer support, and pay for it separately.


> How far does this principle scale?

I think this principle scales most to platforms, in this case web ones.

There's an interesting thing called the "Bill Gates line": https://stratechery.com/2018/the-bill-gates-line/ (third sub-heading in the article). Basically: "A platform is when the economic value of everybody that uses it, exceeds the value of the company that creates it. Then it’s a platform."

From your example, the Fruit Company is a platform, but not really. I'm not entirely convinced that others capture more value from the Fruit Company ecosystem. Yes, for the AppStore they only get 30% of the value of third party sales, but they also sell the hardware and have their own extras. The Fruit Company is a platform company primarily by virtue of creating hardware, which generally makes one a platform, but other than that, they definitely look like they don't want to be a platform with the way they control everything, break stuff, etc.

AWS is for sure a platform.


Typical Apple products and services are fewer, more complex, and with enormously longer development time than Typical Amazon "products", so they could find processes that target great quality at high cost (e.g. "ask Steve Jobs") more useful than processes that target good enough quality at minimum cost.


> And yet, there's another big tech company, fruit symbol I think, that has played the long game of continually fine tuning the interaction points to make the integration of their parts more than the whole.

There was a recent article about another concise memo making the case to Steve Jobs for an iOS App Store, which Jobs quickly approved.

So I think the same kind of systems thinking and understanding technological ramifications of decisions at the very top, also played a large role in Apple's success.


I think you're trying to compare and contrast things that don't contradict each other. Having APIs is one thing (and Apple has lots of APIs about many things), and integrating the whole experience is another. Amazon and Apple do both.


Which thing exactly are you talking about apple which is not an API(except for esoteric thing like kext and all). All their programming model is based on API AFAIK.


Amazon's API mandate is clearly not just about having things "based on APIs", but rather about how teams collaborate and how tightly coupled they are. This goes way beyond "just having" APIs.


You say it yourself at the end, but it's really a case of comparing apples (eh) and oranges.

You can't build an OS or a chipset the same way you build AWS or a warehouse.

> it’s probably the most important single memo in the history of business

Maybe that's what you disagree with? I also find it a bit overblown. The memo worked wonders for Amazon, I'm sure it would work wonders for many other companies, but the world is more diverse than that.


Italy, France, and Portugal are playing a 4-3-3 at the Euros. Germany is using a 3-4-2-1. Hungary is playing a 3-5-2.

They can all be successful if you have the players executing their roles well. The structure and plan is necessary but it is not sufficient for success.


I worked at a company that read the Amazon memo virtually straight after it was published and decided to copy it. The CIO at the time predicated every system's teams' bonus on it.

No API, no bonus.

What followed was 18mths of people building enterprise messaging systems and "buses" of various types. Hundreds of pages of documentation on APIs was prepared and released.

I think 2 API's with maybe 40 methods actually went live.

There were no bonuses

Many people left

The people who stayed were unhappy

It never happened, everyone quietly forgot about it after 18 mnths.

The CIO stayed for another 3 or 4 years - he only left when a new CEO came in due to internal promotion. The new CEO used to run one of the P&L's/business units and hated the CIO with a passion.

So... it's not the API mandate that made the difference at Amazon.


That's the wrong way to do it though.

You can't just have a free for all mentality when building APIs. The platform has to be there first.

Everyone's API should just need simple descriptive json files and then a client can be generated to consume that API easily. Everyone can declare those json files and it would work everywhere across the business.

This free for all with no standardization is bound to be a disaster.


That is quite the conclusion you've made from a single data point!


Well - you can argue about the opposite as well!

But I would claim that this is an existence proof - if the management will and focus is there for an API enabled business (it was) this case study proves that is not sufficient to create the kind of company that Amazon became. An API program isn't a magic bullet (as everyone knows by now I guess!)


The following can be true at the same time:

- An API program isn’t a magic bullet

- An API program is a necessary, but not sufficient condition to replicate Amazon’s success

You’re right that this case study proves that it alone is insufficient to create the kind of company that Amazon became, but does it also prove that it’s unnecessary to create the kind of company that Amazon became?


yes - agree.

I think that the testamentary on this thread from current Amazon employees that they have ditched API requirements is evidence that it is also unneeded to sustain the kind of company that Amazon is.


It's interesting and often missed that this strategy implies a view of human collaboration and organization. Note that the memo says: 'All teams... Teams must... ...another team’s data store...'. It seems that someone figured out that a (5-10 person?) team is the right human collective size to design and build useful stuff, but that collective shall expose what they build to other teams via APIs. It's fine for the team to do things in a tightly coupled way - internally! In a way, tight collaboration between teammates can be reflected in tight coupling of internal components of whatever they build. But across teams the coupling shall be more formal and via APIs, reflecting looser and less powerful collaboration possible across teams.


That someone would be Bezos and his pizza person.

in time, two-pizza teams evolved into single-threaded leader (STL) teams, a term borrowed from computer science that means to only work on one thing at a time

https://www.inc.com/jeff-haden/when-jeff-bezoss-two-pizza-te...


It’s a form of using Conway’s Law to your own advantage instead of living in denial of it or treating it as an antipattern.


I love this. As others have mentioned I wonder how much input Jeff had, and from who, before making the mandate.

I am experiencing the polar opposite of this mandate. The systems in my organization are always built to require human touch-points. What's worse, our CTO mistakes these menial interactions as "teamwork" and "collaboration" when they are really just toil to compensate for the lack of platform-level thinking. I love how a CEO can put this so bluntly, upend everyones work for a couple of years and build a juggernaut because of it.

The idea runs parallel to one I have been championing throughout my career with SOAs which I call "self serve architecture". I want others in the organization to be able to pick up use my services to their benefit with zero input or help from me or my team. I tell my team to design the API as if it were GitHub's API.

Practically, that means - There are up-to-date and easy docs that cover what you need to know. - People can gain access on their own (via some existing workplace/team based credential). At most we have to add them to a list somewhere. - The system will protect itself and inform users against problematic use (quotas, throttling, and visibility into this) - You have visibility into who your users are and what they are doing such that you can assess value, learn from usage, and communicate to consumers when necessary.


Does anyone know who else was involved in constructing this memo?

"There will be no other form of interprocess communication allowed: no direct linking, no direct reads of another team’s data store, no shared-memory model, no back-doors whatsoever"

Was Bezos deeply enough involved in Amazon's engineering to set those rules himself, or was the text of the memo influenced by a senior engineering group that he was working with?


The "text of the memo" is really just the article's author not understanding the context of Steve Yegge's years-later retelling, from which that text comes. "His Big Mandate went something along these lines" (Yegge's words right before the quoted text) doesn't even imply there was a singular 'memo' involved, and definitely (and obviously) wasn't meant to say that the text was actually what Bezos wrote. So the footnote "Whether or not it existed in this exact form" is unnecessarily ambiguous about whether these were Bezos' words. They're clearly not - besides Yegge not saying they are, they're a dead ringer for Yegge's style and not at all Bezos'.


Came here to say this - the next line made it even more clear that Yegge is making a joke.

Here's a mirror of the original essay: https://gist.github.com/chitchcock/1281611

Ha, ha! You 150-odd ex-Amazon folks here will of course realize immediately that #7 was a little joke I threw in, because Bezos most definitely does not give a shit about your day.


and then the next line clarifies that only the last part of the whole story was a joke:

> #6, however, was quite real, so people went to work. Bezos assigned a couple of Chief Bulldogs to oversee the effort and ensure forward progress, headed up by Uber-Chief Bear Bulldog Rick Dalzell.


I wish this comment was the top comment.

Not to be too critical, but this blog post is a (in my opinion, poor) rehash of Steve Yegge's infamous Google+ post which he accidentally posted online. It's entertaining and one of the most influential memos I've read in the last twenty years.

His followup memo was great as well.


And the cargo-culting of "everything should be API no exception" was due to Yegge's colorful writing which endorsed it (relative to how things were done at Google at the time he was there) in his original piece. I sometime wonder if this development philosophy drove the microtizing of services, and then AWS went to sell that philosophy all over the world. There is such a thing as TOO MUCH atomizing of services.


Many people may not know this but Jeff Bezos clearly has a technical background as evidenced by this blurb from his Wikipedia entry:

“graduated from Princeton University in 1986. He holds a degree in electrical engineering and computer science”.

Of course he’s more known for his decision-making ability as an executive but he clearly has a solid understanding of computing fundamentals as his memo on Amazon S3 characterized it as “malloc for the Internet” [0][1].

0: https://aws.amazon.com/blogs/aws/eight-years-and-counting-of...

1: https://aws.amazon.com/blogs/aws/amazon-s3-path-deprecation-...


I mean, this pic of him from 1995 screams "nerd": https://www.seattletimes.com/business/amazon/the-seattle-tim...

And his first office was literally for one desk. He clearly did some of the tech work himself.


This was less about a memo showing up one day out of the blue and more like an attempt to bring resolution to a series of long, heated, and not terribly productive debates that took raged through the development teams over many months. It's worth knowing that the "before" state was that almost every team exposed their functionality via bespoke C/C++ libraries that everybody else linked to, and this resulted in enormous (for the time) spaghetti binaries. Want to write a little script that needs one piece of information from the database with customer information in it? No problem, just link the customer team's client library, and its 100MB of direct dependencies and 800MB of indirect dependencies. A handful of teams (notably those that already had to interface with third parties) were trying a different way (like http services written in Java) but they got a lot of side eye (or even more direct "you're doing it wrong" remarks) from the "core" developers.


My favorite is the control theory anecdote (point 2 here: https://gigaom.com/2013/10/10/5-fun-and-terrifying-facts-abo... ). Some people are just able to grasp the core of a large number of topics really fast, and born-in-1964-Jeffrey seems to be one of them. It’s fairly clear he would be very well versed at various tech architectural designs even if he didn’t have a CS background, I’ve worked with several people (ostensibly not at that level) and it’s some of the most fun times ever. They’re not burdened by any traditions and are often able to make breakthroughs in ideas that people steeped in the field are unable to themselves.


> It’s fairly clear he would be very well versed at various tech architectural designs even if he didn’t have a CS background

He does have a CS background.


> He does have a CS background.

The "even if he didn't" you quoted already implies this.


Ah, true.


I wish GitHub under Microsoft followed this philosophy. So much of their repository management can be done through their APIs, but you hit some painful brick walls around things like enterprise security where you could really use centralized management.

My business area uses around 200 repositories. APIs aren’t really optional at that scale.


There are multiple large Rails services (I'd guess GitHub included) that internally have a majority of contributors in favor of an API first approach, but it's not mandated absolutely.

Shopify has a component boundary interface in Ruby that can be reflected into with GraphQL, and a lot of features are built for GraphQL first anyway. First party client side apps demand it. A lot of internal services use GraphQL to talk server-server as well.

However, there's still a good chunk of monolithic logic left that hasn't been refactored yet. Refactoring efforts are mostly JIT when demanded.


Kind of lame that it doesn’t even mention Steve Yegge or his blog, which is almost certainly the source.


Is this the same doc as "Distributed Computing Manifesto" mentioned in Werner Vogels' blog? [1]

These legendary Amazon Memos haven't leaked, unlike Billg's various memos [2]. That is... journalistically unfortunate, I would say. I even wonder current Amazonian actually has the access to these docs.

[1] https://www.allthingsdistributed.com/2019/08/modern-applicat...

[2] https://lettersofnote.com/2011/07/22/the-internet-tidal-wave...


I posted a comment on libraries vs frameworks yesterday[0], here's the relevant section.

> Frameworks are easier to setup initially, but they do not scale. Why is it that Microsoft Windows has 13 different dialog generations? Because each is a framework on top of a framework on top of a framework. It's amazing that they can even get that done.

> On the other side, OSS is generally built on libraries. When the 2 UNIX devs were in a basement building UNIX and were able to out-compete Multics[1], they did it because they were building libraries that could talk with each-other using pipes around the boundary. Applications that communicate based on input-output with no internal state behave just like pure functions do. Pure functions compose. When Linus built git in 10 days, he was able to do this because the core idea of git isn't actually that much work. The library is built out of composable blocks that neatly come together. Microsoft's TFS Source Control is a framework that acts on your behalf and therefore the bigger the project gets, you need n^2 people to work on it.

Amazon's API mandate is the same exact unification as UNIX's pipe effect was to Operating Systems. Amazon's internal teams are building libraries whereas all other companies started at the same time were building frameworks. Jeff was brilliant to see this at the time, and I expect that this memo will be seen as just as pivotal as the Toyota Production System, and it might already have that prestige to some. Unfortunately for other companies, you can't retrofit into it and rewrites are always a terrible idea[2].

[0]: https://news.ycombinator.com/item?id=27570917

[1]: https://www.youtube.com/watch?v=3Ea3pkTCYx4, thanks to this HN comment(https://news.ycombinator.com/item?id=27494671) for this reference.

[2]: https://www.joelonsoftware.com/2000/04/06/things-you-should-...


If every part of Amazon is really like that, then it'll make the task of breaking them up much easier!


That's actually quite possibly true, if they were indeed broken up. But a hypothetical breakup might also forbid certain (or any) kinds of collaboration between any of them, which might include calling each other's APIs.


It would mean that if two split off pieces of Amazon want to call each others' APIs, other companies could also do so under the same terms (if there is a charge, everyone pays the same rate). This could work.


Otherwise: Later versions of the Hydra story add a regeneration feature to the monster: for every head chopped off, the Hydra would regrow two heads.

https://en.wikipedia.org/wiki/Lernaean_Hydra


>Heracles required the assistance of his nephew Iolaus to cut off all of the monster's heads and burn the neck using a sword and fire.


Always worth reading Yegge's insider take on this - https://gist.github.com/chitchcock/1281611.

The Golden Rule of Platforms, "Eat Your Own Dogfood", can be rephrased as "Start with a Platform, and Then Use it for Everything." You can't just bolt it on later. Certainly not easily at any rate -- ask anyone who worked on platformizing MS Office. Or anyone who worked on platformizing Amazon. If you delay it, it'll be ten times as much work as just doing it correctly up front. You can't cheat. You can't have secret back doors for internal apps to get special priority access, not for ANY reason. You need to solve the hard problems up front.


Background - Oct 12, 2011:

> Google engineer Steve Yegge was trying to start a robust internal discussion, not post a viral hit, when he published a 4,570-word self-styled rant about what he sees as the company’s greatest flaw to Google+. Unfortunately for Yegge, he didn’t check the settings and shared his view on Google’s failure to grasp platforms over products — including Google+ — with everyone.

> He later pulled it down on his own accord but he and Google aren’t asking that the copies already spread across the net be deleted. You can read the full post here and here, among other locations — and you should to get the real flavor about why Yegge thinks the company that does nearly everything right gets this fundamental so wrong. But a large chunk is also about his former employer Amazon, what it does wrong and how Jeff Bezos — Steve Jobs “without the fashion or design sense” — got it so right.

Source: https://gigaom.com/2011/10/12/419-the-biggest-thing-amazon-g... (Note, I think the post I'm citing might be incomplete as on my screen it stops at "Playstation Network" while the Gist continues...)


Tangentially, a great lesson to Google UX designers.


I kind of remember reading this shortly after joining Google in 2015 (have since left), and thinking yeah he's got a point, esp. regarding point 1: "All teams will henceforth expose their data and functionality through service interfaces." and point 5: "All service interfaces, without exception, must be designed from the ground up to be externalizable."

It's like, if we hit upon something useful, it better be available as a network service with a well defined interface from the start. And I do remember looking around at how things were, and thinking, yeah, Google could definitely use some of that philosophy (without, all these years later being able to cite any specific examples). It definitely felt like it hit home at the time.


That Gist is the source of the "email": #7 is fake and was added by Yegge as a joke.


"But I'll argue that Accessibility is actually more important than Security because dialing Accessibility to zero means you have no product at all, whereas dialing Security to zero can still get you a reasonably successful product such as the Playstation Network."

This is gold :D


“Jeff Bezos doesn’t give a f*ck about your day”. I still laugh at that.


> But making something a platform is not going to make you an instant success. A platform needs a killer app.

Which is a big ask, since

> The problem is that we are trying to predict what people want and deliver it for them. You can't do that.


Pretty much the hardest part about SaaS as a business, described in just two sentences.


So is this still correct in regards to the Google doesn't get platforms stuff? I sort of have the feeling it is, but I mean they have significantly more stuff now than they did when Yegge wrote it but maybe that is not good enough.

Hey do these other platforms have developer support? Developer support is sort of the accessibility for developers - Google doesn't have it. MS definitely has it. I'm thinking Amazon does too but I try to avoid them.


This really shows in the products Amazon offers to developers. Everything that isn’t already an entrenched business success is some random internal tooling that got productized because everything does.


> All service interfaces, without exception, must be designed from the ground up to be externalizable. That is to say, the team must plan and design to be able to expose the interface to developers in the outside world. No exceptions.

I wonder how this is accomplished for event driven architectures built on shared event buses.


Presumably, you expose the ability to inject events onto the bus, and the ability to listen for events on the bus?


In that case, the bus, producers, and consumers would all be treated as separate services, no?

The memo does mention pub sub, FWIW.


SQS and SNS are services offered by AWS


Things like these are taken as dogma or a religion -- and are applied to all things in an organization, for better or for worse.

When there's a case for tighter coupling and less services, (and yes, there are cases for it), this memo gets brought up and microservices win the argument.


What is an organizational case for tighter coupling and fewer services? Downsizing with an intent to not grow in the same direction again?


Each Service adds a marginal cost to maintenance, ops, infrastructure and debatably development speed (Easier to refactor in an IDE and semi-atomically deployed code than 50 independent services ).

If you have a team of 5 people, launching 50 services is probably not as efficient as 10 services.


As someone that tends to work this way, I can tell you that every API adds overhead. This is because they need to be documented, tested, and release-managed.

Each of my APIs is a self-contained project, with its own lifecycle.

That can, potentially, add a lot of overhead, and “concrete galosh”[0] to the project.

And I am not a fan of dogma, in general. I like flexibility, and dogma is anathema to flexibility.

[0] https://littlegreenviper.com/miscellany/concrete-galoshes/


> Downsizing with an intent to not grow in the same direction again?

More like we're not really big enough yet to justify multiple services, since everything still runs on a single beefy machine and we don't have enough experience running the system yet to really know where to put the service boundaries.


>It doesn’t matter what technology they use. HTTP, Corba, Pubsub, custom protocols — doesn’t matter.

So how does Amazon manage this btw? I don't find AWS SDKs to be all that consistent or inconsistent. They are usually good _enough_. Is that all it takes?

By contrast, Google seems to spend a lot of time on their single repo, build the world, approach. For the most part it seems beloved, or at least people try to recreate it outside google with things like Bazel.

I feel like Google's approach is more popularized. Is that because its actually better, just advertised more, or simply consistent enough to explain?

Can anyone shed light on Amazon's approach?


I don't work in Amazon, but from everything I've heard and read Amazon values autonomy more than consistency.

I think there's a saying there: "it's better to have two of something than none".

This can often result in inconsistency and duplication.

When that happens, a team would later be formed to unify things if needed.


I've definitely been burned by inconsistency on Amazon API inconsistency. At least 3 years ago, Cloudformation, Data Pipeline and an EMR specific API all have three differing sets of params for defining an EMR cluster and just because something was supported in one of them didn't mean it was even in the others.


> I feel like Google's approach is more popularized

I mean, the engineering architecture being described in this memo is, basically, microservices. That's certainly an extremely popular -- I would go so far as to say even vogue -- pattern for solving the problem of building software at scale.


No, they're just services - a service per team.

Microservices usually overdoing it, when you have multiple microservices per team, and sometimes even per developer.

Also, microservices sometimes implemented incorrectly, where they're still communicate via shared databases, instead of encapsulating them and exposing them via service APIs only.

This Memo was born because of the real business need, while many modern microservices deployments are the result of cargo-culting GAFAMs/FANGs.


Indeed, hence "the network is the computer" footnote that used to be written on Sun manuals.

Many cargo cult web services of today can be written in Sun RPC APIs.

But yeah it is C and its IDL isn't as cool as gRPC proto files.


For real, if AWS internal APIs are as good as their outside-facing ones I fear the complexity.

Their APIs are... correct. But have things like usability as a last concern. You also find some things not behaving exactly as you would expect and some things that are barely explained.


When I worked at Amazon there was a framework similar to gRPC for internal API’s. This was a few years before gRPC was released and the framework was already a few years old at this point.


Is this actually still applied in 2021?

Most memos from two decades ago aren't still followed.


> Is this actually still applied in 2021?

Yes, absolutely. Arguably even more so now that Lambda and ECS/Fargate have reduced the cost of standing up a simple service.


A perspective : Incentives, social proof and momentum are powerful things which might explain why this could still be applied in 2021 (I have no connection to Amazon or know anyone there so this is a theory). What do I mean?

Imagine you are a newbie who just joined a project that has successfully done what Jeff said in the memo. The success of undertaking that would have meant that your colleagues will want to continue doing this (threat of firing or not :) ) and pretty soon you'll get sucked into it and hopefully see the rewards of such design.

Soon, another team notices your team consistently getting things right and getting rewards so they have an incentive to follow (social proof). This becomes department wide next and so on. This is where momentum comes into picture. It is 2015 (let's say) and there are a dozen new departments. All of them reasonably want to get going quickly so they take up patterns that worked for other in the org. 6 more years pass by with more successes and there's no real incentive (at-least org-wide) to do something different to the one that works. My educated guess: The memo is still followed.


> It doesn’t matter what technology they use. HTTP, Corba, Pubsub, custom protocols — doesn’t matter.

CORBA is quite close to direct linking, with a network in between. The developer does not see it as a service or protocol, but a library call, which is rather the point. And it's not very compatible with the next one:

> All service interfaces, without exception, must be designed from the ground up to be externalizable. That is to say, the team must plan and design to be able to expose the interface to developers in the outside world. No exceptions.

CORBA/COM never played well over the internet.


Mostly because they got hit by J2EE rewrites fashion wave.

Their are now back via the gRPC fashion wave, until something else "improves" it.


As much as anyone who uses AWS and its various client libraries can attest to the areas this manifesto didn't solve, I think it's still very much the right way to go, and I think the success of AWS was at least significantly attributable to it. I've been part of several projects whose goal was to build Good interfaces on top of services after the fact, and I'm absolutely convinced that the shittiest interface in the world built in from the start trumps whatever you think you're going to be able to do later.


Does Amazon has some standards/conventions for inter-team APIs, i.e. something like Google's AIPs (API Improvement Proposals) [1,2]?

[1] https://google.aip.dev/general

[2] https://google.aip.dev/


There’s an API bar raiser system for APIs called by many teams.


I've worked at a place where this was cargo-culted and it worked horribly. All the worst aspects of 'services-first' design.

I think this works much better when each team is working on what could be regarded as a complete end-to-end product e.g. a database. It works poorly when each team is working on part of a product.


If you don't get the boundaries between the various services right, the effort is going to fail spectacularly.


"HTTP, Corba, Pubsub, custom protocols — doesn’t matter."

That could have backfired pretty badly, haha.


I'm not sure which would be worse: CORBA or some custom protocol...


Presumably the custom protocol would map well to at least one side of the API.


> All teams will henceforth expose their data and functionality through service interfaces.

> Teams must communicate with each other through these interfaces.

I read that as interfaces to the team, instead of interfaces to software the team is responsible for. Something like Mechanical Turk(https://www.mturk.com/) but for internal team communication. I'm not sure if that was what was meant but that sounds interesting.


I had always read this as a metaphorical memo. Not necessarily that every team must have a software service running somewhere in infrastructure serving RESTful routes but rather as a way of thinking about agreements between groups. Your team's "API" could be documents describing: if you want us to do XYZ, send us a request via ABC, and you should expect a response in UVW format in x amount of time.


I wonder how an example from the article like the API to post new listings to Amazon works in practice with the requirement to be designed to be open to outside developers. It seems like that’d force some sort of review process (and I’m not really sure who can review all new listings) between API call and public availability that might not be there if you eg. had a private API for approved employees.


In my experience, most APIs my teamed designed/built were not meant to ever be publicly available. That is, we never considered public availability as a design factor. So I think this rule doesn't actually apply anymore.

Then again, I don't know how public availability would change the API design really...


API design concerns for public availability (just to name a few):

  - security
  - preventing abuse
  - API Anti-Corruption Layer
    - sanitizing inputs and outputs
    - i.e. not exposing DB IDs/PKs, or pagination cursors directly
  - versioning, backward- and forward- compatibility, deprecation strategy
  - usability, DX, Documentation
  - reducing bandwidth use:
    - caching
    - eliminate over-fetching
    - efficient wire format


I spent 2020 at AWS and I can't think of a single part of their toolchain that exposed APIs as first-class citizens. Not Pipelines, not Apollo, not CRUX, not Brazil, not MCM.

Its all command line applications and browser interfaces (the stuff first-year developers are most familiar with building, I suppose). I was familiar with this memo so it was quite a shock. Wish they'd taken their own advice.


…and what do the cli apps and browser interfaces use to connect with the tools?


Yeah, all of those tools have APIs…


This is the way I tend to work. I even write about it[0] (“Keeping Things Vague”).

APIs (and opaque modules, in general) are key to the way I develop software. It works well.

(0) https://littlegreenviper.com/miscellany/evolutionary-design-...


I don’t think Yegge is a very reliable source in general, even though he was there at the time. I’m not sure why his version of these events is taken as gospel everywhere. I never saw Bezos concern himself with technical matters while I was at Amazon (though I was in AWS, which he seemed happy to leave to ajassy).


> 6. Anyone who doesn’t do this will be fired.

Sounds about right

Azon turned throwaway compute instances and one-click 2 day delivery ecom into huge cash cows.

They aren't some golden example of how to properly design infrastructure, it just worked out for them. luck/timing/reinvestment had a lot to do with it.


It is exactly what author Cal Newport is recommending in his new book "A World Without Email" (March 2021): replacing the ping-pong of emails and meetings by standardized processes and requests via tools. It's simply incredible to see that Bezos saw this coming 20 years ago!


I came across an interesting blog discussing such ideas a few years ago. I don't think anything ever came of it though http://thingamy.typepad.com/


I've always wondered how far this extends: "All teams will henceforth expose their data and functionality through service interfaces."

Does it include e.g. the team writing the on-device text renderer for a Kindle? What would such a service interface look like?


Presumably that team calls some sort of package management/build api to publish their new driver or whatever.


> no direct reads of another team’s data store

This reminds me of how App Store used to ban game emulators interpreting ROMs downloaded from the internet, while at the same time allowing XML parsing.

Does it really matter if the interface you're talking to over the network is code written by another team, or data written by another team? I suppose implementation details are often hidden away in data stores, but does it have to be that way?


Direct reading effectively makes the data store an API, i.e. the API you provide is SQL SELECT over multiple tables instead of HTTP GET resource.

This pretty much freezes the database schema, the team owning it can no longer change it, because they don't know how the users use it. It also limits the possibilities to provide optimizations on the data (like caching).


> Does it really matter if the interface you’re talking to over the network is code written by another team, or data written by another team?

Yes, because it means the depedencies between systems will consistently be API dependencies, not a mix of API and datastore structure dependencies, which means that the other team will only have to reduces the constraints on change.


this is a post breaking down a pseudo-mythological memo from 2002 at a company where things have obviously changed and evolved, and there are multiple upvoted comments in here from amazon people saying this isn't the case or things have changed etc.


Or: how to waste cycles and make your codebase run slow as shit on a networked computer cluster.


> Anyone who doesn’t do this will be fired. > Thank you; have a nice day!

What a colossal twat.


It's really easy to be a nice guy by reacting the way you did, but if the quote above is real, it should be studied and celebrated.

Let me break it down for you - Amazon bet the farm on this strategy and it worked out amazingly. Not following this strategy is equivalent to sabotaging the most important thing the company is doing.

While I suspect this quote is tongue in cheek, it SHOULD be a fire able offense for someone to ignore company strategy because "they know better" or are too lazy or whatever.


How did we end up in a place where we demand the system of government be democratic, with all the emotional language of freedom, self-determination and so on. And then we carved out an exception and made it so our places where we work are run as oppressive dictatorships...

Dont agree with what the bossman said?: "You're fired!"

And the people love it so much, they even democratically elected the poster child of that catchphrase.


// And then we carved out an exception and made it so our places where we work are run as oppressive dictatorships...

That's just a bunch of random words you're saying. Reality: we're building a company so we can create wealth and support our families and achieve something. We have a plan for doing it. If you're not following the plan and just sabotaging everything, you're doing exactly that - sabotaging.

This is different than saying "hey, I don't agree, let's discuss and challenge the plat" - that's great and admirable and you should do that (and I suspect you don't actually do that in your workplace). But to silently read the strategy and then say "ah fuck'it doesn't apply to me" is a huge fuck you to your colleagues. You have chosen to accept the job and the mission, do the job.


> we're building a company so we can create wealth and support our families

Whose families? Most private companies state very clearly that mission is to "maximize shareholder's value". That's the whole point of being privately owned. Hiring and paying employers is a necessary evil.

That's why they need legal regulation, public supervision, even whistleblower employees.


All the people involved in the case at hand (building out Amazon's software paradigms) are highly paid, equity holding employees who have done quite well for themselves through this process.


Famous last words before Boeing green lighting the faulty MCAS system (against engineering warnings) that killed multiple hundreds of people?

Dictatorships are great when the leader is perfect. Problem is that they never are.


Like I said - if you HAVE A CONCERN, VOICE IT.


You have it upside down. Everything else is authoritarian, except for (some) governments, because governments truly have the power of life or death over you.

But everything else? Highly undemocratic.


> How did we end up in a place where we demand the system of government be democratic, with all the emotional language of freedom, self-determination and so on. And then we carved out an exception and made it so our places where we work are run as oppressive dictatorships...

Private enterprise rests on property rights. Thus Bezos, as owner, was free to write that memo, including spelling out the consequences of sabotaging the company's strategy through insubordination or incompetence.

The only way this is dictatorial is if people are coerced to work there without a legal right to quit at any time. Their employment contracts given them that right (I assume) and they spell out the consequences of quitting without giving proper notice---which one could do. Thus, they're employees, not slaves in a dictatorship.

That's not to say that Amazon or other large corporations don't have problems with mistreating workers. In fact, thinking of businesses as machine systems may encourage a mindset among management that risks dehumanizing the people who do the work. And this problem is far broader than amazon.

However, dehumanizing workers isn't inherent in private property and owners' rights to run their firms as they see fit (within the constraint of law). Look at the Guinness company - privately owned, yet a pioneer in treating workers really well, and people have tasted the quality of their work round the world. Guinness believed people have inherent dignity (as a Christian he knew they were each made in God's image), so as a business owner he knew it was good for them and for his business to treat them well.[1]

Dehumanizing employees is often a result of misaligned incentives in the legal system, of unjust laws that don't fit with reality, or more fundamentally a result of the deep levels of brokenness that exist in every human being. No one is perfect, and no human system is flawless. The distortion of private property and resulting authoritarianism in business that you ask about is a sad result of what the Bible calls sin.

[1] https://www.amazon.com/Search-God-Guinness-Biography-Changed...


This is Steve Yegge paraphrasing and exaggerating, it is not the actual text of the memo.


Whether or not Bezos is a colossal twat (seems like he probably is) is irrelevant.

The point was that this was not up for discussion. Sure, there are less twatty ways of phrasing it, but at least you know where you stand.


Like all the best myth making propaganda this has some sense and truth to it


Have this mandate in Google

You'll have every engineer complaining and stop working to fight their freedom and finally the change has to be reverted. And 5 years later, oh my god, Amazon is doing that, we need to move to that direction as well...


Deleted as I misread the year in the article.


So Amazon has no teams that write simple libraries for other teams?

I imagine now that somewhere in amazon there is a "qsort" REST API that does all the sorting.


There's some frameworks, but libraries are generally under a "community support" model where if you're using the library, you fix the bugs you find.

If you want a team supporting that qsort function,you want it behind a rest api


Libraries exist, too. The glib example is pre-built clients for all those services, but there are also things like retry helpers.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: