Unless their systems are heavily modularized, I have a bit of a hard time believing that something new at Amazon goes live every 11.6 seconds. Maybe I'm wrong, but I'd love to have a better grasp on the context involved here.
Our systems are extremely modular. We've previously disclosed that in excess of a hundred discrete services may be called to generate a single page on our web site. You can find more info about that at the following link.
When we refer to a deployment at Amazon it means a single code push to one or more servers. For example, if you deploy a new piece of code to a thousand hosts that counts as one deployment. In other words a distinct update is pushed every 11.6 seconds.
Hopefully that makes sense.
I guess it's just hard to imagine that kind of situation when I'm on a two man web dev team that pushes out into the testing server 10-20 times a day and to production once a week, if that.
I used to work for Amazon. This is exactly how things are, to a scale that's hard to comprehend.
Knowing how their stuff works internally, a prod deployment every 11.6 seconds is not hard to imagine at all.
Clearly, a benefit is that you can move fast. You don't need permissions from someone half a building away to do something. You don't need to touch code that needs another team's approval. There are no committees that decides on global rules. Your team decides on your team's rules.
Like a shared nothing architecture, there's very little that is shared between teams. Teams are often connected only via their service interfaces. Not much else beyond common tooling.
But even their tooling reflects decoupling. Every tool follows the self-service model ("YOU do what you WANT to do with YOUR stuff"). Their deployment system (named Apollo, mentioned in the slides) and their build system, and their many other tooling, all reflect this model.
Cons. What happens is that you might be reinventing the wheel at Amazon. Often. Code reuse is very low across teams. So there's no shared cost of ownership at Amazon, more often than not. It's the complete opposite at Google w.r.t. code reuse. There are many very high-quality libraries at Google that are designed to be shared. Guava (the Java library) is a great example.
Another con. You may not know what you're doing. But as a team you will still build a rickety solution that gets you to a working solution. This is the result of giving a team complete ownership: they'll build what they know with what they have. Amazon is slowly correcting some of these problems by having teams own specific Hard Problems. A good example is storage systems.
And a lack of consistency is a common issue across Amazon. Code quality and conventions fluctuate wildly across teams.
Overall, Amazon has figured out how to decouple things very well.
As for the second question, a page generally doesn't have to make hundreds of requests. You're thinking of a flat architecture. Think of it more like a pipeline: data goes in at A, flows from A->B->C->D, page reads D. So you end up having to call a handful of services. That can be scaled by 1) caching, 2) careful selection of service calls (don't call ordering service unless you're placing an order), 3) asynchronous requests (you're typically going to be IO bound on the latency, so just spin up X service requests and then wait on them all). There are also other tricks that are fairly well known for reducing latency, such as displaying a limited set of information and loading the rest via AJAX.
As a disclaimer for the above, my work doesn't involve working with the Amazon.com website directly, so its based on my limited view in my domain space.
If you own a page or service that calls a bunch of other service, you typically collect metrics on latency of your downstream services. Amazon has libraries to facilitate this, and a good internal system for collecting and presenting this data. If one service is particularly troublesome, then you can reach out to that other team and get them to lower their latency. The other option is to pull in their data closer to you, in a format that you can consume quickly.
disclaimer: I work for Amazon, and used to work in the presenter's org.
Anyone can create a system that generates a lot of deployments, but what really matters is that you can complete all of those deployments safely. Of course, that is still ~0.001% too many outages due to deployments and we are working hard to make that number zero.
Checking the price history in the AWS console reveals that the prices for spot instances occasionally exceed the on-demand rate. In particular t1.micro instances reached $0.05/hr (vs the on-demand $0.02/hr). One possible explanation is that spot instances are more valuable because you can run more of them at a time than on-demand instances (100 total vs 20 total) without having to get an exemption for your use case. Another possible explanation is that people bid higher amounts to guarantee that their instances will run uninterrupted, knowing that even if the price briefly exceeds the on-demand price, the average should still be lower overall.
I guess as we move towards a more global economy it will level out somewhat on a day to day basis, but I don't know if that is a realistic expectation. The season spikes probably won't change.