One of my favorite youtube channels right now is What's Going On With Shipping, hosted by a former merchant mariner. Here's a 101 primer if you are learning too:
Sounds like DynamoDB is going to continue to be a hard dependency for EC2, etc. I at least appreciate the transparency and hearing about their internal systems names.
I think it's time for AWS to pull the curtain back a bit and release a JSON document that shows a list of all internal service dependencies for each AWS service.
I don’t use AWS or any other cloud provider. I use bare metal since 2012. See, in 2012 (IIRC), one fateful day, we turned off our bare metal machines and went full AWS. That afternoon, AWS had its first major outage. Prior to that day, the owner could walk in and ask what we were doing about it. That day, all we could do was twiddle our thumbs or turn on a now outdated database replica. Surely AWS won’t be out for hours, right? Right? With bare metal, you might be out for hours, but you can quickly get back to a degraded state, no matter what happens. With AWS, you’re stuck with whatever they happen to fix first.
Meanwhile I've had bare metal be a complete outage for over a day because a backhoe decided it wanted to eat the fiber line into our building. All I could do was twiddle my thumbs because we were stuck waiting on another company to fix that.
Could we have had an offsite location to fail over to? From a technical perspective, sure. Same as you could go multi-region or multi-cloud or turn on some servers at hetzner or whatever. There's nothing better or worse about the cloud here - you always have the ability to design with resilience for whatever happens short of the internet on the whole breaking somehow.
+1, SREs can spend months during their onboarding basically reading design docs and getting to know about services in their vicinity.
Short of publicly releasing all internal documentation, there's not much that can make the AWS infrastructure reasonably clear to an outsider. Reading and understanding all of this also would be rather futile without actual access to source code and observability.
They should at least split off dedicated isolated instances of DynamoDB to reduce blast radius. I would want at least 2 instances for every internal AWS service that uses it.
I mean, something has to be the baseline data storage layer. I’m more comfortable with it being DynamoDB than something else that isn’t pushed as hard by as many different customers.
I grew up close to Niagara Falls and my dad was a firefighter there for ~30 years who had to practice rappelling down the gorge to save people (which sadly happens too frequently).
His favorite story from the last time they "shut the falls off" was that they found tons of loose change in the rocks around the rapids - people were racing to get it and bringing back buckets of money. (Of course, they also found a few bodies as well...)
I'm truly hoping for a reasonable resolution on all sides for this situation. IMO Ruby is too small, and shrinking compared to Python and JS/TS especially in the AI era, to be able to afford any splintering of efforts.
Agreed. I wish the communications would move away from FUD that could scare people away from using Ruby when things are already splintered. A more honest and transparent accounting of what really happened is necessary.
I'm using Firefox 139.0.4 canonical-002 Snap on Xubuntu, and the videos don't play for me. Even when not using private browsing, even when I disable uBlock Origin, even when I disable Privacy Badger (and, of course, I've set NoScript to enable JS for the tab.)
https://www.youtube.com/watch?v=T5FR6_6kpG8
reply