Hacker News new | past | comments | ask | show | jobs | submit login
Tell HN: AWS appears to be down again
879 points by thadjo on Dec 15, 2021 | hide | past | favorite | 468 comments
Anyone else seeing this?



I checked their health status page. All is good. /s

https://downdetector.com/status/aws-amazon-web-services/


They did add an update, faster than last time:

"7:42 AM PST We are investigating Internet connectivity issues to the US-WEST-2 Region."

https://status.aws.amazon.com/

Edit: They added US-WEST-1:

"7:52 AM PST We are investigating Internet connectivity issues to the US-WEST-1 Region."

Edit: Found root case, maybe?

"8:01 AM PST We have identified the root cause of the Internet connectivity to the US-WEST-1 Region and have taken steps to restore connectivity. We have seen some improvement to Internet connectivity in the last few minutes but continue to work towards full recovery."

"8:01 AM PST We have identified the root cause of the Internet connectivity to the US-WEST-2 Region and have taken steps to restore connectivity. We have seen some improvement to Internet connectivity in the last few minutes but continue to work towards full recovery."


Too bad I am unable to load the status page due to connection timeouts, so I can't see the updates.


someone tripped over the fiber run i bet. Or, a cleaning person unplugged a router to plugin a vacuum (that actually happened but to a minicomputer iirc)


Unfortunately the vacuum, a shiny IoT connected appliance, didn't work because AWS was down


Usually the problem is "an idiot with a digger".


nah man, it's never the digger that's the idiot. it's always the project manager that told the digger where to dig. just like it's never the dev's fault as the PM made them do it. /s


No way a cleaning person can do that in a datacenter.


I hope that their infra is not that unstable


It's interesting that west-2 was quicker to create the incident (despite the issue starting a bit later there, at least by our experience), and while they both "identified" at the same time, west-2 also waited longer to call it resolved.

I assume there are different teams responsible for each, is the west-2 team just more on top of things?


West-2 also launched many years after us-east-1, so less legacy to deal with.


1.US-East-1 wasn't involved today.

2. They don't really have much "legacy" stuff to deal with since they likely turn over racks quickly across their whole fleet and software deployments should be standardized, so any US-east-1 flakiness has to do with the fact that its where amazon houses their control planes often.


There's at least one AZ in East-1 that doesn't support nitro, and that's been around for 4ish years now...

I agree in principle, but clearly something is hobbling them because of (probably) legacy stuff


The issue is not specific to the US, same issues in Europe. Also, it seems not only AWS experiencing issues. Unless Google is hosted on AWS haha...


Yes, it could be network peering related. But there's definitely a lot of us-west-1 and us-west-2 users complaining and people saying that us-east-1 seems fine.


Seems to be resolved now. And seems they hid / took away any mentioning of possible issues. Sigh.


It's still there now, on the top of the page, just marked resolved:

us-west-1:

7:52 AM PST We are investigating Internet connectivity issues to the US-WEST-1 Region.

8:01 AM PST We have identified the root cause of the Internet connectivity to the US-WEST-1 Region and have taken steps to restore connectivity. We have seen some improvement to Internet connectivity in the last few minutes but continue to work towards full recovery.

8:10 AM PST We have resolved the issue affecting Internet connectivity to the US-WEST-1 Region. Connectivity within the region was not affected by this event. The issue has been resolved and the service is operating normally.

us-west-2:

7:43 AM PST We are investigating Internet connectivity issues to the US-WEST-2 Region.

8:01 AM PST We have identified the root cause of the Internet connectivity to the US-WEST-2 Region and have taken steps to restore connectivity. We have seen some improvement to Internet connectivity in the last few minutes but continue to work towards full recovery.

8:14 AM PST We have resolved the issue affecting Internet connectivity to the US-WEST-2 Region. Connectivity within the region was not affected by this event. The issue has been resolved and the service is operating normally.


That is a shame. Anyone coming in after the fact to investigate an outage or glitch with their systems will need to look harder to find a known AWS outage. We can’t assume everyone looks at HN.


Practice makes perfect


So it is down again.


Ok, so it can't be down then. This is proof!


Yep, when it loads, it's all green. "nine nines!!!"


I thought that sounded ridiculous so I did the math and 99.9999999 uptime allows for 1.314 _seconds_ of downtime every 1000 years. It would take approximately 2.7 million years to acquire just an hours worth of allowable downtime, that's how long it takes light from the second nearest spiral galaxy and farthest visible object to the eye in perfect conditions [1]. Within a single quarter of a year, that's 328.5 μs (microseconds) or about 1200 blinks of an eye [2] or about 3 times faster than a typical electric capacitor camera flash [3], also approximately, and interestingly enough, less than 1% of my current ping to my ISP let alone Amazon's servers.

So yeah, having done that I now understand that it was probably a joke but it really puts into perspective just how ridiculous things can get with a few 9's.

[1] https://earthsky.org/clusters-nebulae-galaxies/triangulum-ga...

[2] https://www.verywellhealth.com/why-do-we-blink-our-eyes-3879...

[3] https://en.m.wikipedia.org/wiki/Flash_(photography) (wikipedia's won't let me deep link on my phone, it's in electronic flash section under types)


emphasis on when


60% of the time, it's all-green 100% of the time


Down detector is just a statistical page, it does not actually detect downtime, and is in no way aws's status page.


What does downdetector run on?


User reports — i.e. the number of people who google “is X down” and then click a Down Detector link.

It’s a clever way of getting reasonably accurate data very quickly and easily, though it does have it’s flaws — the data is pretty noisy and users often attribute outages to the wrong service (e.g. blaming their ISP or Microsoft or something when YouTube is down, or vice versa).


I would guess the user is asking what are down detector's dependencies... E.g. can their website function I'd us-east-2 goes down? Or a GCP equivalent? Or are they on a self-hosted server ? What would cause the metrics to be "off"


They really need to stop requiring SVPs or higher to show non-green status on the status page, as other HNers have revealed in last week's AWS post. It's effectively not a status page, and they could probably be sued if it can be demonstrated that X service was down but the status page showed green (since the SLA is based on status page). Should be automated and based on sample deployments running in every region and every service. And they should use non-AWS instances to do the sampling, so they can actually sample when, say, we experience the obligatory black friday us-east-1 outage every year.


I think SVP / GM approval is only needed for yellow / red status. From my time in AWS Support, the Support Oncall and Call Leader / GM delegate worked to approve green-i posts.


If my app won't run for reasons that are not my fault for longer than the SLA guarantees, the affected services should be at least yellow status and I should be accumulating free AWS credits.


They were much faster than usual about updating the AWS Status page.


With some lame ass tiny blue "connectivity issues" informational text. Surely broken routing to two entire DCs is full red for all services available therein?

Like what, the networking is broken but if you could send packets, the services would still work so they are green?


I was still able to reach our service running in us-west-1 when the connectivity issue was still on-going, so I don't know if it was a full interruption.


Our ~four person ops team shouldn't be able to have our status page updated 15 minutes before the upstream status page...


I thought Status Pages or Health Pages is designed to automate the reporting and checking the status automatically. This was my impression when I came across those status pages. Apparently, it is not automated and only update it manually. What is the point of having a status pages if it cannot be automated? I'm sure FAANG and tech conglomerates don't want it to be automated because of SLA.

I'm surprised with FAANG hosted their stuff in their competitors cloud services without providing a fallback cloud service if the primary service is down. Sure it cost money but it would be effective this way than putting all eggs in one basket.


As stated earlier, AWS has financial incentive to not update the status page. Nobody is willing to call them on the conflict of interest in a meaningful, market-changing way.


Perhaps someone could produce an alternate, Patreon-supported status page that accurately reports on the status of AWS services.


Would love to see them called out via new regulations or a lawsuit, however :)


Why is new regulation the answer here? Let everyone move to Azure, if they care that much about status pages and SLAs.


What if everyone has a financial incentive to lie? (They do) Where do we go then? Also, saying "everyone just leave" is a lot easier than everyone "just leaving", but that's tired and repeated. There's a huge mess and tangle of incentives and drawbacks and I don't know if we'd ever get enough support to weed out a service that gets us above the n'th percentile of greatness. As one falls the other will begin to abuse its power, I dont trust any mega Corp to do otherwise. Do you?


Any public communication is handled by people not machines. No one wants to make an automated status page because theres a shit ton of real noise that users dont need to hear about, nd theres a lot of outages that automation won't accurately catch


> we experience the obligatory black friday us-east-1 outage every year.

Is this a thing?


I tried to monitor services status using https://stop.lying.cloud, but they are also hosted to AWS, and down too.


If they're monitoring AWS downtime they might want to rethink this.


How come? It's accurate.


True, if it is down, then that means AWS is down (not necessarily, obviously). :D But honestly, if they want to monitor AWS, they gotta pick something else for this reason, something that is not down when AWS is.


I guess it depends on whether you like your FALSE's encoded as timeouts :)


Well... Yes. Hahahah


Work smarter, not harder


AWS should monitor itself from Azure or GCP, even DO or Linode makes more sense.

Eat your own dog food shows confidence, but monitoring it is a different dimension, you need use anything but your own dog food there.


It's the only realistic multi-cloud provider scenario I can ever come up with that I would consider actually implementing...


AWS wouldn't monitor itself from a competitor, of course, but they could just as well silo a team and isolate DCs to do independent self-auditing.


I don't know about AWS, but I know a lot of us uptime monitoring makers use (and pay) for competitor's products to know if we're down.


Rightly so. My point is a company can self-audit without having to pay a competitor.


I think that is inherently riskier because you never know on what axis you will have a failure and it is difficult to exclude all shared axes.


But we're talking about a status page which should be basically static. In it's simplest form you need a rack in 2+ random colos and a few people to manage the page update framework. Then you make teams submit the tests that are used to validate SLA. Run the tests from a few DCs and rebuild the status page every minute or two.

Maybe add a CDN. This shit isn't rocket science and being able to accurately monitor your own systems from off infrastructure is the one time you should really be separate.


That applies when you use competitors too.

They could have a related outage, or even a coincidentally timed one


Absolutely. And even if it’s cheaper to use the competition, an expensive custom solution will be found.


They have a bazillion alexa and kindle devices out there that they could monitor from, heh heh. At least let that phone-home behaviour do something useful, like notice AWS is down.


AWS wouldn't monitor itself from a competitor, of course

Why not? The big tech companies use each other all the time.

For example, set up a new firewall on macOS and you can see how many times Apple pulls data from Amazon or Azure or other competitors' APIs and services.


Apple is not a competitor to AWS or Azure in any way. They offer not infrastructure/platform as a service that I am aware of.


Apple and Amazon are competitors. Apple and Microsoft competitors.

The postulation was that Apple and Amazon weren't competitors. Not that they're not competitors in a specific niche.


But the idea that Amazon or Microsoft or Google would host anything at apple is pretty out there.

Apple uses their competitor's services because they can't build their own cloud and host their own shit. The big boys don't use competitors for services they are capable of building themselves.


And yet video.nest.com (Google) resolves to an Amazon load balancer.


A similar reason drives businesses to host `status.product.bigcorp` on a different server. And if your product is a cloud then your suggestion makes sense.


Yeah, I homed https://stop.lying.cloud out of us-west-2. Oops.


Considering the sea of bright green circles, reds might stand out but blues get lost in a fast scroll. Perhaps fade or mute the green icon to improve visibility of non-green which is the interesting information?


The brand is strong if you’re really the owner


How does this service work?

It seems to have all the look and feel of AWS, and somehow has more up to date info than the official AWS status page?


It's the same info - it just changes all blues to yellows and all yellows to reds. :)


I had no idea!

Pretty funny actually.


Now that they're back up they're not reporting any problems, how is it supposed to work? It looks like it is just repeating the status reported on the Amazon status page.


It is. It's just the AWS status page run through a transformation function to:

1. Remove all the thousand green services that no one cares about when looking at AWS status

2. Upgrade all yellows to reds because Amazon refuses to list anything as "down" no matter how bad the outage is.

3. Insert a snarky legend


I mean, sounds like it's working as intended then?


Funny I didn't know that and assumed it was okay


That’s hilarious


I wonder if AWS will make more or less money from these outages?

Will large players flee because of excessive instability? Or will smaller players go from single-AZ to more expensive multi-AZ?

My guess is that no-one will leave and lots of single-AZ tenants who should be multi-AZ will use this as the impetus to do it.

Honestly, having events like this is probably good for the overall resilience of distributed systems. It's like an immune system, you don't usually fail in the same way repeatedly.


* Free chaos monkey installed in every AZ


> * Free chaos monkey installed in every AZ

Only during this beta period, AWS will start charging for this feature soon enough.


We (Netflix) begged them for years to create a Chaos Monkey that we could pay for. There were things we just couldn't do ourselves, like simulate a power pull or just drop all network packets on the bare metal. I guess not enough people asked.


CMaaS sounds amazing for resiliency engineering. There's so much I want to be doing to perturb our stack, but I don't know all the ways stuff can go wrong. Sure I can ddos it, kick services and servers offline, etc, but that's what, a few dozen failure modes? Expertise in chaos would be valuable by itself. Not to mention being able to shake parts of the system I normally can't touch.

Side note: terraform is pretty good for causing various kinds of chaos, deliberately or otherwise.


If my company is any indication, they're going to make more money since everyone will simply check the multi-AZ or multi-region checkboxes they didn't before and throw more money at the problem instead of doing proper resiliency engineering themselves.


It doesn’t matter how much of resiliency engineering you do. Having everything in a single AZ is a risk. If this is acceptable then it’s fine if not you need to think of multi az from day 1.


Auth0 ran in six AZs in two regions[1] and went down today[2], because they picked the wrong two regions. How many regions and AZs should someone pay for before they get reliability?

1: https://auth0.com/blog/auth0-architecture-running-in-multipl... 2: https://twitter.com/auth0/status/1471159935597793290


At a minimum they should have chosen regions not in the same time zone or general geographic area. US-West 1 and US-West 2 might well be safeguarding against a server failure but is not a disaster plan. If your customers are global, choosing multiple continents is probably prudent.


Whelp, I guess you're not using Cognito then. It has no user account syncing feature so you can't have a user group in more than one region. Grrrrr!


No one just "moves off" AWS. Once your apps are spaghetti coded with lambdas, buckets and all sorts of stuff, it's basically impossible to get off. More than likely, as you noticed, it will increase spending since multi-AZ/multi-region will become the norm.


>I wonder if AWS will make more or less money from these outages?

There is no possibility that outages are good for AWS. Nor is there more money to be made from "publicity" of the outages.


I think GP has a point with,

>Or will smaller players go from single-AZ to more expensive multi-AZ?


No -- if they needed to they already would have migrated to a multi-region. If they don't need it -- they won't have. The reason is simple -- it's expensive as you say. I'm not a fanboi or evangelist of AWS either -- I do have pet theories they named their products with shit names in order to make more money by making AWS skills less transferable to Google Cloud etc. S3 should be Amazon FTP, RDS should be Amazon SQL etc.


> S3 should be Amazon FTP

I... don't think you know what S3 is. Or maybe what FTP is.

(Also S3, EC2, RDS, etc. were named long before GCP had competing services)


I mean, lots of people put off doing something expensive but safer just because it’s expensive, but rethink after the consequences show.


S3 is nothing like FTP? RDS stands for Relational Database Service. You have a valid point but picked the worst examples.


S3 is Simple Storage Service RDS is Relational Data Service EC2 is Elastic Compute Cloud

All of these make sense.

If you're gonna complain about names, at least pick the really sucky ones, like Athena, Snowball, etc.


You’re saying businesses always make the right decisions and never put them off?


Not at all the case. It was a regional outage that got Netflix to more than double our AWS spend going multi-region, so that outage netted them millions of extra dollars per year just from Netflix.


You’re underestimating the ability of eng leadership to not take these issues seriously. Only when there’s sufficient pressure from the very top or even the customers it takes a priority.


> There is no possibility that outages are good for AWS.

Do you know how many non-technical CEOs/boards/bosses have told their tech people that they need to go multi-region/cloud because that's what the one-paragraph blog and/or tweet told them to do in response to last weeks event?


The actual answer?

In the next 5 calendar years the bottom line will still grow.

However, the brand damage means they permananently lose market share. Which impacts their growth ceiling.


I would not go multiple Availability Zone within the same Infra/Cloud provider...


"Or will smaller players go from single-AZ to more expensive multi-AZ"

Yes! When you have a service interruption pay 2x more! With a region down I am sure other regions wont have any interruptions either! /s


This outage is extremely frustrating to me. My company hosts all our apps in gov cloud. Gov Cloud West 1 is also down, but the AWS Gov Cloud status page indicates that everything is healthy and green. I thought AWS's incident response to the East outage last week was that they'd update the status page to better reflect reality.

Gov Cloud Status Page: https://status.aws.amazon.com/govcloud


We are in the same boat. Finally updated "We are investigating Internet connectivity issues to the US-GOV-WEST-1 Region"


i had multiple govcloud hosted salesforce instances down but they appear to be coming back up now.


Everyone who spent the past week migrating from us-east-1 to us-west-2: this joke is on you. :)


"US-EAST-1 or bust" being manifested right now.


It's not just AWS - check the down reports:https://downdetector.com/

Cloudflare having some significant issues as well on certain domains.


It's possible people are reporting the issue as CloudFlare because that's whose error page they see when a box on EC2 is unreachable.


No, we are not. But customers who use AWS are having trouble.


Thanks for clarifying! Things seem to have settled down.


The list of affected services is a bit all over the place, especially since I highly doubt Xbox Live or Halo is running on AWS.


Down Detector doesn't really detect anything other than people saying "Is [service X] down?" on Twitter, which does mean that Xbox Live appears to be permanently offline if you believe them because the typical user for Xbox Live will declare anything from tripping over their ethernet cable to a tornado levelling their house preventing a connection to mean Xbox Live is down.


It’s still useful if you remove units from the graph and treat it as a sparkline. If there are reliably ~100 Xbox Live complaints on Twitter per hour, then suddenly there are 3000, that’s an outage.


If that were true, the line should be flat-ish, but it and playstation's show the same extreme spike at the same time as aws etc.


lol imagine if azure was just AWS in the backend


Is it bad that I can almost see that being a quick and dirty MVP to get out the door while you built your own cloud solution? Raises serious migration and cost issues, but... would be interesting.


I think for some targeted things there might well be "value added" services you could offer to transparently wrap AWS. E.g. a "write-through" S3 wrapper was something I was actually looking at because some clients when I was contracting were very reluctant to trust anything but AWS for durability but at the same time AWS bandwidth costs were so extortionate that renting our own servers from somewhere like Hetzner and then proxying writes both to a local disk and to S3 and serve up from local disk with a fallback to pull a fresh copy from S3 if missing broke even at a quite small number of terabytes transferred each month.

The nice part about something like that is that properly wrapped you can change your durable storage as needed, and can easily even selectively pick "cheaper but less trusted" options for less critical data. It also allows you to leverage AWS features to ride closer to the wire. E.g. to take another example than storage, I've used this to cut the cost of managed hosting by being to spill over onto EC2 instances in the past, allowing you to run at much higher utilisation rate than what you can safely on managed / colo / on-prem servers alone - as a result, ironically the ability to spill over onto EC2 makes EC2 far less competitive in terms of cost to actually run stuff on most of the time.


> a quick and dirty MVP to get out the door while you built your own cloud solution?

Seemed to work for Dropbox.


For the core services? Definitely. But do we really know that some 3rd party API which doesn't fail gracefully isn't causing this?


HN was also (briefly) down around that same time (roughly 1 hour ago from now).


DownDetector is showing everything down during that period, including Google.

I suspect DownDetector itself suffered some outages during this period, which it shows as outages of every service it monitors.


That's not how DownDetector works. It just relies on reports from users. The real failure case is users not understanding why they can't access whatever end service. Maybe they blame that service, maybe they blame their ISP, maybe they blame something else.


downdetector.com uses users complaints so it’s unreliable as people can blame anything


some sort of widescale attack would be the only explanation right?


This looks weird. At the same time all those services had a spike in outage reports.


can confirm i have multiple salesforce instances down.


Is it AWS or could it be an ISP?

AWS seems to be working for me, but I’ve worked with clients in the US and spectrum internet tended to drop connections to us sporadically, which looks like an outage to our clients but is something we obviously can’t control.


If it's a network issue, it's on their side. I've verified from centurylink, comcast, cogent, he.net, at&t, and verizon - all of them are having issues. This isn't like: Cox is having an outage and just can't get to AWS.


I have an outage way over in the southeast, looks to be affecting the major monopoly ISP. Can't get a tech to our data center until 2PM.


Things were working during the event, but connectivity was pretty messed up

https://imgur.com/a/VsrS0JZ

(This is two similarly spec'd boxes on us-east-2 and us-west-2). Looking at GeoIP of connecting clients, the only pattern I can see is the region itself.


I'm wondering the same thing. We have stuff hosted in us-west-2 and multiple people across the US are reporting that our systems are down, however our system is working fine for me here, which is near Toronto.


When us-east was down recently, our apps were not effected and we host on east. Maybe a similar issue?


The east-1 downtime was the interconnection between AWS hosted services, including the control plane, so most resources not dependent on AWS APIs stayed up (eg. non-autoscaled EC2 instances).

https://news.ycombinator.com/item?id=29516482


Currently we're seeing 40kms response times from CloudFront distributions, we can't hit PagerDuty (probably runs on AWS), etc.

I guess it could be an ISP thing but I guess we're all assuming 80/20.


I wonder if you really dug into most company's tech stacks, how many of their support tools (e.g., PagerDuty) are reliant on overlapping cloud providers.


Oh man, it is insane. During the aws incident last week we couldn't build software because bitbucket pipelines were all down, due to them running lambdas in us-east-1 only haha.

We've taken a massive turn away from a "decentralized" internet.


it's still decentralized...it's just a centralized version of it right?

just like Cavendish bananas are grown in multiple places...


Yea a number of people got hit by that, Louis Rossmann found out that every form of contact to his buisness was reliant on AWS east 1. https://www.youtube.com/watch?v=DE05jXUZ-FY


It was an AWS networking issue 90%+ packet loss pinging to Google & Facebook.


I'm so glad that I'm not still the CTO of a startup. I would be getting dozens of e-mails from people without engineering backgrounds asking "Are we multi-cloud", "why didn't you make us multi-cloud"?


Well, why didn't you? :)

The response is that this actually works well enough, so the investment required has not pushed anyone to do it (with that meaning building the core infrastructure to make that easy).


We are seeing issues with requests to Auth0, which I believe is hosted on AWS and has historically gone down when AWS has had issues


We see issues with Auth0 too. Other AWS services we use seem to be working fine so far (us-east-1)


AWS is reporting an issue in us-west-2 on their status page.


Auth0 went down for us as well right when AWS did. At least it's not like those two systems run our entire company...


There was a brief period of time back in the early 90's where I felt I understood how Linux worked -- the kernel, startup scripts, drivers, processors, boot tools, etc... I could actually work on all levels of the system to some degree. Those days are long gone. I am far removed from many details of the systems I use today. I used to do a lot of assembly programming on multiple systems. Today I am not sure how most of the systems works in much detail.


To an extent, this is one of the goals, to free up engineers to work on higher level things. Whether it meets that goal in some cases is debatable, and it’s certainly not ideal for us engineers who like to get to the bottom of things.


“working on higher level things” currently implies that depending on many layers of opaque and unreliable lower level hardware and software abstractions is a good idea. I think it is a mistake.


The best conclusion I can come to is "sometimes it works, sometimes it doesn't". Depends on the context. I've seen cases where it works great and other times where it's a huge hassle.


Funny, I feel the exact opposite way. The low level stuff is where all the magic happens, where performance improvements can scale by orders of magnitude rather than linearly with a CTO’s budget. I’d much rather figure out how to condense some over-engineered distributed solution down to one machine with resources to spare.


Seems like ever since Microsoft bought AWS, it's been going down an awful lot.


> Seems like ever since Microsoft bought AWS, it's been going down an awful lot.

What?


Satire.

Every time Github went down multiple people post on HN saying "every since they were bought by Microsoft, ...". As annoying as those Rust evangelists on every single memory corruption bug.


> As annoying as those Rust evangelists on every single memory corruption bug.

First of all, how dare you!

Second, shoulda used rust ¯\_(ツ)_/¯


I could have written the OP message a year ago -- I used to feel the same way.

Plz don't disparage Rust evangelism!

Rust is awesome. yes it is complex, frequently annoying, easy to learn difficult to master. I'm speaking from a 30 year dev career.

a few months ago I intended to do a quick investigation into RUST to validate my "i really don't need to learn this" specifically for an embedded project. Within a few hours I found I had become a zealot. Rust has too many "omg, i should tell everybody about this" behaviors that I can't even find my favorite aspect yet.

It's equivalent to a lost soul finding Christianity and accepting the lords blessing and forgiveness! The weight that is lifted of being forgiven to your sins resulting == no more guilt, it's all forgiven! immediately reduction of cognitive dissonance. in this example with rust, it's pointer tracking and memory management, but it's basically the same thing. Rust is for the pious developer.

Those people who are still using C++ for fresh starts are the same folks who love to do things the hard & wrong way, or at least those who don't know any better, infidels, unwashed heathen.

Join us. join rUSt.


While I'm not sure whether you're serious ;), to be clear: what annoys me is they don't really understand why we are having "someone pwned your phone via a series of memory corruption bugs" daily.

Until those Rust evangelists managed to rewrite the world with Rust (and I promise you there still will be a lot of security bugs), we still have to fix our shit in a low-cost way and their evangelism does not help at all and is pure annoyance.


> what annoys me is they don't really understand why we are having "someone pwned your phone via a series of memory corruption bugs" daily.

No, I understand it. I started out in vuln research and have been in defense for a decade. It's probably fair to even say I'm an expert on it.

I'm going to keep advocating for rust as one the highest ROIs for improving security.


Obviously while using Arch btw


Didn't know Tim Dillon is hanging on here in HN.


Haha wtf?


That was fun. Badges weren't working (daily checkin required) so the front desk had to manually activate them.

Slack wasn't sending messages and Pagerduty was throwing 500's.


... because you need to contact a server 1000 miles away to issue badges in your building.

This cloud-for-everything-even-local-devices thing is both hilarious and sad.

I wonder if anyone had trouble doing their dishes or laundry today, because I'm sure someone thought dish washers and washing machines needed cloud.


I don't know if you can say an on-premise badge hosting service would be more reliable than the cloud.


well, atleast you have the agency to do something about it yourself.

also, building access systems should be hosted in the building they reside in for security reasons anyways.


This creates some really fun failure cases on the form of "I need to enter the building so anybody can enter the building".

Depending on the cloud is certainly a very stupid decision. keeping everything inside the building is better, but still not ideal.


Any electronic access system like this requires manual backup. As in, some doors with regular locks using physical keys.


it requires an override anyways in case of emergencies like a fire.


Taking badges out of the cloud reduces points of failure by several orders of magnitude.

Cloud-based badges make sense if you have locations with small staffs and no HR people or managers. Like if you're controlling access to a microwave tower on the top of a mountain.

But badges-in-the-cloud for an office building full of people who are being supervised by supposedly trusted managers, and all of whom has been vetted for security and by HR, is just being cheap.

Like the 1980's AT&T commercials used to say: "You get what you pay for."


> Taking badges out of the cloud reduces points of failure by several orders of magnitude.

I'm not convinced that's true, or at least certainly not an order of magnitude. Wouldn't a badge system hosted on-prem also need a user management system (database), a hosted management interface, have a dependency on the LAN, and need most of the same hardware? Such a system would also need to be running on a local server(s), which introduces points of failure around power continuity/surges, physical security, ongoing maintenance, etc.


All of those things would also be needed by the cloud provider, too. Just because it's on-prem doesn't mean it doesn't need servers, power conditioning, physical security, etc. "Cloud" isn't magic fairies. It's just renting someone else's points of failure.

In addition, you're forgetting the thousands of points of failure between the building and the cloud provider. Everything from routers being DDOSed by script kiddies to ransomware gangs attacking infrastructure to Phil McCracken slicing a fiber line with his new post hole digger.


The remote solution requires all of those same things, plus in addition it requires internet connectivity to be up and reliable, the cloud provider be available and the third party company be up and still in business.

Adding complexity and moving parts never reduces points of failure. It can reduce daily operating worries as long as everything works, but it can't reduce points of failure. It also means than someday when it breaks, the root causes will be more opaque.


Within the building’s on premise hosted infrastructure, are they going to buy multiple racks and multiple servers spread far enough apart so that there aren’t many single points of failure that will bring the badge machine down if they fail?


Yes, everyone but you is wrong.

Many logical people have decided to abstract away their soul-crushing anxieties and legal gray area during outages to incredibly stable and well-staffed cloud infrastructure providers.

If you and your team are better at taking care of hardware than an entire building full of highly paid engineering specialists, then that's cool for you, but also, no you're not.

That's not to say you're not capable of running on-prem hardware that is stable.

I'm just saying that the high-handed swiping away of everyone else who's made an incredibly safe and logical decision to host their stuff in the cloud makes me question your general vibe.


> If you and your team are better at taking care of hardware than an entire building full of highly paid engineering specialists

The trade offs aren't quite that simple. Those specialists are necessary because they're building and maintaining infrastructure that's extremely complex since it has a crazy scale and has to be all things to all people. When you're running in-house, your infrastructure is simpler because it's custom tailored to your specific requirements and scale.

There are tradeoffs that make cloud vs local make sense in different contexts and there's no one right answer.


> but also, no you're not.

If you plan to replicate all of AWS I'd agree with you. But if all you need is a handful of servers, you could end up with better uptime doing it in-house just because you don't have all the moving parts that make AWS tick, reducing the chance for something to go wrong.

My bare-metal servers stayed up during both of the recent outages, not because I'm some kind of genius that's better than the AWS engineers but just because it's a dead simple stack that has zero moving parts and my project doesn't require anything more complex.


There is absolutely no reason for a local device (like a door lock or dishwasher as per OP) to depend on any external connectivity. Not to the company on-prem hardware, not to AWS.


Yep, it's broken again. I was trying to install some Thunderbird extensions, and stuff started breaking halfway through. Never thought of an AWS outage borking my mail client I guess...


We lost all public IPv6 in the Linode Newark DC.

This appears to be cross-provider.

Edit: We have IPv6 back.


We're having issues connecting to our EC2 bastions and accessing the us-west-1 dashboard too

EDIT: Cognito auth seems down for us too

EDIT2: our ALBs are timing out as well

EDIT3: us-west-1 looks like working now!


That's the price of PIP culture and burning out your devs. Now noone wants to work at Amazon and they can only hire new grads.


I hear they do get people who want to be able to get experience at AWS's scale, there's only a few places for that.

The thing that really gets me is the reports from the last major outage a few days ago about how pervasive lying inside the company is. This really doesn't work well for engineering and we're possibly seeing the results of that. We should certainly expect to see that becoming visible the more time goes on without a major cultural shift. Which given that the guy who ran AWS now runs all of Amazon.com....


Looks like its taken down SendGrid, NPM, Twitch, Auth0 so far


PlayStation Network went down at the same time.


Stripe as well


Notion as well


How much do you guys think these frequent outages will effect their market share in cloud products?

Is this enough of a push for organizations to actually move over their infrastructure to other providers?


Not at all.

The other cloud providers have had their own outages.


Sadly this, people are entrenched with AWS and the... "We're not the only ones down" thing truly has some effect

Organizations can more easily swallow an AWS failure when they aren't the only ones hit. They move elsewhere, those outages look more unique

Folks may think multi cloud is a good idea... But you're just as likely to suffer from the extra points of failure as you are to benefit


Multi-cloud is such an odd idea to me. You're either building abstractions on top of things like cloud-provider specific implementations of CDNs, K8S, S3, Postgres, etc...or using the cloud just for VMs. The latter would be cheaper with just old-school hosting from Equinix, Rackspace, etc. The former feels like a losing battle.


It’s prompted discussions of building multi regional services in my org but not multi cloud. They would have to really really really screw up for that to happen… maybe be down for like a week or something.


Reminder that the internet was literally invented to avoid this kind of nuclear attack. But i guess people are herdish animals and prefer to die as a group


More like ultimately all these companies buy into a certain form of vendor lock-in and they have no competence or willingness to migrate or even consider the competition. It's starts with "oh I'm just renting a remote virtual server" and in no time it's "Oh, all my stack is tied to AWS proprietary products" because convenience. That's what Amazon wants.


Seems like the Internet level networking is quite robust at this point.


We're having troubles in us-west-2.

Discourse is reporting trouble, too. https://twitter.com/DiscourseStatus/status/14711403698992906...


us-west-1 also seems offline, but us-east-1 (ironically) seems fine


AWS status page shows an update:

> AWS Internet Connectivity (Oregon): 7:42 AM PST We are investigating Internet connectivity issues to the US-WEST-2 Region.

Source: https://status.aws.amazon.com


Oh. not again...


It is surprising that their status page is down too:

https://status.aws.amazon.com

Their CDN, CloudFront, always works reliable for me. Couldn't they put the status page on CloudFront?


Takes minutes to update a CloudFront distribution (they say around 5 minutes in their blog post from last year when speed was improved [1]). I think they might want to be able to change it to "everything's back to normal" in an instant, based on the SLA argument I've seen thrown around last time an AWS region was down.

[1] https://aws.amazon.com/blogs/networking-and-content-delivery...


It's minutes to update the distribution settings, but that doesn't have to be the case for the content itself. A much lower cache time can be used.


The status page is working great for me. Did they make it multi-region after the last failure? I'm on the east coast.


Central EU here, appears to be down.


Northern EU, down as well. AWS Management Console in eu-west-1 opens up just fine though.

Edit: Hitting refresh a bunch finally got it open.


Western EU here, appears to be up for me. Maybe a peering issue?


It's back up for me, too, right now. Rather slow, though, and traceroute shows 25 hops. So it might really be peering.


Works for me. It's the usual static page with everything green.


Maybe it is just a static website. Do they even have CSS for red? :D


Not working for me either in the UK.


Down for me, as well.


I think it's time to face the fact that we all have too many of our eggs in the AWS basket.


I'm seeing outages on us-west-2 too. Customer facing traffic being served through Route53 -> ALB -> EC2 is down and CLI tools are failing to connect to AWS too.


Vercel is down too.

My sites run on Cloudflare and Vercel, and I can't even log in to those right now.

I'm curious — what does Hacker News run on? It seems impervious to any kind of downtime...


> I'm curious — what does Hacker News run on? It seems impervious to any kind of downtime...

On a dirty, disgusting dedicated server.


> On a dirty, disgusting dedicated server.

I'm adding "reliable" into that mix. Too bad they're too expensive and hard to setup for side projects, but HN is probably one of the most stable site I frequently visit, and I don't even think about it.


I disagree that they're expensive. Expensive to own maybe, but you can rent them on a monthly basis from something like Hetzner or OVH for a fraction of the cost of AWS (especially when you include bandwidth which is free and unmetered in this case) and they handle hardware maintenance for you.

Hard to setup is relative. It all depends on what you're doing and how much reliability you need. For a side project or a dev server you can just start with Debian, stick to packaged software (most language runtimes and services such as Postgres or Redis are available) as much as possible and call it a day. You can even enable auto-updates on such a stable distro.

The knowledge you'll gain by dealing with bare-metal is also going to be useful in the cloud even in container environments.


> I'm adding "reliable" into that mix. Too bad they're too expensive and hard to setup for side projects

I mean they're not particularly. Unless the use is extremely minimal, it'll always be cheaper to buy a small server even for a side project.

I use cloud VMs for projects that can live on $5/mo VMs because at that usage rate I'll never break even to buy a machine.

But as soon as your AWS bill is even like $50/mo, worth to start looking at alternatives.


HN definitely gets overloaded at times, including during big outages when everyone stampedes here. I got a bunch of "sorry, we can't serve your request" a little while back.

Pobody's nerfect.


Doesn’t HN run on top of firebase? Which ties it to Google? People seem to imply its run on a dedicated instance.


No; a copy of the HN database is synced regularly to Firebase (https://github.com/HackerNews/API), but IIRC the site itself runs on a single process on a single machine with a standby ready.

edit: Yup. https://news.ycombinator.com/item?id=28479595


DNS A record suggests a dedicated server from this company:

https://www.m5hosting.com/


Wow, yeah, us-west-1 AND us-west-2 are reporting connectivity issues. I'm guessing this is related to the Auth0 outage that's currently going on too.


Tangentially related: On Friday Backblaze and B2 were down for 10+ hours to update their systems for the log4j2 vulnerability. Seemed noteworthy for the HN crowd and I posted a link to their announcement when the outage began. However, the post was quickly flagged and disappeared. Genuinely curious, why is announcing some outages ok and others not?


What would be the ratio of HNers who are Backblaze customers vs those who are AWS customers. I bet Backblaze number is small enough where Backblaze employees on HN can downvote you enough for it to matter.


Must be a Y in the day.

It amazes me how many projects exist that don't even have multi-region capability, let alone no single point of failure


Multi-region is difficult and expensive, and a lot of projects aren't that important. Most of our infrastructure just isn't that vital; we'd rather take the occasional outage than spend the time and money implementing the sort of active-active multi-region infrastructure that a "correct" implementation would use. We took the recent 8 hour us-east-1 outage on the nose and have not reconsidered this plan. It was a calculated risk that we still believe we're on the right side of. Multi-AZ but single-region is a reasonable balance of cost, difficulty, and reliability for us.


Curios if you tell your customers you’re totally ok with having lower than 99.9 availability


We don't have any external customers; they are all internal. We're all on the same side of the table.


Sounds like even worse deal for the customer since there is no refund


Depends on the service are

I have some services which can cope with a 98.5% downtime, as long as they are available the specific 1.5% of the time we need them to run, as such "the cloud" is useless for that service


Right when you really want your thing to be up and can’t amortize hours of continuous downtime cloud has no solution for this. That’s something that often gets left out from the sales pitches tho =)


How many 9s can you get from a single-region multi-AZ deployment not on us-east-1 and which nly uses basic services (EC2, IAM, S3, DynamoDB, etc)?

Really only 3?


Depends on how critical they are to your stack. Ime if you use more than a few products and either one of them can take you down yeah it’s less than 3. Just something to ponder but if s3 didn’t meet 99.9 for the month you get a whopping 10% back. Other cloud vendors aren’t much better at this (actually worse). Not even to mention that you need to leave some room for your own fuckups


IDK, don't you end up with a bunch of extra costs? Like you're going to literally pay more money because now you have cross region replication charges, and then you're going to pay a latency cost, and then you may end up needing to overprovision your compute, etc.

All to go from, idk, 99.9% uptime to 99.95% (throwing out these numbers)? The thing is when AWS goes down so much of the internet goes down that companies don't really get called out individually.


If you just sat that there and took that 8 hour outage you’re barely even 99.9 for the year


You're saying that as if it's a walk in the park to set up and not cost prohibitive, in terms of opportunity cost and budget, especially for smaller companies.


Right. Downtime (or perception of downtime) is bad for business, so AWS is surely working to improve reliability to avoid more black eyes on their uptime. But at the same time, an AWS customer might be considering multi-region functionality in AWS to protect themselves ... from AWS making a mistake.

As a customer, it's unclear what the right approach is. Invest more with your vendor who caused the problem in the first place, or trust that they'll improve uptime?


This might be a multi-region problem. Auth0 as an example has three US regions and two of them are down.


An honest question. Why do you guys use AWS instead of dedicated servers? It's terribly expensive in comparison, nowadays equally complex, scalability is not magic and you need proper configuration either way, plus now the outages become more and more common. Frankly, I see no reason.


Once you have committed to a certain way of doing things, the transition costs can be very high.

Let's consider RockCo and CloudCo. They both provide a B2B SAAS that is mostly used interactively during the working day, and mostly used via API calls for the rest of the working week. Demand is very much lower on weekends. Both RockCo and CloudCo were founded with a team of six people: a CEO who does sales, a CTO who can do lots of technology things, three general software developers, and one person who manages cloud services (for CloudCo) or wrangles systems and hosting (for RockCo).

In the first year, CloudCo spends less on computing than RockCo does, because CloudCo can buy spot instances of VMs in a few minutes and then stop paying for them when the job is done. RockCo needs a month to signficantly change capacity, but once they've bought it, it is relatively cheap to maintain.

In the second year, they are both growing. CloudCo buys more average capacity, but is still seeing lots of dynamic changes. RockCo keeps growing capacity.

In the third year, they're still growing. CloudCo is noticing that their bills are really high, but all of their infrastructure is oriented to dynamic allocation. They start finding places where it makes sense to keep more VMs around all the time, which cuts the costs a little. RockCo can't absorb a dynamic swing, but their bills are now significantly lower every month than CloudCo's bills, and the machines that they bought two years ago are still quite competitive. A four year replacement cycle is deemed reasonable, with capacity still growing. And bandwidth for RockCo is much cheaper than the same bandwidth for CloudCo.

Who's going to win?

Well, you can't tell. If they both got unexpectedly sudden growth surges, RockCo might not have been able to keep up. If they both got unexpected lulls, CloudCo might have been able to reduce spending temporarily. RockCo spent more up front but much less over the long term. CloudCo could have avoided hiring their cloud administrator for several months at the beginning. RockCo's systems and network engineer is not cheap. And so on, and so forth.


It sounds like their systems design interviews aren’t rigorous enough.


I'm guessing lots of people fled us-east-1 for us-west-2, after the last outage, and overwhelmed something there.


"Nobody ever got fired for picking [x]" as applied to cloud zones? Sadly, you are probably right.


I wonder how many are now furiously headed back to us-east-1, building the conditions for the third event :)


At this point they should hire specifically for config management and rollout.

Mostly /s; I wish the aws engineers the best of luck through this.


How about an ad for a "Status Page Engineer"?


Root logins are suffering some kind of "captcha outage." The buzz has just begun https://twitter.com/search?q=aws%20captcha&src=typed_query


looks specific to certain (possibly AWS hosted or partially dependent) services such as Auth0:

https://status.auth0.com/

e.g. our services running on AWS are fine right now, but new sessions dependent on Auth0 are not.


My personal health dashboard on AWS shows "InternetConnectivity operational issue us-west-2"

[07:42 AM PST] We are investigating Internet connectivity issues to the US-WEST-2 Region.


Probably a silly question, but what are you using to get this info?


A browser most likely... this is the "Personal Health Dashboard" one gets for each AWS account

/edit: https://phd.aws.amazon.com/phd/home#/dashboard/open-issues


Thanks. Didn’t know if it was a custom dashboard or something provided by AWS.


Asking as a non-cloud-developer: why would Crunchyroll's recovery [0] lag so much behind AWS's recovery [1]?

[0] https://downdetector.com/status/crunchyroll/

[1] https://downdetector.com/status/aws-amazon-web-services/


I don't know for sure, but this is generally common because caches get cold.

A lot of websites use a cache in front of databases (or template rendering engines, or many other systems). That cache might evict entries based on time - after 5 minutes, the entry is considered invalid.

But that means that if you have no traffic for 10 minutes, the cache completely empties. Then when traffic returns, it all skips the cache and actually triggers a real hit to the backend - which is now overwhelmed with traffic. The cache protects the backend in normal behavior, but now it's not doing its job, so the backend has many more requests than usual.

In the worst case, those requests are enqueued in a big serial sequence... but the ones at the back of the queue may time out. The client may do something like say "it's taken me 5 seconds and I still don't have a response - I'll abort and retry!" and now you have even _more_ traffic to deal with.

So cold caches and retries can conspire to keep a service down for a long time even after the root cause is fixed.


I'm accustomed with cache-eviction policies based on LRU, age, etc. But in my systems, eviction happens only when (a) the content is known to be invalid, or (b) there's competition for cache space.

IIUC the parent comment, it's describing a policy that evicts entries even (a) and (b) are false. Is that common in the web-hosting / CDN world? Or is age considered a proxy for stale?


Right, age is used as a proxy for stale, because we often don't have anything better.

A lot of web systems work this way - DNS records for example use a "TTL" which means "time to live." If the TTL is 60, then you throw it out of the cache after 60 seconds even if you have room in the cache, and you have no reason to believe it's invalid. This lets independent entities (like a DNS authority) make a change and get it rolled out everywhere.

I think the reason this is common is that proving cache invalidity is so hard, especially with the typical "dumb" cache appliances that are widely used. They just do stuff like cache the response bytes for a particular URL; they might not even understand HTTP beyond interpreting the request's headers, and certainly don't really understand the response.


Crunchyroll seems to barely work at the best of times, and when it does, it's still a mess.

All sorts of issues still unresolved for years, including the ridiculously annoying "Finishes playing season English sub, autoplays first season of German dub, which then gets stuck". Still no profiles (nerfing their super-premium offering). Auto-resume points are unreliable, the Android app is hot garbage at dealing with network disruption...

I can only imagine their back-end is mostly Visual Basic running on a single AWS-powered VM.


It appears AWS Status Page is hosted at AWS [0].

Seems like a really bad idea.

[0] https://hostingchecker.com/


I'm on us-east-1 and everything is fine for me including:

* EC2 instances

* AWS Workspaces

* FSx for Windows

* AWS Directory Service

* S3 Buckets


"Everything is fine." - https://status.aws.amazon.com


Everything *is* fine now. The status page previously reflected an issue much quicker than last time.


Yes, all our stuff in west-2 went down at 7:15 PT.


At which point this outages are a sign that something inside AWS is deeply broken and pretty much unfixable?


Slack seems to be having issues too.


Even as a software engineer, I think I could build from primitive materials a couple of battery operated transceivers to replace the signal flags or horsemen for critical communications. A little basic physics and materials science goes a long way.



Can't use MFA right now to get into multiple instances due to this outage.


I get the feeling that Havoc will happen when a tornado would reach us-east-1


"Hey boss, that thing that took down us-east-1... that can't take down us-west-1 next week, can it?"

"No, no, of course not"

"Should I check?"

"No, don't waste time checking, get back to your TPS reports"


This is new… Siri hasn’t been able to connect for me since this began


Same thing here.


Seeing this on us-west-1. us-east-1 appears to be functioning for us.


Yes, seeing it too.

Seems to be down in a major way. Lots of various AWS services are down. However, so many things depend on AWS that it could just be EC2 is down and it is causing a rippling affect.


Some npmjs.com pages are returning 503 Service Unavailable for us


I fucking swear to God.


ListenNotes.com has servers running on us-west-2.

One issue is that outbound requests from our servers us-west-2 timeout. Other than that, it seems that we are running ok so far.


Can someone please update the title to be broader than AWS?


Is that related to the current NPM status (https://status.npmjs.org/)?


Systems manager in eu-central-1 is giving us some issues now, but I am not sure about their internal architecture for it, so maybe needs some us resources?


AWS Global Accelerator not working correctly anymore as well, connections dropped worldwide. Seems like it is managed from us-west-2 and not redundant.


This comment taught me about the existence of Global Accelerator and, somewhat ironically given the context, we decided to deploy it today. Pretty neat! I'll have to keep in mind that I learned about it because of a worldwide outage :) Thanks!


Yep, we're also having issues. Hosted on us-west-2


Our systems that talk to S3 in CA and OR are timing out trying to open SSL connections. AWS lists outages in these regions on their status page.


At least this still works: https://livemap.pingdom.com/


Partially, the stats on the right are wrong. For me it shows:

Website outages in the past hour 86,967

Lowest 16,208

Average 16,208

Highest 16,209


I can't log on to the console for us-east-1. But our api gateway seems to be working, so I guess production is still up...for now...


And I kept getting "We're having some trouble serving your request. Sorry!" on HN for the past 10 minutes or something.


Traffic flood to this site for status reports on AWS


My monitoring is on fire, flipping red to green every minute because of connectivity issues with every single LB in us-west-2.


4 hours in, our AWS IoT endpoint (not ATS, Symantec) in us-west-2 is still down according to monitoring, PHD and support.


Auth0 down as well, right at the same time. There goes any sort of productivity today. Whole company in firefighting mode.


Yup. Having issues with IT Glue and Duo here.


Duo issues here as well.


Yeah. It's inconsistent but a number of my production servers appear to be down. Along with my New Relic logging.


AWS appears to be expensive again


Could it be related to a Log4j issue?


I thought the whole point of AWS was that you could fail over to a different location?


7:42 AM PST We are investigating Internet connectivity issues to the US-WEST-2 Region.


I really appreciate seeing these threads. Let's me know I haven't lost it.


It's bad that I come here first to see if I am crazy or AWS is actually down.



Couldn't access Notion, so came to check HN, and boom here is the answer.


Yup, seeing this on us-west-1


This also seems to affect NPM, I can't install packages locally :/


Our IaaS vendor, Aptible, reports us-west-1 is down / throwing errors


We're seeing AWS issues with us-west-2 at [medium-sized tech company]


At least Twitch.tv (Amazon subsidiary) and npmjs.com seems to be affected.


Yeah, I'm getting 2000 player errors in the Twitch video player.


The vehement defenders of AWS are starting to remind me of the cryptobros


QuickBooks Online seems to be down, and they seem to be hosted on AWS.


Twitch video streaming is also down right now:

HTTP Error 500 internal server error


obligatory comment about status page showing seas of green: https://status.aws.amazon.com


The status page appears to be down now as well.


Maybe they got so much flak last time for it being worthless, that they just decided to pull it this time??


yep I'm seeing that too - wow.


For this kind of thing it's usually better to just use a user-driven site like: https://downdetector.ca/status/aws-amazon-web-services/

Some users are clueless, but the clueless users average out over time and the spikes make it clear when there are actual issues.


Love to see the manually updated status page not updating


For me, it's down as well


Confirmed experiencing significant issues in US-WEST-1 as well


Twitch seems to have recovered, is it back now for everyone?


Still getting errors in Houston

edit: some streams back up, chat still buggy as of 09:55 local time

edit2: appears to be back ~10:00 local time


Seems like this is affecting Dropbox paper, at least for me.


Down for us (graphite.dev) as well, running on us-west-2


I guess it is all about log4shell patching in rush.


Tsheets is also down so I can’t clock my hours LOL


we are having issue with us-west-1 and us-west-2


Experiencing significant issues in US-WEST-1


HOST THE GODDAMN STATUS PAGE ON AZURE FOR FUCKS SAKE.

There is zero excuse for this shit. Be professional. Acknowledge reality. It is logically impossible to run your own status page. Trying to do so just wastes everyone else on the internet's time when you have an outage.


They should host their status page on IPFS instead. If you're never going to change the contents of your status page, you might as well put it into immutable storage!


If the status page is down, you know the system is down. Mission accomplished. Go ahead.


Isn't it on S3 or something? And a few years ago we had that whole S3 is down situation and the status page was also down? xD


No, just change it to a statically-rendered page on CloudFlare with all green lights. :-)

And that, ladies and gentlemen, is how I passed my system architect interview!


I don’t understand, are folks looking at a different status page than me?

This morning we saw some weird behavior in us-west-2, our traffic just _vanished_. I thought: there is no way this is us.

Went to https://status.aws.amazon.com/

Top of the board showed “Internet Connectivity Issues (Oregon)”

And that was that. The board worked exactly as it should - it immediately explained my missing traffic and kept me up-to-date with the status of the outage on their side.


They should automatically update as well. Currently it is a static "all green" page and might be manually changed if a managet would give his go. Insane.


Seriously.


You don’t even know what the problem is yet. Stop shouting solutions.


The problem is that AWS can't update their status page to reflect that there's a problem. This happens during every AWS incident without fail.


"Can't" and "won't" are different things.

See discussions from the last outage about the VP signoff needed to admit, I mean announce, an outage.


My point is that you’re not even sure that it’s AWS’s problem. I heard that other providers might be affected, perhaps meaning it’s a network issue.


Status page looks like it was updated. Seems more like we have a lot of impatience on this board.


The problem is very clear: the status page is not working as it should.


Given the legal liabilities Amazon has with their SLAs, it may be working exactly as Amazon thinks it should. Whether anybody would agree with that assessment should be obvious.


What if the problem is not an AWS problem? My point is that you don’t know what the problem is, you’re assuming.


I kind of think everyone else here understands this very particular problem of a status page running on the same equipment that it's supposed to be monitoring if that equipment goes down, and for whatever reason, you don't.


I understand that. What I’m questioning is whether that is the problem here. Is it? Do you know? I heard it might be an internet provider issue, in which case the status page is not the problem here.


Remember, every 12 secs take one 9.


eh?


It's about calculating the 9s in your uptime. But 365 * 24 * 60 * 60 * 0.000001 == 31s (did I get that right?)


Related to Amazon's SLA


Prime video down for me. Australia.


Yup, trouble in us-west-2 for us.


The npm registry is down too.


out of memory again. ;<


yes. Having issues as of few mins ago reaching us-west-2 ec2.


us-west-2 EC2 looks like just came back online.


wohoo! ssh'd back in. ty


Oh man. Not again!


us-west-2 stuff is down for me too


Log4jammed ?


Back up


npmjs has problems too :(


seems to be up again


leetcode.com is also down


Oh man, not again!


We are barbarians occupying a city built by an advanced civilization, marveling at the hot baths but know nothing about how their builders keep them running. One day, the baths will drain and anyone who remembers how to fill them up will have died.


Many years ago I stood at the window of my comfortable apartment, watching wind and cold rain rage outside.

I thought about my cave men ancestors who during such a storm if they needed water would have to go out and get it, getting themselves soaked.

If I wanted water, the tap in the kitchen would give it to me, in a nice controlled fashion. If I did feel like having water rain down upon me, my shower would do that, again in a controlled fashion, and I could select the water temperature.

If they wanted the cave to be warmer, they had to burn something and deal with the smoke. And they might have to work hard to obtain whatever it is they burn.

If I wanted my apartment warmer, I just had to turn the knob on the thermostat.

They were at the mercy of their environment. My environment is mine to command. I was feeling pretty superior to my cave man ancestors.

Then I realized that I don't know how to build the systems that I was relying on for my supposed superiority, or even how some of them work.

I'm really just a cave man that found a nicer cave.


> Then I realized that I don't know how to build the systems that I was relying on for my supposed superiority, or even how some of them work.

I used to have this joke(?) with my friends: remember Mark Twain's "A Connecticut Yankee in King's Arthur Court"? The titular Yankee basically upends the (faux) medieval society he gets transported to, "inventing" all sorts of technological miracles.

Well, I'm a software developer but don't come from an engineering background (I mean actual engineering, not programming). I don't even understand how electricity or the telephone work (I mean, old fashioned telephones, let alone current mobile networks). If I was transported to 2 or 3 centuries to the past, I wouldn't be able to explain modern technology to other people, let alone actually build it.

I sort of understand how steam machines work, and I could "invent" the printing press. I guess. But anything related to circuitry, electricity, chemistry, engineering of any sort, I wouldn't be able to even begin explaining them to King Arthur.

My introduction to the knights of the round table would go something like this:

"We are questing for the Holy Grail, oh noble stranger from a far away land! How can you help?"

"Depends, which version of Python are you running?"


A light, enjoyable read along these lines is Leo Frankowski's "high tech knight" series, starting with The Cross-time Engineer. The main character -- a real engineer -- gets transported back to medieval Poland, and he knows that he's got ten years either to bug out, or help Poland defend itself from the coming Mongol invasion.

[I only liked the first four books, but that's enough to cover the original story arc]


"Deathworld 2" by Harry Harrison has a plot along those lines. Apparently the original name was "The Ethical Engineer".


Do i have the T-shirt for you:

https://topatoco.com/products/qw-cheatsheet

And the T-shirt's companion bandana and spin-off book:

https://www.popularmechanics.com/culture/a23286104/how-to-in...


This shirt annoys me. I get that it is a joke, but the explanations are just so woefully over-simplified, and don't get at the main problem -- materials and manufacturing technology in the past was poor enough that even if you knew the basic physics you'd have no chance of getting, like, material to build a wing out of.


What, not even pinewood and gelatin for ribs and stringers, and some linen cloth plus pine resin and alcohol for doping? Seriously, that's like 1000BC tech level.

Wing is no problem as long as one can calculate how to make it stiff enough and of a right shape.


Inventing the printing press was more difficult than it seems at first. In addition to the idea of unsing movable type significant development of the correct alloys for the types was necessary. The alloy needs to be able to be cast easily and at the same time be durable to be reused for a large enough number of print runs. In addition the proper ink needs to be developed...


Cheap durable paper also helps.

Fun fact: printing rates increased from about 120 sheets/hour to over 1 million over the course of the 19th century. Those began with wooden screw presses that differed little from Gutenberg's to cast iron, rotary, steam and later electric powered, and web (continuous paper feed) presses, and from matrix plates (with individual type set in blocks) to offset Linotype (in which the entire print block was cast as a single sheet through multiple stages from the original matrix characters).

Thought just occurs: the falling characters of the iconic Matrix screen somewhat resemble the individual type elements flowing and falling through a Linotype machine. I don't know if that is a deliberate or incidental reference, but it's an interesting one.


Right, let me amend my statement: I understand how the printing press with movable type works and I would be able to explain it to King Arthur, but I probably wouldn't be able to actually craft the types, inks, etc, and so the annoyed King would have me beheaded.


Even if you knew what to do, convincing the naturally suspicious people back then to trust a strange outsider would be tricky. Then you have to get the right materials.

If I were a bit more clever, or maybe if I was 50 years older and had played with this kind of stuff growing up, I'd probably try to make a spark-gap transmitter. That seems to be in a sweet spot of not requiring too many super clever bits, and having obvious applications.


Also on a similar theme: https://en.wikisource.org/wiki/I,_Pencil (it's intended to be about free market economies, but you can also read it as something about knowing how even simple modern marvels work at all).


At a very general level once you move past subsistence farming you become reliant on society to provide your needs. And in turn provide some value that can only come from spending your time on things other than farming. And that is I suppose how civilization advances. Its kind of funny to work backwards though, because even subsistence farmers are reliant on society for protection -- they are farmers not soldiers after all. I think about this a lot, how important trust is to going anywhere in modern life. And how little choice there is anyways. I also think about how most people don't think about it at all, or very much, and wonder if knowing how fragile we are makes me happier and more productive, or less so.


There's an interesting misconception that humans developed agricultural societies because they achieved better outcomes as individuals. Research shows that hunter-gatherers were healthy and better nourished than humans in early agricultural settlements.

What's probably closer to truth is that many humans were forced to join farming communities. Stronger individuals or tribes probably enslaved others, and then forced them to build and produce.

The patterns of inequity and the march toward hyper-specialization we still see today make sense in that context.

As a tangent, if anyone is interested in that "cavemanness" deep in our DNA, check out the idea of primitive camping. That was my first experience camping, and I expected an idealized tv-ad experience. The trip was not framed as "primitive camping" to me.

I was dealing with intense burnout, stress, ADHD symptoms, immune problems, trouble sleeping... And I was thrown into the desert in the summer with a tent and some beer. It fucking sucked sooo bad. It fucking sucked sooo bad that I forgot every stupid problem I had, because I spent the entire time in survival mode. Setting up camp. Hauling equipment up and down dunes. Staying hydrated in the 100f+ heat. Making food. Making sure my wife and friends were ok. Strategizing how to defend our camp from bugs and psychos.

I really have not had such an existentially-dense experience as that one. And no, I didn't take any mushrooms, as the rest of the group did. I wanted to be lookout. Maybe I come from a long line of hyperaware sentries.


Forced labor was absolutely the norm for the pre-modern state, and provided the bulk of the workforce [1].

AFAIK humanity is yet to produce a society where the majority of farm laborers are fully free to leave the land they work on (whether via having their papers confiscated, their wages held until the season ends, by having transport provided to a remote farm but the trip back withheld etc). We've seen improvements in the degree of freedom, particularly over the past century and especially the past 50 years, but it's still very low compared to urban dwellers.

1: "Against the Grain", James C Scott


Not to mention that when the power goes out, the illusion fades fairly quickly. Learned that lesson myself in the Texas snowstorm early this year.


Absolutely. My wife and I lived for a year in an off-the-grid cabin in some mountains in Mexico.

We had solar panels and a generator we used only when absolutely necessary. We were never without power, but we lived with the constant anxiety of optimizing our energy consumption. Some stuff we could only do during the day and at night we only used devices with batteries.

For a couple of weeks we didn't have running water in the cabin because we were rebuilding our water deposit tower. We used buckets for everything.

That was almost a decade ago and I still feel grateful at having unlimited energy or running water on demand.

I also feel guilty at times when doing power hungry stuff like playing video games, knowing electricity production is by far the biggest driver of climate change.


Absolutely. My wife and I lived for a year in an off-the-grid cabin in some mountains in Mexico.

I think everyone ought to do a week in an RV with no connections to utilities. Not to take away from your story, but a similar scenario comes up when we "dry camp" (no water or electrical connections): resources are not unlimited. We have solar panels, big-ass inverter and big-ass battery to go with it. But if we want lights at night, best not run that 1100W microwave for too long, because the panels won't keep up and the battery isn't that big. We have a built-in generator, but unlike most RV owners, we are loathe to use it. It's almost like a game, and if that generator fires up then we've lost.

You want to let the water run while you brush your teeth? Go right ahead, our water tank is plenty big...oh, wait, but the holding tanks aren't. Shut that tap off before there's dirty water coming up through the shower. Speaking of showers, use the outside shower, as the holding tanks won't hold enough for your 30 minute, piping-hot shower.

Point of it all is that it one quickly learns that it all has to come from somewhere, and it has to go somewhere after you've dirtied it. I'd like to think that it has made the both of us more conscious of our usage.


Very similar experience with sailboats.

There's nothing like being at sea, 100+ miles from civilization, reliant on the limited capacity systems on your vessel. You manage your food, you manage your water consumption, fuel, electrical usage, you're closely attuned to the weather, the sea state, the charts. There are no other visible people or people-made objects out to the horizon in all directions. If something breaks, you'd better know how it works and be able to fix it, or go without. It feels very freeing, but also provides a "back to basics" accountability.

Standing under a hot water shower with unlimited water in a spacious home shower afterward feels luxurious.


Or even better, go backpacking in the wilderness. Slightly different set of constraints: you can usually find water (at least where I hike), but carrying all your equipment and food on your back gives you a new perspective on what's "essential".


Third world country here.

Commit and push often.


I lived in Miami during hurricane Wilma and spent like a week without electricity. You realize how quickly things go south without electricity flowing.


The most impressive thing to me is toilets. Just click a button and your waste disappears. Don't know where it goes or how it gets there and pay almost nothing for the privilege.

Toilets are amazing and I feel privileged every time I use one. Girlfriend thinks I'm nuts.


Well, you didn't just find a cave, it was made for you by other people. Interdependence is a hallmark of social species such as Homo Sapiens. Even your caveman ancestors were probably reliant on one another in many ways.

>It seems that someone asked the great anthropologist, Margaret Mead, “What is the first sign you look for to tell of an ancient civilization?” The interviewer had in mind a tool or article of clothing. Ms. Mead surprised him by answering, “a healed femur (thigh bone)”. When someone breaks a femur, they can’t survive to hunt, fish or escape enemies unless they have help from someone else. Thus, a healed femur indicates that someone else helped that person, rather than abandoning them and saving only themselves.


At least the cave man could go out and get water. Or have some reasonable expectation of finding food. Good luck!


Great for you, I hope you enjoy your cave.

> Then I realized that I don't know how to build the systems that I was relying on for my supposed superiority, or even how some of them work.

I'm sure if you just sat down with a pen and paper you could come up with a DIY solution.


Not to mention that many of the skills needed by the original cavemen to survive are gone in today's society. In other words, if we were to compete with the original cavemen in their environment, we would most likely fare rather poorly, at least in the short term.

Not trying to glorify off-the-grid living or anything, but I think it's interesting to think that in some (very specific) ways, the cavemen were actually superior to us.


> I'm really just a cave man that found a nicer cave.

You aren't really - most cavemen didn't even understand that fire is possible, and wouldn't be able to consistently operate a lighter if they found one (it'd probably be put on an altar and worshipped instead, as it should). You might not be able to build your entire cave, but your education alone is a _huge_ advantage!


Surely we are way past the point where someone knows how the whole thing works, all the way down.

I doubt even a very skilled engineer would know how his own machine works all the way down. What I think happens mostly is the skilled dev can use his experience to know where to investigate and where to look for solutions.

The question is organisational. Might it be that certain orgs have gotten so convoluted that they cannot do this investigation on an org level? Essentially, letting the right people look in the right places, unhindered by politics, legitimate security concerns, and practicality?

You'd think there'd be a limit to scale at some point. A bit of redundancy makes sense. There's probably a lot of people with multicloud setups patting themselves on the back at the moment.


I do contracting dev work and my specialty is being able to drill down into any part of the engineering assets, ops, sec, dev. People think someone like me is slow and expensive until they have a problem that no one else wants to touch.


> Surely we are way past the point where someone knows how the whole thing works, all the way down.

So you are disagreeing with this statement and saying that, in fact, you are the person who knows how the whole thing works?

I knew tech work produced some large egos, but sheesh.


This type of person exists, and while rare, not as rare as some seem to assume.


There is nobody who exists who would be able to recreate a modern computer from scratch.


No one person could recreate the pyramids from scratch either, and that's a pile of rocks


But they could teach others the principle and have them follow their directions to recreate it.

I'm saying no such person, even one who built up a team that they taught, exists for the modern computer.


When you say 'modern' do you mean with photolithography?


Yes, I do, keeping with the spirit of re-building the metaphorical "baths" of modern civilization.

But even if I didn't mean that, there still is nobody who could do it.


Not at all diminishing what you do, but surely you have a limit past which you say "that's outside of my expertise, or what's reasonable for me to gain expertise given the scope of this issue"?

For instance, I manage a team that does "full stack" development, where full stack means I regularly interact with mechanical and manufacturing, operations, electrical engineers, battery and radio people, embedded developers, mobile, and most aspects of backend engineering. We had an issue where one of our chip suppliers changed their FW, didn't tell us, and we literally were taking apart units to get to the bottom of why units off the line weren't working properly. We go pretty deep. Still, at some point we throw our hands in the air and say "Hardware is hard, it's in the name."


This was meant to be in the context of hosting software services on AWS. Certainly there is a limit. If a MBP get a crack in the case, I'm not going to figure out how to machine a piece of aluminum into a new case, I'll replace the laptop.


You know how to debug kernel issues? You know how AWS virtualization works and how to diagnose a problem with the AWS networking stack?


You know how to debug kernel issues? Yes

You know how AWS virtualization works and how to diagnose a problem with the AWS networking stack? Yes, and yes assuming that everything AWS is responsible for is operating within spec. Obviously, I don't have access to their switches, and cannot see anything at layer 1 or 2.


These are not as hard things as you make them sound. They are the things traditional sysadmins spent time understanding.

But I am also incredulous by the parent.

I wouldn’t be able to diagnose traces on a motherboard or a defective (but partially functioning) CPU.

I wouldn’t be able to diagnose irregular voltage conditions or drop offs.

The amount of stuff that I know I couldn’t diagnose is absurdly high, but the amount I don’t even know that I can’t diagnose is higher still.

And my job, like the parents, is to drill down and spend time in specifics.


I also specialize in being great at everything.


> I doubt even a very skilled engineer would know how his own machine works all the way down

Knowing how it works, and being able to build a new one, are also two very different problems. For example, there are plenty of Computer Science folks who learned how to design chips (layout the circuits, write the microcode, etc) - but you need a whole extra background in EE and Physics to be able to fab said chip...


Many programmers complete some kind of nad2tetris[1] style course where they go from basic hardware primitives (the NAND gate) all the way up through a small von Neumann architecture computer that can be programmed with a simple homebrew machine code. Even if they don't, a good CS undergraduate program should cover a lot of it, and since most EEs can program at least a little they probably get a pretty good top-to-bottom understanding as well.

The problem is that this is really only possible with a toy model of a computer and very simple programs. Modern chips with their branch prediction and caching and threading and advanced vectorized operations and so on are vastly more complex. The 6502[2] was perhaps the last chip that one person could fully grok. Maybe a chip designer at Intel or AMD could understand the whole circuit in detail but no one else has the time - it would literally be a full time job. The same thing is true for operating systems - even if you're Raymond Chen, you can know a lot about Windows, but you can't know everything.

We learn just enough about the other parts of the system to convince ourselves that we understand the principles. We build the basic mental model we need to interact with other systems but all we can really do focus on our own specialized areas and hope that everyone else is doing their job. This works well enough until something like Spectre[3] or Meltdown[4] crops up and that's when we realize that we've been building castles in the sand.

[1]: https://www.nand2tetris.org/

[2]: https://en.wikipedia.org/wiki/MOS_Technology_6502

[3]: https://en.wikipedia.org/wiki/Spectre_(security_vulnerabilit...

[4]: https://en.wikipedia.org/wiki/Meltdown_(security_vulnerabili...


I mean you can (and I have) etch your own circuit boards, but that's obviously not at the scale you need for anything other than primitive processing (70s 8-bit at MHz scale), and even then you're just looking at the next layer down (how to make your own chemical wash and get copper onto a board) as a barrier if we're truly talking about 'from scratch'.

We really depend on three things - knowledge (stored collectively and in various media eg books), materials (tools and manufactured precursor goods, available via active supply chains or existing stores), and most importantly having our basic needs met trivially so that all our time is not sucked up addressing them.

A scenario where someone has a 'wasteland' to pick over for their basic needs, knowledge and materials looks quite different to a return to primitive living where what nature provides is all their is to work with. 'if you want to bake an apple pie, first you must invent the universe' or however it goes...

Then of course there's the question of why someone would have any interest in obtaining computing power were either of those scenarios to occur. Much like the 'how do we warn future civilisations about our nuclear waste' problem perhaps it is acceptable to not bother, they'll figure it out again on their own eventually given enough time.

This stuff is fun to think about at 6am when hay fever is preventing my sleep :)


> There's probably a lot of people with multicloud setups patting themselves on the back at the moment.

And there's probably an equal number troubleshooting why it didn't failover the way it should, while their upper management starts questioning what they're paying for.


> Surely we are way past the point where someone knows how the whole thing works, all the way down.

I've met a few people who can rightfully lay claim, but yeah, an incredibly rare set of skills.

That said, there is a recent revival in building systems from the ground up. While you can't manufacturer your own transistors, it is quite possible to understand everything from simple logic gates to ALUs to older style CPUs and memory buses.


I built a toy CPU in software once as an exercise. I started with "class Transistor" (wrapping an AND op) and "class Wire" (wrapping a boolean), and wired them together incrementally to make gates, flipflops, registers, etc.

I eventually got a fully-functioning 32-bit cpu with instruction pipelining, two levels of cache, DMA input/output, an asynchronous bus, a custom assembly language with an assembler written in python, and got the Game of Life running on it.

It ran about 2kHz with 8kb of memory or so.


So long as you live in a city you are probably forgetting most of the ways to survive.

I could not even try to discover which berries are edible without killing myself.

However, I can teach advanced maths to a largish group of students without much trouble.


> I could not even try to discover which berries are edible without killing myself.

Cluster berries, from raspberries to pineapples, are never poisonous. Avoid berries that resemble blueberries or currants unless you're able to identify the plant: we grew up with blueberries and know the leaves, but we avoid anything currant-like because we'd have no idea if they're actually, say, chokeberries. Avoid anything that looks like baneberries.

Here in Maine, we forage for raspberries, blackberries, wild strawberries, and (mostly low bush) blueberries, but don't risk others.


> Cluster berries, from raspberries to pineapples, are never poisonous.

I know nothing but that still seems too generalized that it doesn’t have an exception somewhere in the world.


You won't find enough berries, never mind edibles berries if everyone in your area suddenly shifted to foraging. Game animals would be exhausted quickly or they would migrate further out from human settlements. Even if you had the skills hunting or foraging isn't all that useful anywhere around a city, especially if everyone else is doing it.


True, fruit and nut-based food forests, like those in the Pacific Northwest [1], seem to provide a significant, sustainable food source. Berries make a nice dessert + vitamins a few times a year.

Historically here in Maine, the core diet seems to have been seafood, freshwater fish, maize, Capreolinae, game birds, eggs, honey, roots, and greens. While only a tiny fraction of fish/seafood remain, deer are over-populated and make a fine sustainable food source, the limitation mostly being the contemporary appetite for venison.

1. https://www.smithsonianmag.com/smart-news/indigenous-peoples...


The edible himalayan blackberry infestation that plagues all of the coastal PNW is widely available. It's almost impossible to kill and it fruits for long periods of time.


Our Hoopa lived virtually exclusively on shellfish for hundreds of years.


Blueberries are easy, they have a star shaped "opening" on the bottom. Native Americans called them starberries. No other berry is blue and has that. It's the only berry I trust myself to eat while I'm hiking.


I would recommend you buy a local book on foraging. Keep in case of emergency. But give it a read (at least the first few chapters) so you can get a basic understanding of how to forage without killing yourself. I also recommend keeping viable seeds and a camping shovel around as an insurance policy.

These items aren’t in my earthquake bag (I have enough energy bars to last until the National Guard shows up). Instead these are for a Carrington Event type of solar storm, civil war or some sort of other long-term disaster.


On the seeds front, you really have to be practicing growing food from seed for several years before depending on them for basic caloric needs - after a few years of providing a fraction of our household calories on the property I can see the pitfalls, effort and planting diversity needed were we to need to scale it to that level. The previous me would have had some seeds and a dream, and have died real quick. Even now I give myself 50/50 that water, weather, pests, poor soil, or something unexpected would lead to starvation.


Yes we provide a few hundred annual calories from seed. Not nearly enough to survive. But hopefully enough to learn from while foraging or enough to link up with actual experts who might just be lacking in seeds or labor.


But you could probably devise a scheme by which you feed all of your students an assortment of berries and figure out which ones are safe based on which students get sick or die.


You are close. You first rub the berry on your skin (or leaf, or whatever). Wait 24 hours to see if a rash develops. Then you taste it, wait another 24 hours. Then you eat one, and see if you get sick after another 24 hours. Now you can eat several, and build up from there.

Yes, that is a lot of time to go hungry and testing just one item. And then you still don't know what actually gives you nutrition vs just not killing you (for example leaves that you can break down such as leaf lettuce, vs eating grass).


Alternatively, let the animals worry about all that and then just eat the animals.


Assuming this is in the context of some apocalyptic event requiring you to do this, relying on animal husbandry seems obviously wrong. Plus you only get ~10% of the energy from the lower trophic level.

The prevalence of meat comes from a society of abundance.


Evidence of early hominids and other less advanced proto-human or human groups shows a pretty significant amount of calories came from meat. Some suggest 60-80% of calories came from proteins, largely meat, at various times in history.

Example source:

https://onlinelibrary.wiley.com/doi/10.1002/ajpa.24247

A contradictory source says meat was less prevalent but still at 40-50%:

https://asu.pure.elsevier.com/en/publications/the-diet-body-...

Both of these estimates are way higher than what we know people eat today, where meat and dairy are 18% of worldwide calorie consumption (27% in the US).

So I think the abundance we see today is actually due to the availability of non-animal dietary sources.


That is the Universal Edibility Test and gets repeated ad nauseam in all the survival circles. You would miss out on some fine choice foods if you did that. Stinging Nettles (Urtica dioica) is one of them. Pokeweed (Phytolacca americana) too.

Source - used to teach these skills before it was cool to be a "survivalist" on TV and social media.


Yes I don't know how someone figured out that if you cook pokeweed and change the water multiple times then you can finally eat it without it killing you.


as an NYC-born, growing up with bi-monthly boy scout meetings and yearly "wilderness camps" (pitching tents in open fields, pit latrines, war games/survival, etc.) really helped fill in that gap :)

i wonder if there's anything like that for adults


There's prepper and survivalist camps and classes


> I could not even try to discover which berries are edible without killing myself.

Eat a small amount and see if you get sick? Science in the wild...



Assume that you'll be in a group - some other members of which will be more optimistic than you.


We are the advanced civilization that has built those hot baths. Being an advanced civilization, it's safe to assume that no single person knows all the knowledge necessary to build another hot bath, because it has long surpassed how much one person can learn in a lifetime.

But somehow there are multiple organizations that "know" how to build another hot bath, and newer and bigger baths are continuously being built all across the Empire.

And occasionally one of them stops working and thousands of citizens are angry, because they feel, being honest citizens of the Empire, they are entitled to enjoy these hot baths. Sometimes their very livelihood depends on the baths running.

Then the bath is fixed, and all is well again.


For the past couple of weeks, I've been a beginner-intermediate mechanic trying to breathe life into an aging car.

Sometime in the next few months, I've to troubleshoot and fix the broken 2 yr. old refrigerator. Someone came and fixed it once, now it's out of warranty and fixing it would cost about 50% of its cost. Meanwhile I'm glad I didn't throw away the 10 yr old refrigerator and just moved it to the garage. We just have to keep going to the garage.

I also have to play the accountant for my consulting business pretty soon. This is a task I had outsourced for years and have now started doing myself.

As stuff gets more specialized, I've started noticing that I'm able to do moderately complicated things better than professionals paid at the 50th - 70th percentile. If I want to get a really good job done, my rule of thumb is to be ready to shell out money in the 90th percentile range and look for references.

In case of AWS, I guess the Greasemonkey scripts are getting too complicated ;)?


The remarkable thing is that today no one knows how to “fill up the baths”, or to do more than a small part of the job. Teams exist with extremely narrow expertise. But if anything, there are more options today for DIY infrastructure - way easier to be more advanced than “run the Apache on the server box.”


Anyone who finds this concept interesting to think about and hasn't seen this video may enjoy it:

https://www.youtube.com/watch?v=ZSRHeXYDLko

See also Foundation by Asimov.


I wouldn't be that negative, as long as certain things need to be on prem, we'll always have some people who can get the internet running again.

Most of these people just happen to be employed at AWS or azure right now.


I don't buy this. I've written some pretty complicated codebases at previous companies that no one knew how to operate except for me. After I left those companies they didn't fold or lose all their customers. They adapted and everything is fine. For whatever reason humans find simplicity through complex processes.


There's this classic article on someone's quest to make a toaster from scratch https://gizmodo.com/one-mans-nearly-impossible-quest-to-make....



I'm reminded instead of the supply chain issue our economy has gotten itself into. As more people pile on AWS it becomes the weak link....

Maybe we need more baskets to distribute our eggs amongst?


This has been true for a long time and it is not a bad thing.

It’s an easy target to romanticize but realistically, any alternative is basically a way of saying: “let’s stop evolving.”


P.S. That Paw Patrol shit hit me right in the emotions. Awesome site.


Disagree.

Wanting to evolve differently doesn't mean halting.

It's a call not to devolve.


I certainly think this is a subjective, “no one right answer” discussion topic.

From my perspective, we require abstractions in order to free our intellectual capacity up for the next layer of complexity.


That's a very dramatic interpretation. In reality, as long as Unix greybeards are around, we are safe enough on the "can we rebuild it" question.


This simply isn't true. There are lots of full stack devs out here, we will rebuild from the AWShes :P


I liked that. It might be ever weirder though for us in the new age. We'll have robots (AI) running everything and why things are happening will degrade into unknown unknowns. Engineering and critical thinking may be a lost art.


Or worse, we will be sitting in the hot baths and the hot spring where the water comes from will get hotter and hotter and we won’t notice until suddenly a rapid change in temperature boils the water and burns us to death.


Tangentially related: If you enjoy this sort of idea in fiction-form, I can't recommend Josiah Bancroft's The Tower of Babel series (beginning with Senlin Ascends) enough.


That sounds like the setup of Asimov's Foundation series


It's ok, we can just rebuild everything from scratch if we need to. We know we can cos we already do it every five years anyway without needing to


And a bunch of barbarians will just accept being smelly. And a few barbarians will figure out a different way to fill a bath tub. And life will go on.


More like, they will fill with blood and locusts.


At least you'll have something to eat.


On-prem is much more rare, but its hardly non-existent. Plenty of people know how to do this sort of thing.


On-prem, maybe, but if you include co-located equipment and managed hosting I don't even think it's more rare in absolute terms. Just smaller as a percentage of overall hosting.


There's (still?) a lot of on-prem and managed hosting. It's probably the majority of hosted services. Otherwise VMWare wouldn't be doing as well as it is.


Well said,... but why is the water slowing getting hotter?


Is this a fifth season reference?






Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: