My understanding is that this is a difficult problem to solve "perfectly" due to...

btilly · on April 6, 2021

That is an argument for why these overages occur. It isn't an argument for why customers should eat that cost rather than Amazon. In fact Amazon is in a much better position than customers to absorb those costs. Sure, they'd have to increase rates slightly to cover it. But it would give customers peace of mind.

And if the costs become exorbitant, Amazon is in a better position to improve their own systems to reduce the amount of overages that people run into in practice.

In theory, Amazon's first leadership principle is Customer Obsession. (See https://www.amazon.jobs/en/principles for the full list.) If they took that seriously, then setting this issue to rest for their customers would be a no-brainer.

vineyardmike · on April 6, 2021

> In fact Amazon is in a much better position than customers to absorb those costs

And if you call amazon support and talk to them, especially as an individual, you might get your bill cancelled.

onion2k · on April 6, 2021

you might get your bill cancelled

The "might" in your post is doing a lot of work.

FWIW I know of a startup whose video sharing app was used to reshare a pay per view football match and they incurred a $30k bandwidth bill that AWS did not cancel. That killed the startup. It was largely their own fault for not securing the platform well enough, or moderating popular streams, but being able to cap their AWS bill would have kept them in business..

matkoniecz · on April 6, 2021

I am not going to risk runaway costs in hope that Amazon "might" cancel it.

Though, apparently population both caring about it and avoiding Amazon as result would pay less than cost of implementing it and not refunded income from catastrophic runaways.

jgalt212 · on April 6, 2021

I would argue that AWS does not have a "Customer Obsession", and that's exactly why it's Amazon's most profitable business (by far) and the underwriter of all of Bezos's ambitions.

whoknew1122 · on April 6, 2021

Disclosure: AWS employee. Support specifically. There are good things about my employer. There are bad things about my employer.

AWS does indeed obsess about the customer. Every step along the chain there is someone there advocating for the customer. There are mechanisms to keep the customer in mind even for the developers who actually code the service and don't talk to customers on a daily basis.

I've had many, many service team members shadow me as I worked their service's tickets. This is explicitly so they can see in real time customer pain-points. If a customer has a question about a unique use case, the service team will proactively reach out to support engineers to set up a call to discuss the use case further. There are monthly (or twice-monthly) meetings between support service owners (i.e. those people in support who 'own' a service) and service teams to identify the top issues customers are having with the service. AWS is constantly looking for ways to better assist customers, make support less difficult for customers, increase self-service options for customers, etc.

I'm really, really curious where the basis behind your argument. Because from everything I've seen and been a part of, it's simply untrue.

jgalt212 · on April 6, 2021

$0 ingress charges and >> $0 egress charges make AWS roach motel.

whoknew1122 · on April 6, 2021

So basically you're complaining about pricing? Unlike other cloud providers, AWS has never had a price increase for a service. Just decreases.

I guess 'customer obsession' would be giving away everything for free?

niteshade · on April 6, 2021

Customer obsession would be things like implementing bill caps on new account creation.

Customer obsession would be NOT shipping buggy, unreliable software like AWS Amplify.

Customer obsession would be CloudFormation-first.

Customer obsession would be not forcing me to upgrade to a paid Support account to report a bug.

The list goes on, unfortunately. I do believe AWS employees mean what they say, but the external reality (IMO) is it takes a lot of time and effort to get AWS to notice their customers unless you're one of the big boys.

deanCommie · on April 7, 2021

Bill caps sound great until you leave one on on a production system and your whole business comes crashing down during a spike in customer traffic.

Customer obsession means not shipping bugs? OK, Bob, let's see your code.

CloudFormation is wonderful and essential. But AWS clearly optimizes for delivery speed. SOME service customers want Cloudformation from day 1. Others would rather have the API first, and have CFN a few months later.

niteshade · on April 8, 2021

> Bill caps sound great until you leave one on on a production system and your whole business comes crashing down during a spike in customer traffic.

Hence the "ask on account creation". If I want a dev account, I can choose to cap it. The amount of SMB that would benefit from this is staggering.

> Customer obsession means not shipping bugs? OK, Bob, let's see your code.

A major difference being that my company isn't worth $1T+.

I've used AWS for a long time and spoken at length with many wonderful, intelligent people in the company; and I didn't mean to tread on anyone's toes, I just wanted to express how it feels as a customer who spent >£150k/month for half a decade.

lupire · on April 6, 2021

Amazon is in the news right now for employees making tone-deaf dishonest public statements trying to deflect legitimate criticism. Out of respect for your employer, please stop.

ayberk · on April 6, 2021

I am an AWS employee as well, and I'm definitely one of the biggest critics of the company. That being said, AWS is definitely still customer obsessed.

I feel dirty defending AWS, but this is one case I'd give them the benefit of the doubt. There must be _a_ reason they haven't implemented this yet and that reason must be somehow protecting the customer. "Customers want this" ends the discussion around here. You must have a really good reason to disagree.

lupire · on April 6, 2021

Unpredictable $30K charges protects the customer?

Like $35 bank overdraft fees protect the customer from not getting a candy bar.

_flux · on April 6, 2021

Should they simply automatically eat the costs then it would undoubtedly result in abuse. Just look at GitHub Actions.

Oh I didn't think I was starting 1000 instances of miners in the EC2 GPU cloud, it certainly exceeded my configured budget, please give my money back..

mytherin · on April 6, 2021

"You can only launch 1 concurrent EC2 instance per $100 on your max cap, if you want to launch 1000 instances of EC2 your max cap needs to be set to at least $100.000, or to $unlimited".

There are many solutions to this problem to prevent abuse. All of which are way better for consumers than the status quo of everybody having an $unlimited spending limit.

yorwba · on April 6, 2021

That's where "if the costs become exorbitant, Amazon is in a better position to improve their own systems to reduce the amount of overages that people run into in practice" comes in.

If starting 1000 instances exceeds your configured budget, they could simply not start them, and shut down whatever number of instances you managed to start as soon as their cost exceeds the budget.

_flux · on April 6, 2021

I guess the question is how precise monitoring and reactionary system Amazon wants build for this, for an arguably marginal use case anyway; they do already provide Amazon Budgets for automatic actions when exceeding budgets, but it's quite not as real-time. And then making the niche cases favor the customer is an invitation to abuse.

But least all Amazon accounts are tied to a credit card, so abusing in a scale similar to e.g. the GitHub case is not that easy.

onion2k · on April 6, 2021

It's trivial to add a clause that says "If you add a cap to the cost of your account you can only have 3 concurrent instances."

stefan_ · on April 6, 2021

Don't give your customers unlimited margin and leverage then. See also: Archegos.

sneak · on April 6, 2021

> Oh I didn't think I was starting 1000 instances of miners in the EC2 GPU cloud, it certainly exceeded my configured budget, please give my money back..

All AWS accounts start off without the ability to do this (via the quota system) and being able to start 1000 ec2 instances of any type is a setting that needs to be unlocked via a support request (which can never be done by that support person, but always needs to be escalated to the "service team" and takes about 1 business day).

a-priori · on April 6, 2021

As I posted last time this came up here:

This is a billing question, not a technical question, and looked at through that lens it's easy to put a hard limit on a monthly bill: just don't ever issue bills greater than that amount. If I say I only want to pay a maximum of $1000 a month, and I hit that limit but it takes a bit for the provider to shut everything down so really $1100 of resources were consumed, then the provider eats the $100 overrun and I get a bill for $1000.

With an actual hard limit you create a financial incentive for the provider to minimize this overrun. Yes it might be difficult to fix but I assure you, if hard limits existed, the technical issues would be solved soon enough because now there's a reason to invest in a solution.

andrewguenther · on April 6, 2021

GCP doesn't support a cap, only alerts. You can use those alerts to implement your own cap mechanism, same as how it works on AWS except AWS billing is only on a one hour delay I believe so I'd say AWS wins here.

Clewza313 · on April 6, 2021

App Engine used to support caps. They're no longer supported, because for every customer pleasantly surprised, there were five customers incandescent with rage that their service had gone down at the worst possible moment due to a spike in actual demand.

Arnt · on April 6, 2021

Ho do you know?

I can easily believe it, and I can easily believe that AWS employees have heard such stories, but I'd love to have it be more than an anecdote.

pydry · on April 6, 2021

I could equally well believe that they got rid of it because it affected the quarterly earnings report and there was maybe one customer who was "incandescent with rage" that the caps they put in place worked exactly as advertised.

Arnt · on April 6, 2021

Well... most cloudy limits only affect current operations. If you add a limit to the number of VMs running you might experience service degradation for a while, until you learn to cope with new peak demand by increasing your quota or being more efficient.

That raging customer might well assume that because almost all limits are like that, all are, including the new S3 limit, but the S3 limit causes service degradation forever, not during peak load. The writes that failed for a while map to reads that'll fail forever, because that data isn't there.

We can come up with possibilities that sound more or plausible. I'd love to hear something more factual.

runnerup · on April 6, 2021

Indeed, here is GCP's own guide which confirms what you say and provides a hacky way to implement a cap mechanism.

https://cloud.google.com/billing/docs/how-to/notify#cap_disa...

Agingcoder · on April 6, 2021

We talked about caps with the Google reps at my day job.

The short answer was 'we can, but don't want to' (note : this may be completely unrelated to what Google thinks internally, and is just what the fairly high up the food chain rep told us)

dijit · on April 6, 2021

I'm using caps right now, they work- it shuts down all resources attached to the billing account if it goes over. There are many levels of alerts before it hits that though.

its on the "total spend" of a billing account level though, and obviously you'd have to be a billing administrator, so to work with it would is awkward; many billing accounts across disparate projects is basically the only way.

onion2k · on April 6, 2021

My understanding is that this is a difficult problem to solve "perfectly" due to lag between incurring a cost and recording the cost.

It's impossible to solve perfectly with tech.

Amazon saying "If you put a number in this form we won't charge you more than that, but your account will be limited by <list of limitations> and if you go over those limits then <long list of conditions that will apply if you go over the limits up to and including removing you from the platform if AWS think it was fraudulent>" is a perfect solution.

The only reason why there would need to be a perfect tech solution is if AWS are concerned about giving people a small amount of service that they're not able to charge for. AWS clearly believe protecting themselves from overages is more important than giving customers peace of mind that they're not going to be hit with a huge bill. That's a reasonable position for a business to take, and they're completely free to do it, but you can't also argue that customers are wrong to avoid deploying to AWS because they're scared of a surprise bill.

dvfjsdhgfv · on April 6, 2021

This argument ("it's not implemented because it's to difficult to do properly") was debated ad nauseam in the past. Well, look how others implemented it and you will see how to do it.

A simple example: S3. Currently, what happens when you exceed your credit card limit is that they send you an email they weren't able to charge you but they continue to provide the service for the next two months, and during that time everything works fine: you can access your buckets and you are charged for storage and transfer.

Now, what would happen with hard caps implemented? You didn't pay, so you're locked out of your account. Nobody can access your S3 objects, including yourself. If you care enough about them, you need to make a payment and settle your account. If you don't do it within one month, the whole content will be deleted.

soco · on April 6, 2021

But does it need to be solved perfectly? Change my mind, but I don't believe anybody, garage nerd or small startup, would be affected if they overshot their planned costs by 10%. I really need a solution when my bill gets by (my) mistake 100x bigger. And I'm convinced AWS could so easily solve this - if they cared to.

pardner · on April 6, 2021

There ought to be a "emergency shutoff" threshold, period. And there's just no customer-centric excuse for not implementing it after these many years.

Here's how to implement it:

"Amazon, what do you do today if my credit card fails and all the retries fail?"

Do THAT if billing hits <my emergency off switch threshold>.

Will it disrupt the heck out of all my AWS services? Of course. That's the point, if something went so seriously wrong that my billing hits an absurd level that will put me out of business, I'd rather have downtime.

m0dest · on April 6, 2021

At one point, I owed a balance of $0.57 to AWS and started to get warning emails about my account being suspended. Just out of morbid curiosity, I waited to see what would happen.

2.5 years later, after dozens of automated mails, they finally suspended it.

lupire · on April 6, 2021

The challenge is that there is a lag between consuming the resource and counting the price.

pardner · on April 7, 2021

Seems like yet more "let perfect be the enemy of the good" thinking.

If you want a cut-me-off set to $X, and some lag might allow charges to reach X+Y before the cutoff took effect, which is the customer-centric answer:

a) don't offer ANY cap, simply let the customer's out of control charges just keep racking up to catastrophic levels that put them out of business?

b) cut it off as soon as you DO detect it exceeded their cut-me-off threshold even if by that point it has reached X+Y?

gogopuppygogo · on April 6, 2021

Just offer to switch existing usage to flat rate services instead of consumption based services. This isn’t rocket science. We know how to control costs.

Cloud vendors obviously don’t want this.

Roark66 · on April 6, 2021

There are no billing limits, but there are resource limits set by AWS upfront. I had to create many support cases to raise this or that limit. (for example we had a limit if 100 concurrent lambda workers, then 2.5k if I remember correctly). Some number of active ec2s, some total TB of storage etc. We were hitting those limits pretty frequently despite spending over £50k per month with them (mostly dev and test services).

toomanybeersies · on April 6, 2021

I'm pretty sure that AWS' service quotas exist more as a guardrail to prevent customers from accidentally spinning up 1000 instances instead of 10, which would not only leave you with an eye-watering bill, but affect resource availability for other customers.

They're usually quite happy to increate the quote if you contact support.

whoknew1122 · on April 6, 2021

Most limits exist within what AWS has determined is 'normal' usage. Once you pass that, you can request a service limit increase.

Service limit increases are typically only denied when raising the limit would negatively impact the availability of the service (noisy neighbor issues, for example), or if the customer is needing a limit increase because they're trying to use the service in a way it wasn't designed.

rk06 · on April 6, 2021

but Azure has a cap? if MSFT can do it, what is preventing GCP or Amazon to do the same.

and it takes GCP a day to report billing to consumer. they can monitor the data earlier than that, and stop the services early

ztcfegzgf · on April 6, 2021

are you sure Azure has this?

i looked into this a year ago and there was no such thing available (there is https://docs.microsoft.com/en-us/azure/cost-management-billi... but that is not really a spending limit, it's more like in some cases you get credits from microsoft and when you have spent all the credits they stop things for you). i mean a system where you pay monthly what you consumed, and you can set a limit, and the provider guarantees that you do not have to pay more than the limit.

but maybe i overlooked something, so if you know more about this, please tell.

btw. fly.io has a kind-of spending-limit, you can preload some credit into the account and when that is spent you are relatively safe ( https://community.fly.io/t/can-i-set-a-billing-limit-per-mon... )

hypertele-Xii · on April 6, 2021

Pay up front. Offer services up to total accumulated paid amount.