Hacker News new | past | comments | ask | show | jobs | submit login
Serverless: A lesson learned the hard way (sourcebox.be)
204 points by V3loxy on Aug 10, 2017 | hide | past | favorite | 133 comments

>This is probably the most stupid thing I ever did. One missing return; ended up costing me $206.

No, dear author. Setting up the AWS billing alarm was the smartest thing you ever did. It probably saved you tens of thousands of dollars (or at least the headache associated with fighting Amazon over the bill).

Developers make mistakes. It's part of the job. It's not unusual or bad in any way. A bad developer is one who denies that fact and fails to prepare for it. A great developer is one like the author.

Just wanted to say that last paragraph is one of the simplest descriptions of professionalism in the job I have read - accept and prepare for your own human failings :-)

This is not a "Serverless" problem; this is a mistake a developer made that used a pay-per-use system. If I write code that launches EC2 instances and I accidentally set it to launch an instance every second instead of minute because I divided wrong, that's my fault.

It is a serverless issue because if you were using your own server, a mistake like this wouldn't have cost money, it would have just degraded your service (or possibly brought it offline).

So I guess the question is, with a mistake like this, is it better to be charged hundreds or thousands of dollars, or to have your service degrade or go offline until you can fix it?

Could you just do serverless where it starts to rate limit one you reach a certain cost? It seems like this is an issue that could be fixed somehow

Yes, but that isn't always the right answer. If your system is starting to cost "a lot" is that because of a bug (this case), or is it because your idea just "went viral" and you are not getting tons of paying customers signing up?

If it is the latter you do not want any rate limiting, you want everything to scale as fast as possible (I hope there are no bugs on your end). Rate limiting means that your new customers get a poor experience and so they are more likely to ask for a refund, or not renew next time.

It's almost as if... they should offer multiple options so customers could choose based on their business/hobby needs:

1. Warn me at $X but don't throttle me for any reason--I'll pay if I go viral

2. Warn me at $X and start throttling until I get to $Y at which point stop service and stop charging

3. Warn me at $X and stop service/charging immediately

The market has done this, in part, through segmentation.

When you are on shared hosting, the expectation is that you get shut off when you go over.

When you are on "unlimited" shared hosting, the expectation is that you and everyone on the server gets throttled when you go over.

When you are on a VPS, the expectation is that you will be throttled when you go over, and you will be throttled much less than with other options when your neighbor goes over.

With cloud, then, the expectation is that if you go over, you are charged more proportionately, but things continue to work.

Of course, this is a simplification, but I think it accurate enough to be useful.

I do agree that it would be better to choose your api/provisioning and node reliability separately from overage behavior, but most of these behaviors and expectations were based on traditions that were shaped by technical constraints.

To credibly say "we will keep you online and just charge you" you need a lot of spare capacity.

Throttling one customer on a shared host without impacting other customers used to be very difficult. It is still way easier to throttle one VPS customer, and easier stil to throttle that one customer when they have their own kernel and reserved memory; it is not as big of a deal as it once was, considering everyone now uses ssd, but systems that share page cache are notoriously difficult to setup such that light users don't impact heavy users.

AWS has so many services that trying to decide what to stop if you reach a billing threshold would be impossible to automate. Similarly, pricing is not built into the individual services APIs, so adding a per-item billing threshold would not be a trivial task.

> based on their business/hobby needs

AWS is not interested in hobbyists - other vendors are picking up the crumbs there.

You can optimize step 3 away into step 2.

This is splitting hairs though. It's a mistake in the code that caused it to do something unexpected that costs money. In the serverless world, that means invoking a function repeatedly, costing money. In the old-server world, maybe it means your script had a bug that downloaded an image repeatedly, causing you to rack up networking charges.

It is a mistake, yes. But this particular mistake would have behaved very differently on a normal server. Just because there exist mistakes you can make that would have the same consequences on regular server vs serverless doesn't mean you can just shrug your shoulders and say all mistakes are the same.

The fundamental issue here is serverless is great at allowing you to automatically scale to meet demand, but it also is great at automatically scaling to meet unexpected resource usage caused by errors (or poor design). And so this means a mistake on your end can cost you a lot of money, because the system thought that it was real demand.

Isn't there also a third danger with anything that scales your bill as your app scales - the possibility of some black hat ddos-ing you for the hell of it?

Yes, but I guess in that case you would put your lambda function behind an API gateway, and limit the user requests. If it's a static content you would serve it from a CDN. Not a specialist on this, but that's what I would do.

Wouldn't an API gateway typically limit requests per IP/end user?

I guess it could limit global request rate. But the idea of unbounded elastic services behind a global rate limiter is just funny to me. Like a Ferrari with a 50mph limiter.

Yes, I still don't get it.

> It is a serverless issue because if you were using your own server, a mistake like this wouldn't have cost money.

We dynamically create and instantiate new servers based on load and if it's sustained for a while. Once it's up, it's added to the load balancer. Once the load of them goes down, it's spin down after it's spent some time idle (it costs to instantiate so might as well keep outside of the queue for a bit before completely removing it).

This all runs automatically. If we don't limit it, it's on us.

How is this not a problem with how he managed it?

> This is probably the most stupid thing I ever did. One missing return; ended up costing me $206.

He clearly mentioned it's his error there.

There are chances that degradation or unavailability are not free as in beer.

If the degraded or offline system is used by people, and these people cannot work, the cost can be a lot higher. For example, 10 people not able to work could cost something in the range of $250-$750 per hour.

Moreover, if customers are lost due to this degradation of service and CAC is high, then clearly the cheapest thing is a high bill by AWS, which probably is also capped by Amazon (and handled as an alert by Amazon).

Oh sure. That's why I posed it as a question. Service degrading or going offline could be disastrous and cause losses of thousands of dollars or more, depending on what the service is. But there's also plenty of services where it's cheaper to have the service go down than it is to get an outsized AWS bill. This is just something you need to be aware of when deciding if serverless is the way to go.

The developer should be writing unit tests for their code so they can avoid small mistakes like this.

It is impractical to cover every line of code with tests (it would get too expensive). Futhermore, in this case the author would have to test production config interacting with Amazon servers rather than a piece of code.

And even 100% code coverage doesn't find all possible errors.

Do you need to cover every _line_ of code, or do you need to test resulting behavior? Also, while nothing is 100% foolproof, the example here would probably have been caught.

I doubt this, a unit test wouldn't have covered the infinite triggering of the created events

> because of a refactor, I forgot the return statement and it just continued overwriting the file again

Unit tests are specifically useful for refactors. You can refactor your code and ensure that it behaves as intended. Integration tests are great, too, don't get me wrong. Either or both would have probably caught this.

this was a problem involving multiple parts, unit tests normally don't catches this. You need an integration/functional test and that can be much more time consuming to write for all "integrations" and code paths.

It was a single function that changed behavior after a refactoring. It did work where it did not need to, because the work was already done on the object. This is only hard if you don't test at all and can't already mock the object download/upload or don't have pure functions.

In this case it might have helped, I didn't read the code, but in a more general case these kind of things are rarely found by unit tests. I still doubt that the triggering caused by the file change would have been found in a unit test.

Down vote for advocating for unit tests? That's just good practice in general.

I think integration tests would be more appropriate here, especially since there are different co-operating moving parts: S3 <--> EC2/Lambda.

I notice AWS doesn't have any ability to set limits....

It's "serverless" in the sense that if the developer had provisioned a "server" then the max incursion of cost would equal the cost of that server, no more no less.

So yeah, let's blame the developer, but let's not play like mistakes don't happen and they're not costly in the "serverless" world.

Use EC2, set an auto scaling threshold on CPU utilization, do something dumb, you’ll find you ran “a server” x n.

It’s easy to burn tens or hundreds of thousands ‘accidentally’ on “server”, easier than on serverless.

If you’re spending real money, you should have an account team. Talk to them if such a problem happens.

> If you’re spending real money, you should have an account team. Talk to them if such a problem happens

The colloquialism for "real money", at least to me, is "a substantial sum". If that's what you intended, wouldn't it make sense that you wouldn't have an account team if the only time you spent real money was by accident?

An autoscaling group has a max spend of '"a server" x n', where you set what n is. Autoscaling groups don't keep adding servers forever.

LOL WAT? How is it "serverless" if you're provisioning a server?

He's saying that the problem is with "serverless architecture", because the problem could not possibly have happened without the use of a serverless architecture (e.g. provisioning a server). The problem is exclusive to serverless.

> The problem is exclusive to serverless.

That's simply not true. You can accidentally run up huge bills with EC2 instances too. One typo in your CloudFormation templates could spin up a ton of reserved p2.16xlarge's.

Of course, if you consider EC2, and other AWS services, to be "serverless" too - you're not physically managing your own racks after all - then, yeah, fair enough, it is a problem exclusive to these "serverless" IaaS/PaaS providers.

That’s what he’s saying but that’s not correct.

If I set up a server to run a script each time a file gets updated in a folder, then the costs arn't going to increase.

Its not a "problem" with server/serverless of course, but no-scaling-by-default vs unlimited-scaling-by-default (which is imo the better way to split the server/serverless topic), one is going to cost more when things get thrown for a loop

You could argue that it's a pay-per-use problem because of inherent unpredictability in that pricing model. When you have "big" resources behind you, it's less impactful but still and issue. The difference with a server based vs. serverless here is how the cost grows. You can predict what the cost of a server is going to be and strictly control it. Can you actually do that with the serverless option?

It's a pricing model and limits issue from AWS. Same as you can't launch 1000 instances by mistake (there are account limits) you should be able to run hundred of thousands of recursive serverless calls by mistake (isn't there an account limit?).

What you say is if course true. However, with EC2 that risk is mostly limited to the code launching EC2 instances. Serverless expands that risk to your application code. There are just way more opportunities to screw up in a way that directly hits your wallet.

The same would apply to auto-scaling with EC2 instances though. If you were scaling resources based on a queue and made the mistake of adding something to the queue every time you finished something on the queue then you would start to use too many resources.

Without autoscaling, you would just have a queue that grows until the machine runs out of disk space. Either way, this was a problem with code and not event based scaling.

Maybe the budget notification was late.

"The actual cost is now $206 and over $1000 forecasted, it makes me think twice about using pay-per-use services in the future."

Never use a pay-per-use service that does not include a reasonable "turn off after $X" feature and appropriate warnings. Also, never use such services without being sure to configure such settings.

I like to think of this as a self-inflicted "DDOC" attack: Distributed Denial of Capital.

Best not to leave yourself exposed.

Amazon refuses to set up bill capping, even though users have been asking for it for many years: https://forums.aws.amazon.com/thread.jspa?threadID=58127&sta...

In a past life as a cloud (VDN) provider, this was a real trade off.

When you have customers doing events, it’s more often that the scale up is from a real event than that someone fat fingered a config.

If they are broadcasting an unscheduled Obama speech from home page of a major paper, that’s not the time to go “Oh, anomalous, shut it down.” By the time that gets fixed and back on, Obama’s left the building - and your customer leaves too.

If you are in the business of offering a service with “elasticity” as a core capability, we found it better for SLOs and better for the bottom line to ‘fix’ this after the fact by discussion than to attempt to tell real spikes from glitches.

If you don’t want elasticity, you might not be looking for “cloud”.

If you really wanted, you could create a script that after a certain billed amount gets reached switches your site via route53 over to a static s3 page that says "down for maintenance" or something until you figure out why your forecasted billing amount is so high. forcastedSpend is an object you can call via api: http://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/...

and a SDK like boto3:


That wouldn't have made much of a difference in the particular case of this article.

EDIT: I've reposted my question as a top-level comment to give it a bit more visibility as I'm interested in seeing answers as to what would be a fair price for such a service.


Amazon has always been happy to DDOS your wallet - one of their whitepapers a few years back on how to survive a DDOS attack was "out scale it".

I can't imagine this changing.

OTOH, for anything that's not a toy project, this is one of the viable approaches. Power tool is powerful (which also means you can hurt yourself using it), news at 11.

> Never use a pay-per-use service that does not include a reasonable "turn off after $X" feature and appropriate warnings.

None of the "Cloud providers" offer that. They "claim" that it could impact service - yeah, service of debt that you owe them.

Azure has this. When you hit your spending limit, it shuts down your services.

They didn't a year and a half ago. Created a 3k bill for my employer over RemoteApp. Yeah, charge per user, they said. Oh yeah, min 20 users, and we round up - of course in small print at the time.

Unless I have hard guarantees, I give "cloud providers" re-loadable cards. Can't take more money than what's on there.

The vendor could still assign the debt to a collection agency or sue; a declined charge does not get you off the hook unless they decide it's not worth pursuing.

That's very true. That's why I provide generic usernames and everything. Because how the current providers offer service is as through a debt system. You rack up the $$$, and they tell you after the fact.

I would greatly prefer to pay up front, and have services take my credit. That way, I could control my costs directly and concisely. No surprise billing. DOS'es get stopped by no more funds- they aren't the infinite money piggybank they are now with debt.

I also understand why some clients would want a debt based system where they can expand and contract their costs. I'm cool with that, as long as you know what you're signing up for. The person in this article didn't, and surprise billing is majorly at fault here.

My solution would stem this "You owes us $20,000 by end of month", to "Your credit is exhausted after 10 minutes. Something seems wrong with this account cwhen compared to history."

If you're out of credit, what happens to your data-at-rest? You know, the stuff you're storing in their block storage, where they charge just for storing it? Should they just purge your data?

It's been around a while, but I looked into it and it looks like it's based on the type of account you have[0]. I've only had the accounts that had spending limits as an option. I'd imagine a lot of people are on Pay-As-You-Go, so many people won't have the ability to set spending limits. Frustrating that they would lock these features behind different plans.

[1] https://azure.microsoft.com/en-us/support/legal/offer-detail...

They can terminate your AWS/Azure/DO account though.

And now we wait for the first reports of production services that were shutdown due to spending limits :)

I would rather production go down for a bit than have the whole company go bankrupt

Or you bankrupt them by not having them shut off. Its a CoS - Cost of Service attack

This always comes up but people seem to just ignore the complexity: What exactly is supposed to happen when the spend hits your budget?

Even a few bytes sitting on S3 continue to incur charges and it's hard to be real-time with spend tracking at the scale of these providers so the only option they have is to delete your entire account immediately. Is that what you want? Who would?

For most companies, business continuity matters. The proper solution is to use the budget and reporting features to check your work.

You did well to have billing alerts enabled. Exactly the same thing happened to be, but I didn't notice for three months - no emails because I'd created an account and domain for a side project. Didn't notice anything on my card because the charge had been declined, but my bank didn't contact me. Finally found out because I knew the local AWS rep (was the relationship manager for the accounts we use at work). Had to apologise and explain the situation in detail to AWS and they forgave the bill. That was tens of thousands of dollars.

For those talking about this being a 'serverless' problem or not, think they point is that it's a lot easier to shoot yourself in the foot. Great power + responsibility, etc. On regular servers (outside of unbounded autoscaling) mistakes cost a flat rate.

Not necessarily.

With a regular server you could go viral, your server dies, so you lose also without a bound in lost business/good will/whatever.

Also need to take into account the time/effort spent on making the regular server scale, albeit this is also a relatively flat rate.

TLDR: Deployed buggy code with infinite trigger. Hosting costs increased.

The only difference here is that if he deployed it on his own server, he wouldn't have lost money.

"a $180 actual cost. I was left with a light headed feeling, it's a lot of money for me"

people still play with fire. limit your losses, go with digital ocean or something for 5$/mo flat no matter what.

His previous blog post actually said that he moved away from the exact $5/mo digital ocean plan you're talking about, to this.

I think the author meant to do this less of a "play with fire" way but more of experimenting with new tech way. But yes, I agree that for personal sites, running with your own money, you probably want to stick with something safer like the $5/mo digital ocean box.

This is indeed the case, playing around with new tech. I've been a happy customer with DO for years but my own website was the ideal case to try out the whole serverless thing. As my website doesn't get much traffic so it'd cost me next to nothing. I do agree with most of the comments here though and the $5 DO box is the safest choice. I might have been a bit too excited with the new things and failed to think logically :)

"Oh, look, new tech. It shines! It heats! Ouch, it also burns!" The comparison to fire is quite fitting.

that depends for 5$ it breaks after a certain level of traffic. for many applications its always better to spend $$ instead of things shutting down. (hosting is usually insignificant compared to people, revenue etc)

If $180 is significant, then this person is probably a student or on a budget, not a business. So they'd probably prefer the downtime.

Oh come on. If this is a side project, it's likely impacting his personal monthly budget. A sudden, unplanned, $180 expense on a personal budget can EASILY be significant - even if you make six figures.

it's better to have an ability to cap $. With something like DO it'a way easier to control costs compared to AWS.

The infinitely scalable cloud services have this big problem with surprise bills. In the old days teams often made do with what was available, now it too easy to spin up more resources.

I'm surprised people still talk about cloud services as being cheaper esp where developers are free to use what they want.

The main issue here is the budget notification emails aren't an adequate mechanism to catch infinite loops. They are too slow and you've already racked up big overages by the time you see it.

Idea: use API Gateway to configure a quota to match your budget projections. That will force a hard stop. Would be nice if AWS made this easier.

I wrote my (small) AWS app so it can run both on AWS and my local machine. Then you can write tests against the higher-level logic like "save this file to S3" and run those tests locally as well.

My main challenge with serverless is using Lambda with API Gateway. Lambda has no database connection pooling, so I end up with a ridiculous number of connections to RDS - one for each simultaneous user. I haven't found a solution to this yet, other than not using API Gateway.

One solution, use external pooler like pgbouncer or mysql-proxy running on a small instance(s)

I'm actually kind of blown away RDS doesn't have pgbouncer installed on the database. That's how we operate...each db server has pgbouncer living right on it. We connected to bouncer, not directly to the DB.

There's a lesson in there about how AWS makes a crisp $10m dollar bill for the richest man in the world every day.

I know you're being sarcastic but I feel it's partly true. They announce so many different services, each month something new appears, but this very basic feature - bill capping - asked by users from the very beginning, has never been implemented. It's hard to believe they lack the skill or that it would be much more complicated than the current alert system.

According to the folks I've spoken to, they don't do bill capping because they have no way to safely shut down your workloads in any way - they'd prefer to let you know and have you do it. And having that choice is way better than a capping operation destroying your production database or causing downtime for your users.

I'm sorry, I just don't buy that. It doesn't have to be a hard cap, it could be a soft one. i.e. at £x your servers start shutting down, you'll get billed for a few extra minutes over your cap, before things have finished shutting down. Servers are totally capable of being shutdown without destroying databases.

Besides, we aren't really talking about production databases at large companies. The people who want caps are devs learning and experimenting. It could come with dislaimers that if you enable a cap and exceed it that your services will go offline unexpectedly, and that may leave databases in inconsistent states. But for a large number of usage scenerios that is a completely acceptable tradeoff.

The simple fact is, not having a cap certainly puts me off experimenting with a service due to a fear of a mistake causing a big bill. And developers learning and investigating a technology is what preceeds them recommending that technology to their companies.

Last time I looked Azure allows a zero spend cap on free accounts, but you can't change the amount to anything else, and once you remove it you can't switch the cap back on. Thats limited, but it's perfect for a learning environment.

If Azure can implement a zero spend cap, there is absolutly no reason that either AWS or Azure can't implement an x spend cap in exactly the same way.

> The people who want caps are devs learning and experimenting.

Then AWS is not focused on their use-case. You can make the argument that they're throwing away potential business here, but AWS is already the gorilla in the room, and people clamour for their products already. A couple of years ago they were rated as being bigger than the next 17 VPS providers combined.

> I'm sorry, I just don't buy that. It doesn't have to be a hard cap, it could be a soft one. i.e. at £x your servers start shutting down, you'll get billed for a few extra minutes over your cap, before things have finished shutting down. Servers are totally capable of being shutdown without destroying databases.

The idea of a payment cap sounds easy, but with something as complex as AWS, it's incredibly difficult, and everyone would demand different behaviour at the cap. So, you hit your cap. Turning off EC2 servers is easy. What about the data you've stored on s3? That costs. Should it be purged? What about your disk drives on EBS, should they be purged? How about items you have queued in SQS, should they be purged? Are you using RDS databases? They can't be stopped, only destroyed (you can do a final snapshot, but that's going to go to block storage, which costs. not much, but it costs).

"Just stop anything that costs" sounds easy, but it's not, not when you have a service as complex as AWS. AWS's current model of "forgive the bill for obvious mistakes" is way more workable.

Yeah, I totally accept that people who want caps may not be in the target audience. That's probably true. But don't pretend it's because the problem is too hard for them to build, which is what was originally claimed. If their big customers wanted it it would get built.

Besides, Azure proves that it's clearly possible. Azure have a cap. MSDN gives you free Azure credits. When you open an account via MSDN you still have to put in payment details, but you have the option to enable a hard cap that prevents you spending past your free credits. So Azure have clearly got a solution for stopping all the services when the credit limit is reached.

All of what you describe as problems are just decisions to be made. S3 data..? Delete it. Make it read only. Pretend to delete it, but make recovery possible for x days. Doesn't really matter, just pick one when you build it, and document what it does. People who want a cap are going to more concerned with the overspend than any data or service integrity. They could stick up a disclaimer... "If you enable this cap you data may be destroyed or corrupted if your spending reaches the cap". There are solutions to the implementation problem.

Besides, they probably already have all this code in place. If your payment methods gets declined I'm prepared to bet Amazon don't just let all your services continue running indefinitely because shutting them down automatically is too hard of a problem for them to solve. So any cap could be implemented by just triggering the payment declined function.

> S3 data..? Delete it.


> ... Make it read only.

The primary use-case of s3 is reading objects. This would not be a deterrent for quite a few use-cases

> ... Pretend to delete it, but make recovery possible for x days.

Still consumes the space that they're charging for in the first place

> People who want a cap are going to more concerned with the overspend than any data or service integrity.

This is patently not true, and is why I think you don't really grok why implementing a cap is difficult. It's specifically why I said "everyone would demand different behaviour at the cap". Some would want only this or that service to stop, for example.

A small business sets up a payment cap and hits that cap because they went viral? BAM, all their block storage, destroyed. All their backups, their analytics, their RDS databases, just gone. Right at the time they needed it most.That's a much harder lesson to deal with than "oops, our bill's a bit high because we made a mistake, can you please forgive it?". Or even "ouch, okay we'll pay it". The protection you want for hobbyists would destroy small businesses that may not understand what is actually meant when that hard cap is hit. It's not that caps aren't doable at all, it's just that they're a wicked problem, and the more you look at it, the more issues you can see.

As for soft caps, what is the functional difference between a soft cap and the billing warnings they already have?

Also, their claim on s3 is "we don't lose objects". Destroying objects because of billing would utterly undermine that claim.

> If your payment methods gets declined I'm prepared to bet Amazon don't just let all your services continue running indefinitely because shutting them down automatically is too hard of a problem for them to solve.

AWS does not destroy your services because of late payment. Source: we've just been in late payment.

> The protection you want for hobbyists would destroy small businesses

Are you serious? This would mean Azure is not fit for business:


When your usage results in charges that exhaust the monthly amounts included in your offer, the services that you deployed are disabled for the rest of that billing month. For example, Cloud Services that you deployed are removed from production and your Azure virtual machines are stopped and de-allocated. To prevent your services from being disabled, you can choose to remove your spending limit. When your services are disabled, the data in your storage accounts and databases are available in a read-only manner for administrators. At the beginning of the next billing month, if your offer includes credits over multiple months, your subscription will be re-enabled. Then you can redeploy your Cloud Services and have full access to your storage accounts and databases.

The decision not to implement this in AWS has nothing to do with technical issues - they can all be solved in this way or another.

I think we are talking cross purposes here. I get what you are saying. A cap that junks your services would certainly not be suitable for everyone, I get that.

But this thread was initially about a dev running some experiments and getting a $200 unexpected bill. They would have been very happy with a $30 cap that just deleted everything. That functionality would be easy to build if they wanted to. But they don't. For other reasons, not because it's hard.

The thing I dont buy is Amazon claiming its too hard to build a cap. A cap that suits some people would be easy. What they really mean is... A simple hard cap is only useful to customers we dont care about because they dont pay us enough. An advanced cap with all the kinds of failover options and thresholds that a medium sized business might want is complex and the people who actually pay the big money (those we care about) don't actually want caps anyway.

> A simple hard cap is only useful to customers we dont care about because they dont pay us enough. An advanced cap with all the kinds of failover options and thresholds that a medium sized business might want is complex and the people who actually pay the big money (those we care about) don't actually want caps anyway.

That's an honest answer, I'd be happy if they formulated it this way. Fortunately, there are other cloud providers with billing cap implemented properly, and you don't hear horror stories about them (problems with spending too much on AWS are very common though).

There are so many ways of solving this, and some of them have been suggested to Amazon. They do have some very competent engineers able to implement it in a way that would make everybody happy.

Charged at a flat rate like the cost of a Digital Ocean $5/mo instance, would developers pay for such a service to provide automated notifications of service overages for all the major cloud providers?

All a developer needs to do immediately after adding a credit card to AWS/Azure/GCP would be to create an IAM role with permission to automatically add and track fine-grained billing alarms and notify via email/sms for any potential billing overages.

I think a $60/yr service like this would be useful to protect against future events of bill shock.

Looks like it's already been done, at least for AWS, based on a few Google results (not that I've used/tested any of these):

https://github.com/Teevity/ice https://billgist.com/ http://cloudcheckr.com/

Azure has a feature on their trial account that when you hit your free limit, you can either:

a) go into credit (so they will charge you at end of month)

b) disable services

Maybe AWS/Google also support a hard limit on spending.

I looked at Lambda (we use AWS a lot at work) and decided to simply stay with a flat rate DO server. I know what it costs, no need to worry.

He obviously has a smaller account so Amazon might be less flexible, but it's worth contacting support and explaining the error. Sometimes they do give your money back in my experience.

In my experience flexible starts north of few mil. per month spend so why all the startups are running on AWS is a mystery to me.

Not sure if someone has mentioned this already, but you should contact AWS support and ask if they will forgive the bill given it was an error which led to the high costs. I’ve had bills forgiven this way in the past (e.g. forgot to disable an instance that wasn’t really doing anything).

I was about to make the same mistake yesterday , but I had written a validation function that would check inside a folder only, and fortunately I did not upload the file in that folder. And the next morning , I read this article... Man I gotta be careful LOL

The relatively low barrier to learn just a tiny bit of following a Linode or DO vps hardening & stack setup guide to get an ubuntu server going can go a very long way for development and prototyping environments.

It's gotten much, much easier, and is just another form of command line management, similar to the CLI framework tools with your preferred stack.

Once that first setup is done, similar to setting up a serverless environment, you are generally restoring backups of your base image and beginning projects from there.

It also immensely helps to learn about how to build something to scale that isn't completely reliant on the PaaS layer.

I've built a fair amount of serverless services over the past two years using the Serverless framework, apex and straight API Gateway/Lambda.

It's nice not to have to worry about a server, but I feel like there are just as many little things to futz with in serverless architectures especially before "environment" variables existed in Lambda.

Setup spending alarms for your account. Personally, mine is $5, $10, $15, $20 and so on. At $30, my wife gets paged.

The cost by using AWS is hard to foresee. For years, my s3 storage got charged nothing. But some a month, it got charged several dollars.

I have migrated all my services to GCE. At least GCE provides free decent quotas for every resource.

Serverless is going to make resource usage a focus in a way it hasn't been for years. The quick feedback, the absence of "all you can eat" pricing and the possibility for savings are all factors in this.

Run+know your infra, none of this is a problem. Serverless is a scam.

So run my static content only blog on dedicated hardware that I have to administer rather than throw it in an S3 bucket with a Cloudfront on it? No thank you.

Qualify your statements.

You don't really need to administer it, there are plenty of hosting options, not just bare metal. And you can always add Cloudlflare anyway.

You're missing my point. "Serverless" just means "Applications built on cloud services rather than server(s) I have to administer".

The OP said I should run my own infrastructure. I -could- host my blog by running a web server atop a server I administer, sure. I'd have to take on all the infrastructural tasks of doing that, securing it, ensuring any availability/scalability concerns I may have are taken care of, etc, but I -could- do that.

Instead, S3 + Cloudfront (or, sure, any flavor of hosting and edge caching options you care for; I was not implying "Just AWS") means I don't have to worry about any of that. For me, the reduced level of control, increased availability, scalability, and easy "it just works", is worth the tradeoff. As is the pennies per month it costs me given the low utilization and pay-as-you-go model. It's hardly a scam.

I wouldn't call it a scam. You need a certain level of expertise anyway, otherwise you not only screw up your service, but also, in the case of AWS, lose money. Practically speaking, the amount of work needed to set up a blog on S3 with CloudFlare is not that different from the one needed to deploy, say, a WordPress droplet on DigitalOcean. You just click a few times and it works, you can basically forget it. The only difference is that you are guaranteed not to pay more than $5 or whatever your monthly plan is.

Of course scalability is incomparable in both cases, so if it's something that really matters - and matters more than money - of course something like AWS is a better choice.

It's as simple as shared host FTP drop (for static content), or a cheap VPS on DO, Linode, or what-have-you. Takes all of an hour to configure (with some practice), and if you are running a flavor of linux with APT and ufw, that includes setting up a firewall and unattended-upgrades.

And Cloudflare is still an option, since everyone needs their precious caching.

Serverless still is a marketing tool for cloud providers. It will be useful when it really offers advantages over managing your own servers, especially on the cost and debugging.

Hey, at least is was an enterprise cloud scale infinite loop.

You can host a static site on S3 + Cloudfront, so what purpose does Lambda have in this picture?

>Keep an eye on your logs, test everything again and again.

This is the takeaway quote from this for me.

This has nothing to do with serverless itself but it's rather problem of AWS.

Off topic, but the "serverless" moniker needs to die. I propose "adminless" as in "server I don't have to admin, configure, or patch" as being much more descriptive of whats really going on.

Eh, if I'm deploying a cloud function, the server truly doesn't exist for me. It's more like a Web Worker running in a privileged environment. I'm ok with the name.

No, it's really a server running your function. You just (are told you) don't have to worry about the server or what it's actually doing.

> You just (are told you) don't have to worry about the server

...until you receive the $206 bill for the work done by the server.

But that works on the VPS level too. We're just talking about higher levels of abstraction on top of hardware. Which one stops being a "server"? I'd argue that once you lose the OS, it's no longer a "server".

Well, it's not adminless either, as AWS have lots of them keeping the hosts alive and running.

But if you can ignore that, you can probably also ignore the fact that your code runs on a server.

Fun programming story!

Endless loops are now billing issues, just like a DOS.

If you are big enough and want serverless no-ops yet don't want to pay per burger when you already breed cows then consider kubeless.io

I have a 5$ AWS billing alarm.


I live in Florida, and even in Miami $200 is a very nice dinner once or twice a year. $200 for a decent lunch is insane.

He's being ridiculous. I live and work in SF. Restaurants here are definitely expensive! But my wife and I get "decent lunches" for $60, not $200. A quite nice dinner for date night is probably $120-$150. (And if we were more price-conscious, I'm pretty sure that we could get "decent" lunches and "quite nice" dinners for less than we pay).

You can of course spend much more than that if you choose. But $200 is nothing remotely close to the minimum you'd spend for a "decent lunch."

You both remind me of Lucille Bluth. $20 is PLENTY for a "decent" lunch, even in SF... you can be much more than "pretty sure" that you can have dinner for less than $150, too.

I didn't say it was the minimum. Just a nice lunch.

That just shows you've got no idea what you're talking about

All I said was that I find it bizarre. What is it that you think I don't know exactly?

Applications are open for YC Winter 2022

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact