Cutting down AWS cost by $150k per year simply by shutting things off

darth_avocado · 2024-01-22T17:02:45 1705942965

It is unfortunate that cost management isn’t something most engineers keep an eye out for on a regular basis. Spinning up unnecessary resources, not cleaning up resources properly once not needed, writing inefficient code, etc. all quickly adds up to hundreds of thousands of dollars per month in big companies.

I once found a “test” db cluster from an engineer who hadn’t worked in the company for 3 years. We were paying 300k yearly for it before discounts. It took me a literal click to shut it down. And I’m not proud of it but, had to send out an org wide email on the savings achieved (corporate politics :shrug:).

pjc50 · 2024-01-22T17:07:53 1705943273

The huge achievement of Amazon was designing a system and selling it to people where developers no longer had to pre-approve spending. Previously developers were hamstrung by purchase order requirements; it could take weeks to authorize a single computer. Now the pendulum has swung in the direction. Developers can spend unlimited amounts of company money without realizing, billed in arrears.

And in many cases this is a huge net win! After all, there's another way to waste company money invisibly: design a process which requires meetings and waiting while work is held up.

amichal · 2024-01-22T17:43:25 1705945405

Overall good points but don't forget that pre-approval processes resulted in asking for resources that exceeded the near term needs and once approved ongoing costs were rarely fully reviewed. I have personal experience with "enterprise" clients making a huge months long process to get server resources, reminding us that changes would take 30+ days. when the project was over and we did everything we could to let them know that the servers could be spun down or put to other uses we got back a "ok thanks!" only to find them still running our project code YEARS later. This is infra that was costing them about 1 engineer FTE per year, not even a 10$/mo toy env

throwup238 · 2024-01-22T17:10:30 1705943430

We should be getting kickbacks from Amazon for all the work we’ve done for their bottom line.

marvin · 2024-01-22T17:51:03 1705945863

We kinda do; Amazon puts upwards pressure on US engineering salaries.

LunaSea · 2024-01-22T19:33:44 1705952024

What? No? They are behind all major tech companies and notoriously one of the worst employers too.

teaearlgraycold · 2024-01-22T18:51:41 1705949501

I assume all of the unused resources end up subsidizing the rest of them.

mytailorisrich · 2024-01-22T18:04:16 1705946656

Yes but it is up to a company to control its spending. It must have a process and policy in place to deal with this. It's not Amazon's fault if it hasn't.

pyeri · 2024-01-22T17:29:19 1705944559

That's the whole ploy with Agile, isn't it? In the classical SDLC or Waterfall paradigm, everything was pre-approved and signed off, not just the cost or billing but even the software design itself. Any change in the process and the designers had to raise a change request. Agile changed all of that and now we know how bad things can get with that.

tbalsam · 2024-01-22T17:36:25 1705944985

No, Agile is about tight development loops. When weaponized/used by large corporations it often times turns into a Stay-Puft man of sorts, but as someone who _hates_ processes normally, I actually kinda like it and Kanban when done well.

It basically helps keep things clean with decomposition and doesn't necessarily hamstring older devs as much while giving a good guideline for younger devs to work in. All things considered, it seems like not a bad system to me, and the team customizing the process to their own needs is nice as well.

There's a million ways for it to go wrong, but it's not too terrible on the whole I thinks. <3 :"))))

ska · 2024-01-22T18:15:47 1705947347

> who _hates_ processes normally,

There is no such thing as "no process"; is something you always have whether you talk about it or not. The often heard "I hate process" is counterfactual then - what it really means is "I hate process that I see as intrusive/wasteful/whatever".

The +'ves you are listing are what comes from looking at how things are actually done and doing some of it a bit more thoughtfully.

The common -ves often come from stakeholders outside of developement injecting their needs ... sometimes this is unavoidable (e.g. regulatory) sometimes it is just political, but either way there are better and worse ways to do it.

pjc50 · 2024-01-23T09:28:10 1706002090

But see https://news.ycombinator.com/item?id=39092563 ; the cost of pre-approval can be pretty large, as is the cost of change once you discover that you've frozen in the wrong thing. Agile benefits both consulting situations where the client genuinely doesn't understand their own needs, and also startups where you're building a new product and need to rapidly iterate in response to market feedback.

(Escalating costs of pre-approval, and the need to design around every possible objection, are a big part of why physical infrastructure costs so much more in the West!)

happymellon · 2024-01-22T17:22:21 1705944141

I found that the problem happens mostly when companies

1. Don't ask developers how much something costs, engineers love optimisation, getting as much as possible out of a system for cheap is great fun.

2. Lock down the UI, so devs can't even find out how much things cost. That's my current situation. Why block the billing dashboard, then expose it through billing dashboard tools that are not really any better, and in many ways worse?

It's rhetorical really as I know why. Terrible architecture from "enterprise". Stick everything in a single account so it's hard to figure out how much is your spend. All 3000 databases, and make sure your k8s cluster is 5 8XL boxes so no one can scale down excess capacity.

Classic I Burn Money consultancy!

tuananh · 2024-01-22T17:26:22 1705944382

> 2. Lock down the UI, so devs can't even find out how much things cost. That's my current situation. Why block the billing dashboard, then expose it through billing dashboard tools that are not really any better, and in many ways worse?

This is so true. billing transparency is very important.

in the past, i had a case like this: dev accidentally enable backup policy for test database with no retention. finops think that db backup is important and ignore it. dev has no access to billing and have no idea what's creeping up the bill

macNchz · 2024-01-22T18:01:55 1705946515

> getting as much as possible out of a system for cheap is great fun

In certain circumstances, absolutely, however it's extremely aggravating to be in the position of being constantly pestered to ship features faster without the authority to overprovision some of the infrastructure the software runs on.

Waking up in the middle of the night because we saved money by allocating too little disk for the primary database or because the latest release included new dependencies that increased memory usage and the OOMKiller is picking off web servers like a wolf in the lamb's pen, or we're just swapping our way to hell while web requests 502...eh. Not for me.

More visibility into costs, though, absolutely agreed. Engineers should know that when they turn on some new cool serverless gizmo and then forget about it, it's costing $ each month.

happymellon · 2024-01-22T18:22:13 1705947733

Completely agree.

I didn't mean that engineers love having no control over their systems, I just see the labours of love that get posted here about getting nginx to throw out 1000 pages a second on an Atari 800, or getting LLMs designed for $2000 GPUs running on a phone.

The question should be, we currently cost $X a month and we need to half it because [reason], what can we do to bring it down? Which might be reducing hardware, or maybe something else, might be both. Puzzles can be fun.

malfist · 2024-01-22T17:33:39 1705944819

Lots of that is just bad design by going the easiest route.

It's easy not to grant engineers access to the billing dashboard.

It's easy to put everything in the same aws account.

Inside Amazon, we're supposed to set up new aws accounts for every service and realm, so we know how much X service's beta environment is costing

happymellon · 2024-01-22T18:16:54 1705947414

Indeed, you need to share resources?

Plenty of ways of doing that, like making a cross account shared VPC for example.

Everything is still accountable.

darth_avocado · 2024-01-22T17:28:42 1705944522

Locking down the UI is definitely a problem. Another problem I have seen is, not being able to accurately tell even if you have the UI.

happymellon · 2024-01-22T17:31:09 1705944669

We had a battle to get AWS console access in the first place, after that I had to deal with:

> You need to request access every 60 days.

Luckily it's now added to my permanent role, but even then no billing access? FFS.

lnxg33k1 · 2024-01-22T17:07:21 1705943241

I get paid the same whether or not I spend time going around to save company cost, it’s not like they’re going to share their savings, then shareholders get their juice, management gets their juice and Im the clown who went out of my way for them, who cares I do software not cost management

pixl97 · 2024-01-22T17:09:31 1705943371

Which is why your slice of the organization gets a cloud budget. Don't keep your budget under control, well no bonuses for the employees.

>Show me the incentive and I will show you the outcome

wizerdrobe · 2024-01-22T17:14:11 1705943651

Jokes on corporate, I don’t have ISO, RSU, or a bonus and my raises are always below inflation.

lnxg33k1 · 2024-01-22T17:16:07 1705943767

Oh for the inflation thats awesome, you know how many interviews with engineers Ive had saying they were at a place for 10 years, but the company had a raise cap of 2%, then saying that they wouldn’t hire me because I couldn’t give assurances to be a likeminded clown making corporate rich

lnxg33k1 · 2024-01-22T17:11:26 1705943486

>Show me the incentive and I will show you the outcome

Would print this phrase on every angle of the offices

ponector · 2024-01-22T17:15:12 1705943712

But bonuses are not related to the cloud costs.

Nobody in engineering cares about spending because there is no benefits on doing so.

Even more: most people are on fixed salary and will get paycheck anyway no matter how low their effort is.

pixl97 · 2024-01-22T19:25:49 1705951549

Hence

>>Show me the incentive and I will show you the outcome

krageon · 2024-01-23T09:16:00 1706001360

This is fun, but these policies are based on the fact that your managers don't care about you and in fact prefer you remain as poor as they can legally make you. So any kind of reasoning along these lines ("well this makes sense") just is not real and won't ever become real.

freedom-fries · 2024-01-22T19:16:54 1705951014

Look at Mr bonuse-pants here. Y'all get bonus for writing software?

sotix · 2024-01-22T18:19:11 1705947551

I consistently tried to push for cost management at my last job, but the product manager just wanted to push new features he could show off to management above him. We let costs inflate to ridiculous levels despite my constant discussion around the topic. Software engineers ultimately had no say in the matter.

I was laid off this week in a mass layoff because the company doesn’t have enough money to pay all of us anymore. It’s disappointing to see, and I wonder how many other teams ignored these optimizations and how much unnecessary total cost it all summed to.

righthand · 2024-01-22T17:25:01 1705944301

I had to implement a 2nd deploy for a QA environment, and my first question to the infrastructure team was “won’t this be costly is there a better way to handle this?” They shrugged off the cost and said they would optimize my deploy once I was done with the initial implementation. 6 months later their optimization was to undo all the work not because it wasn’t a good implementation but because it revealed how much non-optimization had went into the QA environment before I even touched it. A lot of cost is probably due to the “we just taped these two things together” strategy for lower environments.

ExoticPearTree · 2024-01-22T18:24:55 1705947895

In the organization that I work in, costs are transparent to everyone involved and most people are aware of the need to keep costs as low as possible.

One of the downsides with this approach is that engineers/developers are not very good business people and don't really understand the notion of "the cost of doing business". And from time to time we have issues with "but it costs $70 more per month", and spend $1000 to optimize those $70 :)

In the end, even with some of the wrinkles mention above it helps and saves money when costs are transparent and readily available for anyone.

cduzz · 2024-01-22T17:20:38 1705944038

I think "engineer" isn't really the correct word to use for the artisans who build much of the tooling used by most companies.

An engineer either wears a striped hat and drives a train, or, went to a credentialed school and passed a bunch of test and is allowed to sign documents that state "this thing, if built this way, won't collapse and kill people."

It is expected that an engineer can predict with reasonable accuracy the expense and timeline of a project, and how to maintain the resulting thing, without resorting to voodoo like "scrum velocity." In large part that's because engineers stick to doing things that are well understood and predictable, and if there's risk they resolve the risk before undertaking the project. (Is there bedrock over here upon which to build a foundation? I don't know; let's find out first!). Sure, there are engineering disasters even today -- buildings that unexpectedly lean over and door/wall things that unexpectedly fly off the side of airplanes, but those are typically organizational / process problems not "engineering doesn't work" problems.

reactordev · 2024-01-22T17:33:55 1705944835

“engineers stick to doing things that are well understood and predictable”

I’m calling BS on this. If this were true, we’d still be a ground species. Engineering has been and will always be about creating something electrical, mechanical, computerized, or all, that solves a problem. Understood or not. Engineers are not oracles. They can not predict whether a tower built in Italy will eventually begin to lean due to erosion. They can not predict that a steel beam rated for 300T of force would break at 180T. They can not predict a rogue developer removing a package from underneath their dependency tree.

You can give estimates all you want but you are still guessing.

If engineers were as you say they are, we would never have delays, we would never have traffic jams, we would never have crap software, we would never have flight.

practicemaths · 2024-01-22T17:47:11 1705945631

“Engineering is the art of modelling materials we do not wholly understand, into shapes we cannot precisely analyse so as to withstand forces we cannot properly assess, in such a way that the public has no reason to suspect the extent of our ignorance.” - Dr. AR Dykes

reactordev · 2024-01-22T17:50:06 1705945806

Ah the eloquence of Dr. AR Dykes, perfectly said. Thank you.

noboostforyou · 2024-01-22T20:01:32 1705953692

I am partial to the following one about computers:

"A cpu is literally a rock that we tricked into thinking."

cduzz · 2024-01-22T17:54:09 1705946049

Engineers manage risk and cost. They certainly make mistakes, like those couple buildings that are famously leaning over in SF and NYC, or the citycorp center where they got the wind sheer loads wrong and had to hot patch the building.

But looking at the malarkey that goes on in "software engineering" or whatever -- clearly not engineering, at least not where I've seen it.

Engineering: a process of repeatably solving an understood problem predictably.

Craft: a process of solving an understood problem.

Science: a process of solving a problem without an exactly understood outcome.

Art: a process of working.

These are all made-up definitions.

I'd expect a software engineer to give me a system that locally caches and verifies distribution artifacts and validates changes -- a craftsperson who gives me a tool chain that yeets goo from the internet and builds on that without validation is not, in fact, an engineer. They could be quite practiced at the art of building working systems, but they're not managing risk....

dehrmann · 2024-01-22T18:01:28 1705946488

What makes software engineering special is the systems are more complex and are cheaper to test and break. You get a completely different engineering culture when you can roll back a bad change after seeing it fail during the canary push. That, and what's usually on the line is money, not life. I'd feel a lot better making a $1M mistake than making a mistake that killed someone.

ExoticPearTree · 2024-01-22T18:16:36 1705947396

> Engineering: a process of repeatably solving an understood problem predictably.

We call it help desk, not engineering.

gottorf · 2024-01-22T19:37:32 1705952252

> Engineers manage risk and cost.

"Any idiot can build a bridge that stands, but it takes an engineer to build a bridge that barely stands."

robocat · 2024-01-22T19:47:36 1705952856

> An engineer [] went to a credentialed school and passed a bunch of test and is allowed to sign documents that state "this thing, if built this way, won't collapse and kill people."

Ahhh - that old craptacular definition. You completely ignore mechanical engineers, chemical engineers, electrical & electronics engineers. Not all engineers make bridges.

Secondly, the implied cause and effect even within civil engineering is a fantasy. Signatures on documents by credentialed engineers doesn't prevent disasters as you noted: Bridges fall down, buildings burn. Read the engineering reports on civil engineering disasters, and look at the consequences for the engineers involved.

You do some handwaving about organizational/process problems, but actually that is the key to safe engineering. Organisations deliver engineering projects and they do it across jurisdictional borders using insurance and liability and with a variety of other means that work: "signatures don't prevent disasters".

Lockheed Martin's skunk-works and SpaceX are real engineering. Any good definition of engineering needs to encompass an extremely wide variety of activities.

Engineering is compromise. I have no love for Musk but him saying build that actuator for less than $5k is actually true engineering: https://news.ycombinator.com/item?id=39085892

I would like to know the psychology behind why people wish to believe credentialed signatures are so powerful? Maybe a cross between two concepts #1: "that individual engineers run the world" and #2: "that retributive punishment of individuals works as a deterrent". I think concept #1 comes from the egotist idea of most engineer-types that we are the center of everything (I need a whole article to explain the concept). I think concept #2 is related to beliefs about the value of incarceration and also punishment beliefs derived from religion (especially in the USA where prisons are not fixing problems?).

Edit: issue #3: the idea that we should make rules about what words mean. It takes a certain worldview to think words should be defined rather than evolve (or worse that words should be part of a justice system)

cduzz · 2024-01-22T20:34:57 1705955697

> Ahhh - that old craptacular definition. You completely ignore mechanical engineers, chemical engineers, electrical & electronics engineers. Not all engineers make bridges.

I suppose you've got an engineering degree in pedantic engineering? Engineers manage cost and risk. The skunk-works stuff is marginally "science" not "engineering" given the relatively large budgets and relative lack of "we know this works." Cern is similarly an enormous engineering enterprise in that it's a huge stack of "we know this works" in service of "we're not sure what this will do"

A discussion of how "software engineers" deliver projects with neither cost or risk as part of the process implies, to me, that they're not engineers.

robocat · 2024-01-22T22:58:08 1705964288

You are the one trying to push your definition of engineering.

I provided counter-examples that show engineering encompasses a lot more than your definition.

I simply don't understand why anyone thinks writing software is somehow uniquely not "real" engineering. Somehow we are indoctrinated to believe that it isn't but all the evidence seems to show software engineering is a valid description.

I have no lack of experience watching the fuck-ups made by electronics engineers, or the fuck-ups made by mechanical engineers. You appear to want to define engineering only as certified civil engineering. And I've seen enough of their fuck-ups too, with signatures. In fact I'll ask my bridge engineer friend from uni about it! Unfortunately my bridge building grandad is dead so I can't ask him.

cduzz · 2024-01-23T00:26:44 1705969604

The vast majority of people I encounter with some "engineering" title, in software (or the related "Architect") are in fact not trained as engineers or architects, in any field.

A site reliability "engineer" or a software "engineer" is not an actual engineer just because they've got that in their title or job description. If I were to hire a "chemical engineer" position and instead hired a chemist, or a mechanic, or a rando who's cooked meth, I may end up with things working okay, or I might end up with a serious mess, even if those people I hire call themselves "engineers" (but in fact have no formal training as such).

I'm not sure to what degree credentials matter, but do credentials matter more than "not a god damned bit" ?

I'm not saying the title makes you "not an idiot" -- people gonna people -- but attention to "cost" and "risk" is (theoretically) one of the distinguishing characteristics of engineering training vs ... "mather" or "programmer" or "philosopher".

robocat · 2024-01-23T01:25:01 1705973101

Yeah, the debasement of meaning is annoying - vice-president is one I hate. Another one that surprises me from the US is "licensed nail technician".

I have a bachelor of engineering title I can use with my name, but that is another distinct type of bullshit.

In New Zealand one relevant legal certification is CPEng which you can apply for after receiving your degree and working for a few years: https://www.engineeringnz.org/join-us/cpeng/ And apparently our government agreed in 2022 to introduce a new licensing regime for engineers doing safety-critical work.

But in an international world, how relevant are certified individuals? When I purchase a stove from a US brand and it catches fire, there needs to be other liability/retribution/corrective systems to deal with the problem. It matters little to me who signed off on the product in the US.

Can I import custom structural steel beams? How many New Zealanders have signed off on this steel construction: https://ccc.govt.nz/the-council/future-projects/major-facili... We need a new stadium because the last one broke. Unfortunately it wasn't insured due to some cockup at the city council (which I suspect had zero retribution on the people that cocked up - I wonder if they signed bits of paper?).

Over-credentialisation is a problem too - where is the right balance? The shift to everyone needing credentials is fucked. My friends (nurses, teachers) literally weep at the absolute trash they have to "learn" for their credential. I also vividly remember the crap I needed to disgorge to get my degree.

I don't know what the answer is, but I honestly believe most credentials are pointless waste and adding more credentials is not actually effective. Neither do I believe that that the anarchy of libertarian free markets are a workable answer.

VoodooJuJu · 2024-01-22T18:10:49 1705947049

And that's how insignificant the costs of cloud providers are in the grand scheme of things. It's a lot of money to a bootstrapping startup, but for the vast majority of these cloud providers' customers, it's a rounding error that's easily forgotten for 3 years.

And that's precisely why you and your little bootstrapper or indie firm should not be using globocloud: you do not have mountains of cash to piss away. Bare metal is trending again. And in this downturn, it's no wonder why. Smaller companies are getting smarter and more efficient. They've decided to chase money instead of cargo cults.

Globocorps burning cash on globocloud is not a signal for small fish to do the same - it's a signal to do the polar opposite. You're not going to become like them by copying what they're doing now. It will not work for you. Globocloud isn't successful because they shovel cash into AWS's shredder, they shovel cash into the shredder because they're successful.

hibikir · 2024-01-22T18:30:38 1705948238

You'd be surprised. I've seen AWS bills well in the 9 figures. It's just that fixing expensive designs is, in itself, quite expensive, and many of those very large corps have hiring practices that don't allow them to complete for the top of the market. Sometimes there's tens, if not hundreds of millions in savings a year, but corporate sclerosis makes it very difficult for broad cost-saving initiatives to be identified and approved.

It's the same issue in any large organization: Large levels of success somewhere allow for large levels of waste somewhere else, but often the waste is not required for the success to exist: The success just makes the organization complacent.

dangus · 2024-01-22T17:07:09 1705943229

Well, it’s definitely fathomable. Does my employer have cost control baked into their proceses, tooling, and culture or are they rushing me to get projects out the door leaving barely enough time to make sure they’re production-ready?

Most places I’ve worked had no formal production readiness review before launching infrastructure.

dilyevsky · 2024-01-22T17:27:21 1705944441

> It is unfortunate that cost management isn’t something most engineers keep an eye out for on a regular basis.

That’s because they were explicitly told not to worry about costs for the last 10 years so majority of ICs at this point never had to do it their entire careers

myaccountonhn · 2024-01-22T18:59:47 1705949987

What I've observed is that people don't really keep track of what they are spending. I like to set up weekly newsletters that show costs and also if there has been a decrease or increase. In bigger corps, you also should have team based tagging of resources so that specific teams get exactly what they are spending. At the very least, managers will look each week and be like "why did costs increase this week? What's going on?" even if the engineers don't care. "What's get measured gets managed" as they say.

onyxringer · 2024-01-22T17:13:24 1705943604

Don't you know that "Developer's time is expensive"? :D

tuananh · 2024-01-22T17:04:21 1705943061

that's one of the main reason we made this feature "opt-out".

if you want to keep it up, you have to tag it.

once you tag it, it can opt-out 7 days. then you have to extend it (simply chat with our bot)

zikduruqe · 2024-01-22T17:05:42 1705943142

https://cloudcustodian.io

Create a rule and shame them on Slack.

racl101 · 2024-01-22T17:25:30 1705944330

I tend to have the opposite problem. I obsesses over the cost of things, and am pretty bashful about bringing it up to my manager, and he's always surprised that scaling some resources doesn't cost more. But I learned the hard way as a contractor about letting these resources run crazy and had to pay out of pocket so I have PTSD about it, which is why I'm vigilant.

mr_00ff00 · 2024-01-22T18:19:06 1705947546

Would be a random coincidence, but if you work at a large bank, I think I may know the team that had that 300k test db lol.

nlawalker · 2024-01-22T18:45:49 1705949149

This topic got a lot of discussion in "I accidentally saved my company half a million dollars": https://news.ycombinator.com/item?id=38069710

znpy · 2024-01-22T17:57:17 1705946237

This is entirely artificial: I now work at a company where we know very clearly what our infrastructure costs. Yes, we know the exact costs (what was negotiated, not what is on the public pages).

And we celebrate costs slashing as much feature delivery and other stuff.

But this is entirely a management problem: at my previous job, only one manager (skip-level manager from my point of view) knew what exactly were we paying for infrastructure.

That moron wouldn't share that information with us engineers managing infrastructure of course, so there were a lot of infrastructure choices that didn't really made sense according to the public prices but (I guess?) made sense according to a price sheet we didn't know.

So we didn't know what we were spending, didn't have the basic data to estimate the price of a new solution or a new service and didn't have the data to determine how much would we be saving by making changes (optimizing stuff etc).

I fought that battle for a bit but then i just said "GFYS, i'm not going to have fights with you so that you can save money" and let go. Later i left the company completely.

Former colleagues tell me it's even worse now: there are consultants from the cloud provider involved, they know the pricing deals, and whenever the topic comes up the manager shushes the consultant so that the engineers don't hear the prices.

tl;dr: it's an entirely artificial problem, and it's most likely a cultural/management problem.

edit: and i'm not even talking about incentives, as somebody else has correctly pointed out.

_t4za · 2024-01-22T17:38:34 1705945114

I recently helped save $150k per year by deleting node_modules.

I noticed that one of our S3 buckets had high data transfer costs, a bucket that our app downloads HTML+JS assets from when we push out a new release. I downloaded the "directory" of files for our latest release and saw it was mostly node_modules. I checked the code and confirmed that, yes, if this file exists in the bucket then it'll be downloaded by the user. I wrote a quick Python script to list out each directory that had this problem, and a quick Slack message to the appropriate team later, we discovered the specific commit that was the cause, a change to our CI that inadvertently uploaded that directory when we wanted to ignore it.

A few months later, I checked the billing metrics, the effect was an avg of $12,500 reduction in cost for this bucket, or around $150k per year, or 4% of our bill. Not bad for one hour of work. Over the course of a quarter I reduced our bill by over $1m, or around 30% of our bill.

I might write a blog post explaining how to go about something like that. A lot of people are not familiar with tools like Trusted Advisor which can easily tell you if you have, for example, unused EC2 instances that can be terminated.

Detrytus · 2024-01-22T19:35:54 1705952154

Now, the inconvenient question: how much of that $1m savings ended up in your bank account as a bonus? Because certainly some of it should :-D

_t4za · 2024-01-22T22:09:39 1705961379

Not sure yet, but probably nothing. I completely understand the expectations written in this thread to receive something in return, but I've given this thought and I'm not sure how to do this in a fair way in this situation. First, I was given dedicated time that quarter to work on cost savings and other people weren't, if I received a bonus is that fair to other people who didn't have the same opportunity? Not to mention the possibility of people abusing this process.

I would be happy to receive some extra cash, don't get me wrong, but I work for non-monetary benefits as well, and I have received some of those as part of this work. If I worked at a company with a different culture and I was being punished for doing the work, I would demand some bonus.

krageon · 2024-01-23T09:19:04 1706001544

You being at peace with "non-monetary benefits" is what depresses wages and real rewards for you, but more importantly all your colleagues.

_t4za · 2024-01-23T17:31:49 1706031109

Just curious, do you feel the same way when someone chooses a job that pays less but has a better work life balance, or a shorter commute, or more opportunities for growth, or similar?

tuananh · 2024-01-22T17:44:00 1705945440

please do :) i would love to learn more about this

paxys · 2024-01-22T19:18:02 1705951082

I was thinking about this recently. I work at a large company with untold millions in AWS spend. I'm 100% confident that I could shave off a few thousands (maybe even tens or hundreds of thousands) from the bill with a little bit of effort on my side. If I go up the management chain and ask if (1) I can make this an official project and put it on the roadmap or (2) I can do this on my own time and keep some % of the savings for myself as a reward, the answer would be a very clear "no" to both. So overall, as an end developer I really have no incentive to work harder and ensure lower operating costs for my company, and I'm sure most developers in the industry are in the same position as me.

mullingitover · 2024-01-22T19:31:03 1705951863

If your company has a spending commitment with AWS in order to get a few percent savings, and it's just barely hitting the contracted amount, it may not be worth the effort to pursue any cost savings. Suppose your company has committed to hit 5M in spend, and they're just barely inching over the line at 5.01M. You might spend a bunch of time and labor expense knocking it down to 4M of usage and not really move the bottom line at all.

> (2) I can do this on my own time and keep some % of the savings for myself as a reward

This is a textbook case of perverse incentives.

deepsun · 2024-01-22T19:41:15 1705952475

Yep, that's how AWS sales make money.

Theoretically, your company could reduce their commitment to 4M next year, but the AWS sales would start negotiating hard against it, like "you will not get the same discount with less commitment".

kthejoker2 · 2024-01-23T06:22:02 1705990922

Not only will you not get the discount you won't get the commit at all, AWS only reups commits with incremental growth. You can spend less; you just don't get commit pricing.

mk89 · 2024-01-22T19:47:10 1705952830

I agree with this comment.

Ironically, I was asked by a manager some time ago if we can imagine using some (more) resources from AWS to reach the next spending commitment. If you're just below the threshold, it's probably inconvenient.

EDIT: To give some context: you can only do this, of course, when you know that you're not really wasting resources, otherwise you end up with just burning money to save little to no money :)

echoangle · 2024-01-23T07:17:53 1705994273

Can you explain how this is perverse incentives? Wouldn’t this actually be a good way to align incentives of the employee and the company (it makes both of them want to save money)? Or do you mean it makes the employee increase cost unnecessarily from time to time so he can reduce it later?

throwaway883322 · 2024-01-23T09:19:10 1706001550

I am not the parent poster, but I think your last sentence covers it well.

caeril · 2024-01-22T19:59:38 1705953578

Spending commitment on AWS? Are you referring to ec2 reserved instances?

Even if you have one year reservations on your instances, starting service migrations/deprecations now would pay off quickly enough. Your commitment expires in 6 months, on an average basis.

nevon · 2024-01-22T20:25:30 1705955130

When you're a large enterprise customer you get private pricing in exchange for committing to spending a certain amount over a few years. This is unrelated to reserved instances and such.

jfim · 2024-01-22T19:22:54 1705951374

Part of it is that it creates an incentive to create wasteful systems, only to "optimize" them later to rack up a bonus. Even if it gets changed to only pay out for reducing spend incurred by other engineers, it's possible to collude in such a way to extract bonuses from the company.

A better way to have aligned incentives for the company and the employees would be to allocate a bonus pool for the entire company, from which AWS expenses are taken out of, but that might be a bit unorthodox.

avidiax · 2024-01-22T21:41:12 1705959672

> allocate a bonus pool for the entire company, from which AWS expenses are taken out of

Also a perverse incentive.

If we use ec2.small, the customer's query will take 3x longer but be half the price. Let's turn off the nightly security audits. We can live with quarterly backups, right? What do we need all these logs for, anyway? We could hack something that works together in 2 weeks, but if we spend 3 months, it could be really efficient, let's do that...

nick_ · 2024-01-22T19:28:52 1705951732

This is one of many insights that hint at why biz-facing cloud architecture is so popular, wasteful, and profitable.

The incentives are designed to form an enormous cash siphon. From aggressively marketing toward fearful & liable (or maybe just tech-cost-illiterate) upper-management to the silencing effect that the low-rung experts experience when sounding the alarm.

tayo42 · 2024-01-22T22:06:35 1705961195

Companies make and spend so much money that this doesn't matter. Thousands, 10s of thousands, 100s is pointless in comparison the potential of building features. A developer costs about half million a year, (salary, bonus, rsu, taxes, benefits)

If they are paying you to saves 10s to maybe hundreds the company is losing money on you so they won't do this.

If your at a public company, look at your company's quarterly reports and see what it would take to many any kind of impact on net income.

cduzz · 2024-01-22T19:27:54 1705951674

How are they going to be sure you're not just farming cobras[1]?

[1]https://en.wikipedia.org/wiki/Perverse_incentive

mlsu · 2024-01-22T19:30:21 1705951821

Maybe there should be a sort of "anti-saas-sales" role: you get commission on whatever costs you're able to justify as superfluous. After all, the person at AWS makes commission selling you the stuff.

jddj · 2024-01-22T21:13:16 1705957996

Funny downvotes. Although not structured like this value engineering is common in other areas.

devin · 2024-01-22T19:29:07 1705951747

Unfortunately it seems like businesses only wake up and asks for cost reduction on the infrastructure spend side once the problem is out of control. At that point, the level of operations effort to get it under control feels more like a "big rewrite" than a collection of small tweaks.

killingtime74 · 2024-01-22T19:38:20 1705952300

It's opportunity cost. If it's not a problem then you can just make more money with your time.

Draiken · 2024-01-22T21:31:17 1705959077

Nah, that'd imply they think about this objectively. Most companies simply don't.

Most time this "opportunity cost" is then spent on useless hacky features that are never used and forgotten right after release (redesign anyone?).

Of course there are exceptions to all of these, but IME the majority of companies are either focusing on pennies or ignoring it completely. Not sure why you don't see balanced approaches more often. Maybe this will change with less VC money flying around.

PreachSoup · 2024-01-22T20:43:47 1705956227

That's why in our company we have 2 type of engineering effort, core projects impact and improvement. Not all time is spent on the impact. A balancing act is needed

killingtime74 · 2024-01-22T19:36:56 1705952216

Not a commission but don't you have performance reviews? Promotion boards? Surely it can count for a bit, if the effort is as little as you say.

PH95VuimJjqBqy · 2024-01-22T19:33:31 1705952011

some of us care enough about our craft to do it without the backpatting or the extra money.

thefourthchime · 2024-01-22T16:55:05 1705942505

I've mentioned this before, in my company (big media company) I saw some S3 costs creeping up each month. I looked into it and it was a system we abandoned that was still copying files to this bucket.

I reached out to the team and they turned it off, it saved us $1m a year. The higher-ups rewarded me by telling me that a team should have caught this so I should meet with them now.

Salgat · 2024-01-22T16:57:51 1705942671

It's truly fascinating how companies won't bat at an eye at spending ungodly amounts of money on things they don't need, but will sweat profusely at the thought of a tiny fraction of that going towards additional compensation.

anotherhue · 2024-01-22T17:06:45 1705943205

Enriching AWS does not threaten their social standing.

preommr · 2024-01-22T19:20:09 1705951209

Something something snakes, something something unintended consequences.

Salgat · 2024-01-22T20:50:52 1705956652

Sabotaging your company's infrastructure for a bonus is a great way to end up in prison.

Marrand · 2024-01-23T09:22:00 1706001720

How would you prove intent to sabotage?

Salgat · 2024-01-23T17:08:04 1706029684

Hopefully your company doesn't have its head up its ass and has basic things like logs setup. The vast majority of what I do in production is logged through AWS (Cloudtrail) or git.

stevejb · 2024-01-22T19:45:34 1705952734

I work for a company called CloudFix, and we are solely focused on AWS costs. We do automated AWS cost optimization. We find one of two reactions when we deliver savings to customers:

(A) "Hey wow, this is great! We are so excited to be saving from here on out." OR, (B) "This should have been caught earlier. $TEAM was supposed to be experts..." and then blame game starts.

It is really unfortunate when institutions react in the latter way. Often the engineers are assigned to cost optimization, along with a million other things. And, the incentives aren't really aligned well to reward savings. For example, S3 Intelligent Tiering is the right thing in 99.9% of cases - so it should be your default bucket type. BUT, engineers often face only downside risk for the change, and very little upside reward. And, it isn't their money so they just leave it. The cost of overprovisioned S3 can be staggering!

What is really needed is to establish a proper FinOps discipline, put someone in charge of cost savings, and make sure incentives are aligned properly. And of course check out CloudFix if you can!

thefourthchime · 2024-01-24T17:50:54 1706118654

We work with a competitor called Vega, the product seems OK although the UI is very slow and confusing.

The biggest problem they have is they have no business insight into what these costs are, and if we can reduce the cost without any kind of engineering, effort, or loss of performance.

tuananh · 2024-01-22T16:56:37 1705942597

true. the first step is to have visibility into what's eating the bill. it's just like you need to profile the program before optimizing it.

that's why we did finops dashboard first thing when we first started the cloud journey.

can't optimize if you dont know.

max_ · 2024-01-22T16:58:35 1705942715

You got rewarded with more meetings? Not a bonus?

thefourthchime · 2024-01-24T17:48:53 1706118533

joshstrange · 2024-01-22T17:31:35 1705944695

One small issue I have as a developer who can spin up just about anything on AWS is this:

I have zero insight into the costs.

Yes, my company could turn that on for me but it's rare that they do so it's nearly impossible to know if I did something that costs a lot of money (relatively or in general) without access to the cost explorer/billing dashboard.

And before "well can look up what a t2.2xlarge costs and calculate it", sure. In a very contrived example I might be able to see what it costs but so many things are hidden/hard to see in AWS. For example, I recently spun up an RDS customer on my own AWS account. After testing for a while I decided it wasn't what I wanted and I deleted the cluster. Fast forward a month and my bill is well over what I expected (Like $30, no it's not a ton of money but it's my personal account and I wasn't expecting that charge). Come to find out it created a VPC as part of the RDS cluster (I think maybe it was for the RDS proxy? Still not sure) that didn't get deleted. I had to go chase that down and even that process wasn't easy. I had to make sure that it wasn't be used by anything else and then delete other things that were created when I made the RDS cluster before I could remove the VPC.

I was only able to do the above because I had access to the billing info. I would have left that VPC indefinitely on my work's AWS account by accident and been none the wiser.

I'm more than happy to take costs into account but without access to what things are actually costing us I can't help that much. Mostly because I need to know the costs to know what's worth optimizing. Sure I know I could improve X feature but if that costs us pennies a day (or month sometimes) then it's not worth it. Similarly if I know feature/infra Y is costing $XX,000/mo then I know I should rethink or investigate if that's correct/worth it.

tuananh · 2024-01-22T17:32:51 1705944771

billing transparency is very important.

in the past, i had a case like this: dev accidentally enable backup policy for test database with no retention. finops think that db backup is important and ignore it. dev has no access to billing and have no idea what's creeping up the bill

joshstrange · 2024-01-22T17:37:08 1705945028

Exactly, sometimes it's not clear at all what something will cost (and/or if the costs will go up). I'm happy to glance at the monthly costs here and there and if I see a jump I can dive in and see where it's coming from. We all make silly mistakes, like leaving logs on infinite retention in CloudWatch, and that's something I can easily fix/address but only if I have the info.

I've asked, off-hand, a couple times for billing access but nothing has come of it. I don't want to seem pushy but also it feels like data I need to perform my job to the best of my ability (especially at a small company). I don't think it comes from a place of "We don't want to give Josh access" or secrecy as much as it not being a priority but I need to bring it up again.

belter · 2024-01-22T18:27:43 1705948063

You are aware of this? - https://calculator.aws/

joshstrange · 2024-01-22T19:11:04 1705950664

I’m very aware of that tool but it’s far from perfect. I’ve spec’d things out on that then seen very different prices when I actually create things in AWS. In part because the tool doesn’t take some things into account or because sometimes it’s impossible to guess your usage for a new feature.

I don’t believe the VPC was factored in when I used that calculator, even after selecting RDS Proxy.

belter · 2024-01-22T19:35:58 1705952158

VPCs are free. Are you talking about data transfer? I know it has options to enter values like amount of data transfer you are planning to do.

joshstrange · 2024-01-22T20:54:54 1705956894

I believe the cost was actually a "NAT gateway" attached to the VPC which has a monthly cost of about $30 even if you don't transfer any data over it.

cddotdotslash · 2024-01-22T17:52:43 1705945963

I’m convinced that once a company reaches ~$10m/year in AWS spend it becomes entirely reasonable to hire an in-house engineer whose sole job is to find cost savings opportunities. Literally a “find unused stuff and turn it off” engineer.

8organicbits · 2024-01-22T18:30:51 1705948251

I've spent some time doing this. There's always old systems people don't really understand, ownership is poorly defined, and no one knows what happens if you turn it off. It's archeology. Understand what the system is doing and how it interacts with other systems and the business. If it looks unneeded back it up, stop the VM, wait and watch for fallout, and eventually terminate it.

cddotdotslash · 2024-01-22T22:24:16 1705962256

There’s definitely a science to it. To complicate matters, the way you explore those connections, take backups, identify owners, and perform restores is different across pretty much every cloud service.

nathanwallace · 2024-01-22T17:48:18 1705945698

Readers may find Steampipe's [1] AWS Thrifty Mod [2] useful. It will automatically scan multiple accounts and regions for 50 cost saving opportunities - many of which are looking for over-provisioned or unused resources. For example, it's crazy how much you can save by doing things like just converting your EBS volumes to the newer gp3 type. Combine with Flowpipe [3] to automate checks and actions. It's all open source and extensible.

1 - https://github.com/turbot/steampipe 2 - https://github.com/turbot/steampipe-mod-aws-thrifty 3 - https://github.com/turbot/flowpipe

latchkey · 2024-01-22T17:02:42 1705942962

It is interesting to note that the author works at VPBank, which is one of the larger Vietnamese banks. Saving $150k per year on an AWS bill, is really nothing to them.

The fact that they even outsource their compute to AWS is kind of surprising when they could just fill up their existing data centers (like VNTT https://vntt.com.vn/) with equipment, and save a whole lot more money.

JCharante · 2024-01-22T17:40:26 1705945226

And it's also interesting that they can outsource their compute to AWS because AWS's nearest data centers are in Hong Kong & Singapore. I didn't realize a bank would allow that.

latchkey · 2024-01-22T18:02:24 1705946544

I thought it, but I wasn't going to say it. Vietnam's internet connection is notoriously unstable. The running joke is that sharks attack the fiber connections [0] because pointing fingers is a national past time. The fact that a major bank is relying on an external AWS like that, makes it even more comical.

My guess is that nobody in corporate approved this guys posting and if word got back, it would disappear quickly.

Reminds me to forward this to my buddies who run Timo, which VPBank used to own, but then dropped [1]. Timo was the first forward thinking bank in Vietnam with a great tech platform, likely because it was started and run by foreigners... ¯\_(ツ)_/¯.

[0] https://www.reddit.com/r/VietNam/comments/zvo553/sharks_ate_...

[1] https://fintechnews.sg/42738/vietnam/vietnams-challenger-tim...

ary · 2024-01-22T17:39:51 1705945191

> The best optimization is simply shutting things off

This is the way.

A similar idea has been bouncing around in my mind for a while now. An ideal, turnkey system would do the following:

- Execute via Lambda (serverless).

- Support automated startup and shutdown of various AWS resources on a schedule influenced by specially formatted tags.

- Enable resources to be brought back up out of schedule when demand dictates.

- Operate as a TCP/HTTP proxy that can delay clients so that a given service can be started when it is dormant or, even better, the service isn't serverless but you want it to be. This can't work for everything, but perhaps enough things such that the need to run always on services is reduced.

Cloud Custodian [1] can purportedly do some of this, but I've been reluctant to learn yet another YAML-based DSL to use it.

So this is my "make things designed to be always-on serverless instead" project and the work AWS has done to make Java apps function on Lambda keeps me thinking about the potential to take things that 1) have a relatively long startup time and 2) are designed to be long running service loops, and find a way to force them into the serverless execution model.

[1] https://cloudcustodian.io/

pid-1 · 2024-01-22T18:21:44 1705947704

> Operate as a TCP/HTTP proxy that can delay clients so that a given service can be started when it is dormant or, even better, the service isn't serverless but you want it to be. This can't work for everything, but perhaps enough things such that the need to run always on services is reduced.

My team mostly builds internal stuff and we save tons of $$$ by using Knative + Karpenter, which basically does that on container + EC2 levels.

akira2501 · 2024-01-22T20:25:07 1705955107

Everything I've built in AWS is strictly serverless. You can do an incredible amount with a clever DynamoDB pay-per-request setup, S3 and CloudFront. I haven't once felt the need to reach out to EC2 or RDS and I can't imagine building any sort of control plane to spool them up and down for me.

ejs · 2024-01-22T17:00:33 1705942833

This is especially easy if you can shutdown environments that are only used for dev/staging tasks. With 168 hours in a week - how many hours do those things need to be running? I run a little tool for Heroku to make it easy to do this kinda thing.

cosmotic · 2024-01-22T17:17:16 1705943836

Often the continous use discounts make regularly turning on and off a wash.

bdcravens · 2024-01-22T17:25:55 1705944355

This assumes they have something like RI etc for those resources. Those are typically used for production, but far too often, dev/test resources are usually turned on ad hoc.

null3cksor · 2024-01-22T17:02:46 1705942966

Couple of years ago I saved about 14 mn of revenue per year for my company. I got a 250$ bonus for it.

schnebbau · 2024-01-22T17:27:12 1705944432

This is why you should start the conversation with "I have drawn up a plan to save the company $14M per year. I will execute this plan in exchange for $7M upon completion."

If they say no then just go back to your regular duties.

listenallyall · 2024-01-23T01:22:06 1705972926

Very few companies would make this deal at 0.7M, 0.07M, or even 0.007M. Direct sharing a % of savings with the responsible employee is simply not the way most companies work. Consultants, now, that's different...

charlie0 · 2024-01-23T06:28:38 1705991318

For a slice of 7M, just quit and come back as consultant.

munchler · 2024-01-22T20:10:20 1705954220

Because blackmail is an effective salary negotiation tactic?

llanowarelves · 2024-01-22T17:57:35 1705946255

Learned a lesson that you only have to "spend" (forgo) $250 to cause that company $14m in losses (that you could have prevented)

nvm0n2 · 2024-01-22T22:16:47 1705961807

But was that your job? Because if so, you really got salary+bonus. And if you'd found nothing you'd still have got salary. So you can look at it several ways.

EvanAnderson · 2024-01-22T17:06:33 1705943193

I worked adjacent some telecom consultants in the 90s whose income was solely driven by a percentage of cost savings they could trim from telephone bills. Seemed like a very brash business model but they clearly knew there was gold to be mined.

I keep thinking I should be doing "cloud optimization" work and being compensated this way. Slicing and dicing output from usage/billing APIs and providing an "optimized spend" probably has the potential for a lot of low hanging fruit.

QuinnyPig · 2024-01-23T01:03:00 1705971780

As someone who's been doing this for the better part of a decade: it's a mirage. No client is going to sign for a "percentage of savings" model when it comes to cloud. Believe you me, I've looked at this up, down, and sideways; neither the math nor the psychology work out the way you'd hope.

philsnow · 2024-01-22T18:35:36 1705948536

Like https://www.duckbillgroup.com/about/ ? There’s probably room in the market.

bdcravens · 2024-01-22T17:31:28 1705944688

My employer has the same business model (we audit Fedex and UPS invoices for late deliveries, bogus surcharges, etc)

partiallypro · 2024-01-22T18:58:35 1705949915

I saved my previous company $4000-5000/mo on AWS billing just by auditing the AWS account and turning off unused machines that that old devs has spun up and deleting hard drives after backing them up to S3, just in case. No one had really even bothered looking at it for years and I did it in my "free time" at work without being tasked with it.

Ironically, I asked for a raise a year later and was denied, despite single handedly saving the company nearly $50000/yr. The raise I asked for wasn't close the cost savings I had brought. I left the company shortly after.

I saw someone else have a similar experience here and a comment to it was saying rewarding this produces a bad incentive...well, honestly why would I have even bothered cutting costs if I felt I wouldn't be rewarded? Not rewarding it just makes me half regret doing it at all.

JohnMakin · 2024-01-22T17:13:16 1705943596

Looking forward to the kubernetes one - Most kubernetes clusters are designed for high availability, not necessarily for being able to quickly spin up/down and there’s often a lot of hidden complexity there (at least on aws).

tuananh · 2024-01-22T17:31:15 1705944675

hint: it's going to use the same platform. allow us having the ability to inject certain manifest into any eks cluster within the org.

JohnMakin · 2024-01-22T17:38:28 1705945108

Have you done this or attempted this yet? Every kubernetes cluster is different, but in my time working with them the last several years I anticipate the following issues:

- dependent services not coming up in the order you expect/want

- issues draining nodes due to crashlooping/erroring pods (can also be caused by dependent upstream services going down in wrong order)

- Persistent Volume retention/synchronization

- IAC not cooperating

- Configuration annoyances with deployments’ availability/replica settings

- Thundering herd types of problems

I can think of tons of things that can make this extraordinarily difficult. I’ve had many managers over the years pitch this idea of “rapidly deployable/destructable EKS clusters” and the projects always get killed due to the complexity around this. IMHO they simply aren’t really designed for this type of thing, however, I could be misunderstanding exactly what you’re trying to do.

Too · 2024-01-23T19:45:31 1706039131

Automatic reconciliation is like half the reason Kubernetes exists. A well designed system should handle this, not having expectations in order for example.

I’ve seen several clusters where one could kill more or less everything and it would just come back again.

tuananh · 2024-01-22T17:54:03 1705946043

> I’ve had many managers over the years pitch this idea of “rapidly deployable/destructable EKS clusters” and the projects always get killed due to the complexity around this.

This is exactly what we do: blue green eks cluster.

We just thought if we do it on monthly basis, DRP will be piece of cake :)

JohnMakin · 2024-01-22T18:09:07 1705946947

Look forward to the writeup! thanks

TrianguloY · 2024-01-22T17:20:09 1705944009

> You go home. So you shut things down.

Sorry for the rant, but this is usually wrong. The amount of people that just keeps their computer on is noticeable. And when I ask it's usually "just to avoid having to wait" or "I've always done that".

I personally always hibernate my computer. When I turn it off it takes more time, but I'm already on the other side of the building so I don't care. When I turn it on it takes basically the same amount of time, and it is exactly as I left it. People keep the computer on just because convenience...and I don't think it's a good thing.

SoftTalker · 2024-01-22T17:49:25 1705945765

I keep mine on because it's my jumphost for working remotely. But I agree many people don't need to do this. My company, though, wants people to leave their PCs on so they can get automatic updates and be centrally managed.

saylisteins · 2024-01-22T17:37:37 1705945057

I always leave my PC on, but for different reasons:

- I have a plex server running on it

- I can remote into it from my phone, this comes in handy a lot of the time.

- I can remote into it when traveling through my Fire stick using parsec, which means I don't have to carry a laptop with me everywhere I go ( I also setup my phone so I can use it as keyboard/mouse when I do this).

Regarding energy costs, it's negligible for the benefits it gives me

charlie0 · 2024-01-23T06:31:52 1705991512

I always shutdown my machine at night and sometimes restart when I leave for lunch. I've noticed running Docker and other apps for a while makes my machine slower. I'm convinced there's a memory leak somewhere and restarting away fixes those issues.

Too · 2024-01-23T19:49:46 1706039386

Restart the computer on a daily basis? Like it’s 1998? You need a different system man. Ubuntu can go strong for a month easily, with only sleep for leaving it unused. Only reason for restarts is security updates or that your battery ran totally dry.

danfritz · 2024-01-22T17:16:36 1705943796

Cut costs down to 90% by going serverless and run everything on lambdas?

Nothing keeps humming if it's not being used

bdcravens · 2024-01-22T17:30:39 1705944639

Sure, after you reengineer your application. Even then, "serverless" apps often use persistent resources like databases, and your developers will likely spin up those resources for the same reasons as indicated in the article.

bob1029 · 2024-01-22T17:32:53 1705944773

Cost savings can be incredible if you use the FaaS product in the most aggressive way possible. For us, this means using functions as a simple translation layer between SSR web forms served directly as text/html and whatever SQL provider (ideally also on a consumption-based tier).

90% sounds just about right. We are seeing figures going from $120/m for a VM-based QA environment to $10/m for a consumption-based / serverless stack.

nikita · 2024-01-22T17:36:34 1705944994

(CEO of Neon)

We routinely see 10x savings when switching from RDS or Aurora. Especially if you start adding dev environments.

hnav · 2024-01-22T17:33:27 1705944807

depends on what your utilization looks like, serverless is usually +/- an order of magnitude more expensive. Ideally your workloads are stateless and containerized so you can shuffle them between serverless, container orchestration that you own and dedicated VMs.

8organicbits · 2024-01-22T18:21:08 1705947668

You should always calculate if you're actually going to see cost savings. Counterintuitively, running for fewer hours can increase your bill if it causes you to switch to on-demand pricing [1]. There's a break even point you need to get past.

[1] https://alexsci.com/blog/modeling-on-demand-pricing/

whummer · 2024-01-22T21:55:19 1705960519

To give this a slightly different spin:

--> "The best optimization is simply not spinning things up!"

At least for local development and testing, as made possible by LocalStack (https://localstack.cloud), among other local testing solutions and emulators.

We've seen so many teams fall into the trap of "someone forgot to shut down dev resource X for a week and now we've racked up a $$$ bill on AWS".

What is everyone's strategy to avoid this kind of situation? Tools like `aws-nuke` (https://github.com/rebuy-de/aws-nuke) are awesome (!) to clean up unused resources, but frankly they should not be necessary in the first place...

datadrivenangel · 2024-01-22T16:56:50 1705942610

If you're spending $425k per year on non-production AWS resources, you have an interesting setup.

ponector · 2024-01-22T17:24:02 1705944242

In one project we had testing setup which costs 600k USD allually. It was three times more expensive than production setup we had for product which was more than 3 year old. Nothing special, just mongo and Kafka with enormous size. If you run automation tests Manu times per day but do not clean anything - you'll get mongo with terabytes of test data. And then, on top there was elastic search which multiply bills.

qaq · 2024-01-22T16:58:58 1705942738

Dev env. for complicated products can be fairly involved and large companies might have a lot of them.

pixl97 · 2024-01-22T17:13:58 1705943638

If you're dev doesn't look like production, then you're not testing in reality.

tuananh · 2024-01-22T16:59:46 1705942786

have you met SAP? :)

kevin_nisbet · 2024-01-22T17:29:51 1705944591

I've for a long time set my cloud VMs to shutdown on idle. I usually use it to also justify running a much larger VM to cut down on build and test times.

Just set a cron to run the shutdown command with a grace period. And then if you're working late, you just cancel the shutdown and the shutdown will be retried in a couple of hours. And have a script or command to just run the cloud API calls to boot the VM in the morning / when needed, and the environment boots in a minute or two.

For other stuff I've been tempted to do a more complicated setup, with something like a micro-vm as a proxy, that will do the shutdown / activation on TCP connection, but haven't gotten around to it.

nodesocket · 2024-01-22T18:40:53 1705948853

I’ve mentioned this before, but probably one of the most egregious costs on AWS are NAT gateways and NAT bandwidth pricing. Typically I deploy one NAT gateway per AZ so looking at $99 a month just for three NAT gateway instances with zero traffic.

mkl95 · 2024-01-22T16:59:21 1705942761

Sounds like a great way to find out who's working late or is an early bird

StratusBen · 2024-01-22T18:01:40 1705946500

Disclosure: I'm CEO of https://www.vantage.sh/ -- a cloud cost observability platform. I previously worked at both AWS and DigitalOcean.

For people looking for how to save money on AWS - I'd [selfishly] recommend connecting up to Vantage. We profile AWS for all sorts of savings and give you the information on how much we can save prior to you paying us. It can be a good gut-check if nothing else on how well optimized you are.

pizzafeelsright · 2024-01-22T18:20:50 1705947650

Silly request. The amount of paperwork, lawyers, and time required to connect our service makes it impossible to validate savings.

Is there an offline method? I have not looked at vantage to see if it's possible.

StratusBen · 2024-01-22T19:38:57 1705952337

Unfortunately we don't have an option for that route -- but we're happy to help support any paperwork for getting things up and running if you contact me or my team: ben [at] vantage [dot] sh.

brycewray · 2024-01-22T17:14:57 1705943697

In a similar vein:

https://usefathom.com/blog/reduce-aws-bill

tehlike · 2024-01-22T17:14:33 1705943673

Cutting down AWS cost by 90% by simply moving to hetzner.

cbg0 · 2024-01-22T17:59:06 1705946346

If you're running a small setup and don't need any value add products or multi-AZ/multi-region this might work, but Hetzner and major cloud providers are by no means comparable.

Hetzner offers a 99.9% uptime guarantee only on their network. AWS has SLAs for every product offering - EC2 for example starts paying out credits if they fall below 99.99% uptime.

If you're a user of various managed cloud products, these will cost quite a bit to replicate on Hetzner and you'll be spending money on personnel to build these out and maintain them instead of just paying for the cloud product on AWS/GCP/Azure.

tehlike · 2024-01-22T18:20:01 1705947601

Managed is good, but opensource is decent these days.

You need postgres? Use crunchydata postgres operator or cloudnativepg. Need multiple regions? setup wireguard.

IT's more work, but might not be a lot of work.

bdcravens · 2024-01-22T17:34:42 1705944882

Was doing some research this weekend on cloud exit. Hetzner is attractive, but our company is pretty much limited to the US (no international companies due to our current business model). How practical would it be?

Also, I've seen a lot of concern over blocked IPs, especially for lower-cost hosts. Is that an issue with Hetzner?

tehlike · 2024-01-22T18:18:47 1705947527

Hetzner cloud supports us-west and us-east.

bdcravens · 2024-01-22T18:24:45 1705947885

I was looking at dedicated hardware. If I decide to stay in anyone's cloud, I'd stick with AWS.

tehlike · 2024-01-22T19:14:29 1705950869

Yeah Hetzner doesn't have dedicated in the US, and not sure if it will in near/mid future.

Still, Hetzner cloud is pretty good option, and there's more support coming on building on Hetzner.

https://www.ubicloud.com/ (from founders of citus) is mainly/currently targeting hetzner, for example.

nvm0n2 · 2024-01-22T22:22:46 1705962166

There's alternative companies with similar offers, like Deft:

https://deft.com/dedicated-servers/

tehlike · 2024-01-23T01:06:37 1705971997

I imagine one reason I didn't hear about them before is that they don't seem to offer self service. I absolutely don't want to talk to anyone when I am buying object storage, cloud vms, or even dedicated.

throwitaway222 · 2024-01-22T18:08:34 1705946914

It looks like this is surfacing because a lot of companies are laying off but also finding every possible way to save on costs right now.

kthejoker2 · 2024-01-23T06:30:24 1705991424

When I joined my current company I quickly found our internal Azure environment was effectively unmanaged. Four weeks in I shaved nearly 55k in monthly spend just scripting out VM shutdowns and service pauses.

Cloud revenue in most large companies is at least 25$, maybe up to 40%, pure developer waste because nobody upstairs knows the difference.

hkt · 2024-01-22T16:56:59 1705942619

Small fry, I saved a public sector body $1m/year by doing this and rightsizing the kubernetes hosts. :D

from-nibly · 2024-01-22T17:30:45 1705944645

Reacting to events to install security defaults (or any kind of defaults) sounds really error prone. Are people running AWS where devs just click buttons in aws and spin up random stuff? I thought we all decided that was dumb and switched to gitops/terraform?

Did I miss a new trend or something?

bornfreddy · 2024-01-22T19:42:54 1705952574

Does anyone have a good experience with tools / services that track and analyze cloud usage? We don't use any, but could benefit from better visibility in spending patterns.

octopoc · 2024-01-22T18:59:48 1705949988

Here's a startup idea: a profiler for infrastructure-as-code that shows how much each line of code is costing per month, instead of showing where the CPU spends most of its time.

msmith · 2024-01-22T19:31:41 1705951901

https://www.infracost.io/ might do what you're imagining

stusmall · 2024-01-22T19:51:38 1705953098

Have you looked at OpenCost? It's more k8s focused but of a similar idea

TheIronMark · 2024-01-22T19:30:05 1705951805

I did this at a previous employer. I leveraged a Lambda functions and tags applied to instances to determine when they should be and when they should be off.

davidgerard · 2024-01-22T19:09:35 1705950575

We do this. It was a bit faffy to set up, but having dev and staging shut down at night and only be started as needed that day saved us a fortune.

sremani · 2024-01-22T16:53:59 1705942439

Cloud optimization is the next Kubernetes.

hkt · 2024-01-22T16:57:42 1705942662

Goodness I hope so, that'll be easy money

bdcravens · 2024-01-22T17:36:06 1705944966

QuinnyPig · 2024-01-23T01:19:15 1705972755

I HAVE BEEN SUMMONED

ralfcheung · 2024-01-22T18:06:36 1705946796

meanwhile, I'm saving the company 7-digit/year from Google.

zikduruqe · 2024-01-22T16:50:27 1705942227

Shit... I reduced our spend on one AWS account from $270K a month to $75K a month.

ponector · 2024-01-22T17:26:52 1705944412

Great! Shareholders will be happy. Did you receive any bonus?

Chico75 · 2024-01-22T18:52:29 1705949549

The problem is always around abuse. If it becomes known that you can get a big bonus by wasting a lot of money on useless infra first and then reducing it, other people will start playing the game.

How do you reward cloud cost awareness without creating perverse incentives?

nvm0n2 · 2024-01-22T22:23:59 1705962239

It's always the same answer: managers who pay attention to the details. People familiar with your work should be able to tell if you're gaming the system or not.

zen928 · 2024-01-22T20:15:32 1705954532

Will they also be paying what they owe on the added costs that could have been noticed earlier with due diligence? I'm imagining that what they'll receive instead is the compensation expected and agreed upon by both parties negotiated during either initial hiring or the multiple points in the year that allow for easy communication about changing payroll expectations, instead of hawking for dimes at first sight.

marvin · 2024-01-22T17:55:28 1705946128

It's a good shoo-in for becoming a freelance consultant charging a percentage of annual savings rather than hourly time billed.

zikduruqe · 2024-01-22T20:27:17 1705955237

Hahaha, for doing my job? I wish.

gnarlouse · 2024-01-22T17:16:36 1705943796

“Have you tried turning it off and then turning it back on again?”