Hacker News new | past | comments | ask | show | jobs | submit login
How we got our AWS bill to around 2% of revenue (sankalpjonna.com)
298 points by grwthckrmstr 17 days ago | hide | past | favorite | 232 comments

If you end up in the lightsail world and you are not utilising Amazon's other services then its probably cheaper to do this with another provider. Someone like Contabo or Hetzner will get you VMs at substantially less cost with similar fixed hardware and just give you the box but for a lot less a month. At these low scales with a completely open source stack Amazon isn't good value in my opinion. It is as you grow and it provides the scale that it becomes valuable.

AWS initially was never meant for this type of workload.

It started with epiphemeral instances, available for a fixed workload, billed by the hour, such as batch jobs.

It turned out people were willing to pay $100 a month for a box worth $30 a month.

Machines provisioned by API were not mainstream in 2006, the "cheap" way to host was shared hosting, and dedicated / cpanel boxes were more common than VMs. Actually, pretty sure "vservers" were a big thing at the time and kind of fit where cheap VMs do now.

AWS was a very different product and not really comparable to other things available at the time.

Also the elasticity and backing by a company that had seen the scale themselves, and can possible deliver for you. I remember being a then startup and boxes costing more than 2x the money we were somehow attracted to it and kept it as the option we would move if we ever had a revenue.. Large companies probably spent more than their AWS bills on software license fees (Oracle/ERP/SAP etc) and employee salaries so they don't care.

Linode started in 2003. Just because there was no DigitalOcean shoveling money into the free credits pit of venture capitalism doesn't mean that affordable VPS services didn't exist. According to https://www.programmableweb.com/api/linode, Linode started offering an API before 2008.

DO had one of the best ads, or a least one of the best targeted / positioned ads.

"You've been developing like a beast, and you app is ready to go live"


Or DigitalOcean, which also offers many of the big-cloud perks like managed db, object storage, load balancers and kubernetes.

There's one critical thing that DigitalOcean, Linode, Vultr, etc. don't provide though: multiple data centers in the same region ("availability zones") with automated cross-zone failover and private networking between zones.

You were pre-conditioned to believe that it's a feature. Traditionally, datacenter itself is supposed to provide HA for both the power and the network.

"Availability zones" exist for AWS convenience, not for yours. It allows for cheaper and simpler DC ops, removes the need for redundant generators, simplifies network design, and makes "under-cloud" maintenance easier. It's a feature for them, a headache for you, and a (brilliantly addressed) challenge for AWS product marketing.

I don’t know about you, but I’ve been in several DC power failures where the fault was in the transfer switch.

It sure is nice to have separate failure domains with low enough latency between them to pretty much ignore it in application architectures.

Colos will rarely lose power, but having your line cut due to a backhoe is pretty common. Even in top tier facilities I observed some loss of service every 6-12 months, add in some misconfiguration risk and Colo failure becomes a frighteningly common affair.

This can be mitigated through redundant service providers, careful checks on shared interconnects and other measures - but having "hard" failure isolation at the facility level will also get you there with less chance of someone doing something dumb.

This kind of thinking is how you end up in a newspaper article where you're in a building in new york babysitting a generator during a hurricane while everyone sane is serving from Atlanta now.

You're doing it wrong. Plan to lose sites. If you plan to never lose a building, you're just setting yourself up for pain by optimizing for the wrong kind of redundancy.

I disagree. AZs are completely independent data centers kilometres apart. For any businesses which may need low latency but still want full HA (e.g. finance systems), it's a blessing. This requirement cannot be fully covered with separate data center regions, something like an airplane crash would still hit everything.

BinaryLane in Australia have VPC across datacenters

Not gonna lie, I have been scared to deploy to DigitalOcean after that debacle [0] posted here last year. It's worth the premium to me not to worry about that at night. Yes, AWS could shut me down too, but the probability seems lower.

[0] "Digital Ocean Killed Our Company", Hacker News, May 2019. https://news.ycombinator.com/item?id=20064169

Yep, now that DO has managed DB instances it’s become my go to provider for most setups.

Droplets (VPS) have generally worked out cheaper than EC2 (with more resources) for me.

The bit that initially sold me on it a while ago was no more worrying about CPU credits, ie consistently maxing the CPU out at 100% then getting throttled, on the small instances (T2).

Also the web interface isn’t a mishmash of 1000 services with different UX for each section.

If DO isn’t cheap enough for people there’s also Vultr which works out even cheaper ($2.50 a month for 1cpu/512mb) if you want something similar (not baremetal).

If you are extremely cheap like me, only have to run cronjobs at certain times or only need your instances to be running at predictable times you can also use the [AWS Instance Scheduler](https://aws.amazon.com/solutions/implementations/instance-sc... ). At my last job we were running SAP on EC2 and we were able to lower our bill like 50% or so by only running the instance 9-5 Mon-Fri. Now I use the Instance Scheduler for running cronjobs every day on a T3a instance and it costs less than a $1 usd per month. You can also configure your cronjob to stop the instance once it ends, that way the scheduler will only need to start it and you'll save the most (./myscript && shutdown -h now)

Is the bottom of the barrel VM's really what AWS is aiming for though? Or is it the auto scaling, variable, HA workloads...

Depending on the amplitude of your load cycle, it may be cheaper to just stay fully provisioned all the time on another provider.

> Vultr which works out even cheaper ($2.50 a month for 1cpu/512mb)

I can see that price on the pricing page but when I go the Deploy new server page, I cant see that price, the minimum is $5.

New York and Atlanta DC only.

I wrote down pros/cons of various ~$5 and below VPS services in a Gist I have been maintaining for a couple of years: https://gist.github.com/frafra/4688b146ca6d55accb768c3557939...

Hope you can add Linode here. I've been a long time user and it's a pretty good & popular option imo.

Not if you need any CPU performance to speak of. Given the misleading "vCPUs" that we are sold, I took some time to benchmark several major cloud providers and the results were… worrying: https://jan.rychter.com/enblog/cloud-server-cpu-performance-...

BTW, I run my SaaS on real servers from Hetzner. I figured I won't need instant auto-scaling anyway, and if you provision with ansible it doesn't really matter if it's an EC2 instance or a real server. What does matter is the price and the performance: I get servers which are significantly faster than anything you can get on AWS, and for a much better price.

To be honest, I do not understand the drive towards AWS. It makes sense in micro-deployments (lightsail and/or lambda, when your usage is sporadic) and in large deployments that need dynamic scaling. But it does not make much sense anywhere in the middle.

Good article, thank you for it. I only heard good things about Hetzner from many people, incl. networking setup.

As an owner of an iMac Pro (the 10c/20t CPU variant) I have to tell you that you are slightly mistaken -- the Xeon-2150B is a pretty strong CPU even compared to the bursty high-end desktop i9 CPUs. Sure I mostly do parallel compilation and work with (and produce) highly parallel programs, and there it really really shines.

But even for the occasional huge JS compilation it still performs pretty well. (Turbo boost puts it at about 4.3Ghz is the max I've seen.)

The iMac Pro's only real downside is that it has only 4 memory channels so as lightning-fast its NVMe SSD is, and as efficient the CPU with its huge caches is, the machine can likely perform anywhere from 30% to 80% faster if it had 8 memory channels.

But in general, iMac Pro is a very, and I really mean very, solid developer / design / professional software user machine.

I mean, all of that doesn't matter much in a world where the Threadripper 3960x / 3970x / 3990x now exist but still. iMac Pro is still the best Mac you can buy (Mac Pro 2019 uses the very last possible high-end desktop Xeons and I don't think Intel will be making many more of these; AMD is definitely on their tails and I don't think Apple will produce that many Mac Pros).

That being said, I am looking forward to buy a Threadripper 3990x monster machine somewhere in the next 1-2 years with dual 4k or 5k displays. Hopefully the Linux community will finally get its act together and make proper sub-pixel font aliasing by then...

> As an owner of an iMac Pro (the 10c/20t CPU variant) I have to tell you that you are slightly mistaken -- the Xeon-2150B is a pretty strong CPU even compared to the bursty high-end desktop i9 CPUs.

I do not think I am mistaken. I haven't yet encountered a Xeon that can beat the desktop i9 in single-core performance. Looking at geekbench scores (https://browser.geekbench.com/mac-benchmarks), my iMac is 10.5% faster in single-core than your iMac Pro, and your iMac Pro is about 15% faster in multi-core. For development work, I will take single-core performance any day.

Use cases and optimizing for them, I suppose.

I do mostly parallel work and for me the Xeon has proved itself as a better CPU compared to several desktop and laptop grade CPUs I also tried.

Words of sense are rare to be heard.

There has been a lot of comparisons and benchmarks and there will be even more.

After some point monthly bills grow to the scale where it makes perfect sense to invest money and time into your own infra. Yet People will still be throwing money into the hype oven.

I use Linode, but it was $10 or more, while I've just seen there's also a $5 VPS called "Nanode". I added it to the list and I will test it more in the future. Thank you!

Your list is missing quite a few major providers in the "low end" VPS space, such as BuyVM, Ramnode, Virmach, HostHatch, and probably others I'm forgetting.

Thanks, I read your comment on the gist too; I added them, but I have no direct experience. My gist was not intended to be complete, but a mostly EU well known VPS providers. A proper git repository with a nice markdown table would be better probably.

I personally use RamNode (a very small, 128M instance as a backup DNS in their Amsterdam DC) and can't really fault them. They've been rock solid so far.

Heard good stuff 'bout Ramnode.

Ah ok, I am using a $5 DO server as file server in Singapore, I thought I might save a few more bucks by moving to vultr but its ok $5 is pretty cheap.

This will work much better as an online spreadsheet. Thank you for the effort though!

CloudSigma is another player in that space. (No affiliation)

You are right, I have never heard of that thanks. I just added them to the list.

Do you host your entire stack on DO, or split compute and DB (and eat the latency hit)?

They also offer spamming services, phishing & Trojan hosting, and will block abuse reports for weeks at a time :)

Put a different way: they won't assume your service is abusive based on a few reports, and will allow you some time to address complaints yourself.

Is there a better way to handle such incidents (serious question, no sarcasm)? I feel like being too receptive to abuse reports would allow anyone to take down your service by submitting fraudulent abuse reports.

Hello, author here.

We definitely considered using other cloud providers like DO, linode, etc. But it was important for us to go with AWS because we needed some of the other services that AWS providers like s3, Route53, etc.

Some of our static websites are in fact hosted entirely using CloudFront + s3 combination which is something I forgot to mention in this post :)

Hi. Also I think you're future proofing your setup in a great way. There's no limit to what you can build. Also you get the fully fledged networking stack I assume, with VPCs, Security groups and so on.

Very good read, thanks for sharing. Have an upvote.

I haven’t used LightSail myself, so that’s the standard disclaimer. But while you do get a VPC, as long as you are in the Lightsail world, it’s invisible.

Once you graduate to full fledge AWS, you can peer your Lightsail VPC to your full fledge AWS VPC.

I'm not an expert on AWS, but can't you use all those services without AWS?

If you mean can you use S3 and Route53 etc. while hosting your servers somewhere other than AWS, absolutely - people do that all the time.

However in the Route53 case there are a few automatic integrations that only work with other AWS services. Not that big a deal though.

I'm not sure whether you're confusing AWS for some specific component of AWS, or whether you're just stating that competitors exist for those techs

But for the former: AWS is a whole suite of tools -- the specified technologies (cloudfront, S3, etc) are individual (and mostly decoupled) tools within that suite

One of my many AWS pet hates is that they have a service called Amazon WorkSpaces, i.e. AWS.

Sounds like they are saying that you can have an S3 equivalent and DNS services without being on AWS. Which is true and most of the time less expensive.

You can if your company can afford to pay enough to hire an entire devops team in a short time.

Why would you host a static website using s3? What's wrong with a traditional file system? You already have the load balancer.

Hosting a static website using S3 can be done dirt-cheap. We were hosting our website on S3 for a while now, with a monthly bill of 1,20€ with CloudFlare CDN in front of our bucket. Another big plus is that you do not have to worry about server administration, load balancers etc.

For a static website isn't something like Netlify easier (and cheaper, as in free)?

100GB bandwidth/month for free, it's quite small traffic.

Just curious, why CloudFlare instead of CloudFront?

I'd assume because CloudFlare doesn't charge anything for bandwidth; so their costs are exceptionally predictable

With block storage (a disk), you are going to need some compute capability to host it. Presumably with S3 you don't need that (I know you can do this with Azure Blob Storage, hosting static content directly; assume it's the same with AWS)

It is the same. S3 alone can host a static website (albeit only over http if you want a custom domain name; you need to add CloudFront [or equivalent] to get https hosting with a valid cert under a custom domain name).

My website is very small scale. I'm already using S3 to store some other things so hosting the static site on S3 is pretty straightforward. And it costs me literally pennies a month.

with caching (eg. cloudflare) in front of s3 hosting for static websites, you have almost zero cost hosting.

because it's super cheap and ridiculously easy

hardest part is working out how to go fully-public fully-static

Also, Lightsail is hot garbage. It has the same crazy aggressive throttles as a T2 instance. You'll have roughly 5% of the original CPU if you run hot for half an hour or so. It is not comparable at all to other cheap VPS options.

Just looked at contabo.com

Am I reading this right? 4 cores, 8 gigs of ram, 200 gigs SSD for 4.99 euro a month? That is around $5.60 USD a month. That feels too cheap to be true. What's the catch over ec2 or Digital Ocean?

Admittedly I've only seen a couple of sources but it looks like their tactics are as old as their website design. Good old fashioned overselling.

yeah, that's literally 4x the cost of my box at Hetzner

mind you, it's very much a pet and not cattle

This is interesting. Do you have (or know of) a published comparison somewhere?

What kind of comparison are you expecting?

Go to hetzner.de, create an account, fire up one of their 5€/month VPS and see if it runs your application well. If it doesn't, upgrade the VPS (or whatever) till the max or until you decide that AWS or whatnot might be better suited (or cheaper).

It's nothing personal against you, but those comparisons aren't worth anything. They are highly specific to the test case, which might very likely not be like the stuff you are trying to run.

And I've just so often seen people on StackOverflow ask how to fix their AWS setup to run a simple WordPress site, it's unbelievable.

There's this old saying: "If the only tool you've got is a hammer, everything looks like a nail."

That pretty much works for GC/AWS/Azure, too.

You can't learn everything by just trying the service though. Particularly on issues like downtime, support, and how they handle spikes. Also some providers have been known to artificially elevate resources for new users until they're locked in.

I think hetzner has servers in Germany and Finland only, not the right choice for most people here.

Hosting in Finland and Germany is a great choice for a lot of the right reasons.

Additionally, because it is hosted in Germany, you need to send them a copy of your ID and I found that their support was quite amateurish. So for me the process was: Go to hetzner.de, create an account, it gets instantly closed because you have a UK driving licence but were born elsewhere (perhaps too complicated for them?).

Never had that from AWS or OVH.

This is how I roll out my services. Always wondered if I need AWS, but always lacked the use case.

For heztner, just read the list prices for dedicated servers:

https://www.hetzner.com/dedicated-rootserver/ or maybe https://www.hetzner.com/sb

For example, a dedicated "AX51-NVMe" server for 65 euro / month = 73 usd / month gives you: AMD Ryzen 7 3700X Octa-Core (8 physical aka 16 logical cores), RAM: 64 GB DDR4 ECC, NVMe: 2x 1 TB NVMe SSD, unlimited traffic. That's less than half the cost of lightsail per logical core, or less than 1/4 of the cost of lightsail per GB of ECC ram.

That seems like a very powerful machine at an incredibly low cost. The closest equivalent at DigitalOcean is RAM 64 GB, 16 CPUs, 1.25 TB SSD, 9 TB transfer at $320/month. If I understand correctly the DO instance will have lower performance because it's virtual, and not bare metal. That amounts to about 4-5x difference in price, which is tremendous, considering that I thought DO to be the one of the cheapest providers. Is this comparison correct or am I missing something?

The part your missing is that basically all dedicated server hosters are this cheap.

DO, Vultr, AWS, GCP, Azure are the odd ones out and extremely expensive.

The only reason ever not to use dedicated servers is if you're in the bay area and ops wages are so overinflated that you literally can't afford an ops or devops person.

For everyone else on the planet, the comparison between the wage of an ops person, and server costs, is always in favor of dedicated hardware with your own ops people.

The thing I don't understand is why would people consider AWS to have lower "ops" costs?

I can deploy my SaaS either to VPS servers (AWS, DigitalOcean, Azure, etc) or to physical servers. Both deployments use the same ansible automation. The only difference is that I additionally use terraform to set up the cloud instances.

In every case, if a machine fails it is my problem.

Where is the savings on ops/devops?

Hilariously, this whole thing is again showing up as issue on the frontpage: https://news.ycombinator.com/item?id=23798347

> Where is the savings on ops/devops?

I don't know either, I've always used european dedicated hosters.

But apparently the typical YC user needs to pay 300'000$/year for a dedicated ops person just to run an ansible script?

Any employee comes with overhead costs ($$ and time and risk). Lots of side projects get off the ground only because people don’t have to cover the fixed quantum of costs that hiring an employee brings.

“If this doesn’t take off, I’m on the hook for ~$100/month when it could have been under $10/month“ just isn’t that compelling.

Ops has a low baseline complexity, and scales more or less linearly with usage on low usage scenarios.

It's something worth optimizing, but the idea that it will break a starting initiative isn't realistic.

You are not missing anything. I wondered about this myself, then started using physical servers from Hetzner for my SaaS. Quite happy with the results over the last 4 years.

what did the interconnection between the dedicated servers cost? in the past, there was no vlan support, i.e. you need to buy the interconnection, which was not cheap.

There are VLANs, but I don't use them. I found it's easier to use vpncloud (https://vpncloud.ddswd.de), that way I can use the same setup for development/staging/production, and it doesn't matter if the specific provider used supports VPCs/VLANs.

Also, with vpncloud I know that my data gets encrypted and is private — not necessarily the case with various VLAN setups.

when I started with hetzner there was no vlan, you needed to pay an "extra pack" and an additional network card for interconnection, with more than two servers you also would need to pay for an router and of course the time it did take to setup. so not really that cheap in the past.

They're not quite comparable, the Hetzner equivalent to that would be the CCX41 plus some extra storage for around $200/month (but you get 11 TB extra transfer). In general I think DigitalOcean VPS's are 1-3x the price of Hetzner ones.

I don't know why Hetzner dedicated servers are so cheap (they're also cheaper than their equivalent VPS's). I guess they take a bit longer to set up but there must be more to it than that.

A dedicated server is a one month commitment, possibly with a setup fee, Hetzner also charges to do things like replace failed drives.

You also don’t get the benefits of a virtualized instance like live migration for host maintenance. You can run your own hypervisor - but you’ll probably want extra hardware like 10Gb switches and the appropriate cross connects and as well as paying the one-time server move fee.

>Hetzner also charges to do things like replace failed drives

Are you sure? If I look at the AX dedicated server page, I can read that the basic support is free:

"Basic support includes the free replacement of defective hardware and the renewed loading of the basic system (in so far as a disk image system can be loaded)."

They’ll replace the drive for free - but likely with another used disk with quite some hours of runtime under its belt. If you want a new disk you need to pony up most of the time.

Dedicated servers are very cheap.

Could you please elaborate why? I thought they should be premium, as they are faster, have more memory and better performance because of bare metal, etc.

The question should be the other way around. Dedicated servers are not cheap per-se, they're just being sold at the market price.

The question should be, why is AWS so expensive on all fronts (not just the hardware but also bandwidth)? And the answer is again because the market lets them get away with it.

AWS is so profitable it turned Amazon from a marginal business into a tech giant. As for Google and Microsoft they don't get out of bed in the morning for anything less than a billion dollar opportunity.

A large part of the cost of a VM on these clouds is paying their super high employee comp, for advanced software development e.g. CosmosDB/DynamoDB, but most of all, contributing to the high revenue growth that drives stock prices.

It's not paying for the actual cost of hardware. Smart people know this and don't run in the cloud unless they get a sweetheart deal. Consider that GitHub, for example, runs/ran in their own datacenter and used the cloud only for spillover capacity. Zoom also (although now Oracle cut them a sweetheart deal). Netflix built their own CDN etc.

You pay them monthly instead of hourly, also you have to manage them yourself.

How to manage a bare metal server running CentOS automatically:

Add this cron job:

yum -y update

It's not quite that easy, is it.

At minimum you need a data replication and backup strategy. You're exposed to things like drive and RAM failures on dedicated hosts, so you'd need to think about RAID at least, unless you're running a system that clusters at a higher level (but then you need multiple machines).

However this is basically a matter of learning or hiring/renting a sysadmin to do it for you.

MTBF stats for different classes of hardware at different hosts would be valuable to have, as it can be affected by things like datacenter temperature. But I never heard of such a dataset.

I don't have a comparison but Hetzner is pretty much unbeatable in price if you can live with having your servers in Europe. If not, OVH, DigitalOcean, Linote or Vultr are good alternatives with datacenters around the globe and prices below that of AWS.

>If not, OVH, DigitalOcean, Linote or Vultr are good alternatives with datacenters around the globe and prices below that of AWS.

Sure they're below AWS, but if you compare lightsail, they're pretty much identical.



I know that hetzner is affordable, but is digital ocean really cheaper than AWS?

That would largely depend on your usecase.

I created a comparison myself, because I couldn't find one. It was for my specific workload, which is CPU-intensive, but could be useful for others, too. An important find was that what we are being sold are not "cores", but "vCPUs", which usually correspond to hyperthreads, on slow Xeons. Which means no performance to speak of if the machine is loaded, which it usually is.

My conclusion was to rent physical servers (I use Hetzner) and get great performance at a fraction of the price.


Great read, thanks! I would definitely consider using dedicated servers instead of VPCs. AWS seems to be ridiculously overpriced.

Would like to see this too

Just to be clear, this would be more appropriately titled How we saved money moving from AWS to VPS providers.

It comes as little surprise to me that the AWS lightsail offering saved them money over traditional AWS services, just as they’d save even more using two dedicated servers with any reasonable provider and being able to have a hot-failover for everything.

At DNSFilter we’ve gone through the evolution of Heroku -> AWS -> VPSs -> Dedicated -> and now are setting up our first colo rack.

I did a price comparison recently between Dedicated, AWS, and Colo. Dedicated is 15% the cost of AWS for our needs, and colo will be 42% the cost of Dedicated for us.

Now keep in mind we have dedicated DevOps staff and are running at a very different scale from OP, and such a solution is not for everyone. But I personally have never understood the folks who love to brag about having spent much time and effort optimizing AWS to spin up and spin things down, when for the same cost I could just have a 10x more powerful server sitting there, on all the time to handle a spike in load, and I can utilize the extra resources or get things done faster with faster gear at 1/10th the price.

In the end, the most optimized way to save AWS cost is to call the AWS sales guy for lunch.

I don't ask for anything but usually 2-3 days after that he'd call to inform that I got a few thousands of AWS credit.

100% agreed. I always cringe when I see some hot new startup going all out on AWS or other cloud services when a $60/mo dedicated box would have more than covered their needs. No point spending that money until they actually need to worry about scaling, and even then, the value of dedicated servers / colo is so much better if you have the staff to support it. I manage a top 2k site off of some OVH boxes and our bandwidth alone at any of the major cloud providers would be many multiples of our monthly bill.

Would you mind sharing any of your experiences with OVH that you’re able to? I’ve used them before for hobby projects but was always worried about reliability for anything more serious. Had also heard questionable things about their networking reliability. Anything related to roughly how many machines you run, types, problems encountered, etc would be super helpful. Thanks!

I actually had similar concerns, in my mind OVH was always one of the lesser regarded providers with cheap hardware and attracted not-so-nice customers judging by the amount of spam I used to get from their network.

I ended up moving my personal projects to OVH as a trial (and due to some some gross incompetence by Singlehop). After a few months of everything looking great, I moved our company's production services and couldn't be happier. I now consider them one of my tier one options when looking for hosting options. The hardware on the higher end models is all enterprise grade - Xeon, ECC, enterprise SSD / NVME drives, etc. You also get full IPMI access with virtual media, so setting up full disk encryption etc is a breeze. Network has been very reliable, their DDoS protection in particular is almost magical - I barely even see the malicious traffic before their network filters it. I only have a handful of servers, but none of them have experienced any hardware or major network issues. Currently have servers in their Vint Hill, Beauharnois and Roubaix datacenters and it all seems to be equally reliable.

The value-add features like FTP backup space haven't been great, speed / reliability problems but I don't care too much for it anyway, I ended up using Hetzner for backup storage as they have a ridiculously good $/GB ratio. I can't speak much for OVH support as I've rarely had to contact them, the few times I did they resolved the problem satisfactorily. With IPMI access and being an unmanaged service, you're probably on your own for anything software related.

Awesome, thanks so much for the info. Glad to hear it's a viable option since the hardware prices always seemed great and the unlimited bandwidth would be key for certain applications too.

Lots of startups need to worry about scaling pretty early on. I see the HN attitude a lot of "You aren't Google so don't bother with scale until later" but a lot of interesting problems require scale extremely early.

> two dedicated servers with any reasonable provider and being able to have a hot-failover for everything.

Except you have to implement that hot failover yourself, rather than using something like EC2 auto-scaling groups or an RDS multi-AZ deployment. I'm not confident that I'd get it right, and the middle of a night is a hell of a time to discover that you got it wrong.

> RDS multi-zone AZ deployment

I don't know about all RDBMS brands, but if you're running Postgres then setting up a read-only slave is trivially simple. You run one command to promote it to master.

At my work we use both AWS and colo our own gear in two geographically isolated datacenters. We use AWS for highly specialized tasks (e.g., Lex) and our colocated servers for just about everything else. The savings have been tremendous -- our costs would be ~10x higher if everything was on AWS.

I think a lot of people forget just how cost-effective dedicated servers and colocation can be. If you're not allergic to dealing with the occasional hardware failure, then there's no question in my mind that it's the right way to go.

How much uptime do you need, and how much does EC2 provide? Things will depend on that.

But from what I've seen, the point may be moot. Odds are that the EC2 uptime is around the same as you'll get on a single machine VPS anywhere anyway.

> An EC2 instance with 2 virtual cores, 4GB RAM and a storage of 80GB costs roughly 37$ a month and a Lightsail instance with the exact same configuration costs 20$ a month which is almost half the cost!

The author completely failed to do his homework. An a1.large instance is only 37$ if it's an on-demand instance. You pay a high premium to be able to pull the switch in the exact minute you cease to need one.

If he's willing to go with AWS lightsail with it's monthly plan, the same a1.medium instance type is about 11.75$/month as a reserved instance, and can be had for about 131$/year as well.

The Lightsail instance comes with 4TB/mo of free transfer. Even using a small fraction of that transfer on EC2 will cost you more than the instance itself.

Also, creating an instance on Lightsail isn't exactly a monthly commitment. You are free to delete your instance at any time and only pay for the hours you used. It's a lot more flexible than the multi-year commitment you need to make in order to get the best pricing out of EC2.

Keep in mind that lightsail charges you for inbound and outbound data transfer; whereas EC2 only charges for outbound ... so depending on your network usage the 4TB isn’t a guaranteed saving.

Yea and 4TB is $360 which dwarfs everything else.

Okay, I don't get this at all!????

If I want to move 400 TB per month out of Amazon, why can't I just proxy it thru 100 Lightsail instances?

Isn't that much cheaper than paying directly per byte? For that much saving it's probably worth figuring out how to do this. It can't be hard.

Isn't it "free" to move data internally to a lightsail instance, as long as both are in the same DC?

> If I want to move 400 TB per month out of Amazon, why can't I just proxy it thru 100 Lightsail instances?

There are some answers in the lightsail faq[1]:

1. You're limited to 20 instances per account. They'll increase that number on a case-by-case basis, but probably not if you're planning on proxying all your traffic through those instances.

2. If you delete an instance and create a new one, they share the same data transfer allowance.

3. All data transfer (both egress and ingress) applies to the data transfer allowance.

It may still be worth doing the proxy setup, but Amazon seems to have pretty clearly set up limits to make it less desirable.


1. https://aws.amazon.com/lightsail/faq/

There are similarly magic numbers in the database cost. A 2 vCPU/4GB instance is db.t2.medium which comes to about $35/month on demand, plus 120 GB is addt'l ~$13 so it actually comes out cheaper than the lightsail version, and certainly way below the proclaimed $200.

This is one place where Google's cloud platform is better in that it automatically applies continuous use discounts without the guessing games.

In addition to data transfer costs, the Lightsail instance also comes with storage, which will cost you extra on the AWS one.

Not sure if relevant, but A1 is ARM, so not exactly the same.

I dabble extensively in being dirt cheap with my monthly cloud spend on personal projects, and after much experimentation, I have settled on the following:

1. elastic beanstalk, no docker: EB comes with really nice defaults so that you can quickly whip up a flask app, upload it and it just works. It provisions a small ec2 instance by default which iirc costs 10ish a month at best. Importantly, any operation you do with EB will by definition be ready for continuous deployment since you don't get an option to ssh into the machine to deploy. It's extensible enough to add whatever extra stuff you need as well. Only thing it can't do is simple caching ( if you scale to more than one instance that is), but that can be solved by having a separate eb deployment for a worker that can take care of all these aux stuff (elasticache is expensive I think). In a pinch, it also scales well (though the default cheap deployment does not have a load balancer).

2. Just suck it up and go with RDS postgres. Again, I see too many things going wrong with spinning up your own db in an ec2 instance especially if that instance goes down. Im too lazy to write backup scripts and keep track of them! The cheapest RDS postgres costs 13 a month or so, but I just suck it up to power whatever side projects I do. Postgres means I'm working with something I know, and I get full text search and pubsub for free. And whatever I write is not locked code in anyform, and can be scaled up if needed as well. More importantly I'm only spending my time in technologies that are relevant for me in my day job so that's a win.

3. Github Actions to deploy to eb. just a few lines of yaml and you instantly get continuous deployment directly from your repo for free! Really can't beat that.

I have meant to try out heroku since it could be cheaper from what I have read. But I couldn't figure out what their S3 alternative is or how different they are from the canonical cloud offerings.

I'm sure it can be so much more cheaper, but I'm not good at advanced networking or sysadmin, and I'm too lazy / bored / disorganised to write deploy scripts or sshing into remote machines. I'm also always afraid of if/how long I need to re-provision a vps/ec2 if it goes down. Not that they do, but they can, and that scares me.

Have you come across any cheap elasticache alternatives? I'm on AWS free tier and AWS gives you a free 2 vCPU 0.5gb(or 1 vCPU and slightly more memory) instance which suits me for now. But I was wondering if there are any other managed redis alternatives? We are just 2 people and we want to keep our stack open to be able to switch providers and I don't know how easy it'd be to switch db instances.

I have explored redislabs but all in all it seems much more expensive than elasticache. A similar instance from redislabs costs ~$36 vs ~$13 in AWS. My comparison was based solely on capacity

Perhaps you could consider hosting redis on lightsail? It might be the cheapest memory option within AWS (so traffic is free and fast). But it sounds like you're trying to use it as a database? I'll still go with managed MySQL/postgres - you can use the database's backup and restore abilities top switch providers fairly seamlessly- I have done it from gcp to AWS. Which is why I suck it up to RDS.

I only need ephemeral storage. I really like the pub sub functionality (which is the my most urgent need) of redis and the key value pair store suits me for now.

Hello, author here.

1. When I started working on this I was not fully aware of the large array of services that AWS provides and therefore our setup may not be the best possible one. So I would be checking out Elastic beanstalk for sure and see if it is feasible to use for my next venture.

2. Just to clarify, we are not running mysql on a lightsail instance, rather we are using a managed database provided by lightsail which automatically takes backups on its own and also has the option to restore a new DB from an existing backup.

3. Thanks for the insight on EB. Will be looking into it for sure.

The thought of any of these instances going down scares me as well. but I would like to believe that I have set up enough alerts everywhere so I can take immediate action :)

Your setup seems quite good for sure as well! It'll definitely be cheaper than going with EB - once you include the load balancer as well, which you probably need if you don't want a single instance managing all your traffic. However you gain ddos protection and fully automatic scaling for 12 bucks a month.

The biggest benefit for me is that you are not tied to a single EC2 instance anymore. theoretically this is a moot pain point - in my work we have EC2 instances with uptimes going to four years now. I assume AWS aims for even better stability with lightsail (or not? Can't remember what the SLA for lightsail is). But EB or appengine is definitely the bare minimum of what you need to truly consider yourself cloud native and given the small increase in investment it might be worth it.

How does your traffic scale? Does it peak during work hours? You could potentially setup EB with a small instance that scales up to 2-3 instances just during work hours, with just a few toggles. Also, if you at least estimate having this service for a year, you can prepay and reserve the instance which is again a huge discount.

More importantly, I think the productivity gains you see going into continuous deployment is amazing. Getting a deploy by just merging your PR to master is surreal for sure, and makes it a breeze to just work on your features instead of infra! Often if I'm lazy and want to refresh some environment state I'll jusr kill the EC2 instance with the confidence that EB will scale a new one up!

This is what we do also, but at a slightly higher level. EB web, and worker environments, t3a.small instances, auto scaling between 10-100 instances. And using Aurora serverless RDS.

How is your experience with serverless RDS? I heard the cold start can be as much as 30 seconds! That will be a bummer for stuff I do where I'm the only user many times lol.

We dont cold start our production environments. But yes it is slow on our staging ones.

By far the biggest issue the the relatively small connection limits.

One problem of VPS solutions is that they're easy to maintain until you have to do it. At some point the technical debt comes back and you need to upgrade the stack components and eventually the OS as well, without sacrificing uptime.

... if you did things well, it's just starting another instance and installing / copying things over with some minimum downtime. But if you don't have a 100% documented stack, you don't know which configurations you touched to make things work, you have files lying around, then you're probably going to pay back all your savings and more in the workdays needed to migrate.

At my current employer we never have time for maintenance, and we don't have professional expertise (the "I worked exclusively at this for many years" kind, I mean) at webhosting, security, system admin or dba, so I heavily lean towards more "managed" and cloud solutions.

I completely agree (it's always the small changes that make things work that cause problems), but for this product which is greenfield and has a very small team, it seems like a good fit.

Like, I wouldn't run Shopify like this, or anything really large, but it may get them to a better place right now.

What I really liked about this post was the discussion of what they actually needed and how to achieve their goals. It's a tradeoff of time against money, and I wish more people documented their thought processes and approaches to this kind of stuff.

But cloud solutions can change their UIs and APIs anytime they want. They can also drop features as they please.

Has any major cloud provider ever done a major change like that without 6+ months of advanced notice?

A bit off topic, but I bet someone here knows. When running an EC2 instance and not using all of the cores on the socket (for example, using a c5.large instance instead of a c5.12x large, which gives you all the cores on the socket), you presumably are sharing your L3 cache with your neighbors on the same socket, because that's how the processor is designed.

Is there a way that the hypervisor allocates a dedicated portion of the shared L3 cache to just your instance, or is it a free for all for all of the L3 cache space against potentially noisy neighbors?

Yeah that is true, you are sharing L3 cache. In order to mitigate some of recent intel issues I think AWS actuallly has their own chip now on newer motherboards to handle the hypervisor duties securely.

Otherwise, they'd do it in software patches for older CPUs and take the performance hit of the patch.

I'm not sure how much the hypervisor would reserve off of the L3, it is likely to be free for all however you'd still have quite a bit of dedicated L2 and L1 on most xenons. With AMD's first gen EPYC it's a little bit different because clusters of cores share a cache and you can get weirdly high latencies depending on which cores you're using, (i.e. cores 8 and 9 being too far apart)

Also according to this anandtech article, the average total CPU load for physical aws machines is ~60% and is actively balanced out by them. And yes, running benchmarks on a machine without noisy neighbors yields very significant improvements, up to 2x better on the benchmark scores. They measured this by comparing renting out all the cores of a machine vs only renting out the 4 or however they needed .


I'm assuming they'd put a CPU into sleep/hibernate mode in order to save power instead of having it only run at 5% utilization.

Without any dedicated hypervisor tricks, can't a typical L3 cache eviction algorithm also evict memory that is assigned to another core and currently residing in its L2 or L1 cache? (Thereby flushing even the higher level caches if another core is really noisy.)

I'm not entirely sure, it looks like for Skylake CPUs the L3 cache is no longer inclusive but instead acts as an extension of the per core L2 cache.


I remember a while ago reading about storing your encryption keys in L2 instead of ram and deliberate "abuse" of the L3 cache on VPS hosts however can't find that article and haven't kept up with the news on it.

Given that the intel cpu patches have reduced CPU performance by ~15% ( again sorry don't have the exact source) I'd say there has been quite significant changes in cache management in the name of security.

Thanks for the info. It's a dilemma when allocating instances because I want the full per-core performance but I don't need a full socket's worth of cores, so I just have to hope my neighbors aren't running huge jobs all the time.

Sounds like a textbook example of where and how a side channel attack would look like.

I read a lot of "you are wrong", "you didn't think about this", etc, which I'm not going to get into. I embrace these posts as an invitation to re-evaluate, with your own data and use cases, your technical decisions and for that I'm always grateful.

On the same note, in case someone is digging into how to reduce CDN bills, I wanted to share that we are quite happy with BelugaCDN. It distributes objects stored in S3 using, in a hacky way, referrals as authentication method. Lots of money saved there.

Their setup is trivial - they could do it at 0.2% of revenue with a cheap vps on Linode or other...

> Their setup is trivial - they could do it at 0.2% of revenue with a cheap vps on Linode or other...

Lightsail has pretty comparable pricing to Linode. I'm sure they could re-architect their app to use fewer instances, but they could do that and stay on Lightsail as well.

Moving to linode isn't going to give them a 90% savings.

Pretty interesting, I didn't realize the pricing was the same and you get bandwidth.

This bandwidth pricing is very interesting because if you have apps that ship a lot of bytes, you can just run like haproxy on lightsail and get those bytes very very cheap instead of paying standard egress data pricing at like 9 cents per gig...

I just asked the same question https://news.ycombinator.com/item?id=23671787 before I saw your comment.

Does this work? It seems like using lightsail is a way to drastically reduce egress data pricing??? It's like a free lunch, what's the catch?

Only thing I can think of is if they charge you egress out of a vpc to lightsail or something - would like to know if anyone has tried this.

The price seems on par. Are there hidden costs on aws?

I'm assuming that you might not be aware of AWS pricing.

It's actually incredibly complicated and often very difficult (if not impossible) to predict. We have a dedicated part of our organisation which exists solely to figure out costs ahead of time for project.

I'm not sure if it's intentionally obfuscated, I would suspect not, because "pay for what you use" can be broken down into many areas.

    It's actually incredibly complicated and often very
    difficult (if not impossible) to predict.
Not a joke: there are people that optimize aws bills for a living.

There are startups that build specialized AI to optimize AWS bills for a living.

Azure is the same and Microsoft spent 20 years making their software licencing more and more unpredictable.

It absolutely is a strategic choice and not an accident.

This is the main reason I don't use it. I like having predictable monthly pricing that remains consistent.

What is incredibly complicated about Lightsail? That’s kind of the point of it.

Egress charges.

Lightsail and digital ocean give you 1TB egress minimum included in billing. Now try to input that number into EC2 instance. If I am not wrong, that turns out to be 130$ approx

Hello, Author here.

While this is true, we wanted to go with AWS for 2 primary reasons

1. We use some of the other services provided by AWS like s3 and Route53.

2. Just the reliability and brand that AWS has is something that we had to take into consideration

Is Lightsail also available for Graviton? Could you also compare your savings/losses if you switched to ARM instead? That would be an interesting comparison

Disclosure: I work for AWS building cloud infrastructure.

No, Lightsail doesn't use Graviton2 powered instances today.

It's the difference between pets and cattle.

They are using Lightsail to create pets, it has not much configuration. When you need a cattle, you need more config and complex setup which costs money.

As they say in the end, this is because they are a micro startup and don't need a huge scalable infrastructure.

It's funny as it seems to be the inverse of economy of scale. The more you grow, the higher the marginal cost. But I think that Amazon gives discount to large users

For context, I know the ins ands outs of most of the core AWS services really well from the dev, Devops, and ops side.

But, my advice tends to be Lambda first if it is really low volume, LightSail second, and full AWS third.

As far as Lambda, I often recommend proxy integration, where you can just use the standard API framework for whichever language you choose (Django, Flask, Express, ASP.Net, etc) add three lines of code and push the entire thing into Lambdas. This gives you the flexibility to deploy to a VM later with no code changes.

For your static assets use S3, except for the case of Lightsail where you get plenty of free bandwidth.

Hello, author here.

I totally agree on Lambda first. The only reason we did not do that is because when we started this product I was not well versed with Lambda and serverless and preferred to work with something that I dealt with previously.

If I could go back in time, I would set up all our applications on serverless.

Disclaimer: I work for AWS Professional Services but I just started. All of my experience comes from working at outside companies.

From the perspective of an outside, boots on the ground Developer/architect, I’ve never worried about “vendor lock in”, I believe you should choose your infrastructure wisely and go all in. But, I do worry about “Lambda lock-in” for APIs. I like the optionality of being able to deploy my APIs anywhere just by changing the CI/CD pipeline.

That’s why I recommend using proxy integration. Every language supported by Lambda has a method to just throw your standard API in lambda without tying yourself to it.

Here is an example for Node/Express


Python/Flask (ignore the DDB part):


C#/Web API


> I often recommend proxy integration

Could you go further into what this is? Do you mean the API Gateway proxy integration? Is there a code sample somewhere

See previous comment.


Short version, instead of letting APIGW do the routing and applying its own intelligence, it just passes the event straight to your lambda. Your lambda has to process it. The plugins I cited, will translate the Lambda event to a form that your standard API framework expects.

Those plugins also work if you put your lambda behind an application load balancer.

Maybe I missed something, but how do you handle the fact that your nginx server is a single point of failure? If that goes down, traffic can’t get to your web servers.

Do you have more than one, and DNS load balance, or do you just live with the risk?

One of the main reasons why I use an ALB/ELB is so that I don’t have that SPOF. If you found a way around that, please share, I would love to know, so I can save some money :)

His database also is.

I think it's highly unprofessional to use a setup like this in production. Looking at his product, it seems like a product whose downtime has a big impact on their clients.

I hate to admit it but the nginx server infact is a SPOF.

We currently handle it by setting up alerts all over the place so I can take quick action if something goes sideways, but other than this I have not really found a way around it.

We also have latest snapshots ready of all our instances so that I can get another server running ASAP during a calamity.

I’m wondering if Lambdas and DynamoDB would be competitive to this setup.

Since they are serving about 250 requests per second, Lambdas might end up being more expensive, factor in an expensive DynamoDB and that monthly AWS bill looks scary.

Hello, author here. I just want to confirm that this is one of the reasons why we did not go with Lambda and the other reason being I was not fully aware of and versed with serverless when we started this venture

I’m thinking otherwise. If the payloads are small, DynamoDB is going to be cheaper and nothing will come close to the savings from Lambda compute.

Almost certainly. A lot of groups have shown low volume systems have been remarkably cheap using lambda's compared to trying to do the same thing in a VM with EC2/lightsail. Whether that is genuinely because it is that much cheaper for Amazon or whether it is trying to lock people into the platform I don't know.

> I’m wondering if Lambdas and DynamoDB would be competitive to this setup.

Depends, but it's important to know the details.

Lambdas are used for 2 things in AWS:

1) Glue for managing services and events. (Required for advanced AWS mgmt.)

2) User applications. Lambdas have improved over time, but initially had all kinds of limitations (startup time, database access, 15 minute job limits, etc.)

DynamoDB has also improved a lot. Initially hot keys could cost you a lot of money. Some companies have designed their applications to use it in a way that absolutely no administration is required, reducing DBAs and Ops people.

Source: DBA.

Dynamo is great. All new projects I work on that require a db use it. We're slowly moving away from SQL databases entirely. Far more scalable, no more horrible SQL issues, at the cost of not being as flexible. But that's alright in the context of micro-services that do one thing and do it well/fast.

You can shop for deals on VPSes at lowendbox.com. But if you're trying to run a business, this is a waste of time. Find a provider which is highly reliable, which can also automatically rebuild all failed infrastructure with no intervention needed. That eliminates 99.999% of the providers out there. You're paying a premium to never have to think about your tech again, so you can focus on the business.

Besides using the free tier and other AWS services which are practically free at low uses, you can use cost effective options like Fargate Spot Instances and EC2 Reserved Instances. I highly recommend Fargate over running instances. Use Lambda with CloudWatch triggers if you need to schedule occasional jobs (or use Fargate's feature for that). Try to avoid heavy reliance on caches, ElastiCache is kind of a rip off. Move as much content to static as possible, use CloudFlare to reduce bandwidth costs. If you're gonna serve over S3, you might as well front it with CloudFront as it's actually cheaper due to caching at the edge, and also more reliable. ALBs are expensive but very useful for APIs as well as autoscaling (if you have to run instances, run them with an ASG, which also means having versioned AMIs)

For the mathematically disinclined (such as myself), the total savings are at least $229/mo .. not bad! Curious if you could use Cloudformation to provision these instances and setups? https://lightsail.aws.amazon.com/ls/docs/en_us/articles/amaz...

I'm interested in the managed DB part of this writeup, specifically that OP chose Lightsail. The last time I looked into it, Lightsail was MySQL only, so that was good to know.

I wrote a PostgreSQL DBaaS Calculator that got some traction here a while ago, and I just updated it this evening to add Lightsail to see how it stacks up: https://barnabas.me/articles/postgres-dbaas.html#calculator

No surprise, Lightsail is similar pricing to RDS with a 1-year commitment, but it's month-to-month. It's a pretty good deal until you need more than 8 GB memory, 240 GB storage, 2 cores ($115/month Standard plan). But Azure or AWS RDS are the ones to beat.

When I did data center stuff for a large company the biggest cost was network - not servers. Specifically network security and redundancy.

Its amazing to me that this is all included for basically free except for Application level filtering by these providers.

Yep, I was consulting for a firm with an already sizeable infrastructure investment/dependence on AWS, and their architects were proud of their best practice usage of multiple AZs for redundancy, but weren't know that inter-AZ traffic cost also. So I imagine they're going to have fun trying to incorporate that into their infrastructure layout.

In my day job we saved $600 a day by cutting down on needless inter-AZ traffic.

> An EC2 instance with 2 virtual cores, 4GB RAM and a storage of 80GB costs roughly 37$ a month and a Lightsail instance with the exact same configuration costs 20$ a month which is almost half the cost!

In general I'm not sure you can really compare instances from different service by just look at cores, RAM, and disk.

I used to have my small website and email server at Rackspace, on the smallest, cheapest instance. I ended up getting out of mail hosting (moved that to Fastmail), and putting the small website on Lightsail.

The Lightsail instance and the old Rackspace instance have the same nominal specs for virtual cores, RAM, and disk. (Actually, Lightsail may be better on disk--I don't recall if the Rackspace one was SSD like Lightsail).

The main thing the website actually does is host some graphs showing the temperature in my house. An ESP8266 temperature monitor I made uploads a sample once a minute. A script was running once a minute on the server that using gnuplot to make graphs of temperature over the last hour, 3 hours, 12 hours, 48 hours, and over all samples.

At Rackspace it ran that fine, while at the same time gathering my mail from several services via fetchmail, receiving mail for my domain on its smtp server, and running spam filtering.

On Lightsail just handling the website it was locking up every few hours. I managed to catch one pre-lockup and found the load average was something like 300.

What was happening was sometimes that once a minute graph generation task was taking longer than a minute, and that would slow down the next one, and so on. Oddly, it didn't seem to be gnuplot that was taking too long, but rather the script that took the file containing all the samples, and extracted just the samples newer than a specific threshold. At least, running everything by hand that was the only step I ever saw take unusually long.

I changed it to every 5 minutes to temporarily stop the frequent hanging, and then added a check to my script to skip regenerating the graphs if a previous instance of the script was still running to fix things permanently. I also changed it from storing the data in just a big file of samples sorted by time to an sqlite DB.

TLDR: Use Lightsail, Amazon's approximation of a dedicated server.

In my opinion, that approach sacrifices pretty much all the benefits that cloud proponents usually talk about, like only paying for what you actually use or scaling up and down on demand.

In fact, I'm not sure what the actual difference would be between Lightsail and a good dedicated hoster that provides backup and failover services.

The difference would be:

- They could have much better service

- And bring down their expenses by at least half again

I used lightsail, and it's a terrible service.

It is very expensive (still) compared to other hosting providers.

And if your instance ever go out of memory, it become unresponsive for as long as you don't go manually restart it.

On other providers that I use, the OS would just "sacrifice child" (kill the process) and restart it.

It's not ideal, but much better than having to go there yourself to restart the whole thing.

I've been using cgroups since way before docker made them cool exactly for this: if one process goes rogue on memory use, make sure only exactly that one crashes.

And yes, I fully agree with you. In addition to the other disadvantages, Lightsail also shifts the burden of process monitoring and management onto you.

I use docker. And yes, it normally just take care of restarting the app whenever something goes wrong.

But somehow on lightsail (only), the machine just goes completely unresponsive instead.

Sounds like OOM killer is disabled by default on Lightsail. Shouldn't be hard to turn it back on by modifying a line or two in sysctl.conf.

I disagree. Start with LightSail customize later.

Too many startups starts with Kubernetes, microservices putting a lot of engineering into infrastruture when they should worry about product market fit and simple server would just do fine.

Complex solutions to simple problems pad resumes.

It's important.

I've been looking for a job recently and my lack of experience with excitedly building expensive overly engineered big piles of stuff to do simple things is really hurting me.

When I suggest simple solutions proportional to the problem size people look at me like I'm some kind of simpleminded ninny.

And then when I describe working systems I've made using the approach on companies that exited north of 10 million, then they usually just think I'm a bullshitter.

10,000 users? Just do a simple lamp stack on a $10 month vps and don't code it like a moron, it's not hard. And it can be delegated to almost anyone, that's important for longevity. That's it, move on with life.

I've even offered to show logs, I mean, it's remarkably bad how hard it is to find work as a cost-cutting product-focused engineer, nobody wants that shit, really.

They want autoscale, prometheus, travis, blah blah blah... as fancy and crazy as possible, an empire of infrastructure instead of infrastructure for an empire... It's a damn cult.

I'm thinking about giving up and giving in. Maybe spend a month and drop $10,000 or whatever on a bunch of the tech just so I can talk about these things.


The second thing I'm constantly surprised by is how many obvious things are formalized with fancy names, math equations, papers, etc. Stuff I'd think like literally anyone with half a brain would do.

Recently I found out about "CRDT". Apparently I've been doing it for a decade... It's just super simple stuff wrapped in needless formalism, like back when design "patterns" were all the rage. I used to think "why is there a name for that?" it's such basic stuff.

So saying "dude, the emperor has no clothes, here, take my coat" - bad interview strategy and it's hard to rectify because there's like 50 naked emperors and silicon valley culture is on a constant rotating basis of seeing the nakedness of 5 of them and ignoring the other 45.

Maybe what I should do is just a couple throwaway interviews to focus on finding out the current fads and then be able to give the right cultural signals for the ones I actually want.

Great response. We’re all obsessed with infrastructure and over-engineering, myself included. Startups like to think they’re working on the next big thing, so they want to use the tech that the big boys use and bask in their anticipation of scale. You can easily save cost and time to validate your business with smaller hosting approaches, though it does look amateur from the outside. It’s not amateur, it’s smart.

Yeah I've been in LA for 12 years which is much more practical for the most part.

And it's been fine, profitable, revenue generating, sure. Mysql, php, a bit of python, easy stuff.

Honestly I've been wanting to move and try out silicon valley again though and it's been hard because I don't speak that language and frankly I'm skeptical and hostile to it - bad idea. That bad attitude needs to go.

I've been in practical startup land too long so I keep getting stonewalled from getting in to the sv world.

I'm trying to contribute to some companies open source projects recently and then basically earn my way in through labor since I can't do the virtue signaling. It's only been a few weeks, nothing yet.

I gotta learn how to be along for the ride without always trying to reach for the steering wheel.

Hopefully by the end of summer I'll find a lead somewhere.

It's not about the money, it's kinda actually how I want to spend my time for the next year or two.

You could just as well start with one cheap dedicated server and then migrate to the cloud once you know that you need it.

For example, one of my companies is doing hosting and image delivery services for photographers. I started it on one dedicated server, and now 10+ years later it's running on 3 dedicated servers.

My hoster does backups and OS upgrades, so there was never any reason to go into the cloud. Except maybe, if I wanted to burn cash on slow performance ;)


> We are a team of two people

Articles like this reflect the fact that one size does not fit all.

> a good dedicated hoster that provides backup and failover services.

In theory that sounds reasonable. In practice the ability to completely manage the system via a web console and APIs makes a big difference.

I have never needed it, but the hosting provider which I use actually has an API that you can use to reboot servers into different OS images, or trigger hardware resets, or do an kvm access.


But honestly, if you have only one server, what do you need an API for? It'll be much faster to just configure things by hand. It's not like you'll switch servers very often. Or at least the original article says that they committed to monthly cancelation terms.

Even Hetzner as the cheapest VPS and dedicated server provider offers an API to completely manage servers.

You can definitely do cheaper than hetzner (like OVH).

> all the benefits that cloud proponents usually talk about, like only paying for what you actually use or scaling up and down on demand.

That's a trope that I have never seen in practise.

A company might have some small percentage of clusters that autoscale, but usually there's a bunch of servers that don't.

Also, when AWS has control plane problems, autoscaling doesn't work, you get swamped, your health checks fail, and you lose all your servers and you're hosed. So it's better to always over-provision in the first place.

Same here. Heroku offers auto-scaling for the web dynos, but they advise you to do near-realtime tasks like the thumbnails for image uploads using background workers, for which there is no Auto-scaling.

Slightly off-topic, but is there a book to learn about this kind of stuff, preferrably without having to sign up for AWS, but that does have code samples and whatall?

There is one book that I would highly recommend - https://gumroad.com/l/aws-good-parts

There's only so much you can learn about this without signing up and actually getting your hands dirty unfortunately.

I don't understand why this is so special? Ours is about 4% for a very compute heavy service handling millions of events per day across four continents.

It's no secret that I've had my fair amount of criticism towards AWS's billing and how unpredictable it becomes, even if you read every fine print and take everything into account. Especially if you are a heavy user, "correction invoices" are not uncommon. That said, cloud services can be incredibly cheap if you are very careful and smart about it. Especially if you combine several of them.

Another approach to consider, that's so straightforward and quick that most will consider it boring and overlook it.

- kill services with low usage

- downgrade instance size

- downgrade instance type

- merge databases

- schedule services with obvious usage patterns to shut down when not used

- use EC2 spot instances

Most importantly, it requires an aggressive culling mindset. If drastically reducing the AWS bill means staying afloat, then make bold choices.

As a summary: They use AWS, see huge bills, then decide to not to use AWS services and switch to redis, nginx etc.

A lot of this is penny wise pound foolish. Managed services cost more, but they provide more. RDS is just such a simple slam dunk for what you get for the cost. How much does it cost to spend 8-20 hours a week handling maintenance and upgrades and scaling for these services? I guess I'm bitter working on a project that was built with all sorts of crazy custom deployment stuff too conserve costs that's a total OPs nightmare. Moving from Redis on EC2s to elasticache was such an upgrade. We constantly had little issues here and there.

Edit - Alsocontainerize your app from the start so that if it does take off you can slam it in ecs or k8s.

So Lightsail is the AWS equivalent of DO/Linode/etc?

Yes, AWS probably realized that they losing money to DO/Linode so to capture some of that market they launched Lightsail.

The general rule of thumb to drop AWS bill is spot instance. Which is very hard to do unless you build your application to be tolerant of server kills.

Definitely off topic, are you using Twilio/Whatsapp at superlemon.xyz ?

I wonder how do you cover the cost for unlimited outbound messages in your free plan...


You said you basically use Shopify as CDN, what’s stopping everyone from just setting it up and abuse it?

Hello, author here.

Like I mentioned in the post, there is no way to programmatically upload anything to the Shopify CDN and the CDN cache cannot be invalidated once you upload a file. If you want to update an existing resource, all you can do really is upload a new one.

This ofcourse still does not completely prevent people from abusing it, but it does restrict usage to a large extent. There is also the issue of giving out a cdn.shopify.com link to your customers instead of something that has you company branding on it. This is not a problem for us because our customers do not have to manually add this snippet to their website and we do it via an API instead, so this link is not apparent to our customers

Do Hetzner or xneelo reflash the firmware on server hardware when they recycle them across users?

How do you solve high availability with the nginx setup? What software for heartbeats?

Another day, another SaaS that I can’t believe has value.

Really good one.

Here is how I do: no revenue, no bill. ;D

I've first hand, witnessed people having no revenue and a long AWS bill. A direct consequence of using all the hip services when they didn't need any. I'd say you're doing great (to be clear this is not sarcasm)

That’s significantly more profitable than most startups on AWS without revenue ;)


Author is talking about few hundred dollars here.

At that rate a raspi on starbucks wifi would get them to 0.2%

I appreciate the disclaimers at the end:

> I would like to put emphasis on the fact that we are a micro-SaaS product that solves a small and specific use case and therefore this kind of AWS setup worked for us. This may not work for big organisations or products where the traffic is erratic.

> This setup will also not work for folks who have a ton of stuff to do already and would prefer to use managed services and not take the additional headache of monitoring, maintaining and provisioning hardware resources on a regular basis because this has a time cost to it.

It is refreshing to see a post that doesn't pretend as though its recommendations are the holy grail for anyone and everyone.

I personally prefer saving time and effort, over a ~$500 monetary discount. But it's nice to read posts like these and learn more about alternatives such as Lightsail.

> I personally prefer saving time and effort, over a ~$500 monetary discount.

Completely agree. As the general "Ops" manager, I hate getting the call at 2am because something broke... I hate it even more when it's not really fixable by me in the first place (e.g. black box appliance or software).. but I feel a heck of a lot better when I can point the finger at an external team that's paid and supposed to be experts... If you outsourced Redis or you DB, and someone like AWS or MongoDB themselves has an issue... Will they pay people a lot of money to simply provide that service reliably... I get paid to keep 20 things connected, not be the subject matter expert in all of them.

> but I feel a heck of a lot better when I can point the finger at an external team

But how would you feel if you were not just the Ops manager, but the CEO as well?

I think if I were the CEO I'd have even more on my mind than managing servers.

I totally get that it can be cheaper to do things yourself. And I also get that sometimes you literally can't afford to go the easier route.

But I also think a lot of people of forums like this overly discount time spent or even just distraction that take away from doing the many other things you could be working on as part of a business.

> I personally prefer saving time and effort

I guess it depends where you live and how much your time costs (and if you have any time to spare).

Yup, This is some high quality jugaad. ;)

Text from about me:

> where I learnt how to scale an application to serve ads to millions of users a day

Please lets stop this 'race'.

AWS expense/ revenue is an insane anti-metric to track.

I have a question: (it may not be related to this article in any other sense aside from "really lightweight" use of AWS...

I would like to have a workflow of really lightweight QR/barcode transactions... but with what seems to me currently a complex implementation:

1. you have a prescribed barcode ParentA 2. you scan that barcode and create a QR code ParentB 2.a: you ascribe that QR to an object ChildA 3.n: you have that ChildA go through many iterations 3.n+: ChildA may be conjoined with ChildB,C,D etc... -- but I want an iterative QR code that follows the family tree back to parent A and keeps history of all transitions...

(I know I am wording this poorly... just thinking out loud...)

So what about the grandparents, and how would ChildN go to kindergarden?

What a fucking stupid response.

The workflow that I am talking about would apply to taking the nightmare out of METRC tracking system for cannabis -- the METRC system is a piece of shit - but wants complete chain of history of how plantA becomes productA - and it is a bullshit workflow - and I have a way to make it easier - and by using a system similar in the way described in this post, it could be done very cheaply...

For what is this workflow? Just curious.

If it's barcodes being used by humans in the real world, get a software package to handle it, check the solution from Zebra https://www.zebra.com/us/en/products.html

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact