Hacker News new | past | comments | ask | show | jobs | submit login
'Merchants of Complexity': Why 37Signals Abandoned the Cloud (thenewstack.io)
87 points by msolujic 5 months ago | hide | past | favorite | 27 comments



He’s absolutely correct about the ridiculous complexity of cloud tooling. The number of services, features, knobs, etc., available in AWS is borderline comical. I’m skeptical that half of the products they have really fill a gap and solve problems people had. Who’s asking for all of these new services that are constantly being rolled out? There’s so much cruft in their product offerings. Promotion at AWS is kind of like how it is at Google now, where a product is launched because someone needs to get their L8, not because customers are asking for it.


To be fair, you don't have to use those features. Stick to the commodities like EC2, S3, and maybe managed databases. You might however notice that you're being ripped off and a crater is forming in your bank account.

Those specialized services are often about trading-off vendor-lock-in in exchange for cost-savings. Sometimes these savings are real but more often they're just perceived due to the disparate billing models (some services bill per the hour, some per-request, some per-GB of traffic, etc) so it makes it hard to estimate or understand what exactly you're paying for and which areas can be optimized.

At the end of the day there is no such thing as a free lunch. AWS' margins must come from somewhere, and they would not be offering a product at a loss.


>Stick to the commodities like EC2, S3, and maybe managed databases.

The problem is they are increasingly not a commodity. Not only EC2 or S3 but also bandwidth / transfer. The price disparity between EC2 and 2nd tier Cloud Provider like DigitalOcean or Linode continues to grow every year. And the gap between DO and 3rd tier continue to grow also.


AWS employs ~60k people for sales and marketing and has an operating margin of ~30%. There is room for margin compression.


It’s all about lock in. AWS has a bunch of features that you could implement yourself in an afternoon with a few EC2s, but it’s much easier to click the checkbox. Repeat this across an entire org for a few years and you are deeply coupled to AWS. They are anti features in that sense.


In some cases you end up writing more glue code to shoehorn some AWS service into your workflow than just implementing the functionality yourself.


Basecamp (neé 37Signals) are in a significantly different position than many companies in two ways:

- Unlike most B2C companies, they have effectively zero unpaid traffic. Ad supported content needs to scale far higher to be able to make useful revenue.

- Unlike many B2B companies, they have remarkably low compute/storage needs. They're not doing ML, they're not doing anything with video or images, they're not doing anything to serve end customers at scale, they don't have any low latency requirements, they don't have uptime requirements, they're not doing builds. They're a CRUD app. This is not to say there aren't hard problems and not to down play their good solutions, but compute is just not one of their major problems and likely never will be. Their scaling will therefore be much slower. They are about the best fit for bare-metal as there can be.


So you are saying they don't need much compute juice and storage? But compute juice and storage are what cost the money at cloud providers, so their financial reasons are weaker than the ones of other companies.


My point was that by not needing many resources, and not scaling quickly, it is much easier for Basecamp to run on their own machines.

That said, compute and storage tend to be the cheapest products from cloud providers, as they are the products that the other "value add" products are built on top of where more margin is made. For example, EC2 is much cheaper than Lambda for the compute you get, because Lambda does a lot more for you.


So what you're saying... because they're not ad-supported, their products are cheaper to run and better for the environment because they're not burning countless megawatt-hours of electricity to find data to sell to third parties?


Not at all. I'm saying that by not being ad supported, they don't need 10 million users to be profitable. There's no profitable ad network with 1000 users, because no one cares about showing ads to 1000 users. However if you charge those 1000 users $100 a month, that's often a very nice business!

Ad supported businesses need to scale to millions of users in order to be able to sell ads. The selling of the ads isn't necessarily a resource intensive process, but you need so many users to even have a market.


Something that doesn’t get talked about much is that the apparently limitless capacity of cloud computing has very real impact on the world.

Go visit any location where cloud providers locate their data centers and you’ll see sprawling campuses of giant windowless concrete buildings surrounded by asphalt and tall fences. These sit on land carved out of huge destroyed forested areas. And all of them require extensive electrical networks which also clear natural areas and break up animal habitats.

The loss of these natural areas isn’t factored into cloud costs at all so companies spin up many more computing resources than they need which necessitates the building of yet more data centers.

Maybe if companies were made more cognizant of the physical impacts of their computing by moving back to self hosted environments the cancer of data centers on the countryside could be slowed and one day reversed.


>Go visit any location where cloud providers locate their data centers and you’ll see sprawling campuses of giant windowless concrete buildings surrounded by asphalt and tall fences. These sit on land carved out of huge destroyed forested areas. And all of them require extensive electrical networks which also clear natural areas and break up animal habitats.

Really? Compared to all the other space we use to put our human things, data centers are literally nothing. Insignificant. I'd grant you the electrical consumption argument, maybe but your point about their vastly destructive use of space is absurd.


IMO it's probably better for everyone, including the environment, to concentrate these data centers in random out of the way places rather than trying to, say, further cram them into pieces of San Francisco or Seattle. People like to live by nature, and the nice places are already super impacted.

These centers aren't going to make a meaningful difference vs your average industrial factory, solar farm, farm farm, etc. But spreading them out will mean less efficient cooling, transportation of goods and labor, less efficient routing, etc. and have climatic impacts.

The industrialization of the countryside is also providing jobs in areas that would typically be extremely depressed or resort to resource extraction economies to survive.


Eh, aren't the footprints of these datacenters really low compared to all the other things we use space for? Also they don't really need premium land they can be built basically anywhere, so you could put them where nothing could grow anyway. I don't think the space is a concern, and including it weakens your argument.

However electricity is a more serious concern. Data centers do use a significant amount of the world electricity, and the production of that electricity does come with concerns and tradeoffs.


Electricity and water.


Could the solution to a complex cloud like AWS and Google be a simpler cloud like Digital Ocean is attempting to create? I find their offerings much easier to decipher and their onboarding journey developer-friendly. This is as opposed to AWS where I reckon a certification is needed just to navigate the mess of services it offers and to securely deploy those services without creating a security vulnerability for your organization.


It's all about finding the minimum of costs between headcount and infra-spend while taking into account risk of failure and a few other factors.

If you need to hire someone to manage stuff you have to build or do yourself because DO doesn't support a solution yet, then that's something you could have spent on AWS. Do you really have spare devops talent just waiting around?

For example - SQS is pretty good and reliable - but you could use something like a rabbitmq droplet https://marketplace.digitalocean.com/apps/rabbitmq

How much traffic can that droplet handle? What happens if it fails? How is the droplet maintained and patched?

Or I could just call boto3 sqsqueue and be done with it... are queue's really the top cost of infra? Probably not - your top cost is probably database.

People probably build up a solution using AWS tech - and see where things suck. Then they move off those bottlenecks or cost tent-poles. If there are none - then stuff is left as is.


This stuff has been making the rounds in my social circle who mostly do infra things.

DHH is a salesman/marketer above all else so you need to take anything he says with a grain of salt.

That said he is right on several crucial advantages of running your own kit:

1. It's a lot cheaper, once you reach the scale of 2-3 racks (or roughly 30-45kw of gear or so) you are well in the green even if you need to hire additional staff to manage the HW side of things.

2. Performance is on a completely different level especially if your workload benefits from I/O. Being able to stack at 2U box with an absolutely silly amount of NVMe drives makes light work of what would otherwise be tricky scaling challenges.

3. Access to real raw networking makes a whole bunch of things more optimal, namely real BGP (yay anycast) but also better load balancing primitives like Maglev.

Now these are all very good reasons to want to be on your own gear but DHH completely glosses over the very real disadvantages.

1. As time goes on you need new hardware, either to replace gear that is out of service life, has otherwise failed or most commonly for expansion. This will require doing the same validation you did when you first ported off (or built out if you were greenfield) because it's very likely you aren't going to be able to get the same SKUs with the same BOM. So over time your infrastructure becomes less and less homogenous which increases the cost of management and becomes a burden in terms of operational knowledge. TLDR introducing new hardware never gets easier, the more unique SKUs you add the less fun things become.

2. Networking is generally entirely glossed over in posts like these. Running your own network with your own AS, IP space, edge routing, dealing with peering and transit, balancing upstreams etc is work. You might be able to offload this to your DC provider in some cases but usually you are going to have real network engineers on staff, you can't repurpose a curious programmer into a network engineer (atleast quickly). That is also ignoring the fact that most network gear still hasn't caught up to the age of IaC. If you want that you will be spending either more money (Arista/friends) or paving road ahead of yourself (unless things have changed in the last few years I have been away from gear).

3. The financial aspect is as a whole a lot cheaper but you trade total net $ for complexity of financial arrangements. Often times you will want to lease servers, you need DC leases, you need transit agreements etc. You went from a single entity you pay in "The Cloud" to many different vendors. Vendors suck of course, lots of vendors sucks even more. If you are a larger company with a procurement office this is a non-issue or can even be an advantage because said procurement team can put the screws into vendors individually.. so YMMV depending on size of your company.

4. While most older software engineers and infra folk generally have bare HW experience the newer generation (last 10 years) for the most part have never touched anything lower level than a VPS. The days where every infra guy knew IOS, megacli, etc like the back of their hands and wrote Perl one-liners instinctively has sadly passed. You will need to seek out actually qualified engineers. This can put off businessy/less-technical types because they don't like the idea of not being able to hire a bunch of juniors that know close to nothing and will accept peanuts for pay. If you want to to do this but you are coming from the technical side (you want them sweet IOPs) this is probably one of the bigger organisational barriers you will probably face.

4. Things fail. Often at very inopportune times. Runbooks, backups, DR, etc these are all much higher stakes on your own gear. Culturally if you want to go this route you also need to be willing to accept a higher level of investment in these and associated activities like running proper drills. You should do this anyway but in the case of "The Cloud" you are getting higher durability etc that do reduce risk vs your own spinning rust/SSDs and operator error.

All that being said I'm still on team "run your own gear after $200k/mo bills start showing up" but I don't like DHH painting such a one-sided picture of what this really entails.


Fully agree, but I am not sure 200k month is high enough of a cost to start looking outside of Cloud. A Sr Engineer will probably be 300/400k per year and you need a few of those to do your own infra, 2.4m per year is not much of a budget to build a proper team with backups, 24/7 support, and then also paying for servers, networking, etc.. Now, if you can afford downtimes, fixing issues only during working hours, etc., you might have a case.


What's your opinion on old-school dedicated server hosting providers? They would handle all the hardware side for you, with an experience close to a virtual machine provider. No need to worry about the details of owning & running actual racks in a DC.


I worked for 7 years for a company that was mostly bare-metal rented hardware. The service provider handled half of this stuff, but not everything. Off the top of my head...

- Networking was still an issue. We didn't get much control over the network we were on, I think we were on a vlan, but we weren't in control of it. We needed quite a robust firewall setup on all our machines. All machines were on the internet, again not something we could control.

- Hardware did become varied over time. We had to do things like creating a new database primary and using the old machine that was now too slow for a read replica. Eventually we found that the read replica was slower than just reading from the primary!

- One machine kept eating SSDs, they had a life expectancy of about 3 months in that machine. Neither us or the provider ever figured out why. The replacement process was a hassle and required a lot of manual work on both ends (even though it was in a RAID array designed for hot swapability).

- Once the provider gave us a machine with a BMC, but didn't tell us about the BMC or give us any credentials for it. The BMC was on the internet with default creds, it got owned, the machine got owned, and we were running a Monero miner for someone for a while. Thankfully no other damage, but that could have been company ending if we hadn't lucked out on a few details.

- The DC had bad peering with an ISP (could have been a customer ISP, or maybe our office, can't remember). That caused a bunch of hassle. Again, not our fault, but definitely our problem. Hard to get someone interested in fixing it.

- The provider got acquired by another bigger DC provider. All the experienced staff left, and service dropped off a cliff. This got us to finally move to the cloud.

- Hardware lead times are rarely less than a few days, often weeks or months. You have to capacity plan far out, and often make trade-offs like taking a spec you don't want (over and more expensive, or under and less useful) to get hardware when you need it.

We might have been able to get the provider to do more for us here, but anything they did was operated at the level of email and someone on their end manually configuring things. It was error prone, it was slow, it was unpredictable. None of this is any criticism of the provider, they were good at what they did, but this is just the inevitable outcome of that model as far as I can tell.


as one of those "newer engineer", the thought of dealing with server racks and bundle of cables inside huge & gloomy server room actually scares me. I really appreciate those guys who actually knew how to setup those HWs, despite maybe clueless about latest shinier things. having the will to actually wrestle with messy to stuff to make the life of other engineer better is actually commandable. kudos to all those guys


But the industry does not appreciate hardware guys.

Young DevOps is much more in demand, has higher salary and zero liability for his fuckups.


Didn’t they make a big hoopla about hosting basecamp on 1 big self hosted server years ago? I can’t find it, but I’m surprised to hear they were in the cloud at all.

Edit: This is what I was thinking of. It was from 2012 and it was just the ram for the machine they were building for their caching of basecamp next.

https://signalvnoise.com/posts/3090-basecamp-nexts-caching-h...


It's linked from the source blog post (linked in OP as 'divest entirely from cloud') for OP:

https://world.hey.com/dhh/why-we-re-leaving-the-cloud-654b47...

So yes, I thought that too (and that perhaps this was old), but turns out they said they were going to do it; now they're saying they've done it.


Hmm, the post I’m thinking of was much older. I’ll have to keep looking.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: