Hacker News new | past | comments | ask | show | jobs | submit login
Unbundling AWS (tclauson.com)
331 points by taylorwc 8 days ago | hide | past | web | favorite | 175 comments

One major thing this doesn't consider is the technical limitations which are latency & bandwidth.

(1) You save a ton of money on bandwidth when you move data from AWS to AWS

(2) Your stack, in most cases, needs to be near each other to minimize latency. Databases get wrecked by this.

This is why, cloud database providers have to often transparently show you which cloud you're launching on [1] which effectively means AWS is going to get a good share of it anyway. My uninformed guess is that EC2&S3 are by far their biggest money maker which is going to be what unbundlers target.

I'm all for the unbundling and will probably take part in some of it, but I don't think it will be that easy.

[1] https://www.cockroachlabs.com/product/cockroachcloud

> You save a ton of money on bandwidth when you move data from AWS to AWS

This is only because AWS grossly overcharges for bandwidth. If you move all services that have high bandwidth requirements to providers with reasonable prices you'll save a significant amount of money.

Within a single AZ. Inter AZ is charged. It's also frustrating that AWS infra doesn't optimize for same AZ (and only fall back to another AZ if an AZ is impaired). For example, a client in us-east-1a can hit the Aurora reader endpoint and be directed to an instance in 1b even though there's a healthy instance in 1a.

A consultant on Twitter recently posted how a client got a bill for more than $100,000 because they unwittingly had a multi-AZ Kubernetes setup and moved a ton of data that way.

Third parties like Snowflake get around this by having you pick where their service is hosted so that bandwidth and latency aren’t a concern.

Further, Snowflake is a good example of an unbundled service that has more capabilities than AWS Redshift (for instance, zero copy clones). They use AWS for infrastructure - the entire warehouse is stored on S3 - and value add on top of it.

If a company needs an almost commodity software, like most companies, it's clear that over the long term, AWS will win on reliability and probably price over an AWS startup.

So why choose differently ?

Because AWS knows that many of their customers think like this and pursues ever more minimal Minimum Viable Products accordingly. Their quality is dismal and it stays that way over the long term.

AWS's killer feature has nothing to do with tech, it's the smooth billing process (techies choose, management pays and supervises). If you can put together a smooth process for paying for 3rd party software in your organization, you can unlock massive improvements in quality for a pittance.

There is nothing smooth about AWS's billing; it's notoriously byzantine, and so difficult to interpret that there are multiple third-party services whose entire offering is based on ingesting and parsing your AWS bill.

The value of AWS billing isn't that it's easy, it's that the spending decisions are in the hands of the right people so software development moves faster.

Yes, exactly. The billing tools are awful, have always been awful, and will always be awful, but bad tooling can be worked around much more readily than bad bureaucracy.

I would argue the killer feature is actually IAM: technically better than anything any other vendor offers, even if there are a million and one problems with it and not a single annoyance has been fixed since 2013.

I could see that.

I almost mentioned IAM in my post as a co-killer-feature, but I've only ever witnessed its power in AWS learning material / conferences, not being leveraged IRL, so I decided not to speculate. I'm more dev than ops, so even though I've seen a decent number of AWS environments the fact that they have all used IAM in a clumsy, coarse manner doesn't really mean much.

Do you have a feel for how frequently ops manages to actually leverage the fine-grained power features in IAM?

Essentially every large scale environment I’ve seen (as an ex-core maintainer of Terraform, that is quite a few) either does an ok job of this or has projects to do better.

The issue I have with IAM is that it is not possible to be sufficiently fine grained - for example I cannot grant an instance permission to read its own tag values but not those of other instances, since the EC2 IAM API is stuck in the state of 85% done at which most AWS services eventually seem to plateau.

> AWS will win on reliability and probably price over an AWS startup

From my experience, Snowflake is both more performant AND cheaper than Redshift or other RDS options from AWS.

Bandwidth/latency was my first thought too. Still, for higher-level services, it is entirely possible to compete with AWS running on lower-level AWS infrastructure, rendering the network issues moot.

Good points.

Would add that edge compute, running cloud paradigms (code instead of config; automation; management abstractions), partially addresses these limitations for many use cases.

Costly to move the data off, but longer-term ROI for those orgs that are willing to make long-term decisions.

Meanwhile, as edge matures, greenfield apps should be edge-centric, rather than cloud-centric (doesn't mean they won't have cloud components...they will do the processing and storage where it best makes sense).

I thought the article was talking about avoiding cloud vendor lock-in by using software that keeps you portable between clouds. (Instead of software that dictates that you buy from a certain cloud provider.)

Couldn't you build your software stack in a cloud-portable way but, at any given time, still have 100% of it on whatever your current cloud provider is? And then switch to another cloud provider from time to time if/when the costs make it worthwhile.

Yes you can, but it’s very difficult and not worth the effort IMO.

The first problem is that you wind up using the least common denominator. You’re paying public cloud providers a premium, so this essentially equates to throwing away money.

The second is that there are some cases where what seems like the same thing isn’t, or best practice is wildly different between providers. Get in a room with an AWS expert and an Azure expert and talk about what an account is.

You raise a good point. Upon re-reading, portability to a different cloud is probably not the point. It's more likely about being on the cloud you're already one but having the choice to take only parts of the platform, instead of taking Amazon everything at every layer of the stack. So you're still on Amazon or whoever's cloud, but you may not use (for example) their database if you like some other database.

In other words, it's not about moving your entire stack off Amazon's cloud, it's about moving parts of your stack off Amazon's software even if all of it may still run inside Amazon's cloud.

>avoiding cloud vendor lock-in

You're simply moving from being locked in by AWS to being locked in by a bunch of much smaller cloud vendors.

1) Doesn't Amazon charge for ingress and egress transit, and isn't this unusual, and isn't it more expensive that way? I could have swore I recently calculated AWS transit to be something like 20x the cost of that in a normal colo datacenter. Maybe there's something in AWS billing where this only comes into play when you cross certain technical boundaries, I don't know enough to say, but I'm thinking it would have to be something like that in order to "save a ton of money."

2) Contemporary site/app development renders the latency of individual requests irrelevant. The whole page is going to boop and bounce around for 20 seconds anyway, why make a big deal about it?

The web is a measurably worse experience now than it was 10 years ago (maybe even 5), so it's ironic that Goodhart's Law has led so many astray toward measuring smaller and smaller trees within a growing forest. "Well but we have to get the request for number of friends below 10ms" while the other 400 requests on the page are dilly-dallying and experiencing their own latencies. Then the CSS gets applied.

AWS egress prices are so insanely high that for bandwidth heavy apps I've had consulting clients where we cut their hosting costs by 90 percent by moving them off AWS. That's extreme, but their bandwidth costs are easily 5x-50x that of alternatives, so it doesn't take much egress before it dominates the cost.

If you don't mind me asking, which cloud provider did you move your client to? Or, moved them on-premise, you mean?

Depends on their need; I tended to move them to managed hosting, which used to reduce their hosting costs massively and reduce their devops costs. Hetzner for anyone with most traffic in Europe. DigitalOcean in some cases. Sometimes we just put caching proxies (external to AWS) in front of EC2.

I've managed racks for customers too, but managed hosting at Hetzner is now usually cost-effective vs. colo-hosting in London where I am. Since they also offer cloud services (though pretty basic) now, there's the option of mixing and matching.

Amazon consistently charges for egress only. Ingress is free.

Live and learn, that seems to be the exception. That sounds like it might be an accident though?

Their inter-AZ pricing is certainly no accident. I'm betting for a large chunk of custs it nets them something on a par with egress

Amazon and Azure both charge vastly higher rates for egress than colo datacenters, but I believe both charge only for egress and not ingress.

Well any other AWS service almost exclusively built on top of EC2/S3, so it’s safe to say it drives all of AWS revenue either directly or indirectly. Probably some service’s use EC2 instance hours as its revenue measure

I think the real moneymaker is that AWS keeps making their servers more efficient but doesn’t lower their costs

Isn’t AWS fairly widely reputed for having lowered its prices dozens of times since first launching? https://aws.amazon.com/blogs/apn/new-research-from-tso-logic...

Arent there many providers with unmetered bandwidth, in nearby datacenters?

AWS always charges for outgoing bandwidth. Unfortunately, they aren’t part of this: https://www.cloudflare.com/bandwidth-alliance/

All of the unmetered bandwidth offers I've seen come with low link speeds like 100Mbps or sometimes even 10Mbps.

Paying for use isn't awful, as long as you're not paying AWS list prices which are rather high.

all hetzner servers are guaranteed 1Gbit

Yes - hetzner supposedly pushes 324TB/month for something like $30/month INCLUDING the server itself. So let's say $15/month for the bandwidth.

AWS is lets say .08 per GB or $30K/month for same bandwidth?

And if you beleive you can run your business on their $15/month network then they would own the market - but oddly they peer with basically NO ONE of any quality - because THEY don't actually pay for the bandwidth either and just totally oversaturate their peering links.


for some of the common whining at least in the past.

My experience is a bit different. Peering works pretty well to many providers, especially with Cloudflare. The big disadvantage of Hetzner is that they're located in Germany and Finland only, so not great if your customers sit in the US. But OVH Canada provides reasonable options for that.

Once you have significant data on AWS it costs you so much to transfer it you are stuck with them. Their data fees are insane, and so are their storage fees.

Also, AWS is slow. If I have a filer with 500TB of disk I should be getting 5-10GiBy/s reads, and burst writes should be that fast too. With EFS I get 200MiBy/s MAX. Likewise EBS and emphemeral SSDs top out at 250MiBy/s, which is just abysmal.

AWS security is so complicated now, at the control plane layer, that hardly anyone understands it.

So if there was a competitor that gave real hardware performance and a simpler security validation at a reasonable price, they could win business. But I don't see Oracle/Azure/GCP etc doing that.

(I work for AWS. Opinions expressed here are my own and not necessarily those of my employer.)

> If I have a filer with 500TB of disk I should be getting 5-10GiBy/s reads

That kind of performance requires a 50-100 GbE SAN fabric. I'm unaware of any cloud provider that offers this yet, let alone an on-prem fabric that isn't extremely expensive. (Our customers can get close with the u-24tb1.metal instance type, which offers 28 Gbps dedicated EBS throughput.)

> Likewise EBS and emphemeral SSDs top out at 250MiBy/s, which is just abysmal.

There are plenty of EC2 instance types that offer 1750MiB/s of EBS throughput - see https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSOptim... for a list. You'll need to stripe some 1.3TB EBS volumes together to achieve it, but it's absolutely attainable. (I did just that when I worked at a well-known YC company to build their high-throughput Kafka clusters.)

As for instance storage, I just ran a fio test on an i3.large's NVMe SSD and it performed at 80k read IOPs, 311MiB/s sustained sequential read. These too can be striped for higher performance -- I was able to achieve nearly 24GiB/sec (at 644k IOPs) on an i3.16xlarge instance.

Thanks for doing this. Many times in my career I've had to deal with long-standing assumptions about storage performance in cloud infrastructure without having done any testing.

It's trivially easy to spin up an instance or three and run FIO.

Thanks for the EBS tips, I'll give it a go with these instances. Still limited by the EFS speed though, which means multiple EFS volumes to shard writes, additional code to handle that etc. (If anyone knows how to use a VFS/fuse adapter to stack/union these into one volume please shout out.)

Since 40-100GBE is extremely cheap now, I don't really see how the bandwidth can be so low. When we're building compute clusters it is multiple rails of 25/50/100GBE (or other fabric like IB). Is Amazon using anemic 2x10GBE LOM or something?

Relatedly, its always mysterious debugging on AWS because you have to theorise about the underlying hardware and topology. Maybe there is a VIP level of CloudWatch where you get told how it works underneath...

> AWS security is so complicated now, at the control plane layer, that hardly anyone understands it.

YES! Here is an example.... I wanted to use Elastic Container registry and have Fargate pull from it. I had to figure out that ECR is actually implemented like a private s3 bucket and my app needed read access to S3. Why should I be exposed to the implementation details as someone granting permissions to ECR?! Also to figure this out it is not trivial as the only way to get a clear error message is to run Cloud Trail and wait 10-15 minutes for the error to be reported there as there is a delay. Also another instance where something needed permission to an weird internal S3 endpoint for gateway access on a VPC....

I have experience deploying to Fargate from ECR and we didn't need to configure any sort of S3 access to do it, certainly for the ExecutionRoleArn. ecr:BatchCheckLayerAvailability, ecr:BatchGetImage, and ecr:GetDownloadUrlForLayer on the ECR repos was sufficient.

I recently learned GCP GCR does this same thing with Cloud Storage permissions.

You miss why folks pick AWS I think.

When you say get a filer with 500TB that can do 10GiBy/s as a developer I might hear this as the reality.

Talk to infra group -> who will sit with a bunch of sales people and go round and round then get some quotes, then go back and forth with dev group trying to really spec out what is needed, then get finance / accounting approvals, more questions and answers, then put the order in, then have sales people apologize for this and that delay, then get filer onsite, the a big todo to install, then find the gigabit Ethernet can't actually push 10GiBy/s, then have to find add-in cards for all the servers and do a big network redo to get some kind of speed to the actual servers. Then after developers play with this a bit they decide to go a different route and the dance starts again.

Contrast with this AWS. I can try out my flow with a few clicks on console.

Security is painful though - no question.

TBF that is a problem with your company (many companies and govt), not non-cloud hardware. If you have good management practices then all the groups involved will share ownship of delivering the successful outcome (whatever you need your giant high performance filer for), which also includes empowering the lead engineer to hire/train (which is often better IMHO) people with the requisite skills.

So what you are arguing is that many places have limited competency, and you can outsource the need for that to AWS -- at significant cost. I dealt with a fair number of turf guarding drongos so I'm sympathetic to this idea, but it remains true that you can get much more peak performance, and perf/$ from your own hardware.

In case my point wasn't clear - for many situations AWS is FASTER in terms of development, and once you've built your app on AWS it can be painful to switch to an on-prem solution and rebuild on the "easy" 10GB/s EBS. And by the time you are done dealing with add-in cards / driver bugs etc...

It sounds like the ease of using AWS means the technical staff capability has atrophied. If you're doing on prem all the time it isn't an issue. Of course you have to have the scale to have a tech staff, including not losing critical capability when someone leaves.

On prem cloud is still mayhem, VMWare is very very bad, Openstack is baroque. Moving anything from AWS or Azure to on prem cloud would be very frustrating.

> AWS security is so complicated now, at the control plane layer, that hardly anyone understands it.

IAM and security management are the worst parts of AWS. It's complex an unintuitive. I always second guess my choices, wondering if I've left a gaping security hole somewhere.

AWS has an automated reasoning group working on tools to help answer the high level questions about complicated policy configurations. [0] has been submitted four times to Hacker News, and received a total of 4 points and 0 comments.

[0] https://aws.amazon.com/blogs/security/protect-sensitive-data...

[1] https://aws.amazon.com/security/provable-security/

I too have been wondering the same. Does anyone else see a need for some external service that examines resources/permissions and give a clear picture and ongoing monitoring for changes? Are there any services that already do this well?

What I'd like to see is larger space for custom rules. You can't have a single policy that defines a whitelist of functions you can do on an EC2 instance per user.

> AWS security is so complicated now, at the control plane layer, that hardly anyone understands it.

Yep, and in the context of OP asking to pull out authN and ID, it's a lot clearer how to pop in an alternate DB than IAM. Even pulling out Cognito makes some sense, but not IAM. Notice OP didn't even mention authZ because IAM roles and policies are so intertwined with every AWS service, there's another major disincentive to pulling something out. It seems any pull-out solution will need to wrap IAM such that it will incur treble complexity and performance hits.


What would you be willing to pay for “physical like” performance?

The same prices that are common for physical hardware:


For the lazy among us, how much does Hetzner charge for 500TB of disk?

you can get 150TB for 310€/month: https://www.hetzner.com/dedicated-rootserver/sx292

2000 eur/mo

Physical-like prices?

What happens when the SSD in your colo server malfunctions? What happens when the (an) SSD attached to your cloud VM malfunctions? Why do you think those two setups should cost the same?

Because it happens at a low enough frequency that dealing with it at places I've managed physical racks is pretty much a rounding error relative to the hardware costs.

And AWS has other complexities, so the devops costs with AWS are different, but rarely lower than managing physical hardware.

I'd never pick AWS based on total cost - it's an expensive convenience that first starts to pay off in terms of cost once you're big enough to negotiate huge discounts.

If you can afford it, it's great, but a lot of people who use AWS have no idea how much of a premium they pay for it.

If you go with dedicated servers instead of colo, the provider will still take care of swapping faulty SSDs. Prices are similar to colo, only downside is that you're restricted with the hardware the vendor offers and cannot guarantee that all servers are in the same rack. But neither of that is true with cloud providers.

Yes, which are mostly SRE salaries.

I like with Google cloud how disk throughput scales with size. It's simple compared to dealing with the crazy multitude of options, striping, and provisioned IOPs AWS uses

The analogy in the article is wrong. Aws is not a marketplace and does not enjoy network effects.

I.e. if developer A uses AWS, this does not affect developer B.

What AWS enjoy is economy of scale (huge capex) which should help reduce price (but I am not sure that this is happening), and being first to market (Nobody got fired for choosing IBM).

Basically the main value prop today, as I see it, is saving the operation costs (human cost), by offloading them to amazon.

This will be solved by Kubernetes operators.

Moreover, most of the new computational intensive workload (E.g. IOT / AI ) is better done on the edge.

There are more than one type of network effects. This one is called "Indirect network effect". What you have written is direct network effect.

As more developers start using AWS, there forms a large developer ecosystem around AWS tools and services, and therefore "developer B" will find it much easier and cheaper to use AWS, indirectly.

I don't think that indirect network effect is particularly strong in this case, though.

There are similar ecosystems around Azure and Google Cloud, and there are plenty of tools that support all 3 clouds equally.

I really don't think AWS is getting any kind of meaningful competitive advantage here these days. And if there's any at all, it's dwarfed by technical, financial, and strategic factors.

In BigCorp, which is ~half of all business, there is consistent pressure to standardize: cost, security, education, etc. There are all sorts of other levers too, like negotiated group deals -- at places like Netflix, AWS is effectively free, while any other vendor requires going through procurement. Competing with a negotiated loss leader - free - sucks. Azure benefits a LOT here due to other MS bundling.

I'm increasingly convinced all this means we are entering a telco-like monopoly era for Big Cloud software, starting at the infra layers and steadily moving up. I'd love a more savvy approach from Elizabeth Warren to be less 'break them up' to more 'these are the anti-competitive bundling violations.'

>there are plenty of tools that support all 3 clouds equally.

there are plenty of tools that support all 3 clouds, but most of those suffer from some sort of impedance mismatch and/or do not fully support the features which those clouds supply.

there are plenty of tools that support all 3 clouds equally.

“Equally badly”, you forgot ;-)

Cloud agnostic tools really mean “lowest common denominator” which means you end up paying premiums but not using the features. Just say no to Terraform!

So they will not find it much cheaper. AWS is priced that same regardless.

They might enjoy knowledge sharing, but at the speed that AWS evolves, the half-life of such knowledge is very short.

If I would bet on anything it would be Kubernetes. At least my knowledge is the same across clouds/on-prem. And there is no gatekeeper.

Except the technical complexity. I work with some very experienced k8s devops engineers these days and they screw up catastrophically every so often because of the million switches and buttons on k8s.

Right. Kubernetes is supposed to be managed by machines not by humans.

So how does a k8s engineer retire a node? This recently fooked up our cluster and it came down to a DNS issue... I don't see how that is "managed by a machine" am I missing something?

>Aws is not a marketplace and does not enjoy network effects. I.e. if developer A uses AWS, this does not affect developer B.

It damn well does. If everybody else is using AWS it pays to use it too, because:

* The IBM/CYA effect - i.e. "nobody got fired for picking AWS".

* The supply of experienced users - it's easier to hire AWS skills than GCP skills.

You're definitely correct. I tried to cover this in the post:

> There are obvious differences between Craigslist and AWS. The most important is that Craigslist (and each of the category spawn) is a marketplace, and so has the powerful advantage of network effects. Another distinction is that AWS has relative cost advantages over its unbundlers when it comes to the fundamental components of infrastructure (compute, bandwidth, storage, etc.), and I can’t see a parallel to Craigslist. So it’s not a perfect analogy, but the premise of unbundling certain categories still holds.

Scale effects are obviously at work in AWS. My premise is basically that higher-level things may be advantaged over AWS offerings because their scale advantages dissipate the farther away you are from their 'primitives.' I don't deny that there are some holes in the logic and comparison.

> Aws is not a marketplace and does not enjoy network effects.

I won't quibble over the definition of "marketplace" but it most certainly does enjoy network effects. AWS is the LAST place I ever look to host apps and software and yet I am almost always forced to use it because everyone I work with knows it. The result is that I use AWS even when I don't wan to use AWS because everyone else does.

I'm quite sure I am not alone in this.

But it does have network effects on several important dimensions. And if you think that k8s-as-a-service can replace AWS infra and reach, I'd like to understand how you think that is the case.

Don't forget economies of scope as well, where cross-product costs are decreased by a joint offering.

I am all for unbundling AWS, but I think it's different from Craigslist.

Craigslist is a consumer facing app, which is much easier to "unbundle" than something like AWS which faces enterprises. Even if some "unbundle wannabe" starts getting traction, I think Amazon will simply catch up by lowering the price as much as possible and putting more resources into improving the developer experience for the corresponding service.

But if anyone has some great insight, please share. I would love to see this "great unbundling of the AWS" happen.

A frequently overlooked aspect in enterprise is billing, accounting and supplier agreement overhead.

In enterprise, Joe developer can often add / use more AWS services and it just goes on the giant enterprise bill, no questions asked.

If you want to use a third party service, waay more work. Supplier assessment needs to be done (by the department that does that) looking at security, company stability, data sovereignty etc. Procurement get involved to negotiate supplier agreement with the vendor and your project needs a specific budget line item which might require a trip to accounting. To get signoff and justify your vendor selection you might also have to do a stupid internal evaluation thing / bakeoff (even though of course you know the thing you really want) where you build feature matrix and carefully adjust the rows so your preferred vendor gets the most ticks.

All this paperwork can take weeks or months.

While a lot of the AWS service offerings are sub-par compared to alternatives, the technical work of papering over this is usually less risky than dealing with all the internal departments which will delay your project if you try to use another vendor.

I think this is a big driver towards bundling being a win for them.

Heck, I work at a 20 person company and this is still generally true. No one is stopping me from using a new service, but using AWS means no new paperwork and no questions.

Actionable advice on this would be: If you're B2B, have an option for yearly billing via invoice.

Billing yearly via invoice vs a monthly direct debit means our finance people deal with payment, and just ask Engineering once a year if they should pay this weird bill they got. Billing via direct debit means I have to do paperwork every month. I don't enjoy things like paperwork. I do enjoy things like building a single customer MVP version of your service on AWS in a couple hours. You can guess which I'm going to choose if you don't offer yearly billing.

Seems a bit risky to wait a whole year to collect what's due. I don't believe I ever heard of yearly invoicing like this.

Of course, if your small company would like to pre-pay towards an approximation of your early invoice I'm pretty sure some company would be accommodating.

I figured paying for a year up front was implied.

Not sure how that could work since most services are based on consumption

Consumption buying is mostly done through the AWS marketplace. Its a good model for companies to try a new vendor but AWS is going to take a big wet bite out of the vendors bottom line. Buying licenses directly from vendors is going to benefit both sides if they have a known demand. You'll generally see an "all you can eat" license after a certain threshold.

>>> While a lot of the AWS service offerings are sub-par compared to alternatives

If you try to get some services in an entreprise, maybe a physical server or get a database, you may easily find that it takes months of paperwork and waiting. Then it's poorly operated and broken half of the time (why are they provisioning a hundred databases on a single shared host? or using a shared NAS as storage?)

AWS services may not be perfect but it's miles ahead from the shitty services you get in an enterprise.

It goes even further. Even if you are a third party, it makes sense to use the AWS marketplace to allow enterprises to consolidate billing. ACloudGuru offers its subscriptions on AWS just because it makes procurement easier.

Author here. You are completely right. There are a million ways the analogy breaks down: consumer vs enterprise, marketplace vs IaaS... It was more just that looking at the console jogged my memory in a visual way and I do believe there are axes on which AWS doesn't compete well in the long run, i.e., offerings where price is not the primary consideration.

Historically, AWS hasn't done this.

They haven't lowered prices on RDS, Elasticsearch, etc. There are far superior options available. The lock-in on billing explains the adoption of these services, but there a tremendous number of accounts of folks struggling to make any of the secondary services scale worth a damn.

AFAIK, I can't remember an instance where AWS raised prices. So if they were to undercut someone in prices, they would not be able to raise them again, so I find this scenario unlikely to happen.

In the tech circles I hang out we think of AWS as the next operating system. In 10-15 years it will be mostly invisible to the application programmer due to abstractions built on top of it. This is a new frontier for startups.

But it's way more than an operating system, it's managed. K8s is a kit car, AWS is a Toyota with a warranty.

I agree that the future is distributed operating systems. I think we desperately need to load in the core features of a D-OS into Linux for it to stay relevant, or a competitive startup will create a tailored environment for programmers that will not even be recognizable as Linux, yet it will be used everywhere because of its ubiquity. (Docker, anyone?)

For example, Terraform is kind of garbage usability-wise, but it just works with every cloud provider feature you can think of; it's a no-brainer. If a D-OS also naturally bundled hooks for every single feature of every managed cloud provider, along with a simplified way to hook custom applications and bundle the whole thing, literally everyone would use it for all their apps and run it everywhere.

That is very true. It is an operating system now.

Operating systems get copied. Would be great (but unlikely) see full copies of AWS by other datacenters. But it's possible for certain subsets of it.

Possibly, but it's a dangerously fragile ecosystem where absurd concessions are the norm.

alternatively in 15 years there will be less than 0.01% workloads that a mainstream 2-way Epyc XXXX Server with 512 physical cores 64TB RAM and 10,000,000 IOPS will not handle so 90% of cloud services will become meaningless :).

> 90% of cloud services will become meaningless

As will that Epyc XXXX server, absent a well-peered and hardened services mesh for it to plug into.

Cloud shouldn’t be used for just someone else’s computer. It’s not the compute or the storage, it’s the network.

Thats cloud ad copy :) >90% of cloud services are there to deal with workloads that need to span machine boundaries. The only thing you need for that Epyc XXXX server(s) is decent colo service + cloudflare. Cloud is the new mainframe and with things being cyclical might repeat it's fate to a degree.

People said this about computers 15 years ago, too.

We've shown a remarkable ability to keep increasing the computational requirements foo the shit we run on computers to keep up with the computers themselves.

Do not worry. Someone will promptly write an interpreter of Lisp in Python running on VM written in JavaScript.

You make me sad, but somehow, it is a truth.

Kind-of alluded to in the article, but if you’re going to unbundle AWS, you’re going to have to figure out how to get the bandwidth 0-rated. One of amazons biggest moats (imo) is that they can realize monumental cost savings on bandwidth while making it very pricy to other service providers. So if you need to architect bandwidth-hungry or “high scale” applications, going outside of AWS becomes way too expensive. Even privatelink tacks on bandwidth charges.

You’d have to look at DB providers like Elastic to see how to do this, I think. But it’s going to involve a very complex and high touch deployment & support environment. This effectively “prices out” certain classes of applications as too expensive for value returned. Again I could be looking at this very wrong, and figuring out how to build profitable unbundled AWS services would be a great business, but it’s hard to see how to successfully execute.

You can do what snowflake did and actually build on top of AWS

Same is true for databricks and others as well

I think the Cloud is a result of a gradual move towards centralization of networked software built by software developers living in countries where DSL and cable broadband was rolled it early. It was a bit like mainframe developers back in the day. I remember people discussing “fat clients” vs “thin clients”.

We forgot what it’s like to have dialup, and the innovative generation of decentralized software protocols (IRC, Usenet etc) faded into obscurity. Even Email and the Web are now centralized!

Well the companies won and now we have a feudal kingdom again where we are the data-serfs. We have been well-trained like pavlov’s dogs using notifications and rewards (likes, comments) to till the datafields and plant fresh crops, use our social capital to reshare articles etc. And you know that you’re a serf when that notification lights up on your master’s device and you pause talking to your girlfriend, wife or child to look at it and maybe act on it.

And of course governments love having a central place they can implement censorship and spy on everyone.

This has got to stop. It’s not just a technological topology problem, it’s become a sociological problem and even a problem of democracy and society. THE SOLUTION IS OPEN SOURCE SOFTWARE that will disrupt the landlords of the Web (Facebook, Google, Amazon etc.) as much as the Web once disrupted the landlords of the Internet (AOL, MSN, Compuserve).

The first HTTP browsers weren’t better than AOL at anything except one thing... ANYONE could permissionlessly host a web browser.

What realistic alternatives do we have to Facebook today? Google Docs? That anyone can easily host and share stuff with friends to easily collaborate? It would have to have people’s identity and contacts stay consistent across domains, without owning them and storing them on centralized servers like Facebook.

I know a few attempts. Diaspora, Solid, Mastodon and Matrix. I also liked Sandstorm a lot, and OwnCloud was ok. They have not reached mass adoption. Scuttlebutt is slightly better in that it can interoperate with everything. The biggest winner in the last few years has been the DID standard, although it solves one problem only.

There is also an innovative project called MaidSAFE which raised around $10M and spent 12 years so far building this back-end. I have high hopes for it once it is released and plan to make Qbix Platform interoperate with it. Check it out: https://maidsafe.net/

We will eventually replace the idea of a startup running a database that has to scale, with an open source startup that releases software that anyone can pay any hosting company to run and makes money via tokens. That was the dream of cryptocurrency. The first kind of startup extracts rents via a SAAS model and forcibly bundles infrastructure management (Slack, Salesforce, Twitter, Etsy, you name it). The other doesn’t care about cheaper “cloud” costs so much, because anyone can host it (Wordpress, Magento, Drupal etc.) and the hosting costs are miniscule for each individual client business. Wordpress and Magento and Drupal are all valued in the billions, as open source companies. Maybe people on HN should strive for this rather than “Zero to One” “competition is for losers” “build a monopoly and extract rents” Peter Thiel style facebook startups.

Do we need something like this? Or am I just ranting?

EDIT: why whenever I post this subject, it is heavily downvoted with no actual replies or explanations showing me why my position is wrong? Is there a coordinated effort to downvote this position without further engaging with it?

I generally agree, and I think there are often business and security cases to be made for why buying into the centralized cloud paradigm are bad or might have long term hidden costs. The problem is that so many people either hear everyone else push it and jump on the bandwagon, often due to professional peer pressure, and another factor is that lets be honest, those of us who remember those days are getting older and there are techies working in the industry who haven't known anything different.

A big fear of mine is that many of the younger hackers have lost or were never trained in that cypherpunk-phreak-hacktivist attitude that many of us had and learned on/with in our and the internets younger years. I know they still exist, but the sheer number of users drowns that type out as a much smaller percentage of the net... but it was largely that group that fought for the rights of users and major holdouts that influenced the patterns that made the internet what it is today. I suspect the increase in types who downvote topics like you bring up are just that majority, middle of the road, corporate placation user type, and it has gotten noticeably worse over the last few years on hn.

You are missing the biggest key to everything: The network effect doesn't happen if people have to install software. FB and twitter et all are so pervasive because the 90% of the population that is tech illiterate can use it by just signing up. When you talk about things people can pay any host to run, you've already lost the argument. Aside from the need to install, now people also have to pay for hosting, which is a huge negative.

I disagree, because Wordpress powers 34% of all websites in the world. Surely that is an example of adoption? Showing it’s possible for non techies to own their own stuff?

What about the Web? Why did people install web servers instead of hosting on AOL? Because web Browsers existed, and let you consume that data. Couldn’t you likewise argue that no business would want to mess around with hosting their own website and a bevy of competing hosting providers would never displace AOL?

> I disagree, because Wordpress powers 34% of all websites in the world. Surely that is an example of adoption? Showing it’s possible for non techies to own their own stuff?

This is a very bad example since most of them are insecure, out of date, and not actually run by the non-techies. Most people just contract it out, once to someone who said he could setup a Wordpress site for cheap, and then they just pay them or someone else to make any major changes. If it isn't that, it's another company, or Wordpress itself hosting their instance, again, the software isn't being run by the "non-techie".

> Why did people install web servers instead of hosting on AOL?

I don't think a lot of people did, again, they paid someone else to host it. Shared web hosting was, and still is, a very large market.

That’s what I said. There are companies which are in the business of hosting and they compete unlike the monopoly on hosting Facebook.

People pay them to take care of hosting. But they have a choice.

> The network effect doesn't happen if people have to install software

Plenty of people installed the facebook, twitter, and instagram apps on their phone - and in many cases don't even use the web interfaces. Local software installation is not an issue when it's as streamlined as the current mobile experience.

yep, we sure need this stuff...small stuff is a lot easier to manage too... also while not very related, I love the SQRL login idea, which is also decentralized, and also not in any final stage yet (but would allow kinda anonymous login, as in, no identifiable usernames, and would be able to also for example optionally open page on desktop, auth with phone by scanning a qr code, and as such avoid any keyloggers and potentially network sniffers, even (and the app shows info, so can't fake the code either, and even then that'd only hijack a single session, unlike passwords))

I didn't down vote you but I did find your post shrill and confronting thanks to your use of crazy capitalization.

> THE SOLUTION IS OPEN SOURCE SOFTWARE that will disrupt the landlords of the Web (Facebook, Google, Amazon etc.) as much as the Web once disrupted the landlords of the Internet (AOL, MSN, Compuserve).

Using capitals like this implies to me that you are not making a reasoned argument, otherwise your points would stand on their own rather than BEING BLASTED INTO MY FACE.

I dislike that this meme from IRC became so commonly repeated. I did not agree with it 20 years ago when just a few people repeated it, and I do not agree with it now.

Capitals were wonderfully used by Pratchett as the voice of one of the characters in written text. And the late Christopher Lee did a wonderful interpretation of that character.

Cyrillic letters for example resemble capitals in that they have a constant height and are generally more square and if anything that doesn't make them more difficult to read, just different.

Actual shouting hurts my ears. It is about a thousand times more annoying than reading capital letters.

I think the real equivalent of shouting in a text media is the hated blink tag. With some bright neon colours. On black background. In a dark room.

I think it's not about readability but rather about context and the audience.

On HN all caps is uncommon and, I would venture, considered somewhat aggressive/rude. IMO HN is somewhat of a bastion of courtesy and reasoned debate, a rarity on the internet today. Long may it remain so.

Reasoned arguments can involve EMPHASIS. It’s only a choice of typography how to express it.

You are the one who asked about downvotes.

By all means use all bold, comic sans font, and flashing red text. But don't be surprised when people react negatively to it.

Most of the people in hn work for these monopolies. Of course you're downvoted.

> As an early stage investor, I’m hard-pressed to name any tectonic shifts that have had as much impact on startup formation.

Gross exaggeration. Most startups should worry less about 100% uptime and scaling than about sales. A dedicated server or two goes a long way, and it isn't/wasn't really that much harder.

It’s really crazy the myopia I see with posts like this. I’ve worked in the B2B space mostly in healthcare.. When you get a sale to business where they are outsourcing a major piece of their functionality to you. You can’t shrug and not worry about uptime. If you’re building Twitter it doesn’t matter.

Right, but at least where I work (Austria/Germany) if a healthcare provider is outsourcing their data to you, they won't allow you to use AWS anyway (data has to be on your servers, in the same country, run by a company owned in that same country). So even if you are in the situation where you have to worry about uptime, AWS isn't a solution.

AWS and Azure both partner with local companies to meet the data residency/sovereignty requirements, IDK about healthcare regulation though.

> It’s really crazy the myopia I see with posts like this.

I said "most startups", not critical medical care. I've dealt with startups for nearly two decades and most startups can begin operations without the cloud, and with the possibility of downtime.

Even in healthcare there are plenty of systems where varying levels of downtime is acceptable.

In any small B2B company where your business is based on a few big spenders, uptime is important. If you lose one customer you lose a noticeable share of revenue.

The assumption there is that if you're down, you'll lose the customer. But that depends on the industry. At least in one industry I've worked in (B2B accountancy and reporting software), the product can be down for a day and it doesn't matter. A few employees at the customer are using our software to produce monthly reports, and if they have to wait until the next day to produce the reports, they're annoyed but they can get on with other work in the meantime. I mean, for sure, it's not ideal, and you'll get emails to complain, but so far nobody has cancelled on us due to something like that. I'm not saying I like the product being down, but I can understand the CEO not wanting to pay extra to invest in redundant infrastructure, it's probably the correct business decision.

There were cloud provider outages that lasted a long time as well. They are far from frequent, but it's not like you are immune from it ever happening if you're using aws or gcp.

Anyway, I don't think uptime is the main reason to use a cloud provider. For me, the main reason is the ease to expand. You can add and remove infrastructure quite easily. If course, you can do the same in a datacenter, but in my experience it takes planning ahead a bit more than you need with a cloud provider.

The elastic cache sla is 99.9%. That's roughly 43 minutes per month. Do you think that's that hard to achieve given the tools we have today?

> It’s really crazy the myopia I see with posts like this.

The myopia comes from experience. Startups (which aren't solely B2B Healthcare) are more often doomed by lack of customers than by other factors.

I've never understood B2C. Why would anyone go to the trouble of starting a company without $1M in ARR commitments?

To me, that sounds like a death wish, but it worked for all the new high-tech consumer facing companies we read about every day!

My point is not that 100% uptime was a tectonic shift; it's that the ability to enter a credit card and click deploy is a tectonic shift.

This was doable well before the advent of what we call now as the "cloud".

My first company hosted the product on a shared hosting server, this was back in 1999. Availability was immediate once you paid. Since the early 2000s, dedicated servers were often available within an hour of payment. VM-based hosting took only minutes to get started with. Credit cards were the usual mode of payment, but PayPal was also quite popular.

What I'm saying is that cloud hasn't dramatically changed life for startups in their earliest phase of existence.

For monolithic web apps, this is almost certainly true (in my experience), but the ability to rapidly turn d/c configurations, and instantly scale to match spot demand has been a game changer (for me)

> some of their offerings shift from being best-in-class to being very reliable and with a ‘just-ok-but-well-integrated’ user experience.

I would go further and say that some of their experiences are buggy and often non-functional. Ever tried to search for a service in ECS? The results load from the backend, and it's a frontend based search on the backend set. Meaning it will say your service does not exist, but you're only searching on that page * offset.

Then there's the Elastisearch fiasco... https://spun.io/2019/10/10/aws-elasticsearch-a-fundamentally...

There are many examples where basic AWS functionality is broken.

One missing point is the support for customers and partners. AWS and Azure provide amazing support experience which emulates and improve what Cisco, HP, IBM did in the Enterprise space many years before: TAC, Forums, certifications, partner ecosystem. When you have an issue someone will actually respond to it. I was interested in Google Cloud but their support was non-existent, seems to be that their leadership came from the consumer world where a person is just dumb and their product is always perfect. Specifically I had an issue with Firebase which I found the answer on Stackoverflow eventually (app was down for few hours because Google blacklist my environment by mistake). Since my company relies on high availability, we needed to work from day 0 with AWS on the design and have a person which would be available to respond to queries directly. Looks like with new Google CEO that's changing but just cares about large enterprises . With that in mind AWS and Azure to me will become a binary option.

A 1GB hosted Redis instance is $40/mo on Azure (which I’m told is competitive with AWS/GCP) yet a “bare-metal” instance with 1GB of memory (in which you can configure to run Redis yourself) is $5/mo.

Are all hosted infrastructure prices this inflated?

Backups, configuration and failover. Which traditionally meant a db guy or an ops guy which costs money. I've done it for multiple database technologies and doing it right is not at all easy or cheap on time.

With that said, Kubernetes might fix this because they have access to more parts of the system in a generic API. With tech like KubeDB[1] and other operators[2] we might see these prices coming closer to bare-metal.

[1] https://kubedb.com/ [2] https://github.com/zalando/postgres-operator

Yep, this is exactly it. By the way, even SQL Server can use Kubernetes for HA nowadays.[1]

[1] https://docs.microsoft.com/en-us/sql/linux/tutorial-sql-serv...

Does running Postgres in a container hurt performance?

Not inherently, but if you want to use provisioned storage then your storage driver[0] will have some cost (unless you use local storage).

Of course, if you're on AWS then you'd be paying the EBS tax either way.

[0]: https://kubernetes.io/docs/concepts/storage/storage-classes/...


It used to be unreliable. Is it not now?

I don't think so. We run a bunch of things in k8s / docker and you can achieve bare metal performance.

The idea with the hosted version of the service is that you're paying more but (in theory) you're also getting more. You can certainly host your own Redis but Azure's hosted offering probably automatically takes care of a bunch of "housekeeping" things that you would have to manage yourself otherwise, like provisioning, HA, scaling, backups, and automatic integration with other Azure services.

For a lot of companies, an extra $35 per month to save hours of work is a no-brainer.

A rudimentary price comparison tends to show AWS/Azure costs much higher than bare metal or VPS, sometimes an order of magnitude higher, but the reality is more nuanced.

Many people still prefer hosted services because (a) total cost of ownership can be much lower once you factor in the cost of a qualified engineer installing it, ensuring it's online 365/24/7, and keeping it up to date (b) comparisons tend to be apples-to-oranges, since many AWS services are charged on a per-transaction basis. With a VPS, it often appears a lot cheaper by comparison, but you're also paying for a lot of capacity (RAM, CPU, storage) that you're not using, and generally can't use most of the time because you need to reserve enough for peak periods.

The 1GB redis instance on hosted infrastructure typically comes with opinionated defaults - and tight integration with the hosting provider's other offerings - for security, monitoring, resiliency, support and license management. If you're only going to run one or two Redis instances, $500/year compared with the additional engineering time required to handle anything other than just tossing the box on the Internet starts to look pretty reasonable.

There is a cost to manage the instance, upgrading monitoring etc. Even though there is no SLA on the basic tier, I would imagine Azure does some maintenance and monitoring. However is it worth 35$ etc, I am not sure. Also, the basic tier does not have replicas. If you want replicas you have to pay 100$ per month.

Not all of them, Digital Ocean starts at $15/mo for a 1GB Redis instance.

A lot of the value of AWS is its integration within itself. (even if it can be full of rough edges). I think the danger with unbundling is that your competing project needs to be clearly so much better that people are going to be willing to go through the hassle of configuring a lot of complicated networking to work with what they already have, along with separate billing etc. Not saying it can't be done, but it means the barrier to entry is really really high. You're not just convincing them your service is better, you're also convincing them that it's so much better that they should be willing to take on extra headaches for it.

If I were to compete with aws, I think the area I'd really go after them with is better kubernetes support. I've yet to see any cloud provider really do it well, to be honest (which isn't surprising, development on kubernetes moves so fast and it's so advanced, it's kind of amazing, but that does mean it's really hard to make it nice outside of a "works for a demo" kind of deal). Azure and GCP do kubernetes a little bit better, but I think if someone were to come in and say, like, I don't know, we can do kubernetes on bare metal so it's much faster than through a VM, and all our services are natively integrated, that would be a cool story.

The other area I might compete with AWS is in finding niches where organizations might be hesitant to build an operations department, but they really need cloud computing. So, for instance, maybe scientific computing or something like that. If you could make a really useful cloud that can be administered by someone who barely knows linux, that could be a thing.

Hm, I don’t think the craigslist analogy works.

For the same reason it’s difficult to argue there’s opportunity for “unbundling Facebook” by launching a new image sharing service, building a news aggregator to compete with newsfeed, building a new messaging app to compete with Messenger, etc.

The main reason I disagree with the premise is security / compliance requirements of large customers (a large percentage of the market). The largest purchasers of AWS-like services prefer to work with as few vendors as necessary to minimize the number of vendors they need to worry about during security audits, contracting, etc.

The other obvious problem is, similar to Facebook, AWS has a “network effect”, in that all AWS services live in the same physical data centers, which results in potentially competing services suffering from higher latency than “native AWS” services which might make it harder to compete.

Consumers did not have the same level of loyalty to Craigslist as business have to AWS.

LAMP was a simple stack. It was performant, and not bloated. That's why 1000 different datacenters could provide the same service. Are there such sub-bundles in AWS that are cheap enough to run without amazon's scale ? Do people even need that huge scaling capability or is AWS just convenient?

I honestly don’t get this myself. I understand AWS for a fast moving startup, but I’m seeing friends start hobby projects that will never need more than one $5/m shared dedicated host, and pay $60/m for worse performance.

Personally I use DigitalOcean for this sort of thing. But this is why AWS has Lightsail https://aws.amazon.com/lightsail/ to service that use case and then let people upgrade later to full AWS.

They're doing that to learn AWS!


> "selling AWS at a loss" is crisp shorthand for a lot of startups' business models!

It is usually easier to dump costs to the already hefty AWS bill than push a completely new service provider through procurement process.

How does the Unbundling work if the Services you are Unbundling runs on AWS?

I dont understand how Craglist and AWS relates? Craglist.... is a .. list or a market place. AWS is literally like a Fortune 2000 business that is in itself fully vertically integrated from Hardware to Software and Network.

And then you have economy of scale, and Good enough is enemy of Best. The barrier of Entry to recreate Craglist is less than a rounding error in recreating AWS or even any part of its sub component.

I dont have any numbers to back me up on this, so take this with the biggest grain of salt. I do think there is a Market for provider that sits underneath AWS or GCP, something like DigitalOcean. AWS or GCP are like Enterprise product with millions of features being offered to all, while DO is like a simple to use product that is scaling up its feature offering. And it offers the essential of Cloud Hosting with Better experience. ( Comparatively Speaking )

Or Heroku with their own Infrastructure, ( or Cheaper Pricing )

I made a similar argument [1][2] about tipping the cloud computing market from vertical to horizontal integration. One important aspect of this transformation is to maintain the feel of an integrated cloud provider, and not let the customer/end-user deal with the cost of heterogeneity [3].

[1] https://www.youtube.com/watch?v=GOssXrkNYxM

[2] https://speakerdeck.com/bassam/opening-up-the-cloud-with-cro...

[3] https://crossplane.io

There's definitively a market for unbundling services from Amazon, but the problem is that people forget what the actual killer feature of AWS is; IAM and Instance Metadata Service.

Can you elaborate? After your comment I started thinking I must not really get IAM.

IAM = dynamic role based access to and among all your things, through policy-as-code.

If you just Allow: *, you might not care. If you practice least privilege, this is hard to do another way.

Small note: I wouldn't count the serverless framework as cloud agnostic. Yes you can use the framework for all major clouds but you have to adjust your project for that. Serverless does not abstract the events, function context or other service APIs which would be needed if it would be really cloud agnostic = you can deploy your project to different providers without changes.

Craigslist didn't try to improve their product which left me opportunity for other startups than there would have been otherwise.

> There are obvious differences between Craigslist and AWS. The most important is that Craigslist (and each of the category spawn) is a marketplace, and so has the powerful advantage of network effects.

Isn't AWS also a marketplace?

No. It's only amazon selling services, not everyone. Amazon is a service provider.

This is... Kind of true, kind of false.

The AMI marketplace is probably the biggest and most long running example of a marketplace where others are selling services (software licenses/support), but privatelink allows people to make a SaaS and not only sell it on a marketplace on AWS, but also do so with an endpoint in your VPC so you don't have to go out over the public internet.

I have no idea how well utilized that sort of thing is, so in reality it might be similar to there not being an option in the first place, and I'm sure in general Amazon sells far far far more services themselves, but it is possible for people besides Amazon to sell services on AWS.

Most profits are obtained by bundling closed-source or costly infrastructure together with stuff that is cheap but inseparable from it, so you can extract rents.

That’s why open source unleashes an explosion in innovation.

author's focus on datadog & metrics tools as the place to innovate is a smart way in

in particular, declarative dashboards (from code) and declarative alerts (from code) would make my life a lot easier

feedback-based / ML alerting thresholds might also hit the spot -- this is an area where black box isn't safe enough and some innovation is needed

getting any piece of information from datadog or amzn / goog's in-house dashboards is like pulling teeth -- they're so slow and clunky

AWS is expensive for what it is. I really don't understand the hype machine around what they provide, since openstack has been around for a few years.

1. AWS got business advantage of buying 100+ services at once, without going through procurement, invoicing, GDPR Data Processing Agreements, on/off boarding, same support, good security and uptime, credibility of keeping backward compatibility for decades etc.

There should be open platform that would eliminate that advantage.

2. EC2 and S3 when used properly would be harder to beat. You can get reserved discounts, spot, elasticity... Moving them would require paying for bandwidth which easily makes playing field uneven.

3. The higher layer of services CloudWatch, ElasticSearch, Cognito got much higher margins, yet lacking functionality and quality. Much easier targets for disruption.

I've not come across the term JAMstack before, why doesn't this make sense for using with AWS as the author suggests?

AWS is best for big projects with lots of data processing in different ways that need scale, that's what all the features are for. JAMstack usually implies a simple text API which you could use on AWS AppSync for example but there are much simpler horizontal alternatives since a tiny websocket server can handle many 100,000s of users.

another point: you don't pay for services you don't use, so AWS is not "bundled" to begin with. you get to pick and choose. in such a model, i don't see any benefit in fragmentation tbh.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact