The complexity of the cloud exists because the cloud vendors allows a user to do advanced things if the user understands how. Using AWS, GCP, and Azure as Infrastructure-as-a-Service (Iaas) means that there's no easy mode.
If you want easy (or easier) mode, you'll have to use a Platform-as-a-Service (PaaS).
The major cloud vendors might have problems with quirky designs and poor documentation, but beyond that is necessary complexity.
You want a high-availability website allows user-uploaded files and does asynchronous task processing? You're probably going to have to get familiar with servers, load balancers, queues, and object storage, at a minimum.
You want it all to be secure? You're going to have to configure network rules/firewalls and set up lots of access policies.
How many totally-different-yet-common sets of requirements are there for user uploads or task processing do you think there are that necessitate the ultimate flexibility and complexity? I suspect vendor lock-in is a more likely cause of the complexity.
> I suspect vendor lock-in is a more likely cause of the complexity.
I think people give large organizations credit for being mustache twirlingly evil when the collective consciousness that makes up AWS is simply not smart enough to be this evil. If AWS had the coordination to do this the product would be better.
It's much more likely that the complexity is the result of a huge number of teams working independently and integration complexity being 2^n. Like AWS had one good transformative idea to make coordination easier which is to be API first but that only forces superficial consistency.
AWS and Amazon for that matter didn't just happen. There absolutely is a lot of enterprise architecture and strategy built into them with the goal of capturing markets and extracting rent. Corporations exist for the exact reason that this is possible.
This seems seriously naive.. its not mustache twirling evil, it is business and its a big part of it to build moats and prevent yourself from unintentionally filling it in. Businesses absolutely will maintain worse functionality if improving it can aid a competitor.
"You want a high-availability website allows user-uploaded files and does asynchronous task processing? You're probably going to have to get familiar with servers, load balancers, queues, and object storage, at a minimum."
Really? I disagree. I could probably build that with Rails and Heroku in an afternoon, after creating a single S3 bucket and an access key for presigned POST. AWS has "necessary complexity" in the same way a giant hole in your head improves your brain's cooling potential. (i.e. maybe, in some very rare cases, but you almost certainly don't need it)
Are you kidding me? Amazon, on their biggest day of the entire year (Prime Day), reached peak traffic of approximately 290m requests per minute through CloudFront. I would bet that loading a single Amazon page uses much less than a minute and serves more than 10 requests per page load.
If you're not going to be as big as the entirety of Amazon you don't need to serve 20 million concurrent users. Ever.
All of these are basic requirements. Yet they're available for use only after AWS specific proficiency. Why do I need an AWS setting to permit DB access when the DB had that feature already? I don't need the extra layer of complexity AWS puts on most things.
Tangent - Their docs are abysmal. Written like a novel which I'm meant to cross reference to their SDK.
Only partly agree, definitely not all complexity there is necessary.. part is vendor lockin and another one is their own grown complexity due to rewiring/wrapping their own stuff for reuse in different forms.. and grown complexity that maybe made sense at some point, but grew too much - as almost everywhere.
> The complexity of the cloud exists because the cloud vendors allows a user to do advanced things if the user understands how.
The complexity of the cloud exists because it wasn't designed very well and all reactionary.
You look at AWS and it feels like things are getting tacked on because there's "demand" instead of thinking of what the platform should look like and building it out. Every service is done by a different team that doesn't talk to each other as well. There's no consistency anywhere.
> If you want easy (or easier) mode, you'll have to use a Platform-as-a-Service (PaaS).
It's been blurred a long time ago so how do you make this distinction? They all have PaaS features / services.
The "free lunch" was sold in a form of the hybrid cloud orchestrator. These pretended to make all clouds look the same, but were more shallow layers of abstraction that didn't add much value.
> When I started programming, I used Borland C++. It used to take about 100ms to compile and run a program on an IBM PC AT machine (TURBO ON). An average iteration cycle in the cloud takes minutes. Minutes! Sometimes dozens of minutes!
I'm a fast-feedback fan myself, and my weapons of choice in refuge from a dark decade of c++ are the Python notebook and Clojure REPL. With that as it is, the lurching tedium of cloud development (infrastructure especially) makes me want to pull my skin off.
What is so galling about it is that, for dev purposes, almost none of these SaaSes and cloud services are really so 'big' that they couldn't be run on a beefy local workstation for development. The galling reason that I have to wait N minutes for terraform or cdk or whatever to rebuild some junk and deploy it to a bunch of neigh un-remote-debuggerable-without-firewall-shenanigans lambdas and docker containers is commercial moat-keeping for the services.
At least Azure and GCP put some token effort into local emulators of their services. AWS work has to rely on the valiant but incomplete efforts of LocalStack if they want a fast and disposable way to test infra.
Yeah I miss the days when I used to run a web site for a niche crowd of about ten thousand users on my bedroom pc. Now I'm pushing buttons on heroku and getting grief for dyno overages.
I don't understand why there's so much negativity here. From my cursory perusal of the docs this looks like a simplified, vendor agnostic re-imaging of something like CDK, with cool tooling including visualisations and out-of-the-box support for local dev. Where's the beef?
You think cloud is too expensive or unnecessary? Fair enough, this tool is not for you.
You think cloud infra is necessarily complex because you need to support <insert use case here>. You're right! This tool is not for you (yet?).
You don't need this because you already know <CDK / Terraform / whatever abstraction is already in your repertoire>? I agree, the juice is probably not worth the squeeze to learn yet another tool.
Are you approaching cloud for the first time or have been managing existing simple infra (buckets, queues, lambdas) via ClickOps and want to explore a feature constrained (hence easy to grok) Infrastructure as Code solution? Maybe give this a look.
While it's still early days, I suspect there will be many who will find this useful, and congratulate the authors for their efforts!
I just don't see this being true. Being "cloud agnostic" likely means it's an incredibly leaky abstraction where at the first sign of trouble, you're going to have to understand winglang + the specific provider API. Any IaC product requires you intimiately understand what it's actually doing if you care about security and performance. Just because it's a managed service doesn't mean you get to ignore all it's implementation detail, right?
All the cloud providers give you a function as a service, or a nosql database, or a file bucket: ignoring all the nuance as an agnostic is at a minimum leaving optimisation on the table and more likely dangerous and expensive, surely?
To add, CDK is quite distant vs this - it's AWS proprietary, painfully slow testing cycle, lots of limitations & faults leak in due to being based on CloudFormation, has nothing like the compiler plugins, etc. Plus the stuff in https://docs.winglang.io/faq/why-a-language
The annoying thing about using cloud infra is finding that all your skills and knowledge have to be relearned N times over for N different cloud vendors for a huge array of their services, mostly to do basic things that you already know how to do anyway in traditional environments.
The fact they all offer similar but subtly different versions of every type of product and that cross platform tools like Terraform etc have some ability to paper over these only makes it worse. (Your google cloud bucket is just like your S3 bucket right? Until it's not). When I rant about platform independence people think I have a philosophical objection to lockin, but its really much more basic than that. I just don't have time to learn thousands of vendor specific APIs, bugs, constraints etc on top of the perfectly good built up knowledge I have from 25 years of working with software systems already. I am busy using all that time and brainspace trying to keep up with the fundamental knowledge that is actually important.
I heard this described as the "200% problem": when you introduce a new layer to solve your annoyance with learning the layer below it, new users likely have to do 200% of the learning because now they have to learn your new layer and the layer below it because your layer is leaky. I first heard this from a great recent talk by the creator of Chef, Adam Jacob -- What if Infrastructure as Code never existed [1]. This applies to everything with a error / abstraction-leakage rate that is observable on human timescales, definitely including everything cloudy/devopsy/sysadminy. An example of an abstraction that lands on the other side of that definition are 'digital' computers, which are really analog underneath but they present an abstraction with an error rate like 1e-15 which is well below the threshold where we suspend disbelief.
Really the best you can hope for with a layer on top of system configuration is to just reduce boilerplate in the lower layer.
A colleague once coined the term "horizontal abstractions" (referring to a lot of the java "enterprise" pomp and ceremony) referring to much the same effect (there's no pyramid of abstractions - just more layers of similar complexity).
I believe businesses like Microsoft, AWS, etc. all make their enterprise stuff super complicated as an excuse to sell support contracts. I remember trying to set up Windows Deployment Server, and the whole thing was very unreasonably complicated. I got it working, but I can see why businesses would pay for support contracts just so they have someone to call to walk them through all the esoteric steps, menus, options spread across the whole OS. The other side of it is, if you make your cloud offering super complex then you're pretty much locked into that platform because of sunk cost. Switching from Azure to AWS or whatever would be a huge undertaking.
That's really what frustrates me the most about modern technology. It's like tech and software developers love to rube-goldberg things, and everything is vastly more complicated than it needs to be. The thing is, tech people seem to like complexity for the sake of complexity. It gives them a fun [read: masochistic] puzzle to work on and makes them feel smart.
Heroku is great until you need to do something non-standard in your deployment, like compile node and configure a python virtualenv in the same deploy. Then you have to learn their Buildpack framework and that is a nightmare. It's a hugely complicated abstraction layer that, like one of the parent comments says, requires me to learn 200%. I have to learn all the steps to install/configure nodejs, then python, then how they decided to slap together a layer on top of it that made assumptions like you would only be doing one of these things.
While I don't disagree entirely, I feel like you're misrepresenting what they're offering. Their language is an attempt at simplifying orchestration and application code. The diagram you're showing only proves that point - that's an AWS diagram, and the code on the same page is what creates it:
bring cloud;
let queue = new cloud.Queue(timeout: 2m);
let bucket = new cloud.Bucket();
let counter = new cloud.Counter(initial: 100);
queue.addConsumer(inflight (body: str): str => {
let next = counter.inc();
let key = "myfile-${next}.txt";
bucket.put(key, body);
});
If they can deliver on their promise, it's actually promising. (I haven't evaluated it; just taking things at face value)
There's likely a clever way to do that despite Wing's explanation, but it'll be a whole separate project.
Wing will pave the way and come up with a good abstraction or two, then become obsolete once general purpose languages with better ecosystems can do the same.
Of course not. There are some powerful options like CDK and others.
An additional goal seems to be to create a simplified language that doesn't have the surface area and dependency issues another language might have. While I understand the reasoning, I've been in tech long enough to know that usually doesn't work, and just creates another real world example of XKCD 927 (https://xkcd.com/927/)
Just configure *nix “user” like they’re containers, or queues, or buckets.
How many DSLs do we need for the single domain of managing electron state?
I don’t mean to say other abstractions are unnecessary, I mean for that realm of platform/sre/ops/sysadmin metrics, telemetry, observability. Really though why isn’t it just fork() all the way down and code thread counts reacts to eBPF? Jettison Docker, k8s. Just release “user” profiles that are dotfiles, namespaces, cgroups rules, and a git repo to clone.
Better yet just boot to something better that mimics CLI for the lulz but isn’t so fucking fickle
And why would I base my entire application on a framework by a tiny developer?
And honestly, ChatGPT is so well trained on AWS CLIs, Terraform, CloudFormation, the SDKs for Python and Node, I can throw most of my problems at it and it does well.
I’m not saying their work is bad. But every abstraction is leaky and it’s a lot easier to find someone who knows how to use the native SDKs for AWS/Azure/GCP than someone who knows an obscure framework that doesn’t cover everything.
So the top contributors of modern Linux are large corporations. cURL is not a framework
And isn’t that the ultimate in survivorship bias? How many other languages and frameworks would you have left you screwed if you jumped into whole hog in before they had popular uptake?
With all due respect this is a trivial and misleading point of view. Of course if you have the staff you could do all the things they do in the cloud, on-premises. If you had the staff. And the skills. And the money. And you wanted to spend your finite devops resources deploying and monitoring a data center.
Yes it’s possible to buy your own building, and your own DS3/OC3. And HVAC. And electrical. And backup generators for the HVAC. And the personnel to design and specify the racks and the hardware in them (all of the different configs you need). And to assemble and connect the equipment. And to maintain it when something breaks. And the network engineers to design your network and deploy and maintain it.
And do it again in a different place for geographic redundancy.
And, if you have any money and personnel left, then you can think about a virtualization infrastructure (because of course who would be stupid enough to buy VMware when you could build your own open source equivalent around HVM or whatever.
And now you’ve got like a tiny fraction of what the cloud can offer. And I guarantee you that the TCO is way higher than you expected and that your uptime is a 9 or two short of what a cloud provider would give you.
I’d you are running a single cloud-scale workload (Google Search or Dropbox or Outlook.com) then you probably can do better financially with your own optimized data center. But you almost certainly can’t beat cloud for heterogeneous workloads.
And the biggest benefit of all is savings in opportunity cost as your tech people can focus on your own unique business problems and leave the undifferentiated heavy lifting to others.
> Yes it’s possible to buy your own building, and your own DS3/OC3. And HVAC. And electrical.
> And do it again in a different place for geographic redundancy.
With all due respect this is a trivial and misleading point of view.
Most people who do cloud don't need or employ redundancy across machines; much less so in different regions. But they do have devops teams or programmers who are required to learn bespoke cloud dashboards and AWS products. Even though most of the time they could just ssh into a box and run nodejs in screen and be 99% of the way there. Cloud providers convinced everyone that it's really hard to run a computer program. And companies set money on fire because spending signals growth, especially to investors.
Literally everything you said is the opposite of how I've seen people use "cloud" in the real world. I don't know what universe you're living in where things are as wonderful as you proclaim but I wish I was in it because mine is a nightmare.
> Literally everything you said is the opposite of how I've seen people use "cloud" in the real world.
I’ve been working in SaaS since 2008 and in cloud providers since 2012. And the use case I see over and over is people who don’t think they need or want geographical redundancy, until they do, and then they want it yesterday. Typically they are running fine for months or years and then there is an outage - maybe an AZ or a network partition - and then all of a sudden they’re scrambling for failover. Cloud often (usually?) has higher availability than the infra they migrated off of, and they grow addicted to it while not wanting to pay the dev cost for true high availability.
> Cloud often (usually?) has higher availability than the infra they migrated off of
I've seen more region wide outages than just an AZ e.g. where AWS us-west-2 goes out and so it doesn't make a difference.
> and then they want it yesterday
Some great SaaS you've experienced. I've been at 1s that say they want it yesterday but then realize it's too much work (already in the cloud) and just let it go until the next outage and complain again but yet again nothing gets done.
If you actually need so many 9's in the availability that it makes sense to invest in well tested robust failovers and resulting distributed-system application complexity (which most people don't test, they end up just paying for redundancy but getting maybe even reduced availability), you are in a very small minority and you will be highly skeptical of outsourcing this to a cloud vendor since their HA also goes tits up regularly. Witness how often cloud vendors have unexpected outages and read the post mortems.
Build your application to run on a few boxes with a slow (say 5 second) failover. They can have a single power supply running somewhere in the back of an office. If the power goes, oh well - one of the other nodes will carry on.
Of course you’re right a VPC or a CoLo would be better. However that’s not what most people think of “cloud”
I don’t think you deserve a downvote but I can give you my take having worked at a lot of places that did their own infra as well as cloud.
You don’t need to physically rack servers, lots of systems integration vendors and remote hands will gladly put servers together for you. Most colos will gladly help you figure out connectivity. And there are lots of vendors, like Cisco, who will deliver a rack to your datacenter with virtualization software installed and everything, plug and play..
My point is there isn’t an either/or choice of using the cloud or building the universe from scratch, there are so many available options in-between. And while those options aren’t available conveniently behind an API and might require a few old school phone calls, you can save millions of dollars, get access to better performing hardware, have better control over data sovereignty, and 90%less lock-in if you choose to go down that road. It’s not for everyone but a /lot/ of workloads that people are running in the cloud can be done better elsewhere.
The person you're replying to is definitely overly flippant, but you've taken a sort of Gish gallop approach where you think if you list enough individual things that have to be done, that'll be overwhelming evidence that it's impossibly difficult. But the things you've listed aren't as hard as you want them to be on reasonable small business scales.
We are a company with 4 IT employees including myself, and two of us alone (both full-time programmers) handled our hybrid cloud migration. We rented a rack in a colocation facility. I learned how to design racks in a couple days and did the rack design myself. We bought servers from Dell and network equipment from Meraki. The colo facility found us an inexpensive contractor who racked and stacked everything to my design, and remote hands does any ongoing hardware maintenance. The other guy had an old, outdated CCNA and he designed the network. We got a fiber connection to AWS up and running for a hybrid cloud approach. All of this was very doable for a part-time two-man team with other job responsibilities and we're saving a ton of money for database and workstation hosting--big, expensive, totally static workloads. Perfect for on-prem. The ongoing savings vs. pure AWS exceeds my own salary.
It was clear from the outset that we could accomplish this. I wouldn't have signed us up for a boondoggle. Certainly, there are more demanding configurations where the complexity would be too high, but people act like on-prem is literally impossible without a team of dedicated staff in every case. It's not. It can be doable.
A recurring anti-pattern I have seen over and over is the app that is built and runs on a developer desktop, then becomes business critical and needs to scale and evolve, and eventually fails catastrophically. Maybe the SQL database design or storage system isn’t capable of handling the new I/O requirements. Maybe the creator quits and nobody else knows how it works. Whatever.
On-prem is like that. Yes, you have all the skills to originally stand it up. But you don’t know what you don’t know, and you make a bunch of resource trade-offs, usually by not implementing stuff that you’ll never need (until you do).
That was the point I was trying to make.
As I said though, the unique value of cloud is letting you focus on a business specific problem instead of reinventing wheels that have already been invented many times over.
As other a have pointed out, other benefits are scale-on-demand, pay only for what you use, and agility - if you have a great idea you don’t have to do a PO and wait months for a server.
How many more years do you suppose we need to run on-prem before you can accept that we understand our needs? It's been 10+. We've been running on-prem for a lot longer than we've been running in AWS. Again, there's this sort of paternalistic suggestion that people can't possibly understand how to run on-prem. But we've always been on-prem. We've always maintained a SQL Server cluster. AWS is the new thing.
AWS vs. on-prem is always a tradeoff. You have to look at the costs and benefits for your particular situation to decide which is best. We decided to go with both, because AWS has benefits for our dynamic workloads and on-prem has savings for our static workloads.
The cloud has the same problem but in a different color. One employee created and managed everything in AWS and then quits. You have the same problem that nobody really knows how to continue from that point. Why was it built that way? Why are the IAM rules like that? etc.
Yeah complexity doesn't go away, you just move it around. The trick is not standing up stuff, that's pretty straight forward with cloud or metal. The real talent lies in understanding that you need to pump the entropy somewhere so you should be aware of the trade-offs and make things that are well organized, explicit and have contingency plans for the future.
Exactly. It actually makes this problem worse bc cloud apis lower the skill level requirement to the point that people who don’t know what they’re doing can create a lot of unmanageable headache in very short time span
> On-prem is like that. Yes, you have all the skills to originally stand it up. But you don’t know what you don’t know, and you make a bunch of resource trade-offs, usually by not implementing stuff that you’ll never need (until you do).
But what you described sounds like a packaging / software distribution issue.
Like, someone writes a one off Python script or program to do a thing and a year later it doesn't work because the host machine is using a newer version of Python and the dependencies need to be reinstalled to the new site-packages and they didn't document if they used the package manager or a virtualenv and a pip requirements file or setup.py or whatever.
The "it works on my machine" thing isn't really a "cloud" thing? It doesn't really solve the issue of having a weird bespoke service that nobody understands. Even if it's so abstracted from a normal computer that it has some esoteric requirement like an OCI image to run software, if the Dockerfile/Containerfile or whatever that generates the image doesn't exist/work/make sense then you have the same problem.
> As I said though, the unique value of cloud is letting you focus on a business specific problem instead of reinventing wheels that have already been invented many times over.
Reinventing the wheel like with docker ansible terraform kubernetes nomad aws?
Recently I was asked to help a company receive out of office replies to their web service that sent mail from Amazon SES. The client was sending mail from app.foo.org (with MX SPF for amazon) and wanted to receive them to foo.org (MX and SPF for outlook). Setting Reply-To or some other headers to foo.org worked in testing but not in practice. I maneuvered the amazon product menagerie and set up SES to get notifications on out of office replies and that also worked in testing but not in fact. Even then it would not store a list or provide details in the dashboard about replies without further using lambda or SQS or something. Every deficiency in an amazon product is "solved" by another amazon product. You're swallowing a horse to swallow a fly. In the end I just added AWS to the foo.org SPF records along with outlook's and set the From header accordingly; way simpler, didn't need to any more AWS products, and knowledge of DNS is more portable than knowledge of AWS. AWS is in the business of inventing wheels and trying to get you stay in their wheel ecosystem.
Not to contradict everything you're saying like you're wrong or something. I wonder what the circus is like for those of you who run it. Everything you say reads like high-level manager/sales engineer marketing talk from someone who spends all day in meetings. Not to say I'm an authority and that your voice is illegitimate; I'm just a resentful out of touch NEET waiting for the world to change to the point that I have nothing left to offer it.
I work in one of those pesky HIPPO companies. For us, all of our AWS infrastructure up to including nifty shit like Cognito is HITRUST certified. For us, we get to offload an ass of compliance headache which is worth our 2-4x multiple as the server overhead is far less than a team of compliance and security wonks.
> The cloud is renting someone else's computer at a way higher price than it would cost you to use your own.
This is a simplistic view that doesn't discuss any of the trade-offs inherent in the choice between running your own hardware and using a cloud service.
Yes, it's a higher price, but it allows you to stop paying for it when you stop needing it. You can scale up rapidly. You don't have to deal with buying, maintaining, or replacing hardware.
That's basically just the definition of the word "renting". If the owner is doing their job right, it's always more expensive than owning, but it still makes financial sense to rent in many scenarios.
until you realize you have to build a datacenter yourself. Or order purestorage appliances for 3 datacenters you have your hardware in. And then realize dc 1 does not physically have enough space to add more storage servers. But you need that third site. There is no convenient 4th “spare” site. Let alone 100 across the world.
Or that your business has to replace a router on one of the dcs and then you have to do all the work to ensure nothing goes down yourself. You cant blame anyone if it goes bad.
Then you realize how much work that is. The cloud is really convenient.
Source:
Always worked “in the cloud”. Current client is on premise(s) for solid reasons. A very unusual case though. Fun. Inconvenient. Makes you respect the big clouds even more.
> until you realize you have to build a datacenter yourself
Have you never heard of a colo? Rent 1-2 racks in one of those. And you probably won't need more than 1-2 racks because that's what Stack Overflow runs on.
And guess how long it takes me to set up an entire data center with Terraform on any of the three major cloud providers? (disclaimer: I work for one of them)?
It’s much less maintenance than my days of maintaining servers myself.
And not to mention half the reason I went to cloud was not that I didn’t want to deal with administering servers, I didn’t want to deal with server administrators.
When I was at the 60 person company where I got my start in “cloud”, I could experiment with different types of databases, scaling, and other technologies just by throwing something together and deleting the entire stack.
I worked for a company that aggregated publicly available health care provider data (ie no PII) for major health care providers. They used our APIs for their own websites and mobile apps.
When we got a new customer (ie large health care provider), our systems automatically scaled.
When a little worldwide pandemic happened in 2020 and our traffic spiked by 100%+, guess how long it took us to provision new servers.
Hint: we didn’t, everything just scaled by itself.
I compare that to the old days when it took us weeks to provision an MySQL server.
Managing infrastructure is doesn’t provide a competitive advantage unless you’re something like Backblaze, DropBox or another company where your entire reason for existing is your infrastructure expertise.
> And guess how long it takes me to set up an entire data center with Terraform on any of the three major cloud providers? (disclaimer: I work for one of them)?
And the discussion is how much extra do you pay for it.
> Hint: we didn’t, everything just scaled by itself.
Again it's not free so what's the surprise? Are you surprised that you get water out of your tap? Hint: it just flows!
> I compare that to the old days when it took us weeks to provision an MySQL server.
Sounds like you've burnt in the past is all. So your on-prem is slow does not equal all on-prem is bad?
> Managing infrastructure is doesn’t provide a competitive advantage
How do you know it doesn't? You've only looked at it from your use case and based on it making you happy and saving you time. Nothing to do with the business needs at all.
> How do you know it doesn't? You've only looked at it from your use case
So you didn’t see the rest of the paragraph that you snipped?
“unless you’re something like Backblaze, DropBox or another company where your entire reason for existing is your infrastructure expertise.”
> So your on-prem is slow does not equal all on-prem is bad?
How fast can you spin up a dozen VMs? A message bus? A scalable database with read replicas? An entire redundant data center in another region? A few terabytes of storage? A redis cluster? An ElasticSearch cluster? A CDN? A few load balancers?
The procurement process to get an extra server provision in a colo will by definition be slower than my deploying a CloudFormation stack.
> How fast can you spin up a dozen VMs? A message bus? A scalable database with read replicas? An entire redundant data center in another region? A few terabytes of storage? A redis cluster? An ElasticSearch cluster? A CDN? A few load balancers? The procurement process to get an extra server provision in a colo will by definition be slower than my deploying a CloudFormation stack.
Your examples here are just examples of situations where you basically need a cloud solution by definition. If these are your requirements, then yes obviously you should use cloud for it. That said, your points are a bit confusing. It's not an either-or. For situations like you're describing, you use cloud. For situations where you don't need to use cloud, you can consider something else like on-prem or colo or ...
You seem to have a (literally) extremist position where it's all cloud or nothing. It's not.
Well then I'm a little confused. You wrote this earlier which contradicted my post:
> Managing infrastructure is doesn’t provide a competitive advantage unless you’re something like Backblaze, DropBox or another company where your entire reason for existing is your infrastructure expertise.
You don't need to be a company "where your entire reason for existing is your infrastructure expertise" in order for managing your own infrastructure to be a competitive advantage. Managing (some of) your own infrastructure can be a competitive advantage even managing infrastructure is not your core competency or even your goal. It is a competitive advantage of the TOC is lower. It sometimes is.
But if you're now saying you agree with my statement, then I guess well we're in agreement.
But, really how often do you need to do that and what % of users really need to?
Also, once on the cloud some business management take so long to "approve" new expenses that in reality it may not really be feasible to do things fast enough for it to be a benefit.
I've quite often seen the need for 5-10 meetings or 2-3 written documents to get approval for 10 new VMs for developers or new servers for backups.
> But, really how often do you need to do that and what % of users really need to?
When testing something or you want to spin up your own isolated environment for yourself or for your team? Very often.
> Also, once on the cloud some business management take so long to "approve" new expenses that in reality it may not really be feasible to do things fast enough for it to be a benefit.
And that’s get back to my other point that when you do a “lift and shift”. If you don’t change your processes both IT and technical, you won’t see any benefit from the cloud and you will end up spending more.
There are so many ways that you can both give developers freedom and still have the necessary guardrails. I’m speaking about AWS because that’s the one I know best (and where I work). But I’m sure there are equivalent services on other providers.
For instance you can have a vending machine type of setup where you allow department heads to set up non prod accounts with organization controlled service control policies. You can use a Service Catalog approach where you surface Terraform or CloudFormation defined products where the users can only provision infrastructure defined by their administrators. But they can do it themselves.
Depending on which level of the organization I’m working with, I try to convince the IT department to give individual departments their own organizational unit to monitor and to embed someone from IT into their team - ie a “DevOps” philosophy.
> And not to mention half the reason I went to cloud was not that I didn’t want to deal with administering servers, I didn’t want to deal with server administrators.
I bet it's true for many. I approximate it from what I see in backend/frontend teams - they don't even deal with eachother, not even system administrators.
Luckily [in the current project] devs don't have access to production and very limited to dev environment in terms of ssh/db endpoints.
At my n-2 job (2017-mid 2018), I was the dev lead when management decided to “move to the cloud”. They hired a bunch of “consultants” who were old school operations people who only knew how to do lift and shifts.
I didn’t know cloud from a whole in the wall. But the internal IT department treated AWS just like they did their Colo. I thought AWS was just a bunch of VMs and I treated it as such for a green field implementation.
I studied for the AWS Solution Architect certification just so I would know what I didn’t know and to be able to come up with some intelligent ideas for phase 2.
I ended up leaving that job and working for a startup. The CTO knew I had only theoretical knowledge of AWS. But I had good system design instincts and he liked my ideas. I was hired as a senior developer. But that rapidly morphed into a cloud architect role. I took advantage of AWS and all of its locked in goodness including moving everything to either Lambda and Fargate (serverless Docker).
I had admin rights to everything until I voluntarily gave myself the same constraints to production that everyone else had when we hired a couple of operation guys.
We scaled without any issues as the company grew and Covid happened - we worked in the healthcare industry.
Now I work for AWS. But I’ve done my share of managing servers since the mid 90s as part of my job. That’s a life I don’t ever want to go back to.
The difficulty with cloud is Joel's rule: "All non-trivial abstractions are leaky" they just abstract complexity and eventually the abstraction breaks and you actually need to know some amount of linux, networking, security, or distributed system engineering to fix it.
The easiest way to not get bitten by this is to avoid the abstractions and keep it simple as long as possible. Most apps can probably do fine with a single beefy box and a local sqlite database - this can likely scale vertically indefinitely with moore's law and still probably have less downtime than if you relied on all the fancy cloud technology.
What I don't get in this discussions is, why not just target containers and after that do the least amount you need to have you container online somewhere? At most you'd need to do Kubernetes if it gets complex enough but by then you have manifests that would work on any cluster anyway, doesn't matter which cloud you use. Am I being too naive?
All big tech platforms make their riches mostly this same way over the last ~40 year: solve 80-90% of the problem in a super simple, slick manner that is priced competitively. Then as customers build out the unique parts that match their needs (the last 10-20%), bleed them dry with features that are more expensive and establish lock-in.
Assuming you're being serious, the advantage of the cloud is scalability, avoiding managing hardware, reliable power, and reliable network.
You want to run your web app off your basement server? Go ahead, but if you have a blog post that hits the front page of HN, there's no way to scale up. If your home server has hardware failure then you're out of luck until you can get new hardware. Your home has a power outage or ISP outage? Your site is down.
If you can tolerate those things, great! I wish my employers and their customers would tolerate it.
> Assuming you're being serious, the advantage of the cloud is scalability, avoiding managing hardware, reliable power, and reliable network.
Or in reality, it's weird transient failures you can't debug, and unexpected bills for some asanine reason. And scalability sure seems like something to avoid until you actually need it (as in the demand on the site is large enough, not that your performance is so bad you can't handle traffic).
Obviously I can be more cavalier with my uptime, as it is a personal server.
My first job out of college was a hybrid computer operator /programmer for a company that ran a state lottery. This was the mid 90s.
They had a complete backup site with redundant servers, modems (for point of sales systems) and a redundant staff because the cost of being down was so high.
We never had to use the backup site the entire three years I was there.
Well, if you're serving reasonable amounts of text and not including a javascript kitchen sink. I mean yes, you probably should be serving library off an edge network, but I've seen stranger.
Then that goes out the window the moment you have any media on that Pi... ISP upload speed in the US is balls.
Oh, did I mention that most ISPs in the US don't allow hosting websites? So are you putting that Pi in a datacenter?
And, I forgot to mention, what happens with you piss off some bored troll and get DDOSed?
None of what you mentioned related to "traffic to a blog post from HN front page". I am just saying "traffic from HN front page" is a poor example of things that need "scale up".
Not disagreeing all the challenges you mentioned. Just "traffic from HN front page" is so easy it is a poor example.
> I'm sorry, what's the advantage to the cloud again?
For-most-purposes-infinite on-demand resources, provided in a format where you don't have to pay for when you aren't using it. And yes, we all know that this does mean you pay a premium for it when you need it, and (good) cloud folks understand that you have to design for that reality.
If you don't need that advantage, that's fine, but it's silly to act like it doesn't exist. Find the point where the delta in variable costs outweighs additional capex and steer your solution to that point, make peace with inefficient spending, or get out. You don't have to be weird about either extreme position.
This matters when you're an op in an IRC channel and someone joins the channel, starts spamming racial slurs, so you ban them, and they respond by DDoSing you.
If I had a dollar for every time it happened, I'd have $2, which isn't a lot of money but it's pretty annoying that it's happened twice. The first time, they only ran the attack for a few minutes. The second time it happened, they ran it for over an hour. Releasing and Renewing my WAN IP didn't stop the attack, because I still got the same WAN IP. I had to call my ISP support and spend way too long on the phone trying to talk to someone who knew what I was asking for to get a new IP address before I was able to dodge the attack.
Using AWS, I can configure security groups so that my VM never even sees the packets from a DDoS if I've configured them to only allow connections from my home IP.
I currently run a micro-saas product on a $4 a month namecheap server using the LAMP stack. It runs fast for all 34 companies, or 400+ employees, using it.
I've looked into moving it to Google cloud or AWS and it just seems daunting. Honestly, I use ftp, cpanel, and phpmyadmin.
Is there a way to get this product into the 'cloud' in case it grows, easily?
"Run your own container" offerings also scale to zero, all clouds have something to that effect and it doesn't require you to embedded your logic into their tar pit.
(Please take my opinion with a grain of salt, as I might be biased - I'm a founder at a startup that solves a very similar problem).
The cloud (and by "cloud" I mostly mean AWS) in general is indeed insanely complex. Not only is it complex and hard to use for dedicated and trained DevOps/Cloud experts, it's even more overwhelming for developers wanting to just deploy their simple apps.
This statement is in my opinion almost universaly accepted - during our market research, we've interviewed ~150 DevOps/Cloud experts and ~250 developers that have been using AWS. Only ~2.5% of them have said that the complexity of AWS is not an issue for them.
That being said, I understand that AWS has to be complex by design. Not only it offers ~250 different services, but the flexible/configurable way it's designed simply requires a lot of expertise and configuration. For example, the granularity and capabilities of AWS IAM is unparalelled. But it comes at a cost - the configurational and architectural complexity is just beyond what an average AWS user is willing to accept.
An alternative to the cloud complexity are the PaaS platforms (such as Heroku or Render). But they also have their disadvantages - mostly significantly increased costs, lower flexibility and far less supported use-cases.
At https://stacktape.com, we're developing an abstraction over AWS, that is simple enough so that any developer can use it, yet allows to configure/extend anything you might need for complex applications. Stacktape is like a PaaS platform that deploys applications/infrastructure to your own AWS account.
We believe that Stacktape offers the perfect mix of ease-of-use, productivity, cost-efficiency and flexibility. It can be used to deploy anything from side projects to complex enterprise applications.
I'll be very happy to hear your thoughts or to hear any feedback.
Applications developers often tell me about (to quote another post here) "the lurching tedium" of cloud infrastructure development.
Growing up in a datacenter, opening tickets and checking them weekly, hoping for the vendor to finally ship the right backplane; datacenter engineering used to take weeks, months, years. Waiting 30 seconds for terraform plan to check 120 resources which corresponds to thousands of pounds of metal and enough wattage to blow up the city I live in... doesn't seem too bad. That said, I understand where you javascript folks are coming from with your iteration loops, but still, you've gotten understand: it's so easy now.
Leaky abstraction, sure. But it's always great to see innovation in cloud infra.
> It doesn't make sense that every time I want to execute code inside an AWS Lambda function, I have to understand that it needs to be bundled with tree-shaken dependencies, uploaded as a zip file to S3 and deployed through Terraform. Or that in order to be able to publish a message to SNS, my IAM policy must have a statement that allows the sns:Publish action on the topic's ARN. And does every developer need to understand what ARNs are at all?
The Terraform AWS provider is a very thin abstraction. If your needs are not too specific, there are probably a few higher level abstractions out there that you can use. This is one of the main reasons PaaS are so popular.
It’s also weird not to know about terraform modules. There are a lot of things you _can_ configure because there are a lot of things people need to configure but if you’re using something like https://github.com/terraform-aws-modules/terraform-aws-lambd... it’s only a couple of lines of config for a Lambda.
I am finding "cloud" to be a pleasant experience at the moment.
We are building a B2B service in Azure using Az Functions & Az SQL Database as the primary components. That's about it. We figured out you can "abuse" Az Functions to serve all manner of MVC-style web app (in addition to API-style apps) by using simple PHP-style templating code. Sprinkle in AAD authentication and B2B collaboration and you have a really powerful, secure/MFA auth solution without much suffering. Things like role enforcement is as simple as taking a dep on ClaimsPrincipal in the various functions.
The compliance offerings are really nice too. Turns out if you use the compliant services without involving a bunch of complicated 3rd party bullshit, you wind up with something that is also approximately compliant. For those of us in finance, this is a really important factor. 10 person startups don't have much bandwidth for auditing in-house stacks every year. If you do everything "the Azure way", it is feasible for you to grant your partners (or their auditors) access to your tenant for inspection and expect that they could find their own way around. If you do it "my way" you better be prepared to get pulled into every goddamn meeting.
I am starting to wonder if not all clouds are made equal anymore. We also have some footprint in AWS (we used to be 100% AWS), but it's really only for domain registration and S3 buckets these days. GCP doesn't even fly on my radar. I've only ever see one of our partners using it.
Similar for Deno deploy and Cloudflare Workers. In the end, the layers you get on AWS are not the same for all providers. It varies a lot from platform to platform in terms of "cloud".
I'm working on a relatively complex cloud solution centered round AWS Lambda, SQS/SNS and DynamoDB with many different lambda endpoints and isolated databases. It works, but it's incredibly hard to test. The fortunate thing is there's a system in place to stand up an environment at the PR level, but even that take almost an hour to test/build/deploy/test/deploy for the PR and every commit after the PR is made. Local runs are sorely lacking. And I can only imagine the number of environments with similar issues.
I've been playing with Cloudflare Pages/Workers and CockroachLabs (CockroachDB Cloud) on a personal/side project and it's quite a bit different. Still in the getting groundwork done and experimenting phase, but wanting to avoid the complexity of the work project while still providing enough scale to not have to worry too much about falling over under load.
Not every application needs to scale to hundreds of millions of users, and it all comes at a cost that may not be worth the price of entry. The platform at work is at well over 1.6 million in terms of the story/bug/feature numbers at this point... it's a big, complex system. But working in/on it feels a few steps more complex than it could/should be. It'll absolutely scale horizontally and is unlikely to fall over under load in any way... but is it really worth it with the relatively slow turn around, in what could have still used dynamo, but the service layers in a more traditional monolith with simply more instances running?
I've noticed the frequency of K8s headlines has diminished. (Very) roughly two years ago you never saw a HN front page without one or more K8s headlines. I suspect it has saturated the market that it appeals to.
>To be honest, give me the developer experience of the 90s. I want to make a change, and I want to be able to test this change either interactively or through a unit test within milliseconds, and I want to do this while sitting in an airplane with no WiFi, okay? (we didn't have WiFi in the 90s).
I'm going to make an impossible request and ask that any readers ignore everything else they know about "crypto", but...this is one of the things that feels right in EVM development compared with normal cloud applications. Especially with frameworks like Foundry, unit tests for distributed applications run very quickly on local copies of the whole production environment. It's a lot more fun than anything which touches AWS.
Obviously, there are some major downsides (such as Ethereum being comparable to a late 70s microcomputer in computing power). But the model of a single well specified execution + storage protocol might be worth porting over to non-financialized cloud application development.
> But the model of a single well specified execution + storage protocol might be worth porting over to non-financialized cloud application development.
To a first approximation, it exists, and it's called Cloudflare Workers (and KV).
If I had to bet money, mine would be on the bet that Workers represents an early example of what will be in-the-main development in a decade.
“In existing languages, where there is no way to distinguish between multiple execution phases, it is impossible to naturally represent this idea that an object has methods that can only be executed from within a specific execution phase.”
This is not true. Several languages (Haskell, OCaml, F#, Scala, etc) allow you to define and use monads. Granted, monads are not something many developers know about … but it may make sense to learn about them before writing a new language.
Great point. This also reminds me of discussion I saw somewhere else about programming languages with execution capabilities support for security, the author saying that no languages supported it, and a commenter saying that this was basically the same as an effects system.
I'm personally a professional Haskell programmer and quite like it, but I think we are circling a core notion: There are many problems with having programming languages where any code can do literally anything at all, and being able to restrict it is extremely powerful.
Constraints liberate, liberties constrain. You can always loosen strictures, but once loosened, they are extremely hard to reintroduce.
Thanks, we'll rephrase to say that it is impossible in most languages, not all.
BTW, the company that is behind the project is called Monada (https://monada.co), so we've heard of monads :)
Would be happy to hear what else you think about the language
I hate to say this, but this is straight up rediscovering monads.
They have a notion of phases of execution which are different execution contexts with an ordering. And yes most languages don’t have a decent facility for expressing this or staged computation. Let alone a notion of computation phases that map to distributed systems state.
I would love something like networking-as-a-service. My ignorant ass do not understand the specifics related to it. I would love to option to specify that service A should be able to call service B irregardless of IP schemes, peering, firewalls, service discovery and 1000 other layers.
Tailscale [1] can do this. I've always had a limit of how much networking I can grok, and Taiscale basically let's me run a 90s style LAN across clouds and my local network like they're all connected to the same switch and VLAN. No port forwarding, firewall rules, or subnet management.
It can be made even more secure with some relatively incompatible ACLs sprinkled in too.
The problem with your request is there's a million ways to do that, some much less secure than others - I could satisfy your request by putting service B's security group wide open to 0.0.0.0/0, but then every other possible service could also reach it.
Utilizing an infinite amount of computing resources is probably always going to be an inherently difficult thing. It's a task made more difficult by all he snake oil salesman promising perfect easy solutions. The only workable solution is to find out what works for everyone else and, even if it doesn't fit your needs perfectly, figure out how to make it work for you. Never underestimate the power of strength in numbers. If the entire industry is utilizing a technology or paradigm you can either get on board or get left behind.
I don't understand what that solution brings when you have Terraform? Also linking your code with you infra code, who thought this was a good idea?
Last question how actually the simluator works, is it one of those case where it try to emulate some high level concept but then your prod code break because the simulator was 40% accurate?
About the simulator, it is a functional simulator to be able to test and interact with the business logic of the application.
There are other solutions, like LocalStack to simulate the non functional parts too.
What complexity are we talking about? If your app needs queues, databases or shared storage, it is because of how you designed it. Now just imagine you’re doing this yourself: install servers, os, software, configure integrations, do patches, upgrades, repair hardware. Would it be any simpler?
The comparison should be this vs. aws rather than this vs. bare metal … but I get your point.
Apps need some kind of persistence state. That requires some thought and annoyances to deal with however you do it. There is no leakproof abstraction. Takes 5s to retrieve the data? Get digging!
We decided to use DynamoDB as a cache for our expensive-to-compute statistics from the data warehouse so we can show them to users without putting a big read load on the data warehouse. DynamoDB scales really well right? And big statistics tables don't belong in the main app database. I'll build the statistics in our postgres data warehouse and use a little ruby script I wrote to push them from postgres to Dynamo with batch_write_item, shouldn't take long.
After spending a couple of days in terraform (I'm no infra expert) creating roles to assume, cross account permissions, modifying my script to assume those roles, figuring out the ECS run-task api syntax and some other things I'd rather forget about I kicked off the jobs to copy the data and left for the weekend. Sunday cost alert email put that thought out of my head: I just spent 8k USD on writing 2 billion rows to two tables because I misunderstood how Dynamo charges for write units. I thought I was going to spend a few hundred (still a lot, but our bill is big anyway) because I'm doing batch write requests of 25 items per call. But the dynamodb pricing doesn't care about API calls, it cares about rows written, or write capacity units, or something. OK, so how do we backfill all the historical data into Dynamo without it costing like a new car for two tables?
Apparently you can create new dynamodb tables via imports from S3 (can't insert into existing tables though) for basically no money (the pricing is incomprehensible but numbers I can find look small).
Now I just need to write my statistics to line delimited dynamodb-flavored json in S3 (statements dreamed up by the utterly deranged). You need to put the type you want the value to have as the key to the value you see.
A little postgres view with some CTEs to create the dynamodb-json and use the aws_s3.query_export_to_s3 function in RDS Postgres and I had a few hundred GB of nice special-snowflake json in my S3 bucket. Neat!
But the bucket is in the analytics account, and I needed the dynamo tables in the prod and staging accounts.
More cross account permissions, more IAM. Now prod and staging can access the analytics bucket, cool!
But they aren't allowed to read the actual data because they don't have access to the KMS keys used to encrypt the data in the analytics bucket.
OK, I'll create a user managed KMS key in analytics and more IAM policies to allow prod and staging accounts to use them to decrypt the data. But the data I'm writing from RDS is still using the AWS managed key, even after I setup my aws_s3_bucket_server_side_encryption_configuration in terraform to use my own managed key.
Turns out writes from RDS to S3 always use the S3 managed key, no one cares about my aws_s3_bucket_server_side_encryption_configuration. "Currently, you can't export data (from RDS) to a bucket that's encrypted with a customer managed key.".
Great. So I need to manually (yes I could figure out the aws api call and script it I know) change the encryption settings of the files in S3 after they've been written by RDS to my own custom key. And now, 4 hours of un-abortable dynamodb import jobs later, I finally have my tables in prod and staging in DynamoDB.
Now I just need to figure out the DynamoDB query language to actually read the data in the app. And how to mock that query language and the responses from dynamo.
lots of learning! also sounds like a bit of pre-existing aws state.
for round two, try:
- spinup a new subaccount to ensure you have total control of its state.
- data goes in s3 as jsonl/csv/parquet/etc.
- lambdas on cron manage ephemeral ec2 for when heavy lifting is needed for data ingress, egress, or aggregation.
- lambdas on http manage light lifting. grab an object from s3, do some stuff, return some subset or aggregate of the data.
- data granularity (size and layout in s3) depends on use case. think about the latency you want for light and heavy lifting, and test different lambda/ec2 sizes and their performance processing data in s3.
lambda is a supercomputer on demand billed by the millisecond.
ec2 spot is a cheaper supercomputer with better bandwidth on a 30 second delay billed by the second.
bandwidth to s3 is high and free within an AZ, for ec2 and lambda.
bandwidth is so high that you are almost always bottlenecked on [de]serialization, and then on data processing. switch to go, then maybe to c, for cpu work.
dynamodb is great, but unless you need compare-and-swap, it costs too much.
Putting a network in the middle of your system, and making everything go through API requests (or RPC shudders) is almost never a good idea. If you have planet scale systems and the budget to hire enough bright engineers and still turn record breaking profits, sure. If not, you're just draining value from your bottom line to the vendor's
If you want easy (or easier) mode, you'll have to use a Platform-as-a-Service (PaaS).
The major cloud vendors might have problems with quirky designs and poor documentation, but beyond that is necessary complexity.
You want a high-availability website allows user-uploaded files and does asynchronous task processing? You're probably going to have to get familiar with servers, load balancers, queues, and object storage, at a minimum.
You want it all to be secure? You're going to have to configure network rules/firewalls and set up lots of access policies.
There's no free lunch.