I suspect the same is true for cloud. Real portability has real costs, and if you aren't incurring all of them up front and validating that you're doing the right things to make it work, then incurring part of them up front is probably just a form of premature optimization. At the end of the day, all else being equal, it's easier to port a smaller codebase to new dependencies than a larger one, and attempting to be platform-agnostic tends to result in more code as you have to write a lot of code that your platform would otherwise provide you.
Running on Lambda, one day you’ll get an email saying that we’re deprecating node version x.x so be sure to upgrade your app by June 27th when we pull the plug. Now you have to pull the team back together and make a bunch of changes to an old, working, app just to meet some 3rd party’s arbitrary timeframe.
If you’re running node x.x on your own backend, you can choose to simply keep doing so for as long as you want, regardless of what version the cool kids are using these days.
That’s the issue I find myself up against more often when relying on Other People’s Infrastructure.
This way you have a good argument towards management and if you do it regularly or even plan it in ahead of time it's usually not much work.
During a product planning meeting:
"Dear manager, for the next weeks/sprint the team needs X days to upgrade the software to version x.x.x otherwise it will stop working"
Imagine a world where you didn't need to spend a whole week every year, per project, just keeping your existing software alive. Imagine not having to put off development of the stuff you want to build to accommodate technical debt introduced by 3rd parties.
That's the reality in Windows-land, at least. And I seem to remember it being like that in the past on the Unix side too.
But developers don't need to make any code changes or redeploy anything to mitigate those security issues. It all happens through patches on the server, 99% of which happen automatically via windows update.
So many open source hackers do not know the basic tecniques for backwards compatibility (e.g. don't reaname a function, just intoduce a new one, leaving the old available).
I'm spending very significant efforts maintaining an OpenSSL wrapper because OpenSSL constantly remove / rename functions. I hoped to branch based on version number, but they even changed the name of the function which returns version number.
And that's only one example, lot of people do such mistakes costing huge efforts from users.
And this popular semantic version myth, that you just need to update major version number when you chane the API incompatibly to save your clients from trouble.
I'd dispute this, or at least I think this doesn't capture the whole picture. Microsoft makes money with backwards compatibility and can afford to spend significant effort on to the ever-growing burden of remaining backwards-compatible indefinitely. Open source volunteers are working with much more limited resources and I think that it comes down much more to intentional tradeoffs between ease of maintenance and maintaining backwards compatibility.
If you have a low single-digit number of long-term contributors, maybe the biggest priority to keep your project moving at all is to avoid scaring off new contributors or burning out old contributors, and that might require making frequent breaking changes to get rid of unnecessary complexity asap. Characterizing that as "they don't know that you can just introduce a new function" doesn't seem like it yields instructive insights.
The mistake here is that in 99% of cases backwards compatibility costs noting - no efforts, no complexity.
Of two equally costing choices the people breaking backwards compatibility just make a wrong choice.
> maybe the biggest priority to keep your project moving at all
When you rename function SSLeay to OpenSSL_version_num, where are you moving? What does it give to your project?
Ok, if you like the new name so much, what prevents you from keeping the old symbol available?
unsigned long (*SSLeay)(void) = OpenSSL_version_num
When developers do such things, they break other open source libraries, which in turn break other. It's a huge destructive effect on the ecosystem. It will take many man-days of work for the dependent systems to recover. And it may take years for the maintainers to find those free days to spend on recovery, and some projects will never recover (e.g. no active maintainer).
With a lift of a finger you can save humanity from significant pain and efforts. If you decided to spend your efforts on open source, keeping backwards compatibility by making the right choice in a trivial situation will make you contribution an order of magnitude bigger, efficient.
So, I believe people don't know what they are doing when they introduce breaking changes.
So please, just keep the old function name. It will be cheaper for you and for everyone.
That's a good way of putting it, and it gets to a key difference between open source and proprietary software.
In the open source world where a million eyes make all bugs shallow, developer hours are thought of as free. So if you change something it's no big deal because all the developers using your thing can simply change their code to accommodate it. It doesn't matter how many devs or how many hours, since the total cost all works out to zero.
In the proprietary world, devs value their time in dollars. The reason they're using your thing is because it's saving them time. They paid good money because that's what your thing does. Save time. Get them shipped. As a vendor, you're smart enough to realize that if you introduce a change that stops saving your customers time or, worse, costs them time or, god forbid, un-ships their product, they'll do their own mental math and drop you for somebody who understands what they're selling.
In the end, all we're talking about here is the end product of this disconnect in mindset.
Ever time I run an audit (which is monthly) I see at least a dozen conversations in NPM packages we use. Sure, some of them don't apply to our usage, and others can't really impact is, but occasionally there is one we should be concerned about.
We server admins can push buttons to upgrade, but that doesn't mean developer code will keep working.
Many developers live in this world were they think server admins will protect their app... But we're more likely to break things by forcing your neglected package upgrades
I don't believe it. Most security issues are not just an implementation issue in the framework but an API that is fundamentally insecure and cannot be used safely. Most likely those developers' programs are rife with security issues that will never be fixed.
The discussion is about the difference between updates when there's a valid reason and updates that are imposed by cloud providers, nobody advocates sticking with old software versions.
In my experience that's what any policy that doesn't include staying up to date actually boils down to in practice. Auditing old versions is never going to be a priority for anyone, and any reason not to upgrade today is an even better reason not to upgrade tomorrow, so "understanding and mitigation" tends to actually become "leave it alone and hope it doesn't break".
You think about these things in layers. If X, then Y, and if Y, then Z, and if X, Y, and Z do we just accept that some problems are more expensive than they're worth or get some kind of insurance?
Keeping stuff up to date is exactly the kind of "simple thing" that no amount of sophistry will replace; in practice it has a better cost/benefit ratio than any amount of "thinking holistically". Those who only their things up to date and do nothing else may be foolish, but those who don't keep their things up to date are even more foolish.
Right, so all deployed Active X based software magically became both secure and continued working as before after everyone installed the latest Windows patches?
The trivial patching only works for security issues due to implementation not design defects. If you have a design defect, your choice is typically either breaking working apps or usage patterns or breaking your users security. Microsoft has done both (e.g. Active X blocking, vs continued availability of CSV injection) and both have negatively affected millions.
If they're doing security patches and bug fixes it's a maintained codebase.
As opposed to:
2011: deploy website, turn on windows update
2011-2019: lead life as normal
2019: website is up and running, serving webpages, and not part of a botnet.
That's reality today, and if it helps to refer to it as "maintained", that's fine. The point is that it's preferable to the alternative.
Installing security patches for a Microsoft stack requires turning on windows update.
There's a BIG difference. Once you write your msft stack app, is done. Microsoft apps written decades ago still work today with no code changes.
Is it resonable to postpone the upgrade for later ?
Example : the software uses python requests. A new version fixes CVE-2018-18074 about Authorization header, but you don't use this header, for sure. Is it resonable to upgrade a little bit later ?
Or is it going to take less time/effort to upgrade each time?
Or is the code so trivial you can immediately make the decision to skip that patch?
There's no perfect answer - you have to decide what's reasonable for your teams.
If your software runs on a unmaintained platform there won't be any security fixes and that's why amazon forces you to upgrade at some point.
I mean a few years ago there was a huge security flaw in Apache Struts, whose impact was big because it had been used in a lot of older applications - meaning a LOT of people had to be summoned to work on old codebases to fix this issue.
The problem isn't a changing runtime - even if you self-host it you should make sure to keep that updated regularly.
Mild disagreement: My philosophy is that you should choose technologies that aren't likely to introduce breaking changes in the future.
As an example, I have sites that were built using ASP.NET version 1.1 that have survived to this day with nothing more than Windows Update on the server and the occasional version bump in the project config when adding a feature that needed the latest and greatest.
Compare that to the poor soul who decided to build on top of React when it first came out, and has been rewarded by getting to rewrite his entire application four times in as many years.
To return to the point, rather than rewriting around breaking changes from Node x.x to Node y.a, I'd be shopping around for the LTS version of Node x that I could keep the thing running on without intervention from my team.
You are right; but in my experience those ASP applications also had security holes (CSRF etc) that were never patched. They ultimately either became botnets or faded away when the corp simply faded away.
A business that can't afford to pay for cleaning up its business applications is likely to be unable to pay for general upkeep as well. It is simply past the point of being a viable business and is either in limbo or in the grave!
See "maintenance free" approach to software as canary in the coal-mine and run away as fast as possible.
If you work on any non-msft stack I know of, you're constantly updating code for any, sometimes even minor version upgrade.
(That's the PCI announcement in 2015 that despite everyone knowing about problems with TLS1.0, they would continue to allow it through 2018 because of all the companies who deferred their technical debt in the manner you seem to be advocating.)
I know is hard to believe it's that simple, but it is.
1. Know what to do.
2. Have approval to do it.
3. Do it.
... which is to say, maintenance. The fact that maintenance is simple and/or easy doesn't mean it happens by itself.
Yes, there will always be _some_ maintenance. The point is it should be as simple and easy as possible.
Someone isn't getting the point, and I don't think I can make it any clearer.
If we made things harder to upgrade would that encourage more people to upgrade?
I know, I know... It's almost never the case.
What do you mean by your own backend? Your own physical hardware, your own rented hardware, your own EC2 box, your own Fargate container?
What if you get an office fire, a hardware failure, a network outage, a required security update, your third party company goes out of business, etc.?
There's no such thing as code that doesn't need to be maintained. Lambdas (and competitors) probably require the least maintenance of the lot.
GP is talking about unplanned maintenance, which is a huge problem in many industries, like air travel (any transportation, really), or software.
Until you get audited, and that raises a flag, and now you have to deal with it.
Using an old version of Node is just going to leave you with worse performance and potentially security holes.
-What would the security issue be on outdated lambda code?
-Wouldn’t the performance of the code would be equal to the performance of when you first deployed.
In software development, a lot of the new stuff that is supposed to solve our problems just shifts those problems somewhere else -- at best. At worst, it hides the problems or amplifies them. This isn't unique to software development, but it seems to be particularly pervasive because so much is invisible at the onset of use.
Best advice, be very suspicious when someone can't explain what the bad parts are.
In business you should never depend on just one provider. This provider could be the best in the world, now, because it has amazing management. But people die, or are hit by a bus, or get crazy after a divorce or just retire and they get replaced by the antitheses.
In my personal experience, real portability has great benefits on their own. Design is easier to understand and you constantly find bugs thanks to running your software in different architectures, compilers and so on.
At the end of the day, it is the quality of your team. If your team is bad, your software will be bad and vice versa.
Funny thing is, the fact alone that you create a provider-agnostic system can give you enough performance or cost penalties that you want to change providers in the first place.
You can't use provider specific optimizations, so the provider seems bad and you don't wanna use it.
In Django urls point to a function. Your request matches that url, that function gets run.
Lambda must have some similar sort of mapping of urls to functions, so what exactly are you saving with it? Ok, Django includes an ORM, but if you are using any sort of persistence you will need a database layer as well.
Can someone explain what all the fuss is about or what I am missing?
And if you're not dealing with purely web requests, then they're very different. Most of my lambdas trigger off of Kinesis, SNS, and SQS events. Work gets sent to these queues/notification endpoints, and then the Lambda function does work based off the data received there, and scales to handle the amount of data automatically.
The fact that there’s three major cloud providers (who will at the very least compete on customer acquisition) is a point in favor of this, but it’s definitely an experiment. In my mind, this is as much about negotiating power as it is about technical tradeoffs.
Whether it's DB independence or cloud provider independence, from my experience it's cheaper to pay the cost of the migration when you know you want to migrate (even if it involves rewriting some parts) rather than paying it everyday by writing portable code.
Most of the time the portable code you write becomes obsolete before you want to port it.
Agreed, on principle and based on experience. This does require you have reasonable abstractions in place as otherwise you’ll end up refactoring before you can migrate. But that’s a good thing in any case.
Using this template.
I’ve been able to deploy the same code as a regular Node/Express app and a lambda with no code changes just by changing my CI/CD Pipeline slightly.
You can do the same with any supported language.
With all of the AWS services we depend on, our APIs are the easiest to transition.
And despite the dreams of techies more than likely after awhile, you aren’t going to change your underlying infrastructure.
You are always locked into your infrastructure choices.
Connecting to AWS managed services (s3, kinesis, dynamodb, sns) don't have this overhead so you can actually perform some task that involves reading/writing data.
Lambda is basically just glue code to connect AWS services together. It's not a general purpose platform. Think "IFTTT for AWS"
It’s a standard CRUD REST API used by our website.
The only thing slow is infamous cold start time when running within a VPC because it has to create an ENI.
AWS pinky promised they were going to fix this soon.
If I put the node service and a database on the same box I'd get the same performance, and actually probably better since Amazon would still have them on separate physical hardware.
If you use open source interfaces, or even proprietary interfaces that are portable, it's easier to take your app with you to the new hosting provider.
The non-portable interfaces are crux of the matter. If you could run lambda on Google, Azure or your own metal, folks wouldn't feel so locked-in.
But, I could still take advantage of hosted Aurora (MySQL/Postgres), DocumentDB (Mongo), ElasticCache (Memcached/Redis) or even Redshift (Postgres compatible interface) without any of the dreaded “lock-in”.
My position isn't don't use AWS as a hosting provider, it's that you ought to avoid being locked into a proprietary non-portable interface when possible.
Or the Twitter model - very bad architecture that always crashed, find “product market fit” and then get funding to fix any issues.
Or the company goes out of business, I put X years of AWS experience on my resume and make out like a bandit as an overpriced consultant.
I don’t see the downside....
If writing code to be able to move to a different cloud isn't considered lock-in, then nothing is since anyone can write code to do anything themselves.
There are two kinds of lock-in: high cost because no competitor does as a good a job - this is good lock-in, and trying to avoid this just means you’re not building the thing optimally in the first place.
There is also high switching cost because of unique interface and implementation requirmenrs that don’t add any value over a more interopable standard. This is the kind that’s worth avoiding if you can.
"Connecting to AWS managed services (s3, kinesis, dynamodb, sns) don't have this overhead so you can actually perform some task that involves reading/writing data."
That is due to network and colocation efficiencies. The overhead of managing such services yourself is another matter.
I’ve done stress testing by spinning up and tearing down multiple VMs played with different size databases, autoscaled read replicas for performance. Ran a spot fleet, etc.
When you need things now you don’t have time to requisition hardware and get it sent to your colo.
And this doesn’t count all of the third party hosted services.
Aurora (Mysql) redundantly writes your data to six different storage devices across multiple availability zones. The read replicas read from the same disks. As soon as you bring up a read replica, the data is already there. You can’t do that with a standard Mysql read replica.
You connect to S3, and:
a) You can build an abstraction service if you care about vendor lock-in so much
b) It has an API that plenty of open source projects are compatible with (I believe Google's storage is compatible as well)
Maybe you use something like SQS or SNS. Bummer, those are gonna "lock you in". But I've personally migrated between queueing solutions before and it shouldn't be a big deal to do so.
It's really easy to avoid lockin, lambda really doesn't make it any harder than EC2 at all.
As I said, at least in the cases of your database and your storage, being cloud-agnostic is trivial. Managed postgres is easy to migrate from, S3 shouldn't be hard to migrate from either.
Certainly lambda doesn't impact this too much.
> Serverless technology is only starting out still and I'm pretty sure 5 years from now Lambda won't be the go-to platform anyway. Plus honestly we've learned so much from the first big project on Lambda that writing the next one with all of that in mind will be pretty great (and agnostic).
I realize it isn't entirely on-topic, but could you elaborate? I'm curious to hear more about your opinion on this, I'm not sure what the future of Serverless is.
Heck, consultants get paid lots of money just to do a lift and shift and migrate a bunch of VMWare images from on prem to AWS.
Yes, you can build an abstraction layer. And maintain it. And hope that you don't get feature divergence underneath it.
That's really, really expensive.
I’ve had to explain to a CTO before why I had my team spending time on a CI/CD pipeline. Even now that I have a CTO whose idea of “writing requirements” is throwing together a Python proof of concept script and playing with Athena (writing Sql against a large CSV file stored in S3), I still better be able to articulate business value for any technological tangents I am going on.
Any time spent on work that doesn't directly create value for customers is work that the business should be weighing in on. I'm not saying that you should never spend any time doing anything else - but these are trade offs that the product manager should be involved in, and one of their primary jobs is being able to weight the technical and business realities and figuring out where resources should be going.
My only point is that vendor lock-in is not a significant issue on AWS, and that it requires virtually no effort to avoid it.
Of course it requires effort. A lot of effort, not to mention headcount. The entire value of cloud-managed services is what it saves you vs. the trade-off's, and it's disingenuous to pretend that's not the case.
There are tradeoffs, of course. Cost at scale is the really big one - at some point it's cheaper to bring ops/ hardware in-house.
I just don't agree that lock-in is a huge issue, and I really disagree with the idea that lambdas make lock-in harder.
- DBA's & DevOps
- Procurement management & spare parts
- Colocation w/multihoming
- Leasing agreements
- Paying for power usage
- Disaster recovery plan
- CapEx & depreciation
- Uncomfortable meetings with my CFO explaining why things are expensive
- Hardware failure
- Scaling up/out
Not even worth going on because the point is obvious. Going "all in" reduces cost and allows more time to be focused on revenue-generating work. The "migration" boogeyman is just that, something we tell other programmers to scare them around the campfire. You're going to be hard-pressed finding horror stories of companies in "cloud lock-in" that isn't a consultant trying to sell you something.
> at some point it's cheaper to bring ops/ hardware in-house.
It depends. It's not always scale issue, and with all things it starts with a model and collaboration with your finance team.
Using a company that bypasses Amazon for 99.999% of its traffic isn't exactly an Amazon success story.
Everything was built on Amazon and video is largely hosted on S3. Yes, there’s a large CDN in the mix too. That doesn’t take away from the achievement.
Except for some proprietary light AWS proxy code, the bulk of the Lambdas delegate to pre-existing Java POJO classes.
The cold start issues and VPC configuration were a painful learning curve, but nothing I would consider proprietary to AWS. Those are universal deployment tasks.
This is false. I've seen entire Lambda APIs backed by MySQL on large, consumer-facing apps and websites. As another poster pointed out, the cold-start-in-a-VPC is a major PITA, but it can (mostly) be worked around.
Or is it when deploying new code?
"Why are you throwing rocks at that machine?"
"It makes it respond more quickly to actual client requests. Sometimes."
"Well, most the time."
"Why's that? What's causing the initial latency?"
"Yeah, but what's that mean?"
"The machine is booting or caching runtime data or, you know, warming up and serverless. Anyway, don't think about it too much, just trust me on this rock thing. Speaking of which, I got to get back to bastardizing our core logic. Eddie had great results with his new exponential backup rock toss routine and I'm thinking of combining that with a graphql random parameter generator library that Ted said just went alpha this afternoon."
Also, this conversation is very different if you are an early stage company racing to market as opposed to an established organization where P&L often takes precedence over R&D.
Because so much had been invested (sunk cost fallacy), no-one could really get their heads around a shift to SES, even though it would have been a slam dunk in improved reliability and developer productivity.
Whereas if we were on, say, Mailgun, and then someone wanted a shift to SES, that debate could probably have been a more rational one.
I just point this out to say that investing in your own infrastructure can be a very real form of lock-in itself.
For small companies, you may not be able to afford infrastructure people, and moving fast makes way more sense. There's little point in paying for an ops person when you have very little infrastructure.
At a certain scale though, AWS stops being cost effective. You begin to have room in your budget for ops people, you get room to afford datacenter costs, and you can start paying for a cloud architect to fill out internal or hybrid cloud offerings using openshift or openstack.
It's all about the right tool for the right job.
I know for a fact GE is moving a lot of their workload to AWS.
Yeah, Netflix's opinion is to use Amazon as little as possible. Their critical infrastructure (the CDN) is not anywhere near the slimy grip of AWS.
This is simply untrue. Everything but their CDN uses AWS.
>Their critical infrastructure (the CDN) is not anywhere near the slimy grip of AWS.
The streaming website and app aren't critical infrastructure? Databases containing all of their business and customer details aren't critical infrastructure? Encoding content so it can be delivered by the CDN isn't critical infrastructure?
That's like saying I don't trust Ford because I buy Michelin tires while I drive a Fusion.
Netflix does presentations every year at ReInvent about how they use AWS and they have a ton of open source tooling they wrote specifically tied to AWS.
This ease of use philospohy goes way back to the one-click patent. If I want DNS, why wouldn't I go to amazon, which has all my finance details, and a decent interface (and even API), rather than choosing another DNS provider, setting up my credit card, and having to maintain an account with them. So I choose DNS via AWS. Then I want a VPS, but why go to linode and have the overhead of another account when I could do lightsail instead?
If you use provider agnostic solutions, things get expensive quick.
Stuff like SNS and Cognito is much cheaper in terms of TCO.
If you don't use them you can switch providers more easily, but if you use them you wouldn't need to switch providers.
On top of that. I lose the “easy button” of depending on our AWS business support contract if something gets wonky.
Can he do all that across multiple geographically dispersed redundant data centers?
Having relatively easy to spin up alternatives is a great thing. I can run my application entirely on a local kubernetes cluster or one on Amazon, DigitalOcean or Google's cloud services. That sort of flexibility is excellent and has allowed us to scale into situations where we otherwise couldn't have affordably done so (being able to buy some bandwidth from Joe Entrepreneur has it's benefits sometimes).
Which compliance regulation require you not to use a cloud provider? At most they may require you to not share a server with another company - that can be done with a drop down - or the data has to be hosted locally - again that can be done by selecting a region.
The policies that say not to use a company controlled by the US government. Or the ones that say under no circumstances should the data be sent over the Internet to a third party "because OPs are hard".
Even if you chose to use AWS’s OLAP database,Redshift, it uses standard Postgres drivers to connect. You could move it to a standard Postgres installation. You wouldn’t get the performance characteristics of course.
If you don’t want to be “locked in” to DynamoDB, there is always the just announced Mongo compatible DocumentDB. Of course ADFS is used everywhere.
Why in the world would I want to manage a colo with all of those services myself and still not have any geographic redundancy - let alone any place near our outsourced developers?
isn't the _entire context_ of this article (and your response, experience, etc) 'implications of changing an underlying infrastructure' ?
yes, this isn't something one does on a whim, but it does happen, as your own post suggests.
Yeah, nowadays it's done by dissolving the company or being bought out, then shut down.
Still, it's the same thing. Use of inadequate infrastructure being, let's say, terminated.
What do you consider the lock in when using RDS? S3? EC2?
If I did want to move from AWS that’s the first thing I would do. Put an API end point in front of my business logic, change the event from triggering lambda to an SNS message and move my API off of AWS. Then slowly migrate everything else.
People aren’t using the same frameworks and infrastructure they used 10 years ago.
The cool kids don’t do large apps, we do microservices. (Said ironically I’m in my mid 40s - far from a kid.)
I didn't include post-deployment end-to-end tests in the 5 minute figure, but technically speaking, we do deploy that quickly
Maybe I'm overstating it, but I don't think I am...
It's all a lot harder than you make it out to be, but at least with lambda (and something like Zappa) you don't have to figure anything out beyond how you get your first environment up. There's just no second step, and that's huge.
Using a scripting language like Python or Node it literally is adding one function that takes in a JSON event and a context object as your entry point.
With Google Firebase Functions I was able to start writing REST APIs in minutes.
Compare that to setting up a VM somewhere, getting a domain name + certs + express setup + deployment scripts, and then handling login credentials for all of the above.
I had never done any of that (eventually I grew until I had to), so serverless let me get up and running really quickly.
Now I prefer my own express instance, since deployment is much faster and debugging is much easier. But even for the debugging scenario, expecting everyone who wants to Just Write Code to get the horrid mess of JS stuff up and running in order to debug, ugh.
(If it wasn't for HTTPS, Firebase's function emulator would be fine for debugging, as it is, a few nice solutions exist anyway.)
But, to be clear, on day 1 the option for me to write a JS rest endpoint was:
1. Follow a 5-10 minute tutorials on setting up Firebase Functions.
1. Pick a VM host (Digital Ocean rocks) and setup an account
2. Learn how to provision a VM
3. Get a domain
4. Get domain over to my host
5. SSH into machine as root, setup non-root accounts with needed permissions
6. Setup certbot
7. Learn how to setup an Express server
8. Setup an nginx reverse proxy to get HTTPS working on my Express server
9. Write deployment scripts (ok SCP) to copy my working code over to my machine
10. Setup PM2 to watch for script changes
11. Start writing code!
(12. Keep track, in a secure fashion, of all the credentials I just created for the above steps!)
I am experienced in a lot of things, and thankfully I had some experience messing around with VMs and setting up my own servers before, but despite what everyone on HN may think, not every dev in the world also wants to run a bunch of VMs and manage their setup/configuration just to write a few REST endpoints!
So yeah, instead I can type 'firebase deploy' in a folder that has some JS functions exported in an index.js file and a minute later out pops some HTTPS URLs.
It's funny but when I learned to code basically all ISPs provided you with free hosting and a database, and you just needed to drag and drop a PHP file to make it live. It's like we have gone backwards not just in terms of openness but also in terms of complexity.
I was a bit shocked at how asinine things had gotten.
Every React Native tutorial out there has a section on setting up user auth with Firebase, and then putting a few REST endpoints up.
It is simple enough that a beginner "never touched mobile or web" development tutorial can go through it in under an hour.
Firebase is incredibly simple to get started with.
Another solution is to use one of the HTTPS port forwarding services that takes a localhost server and gives it a public HTTPS endpoint, but that is more work to explain than
Lambdas basically require zero maintenance. SQS requires zero maintenance. EC2 load balancer is zero maintenance. And the setup is trivial too and there's no migration time down the line.If you start off with native cloud for everything you can keep your maintenance and setup costs down drastically.
However, a lot can be done with the old school unsexy tech.
So I'm mixed.
I've just found in my experience maintaining a web server or a database server, keeping security in mind, upgrades, scaling, etc. is alot more work than simply spinning up RDS and a Lambda with API gateway. Or even hosting static sites on s3 or Netlify.
Like I said I don't know enough to say one is better.
Also, is your name Jason?
Then, I configured Cloudwatch alarms to have thresholds for certain metrics that send a web hook to PagerDuty.
The benefit is that my monitoring system never goes down, never needs to be patched manually (AWS even patches my Postgres database, and fails over to the warm standby during patching), and never needs any system administration.
Have you ever worked at a company where you had a serious outage that you didn't detect quickly enough because a monitoring system was down? Having a Serverless monitoring system means this has happened 0 times despite our app running in production for almost a year now.
That wouldn't be a great reason to choose serverless indeed. However, that doesn't mean serverless isn't still the right choice.
We've tried both the traditional approach you describe and serverless, and from experience the latter is 10x less infrastructure code than the former (we compared both Terraform implementations).
If serverless fits your use case, saving time and effort is a very good reason to go for it IMHO.
You don't have to write much code to implement a lambda handler's boilerplate, and that boilerplate is at the uppermost or outermost layer of your code. You could turn most libraries or applications into lambda functions by writing one class or one method.
A lambda's zip distribution is not proprietary and is easy to implement in any build tool.
Basically, AWS has so much damn stuff under their belt now, and it all integrates so nicely, every time they add a new feature it lifts up all the other features as a matter of course.
Vendor lock-in is a thing, but Lambda are lower on the rungs than other things.
It's the hyper-specific things that imply heavier lock-in, especially those that bleed into other systems.
There are business reasons to go multi-cloud for a few workloads, but understand that you're going to lose time to market as a result. My best practice advice is to pick a vendor (I don't care which one) and go all-in.
And you'll forgive my skepticism around "go multi-cloud!" coming from a vendor who'll have precious little to sell me if I don't.
Pick a vendor and go all in.
Don't expect a vendor to always stay competitively priced, especially once they realize a) their old model is failing, and b) everybody on their old model is quite stuck.
Ask anybody about Oracle features, they'll tell you for days about how their feature velocity and set is great. But then ask them how much their company has been absolutely rinsed over time and how the costs increase annually.
Oracle succeed by being only slightly cheaper than migrating your entire codebase. To offset this practice, keep your transition costs low.
Personal note: I'm currently experiencing this with microsoft; all cloud providers have an exorbitant premium when it comes to running Windows on their VMs, but obviously Azure is priced very well (in an attempt to entice you to their platform). Our software has been built over a long period of time by people who have been forced to run Windows at work -- so they built it on Windows.
Now we have a 30% operational overhead charged from microsoft through our cloud provider. But hey.. at least our cloud provider honours fsync().
I guess the part of us that is cautioning ourselves and others are aware of the pitfalls, but others also have valid points around going all in.
There is a matrix of different scenarios let's say.
You can go all in on a vendor and get Oracled.
You can go all in on an abstraction that lets you be vendor agnostic and lose some velocity while gaining flexibility.
You can go for a vendor and perhaps it turns out that no terrible outcome results because of that.
You can go all in on vendor agnostic and have that be the death of the company.
You can go all in on vendor agnostic and have that be the reason the company was able to dodge death.
You might know all the theory on aviation and be a really experienced pilot and one day a sudden wind shear might still fuck you.
Speaking from first hand running-a-cloud-platform experience, it's because running Windows on a cloud platform is not easy, and comes with licensing costs that have to be paid to Microsoft for each running instance (plus a bunch of infrastructure to support it). It's not even a per-instance-per-time-interval cost, there's all sorts of stuff wrapped up in it and impact the effective cost. It requires a bunch of administrative work and specific coding to try to optimise the costs to the cloud provider.
In addition, where Linux will happily transfer between different hardware configurations, you'll often have to have separate Windows images for each hardware configuration, so that means even more overhead on staffing both producing and supporting. So e.g. SR-IOV images, PV images, bare metal images (for each flavour of bare metal hardware), etc. While a bunch of this work can be fully automated, it's still not a trivial task, and producing that first image for a new hardware configuration can take a whole bunch of work, even where you'd think it would be trivial.
Amdocs, ESRI, Microsoft too... Their commercial strategy is a finely tuned parasitic equilibrium.
Sales training that emphasizes knowing one's customer is all about that: if the salesperson understand the exit costs better than the customer, he is going to be milked well & truly !
I'm thinking that the sales side may benefit from hiring people with experience in the customer's business to game the technical options in actual study... I guess they do it already - I'm not experienced on the sales side.
For me this is key.
Yeah I know I am showing my age....
Vendor lock-in was just as much of a problem then as now.
More than that. They picked open source vendors that didn't (a) vanish in a puff of smoke, (b) get bought out by a company that slowly killed the open source version, or (c) who produced products that they were capable of supporting without the vendor (or capable of faking support for).
In general I agree with you. In practice, the more practical approach may be to focus on making more money & not fussing too much w/ vendor costs & whatever strategy you choose to use. It's easy to pay a big vendor bill when there's a bigger pile of cash from sales.
My best-practice advice is to do the math. What is the margin on infrastructure vs. the acquisition and management costs of the engineers necessary to operate the infrastructure.
Serverless doesn't scale very well in the axis of cost. At some point that's going to become an issue. If one has gone "all-in" on vendor lock-in then that vendor is going to spend as much time as possible enjoying those margins while the attempts to re-tool to something else is underway.
Best practice, generally speaking is to engineer for extensibility at some point, fairly early on.
There are a number of reasons that self-hosting doesn't make sense which have very little to do with scale and more to do with the lack of scale. For very little investment, one can get highly available compute, storage, networking, load-balancing, etc. from any of the major cloud providers. Want to make that geographically distributed? In your average cloud provider that's easy-peasy for little added cost.
Last time I had to ballpark such a thing, which is to say, what is the minimum possible deployment I'd be willing to support for a broad set of services, I settled on three 10kw cabinets in any given retail colo space with twenty-five servers per cabinet each consuming an average of 300W each. Those server were around $10k and were hypervisor class machines, i.e. lots of cores and memory for whatever time that was. Some switches, a couple routers, and 4xGigE transit links.
Of course I'd want three sets of that spread in regions of interest. If I were US focused, east coast, west coast and Chicago or thereabouts. All the servers and network gear come to around $1.5m CapEx. OpEx is $200/kw for the power and space and around $1/mbps for the transit. Note that outside the US, the price per kw can be much, much higher.
So, $6k MRC for the power and $4k MRC for the intertubes. $10k OpEx on top of ~$42k/month in depreciation ($1.5m/36) on your CapEx multiplied by three gives you $156k/month.
Lets assume my middle of the road hypervisor class machine has all the memory it needs and two 16 core processors with hyperthreading, so 64vCPU each or 14400 vCPU across your three data centers all for only around $2m/yr with nearly $5m of that up front.
That's a boat load of risk no startup or small enterprise is going to take on. You still have to staff for that and good luck finding folks that aren't asshats that can actually build it successfully. They're few and far between. That said, it does scale. It scales like hell, especially if you can manage to utilize that infrastructure effectively. I wager that if you were to look at what it would cost to hold down that much CPU and associated memory continuously in AWS then you'd be paying roughly 6x as much.
14400 vCPU of R4 for 3yr reserved, monthly is $300k MRC. I'm guessing you'd run ceph or rook on your bare metal and have ~8 1TB SSD per server, so 75 servers * 8 SSDs /3 (for replication) is 200TB with decent performance by three data centers for 600TB usable compared to EC2 GP2 at $.10/GB comes to roughly $60k MRC.
Less any network charges that's $360k vs. $156k self-hosted. Guess I'm wrong. It's only twice as much.
That's just not true. There are plenty of companies that self-host highly scaling infrastructure. Twitter being just one of those companies. They've only recently started thinking about using the cloud, opted for Google, and that's only to run some of their analytics infrastructure.
> There is very little reason for a good sysadmin to work for International Widgets Conglomerated when they can work for a cloud provider instead, building larger scale solutions more efficiently for higher pay
That's not true either (speaking from personal experience working for companies of all sizes, from a couple dozen employees on up to and including AWS and Oracle on their cloud platforms). For one thing, sysadmin is far to broad a job role to make such sweeping statements.
A whole bunch of what I do as a systems engineer for cloud platforms is a whole world of difference from general sysadmin work, even ignoring that sysadmin is a very broad definition that covers everything from running exchange servers on-prem, to building private clouds or beyond.
These days I'm not sure I've even got the particular skills to easily drop back in to typical sysadmin work. Cloud platform syseng work requires a much more precise set of skills and competencies.
All that aside, I can point you in the direction of plenty of sysadmins who wouldn't work for major cloud providers for all the money in the world, either for moral or ethical reasons; or they're just not interested in that kind of problem; or even just that they don't want to deal with the frequently toxic burn-out environments that you hear about there.
> I'd rather buy an off the shelf service used by the rest of the F500 than roll my own.
No where near as much of the F500 workload is on the cloud as apparently you'd believe. It's a big market that isn't well tapped. Amazon and Azure have been picking up some of the work, but a lot of the F500 don't like the pay-as-you-go financial model. That plays sweet merry havoc with budgeting, for starters. It's one reason why Oracle introduced a pay model with Oracle Cloud Infrastructure that allows CIOs to set fixed IT expenditure budgets. Many of the F500 companies are only really in the last few years starting to talk about actually moving in to the cloud (when OCI launched at Oracle OpenWorld, there was a startling number of CIOs from a number of large and well known companies coming up to the sales team and saying "So.. what is this cloud thing, anyway?"
> Successful companies outsource their non core competencies
Yes.. and no. Successful companies outsource their non-core competencies where there is no value having them on-site. That's very different.
Twitter was founded in 2006, the same year AWS was launched, so in the early days Twitter didn't have a choice - the cloud wasn't yet a viable option to run a company.
And, if you remember in the early days, Twitter's scalability was absolutely atrocious - the "Fail Whale" was an emblem of Twitter's engineering headaches. Of course, through lots of blood, sweat and tears (and millions/billions of dollars) Twitter has been able to build a robust infrastructure, but I think a new company, or a company who wasn't absolutely expert-level when dealing with heavy load, would be crazy to try to roll their own at this point unless they wanted to be a cloud provider themselves.
That's because Twitter was:
1) A monolith
2) Written in Ruby
They started splitting components up in to specialised made-for-purpose components using Scala atop the JVM, and scaling ceased being a big issue. The problems they ran into couldn't be solved by horizontal scaling. There wasn't any service that AWS offers even today that would have helped with those engineering challenges.
Twitter is both unusually big and unusually unprofitable. You're unlikely to be as big as them, and even if you were I wouldn't assume they've made the best decisions.
> All that aside, I can point you in the direction of plenty of sysadmins who wouldn't work for major cloud providers for all the money in the world, either for moral or ethical reasons; or they're just not interested in that kind of problem; or even just that they don't want to deal with the frequently toxic burn-out environments that you hear about there.
It's best to work somewhere you're appreciated (both financially and for job-satisfaction reasons), and it's harder to be appreciated in an organization where you're a cost center than one where you're part of the primary business. There are good and bad companies in every field, and good and bad departments in every big company, but the odds are more in your favour when you go into a company that does what you do.
Of course, we won't switch providers, because they're offering great value right now.
I feel this vendor lock-in business is a phase that will pass. We were vendor locked when we paid for Unix or Windows servers, then we got Linux and BSD. Then we got vendor locked by platform providers like AWS and such, and now that grip is loosened by open source infrastructure stacks like Kubernetes.
NT was released in -93, FreeBSD 2.0 (free of AT&T code) was released in -94. GNU/Linux also saw professional server use in mid/late 90's. People still lock themselves in though.
If an established company is moving to the cloud, the equation is not as simple. The established company presumably has the money and time to make their vendor agnostic. Is the vendor lock in risk worthwhile to spend more now? How large is the risk? What are the benefits (using all of the AWS services is pretty nice)?
>"benefits (using all of the AWS services is pretty nice)"
your personal experience? Or are you simply assuming that interop / efficiencies obtain when going all-in on AWS? I ask, because I've had multiple client conversations in which these presumed benefits fail to materialize to a degree that offsets the concern about lock-in.
For instance, putting your EC2 instances in a VPC has been the preferred way of operating since 2009. But, if you have an account old enough, you can still create an EC2 instance outside of a VPC.
You can still use the same SQS API from 2006 as far as I know.
Maybe "AWS and cloud infrastructure" will be to modern companies what COBOL and mainframes were to the big companies of 50 years ago.
No doubt somebody will be happy to charge you to support it for a long time...
Still, I don't know what the solution is. Increasingly it seems to me that building high-quality software is literally not feasible, we can't afford it, all we can afford is crap.
Its true tho that we never wanted to work with managed services, so there was literarly no need to redo any of the tooling.
In other words you can decide to not bother with worrying about lock-in when it costs too much.
This will make your code base easier to port to multi-cloud in the future if you should ever want to.
At any kind of scale, though, one is loosely coupled to the cloud provider in any case for things like persistent disk, identity management, etc.
And then watch the CTO throw you out of his office.....
Atlassian is bad, but Oracle is on a whole different level.
Or you don't bother replicating the API, even if you don't want vendor lock-in, because you realize that if a cloud provider evaporates, there will be a lot of other people in the same boat as you and surely there will be open-source + off-the-shelf solutions that pop-up immediately.
But pick a vendor and go all-in can work for netflix'y big companies or ones with static assets on cloud. All cloud providers have their own rough edges and if you get stuck in one you might be losing your business edge. Case in point - not going to name the provider since we are partners with them, we found a provision bug - custom windows image based vm took 15 minutes to get provisioned and also exporting custom image across regions has rigid size restrictions. The provider acknowledged the bugs but they are not going to address it in this quarter but if we are netflix big - may be they could have addressed it sooner.
We have automated the cluster deployment so we can get our clusters up and running in most major cloud providers. We are careful not to be tied to vendor lock-in as much as possible since business edge cannot be compromised based on this big cloud providers, who only heed to your cry only if you are big and they care none what so ever for your business impediment. When you are expecting cloud resources which aren't going to be static - you need flexibility so the above recommendation doesn't suit all.
I'll note that I have no objection to abstractions per se, especially in cases where a community solution exists, e.g. Python's sqlalchemy is good enough that I'd seldom recommend directly using a database driver, Node's nodemailer is in many cases easier to use than lower level clients, etc.
I'm currently working on a system that has several multi-vendor abstractions - for file storage across AWS, GCP, NFS and various other things; for message queues across Kafka, GCP Pubsub, and direct database storage; for basic database storage (more key-value style than anything else) across a range of systems; for deployment on VMs, in containers, and bare metal; and various other things.
All of these things are necessary because it's a complex system that needs to be deployable in different clouds as well as on-prem for big enterprise customers with an on-prem requirement.
None of the code involved is particularly complex, and it's involved almost zero maintenance over time.
That would less be the case if you were trying to roll your own, say, JDBC or DB-agnostic ORM equivalent, but there are generally off the shelf solutions for that kind of thing.
Perhaps standardizing on something like Terraform allows you to reduce the risk of going all-in on one vendor.
Similarly with Kubernetes; if you go all in on k8S, do you care where it's hosted or can you maneuver quick enough to the best provider?
There is however a large number of painfully learned lessons of vendor locked in systems... no one got fired for buying IBM, right?