I have worked with database code that was meant to only work with the database it was running on, and database code that was meant to be agnostic to what database you used. I always thought that the costs of the second were underappreciated relative to their benefits. And unless you're actively maintaining running on more than one database (as in, shipping a product where your users have more than one database) you tend to miss all the implicit ways you come to depend on the implementation you're on -- yes, the syntax may be the same across databases, but performance impacts are different, and so you tend to optimize based on the performance of the database you're on.
I suspect the same is true for cloud. Real portability has real costs, and if you aren't incurring all of them up front and validating that you're doing the right things to make it work, then incurring part of them up front is probably just a form of premature optimization. At the end of the day, all else being equal, it's easier to port a smaller codebase to new dependencies than a larger one, and attempting to be platform-agnostic tends to result in more code as you have to write a lot of code that your platform would otherwise provide you.
It’s not just portability that’s an issue with lambda. It’s also churn.
Running on Lambda, one day you’ll get an email saying that we’re deprecating node version x.x so be sure to upgrade your app by June 27th when we pull the plug. Now you have to pull the team back together and make a bunch of changes to an old, working, app just to meet some 3rd party’s arbitrary timeframe.
If you’re running node x.x on your own backend, you can choose to simply keep doing so for as long as you want, regardless of what version the cool kids are using these days.
That’s the issue I find myself up against more often when relying on Other People’s Infrastructure.
It's not about using what the cool kids use these days.
I can't stress enough that unmaintained software should not run in production.
This way you have a good argument towards management and if you do it regularly or even plan it in ahead of time it's usually not much work.
During a product planning meeting:
"Dear manager, for the next weeks/sprint the team needs X days to upgrade the software to version x.x.x otherwise it will stop working"
I guess we have different philosophies then. My take is that software in production should not require maintenance to remain in production.
Imagine a world where you didn't need to spend a whole week every year, per project, just keeping your existing software alive. Imagine not having to put off development of the stuff you want to build to accommodate technical debt introduced by 3rd parties.
That's the reality in Windows-land, at least. And I seem to remember it being like that in the past on the Unix side too.
Your vision is only workable for software for which there are no security concerns. This might improve to the extent industry slowly moves away from utterly irresponsible technologies like memory-unsafe languages and brain damaged parsing and templating approaches and more or less the whole web stack. I wouldn't hold my breath though. And even software that's not cavalierly insecure will have security flaws, albeit at a lower rate.
Keep in mind that you're arguing against an existence disproof. The Microsoft stack, for example, is a pretty big target for attack, and has seen its share of security issues over the years.
But developers don't need to make any code changes or redeploy anything to mitigate those security issues. It all happens through patches on the server, 99% of which happen automatically via windows update.
So many open source hackers do not know the basic tecniques for backwards compatibility (e.g. don't reaname a function, just intoduce a new one, leaving the old available).
I'm spending very significant efforts maintaining an OpenSSL wrapper because OpenSSL constantly remove / rename functions. I hoped to branch based on version number, but they even changed the name of the function which returns version number.
And that's only one example, lot of people do such mistakes costing huge efforts from users.
And this popular semantic version myth, that you just need to update major version number when you chane the API incompatibly to save your clients from trouble.
> So many open source hackers do not know the basic tecniques for backwards compatibility (e.g. don't reaname a function, just intoduce a new one, leaving the old available).
I'd dispute this, or at least I think this doesn't capture the whole picture. Microsoft makes money with backwards compatibility and can afford to spend significant effort on to the ever-growing burden of remaining backwards-compatible indefinitely. Open source volunteers are working with much more limited resources and I think that it comes down much more to intentional tradeoffs between ease of maintenance and maintaining backwards compatibility.
If you have a low single-digit number of long-term contributors, maybe the biggest priority to keep your project moving at all is to avoid scaring off new contributors or burning out old contributors, and that might require making frequent breaking changes to get rid of unnecessary complexity asap. Characterizing that as "they don't know that you can just introduce a new function" doesn't seem like it yields instructive insights.
Yes, this is exactly the wrong reply I often hear when complaining about backwards compatibility.
The mistake here is that in 99% of cases backwards compatibility costs noting - no efforts, no complexity.
Of two equally costing choices the people breaking backwards compatibility just make a wrong choice.
> maybe the biggest priority to keep your project moving at all
When you rename function SSLeay to OpenSSL_version_num, where are you moving? What does it give to your project?
Ok, if you like the new name so much, what prevents you from keeping the old symbol available?
unsigned long (*SSLeay)(void) = OpenSSL_version_num
(Sorry for naming OpenSSL here, it's just one of many examples)
When developers do such things, they break other open source libraries, which in turn break other. It's a huge destructive effect on the ecosystem. It will take many man-days of work for the dependent systems to recover. And it may take years for the maintainers to find those free days to spend on recovery, and some projects will never recover (e.g. no active maintainer).
With a lift of a finger you can save humanity from significant pain and efforts. If you decided to spend your efforts on open source, keeping backwards compatibility by making the right choice in a trivial situation will make you contribution an order of magnitude bigger, efficient.
So, I believe people don't know what they are doing when they introduce breaking changes.
I saw developers introducing breaking changes, then finding projects depending on them and submitting patches. So they really have good intentions and spend more their volunteer open source energy than necessary. And when the other project can not review and merge their patch (no maintainers) they get disappointed.
So please, just keep the old function name. It will be cheaper for you and for everyone.
Microsoft makes money with backwards compatibility
That's a good way of putting it, and it gets to a key difference between open source and proprietary software.
In the open source world where a million eyes make all bugs shallow, developer hours are thought of as free. So if you change something it's no big deal because all the developers using your thing can simply change their code to accommodate it. It doesn't matter how many devs or how many hours, since the total cost all works out to zero.
In the proprietary world, devs value their time in dollars. The reason they're using your thing is because it's saving them time. They paid good money because that's what your thing does. Save time. Get them shipped. As a vendor, you're smart enough to realize that if you introduce a change that stops saving your customers time or, worse, costs them time or, god forbid, un-ships their product, they'll do their own mental math and drop you for somebody who understands what they're selling.
In the end, all we're talking about here is the end product of this disconnect in mindset.
Microsoft also isn't your average developer that imports libraries from strangers.
Ever time I run an audit (which is monthly) I see at least a dozen conversations in NPM packages we use. Sure, some of them don't apply to our usage, and others can't really impact is, but occasionally there is one we should be concerned about.
We server admins can push buttons to upgrade, but that doesn't mean developer code will keep working.
Many developers live in this world were they think server admins will protect their app... But we're more likely to break things by forcing your neglected package upgrades
> Keep in mind that you're arguing against an existence disproof. The Microsoft stack, for example, is a pretty big target for attack, and has seen its share of security issues over the years.
> But developers don't need to make any code changes or redeploy anything to mitigate those security issues.
I don't believe it. Most security issues are not just an implementation issue in the framework but an API that is fundamentally insecure and cannot be used safely. Most likely those developers' programs are rife with security issues that will never be fixed.
Which is fine, as long as they are understood and mitigated against. If your security policy consists entirely of "keep software up to date", you don't have a security policy.
In practice trying to "understand and mitigate against" vulnerabilities inherent in older APIs is likely to be more costly and less effective than keeping software up to date.
If there is a problem in an older API, it's probably time to update. That's understanding and mitigation.
The discussion is about the difference between updates when there's a valid reason and updates that are imposed by cloud providers, nobody advocates sticking with old software versions.
> nobody advocates sticking with old software versions.
In my experience that's what any policy that doesn't include staying up to date actually boils down to in practice. Auditing old versions is never going to be a priority for anyone, and any reason not to upgrade today is an even better reason not to upgrade tomorrow, so "understanding and mitigation" tends to actually become "leave it alone and hope it doesn't break".
In practice you don't mitigate against specific vulnerabilities at all, you mitigate against the very concept of a vulnerability. It would be foolish to assume that any given piece of software is free from vulnerabilities just because it is up to date, so you ask yourself "what if this is compromised?" and work from the premise that it can and will be.
Let's say I have a firewall. If we assume someone can compromise the firewall, what does that mean for us? Can we detect that kind of activity? What additional barriers can we put between someone with that access and other things we care about? What kind of information can they gather from that foothold? Can we make that information less useful? etc.
You think about these things in layers. If X, then Y, and if Y, then Z, and if X, Y, and Z do we just accept that some problems are more expensive than they're worth or get some kind of insurance?
I've found that kind of approach to be low security in practice, because it means you don't have a clear "security boundary". So the firewall is porous but that's considered ok because our applications are probably secure, and the applications have security holes but that's considered ok because the firewall is probably secure, and actually it turns out nothing is secure and everyone thought it was someone else's responsibility.
I think you're projecting. The whole point is reminding yourself that your firewall probably isn't as secure as you think it is, just like everything else in your network. This practice doesn't mean ignoring the simple things, it just means thinking about security holistically, and more importantly: in the context of actually getting crap done. Regardless, anyone who thinks keeping their stuff up to date is some kind of panacea is a fool.
Personal attacks are for those who know they've lost the argument.
Keeping stuff up to date is exactly the kind of "simple thing" that no amount of sophistry will replace; in practice it has a better cost/benefit ratio than any amount of "thinking holistically". Those who only their things up to date and do nothing else may be foolish, but those who don't keep their things up to date are even more foolish.
> But developers don't need to make any code changes or redeploy anything to mitigate those security issues
Right, so all deployed Active X based software magically became both secure and continued working as before after everyone installed the latest Windows patches?
The trivial patching only works for security issues due to implementation not design defects. If you have a design defect, your choice is typically either breaking working apps or usage patterns or breaking your users security. Microsoft has done both (e.g. Active X blocking, vs continued availability of CSV injection) and both have negatively affected millions.
We're using the definition a few notches upthread: "Dear manager, for the next weeks/sprint the team needs X days to upgrade the software to version x.x.x otherwise it will stop working"
As opposed to:
2011: deploy website, turn on windows update
2011-2019: lead life as normal
2019: website is up and running, serving webpages, and not part of a botnet.
That's reality today, and if it helps to refer to it as "maintained", that's fine. The point is that it's preferable to the alternative.
I think that the parent commenter is referencing node 4.3 being past EOL and being unmaintained software and therefore unfit for prod, unlike the ms stack which is receiving patches
I was referring to comments that MS is good at backwards compatibility and “if you write application, it will run forever” and I pointed out that MS also breaks backward compatibility what regards languages.
Installing Security patches for a ruby stack takes a full code coverage test suite, days of planning and even more to update code for breaking changes.
Installing security patches for a Microsoft stack requires turning on windows update.
There's a BIG difference. Once you write your msft stack app, is done. Microsoft apps written decades ago still work today with no code changes.
What if the new node version fix an bug / issue / CVE that doesn't concern the software ?
Is it resonable to postpone the upgrade for later ?
Example : the software uses python requests. A new version fixes CVE-2018-18074 about Authorization header, but you don't use this header, for sure. Is it resonable to upgrade a little bit later ?
Depends on how mature is your security team/process. Can you spend time tracking separate announced bugs and make case by case decision for each cve? How much would you trust that review? Do you review dependencies which may trigger the same issue?
Or is it going to take less time/effort to upgrade each time?
Or is the code so trivial you can immediately make the decision to skip that patch?
There's no perfect answer - you have to decide what's reasonable for your teams.
The cool thing about serverless infrastructure is that it does not really concern you. As long as you are on a maintained version of the underlying platform your provider will take care of the updates.
If your software runs on a unmaintained platform there won't be any security fixes and that's why amazon forces you to upgrade at some point.
You are looking to save yourself a week of time a year and then 3 years later for some reason or another you will HAVE to upgrade and good luck making that change when the world has moved past you.
Your describing traditional sysadmin vs devops. Devops means repeating the stress points so that they are no longer stressful and automated as much as possible.
I like it way better then the classic, "don't touch this, it's working and the last guy that knew how to fix it is gone.
you don't need maintenance to remain in production, you need maintenance to reduce the tech debt in the infrastructure you decided to use (code, frameworks, third party libraries, security issues).
Even just vanilla languages get upgraded every X months/years etc.
Not maintaining the code is just a bad gift you are giving to your (or someone's) future.
I have been in upgrades from perfectly working software written in an older (almost 4) version of java that was needed to add new features and it took a hell of a time and I have never seen it working at the end.
I don't think it's a safe choice to "let it be" when it comes to software.
Imagine a world where new exploits and hacks didn't come along every day and compromise the systems your app sits on because you didn't keep up with patches and upgrades...
That also describes my (very small in scope) PHP and Javascript things. They all still work, and I love that to bits. Admittedly, the price of that probably is keeping it simple, but if I needed to update it all the time just to keep it from not sinking under its own weight or the ground shifting beneath it, that would be no fun for me.
I completely agree with this, starting a new position and coming into infra running 5 year old software is not fun, generally neglected, and full of deprecated features/code that is improved in later versions.
Not to mention the security risk running old software can often create.
Isn't that always the case though? I mean, outside of e.g. lambdas or other platforms / runtimes as a service?
I mean a few years ago there was a huge security flaw in Apache Struts, whose impact was big because it had been used in a lot of older applications - meaning a LOT of people had to be summoned to work on old codebases to fix this issue.
The problem isn't a changing runtime - even if you self-host it you should make sure to keep that updated regularly.
if you self-host it you should make sure to keep that updated regularly.
Mild disagreement: My philosophy is that you should choose technologies that aren't likely to introduce breaking changes in the future.
As an example, I have sites that were built using ASP.NET version 1.1 that have survived to this day with nothing more than Windows Update on the server and the occasional version bump in the project config when adding a feature that needed the latest and greatest.
Compare that to the poor soul who decided to build on top of React when it first came out, and has been rewarded by getting to rewrite his entire application four times in as many years.
To return to the point, rather than rewriting around breaking changes from Node x.x to Node y.a, I'd be shopping around for the LTS version of Node x that I could keep the thing running on without intervention from my team.
> As an example, I have sites that were built using ASP.NET version 1.1 that have survived to this day with nothing more than Windows Update
You are right; but in my experience those ASP applications also had security holes (CSRF etc) that were never patched. They ultimately either became botnets or faded away when the corp simply faded away.
A business that can't afford to pay for cleaning up its business applications is likely to be unable to pay for general upkeep as well. It is simply past the point of being a viable business and is either in limbo or in the grave!
See "maintenance free" approach to software as canary in the coal-mine and run away as fast as possible.
The reality I know is that the people who wrote their apps on Windows XP left the company some years ago, but the apps live on: perfectly functioning, but unable to make requests via anything more secure than TLS1.0 and forcing other people to run servers that continue to accept TLS1.0 years after it was deprecated by everyone else.
(That's the PCI announcement in 2015 that despite everyone knowing about problems with TLS1.0, they would continue to allow it through 2018 because of all the companies who deferred their technical debt in the manner you seem to be advocating.)
I'm telling you that the entire infrastructure of the world is held back by companies who don't do simple maintenance, and your response is to tell me it should be easy to do maintenance.
Someone isn't getting the point, and I don't think I can make it any clearer.
Or is your point that we're held back by people who don't do simple maintenance, and trying to make maintenance simpler, while it might help, won't solve 100% of the problem.
Since late last year, you can use old versions if you want to. [0] The provider doesn't enforce the runtime anymore. But I don't think it's the provider issue in the first place. At some point "node x.x" will be EOL and you won't get an email. You'll just stop getting maintenance patches.
Good point, but no. If you have architected your apps properly, decommissioning or sunsetting services or individual components should already have been designed and planned.
You really just don't have to think like this on the msft stack. I'm so glad I chose msft ASP 20 years ago, instead of php, or RoR or python, or node or any of the myriad other stacks that have come and gone since.
Do you ever need to update your servers to a newer version like say 2016 or now 2019? There are definitely issues on using say old VB6 libraries when you need to upgrade your severs from 2008 to 2019. Not to mention using those old technologies if you do have a new feature or change you end up with an unmaintainable mess. I am MSFT stack programmer, but to claim there are no issues and you just need to patch a server is flat out wrong.
They didn’t pull the plug. You can’t create or update new lambdas with older versions of Node, but if you have existing code, it won’t just stop working.
> If you’re running node x.x on your own backend, you can choose to simply keep doing so for as long as you want, regardless of what version the cool kids are using these days.
What do you mean by your own backend? Your own physical hardware, your own rented hardware, your own EC2 box, your own Fargate container?
What if you get an office fire, a hardware failure, a network outage, a required security update, your third party company goes out of business, etc.?
There's no such thing as code that doesn't need to be maintained. Lambdas (and competitors) probably require the least maintenance of the lot.
The unifying lesson I've been appreciating after reaching my 30s is that in order to make good decisions, an individual needs to have a very clear understanding of how something works. If you know how something works, then you can foresee problems. You can explain the correct use cases for a particular tool. You know the downsides just as well as the upsides for a particular choice.
In software development, a lot of the new stuff that is supposed to solve our problems just shifts those problems somewhere else -- at best. At worst, it hides the problems or amplifies them. This isn't unique to software development, but it seems to be particularly pervasive because so much is invisible at the onset of use.
Best advice, be very suspicious when someone can't explain what the bad parts are.
There are also a terrible cost of running just on one database, the dependency on just one provider.
In business you should never depend on just one provider. This provider could be the best in the world, now, because it has amazing management. But people die, or are hit by a bus, or get crazy after a divorce or just retire and they get replaced by the antitheses.
In my personal experience, real portability has great benefits on their own. Design is easier to understand and you constantly find bugs thanks to running your software in different architectures, compilers and so on.
At the end of the day, it is the quality of your team. If your team is bad, your software will be bad and vice versa.
Funny thing is, the fact alone that you create a provider-agnostic system can give you enough performance or cost penalties that you want to change providers in the first place.
You can't use provider specific optimizations, so the provider seems bad and you don't wanna use it.
The big difference with something like PL/SQL is that it’s a proprietary language, whereas lambda and the other faas options are based on open languages. Makes portability somewhat easier to achieve.
For sure; the external interface of a lambda is trivial, that is, a single entrypoint function with an 'event'. It's relatively easy to create a simple wrapper around that to make it either provider agnostic or self-hosted.
I haven't used lambda but it sounds remarkably similar to a Django function.
In Django urls point to a function. Your request matches that url, that function gets run.
Lambda must have some similar sort of mapping of urls to functions, so what exactly are you saving with it? Ok, Django includes an ORM, but if you are using any sort of persistence you will need a database layer as well.
Can someone explain what all the fuss is about or what I am missing?
If you're dealing purely with web requests, then yeah, API Gateway + Lambda sounds pretty similar to a Django function. But having used both, it's a lot easier and faster to setup API GW and Lambda than it is a Django app.
And if you're not dealing with purely web requests, then they're very different. Most of my lambdas trigger off of Kinesis, SNS, and SQS events. Work gets sent to these queues/notification endpoints, and then the Lambda function does work based off the data received there, and scales to handle the amount of data automatically.
Good points, but it remains to be seen whether the portability of large serverless apps is sufficient to prevent the providers from squeezing their customers once they are deeply established.
The fact that there’s three major cloud providers (who will at the very least compete on customer acquisition) is a point in favor of this, but it’s definitely an experiment. In my mind, this is as much about negotiating power as it is about technical tradeoffs.
Well, there is a cost to be portable and the cost to adapt an app that's not portable.
Whether it's DB independence or cloud provider independence, from my experience it's cheaper to pay the cost of the migration when you know you want to migrate (even if it involves rewriting some parts) rather than paying it everyday by writing portable code.
Most of the time the portable code you write becomes obsolete before you want to port it.
> it's cheaper to pay the cost of the migration when you know you want to migrate
Agreed, on principle and based on experience. This does require you have reasonable abstractions in place as otherwise you’ll end up refactoring before you can migrate. But that’s a good thing in any case.
You're only thinking about the _input_. Technically, yes, I can host an express app on lambda just like I could by other means, but the problem is that it can't really _do_ anything. Unless you're performing a larger job or something you probably need to read/write data from somewhere and connecting to a normal database is too slow for most use-cases.
Connecting to AWS managed services (s3, kinesis, dynamodb, sns) don't have this overhead so you can actually perform some task that involves reading/writing data.
Lambda is basically just glue code to connect AWS services together. It's not a general purpose platform. Think "IFTTT for AWS"
We have been connecting to mongo db from without lambda for the past year and sure you don't get single digit latency but r/w data happens under 30ms in most cases, we even use paramastore to pull all secrets and it's still without that time frame.
You can run your Lambda function within the same network as your other servers. It just appears as a new IP address inside your network and can call whatever you permit it to.
he probably meant 'within' instead of 'without'.
myself to use aws-serverless-express and connect my lambda hosted in us-east-virginia to a mlab (now Mongo Atlas) mongo database hosted in the same us-east-virginia amazon region
But that is the whole point of using cloud services that are tightly integrated with each other. I can not do it as efficiently as Amazon myself can not be called "propriety lock-in".
Said efficiencies are not due to Amazon, just that the services are colocated in the same facility.
If I put the node service and a database on the same box I'd get the same performance, and actually probably better since Amazon would still have them on separate physical hardware.
The infrastructure, or interfaces is where the lock-in comes in. Each non-portable interface adds another binding, so it's not as easy as swapping out the provider as the OP pointed out, once you've been absorbed into the ecosystem of non-portable interfaces. You have to abstract each service out to be able to swap out providers.
If you use open source interfaces, or even proprietary interfaces that are portable, it's easier to take your app with you to the new hosting provider.
The non-portable interfaces are crux of the matter. If you could run lambda on Google, Azure or your own metal, folks wouldn't feel so locked-in.
As I said. I can run the Node/Express lambda anywhere without changing code.
But, I could still take advantage of hosted Aurora (MySQL/Postgres), DocumentDB (Mongo), ElasticCache (Memcached/Redis) or even Redshift (Postgres compatible interface) without any of the dreaded “lock-in”.
It sounds like you have a preference for choosing portable interfaces when it comes to storage. And you've abstracted out the non-portable lambda interface.
My position isn't don't use AWS as a hosting provider, it's that you ought to avoid being locked into a proprietary non-portable interface when possible.
I don't really see cloud-provider competition lessening or hardware getting more expensive and less efficient or the VMs getting worse at micro-slicing in the next 5 years. So why would I be worried about rising costs?
I think spending one of the newly-raised millions over a year or so can help there, including hiring senior engineers talented enough to fix the shitty architecture that got you to product-market-fit. This isn’t an inherently bad thing, it just makes certain business strategies incompatible with certain engineering strategies. Luckily for startups, most intermediate engineers can get you to PMF if you keep them from employing too much abstraction.
Isn’t employing too many abstractions just what many here are advocating - putting a layer of abstraction over the SDKs abstractions of the API? I would much rather come into a code base that just uses Python + Boto3 (AWS’s Python SDK) than one that uses Python + “SQSManager”/“S3Manager” + Boto3.
That is indeed what many here are advocating. There are only so many possible interfaces or implementations, and usually abstracting over one or the other is an effort in reinventing the wheel, or the saw, or the water clock, and not doing the job as well as some standard parts glued together until quite far into the endeavor.
Stop scare-quoting "lock-in". Lock-in means development effort to get out of a system, regardless of how trivial you think it is.
If writing code to be able to move to a different cloud isn't considered lock-in, then nothing is since anyone can write code to do anything themselves.
Lock in is an economic concept, it’s not just about code but about “switching costs”. Ecosystem benefits, data gravity etc all come into play.
There are two kinds of lock-in: high cost because no competitor does as a good a job - this is good lock-in, and trying to avoid this just means you’re not building the thing optimally in the first place.
There is also high switching cost because of unique interface and implementation requirmenrs that don’t add any value over a more interopable standard. This is the kind that’s worth avoiding if you can.
"Connecting to AWS managed services (s3, kinesis, dynamodb, sns) don't have this overhead so you can actually perform some task that involves reading/writing data."
That is due to network and colocation efficiencies. The overhead of managing such services yourself is another matter.
Not just the network overhead, the maintenance and setup overhead. I can spin up an entire full stack in multiple accounts just by creating a CloudFormation template.
I’ve done stress testing by spinning up and tearing down multiple VMs played with different size databases, autoscaled read replicas for performance. Ran a spot fleet, etc.
When you need things now you don’t have time to requisition hardware and get it sent to your colo.
And then you still have more stuff to manage now based on the slim chance that one day years down the road you might rip your entire multi Az redundant infrastructure, your databases, etc with all of the read replicas to another provider....
And this doesn’t count all of the third party hosted services.
Aurora (Mysql) redundantly writes your data to six different storage devices across multiple availability zones. The read replicas read from the same disks. As soon as you bring up a read replica, the data is already there. You can’t do that with a standard Mysql read replica.
OK. So you connect to Postgres on RDS - cloud agnostic.
You connect to S3, and:
a) You can build an abstraction service if you care about vendor lock-in so much
b) It has an API that plenty of open source projects are compatible with (I believe Google's storage is compatible as well)
Maybe you use something like SQS or SNS. Bummer, those are gonna "lock you in". But I've personally migrated between queueing solutions before and it shouldn't be a big deal to do so.
It's really easy to avoid lockin, lambda really doesn't make it any harder than EC2 at all.
As long as you write your own wrappers to the SDKs changing cloud providers is definitely doable. We started full AWS stack with Lambda but have now been slowly refactoring our way into more cloud-provider agnostic direction. It's definitely not an existential threat level lock-in. Serverless technology is only starting out still and I'm pretty sure 5 years from now Lambda won't be the go-to platform anyway. Plus honestly we've learned so much from the first big project on Lambda that writing the next one with all of that in mind will be pretty great (and agnostic).
I don't believe that writing wrappers is particularly important, though I think that anyone who uses SQS is likely to build an abstraction over it at some point (as with all lower level communication protocols, at some point you build a "client library" that's more specific).
As I said, at least in the cases of your database and your storage, being cloud-agnostic is trivial. Managed postgres is easy to migrate from, S3 shouldn't be hard to migrate from either.
Certainly lambda doesn't impact this too much.
> Serverless technology is only starting out still and I'm pretty sure 5 years from now Lambda won't be the go-to platform anyway. Plus honestly we've learned so much from the first big project on Lambda that writing the next one with all of that in mind will be pretty great (and agnostic).
I realize it isn't entirely on-topic, but could you elaborate? I'm curious to hear more about your opinion on this, I'm not sure what the future of Serverless is.
And that goes back to developers using the repository pattern because one day the CTO might decide that they want to get rid of their 6-7 figure Oracle installation and migrate to Postgres. There is a lot more to migrating infrastructure at scale than writing a few facades.
Heck, consultants get paid lots of money just to do a lift and shift and migrate a bunch of VMWare images from on prem to AWS.
a) You can build an abstraction service if you care about vendor lock-in so much
...
It's really easy to avoid lockin, lambda really doesn't make it any harder than EC2 at all.
Yes, you can build an abstraction layer. And maintain it. And hope that you don't get feature divergence underneath it.
Have you ever asked the business folks or your investors did they care about your “levels of abstraction”? What looks better on your review? I created a facade over our messaging system or I implemented this feature that brought in revenue/increased customer retention/got us closer to the upper right quadrant of Gartner’s magic square?
Why should they care, or even be in the loop for such a decision?
You don’t ask your real estate agent on advice for fixing you electrical system I guess?
Of course your business folks care whether you are spending time adding business value and helping them make money.
I’ve had to explain to a CTO before why I had my team spending time on a CI/CD pipeline. Even now that I have a CTO whose idea of “writing requirements” is throwing together a Python proof of concept script and playing with Athena (writing Sql against a large CSV file stored in S3), I still better be able to articulate business value for any technological tangents I am going on.
Sure. Agree totally, maybe I misread your previous comment a bit.
What I meant is that run-of-the-mill business folks do not necessarily know how business value is created in terms of code and architecture.
I don't know of any business where they wouldn't be involved. Not in the "Let's talk directly about implementation details" way, but in the "Driving product development and roadmap" and "ascertaining value to our customers" way.
Any time spent on work that doesn't directly create value for customers is work that the business should be weighing in on. I'm not saying that you should never spend any time doing anything else - but these are trade offs that the product manager should be involved in, and one of their primary jobs is being able to weight the technical and business realities and figuring out where resources should be going.
> and that it requires virtually no effort to avoid it
Of course it requires effort. A lot of effort, not to mention headcount. The entire value of cloud-managed services is what it saves you vs. the trade-off's, and it's disingenuous to pretend that's not the case.
Sorry, I don't agree, and I feel like I provided evidence why in my first post. To summarize, choosing services like Postgres and S3 doesn't lock you in. SQS and SNS might, but I think it's an exaggerated cost, and that has nothing to do with Lambdas (EC2 instances are just as likely to use SQS or SNS - moreso, given that SQS wasn't supported for Lambdas until recently).
There are tradeoffs, of course. Cost at scale is the really big one - at some point it's cheaper to bring ops/ hardware in-house.
I just don't agree that lock-in is a huge issue, and I really disagree with the idea that lambdas make lock-in harder.
There's a big difference between AWS RDS and self-managed. Huge difference.
- DBA's & DevOps
- Procurement management & spare parts
- Colocation w/multihoming
- Leasing agreements
- Paying for power usage
- Disaster recovery plan
- CapEx & depreciation
- Uncomfortable meetings with my CFO explaining why things are expensive
- Hardware failure
- Scaling up/out
Not even worth going on because the point is obvious. Going "all in" reduces cost and allows more time to be focused on revenue-generating work. The "migration" boogeyman is just that, something we tell other programmers to scare them around the campfire. You're going to be hard-pressed finding horror stories of companies in "cloud lock-in" that isn't a consultant trying to sell you something.
> at some point it's cheaper to bring ops/ hardware in-house.
It depends. It's not always scale issue, and with all things it starts with a model and collaboration with your finance team.
While I could probably answer that, I don't think it's relevant to my central point - that lock-in is not as big of a deal as it's portrayed as, and that lambdas do not make the problem considerably worse.
That’s an incredibly ignorant and misleading statement. It’s sort of like saying a database isn’t valuable because 99.999% of requests hit the cache, and not the disk.
Everything was built on Amazon and video is largely hosted on S3. Yes, there’s a large CDN in the mix too. That doesn’t take away from the achievement.
Well, what do you think Netflix is doing to be AWS’s largest customer? Have you seen any of their presentations on YouTube from AWS reinvent? Where do you think they encode the videos? Handle sign ins, etc?
That’s just the CDN. Netflix is still by far AWS’s biggest customer and its compute is still on AWS. I don’t think most companies are going to be setting up colos at ISPs around the world.
Our Lambda deployments handle REST API Gateway calls, SQS events, and Step functions. Basically the entire middleware of a classic 3-tier stack.
Except for some proprietary light AWS proxy code, the bulk of the Lambdas delegate to pre-existing Java POJO classes.
The cold start issues and VPC configuration were a painful learning curve, but nothing I would consider proprietary to AWS. Those are universal deployment tasks.
> Unless you're performing a larger job or something you probably need to read/write data from somewhere and connecting to a normal database is too slow for most use-cases.
This is false. I've seen entire Lambda APIs backed by MySQL on large, consumer-facing apps and websites. As another poster pointed out, the cold-start-in-a-VPC is a major PITA, but it can (mostly) be worked around.
And there is always DynamoDB where you aren’t in a VPC and Serverless Aurora where you don’t have to worry about the typical database connections and you can use the http based Data APIs.
How is the Aurora Serverless Data API now? On preview release it was a bit sketchy: horrible latencies (pretty much ruining any benefit you could get from avoiding the in-VPC cold start) and a dangerous sql-query-as-a-string API (no prepared statements or placeholders for query params that would get automatically escaped IIRC).
Unfortunately, we require the ability to load/unload directly from S3 and Aurora serverless doesn’t support that. We haven’t been able to do anymore than a POC.
Dynamo is really the hardest lock-in in the ecosystem for me. Serverless Aurora is still easy to kill with too much concurrency/bad connection management compared to Dynamo
When lambdas haven't been hit for 15mins the first hit after has a noticeably longer start time. It's due to deprovisioning/reprovisioning underlying resources like network interfaces. Some people do crazy stuff like a scheduled task to hit their own service to combat this so AWS promised to solve it.
Even if you invoke your lambda function to warm it up in anticipation of traffic, you'll still hit cold starts if the lambda needs to scale out; the new machines are exposed to inputs "cold." Those crazy patterns trying to warm the lambda up are really crazy if you think about it because no one is using them is really aware of the underlying process(es) involved.
"Why are you throwing rocks at that machine?"
"It makes it respond more quickly to actual client requests. Sometimes."
"Sometimes?"
"Well, most the time."
"Why's that? What's causing the initial latency?"
"Cold starts."
"Yeah, but what's that mean?"
"The machine is booting or caching runtime data or, you know, warming up and serverless. Anyway, don't think about it too much, just trust me on this rock thing. Speaking of which, I got to get back to bastardizing our core logic. Eddie had great results with his new exponential backup rock toss routine and I'm thinking of combining that with a graphql random parameter generator library that Ted said just went alpha this afternoon."
In addition to what the sibling reply said. There is also the issue of your choice of runtimes. Java has the slowest startup time and Go the fastest. C# is faster than Java but still slower than Python and Node.
Clever uses of dependency injection without reflection (think dagger not spring) and reducing code size as much as possible can give you fairly fast cold start times with a Java execution environment. Bloated classpaths and reflection filled code will destroy your initialization time.
I think it probably isn't purely the use of lambda/serverless but the creep of other things that make it more difficult to leave. Stuff like cognito or SNS or other services. Once you start using AWS, each individual choice in isolation looks cheaper and easier to just use something AWS provides. Once you start utilizing 7+ different AWS services it becomes very expensive to transition to another platform.
Also, this conversation is very different if you are an early stage company racing to market as opposed to an established organization where P&L often takes precedence over R&D.
At my previous place I saw this in reverse. Over the previous years we had invested in building up our own email infrastructure.
Because so much had been invested (sunk cost fallacy), no-one could really get their heads around a shift to SES, even though it would have been a slam dunk in improved reliability and developer productivity.
Whereas if we were on, say, Mailgun, and then someone wanted a shift to SES, that debate could probably have been a more rational one.
I just point this out to say that investing in your own infrastructure can be a very real form of lock-in itself.
Other way around, you'd be surprised at how much infrastructure you can buy by forgoing AWS's offerings (for large work scales.)
For small companies, you may not be able to afford infrastructure people, and moving fast makes way more sense. There's little point in paying for an ops person when you have very little infrastructure.
At a certain scale though, AWS stops being cost effective. You begin to have room in your budget for ops people, you get room to afford datacenter costs, and you can start paying for a cloud architect to fill out internal or hybrid cloud offerings using openshift or openstack.
>Yeah, Netflix's opinion is to use Amazon as little as possible.
This is simply untrue. Everything but their CDN uses AWS.
>Their critical infrastructure (the CDN) is not anywhere near the slimy grip of AWS.
The streaming website and app aren't critical infrastructure? Databases containing all of their business and customer details aren't critical infrastructure? Encoding content so it can be delivered by the CDN isn't critical infrastructure?
That's like saying I don't trust Ford because I buy Michelin tires while I drive a Fusion.
The CDN is in the ISPs data center. You can’t get much lower latency than that. But if Netflix is AWS’s largest customer, I doubt they are using it as little as possible.
Netflix does presentations every year at ReInvent about how they use AWS and they have a ton of open source tooling they wrote specifically tied to AWS.
This is the same model of buying things on amazon too -- once you've bought once, it's easier to buy again. Why spend time going to another shop when you can just use amazon to buy multiple things.
This ease of use philospohy goes way back to the one-click patent. If I want DNS, why wouldn't I go to amazon, which has all my finance details, and a decent interface (and even API), rather than choosing another DNS provider, setting up my credit card, and having to maintain an account with them. So I choose DNS via AWS. Then I want a VPS, but why go to linode and have the overhead of another account when I could do lightsail instead?
And then I can use Amazon Certificate Manager, create a certificate attach it to your load balancer and cloud front distribution and never have to worry about expiring certificates - and they are free.
You don't necessarily have to host this yourself. With this, there's a relatively straightforward (but obviously not necessarily easy) way for Joe Entrepreneur to set up a hosted service to compete with AWS Lambda, thus helping the community avoid vendor lock-in.
So can Joe Entrepreneur also host my databases? my storage? My CDN around the world? My ElasticSearch cluster? My Redis cluster? All of my objects? Can he provide 24 hour support? Can he duplicate my data center for my developers in India so they don’t have lag? What about my identity provider? My Active Directory?
Can he do all that across multiple geographically dispersed redundant data centers?
I think you're missing the point - you may need to move to your own infrastructure for security, privacy, regulatory or accountability reasons. You may encounter new needs which AWS may no longer meet, cheap bandwidth for example (AWS bandwidth is both way overpriced and on some rather poor networks). Or Amazon may decide that the price of all their serivces is now going up by a factor of 5 and if you have your attitude, well, you're stuck paying for it.
Having relatively easy to spin up alternatives is a great thing. I can run my application entirely on a local kubernetes cluster or one on Amazon, DigitalOcean or Google's cloud services. That sort of flexibility is excellent and has allowed us to scale into situations where we otherwise couldn't have affordably done so (being able to buy some bandwidth from Joe Entrepreneur has it's benefits sometimes).
I think you're missing the point - you may need to move to your own infrastructure for security, privacy, regulatory or accountability reasons.
Which compliance regulation require you not to use a cloud provider? At most they may require you to not share a server with another company - that can be done with a drop down - or the data has to be hosted locally - again that can be done by selecting a region.
>Which compliance regulation require you not to use a cloud provider?
The policies that say not to use a company controlled by the US government. Or the ones that say under no circumstances should the data be sent over the Internet to a third party "because OPs are hard".
Which regulation or policy? Which certification? Name names. It’s not any of the financial, legal, or health care compliance regulations that I’m aware of.
In short most of German laws make it incredibly risky (but not forbidden) to use any american company for any kind of data that can be resolved to the underlying person. (Eg. a lot of companies got their warning shot when "safe harbor" exploded, if the same happens with https://www.privacyshield.gov/welcome a world of shit awaits)
Yeah I can see explaining to our business customers who grill us about the reliability of our infrastructure that we host our infrastructure on Digital Ocean....
You realize the databases I’m referring to are hosted versions of MySql/Postgres, the ElasticSearch cluster is just that standard ElasticSearch, and Redis is just that - Redis and you can setup a CDN anywhere?
Even if you chose to use AWS’s OLAP database,Redshift, it uses standard Postgres drivers to connect. You could move it to a standard Postgres installation. You wouldn’t get the performance characteristics of course.
If you don’t want to be “locked in” to DynamoDB, there is always the just announced Mongo compatible DocumentDB. Of course ADFS is used everywhere.
Why in the world would I want to manage a colo with all of those services myself and still not have any geographic redundancy - let alone any place near our outsourced developers?
None of my list is esoteric - a website with a caching layer, a database, a CDN, a search index, and a user pool that can survive a data center outage and some place to store assets.
Seeing that as long as Y combinator has existed, only one company has ever gone public, the rest are still either humming along unprofitably, have been acquired, or gone out of business, the chances of most businesses having to worry about their largest business risk being an over reliance on AWS seems slim.
It’s easy enough to separate your event sources from your event destinations. Anything that can trigger lambda can also trigger SNS message that can call an API endpoint.
If I did want to move from AWS that’s the first thing I would do. Put an API end point in front of my business logic, change the event from triggering lambda to an SNS message and move my API off of AWS. Then slowly migrate everything else.
In most cases, very few companies have products that need to scale to extreme load day 1 or even year 1. IMO, instead of reaching for the latest shiny cloud product, try building initially with traditional databases, load balancing, and caching first. You can actually go very far on unsexy old stuff. Overall, this approach will make migration easier in the cloud and you can always evolve parts of your stack based on your actual needs later. Justify switching out to proprietary products like lambdas, etc once your system actually requires it and then weigh your options carefully. Everyone jumping on the bandwagon these days needs to realize: a LOT of huge systems are still rocking PHP and MySQL and chasing new cloud products is a never ending process.
In my case switching to an AWS API Gateway + Lambda stack means I have zero-downtime, canary deployments that take less than 5 minutes to deploy from version control. Api Gateway is configured using the same swagger doc that autogenerates my request/response models (go-swagger) and (almost) completely removes routing, request logging, throttling and authentication concerns from the request handlers. Combined with a staticly hosted front-end and SNS/SQS+lambda pub-sub for out-of-process workers I never have to worry about auto-scaling, load-balancing or cache-invalidation and we only pay for what we use. It may not suit every use case, but in our case, we have bursty, relatively low-volume traffic and the hosting bill for our public-facing site/service that comprises most of the main business revenue is the same as a rounding error on our BI service bill.
We use golang lambdas, binaries are built in our CI pipeline. Build stage takes ~10 seconds, tests (integration + unit) take ~30 seconds. We use AWS SAM for generating our CFN templates, we package and deploy using the AWS Cloudformation CLI and this takes the remaining 3-4 minutes.
I didn't include post-deployment end-to-end tests in the 5 minute figure, but technically speaking, we do deploy that quickly
We have a fair number of endpoint, so due to the CloudFormation 200 resource limit per stack, we end up creating about 10 different stacks that frankenstein themselves onto a main API gateway stack.
Try deploying changes to google cloud loadbalancers. Updated within a few seconds, but changes will take seversl minutes to be applied. The first time i was scratching my head why my changes don‘t work as expected...
This (pay for what you use, so many fewer scalability issues) is so big it, by itself, can give you a competitive advantage against anyone who isn't doing this, which is almost everyone.
Maybe I'm overstating it, but I don't think I am...
I think you're overstating it. Why do people care so much about scalability issues, anyway? Given that (a) a stateless server plus an SQLite instance is much, much easier to set up than the proprietary, poorly documented mess that is Lambda, (b) that server can easily be horizontally scaled, and the SQLite instance can be swapped out with any other SQL database with some effort, and (c) a single server with SQLite will easily handle up to 100K connections, it doesn't seem like scaling was ever an issue for most websites.
I don't think you've used the tools that can work with lambda or had to actually scale something in production, based on your response...
It's all a lot harder than you make it out to be, but at least with lambda (and something like Zappa) you don't have to figure anything out beyond how you get your first environment up. There's just no second step, and that's huge.
Using a scripting language like Python or Node it literally is adding one function that takes in a JSON event and a context object as your entry point.
With Google Firebase Functions I was able to start writing REST APIs in minutes.
Compare that to setting up a VM somewhere, getting a domain name + certs + express setup + deployment scripts, and then handling login credentials for all of the above.
I had never done any of that (eventually I grew until I had to), so serverless let me get up and running really quickly.
Now I prefer my own express instance, since deployment is much faster and debugging is much easier. But even for the debugging scenario, expecting everyone who wants to Just Write Code to get the horrid mess of JS stuff up and running in order to debug, ugh.
(If it wasn't for HTTPS, Firebase's function emulator would be fine for debugging, as it is, a few nice solutions exist anyway.)
But, to be clear, on day 1 the option for me to write a JS rest endpoint was:
1. Follow a 5-10 minute tutorials on setting up Firebase Functions.
OR
1. Pick a VM host (Digital Ocean rocks) and setup an account
2. Learn how to provision a VM
3. Get a domain
4. Get domain over to my host
5. SSH into machine as root, setup non-root accounts with needed permissions
6. Setup certbot
7. Learn how to setup an Express server
8. Setup an nginx reverse proxy to get HTTPS working on my Express server
9. Write deployment scripts (ok SCP) to copy my working code over to my machine
10. Setup PM2 to watch for script changes
11. Start writing code!
(12. Keep track, in a secure fashion, of all the credentials I just created for the above steps!)
I am experienced in a lot of things, and thankfully I had some experience messing around with VMs and setting up my own servers before, but despite what everyone on HN may think, not every dev in the world also wants to run a bunch of VMs and manage their setup/configuration just to write a few REST endpoints!
So yeah, instead I can type 'firebase deploy' in a folder that has some JS functions exported in an index.js file and a minute later out pops some HTTPS URLs.
If you don't want to learn DevOps why not use a PaaS like Heroku? That way when you want to learn DevOps, you can move your application without rewriting large swathes of it.
It's funny but when I learned to code basically all ISPs provided you with free hosting and a database, and you just needed to drag and drop a PHP file to make it live. It's like we have gone backwards not just in terms of openness but also in terms of complexity.
This seems a bit exaggerated. You definitely don't need to start with certbot or even with a domain name. Why can't you just start with a regular server on localhost, which can be set up in way less than ten minutes, and publish it only when you're ready? Not to mention that Firebase Functions is way less beginner-friendly than Express.
Mobile app development requires an HTTPS connection.
Every React Native tutorial out there has a section on setting up user auth with Firebase, and then putting a few REST endpoints up.
It is simple enough that a beginner "never touched mobile or web" development tutorial can go through it in under an hour.
Firebase is incredibly simple to get started with.
Another solution is to use one of the HTTPS port forwarding services that takes a localhost server and gives it a public HTTPS endpoint, but that is more work to explain than
firebase deploy
so the tutorial authors go with Firebase. Auth being super simple is icing on top.
I'm generally a proponent of DIY, this doesn't make sense. When an org needs to scale, it makes sense to cultivate skill in the underlying infrastructure. On day 1, serverless makes a lot of sense because it encourages development patterns that eventually will scale nicely.
I have mixed feelings about this and I don't have enough experience to label on method better than the other.
Lambdas basically require zero maintenance. SQS requires zero maintenance. EC2 load balancer is zero maintenance. And the setup is trivial too and there's no migration time down the line.If you start off with native cloud for everything you can keep your maintenance and setup costs down drastically.
However, a lot can be done with the old school unsexy tech.
Zero maintenance? What about standing up multiple environments for staging, prod? Sharing secrets ,keys, and env vars? Deployments? Logging? No migration time down the line? I'm pretty sure GCP, Azure don't have Lambda, SQS, or EC2 load balancers so you absolutely will have migration time if you have to retool your implementation to switch cloud providers or products.
You make really good points about that. The classic things do migrate providers the easiest of all.
I've just found in my experience maintaining a web server or a database server, keeping security in mind, upgrades, scaling, etc. is alot more work than simply spinning up RDS and a Lambda with API gateway. Or even hosting static sites on s3 or Netlify.
Like I said I don't know enough to say one is better.
Honestly, I've developed the last few projects in Serverless framework and deployed to AWS Lambda. My biggest project is a custom Python application monitoring solution that has 4 separate Lambda function/handlers, populates an RDS PostgresSQL database with monitoring data (runs in a VPC), then reads from that database using complex joins across multiple application metric tables to send time-series data to Cloudwatch metrics.
Then, I configured Cloudwatch alarms to have thresholds for certain metrics that send a web hook to PagerDuty.
The benefit is that my monitoring system never goes down, never needs to be patched manually (AWS even patches my Postgres database, and fails over to the warm standby during patching), and never needs any system administration.
Have you ever worked at a company where you had a serious outage that you didn't detect quickly enough because a monitoring system was down? Having a Serverless monitoring system means this has happened 0 times despite our app running in production for almost a year now.
> In most cases, very few companies have products that need to scale to extreme load day 1 or even year 1.
That wouldn't be a great reason to choose serverless indeed. However, that doesn't mean serverless isn't still the right choice.
We've tried both the traditional approach you describe and serverless, and from experience the latter is 10x less infrastructure code than the former (we compared both Terraform implementations).
If serverless fits your use case, saving time and effort is a very good reason to go for it IMHO.
Of all the AWS features to criticism for lock-in, Lambda seems like the weakest choice.
You don't have to write much code to implement a lambda handler's boilerplate, and that boilerplate is at the uppermost or outermost layer of your code. You could turn most libraries or applications into lambda functions by writing one class or one method.
A lambda's zip distribution is not proprietary and is easy to implement in any build tool.
I'd include the triggers as part of that analysis, like being able to invoke a function every time something is pushed to an S3 bucket for example. Just being able to run arbitrary functions without caring about the OS is the core product, but the true value is that you can tie that into innumerable other services that are so helpfully provided.
Basically, AWS has so much damn stuff under their belt now, and it all integrates so nicely, every time they add a new feature it lifts up all the other features as a matter of course.
I tend to agree, although with most things I think it depends on how heavily invested you are. Migrating a handful of mission-critical Lambdas is no biggie, but if you've really bought the bait and implemented your entire web services architecture on AWS API Gateway and Lambda -- for some reason -- you've got a much tougher job untangling yourself. Perhaps it suffices to say it's worth keeping an eye on how much friction you're building for yourself as you go.
"I'm scared of vendor lock-in, so I'm going to build something that's completely provider agnostic" means you're buying optionality, and paying for it with feature velocity.
There are business reasons to go multi-cloud for a few workloads, but understand that you're going to lose time to market as a result. My best practice advice is to pick a vendor (I don't care which one) and go all-in.
And you'll forgive my skepticism around "go multi-cloud!" coming from a vendor who'll have precious little to sell me if I don't.
That sounds like the perspective of someone who's picked open source vendors most of the time, or has been spoiled by the ease of migrating Java, Node, or Go projects to other systems and architectures. Having worked at large enterprises and banks who went all in with, say, IBM, I have seen just how expensive true vendor lock-in can get.
Don't expect a vendor to always stay competitively priced, especially once they realize a) their old model is failing, and b) everybody on their old model is quite stuck.
I am incredulous that people wouldn't be worried about vendor lock-in when the valley already has a 900lb gorilla in the room (Oracle).
Ask anybody about Oracle features, they'll tell you for days about how their feature velocity and set is great. But then ask them how much their company has been absolutely rinsed over time and how the costs increase annually.
Oracle succeed by being only slightly cheaper than migrating your entire codebase. To offset this practice, keep your transition costs low.
--
Personal note: I'm currently experiencing this with microsoft; all cloud providers have an exorbitant premium when it comes to running Windows on their VMs, but obviously Azure is priced very well (in an attempt to entice you to their platform). Our software has been built over a long period of time by people who have been forced to run Windows at work -- so they built it on Windows.
Now we have a 30% operational overhead charged from microsoft through our cloud provider. But hey.. at least our cloud provider honours fsync().
I think perhaps not all vendor lock-in is created equal. I too shudder at the thought of walking into another Oracle like trap, but it's also an error in cognition to make the assumption that all vendors will lock you in to the same degree and in the same way.
I guess the part of us that is cautioning ourselves and others are aware of the pitfalls, but others also have valid points around going all in.
There is a matrix of different scenarios let's say.
You can go all in on a vendor and get Oracled.
You can go all in on an abstraction that lets you be vendor agnostic and lose some velocity while gaining flexibility.
You can go for a vendor and perhaps it turns out that no terrible outcome results because of that.
You can go all in on vendor agnostic and have that be the death of the company.
You can go all in on vendor agnostic and have that be the reason the company was able to dodge death.
Nobody can read the future and even "best practices" have a possibility of resulting in the worst outcomes. The only thing for it is to do your homework, decide what risks are acceptable to you, make your decision, take responsibility for it.
Vendors have 2 core requirements to continue operating: get new customers and keep the existing ones. Getting new customers requires constant innovation, marketing spend, providing value, etc. Keeping existing customers only requires making the pain of leaving greater than the pain of staying.
Sure. And from even from that you still can't infer what outcome will materialize. If you made the technically correct decision and your business went under because of it, that is still gonna hurt no matter which way you look at it. Hence the advice is do your homework, figure out which risks are acceptable to you, make your choice and take the responsibility. There is no magic bullet to picking the right option. Only picking the option you can live with because that's what you're going to have to do regardless of the outcome.
You might know all the theory on aviation and be a really experienced pilot and one day a sudden wind shear might still fuck you.
> all cloud providers have an exorbitant premium when it comes to running Windows on their VMs
Speaking from first hand running-a-cloud-platform experience, it's because running Windows on a cloud platform is not easy, and comes with licensing costs that have to be paid to Microsoft for each running instance (plus a bunch of infrastructure to support it). It's not even a per-instance-per-time-interval cost, there's all sorts of stuff wrapped up in it and impact the effective cost. It requires a bunch of administrative work and specific coding to try to optimise the costs to the cloud provider.
In addition, where Linux will happily transfer between different hardware configurations, you'll often have to have separate Windows images for each hardware configuration, so that means even more overhead on staffing both producing and supporting. So e.g. SR-IOV images, PV images, bare metal images (for each flavour of bare metal hardware), etc. While a bunch of this work can be fully automated, it's still not a trivial task, and producing that first image for a new hardware configuration can take a whole bunch of work, even where you'd think it would be trivial.
> Oracle succeed by being only slightly cheaper than migrating your entire codebase
Amdocs, ESRI, Microsoft too... Their commercial strategy is a finely tuned parasitic equilibrium.
Sales training that emphasizes knowing one's customer is all about that: if the salesperson understand the exit costs better than the customer, he is going to be milked well & truly !
I'm thinking that the sales side may benefit from hiring people with experience in the customer's business to game the technical options in actual study... I guess they do it already - I'm not experienced on the sales side.
At least if you went with IBM and followed the old adage “no one ever got fired for buying IBM” in the 80s, you can still buy new supported hardware to run your old code on. If you went with their competitors, not so much.
The people who designed their systems such that they could be easily transitioned off of IBM have done so long ago. Those systems now run exponentially cheaper and have access to more resources.
Vendor lock-in was just as much of a problem then as now.
Yes because people in the 80s writing COBOL were writing AbstractFactorySingletonBeans to create a facade to make their transitions to newer platforms easier....
Well, our next iSeries is a whole lot cheaper than our current iSeries and quite a bit more powerful. The Power 9 is not something I would call a dinosaur.
> That sounds like the perspective of someone who's picked open source vendors most of the time
More than that. They picked open source vendors that didn't (a) vanish in a puff of smoke, (b) get bought out by a company that slowly killed the open source version, or (c) who produced products that they were capable of supporting without the vendor (or capable of faking support for).
Vendor lock in can be expensive, but spreading yourself across vendors can also be expensive. There are lots of events that are free if you stay in the walled garden, but the second you starting pulling gigs & gigs of data from AWS S3 into GCP (or whatever), that can get pricey real fast.
In general I agree with you. In practice, the more practical approach may be to focus on making more money & not fussing too much w/ vendor costs & whatever strategy you choose to use. It's easy to pay a big vendor bill when there's a bigger pile of cash from sales.
My best-practice advice is to do the math. What is the margin on infrastructure vs. the acquisition and management costs of the engineers necessary to operate the infrastructure.
Serverless doesn't scale very well in the axis of cost. At some point that's going to become an issue. If one has gone "all-in" on vendor lock-in then that vendor is going to spend as much time as possible enjoying those margins while the attempts to re-tool to something else is underway.
Best practice, generally speaking is to engineer for extensibility at some point, fairly early on.
Self hosting doesn't scale. There is very little reason for a good sysadmin to work for International Widgets Conglomerated when they can work for a cloud provider instead, building larger scale solutions more efficiently for higher pay. I'd rather buy an off the shelf service used by the rest of the F500 than roll my own. Successful companies outsource their non core competencies
Self-hosting doesn't scale? Have you done the math? If so, I'd be curious to see it.
There are a number of reasons that self-hosting doesn't make sense which have very little to do with scale and more to do with the lack of scale. For very little investment, one can get highly available compute, storage, networking, load-balancing, etc. from any of the major cloud providers. Want to make that geographically distributed? In your average cloud provider that's easy-peasy for little added cost.
Last time I had to ballpark such a thing, which is to say, what is the minimum possible deployment I'd be willing to support for a broad set of services, I settled on three 10kw cabinets in any given retail colo space with twenty-five servers per cabinet each consuming an average of 300W each. Those server were around $10k and were hypervisor class machines, i.e. lots of cores and memory for whatever time that was. Some switches, a couple routers, and 4xGigE transit links.
Of course I'd want three sets of that spread in regions of interest. If I were US focused, east coast, west coast and Chicago or thereabouts. All the servers and network gear come to around $1.5m CapEx. OpEx is $200/kw for the power and space and around $1/mbps for the transit. Note that outside the US, the price per kw can be much, much higher.
So, $6k MRC for the power and $4k MRC for the intertubes. $10k OpEx on top of ~$42k/month in depreciation ($1.5m/36) on your CapEx multiplied by three gives you $156k/month.
Lets assume my middle of the road hypervisor class machine has all the memory it needs and two 16 core processors with hyperthreading, so 64vCPU each or 14400 vCPU across your three data centers all for only around $2m/yr with nearly $5m of that up front.
That's a boat load of risk no startup or small enterprise is going to take on. You still have to staff for that and good luck finding folks that aren't asshats that can actually build it successfully. They're few and far between. That said, it does scale. It scales like hell, especially if you can manage to utilize that infrastructure effectively. I wager that if you were to look at what it would cost to hold down that much CPU and associated memory continuously in AWS then you'd be paying roughly 6x as much.
14400 vCPU of R4 for 3yr reserved, monthly is $300k MRC. I'm guessing you'd run ceph or rook on your bare metal and have ~8 1TB SSD per server, so 75 servers * 8 SSDs /3 (for replication) is 200TB with decent performance by three data centers for 600TB usable compared to EC2 GP2 at $.10/GB comes to roughly $60k MRC.
Less any network charges that's $360k vs. $156k self-hosted. Guess I'm wrong. It's only twice as much.
That's just not true. There are plenty of companies that self-host highly scaling infrastructure. Twitter being just one of those companies. They've only recently started thinking about using the cloud, opted for Google, and that's only to run some of their analytics infrastructure.
> There is very little reason for a good sysadmin to work for International Widgets Conglomerated when they can work for a cloud provider instead, building larger scale solutions more efficiently for higher pay
That's not true either (speaking from personal experience working for companies of all sizes, from a couple dozen employees on up to and including AWS and Oracle on their cloud platforms). For one thing, sysadmin is far to broad a job role to make such sweeping statements.
A whole bunch of what I do as a systems engineer for cloud platforms is a whole world of difference from general sysadmin work, even ignoring that sysadmin is a very broad definition that covers everything from running exchange servers on-prem, to building private clouds or beyond.
These days I'm not sure I've even got the particular skills to easily drop back in to typical sysadmin work. Cloud platform syseng work requires a much more precise set of skills and competencies.
All that aside, I can point you in the direction of plenty of sysadmins who wouldn't work for major cloud providers for all the money in the world, either for moral or ethical reasons; or they're just not interested in that kind of problem; or even just that they don't want to deal with the frequently toxic burn-out environments that you hear about there.
> I'd rather buy an off the shelf service used by the rest of the F500 than roll my own.
No where near as much of the F500 workload is on the cloud as apparently you'd believe. It's a big market that isn't well tapped. Amazon and Azure have been picking up some of the work, but a lot of the F500 don't like the pay-as-you-go financial model. That plays sweet merry havoc with budgeting, for starters. It's one reason why Oracle introduced a pay model with Oracle Cloud Infrastructure that allows CIOs to set fixed IT expenditure budgets. Many of the F500 companies are only really in the last few years starting to talk about actually moving in to the cloud (when OCI launched at Oracle OpenWorld, there was a startling number of CIOs from a number of large and well known companies coming up to the sales team and saying "So.. what is this cloud thing, anyway?"
> Successful companies outsource their non core competencies
Yes.. and no. Successful companies outsource their non-core competencies where there is no value having them on-site. That's very different.
Honestly, I think Twitter is a counterpoint to your argument.
Twitter was founded in 2006, the same year AWS was launched, so in the early days Twitter didn't have a choice - the cloud wasn't yet a viable option to run a company.
And, if you remember in the early days, Twitter's scalability was absolutely atrocious - the "Fail Whale" was an emblem of Twitter's engineering headaches. Of course, through lots of blood, sweat and tears (and millions/billions of dollars) Twitter has been able to build a robust infrastructure, but I think a new company, or a company who wasn't absolutely expert-level when dealing with heavy load, would be crazy to try to roll their own at this point unless they wanted to be a cloud provider themselves.
> And, if you remember in the early days, Twitter's scalability was absolutely atrocious - the "Fail Whale" was an emblem of Twitter's engineering headaches.
That's because Twitter was:
1) A monolith
2) Written in Ruby
They started splitting components up in to specialised made-for-purpose components using Scala atop the JVM, and scaling ceased being a big issue. The problems they ran into couldn't be solved by horizontal scaling. There wasn't any service that AWS offers even today that would have helped with those engineering challenges.
Yes, based on their own detailed analysis and extensive technical blogs on the subject. Ruby was doing a lot of processing of the tweets contents etc. etc, and at the time Ruby was even worse for performance than it is today. (Ruby may be many things, but until the more recent JIT work, fast was not one of them).
> That's just not true. There are plenty of companies that self-host highly scaling infrastructure. Twitter being just one of those companies. They've only recently started thinking about using the cloud, opted for Google, and that's only to run some of their analytics infrastructure.
Twitter is both unusually big and unusually unprofitable. You're unlikely to be as big as them, and even if you were I wouldn't assume they've made the best decisions.
> All that aside, I can point you in the direction of plenty of sysadmins who wouldn't work for major cloud providers for all the money in the world, either for moral or ethical reasons; or they're just not interested in that kind of problem; or even just that they don't want to deal with the frequently toxic burn-out environments that you hear about there.
It's best to work somewhere you're appreciated (both financially and for job-satisfaction reasons), and it's harder to be appreciated in an organization where you're a cost center than one where you're part of the primary business. There are good and bad companies in every field, and good and bad departments in every big company, but the odds are more in your favour when you go into a company that does what you do.
It doesn't have to be that way, for a client I've recently set up a Gitlab Auto Devops on Google Kubernetes. Feature velocity wise it is nearly as painless as Heroku (which to me is the pinnacle of ease of deployment), but because every layer of the stack is open source we could switch providers at a flick of the wrist.
Of course, we won't switch providers, because they're offering great value right now.
I feel this vendor lock-in business is a phase that will pass. We were vendor locked when we paid for Unix or Windows servers, then we got Linux and BSD. Then we got vendor locked by platform providers like AWS and such, and now that grip is loosened by open source infrastructure stacks like Kubernetes.
> We were vendor locked when we paid for Unix or Windows servers, then we got Linux and BSD.
NT was released in -93, FreeBSD 2.0 (free of AT&T code) was released in -94. GNU/Linux also saw professional server use in mid/late 90's. People still lock themselves in though.
If a company is going from zero to something, then you are absolutely right. In that case, dealing with vendor lock in later means success!
If an established company is moving to the cloud, the equation is not as simple. The established company presumably has the money and time to make their vendor agnostic. Is the vendor lock in risk worthwhile to spend more now? How large is the risk? What are the benefits (using all of the AWS services is pretty nice)?
>"benefits (using all of the AWS services is pretty nice)"
your personal experience? Or are you simply assuming that interop / efficiencies obtain when going all-in on AWS? I ask, because I've had multiple client conversations in which these presumed benefits fail to materialize to a degree that offsets the concern about lock-in.
I can't claim to have used all of the AWS services, but whenever I need something done I check if it's offered in AWS first. SQS,SNS,ECS,ALB/ELBs,SFNs,Lambdas, media encoding pipelines, Aurora RDS, etc... have all made my job easier.
If your time horizon is short (months to a couple of years) going all-in with a vendor can work quite well. Over longer time horizons (several years or more), it's often not so great. Forgetting about the costs involved (i.e. they know they've got you) and the fact that if that one vendor ever becomes truly dominant that improvements will slow to a crawl (i.e. they know they've got everyone), there is a very real risk of whatever technologies you are depending on disappearing as their business needs will always outweigh yours. Feature velocity tends to evaporate in the face of a forced overhaul or even rewrite.
Pricing changes are a real issue. Google maps spiked in cost and those who were using leaflet JS just had to change a few settings to switch to another provider. Those who built using googles map js are locked in.
Yeah...We had a fixed yearly price for Google Maps API Premium account - when they switched us to PAYG, our costs increased 8x...Spent 2 weeks of unplanned urgent work switching to Mapbox for 3x of original cost...
The problem with that is, there's only one thing, some product manager somewhere, that sits on a light switch of your company or project's success or failure. They could even unintentionally end you with a price change.
For instance, putting your EC2 instances in a VPC has been the preferred way of operating since 2009. But, if you have an account old enough, you can still create an EC2 instance outside of a VPC.
You can still use the same SQS API from 2006 as far as I know.
At first this seems reasonable from a 'technical debt' perspective. Building in vendor-agnosticism takes extra resources (true), that you could spend on getting features to market (true), you can always spend those extra resources later if you succeed and need them... sort of true, not really. Because the longer you go, the _more expensive_ it gets to switch, _and_ the more disruptive to your ongoing operations, eventually it's barely feasible.
Still, I don't know what the solution is. Increasingly it seems to me that building high-quality software is literally not feasible, we can't afford it, all we can afford is crap.
There's a middle ground here. You can decide on a case by case basis how much lock-in you are willing to tolerate for a particular aspect of your system. You can also strategically design your system to minimize lock-in while still leveraging provider specific capabilities or "punting" on versatility when you want to.
In other words you can decide to not bother with worrying about lock-in when it costs too much.
This will make your code base easier to port to multi-cloud in the future if you should ever want to.
Obviously, there's a huge cost associated with the learning curve, but this the part of the reason that Kubernetes is so attractive. It abstracts away the underlying infrastructure, which is nice.
At any kind of scale, though, one is loosely coupled to the cloud provider in any case for things like persistent disk, identity management, etc.
This, and if you really really don't want vendor lock-in, instead of inventing your own infra APIs, find the API the cloud vendor you chose exposes, replicate it, make sure it works, and then still use the vendor with the comfort of knowing you can spin your own infra if the cloud provider no longer meets your needs.
Or you don't bother replicating the API, even if you don't want vendor lock-in, because you realize that if a cloud provider evaporates, there will be a lot of other people in the same boat as you and surely there will be open-source + off-the-shelf solutions that pop-up immediately.
Agree that time to market shouldn't be impeded by any unnecessary engineering.
But pick a vendor and go all-in can work for netflix'y big companies or ones with static assets on cloud. All cloud providers have their own rough edges and if you get stuck in one you might be losing your business edge. Case in point - not going to name the provider since we are partners with them, we found a provision bug - custom windows image based vm took 15 minutes to get provisioned and also exporting custom image across regions has rigid size restrictions. The provider acknowledged the bugs but they are not going to address it in this quarter but if we are netflix big - may be they could have addressed it sooner.
We have automated the cluster deployment so we can get our clusters up and running in most major cloud providers. We are careful not to be tied to vendor lock-in as much as possible since business edge cannot be compromised based on this big cloud providers, who only heed to your cry only if you are big and they care none what so ever for your business impediment. When you are expecting cloud resources which aren't going to be static - you need flexibility so the above recommendation doesn't suit all.
I'll echo this sentiment, and while it may sound like a philosophical position it really is pragmatic experience: it is nearly impossible to realize a gain from proactively implementing a multi-vendor abstraction. I've found this to hold in databases, third party services like payments and email, and very much so in cloud services. I instead recommend using services and APIs as they were designed and as directly (i.e. with as little abstraction) as possible. Only when and if you decide to add or switch to another vendor would be the time to either add an abstraction layer or port your code. I've never seen a case where the cost to implement two direct integrations was significantly more than the the cost to implement an abstraction, and many cases where an abstraction was implemented but only one vendor was ever used.
I'll note that I have no objection to abstractions per se, especially in cases where a community solution exists, e.g. Python's sqlalchemy is good enough that I'd seldom recommend directly using a database driver, Node's nodemailer is in many cases easier to use than lower level clients, etc.
It very much depends on the nature of the services and the abstractions.
I'm currently working on a system that has several multi-vendor abstractions - for file storage across AWS, GCP, NFS and various other things; for message queues across Kafka, GCP Pubsub, and direct database storage; for basic database storage (more key-value style than anything else) across a range of systems; for deployment on VMs, in containers, and bare metal; and various other things.
All of these things are necessary because it's a complex system that needs to be deployable in different clouds as well as on-prem for big enterprise customers with an on-prem requirement.
None of the code involved is particularly complex, and it's involved almost zero maintenance over time.
That would less be the case if you were trying to roll your own, say, JDBC or DB-agnostic ORM equivalent, but there are generally off the shelf solutions for that kind of thing.
I would never argue against doing it your case, but implementing an abstraction because multi-vendor support is an actual requirement is quite different from implementing an abstraction on top of a single vendor because you are trying to avoid "vendor lock-in".
I agree with this, but would note that building an abstraction layer is not the only way to approach this issue. Just building the thing with half an eye on how you would port it over to a different platform is you needed to can make the difference between it being a straightforward like-for-like conversion, and having to rearchitect the entire app...
I've been at places where they were so vendor locked to a technology that there was a penalty clause for leaving that was in the tens of millions. It obviously wasn't cloud but the point still stands. If you don't have options you pay what they tell you or go out of business.
Yeah, I can't help but wonder about offerings like AppSync; in one level it seems cool, but I recoil a the thought of introducing a critical dependency on AWS for a core piece of the application layer.
This has been my company's approach.
There's always going to be some provider specific stuff you have to deal with.. The networking has been a major difference between clouds I've noticed. But I'm guessing in most cases our Helm charts would deploy unchanged Toa different provider.
At the expense of losing what little reputation I have on HN I will say this:
As many others on here seem to be correctly saying, i think this article amounts to fear mongering of vendor lock in. The modern public cloud is very different from the Oracle/IBM mainframes of yester-year.
The whole point of the public cloud is to leverage managed services to their fullest extent so you can move incredibly fast. As a startup, you’ll run laps around your competitors doing all of this from scratch simply to preserve their non vendor lock in.
The notion that removing that glue code that glues your code to AWS or Azure managed services amounts to vendor lock in, that is no more true than any other code running on any VM that talks to those same managed services. Except the main difference here is that your not wasting time writing the glue code.
Additionally Azure Functions or AWS Lambda, or even Functions on Kubernetes, which are meant to be the smallest unit of work when used correctly (similar to a MicroService) and should contain only your application logic are “vendor lock-in” is absolutely rediculous to me. If anything when you do decide to move vendors this will be the easiest code in the world to migrate, inputs and outputs.
I will concede that it is hard to see this the way I’m describing if you haven’t actually worked on the modern public cloud and are not actively taking advantage of managed services on there for speed of delivery.
A little self promotion: as an example of what’s actually possible with these Serverless frameworks I recently built a cross platform app, as a side project in just a few months nights and weekends with the entire backend as Serverless Functions, the app can read any article to you, using some open source ML models for text to speech, and can be found https://articulu.com if you want to check it out.
There is a certain amount of arrogance to always being afraid of vendor lock-in. Most companies don’t survive, even the best ones might be just around for 20-25 years. The big worry should be on building a business that won’t immediately die.
And even with Oracle (probably the primo example of lock-in) it’s not like there aren’t firms who’s sole speciality is pumping data out of the Oracle DB and transforming it magically into T-SQL. It’s never the end of the world with vendor lock-in.
NOTE: now vendor lock-out does scare me like no other ironically
> It’s never the end of the world with vendor lock-in.
I think that's a matter of scale. Using Oracle for your counter example isn't very persuasive, as that's a huge vendor, so there's a market to extract you. Not so for many other vendors.
> Most companies don’t survive
...and ensuring they can control their costs and pivot away from solutions that prove too complex is part of being able to survive.
I mean, I agree, vendor lock-in doesn't have to be the end, but it also makes sense to extricate yourself when you can. When I worked for Virginia state govt they made a deal with Northrup Grumman...one that the legislative audit group came by later and said it was such a terrible deal that we should drop it...except we couldn't afford the exit costs, so we had to stick with a deal that was bad in terms of both money and quality.
That's a position you don't want to be in, and that result "It's bad but we can't afford to get out of it" is what the fear of vendor lock in is all about.
The only YCombinator company to go public - Dropbox - started life completely dependent on AWS, proved product market fit, got funding and slowly moved from AWS.
Other companies like Netflix decided that they didn’t want to manage infrastructure and proudly announced they closed down their final data center and moved to AWS.
Twitter desperately needs to move everything as quickly from self hosting as possible.
Not the person you're responding to, but I worry about (and have experienced) both with my tech stack, even as I've purposefully switched vendors multiple times with minimal headaches.
Locking yourself into a single vendor is easier to voluntarily work your way out of than your vendor shutting down or shutting you out unexpectedly. But the good news is if you plan for one you get the other for free.
I would argue that small to medium web services don’t need Kubernetes nor serverless. It doesn’t even need to be split into services. Build a tidy monolith and see how far that takes you first. Have less moving parts.
Yes, serveless ties you in to platform specifics but in their nature the functions you create should be small and easy to reimplement elsewhere.
Kubernetes on the other hand is arguably also a certain lock-in, by virtue of being complicated. No wonder vendors love it, it’s an offering that is hard to do right in-house. And when Kubernetes releases updates only the most seasoned in-house teams will be able to keep up. It creates job security by being a lot to learn and manage. Yes there are good abstractions but when something breaks you’ll need to delve into that complexity below. (Makes me think of ORM abstractions vs SQL.)
Yes, Kubernetes is an awesome vehicle for orchestrating a swarm of containerized services. But when you’re not Netflix or Twitter scale it’s ridiculous to worship this complexity.
Frankly I keep coming back to appreciate Heroku's abstractions and its twelve factor app philosophy https://12factor.net/. Heroku runs on AWS but feels like a different world than AWS to develop on. I can actually get projects flying with a 2-3 person team me included.
Actually 'serverless' is where small shops might want to start.
A single Lambda can encompass a whole variety of functions, and if you're using a datastore that scales as well, you don't need to worry about much. Once it's set up, it should be very easy to monitor and change.
I'd rather a simple Lambda than managing a couple of EC2's with failover scenarios and the front end networking pieces for that.
Also small scale is where lambda really shines in terms of costs. If you have some api endpoint that gets a hit 100 times per hour and does some execution then this is actually way cheaper then even the cheapest ec2 instance in a production setup with ELB.
There's a world of organizations between Netflix and Twitter scale and small startups with 2-3 person infrastructure teams in which Kubernetes makes good sense.
At some point, it becomes cost effective to have a two-pizza team of engineers dedicated to it. Once an org hits that level, then the scale that can be achieved with Kubernetes is pretty immense.
I'm using k8s for a side project on Digital Ocean. I'm familiar with k8s, and deployment and maintenance are easy. The only devops I've gotten into so far is setting up DNS for the external load balancer. I think people are mistaken when they say k8s is too complex for the smallest apps.
We've heard from our customers, if you cross $100,000 a month on AWS, they'll negotiate your bill down," said Polvi. "If you cross a million a month, they'll no longer negotiate with you because they know you're so locked that you're not going anywhere. That's the level where we're trying to provide some relief.
I've also negotiated with AWS, and both your position and their position strikes me as equally true.
There's certain products of theirs that they just aren't going to negotiate on, because they know they've got you, whereas others the clouds part and discounts rain down.
It certainly used to be this way when AWS had less competition, these days there's an Azure/Oracle Cloud/Rackspace/Google/etc alternative to most of their greatest hits, which gives a greater negotiating edge.
Lambda certainly has more alternatives than Dynamo for example. But I guess the true lock-in is the integration. If you use Lambda, chances are you'll end up choosing S3 and SQS and Dynamo and API Gateway etc.
"If you cross a million a month, they'll no longer negotiate with you because they know you're so locked that you're not going anywhere"
If they think you are unable to move at all, that may be true. The tone changes once you realize that either you can move, or you can't, but there's a lot more business coming and they won't get it.
I’m not buying that. Lambda is merely an execution environment. In most Lambda functions I write, the Lambda-specific bit is tiny, and could be easily replaced without affecting business logic.
On the other hand, most Lambdas I write interact with other AWS APIs, which is where the real lock-in is. The effort to eg. move the data off Dynamo is substantially higher than what’s required to switch that bit of code to run on k8s and consume a Kafka topic.
Cool. What part aren't you buying, the title of this post, or the actual five paragraphs in the article that actually give you the context of that title?
I think best practice is to think of serverless deployment as a technical operations technology, rather than as a methodology to eliminate the need for technical operations. Don't lose track of what you're effectively outsourcing to your serverless provider. Have a backup plan, just like in the old days you wanted a backup plan in case your datacenter provider had issues.
The problem I see with this kind of vendor lock in is you can get screwed in several different ways if you let yourself get locked in enough.
The 'good': a competitor overtakes AWS and is able to offer vastly cheaper or better value services than you have access to, rendering you less competitive than people who are able to move to that platform easily.
The 'bad': Amazon starts deprecating services you rely on and you're forced to port things anyway.
The 'ugly': Amazon decides that it's happy with its market share or its shareholders start demanding they bring in more revenue and they realise that those who are locked in to AWS are easy targets. It'd be easy to just jack up the higher tiers of things like lambda, dynamo DB, API gateway, etc. and on those who they have bespoke agreements with without even necessarily affecting their marketshare.
It's really a risk/reward thing when going for these platform specific serverless systems. It's like asking if you trust a big company enough that you want to give up all of your bargaining power with them, and that you're going to put thousands or even millions of dollars where your mouth is on that.
After using lambda and serverless myself over the past year, I really struggle to see where this lock-in is. If you're already writing a stateless API as most are these days, and the cloud platforms support many language options, going between say EC2 and Lambda really isn't that much difference in code. If that changeover time is too costly for you, that's far more likely a sign of changing infrastructure too often.
In my opinion, the real lock-in is not the stateless API, but the tie-ins with other AWS services that may end up being required to accomplish what you need.
Like, if you're trying to provide a calculator API, you can definitely run that in Lambda and then easily move it somewhere else when AWS does something to upset you. But, let's say you're trying to do something a little more complicated (a common example is validating and transforming profile pictures for some sort of app), you might end up using AWS Step Functions and SQS. Your code is still portable but it relies on a bunch of managed services.
> He elaborated: "It's code that tied not just to hardware – which we've seen before – but to a data center, you can't even get the hardware yourself. And that hardware is now custom fabbed for the cloud providers with dark fiber that runs all around the world, just for them. So literally the application you write will never get the performance or responsiveness or the ability to be ported somewhere else without having the deployment footprint of Amazon."
It's almost as if you're paying to use someone else's massive investment in technology so you don't have to reinvent the wheel, enabling you to just get business done quickly and at ridiculous scales. Kind of like using Windows tech stacks, or buying a Ford F-350. Who could possibly build a business on such terrible lock-in devices?
Serious question for all on this thread: have you personally encountered a deal-breaking issue while actually implementing a significant application on "Lambda and serverless"? Whether that's lock-in, scaling issues, cost, performance, or whatever. Has there been something that's caused you to go "yeah, no, this was a bad idea; should've rolled my own infra."
I'm not asking this disingenuously; I legitimately want to know.
There is absolutely no lock in whatsoever with Lambda. The features provided by Lambda are also provided by Google Cloud Functions and Azure Functions.
The lock in comes from the ecosystem you use them in. If you make code that just returns the time, you can run that anywhere. If you make code that uses a database, your database choice provides lock in, but not Lambda.
And it's the same lock in you get using any service from AWS.
But your trade off is that you can make something that's super portable, but must cater to the lowest common denominator of features amongst all the providers you want to be compatible with.
I'd rather have lock in than be hamstrung by the velocity of the slowest provider.
I've built a number of serverless systems over the past few years on AWS and GCP. None were too extreme, but ranged from moderately complex SPA to silly chat bot. Some saw light, but real, usage.
To echo what others have already said: the lock in isn't in the compute, it's in the ecosystem, which also happens to be where all of the value is.
Like everything else in our industry, serverless is a series of trade offs. There are a number of classes of problems where it is absolutely worth trading the downsides of serverless for the agility and velocity the ecosystem can provide you. As with anything, the key is knowing when it is the right tool and how to use it properly.
Somebody should make a movement called 'serverful' that builds technologies that allow you to deploy a web service on any arbitrary server in any cloud that scales to the amount of resources the server is capable of consuming. You could just reskin Apache and call it a day.
This is why I can see why Kubeless [1], Fission [2] and OpenFAAS [3] are gaining traction.
But my take is always that it depends on the size of your company, your cloud strategy and how much serverless you are using.
* If you are small company dabbling in small serverless scripts, just use Lambda.
* If you are a medium+ company but have gone all in on AWS or GCP, and serverless is still a limited small part of your stack, then also just use Lamda or Google Cloud Functions. But consider the options.
* If you are a multi-cloud company or more invested in serverless. Then they are the ones that should definitely consider OpenFAAS etc and not use Lamda etc for anything but minute parts of your stack.
* If you use Kubernetes and are fairly Cloud agnostic, then use Kubeless etc so that you have full serverless support in local and staging clusters and any cloud provided clusters you expand and migrate to as well.
I the Serverless framework and have been able to successfully redeploy functions from AWS to GCP (all node, this was in 2018) with only a few changes to the Provider section off serverless.yml. We are adopting Kubernetes now and I'm feeling out the landscape, so I'm planning on trying the same thing with Kubeless. AFAIK it should be pretty seemless- I'm more worried about Ingress working properly than Kubeless not being able to run my code.
FaaS has an important role to play: we often prototype things with Zapier, then redeploy them as FaaS functions when we need to scale them or process any PII. I can't imagine trying to make a full app with them with the current state of the dev/testing workflow, but for internal systems, integration, and stream processing they are pretty tough to beat.
You'll start seeing slowly increasing rates... and as people leave, the rates will increase further. Eventually, you'll start seeing Snowball no longer supported for getting your data off S3.
There’s a whole generation of developers that didn’t come up in IBM and then later MS days of vendor lock-in. Open source is default and it’s easy to only see the benefits and positive side of one vendors vision. Only now it’s harder as proprietary tech is often cloaked in “open” culture and only when you go to rip the bandaid off do you see where the real lock-in is
Apache Openwhisk https://openwhisk.apache.org. it's open source and can run on any cloud platform. It will run your serverless apps. You'll maintain some infrastructure to run it on unless you go for a hosted service like IBM provides.
I'm building the infrastructure for Libr (Tumblr replacement, https://librapp.com) on a serverless platform that I won't name which will be hidden behind a reverse proxy. There won't be any vendor lock in. It's an express/Vue app and will run on any serverless platform or CDN.
If the app is censored by my first cloud provider (perhaps due to pressure on the provider from SESTA/FOSTA) I'll move to a new one. It's likely I'll build parralel copies of the production infrastructure on different cloud providers at some point for rapid migration capabilities.
It’s a boilerplate Go, Lambda, API gateway, dynamo, SNS, x-ray etc app.
Personally I embrace the “lock-in”. This architecture faster, cheaper and more reliable than anything I’ve seen in my 15 years of web development.
Most importantly it is less code. Most time is spent writing Go functions. A little time goes into configuring the infra but the patterns are simple. No time goes into building infra or a web framework.
I think Go is the antidote to true lock-in.
I have a ‘Notify’ function that uses SNS that I recently replaced with a Slack implementation. With well defined interfaces you can swap out DynamoDB for Mongo if you have to move.
It is also easy to turn a function into a HTTP handler. There is a smooth path from function to container to server if the cost or performance of lambda doesn’t work. It’s hard or impossible to go the other way.
"Serverless" is the most ridiculous example of bullshit marketing in recent history. It truly took me a good twenty minutes to understand what it is supposed to represent because I kept thinking, "There HAS to be more to this than vendor-supplied CGI."
People make many arguments for designing WITHOUT portability (and cwyers even calls portability "premature optimization"). What they're implicitly stating is that they can't code to abstractions, aren't effective at coding without using edge cases, and require package specific optimizations to barely get acceptable levels of performance. If the edge cases and package specific optimizations weren't considered necessary, there'd be no real case for making something non-portable.
The fact that people can even rationalize non-portability boggles the mind. It just seems like a poor attempt at job security or something equally silly.
I am thinking of my own SaaS offer, but combined with Open Source. Basically, you will be able to publish a certain kind of application with a little bit of Javascript coding, and coding several lambdas in Golang. There will be an entire miniature server cluster running as goroutines, which you will be able to download off of github, then run locally. You will also be able to take the same server cluster and run it on a service like AWS. (On my roadmap, I'm going to remove all dependencies outside of the project, so you will pretty much be able to fill out the config file, and just run the executable, and have it scale according to the number of processors.
However, you will also be able to sign up for an account on my website, then use a command line facility to "inject" your lambdas into my system, which takes care of the autoscaling, database backup, and staging for you.
When discussing lambda/serverless/<whatever flavor of pay per request> setup, people don’t often seem to stop and think about the usage/access patterns, the associated costs and performance.
I’ve seen such setups being recommended for APIs that have predictable and fairly constant load, for which you are a lot better off having an actual running set of processes that can be reused. For Google that could be AppEngine, for AWS ElasticBeanstalk. It’s a question of the right tool for the job.
One tech that I haven’t played with that’s really interesting is KNative, where you can run an underlying infra with predictable costs/performance, but allocate it like a lambda per request. Performance of the requests themselves may still be less though when compared to a more traditional setup.
I made the OSS https://github.com/rynop/aws-blueprint partially to address this problem. Easily migrate from Lambda to ECS (remove lambda lock-in). It abstracts all the difficult & time consuming to learn aws idiosyncrasies in to a best-practice production ready harness.
You could argue that if you have lots of lambdas, this is non-trival. I would argue that tons of lambda is a poor architecture. What you gain with isolation you lose
management, complexity, nimbleness, attack vector surface.
You could also argue my harness locks you into AWS as it is aws specific. However I'd argue that it is the other aws services locking you in (ex: Dynamo) or your code/architecture.
The argument makes no sense. Because it is deployed in AWS, you can't get performance without using AWS services, therefor it is locking.
This seems like no more of a lockin than, say, choosing a DNS server that's giving me lower latencies.
My AWS Lambdas talk to a Postgres database, S3 (which has many open source API implementations), and SQS, which yeah, I'm "locked into".
The work to move to another service would be absolutely trivial. All of the AWS stuff like Postgres, S3, and SQS is totally abstracted from the business logic. I could rip it out at any time.
I just don't get what anyone means when they saw lambdas lock you in, I don't feel locked in at all. I could move to GCP in, idk, two weeks probably.
I would guess ecosystem as a whole is a bigger deal.
Porting lambas alone probably isn't a huge deal. But then the cloudformation templates, sqs configs, dynamo tables, rbac configs, S3 access settings, cloudwatch logging and alerts, etc. It all adds up.
Is using the features of a tool, 'lock in'? You could do the things the tool does yourself but you choose to leverage the tool. That's why you use it. You're aware of this. aws lambda functions can be just pure python or other code. Anything logged there can be a metric in cloudwatch. Users are blissfully aware of how much it's doing for them from security to monitoring. And to be honest companies with enough money would prefer these solutions to something you can knock-up yourself. I feel so locked in by my million free requests a month to service that wont choke that I didn't even have to set-up.
Yeah, absolutely, if your engineers decide to adopt serverless due to hype or just to improve their own curriculum, you are going to spend a lot more on infrastructure than you would by provisioning VMs or running containers. By being selective about which workloads are eligible to become FaaS and doing a little of optimization, however, you can cut some costs and avoiding overprovisioning, with automatic and efficient scalability.
I believe that, in most cases, it is better to control the exit costs of your architectural decisions than to avoid lock-in at all costs.
This is why the serverless framework (trying to give the possibility to deploy to any cloud provider) is so important. Some repetitive simple tasks are extremely well suited for lambda/functions /whatever name and some tasks are suited for big machines with gazillion terraflops and terabytes of ram. The job of the engineer is to know what his software needs, and what the most optimal path for this is.
Business is ever changing. This is just another step
I wrote a flaky test management system called https://www.flaptastic.com/ on AWS Lambda... my first AWS bill was $2.50. I love it. I also used serverless.com's wrapper to deploy it for free. If this gets expensive and I want to raise money to have expensive DevOps engineers setup Kubernetes for months then fine... but I really don't need that and I can easily port this to any other platform if I want to later.
I use lambda to perform simple, stateless units of work like autoscaling event post-hooks, chat bots, routine scheduled tasks, etc. They're mostly cloud-agnostic and I could move them to a server if I had to.
I don't think anyone can really build a full scale application using serverless. It's just not performant or predictable enough. I've seen people try, and it always ended in frustration. A properly configured docker scheduler is better for this type of work anyway.
You can definitely build a fully functional web app using just serverless. For an example, take a look at https://acloud.guru . Where I work, we almost exclusively use serverless, and I have found it to be incredibly reliable, and way more hands-off than a docker deployment.
Zeit is a great alternative here in my opinion. You don't need to make any changes to your code, it will just run on Zeit. Have used it to host a few microservices in the past and the experience has been much more pleasurable than Lambda (I needed to do PDF generation through a browser and had to use Zappa and modified binaries for phantomjs)
Clear abstraction anyone? We make it mandatory to separate the Managed service code to separate interfaces and implementation. We have the same code running both on Azure and AWS each leveraging their respective managed services. Just implement the interfaces for your need. It's not difficult.
Worth mentioning that CoreOS was bought out by RedHat, which makes a lot of sense given where they were going with OpenShift. Which, in turn was bought (is being bought) by IBM.
In the end, the tooling in use is crossing a lot of lines and becoming very common in a lot of ways.
Is it lock-in? It’s just your code being packaged in a container and run on an api invocation. I thought one could move from any of the big three’s serverless function offerings pretty easily?
On Azure it's not a docker container, or any other kind of 'standard' container.
That said you can get a docker container with the Azure Function runtime, so you could in theory port your functions elsewhere, but I don't think you get the monitoring benefits that you'd have keeping it on Azure.
Yes, a lock-in that can be avoided with a 20 to 30 line piece of Javascript that just handles the messages and passes it to your cloud agnostic piece of business code.
A lot of embittered Ops engineers shaking their fists in the comment thread of that article. "Real engineers write assembly, on punchcards, blindfolded" etc.
How do you do that? With all the stories in play, and all the comments being added, how do you notice within four minutes that one of them says to add the date to a title?
I wonder if a regexp for a short comment containing a single date would work. I expect comments matching that regex are streaming by, Matrix style, on one of dang's many monitors¹, as he sips his morning cold-pressed flat grey². In fact, with a bit of practice, there is probably a regexp that catches all of them.
¹ I haven't seen dang's workstation, I'm just imagining.
Why do we park in a driveway and drive on a parkway? The key is to understand what people mean by the words they say. Yeah, there can be things that get under our skin, but at the end of the day, just roll with it. (I say this as much to remind myself as to you.) Looking for absolute consistency in language is a recipe for needless, self-inflicted frustration.
"The key is to understand what people mean by the words they say."
Yes, it is -- and I'm not entirely sure I understand what "serverless" means when used this way, unless it's just a term that marketing people invented because their old term "the cloud" has lost its sizzle.
> Why is it called "serverless" when it is not, in fact, serverless?
It is serverless from the perspective of an IT department that, by adopting it, no longer has to manage servers as a distinct resource.
It's like a product sold as having “worry-free interoperation”. There's still worry in the interoperation, you are just paying someone else to do the worrying.
Likewise, a serverless product still has servers underneath, you are just paying someone else to abstract them so that they aren't a concern for you.
but someone has to manage the things that manage the lambda right? you're just moving IT server operations to devops...my understanding of lambda is limited to "you write functions and you pay per function compute" so please correct me if i'm wrong
No, the cloud is a term for dynamically provisionable resources, some of which (IaaS, for instance) still require traditional server management.
It's true that the invention of the term, originally for Amazon's Functions-aaS, wasn't particularly distinguishing from lots of existing cloud SaaS categories (classical PaaS, DBaaS, etc.) which are equally free of server management to FaaS services, but the terms has subsequently been broadened in use (AFAICT, Google Cloud Platform was the main driver here) so that it makes more sense than Amazon's original use did.
Hmm. Your reply has left me even more confused about the nomenclature. Oh well, I guess it doesn't matter if I actually understand what these names mean or not.
Why is wireless networking called wireless when there are wires connecting the wireless router to the modem? Or the modem to the coax? Because it's wireless at the last hop and that is what is what is important to the user.
Because with wireless networking, there is a major portion of the link that actually is wireless (and with some setups, it's entirely wireless). With "serverless" computing, there is no point at which a server is not being used.
I suspect the same is true for cloud. Real portability has real costs, and if you aren't incurring all of them up front and validating that you're doing the right things to make it work, then incurring part of them up front is probably just a form of premature optimization. At the end of the day, all else being equal, it's easier to port a smaller codebase to new dependencies than a larger one, and attempting to be platform-agnostic tends to result in more code as you have to write a lot of code that your platform would otherwise provide you.