Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

"I'm scared of vendor lock-in, so I'm going to build something that's completely provider agnostic" means you're buying optionality, and paying for it with feature velocity.

There are business reasons to go multi-cloud for a few workloads, but understand that you're going to lose time to market as a result. My best practice advice is to pick a vendor (I don't care which one) and go all-in.

And you'll forgive my skepticism around "go multi-cloud!" coming from a vendor who'll have precious little to sell me if I don't.



    Pick a vendor and go all in.
That sounds like the perspective of someone who's picked open source vendors most of the time, or has been spoiled by the ease of migrating Java, Node, or Go projects to other systems and architectures. Having worked at large enterprises and banks who went all in with, say, IBM, I have seen just how expensive true vendor lock-in can get.

Don't expect a vendor to always stay competitively priced, especially once they realize a) their old model is failing, and b) everybody on their old model is quite stuck.


I am incredulous that people wouldn't be worried about vendor lock-in when the valley already has a 900lb gorilla in the room (Oracle).

Ask anybody about Oracle features, they'll tell you for days about how their feature velocity and set is great. But then ask them how much their company has been absolutely rinsed over time and how the costs increase annually.

Oracle succeed by being only slightly cheaper than migrating your entire codebase. To offset this practice, keep your transition costs low.

--

Personal note: I'm currently experiencing this with microsoft; all cloud providers have an exorbitant premium when it comes to running Windows on their VMs, but obviously Azure is priced very well (in an attempt to entice you to their platform). Our software has been built over a long period of time by people who have been forced to run Windows at work -- so they built it on Windows.

Now we have a 30% operational overhead charged from microsoft through our cloud provider. But hey.. at least our cloud provider honours fsync().


I think perhaps not all vendor lock-in is created equal. I too shudder at the thought of walking into another Oracle like trap, but it's also an error in cognition to make the assumption that all vendors will lock you in to the same degree and in the same way.

I guess the part of us that is cautioning ourselves and others are aware of the pitfalls, but others also have valid points around going all in.

There is a matrix of different scenarios let's say.

  You can go all in on a vendor and get Oracled.
  You can go all in on an abstraction that lets you be vendor agnostic and lose some velocity while gaining flexibility.
  You can go for a vendor and perhaps it turns out that no terrible outcome results because of that. 
  You can go all in on vendor agnostic and have that be the death of the company.
  You can go all in on vendor agnostic and have that be the reason the company was able to dodge death.
Nobody can read the future and even "best practices" have a possibility of resulting in the worst outcomes. The only thing for it is to do your homework, decide what risks are acceptable to you, make your decision, take responsibility for it.


Vendors have 2 core requirements to continue operating: get new customers and keep the existing ones. Getting new customers requires constant innovation, marketing spend, providing value, etc. Keeping existing customers only requires making the pain of leaving greater than the pain of staying.


Sure. And from even from that you still can't infer what outcome will materialize. If you made the technically correct decision and your business went under because of it, that is still gonna hurt no matter which way you look at it. Hence the advice is do your homework, figure out which risks are acceptable to you, make your choice and take the responsibility. There is no magic bullet to picking the right option. Only picking the option you can live with because that's what you're going to have to do regardless of the outcome.

You might know all the theory on aviation and be a really experienced pilot and one day a sudden wind shear might still fuck you.


> all cloud providers have an exorbitant premium when it comes to running Windows on their VMs

Speaking from first hand running-a-cloud-platform experience, it's because running Windows on a cloud platform is not easy, and comes with licensing costs that have to be paid to Microsoft for each running instance (plus a bunch of infrastructure to support it). It's not even a per-instance-per-time-interval cost, there's all sorts of stuff wrapped up in it and impact the effective cost. It requires a bunch of administrative work and specific coding to try to optimise the costs to the cloud provider.

In addition, where Linux will happily transfer between different hardware configurations, you'll often have to have separate Windows images for each hardware configuration, so that means even more overhead on staffing both producing and supporting. So e.g. SR-IOV images, PV images, bare metal images (for each flavour of bare metal hardware), etc. While a bunch of this work can be fully automated, it's still not a trivial task, and producing that first image for a new hardware configuration can take a whole bunch of work, even where you'd think it would be trivial.


> Oracle succeed by being only slightly cheaper than migrating your entire codebase

Amdocs, ESRI, Microsoft too... Their commercial strategy is a finely tuned parasitic equilibrium.

Sales training that emphasizes knowing one's customer is all about that: if the salesperson understand the exit costs better than the customer, he is going to be milked well & truly !

I'm thinking that the sales side may benefit from hiring people with experience in the customer's business to game the technical options in actual study... I guess they do it already - I'm not experienced on the sales side.


I know whole bunch of people who complain about Oracle lock but fine with moving everything to DynamoDB/Lambda


>keep your transition costs low

For me this is key.


At least if you went with IBM and followed the old adage “no one ever got fired for buying IBM” in the 80s, you can still buy new supported hardware to run your old code on. If you went with their competitors, not so much.


You certainly can, but the prices haven’t decreased (and probably increased) from their 80s values, even though the thing is now probably a dinosaur.


Which of their competitors can you still buy new faster hardware from? Does anyone sell Stratus VOS or DEC VAX VMS compatible systems?

Yeah I know I am showing my age....


The people who designed their systems such that they could be easily transitioned off of IBM have done so long ago. Those systems now run exponentially cheaper and have access to more resources.

Vendor lock-in was just as much of a problem then as now.


Yes because people in the 80s writing COBOL were writing AbstractFactorySingletonBeans to create a facade to make their transitions to newer platforms easier....


> Does anyone sell... DEC VAX VMS compatible systems?

https://www.avtware.com/cloud


Someone should check if it can run OpenGenera.


Well, our next iSeries is a whole lot cheaper than our current iSeries and quite a bit more powerful. The Power 9 is not something I would call a dinosaur.


IBM's prices haven't changed at all since the 80s. They're still four times as expensive as they should be.


Sounds like you've described IBM being priced exactly right, given their tenacious longevity.


> That sounds like the perspective of someone who's picked open source vendors most of the time

More than that. They picked open source vendors that didn't (a) vanish in a puff of smoke, (b) get bought out by a company that slowly killed the open source version, or (c) who produced products that they were capable of supporting without the vendor (or capable of faking support for).


Vendor lock in can be expensive, but spreading yourself across vendors can also be expensive. There are lots of events that are free if you stay in the walled garden, but the second you starting pulling gigs & gigs of data from AWS S3 into GCP (or whatever), that can get pricey real fast.

In general I agree with you. In practice, the more practical approach may be to focus on making more money & not fussing too much w/ vendor costs & whatever strategy you choose to use. It's easy to pay a big vendor bill when there's a bigger pile of cash from sales.


IBM is awful on the way out. So is any firm bought by Oracle or CA.


That's terrible best-practice advice.

My best-practice advice is to do the math. What is the margin on infrastructure vs. the acquisition and management costs of the engineers necessary to operate the infrastructure.

Serverless doesn't scale very well in the axis of cost. At some point that's going to become an issue. If one has gone "all-in" on vendor lock-in then that vendor is going to spend as much time as possible enjoying those margins while the attempts to re-tool to something else is underway.

Best practice, generally speaking is to engineer for extensibility at some point, fairly early on.


Self hosting doesn't scale. There is very little reason for a good sysadmin to work for International Widgets Conglomerated when they can work for a cloud provider instead, building larger scale solutions more efficiently for higher pay. I'd rather buy an off the shelf service used by the rest of the F500 than roll my own. Successful companies outsource their non core competencies


Self-hosting doesn't scale? Have you done the math? If so, I'd be curious to see it.

There are a number of reasons that self-hosting doesn't make sense which have very little to do with scale and more to do with the lack of scale. For very little investment, one can get highly available compute, storage, networking, load-balancing, etc. from any of the major cloud providers. Want to make that geographically distributed? In your average cloud provider that's easy-peasy for little added cost.

Last time I had to ballpark such a thing, which is to say, what is the minimum possible deployment I'd be willing to support for a broad set of services, I settled on three 10kw cabinets in any given retail colo space with twenty-five servers per cabinet each consuming an average of 300W each. Those server were around $10k and were hypervisor class machines, i.e. lots of cores and memory for whatever time that was. Some switches, a couple routers, and 4xGigE transit links.

Of course I'd want three sets of that spread in regions of interest. If I were US focused, east coast, west coast and Chicago or thereabouts. All the servers and network gear come to around $1.5m CapEx. OpEx is $200/kw for the power and space and around $1/mbps for the transit. Note that outside the US, the price per kw can be much, much higher.

So, $6k MRC for the power and $4k MRC for the intertubes. $10k OpEx on top of ~$42k/month in depreciation ($1.5m/36) on your CapEx multiplied by three gives you $156k/month.

Lets assume my middle of the road hypervisor class machine has all the memory it needs and two 16 core processors with hyperthreading, so 64vCPU each or 14400 vCPU across your three data centers all for only around $2m/yr with nearly $5m of that up front.

That's a boat load of risk no startup or small enterprise is going to take on. You still have to staff for that and good luck finding folks that aren't asshats that can actually build it successfully. They're few and far between. That said, it does scale. It scales like hell, especially if you can manage to utilize that infrastructure effectively. I wager that if you were to look at what it would cost to hold down that much CPU and associated memory continuously in AWS then you'd be paying roughly 6x as much.


Lessee...

14400 vCPU of R4 for 3yr reserved, monthly is $300k MRC. I'm guessing you'd run ceph or rook on your bare metal and have ~8 1TB SSD per server, so 75 servers * 8 SSDs /3 (for replication) is 200TB with decent performance by three data centers for 600TB usable compared to EC2 GP2 at $.10/GB comes to roughly $60k MRC.

Less any network charges that's $360k vs. $156k self-hosted. Guess I'm wrong. It's only twice as much.


> Self hosting doesn't scale

That's just not true. There are plenty of companies that self-host highly scaling infrastructure. Twitter being just one of those companies. They've only recently started thinking about using the cloud, opted for Google, and that's only to run some of their analytics infrastructure.

> There is very little reason for a good sysadmin to work for International Widgets Conglomerated when they can work for a cloud provider instead, building larger scale solutions more efficiently for higher pay

That's not true either (speaking from personal experience working for companies of all sizes, from a couple dozen employees on up to and including AWS and Oracle on their cloud platforms). For one thing, sysadmin is far to broad a job role to make such sweeping statements.

A whole bunch of what I do as a systems engineer for cloud platforms is a whole world of difference from general sysadmin work, even ignoring that sysadmin is a very broad definition that covers everything from running exchange servers on-prem, to building private clouds or beyond.

These days I'm not sure I've even got the particular skills to easily drop back in to typical sysadmin work. Cloud platform syseng work requires a much more precise set of skills and competencies.

All that aside, I can point you in the direction of plenty of sysadmins who wouldn't work for major cloud providers for all the money in the world, either for moral or ethical reasons; or they're just not interested in that kind of problem; or even just that they don't want to deal with the frequently toxic burn-out environments that you hear about there.

> I'd rather buy an off the shelf service used by the rest of the F500 than roll my own.

No where near as much of the F500 workload is on the cloud as apparently you'd believe. It's a big market that isn't well tapped. Amazon and Azure have been picking up some of the work, but a lot of the F500 don't like the pay-as-you-go financial model. That plays sweet merry havoc with budgeting, for starters. It's one reason why Oracle introduced a pay model with Oracle Cloud Infrastructure that allows CIOs to set fixed IT expenditure budgets. Many of the F500 companies are only really in the last few years starting to talk about actually moving in to the cloud (when OCI launched at Oracle OpenWorld, there was a startling number of CIOs from a number of large and well known companies coming up to the sales team and saying "So.. what is this cloud thing, anyway?"

> Successful companies outsource their non core competencies

Yes.. and no. Successful companies outsource their non-core competencies where there is no value having them on-site. That's very different.


Honestly, I think Twitter is a counterpoint to your argument.

Twitter was founded in 2006, the same year AWS was launched, so in the early days Twitter didn't have a choice - the cloud wasn't yet a viable option to run a company.

And, if you remember in the early days, Twitter's scalability was absolutely atrocious - the "Fail Whale" was an emblem of Twitter's engineering headaches. Of course, through lots of blood, sweat and tears (and millions/billions of dollars) Twitter has been able to build a robust infrastructure, but I think a new company, or a company who wasn't absolutely expert-level when dealing with heavy load, would be crazy to try to roll their own at this point unless they wanted to be a cloud provider themselves.


> And, if you remember in the early days, Twitter's scalability was absolutely atrocious - the "Fail Whale" was an emblem of Twitter's engineering headaches.

That's because Twitter was:

1) A monolith

2) Written in Ruby

They started splitting components up in to specialised made-for-purpose components using Scala atop the JVM, and scaling ceased being a big issue. The problems they ran into couldn't be solved by horizontal scaling. There wasn't any service that AWS offers even today that would have helped with those engineering challenges.


It had more to do with the consistency model of relational databases, though Ruby definitely didn't help.


You honestly think that the main performance problem with a 2yo app is that it's a monolith?


Yes, based on their own detailed analysis and extensive technical blogs on the subject. Ruby was doing a lot of processing of the tweets contents etc. etc, and at the time Ruby was even worse for performance than it is today. (Ruby may be many things, but until the more recent JIT work, fast was not one of them).


> That's just not true. There are plenty of companies that self-host highly scaling infrastructure. Twitter being just one of those companies. They've only recently started thinking about using the cloud, opted for Google, and that's only to run some of their analytics infrastructure.

Twitter is both unusually big and unusually unprofitable. You're unlikely to be as big as them, and even if you were I wouldn't assume they've made the best decisions.

> All that aside, I can point you in the direction of plenty of sysadmins who wouldn't work for major cloud providers for all the money in the world, either for moral or ethical reasons; or they're just not interested in that kind of problem; or even just that they don't want to deal with the frequently toxic burn-out environments that you hear about there.

It's best to work somewhere you're appreciated (both financially and for job-satisfaction reasons), and it's harder to be appreciated in an organization where you're a cost center than one where you're part of the primary business. There are good and bad companies in every field, and good and bad departments in every big company, but the odds are more in your favour when you go into a company that does what you do.


It doesn't have to be that way, for a client I've recently set up a Gitlab Auto Devops on Google Kubernetes. Feature velocity wise it is nearly as painless as Heroku (which to me is the pinnacle of ease of deployment), but because every layer of the stack is open source we could switch providers at a flick of the wrist.

Of course, we won't switch providers, because they're offering great value right now.

I feel this vendor lock-in business is a phase that will pass. We were vendor locked when we paid for Unix or Windows servers, then we got Linux and BSD. Then we got vendor locked by platform providers like AWS and such, and now that grip is loosened by open source infrastructure stacks like Kubernetes.


> We were vendor locked when we paid for Unix or Windows servers, then we got Linux and BSD.

NT was released in -93, FreeBSD 2.0 (free of AT&T code) was released in -94. GNU/Linux also saw professional server use in mid/late 90's. People still lock themselves in though.


If a company is going from zero to something, then you are absolutely right. In that case, dealing with vendor lock in later means success!

If an established company is moving to the cloud, the equation is not as simple. The established company presumably has the money and time to make their vendor agnostic. Is the vendor lock in risk worthwhile to spend more now? How large is the risk? What are the benefits (using all of the AWS services is pretty nice)?


I have to ask, is this

>"benefits (using all of the AWS services is pretty nice)"

your personal experience? Or are you simply assuming that interop / efficiencies obtain when going all-in on AWS? I ask, because I've had multiple client conversations in which these presumed benefits fail to materialize to a degree that offsets the concern about lock-in.


I can't claim to have used all of the AWS services, but whenever I need something done I check if it's offered in AWS first. SQS,SNS,ECS,ALB/ELBs,SFNs,Lambdas, media encoding pipelines, Aurora RDS, etc... have all made my job easier.


If your time horizon is short (months to a couple of years) going all-in with a vendor can work quite well. Over longer time horizons (several years or more), it's often not so great. Forgetting about the costs involved (i.e. they know they've got you) and the fact that if that one vendor ever becomes truly dominant that improvements will slow to a crawl (i.e. they know they've got everyone), there is a very real risk of whatever technologies you are depending on disappearing as their business needs will always outweigh yours. Feature velocity tends to evaporate in the face of a forced overhaul or even rewrite.


Pricing changes are a real issue. Google maps spiked in cost and those who were using leaflet JS just had to change a few settings to switch to another provider. Those who built using googles map js are locked in.


Yeah...We had a fixed yearly price for Google Maps API Premium account - when they switched us to PAYG, our costs increased 8x...Spent 2 weeks of unplanned urgent work switching to Mapbox for 3x of original cost...


The problem with that is, there's only one thing, some product manager somewhere, that sits on a light switch of your company or project's success or failure. They could even unintentionally end you with a price change.


AWS supports everything...forever.

For instance, putting your EC2 instances in a VPC has been the preferred way of operating since 2009. But, if you have an account old enough, you can still create an EC2 instance outside of a VPC.

You can still use the same SQS API from 2006 as far as I know.


> AWS supports everything...forever.

Maybe "AWS and cloud infrastructure" will be to modern companies what COBOL and mainframes were to the big companies of 50 years ago.

No doubt somebody will be happy to charge you to support it for a long time...


They even still offer "reduced redundancy storage" even though it's been made obsolete (and is more expensive than the regular S3 storage).


And we see this here regularly, as with the Google Maps API price hike.


No you just see that with Google...a company not exactly known for its customer relations.


And Oracle, and IBM, and every company that doesn't pour every dollar of profit into growth marketing to continue redoubling down with investor money.


AWS drives most of Amazon’s profits these days. It isn’t running at a loss.



In the immortal words of @vgill, "You own your availability."


At first this seems reasonable from a 'technical debt' perspective. Building in vendor-agnosticism takes extra resources (true), that you could spend on getting features to market (true), you can always spend those extra resources later if you succeed and need them... sort of true, not really. Because the longer you go, the _more expensive_ it gets to switch, _and_ the more disruptive to your ongoing operations, eventually it's barely feasible.

Still, I don't know what the solution is. Increasingly it seems to me that building high-quality software is literally not feasible, we can't afford it, all we can afford is crap.


Gitlab.com did a switch from azure to GCP so it is realistic.


We had to migrate ~100 instances from Azure to GCP. It took us one month. At the end of the month we changed the dns entries and flipped the switch.

Its true tho that we never wanted to work with managed services, so there was literarly no need to redo any of the tooling.



There's a middle ground here. You can decide on a case by case basis how much lock-in you are willing to tolerate for a particular aspect of your system. You can also strategically design your system to minimize lock-in while still leveraging provider specific capabilities or "punting" on versatility when you want to.

In other words you can decide to not bother with worrying about lock-in when it costs too much.

This will make your code base easier to port to multi-cloud in the future if you should ever want to.


Obviously, there's a huge cost associated with the learning curve, but this the part of the reason that Kubernetes is so attractive. It abstracts away the underlying infrastructure, which is nice.

At any kind of scale, though, one is loosely coupled to the cloud provider in any case for things like persistent disk, identity management, etc.


Or the old “I use the Repository Pattern to abstract our database so sure we can move our six figure Oracle installation to Postgres”.

And then watch the CTO throw you out of his office.....


He didn’t want to hear? Why anyone would build on top of Oracle still eludes me.

Atlassian is bad, but Oracle is on a whole different level.


This, and if you really really don't want vendor lock-in, instead of inventing your own infra APIs, find the API the cloud vendor you chose exposes, replicate it, make sure it works, and then still use the vendor with the comfort of knowing you can spin your own infra if the cloud provider no longer meets your needs.

Or you don't bother replicating the API, even if you don't want vendor lock-in, because you realize that if a cloud provider evaporates, there will be a lot of other people in the same boat as you and surely there will be open-source + off-the-shelf solutions that pop-up immediately.


Agree that time to market shouldn't be impeded by any unnecessary engineering.

But pick a vendor and go all-in can work for netflix'y big companies or ones with static assets on cloud. All cloud providers have their own rough edges and if you get stuck in one you might be losing your business edge. Case in point - not going to name the provider since we are partners with them, we found a provision bug - custom windows image based vm took 15 minutes to get provisioned and also exporting custom image across regions has rigid size restrictions. The provider acknowledged the bugs but they are not going to address it in this quarter but if we are netflix big - may be they could have addressed it sooner.

We have automated the cluster deployment so we can get our clusters up and running in most major cloud providers. We are careful not to be tied to vendor lock-in as much as possible since business edge cannot be compromised based on this big cloud providers, who only heed to your cry only if you are big and they care none what so ever for your business impediment. When you are expecting cloud resources which aren't going to be static - you need flexibility so the above recommendation doesn't suit all.


I'll echo this sentiment, and while it may sound like a philosophical position it really is pragmatic experience: it is nearly impossible to realize a gain from proactively implementing a multi-vendor abstraction. I've found this to hold in databases, third party services like payments and email, and very much so in cloud services. I instead recommend using services and APIs as they were designed and as directly (i.e. with as little abstraction) as possible. Only when and if you decide to add or switch to another vendor would be the time to either add an abstraction layer or port your code. I've never seen a case where the cost to implement two direct integrations was significantly more than the the cost to implement an abstraction, and many cases where an abstraction was implemented but only one vendor was ever used.

I'll note that I have no objection to abstractions per se, especially in cases where a community solution exists, e.g. Python's sqlalchemy is good enough that I'd seldom recommend directly using a database driver, Node's nodemailer is in many cases easier to use than lower level clients, etc.


It very much depends on the nature of the services and the abstractions.

I'm currently working on a system that has several multi-vendor abstractions - for file storage across AWS, GCP, NFS and various other things; for message queues across Kafka, GCP Pubsub, and direct database storage; for basic database storage (more key-value style than anything else) across a range of systems; for deployment on VMs, in containers, and bare metal; and various other things.

All of these things are necessary because it's a complex system that needs to be deployable in different clouds as well as on-prem for big enterprise customers with an on-prem requirement.

None of the code involved is particularly complex, and it's involved almost zero maintenance over time.

That would less be the case if you were trying to roll your own, say, JDBC or DB-agnostic ORM equivalent, but there are generally off the shelf solutions for that kind of thing.


I would never argue against doing it your case, but implementing an abstraction because multi-vendor support is an actual requirement is quite different from implementing an abstraction on top of a single vendor because you are trying to avoid "vendor lock-in".


I agree with this, but would note that building an abstraction layer is not the only way to approach this issue. Just building the thing with half an eye on how you would port it over to a different platform is you needed to can make the difference between it being a straightforward like-for-like conversion, and having to rearchitect the entire app...


I've been at places where they were so vendor locked to a technology that there was a penalty clause for leaving that was in the tens of millions. It obviously wasn't cloud but the point still stands. If you don't have options you pay what they tell you or go out of business.


Yeah, I can't help but wonder about offerings like AppSync; in one level it seems cool, but I recoil a the thought of introducing a critical dependency on AWS for a core piece of the application layer.


Is there a middleground?

Perhaps standardizing on something like Terraform allows you to reduce the risk of going all-in on one vendor.

Similarly with Kubernetes; if you go all in on k8S, do you care where it's hosted or can you maneuver quick enough to the best provider?


This has been my company's approach. There's always going to be some provider specific stuff you have to deal with.. The networking has been a major difference between clouds I've noticed. But I'm guessing in most cases our Helm charts would deploy unchanged Toa different provider.


Most systems out there are not in cloud (and multi cloud is even more far fetched).

There is however a large number of painfully learned lessons of vendor locked in systems... no one got fired for buying IBM, right?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: