It's hard for me to know whether to feel bad for ES in this case. Did they bring it on themselves? Is Amazon too big and a bully?
From my perspective, Amazon has made most of its profit price gouging consumers on bandwidth after vendor locking them into their ecosystem, where they bootstrap new services by wrapping open source software with some provisioning scripts, management dashboards and cookie-cutter API / console templates. Indeed, most of this is templated -- AFAIU, for example, each AWS service autogenerates its Boto bindings and parts of its console frontend via code generators. Amazon has really mastered the factory process of churning out new services, and when they find a popular one, they can invest more resources into developing it than the original team ever could.
And therein lies the rub. If Amazon is improving the software in a way that the original team couldn't, it's hard to say that the community isn't benefiting. I think what strikes me the wrong way is that Amazon is not doing it for any altruistic reason. In fact, Amazon contributes very little to open source in general, considering how much they take from it. Compare them to Facebook (React, etc) or Google (tons of dev tools) or Microsoft (VSC, TypeScript). What does Amazon have? Firecracker, kind of? And now a fork of ES because that's the only way they could continue making money off it without violating the license a small startup put in place to stop them?
Well, good for Amazon, I suppose, but I find myself instinctively disliking them for this. I'm not sure what the solution is. Hopefully technologies like Kubernetes and Terraform will encourage big customers to become at least cloud-agnostic, if not cloud-independent. At the very least it would be great if Amazon / Google / Microsoft stopped gouging bandwidth at such absurd margins. Or not. Maybe it will be their downfall as startups differentiate along those lines. That would be ironic, coming from the originators of "your margin is my opportunity."
Personally I'm doing my part by not building anything with vendor lock-in. It's great to be able to deploy to any cloud, if you value either robustness or flexibility.
One thing which surprised me: Elastic has a market capitalization of ~$11B.
I think that changes some of the more floaty ethical concerns. This is not a David vs Goliath situation. This is Goliath vs Super-Goliath.
At this point, I'm much less interested in the drama of which mega-corp is screwing over the other. I'm more interested in: how does it affect me? When the titans are done trampling over the rest of us, which side benefits me the most?
Its too early to tell, but it seems like it'll be Amazon. The product is more open. They have a demonstrated history of great support. Yeah, they gouge us on networking and everything else, but at least they're the devil we know, and buying into the OpenSearch ecosystem has a greater probability of being the more open solution into the next decade.
Uhmmm I’m pretty sure David vs Goliath is talking about scale between competition. Saying that $11B is Goliath just because you’re sitting at $1M doesn’t mean they’re not in a crazy mismatched fight against a $2T company. In the same way you could be in a David vs Goliath situation yourself if you with $1M in wealth tried to sue someone with $25K of wealth. Everything is relative. Doesn’t mean it’s not a crazy unfair mismatch that doesn’t deserve sympathy and regret.
It may be relative but it isn't proportionate. A 11B market cap company can field a similarly competent legal department as a 2T market cap company can. I, with 25k, absolutely could not afford the same lawyer someone with 1M.
Have you considered that guy with $1M won't be willing to drop like $50k (2x your net worth) on a lawyer just to squash you?
He must have possibility to earn $100k in process to do that. Well unless you really pissed him off.
If you earn $25k a year and he spends $50k in one year to take your cake. Well unless he really is your competitor that can make use of your defeat he still needs 2 years to get even. Then it probably is not easy money because there is always a risk he will not win. Maybe he can find better ways to earn more money than squashing some $25k guys.
Quite. I run a company with roughly £1.25M turnover per annum, so seem to fit the $1M guy example.
There is no way on earth that I would look at my "value" and compare it with a £50K "value" to decide on my legal "battle-worthiness." It is quite likely that we both have similar (to the same power of 10) insurance for whatever it is we are duking it out over. Anyway this is all a bit circumspect.
Against a company with billions to play with? I'll just keep my head down and crack on and hope not to be noticed 8)
It seems like the theme here is asymmetry in legal firepower & resources between entities.
I do personally feel like that the asymmetry is smaller between elastic and AWS than the say an indie dev and a company with $1M revenue for various reasons.
This doesn't mean that there still isn't a meaningful asymmetry between elastic and AWS.
When there are two large entities (and I take this to mean way larger than you or your company) then you’re better off rooting for the one that at least releases code you can use.
If OpenSearch is truely open then in theory you can find another provider. But for ElasticSearch you’re stuck with them.
I don't think it's the freedom to run it anywhere. It's the freedom to run it anywhere and make changes to it that you don't contribute back:
1. Amazon wants to make private changes to the management layer for their cloud offering and not share those.
2. ES doesn't want that, so the 7.11+ license restricts it.
3. Amazon doesn't want to have to explain to their customers why their ES offering is stuck on v7.10, so they're changing the name of it.
4. Elastic was really hoping this wouldn't happen, but they overestimated the value of their brand and Amazon called their bluff.
So yeah, nominally OpenSearch is unrestricted, but realistically few other entities are in a position to make or benefit from the private modifications Amazon will be making. For us normal people, ES and OS are equivalent today, so it's more about how they're going to diverge over time in terms of fixes, features, whatever.
The SSPL directly prohibits offering the software as a service without releasing the source code of your entire operation regardless of if you change it or not:
> If you make the functionality of the Program or a modified version available to third parties as a service, you must make the Service Source Code available via network download to everyone at no charge...
> “Service Source Code” means the Corresponding Source for the Program or the modified version, and the Corresponding Source for all programs that you use to make the Program or modified version available as a service, including, without limitation, management software, user interfaces, application program interfaces, automation software, monitoring software, backup software, storage software and hosting software, all such that a user could run an instance of the service using the Service Source Code you make available.
AKA someone must be able to create aws.example.com if aws uses SSPL code. Not exactly the 'freedom to run it anywhere'.
Doesn't a lot of that hinge on what the "service" is? Like, if I run some generic knowledgebase website and use my own ES installation to provide user-facing search, does Elastic want to see my Ansible scripts? Or just the ones related to the search function?
If this interpretation holds, I feel like it could be a pretty big problem for companies like GitLab, who not only have deep integration with ElasticSearch, but offer it as a premium-tier feature:
Maybe this really is just a cash grab from the Elastic side, like, "it's too easy to self-host rather than paying for our SaaS offering, so now you need to pay us for many common self-hosting use cases also."
There's several lawyers who believe the wording would imply the problematic interpretation - /dev/lawyer is one on the internet, one from my anecdotes is the lawyer who advised the company I used to work at.
SSPL doesn't come up under free or open source licenses partly for this reason, actually.
> Making the functionality of the Program or modified version available to third parties as a service includes, without limitation, enabling third parties to interact with the functionality of the Program or modified version remotely through a computer network, offering a service the value of which entirely or primarily derives from the value of the Program or modified version, or offering a service that accomplishes for users the primary purpose of the Program or modified version.
This probably qualifies it as a service, but since it looks like gitlab doesn't install it or redistribute it Elastic isn't going to come after them (if they don't use SSPL code then SSPL doesn't apply). Plus, GitLab is effectively open-source, so they might meet the qualification for using SSPL anyways. Of course, I am not your/a lawyer.
> I run some generic knowledgebase website and use my own ES installation to provide user-facing search, does Elastic want to see my Ansible scripts?
According to Elastic, "it depends" - if you can just search by entering a term in the searchbox, you're probably fine; If you allow users to use Elastic's query language, you're very likely in violation. If you let your customers define Kibana dashboards, you're definitely in violation.
It's worth noting that you can do a custom deal with Elastic, as long as you pay whatever they ask & accept their terms (or renegotiate them, which is a fairly painful process) - so maybe GitLab did just that.
Great point. SSPL is basically the open source equivalent of “we had to destroy the village in order to save it.” In order to protect their IP from being “taken away” because of their original licensing decision, they basically chose a new license that’s effectively no longer open source (although it is “source available”).
Err, SSPL which Elasticsearch is licensed under isn't AGPL - it's a viral, proprietary license - not quite the copyleft we'd want if we wanted a fair playing field between all actors working on it.
If you think this is about Elastic caring about openness and freedom, ask yourself why they don't drop the CLA for a DCO and let themselves be beholden to the same terms
Except that FSF and Mozilla are not usual for-profit companies with primary mission of making as much money as possible. Elastic on other side is and it's funded by VC money that want return on their investment.
It's a viral open source license, it literally requires open sourcing code, there's nothing proprietary about it except that we allow a council of elitist snots to decide what is and isn't Open Source(TM), and they have decided Google and Amazon support is more important than viable businesses which are building open source businesses.
> If you can't use the software because you're a cloud provider, that's rules 1, 5 and 6 easy.
I may be mistaken but I thought cloud-providers can use it and offer it for others to use as well, they just can't charge for that. But I may be wrong.
by definition sspl only qualifies as a source available license but it restricts stuff in so many ways that it is basically propriatary, even for normal uses.
btw. even open source licenses are not free or libre, they also restrict usages, just not as hard as most source available licenses.
source code is not open if I'm restricted to use it in most cases without making everything open, it's available of course, but that limits my use case by a huge margin. even open source (osi) licenses do that, but in a way more fair manner, which does not discriminiate usages.
heck I'm not even a fan of gpl 3 (especially agpl) , because I think they also discrimnate usages and are also poorly written (too much stuff that is hard to understand without a lawyer)
This is incorrect, you are free to use the program if you wish, it just also, like the GPL, conveys requirements on open sourcing code you use with it. Open sourcing services without the necessary tools to run it isn't much good, so SSPL ensures the freedom to run your own better than GPL based licenses.
no. get it right. if you are not free to run the program without additional stipulations, especially stipulations that dictate how you license _your_ code, then you are not free to use the program as you wish.
Literally, "as you wish". Stipulations is "as we wish".
The AGPL was the first step in addressing a critical need to protect the users' right to run the software they use, the SSPL just continues that, as many services wouldn't be runnable without the support code around them.
I hear what you're saying, but those whose mission is to encourage open source software don't call the SSPL anything but a source-available, proprietary software license. Like literally the organization that defined what "open source software" is.
While you may disagree, you aren't going to convince me to take your word over theirs, especially when I have come to the same conclusions they did on my own.
On a moral principle, AGPL levels the playing field fairly much like the GPL license, just the terms of distribution are updated for an age of software as a service. No one is exempted to the rules in AGPL like SSPL - it is intentionally viral, even if you'd dislike using it in many of your projects (though I think it's fair to say lawyers don't like the ambiguity of the language).
Just like some projects are best under LGPL instead of GPL, AGPL has its place in the toolkit of free software licensing.
Could Elastic's business model work if lucene were licensed SSPL?
I think it probably could not. And if lucene were licensed GPL, it would not be possible for ElasticSearch to use this new SSPL license, it would be have to be GPL too.
The principle of in the GNU manifesto would be that the software is available for anyone to use in the same way, such that Elastic isn't elevated to not disclose their closed source additions to the software.
At this point, they are directly violating that core principle as well as the uncontroversial OSI directive 6 which copyleftists like myself don't really have a problem with... So I'm not sure what the issue here would be other than you dislike the larger companies' involvement in OSS? I think myself and others would appreciate clarification.
To be realistic and fair I don't believe part of Amazon that is doing fork of Elasticsearch is $2T, they have other stuff to do.
Just like the guy that is $1M worth is not going to blow all of his money just to squash some guy. He probably has more powerful friends and maybe some connections but I don't see dumping $0.5m on lawyers just because he can take piece of cake from the other guy that will make him $10k a year.
Imagine head of the department storming into Jeff Bezos office telling that he needs 20% of all Amazon worth to squash Elastic, that would be funny. Quick calc with 20% I assume would be $20B which would be only x2 of Elastic, so I don't see something like that happening.
This argument is part of Amazon's PR campaign to tell devs to not feel sorry for Elastic because it's now a big company and they make money in the market. So, if you built a successful OSS and start to make money then it's ethical to clone any OSS and pushes projects out of the market because now it is "Goliath vs Super-Goliath".
If Elastic's own SaaS isn't good enough to generate revenue that its keeps investors happy, and ii didn't survive, that'd be a shame. Making anti-OSS moves to salvage things though is a disgrace.
Quite a few companies have proprietary products but have learned to make money selling services. One of them even has a 200B market cap and is called Oracle.
Right. But the question here is not about companies value in the market. Why changing the subject? This should be about the ethics behind Amazon's aggressive actions against OSS and its effect on the OSS industry.
The value in the market is the subject of the parent thread. Anyways, like I said, Elastic could've made money with a proprietary product but they chose OSS instead (and used Apache Lucene to build on).
AWS is just offering customers what they want and there are many other companies doing the same thing (IBM's Compose, Aiven, Instaclustr, etc). How is this against OSS? This is the OSS industry operating as intended.
Mentioning lucene raises an interesting question... what if lucene adopted the SSPL license that Elastic is for their own product... could Elastic's own business model actually survive that?
This can't happen because Lucene is controlled by the ASF, not a commercial entity.
Allowing one of various foundations to take control of an open source project, can be beneficial for the community as its licensing is unlikely to change in the future. However it does present challenges for any future commercialization.
A good example of this is Confluent, which was founded by the creators of Kafka. LinkedIn, where Kafka was originally developed, transferred control of Kafka to the ASF. As the original developers, the Confluent team still has a lot of influence and contributes a lot of code to Kafka, but they do not wield absolute control. While this has presented some challenges to build a Kafka-centric business, and even led to them creating their own Kafka fork ("Confluent Server") they have still been successful. The community also has long-term confidence in the Kafka's license.
They would have to triple-quintuple-backflip-down on "open." So, maybe? It depends on how much value is being created besides the code, in squishier parts of the business like service, support, pricing models, marketing, and so on.
But it's moot, since Apache Lucene is part of the Apache Software Foundation and has much stronger promises about its licensing and governance. Which is not a small reason why Lucene is the de facto standard for search technology.
Yeah. It just seems to say something if the license they are insisting is the "spirit of open source" that everyone downstream of them should be okay with... they are counting on not having to deal with upstream...
Elastic contributed to Lucene. (They have committers as PMCs.)
If Lucene had adopted SSPL they would have been forced to fork. But basically nothing really interesting happens at the Lucene level for ES anymore. (Sure, there's always a lot to speed up, optimize, etc. But anyone who buys ES needs the fancy stuff, security/audit/management, not a few percent more RAM efficiency.)
Some of the biggest changes within ES come from Lucene, like _massive_ reduction in memory footprint, enabling ES to use cases not even possible before.
Your comment is interpreting as ES was naive by choosing open-source which allows AWS to fork the code. I read this argument all the time about OSS. Where does this conversation lead? I'll tell you: to end any meaningful attempt to create open-source code with a business model aka the end of OSS.
OSS doesn't have anything to do with business models. Anyone can create open-source software and many do. The whole point is that it's free to create, see and modify. Some companies choose business models which include open-source software, and that comes with advantages (increased popularity and growth) and disadvantages (free usage, forks of codebase).
The reality though is that tech has changed and customers don't want to buy software, they want services now. ES has been slow to offer this, and they still offer it poorly, which is why AWS and other companies have filled in the gap. This is how business competition works.
Please answer these questions: Do you have a problem with the other companies (which are smaller than Elastic) that offer managed elasticsearch? Do you have a problem with multiple companies offering hosted Wordpress? Do you have a problem with any company that sells software or services that use open-source components?
I hear you. This is a valid argument. But I see it both ways. I will offer an open-source and get something else in return. My problem is, Amazon is a taker and it's getting greedy and not giving back. So do you think it is legal? Yes, it is. Is it ethical? No. You are rooting for the big guy, many people do. I simply care about the collective good of the industry and I am for the open-source. Just because you can, it doesn't mean you should.
I find it perfectly ethical to offer a service to customers based on open-source software. Again, nothing prevented ES from doing this for years, or sticking to only making proprietary addons. And again, there are several other companies offering the same services and many startups have built massive businesses on various open-source components.
Forks are a celebrated part of OSS and we only have more choice as customers now. The industry is better than ever. You keep ignoring these facts so at this point it's clear that you just have a personal bias against Amazon rather than a cohesive argument regarding OSS.
> The value in the market is the subject of the parent thread.
As I said, it's a tactic to change the subject. Instead of focusing on actions with bad faith, change the subject by saying they have a lot of money so Amazon is allowed to be competitive and fork the code.
> AWS is just offering customers what they want.
No. it's not AWS improving some services. It's about a multi-billion dollar company launching a campaign against OSS. They did it with MongoDB and it continues. Some see these actions as justifiable. Because OSS should be MIT and maintainers should live with donations. Others, however, disagree. It can be OSS and profitable (without Amazon actions).
Forking OSS isn't bad faith. If you want to make people pay you, make your software proprietary. In that case, however, you wouldn't get the free ride into the market that being OSS gives you.
Yes. This is how you discourage people from doing OSS. "You can't stop me" argument leads to a slippery slope and if you care about the open-source you see it differently. Not because you can, but because of the consequences of your doing.
By people you mean, the big tech who literally did zero effort to support OSS and do everything in their power to fully control the market even if this means pushing the open-source business model out of the market? Or abusing the OSS legal license?
The license is not complex, but maybe the Wikipedia summary explanation of the license will help explain how it’s straightforward?
> The Apache License is a permissive free software license written by the Apache Software Foundation(ASF).[6] It allows users to use the software for any purpose, to distribute it, to modify it, and to distribute modified versions of the software under the terms of the license, without concern for royalties
That's the entire premise of OSS. Would it make a difference if they did it years ago? If not then what's the problem?
Again, the value AWS adds is in operational services because that's what customers want. Are you taking issue with all the other companies offering managed elasticsearch too?
I mean, Elastic was successful because of license arbitrage; to complain about said arbitrage when Amazon does it is ... well, it's hard to feel a lot of sympathy.
Exactly this. Elastic search took open source code and turned it into a service to profit from it while making improvements to the code. Amazon is doing the same thing, except they’re keeping their contributions truly open source.
Quick point,.., as yours is valid... these days $11Bn market cap doe not represent cash in bank for development and R&D. It just reflects what the market think its worth. R&D , of which there has been a lot, and it continues, is hugely expensive.
> they can invest more resources into developing it than the original team ever could.
I know this is a popular narrative, but as someone who works on AWS, I think you would be shocked by how small the individual dev teams are that build and maintain the services that everyone uses.
I'm not going to downplay the network effects involved. Of course AWS has a tremendous advantage in being able to standardize the customer billing, IAM, and EC2 Usage.
And there are economies of scale.
But individual AWS service teams are:
* incredibly lean and focused
* still have to make a profit on their own terms based on the infrastructure they build and the fees they charge customers
* laser customer obsessed to solve people's (developer's) direct needs.
I understand the community's concern about AWS investment and approach to OSS. But I can assure you (though you have no reason to believe me) that the goal is never to embrace, extend, then extinguish. It's all in the service of going where the customers are, and solving problems that they tell us they have. The profits are a byproduct. The "working backwards" process is no joke. We spend a lot of time figuring out what is the right thing for customers to build, start building it, and THEN we think about how do we make money from it.
> and THEN we think about how do we make money from it
Do you really need to think? Looking at the on-demand pricing in US East, a m5.4xlarge.elasticsearch instance costs $1.133 an hour, while a m5.4xlarge instance costs $0.768 an hour. That's 47.53% of extra money. And like you said, it only requires a small team to build and maintain the service.
It is no coincidence that all cloud providers are trying to ramp up their hosted services for open source software, even GCP, who historically only focused on their own proprietary stack. There's a lot of money to be made.
What people often forget is that cross-AZ traffic is included in those .elasticsearch instance prices.
In a 3-AZ deployment and given the frequent rebalancing/replication traffic that is a significant line item if you run it yourself.
I'm not arguing it's not expensive, but when you look at the TCO it's less obscene than if you just look at EC2 pricing alone.
Do you really think that all it takes to operate an elasticsearch instance is to white-label an equivalent ec2 instance and execute some shell scripts on it?
Of course not, don't be ridiculous. My point is that whatever man-power and resources required to run the elasticsearch service can be considered as fixed cost, while the same 47.53% profit margin can continue to "scale" indefinitely, regardless of whether you have 10 customers or 10,000 customers. That's the beauty of it.
As we can see elsewhere in the comments, your biggest competitor appears to be the various kubernetes operators. With kubernetes as the common stack across multiple cloud providers as well as on-prem datacenters, the 10,000 customers can now contribute to the same operator project, to possibly get somewhere close to, or even exceed, what AWS can do with a small team of engineers.
Perhaps anticipating this trend, GKE now has an "autopilot" mode, such that a 30% profit margin is already included in the node pricing. That's one hell of a moneymaker.
they have as many people dedicated to the capability as needed. at least the capabilities have owners, which from my experience is not trivial for an organization to achieve.
curious if others have noticed this as well (capabilities without clear owners)? what does this depend on? time? company size? both?
Could you please shed some light on how many people would be behind a product like AWS Lambda or AWS CloudWatch?
As an outsider, I would guess huge swaths of developers with a massive hierarchy. Buildings full of folks working on AWS services. I have no idea and extremely curious.
Interesting. I wonder if the market can only support managed services when it’s provided directly by the cloud provider like AWS. I assume Elastic has to add margins to cover for their existence, which make them less competitive with AWS.
This sounds a little unfair, even if I agree with the argument that they’re free to fork OSS software and do whatever the fuck they want.
It sounds trivial to "wrap open source software", but surprisingly it is big value-add to thousands of companies. We can't just look at successful companies like Netflix to downplay the challenges of operating a service. Not every company knows how to operate complex systems under manageable cost. How many companies can really manage a Kafka cluster, let alone scaling it, for instance? Indeed, even companies that people deem powerful may screw up, if they don't get their culture or process right. Take Uber for example, for god damn five years, they still couldn't offer a service like EC2, let alone supporting persistent volumes. They still couldn't make their database provisioning on demand via an API. Their MySQL-based NoSQL solution was still based on FriendFeed's architecture and the APIs were hard to use. Yet they spent millions building a k8s replacement, building a GPU database, switching from mysql to postgres and back to mysql, etc and etc. So, yes, cloud companies like AWS buildd mere control planes to wrap open-source software, yet such seemingly mundane offering does bring values to many customers.
A key reason for Netflix to have an easy-to-operate infrastructure is that Netflix prioritizes productivity and scalability. They specifically did three things:
1. No fixed deadline, with a few exceptions of course, for
platform-related projects.
2. Promotion/salary negotiation was not tied directly to release of external features.
3. A single engineer could be responsible for more than one service for the entire company, with 24x7 oncall.
With Netflix establishing such incentives, engineers naturally focus on getting infrastructure right, to the point that oncall 24x7 is a non-issue.
So, yeah, culture matters, big time.
Edit: another incentive was that a service was measured by its adoption. The more people praised it, the more successful the service would be. Requiring meetings to get buy-in for a new service was considered a sign of potential failure. As a result, every single team focused on making the value proposition of their services obvious. Path of least resistance was a given instead of a debated topic.
This is the most important point, IMO. Amazon's value add is not the software itself, it's the operation of the software. That includes a LOT of stuff, not just making sure it's running. It's security modeling and patching, compliance, DDOS protection, etc. Amazon's product is an army of ops engineers working 24/7 to keep your stuff secure and online.
With that in mind, their behavior here makes a lot more sense, and comparing it with companies who have dramatically different products, like Facebook and Google, takes a lot of effort to understand the differences and what impact they have.
You can rent a car but nobody would suggest the car manufacturer be paid nothing for the privilege just cause the rental company cracked car maintenance and how to fill the tank.
The AWS "value add" is only value add in the context of being locked into AWS in the first place.
Elasticsearch is not a car manufacturer. They are an after market manufacturer for a car which they got without paying themselves.
Too many people completely ignore that the heart of Elasticsearch is Lucene. Elasticsearch adds things on top of it, but without Lucene it would be useless.
Elasticsearch is the same as Amazon here, but mentioning that doesn't make for such a nice narrative.
I'm not sure if I like your comparison to be honest.. not only is it not _just_ maintenance and filling the tank, if we stick with your picture, there is also things like buying the car, paying for it when no one uses it, insuring the car, repairing the car, general logistics of moving it around when someone has a one way rental, etc. - basically making it convenient for consumers to rent a car.
Looking at what "as a service" providers of open source software do though, that is taking it a step further, since they wrap the software in a layer that _smoothes_ out changes for the user. Going back to the rental company that would equate to the car manufacturer deciding the indicator needs to be on the right side of the steering wheel now and the rental company installing an adaptor so that it remains on the left for you, to keep the look and feel for the user the same.
A lot of people on HN have very simplistic views of industries. I sometimes wonder how they're not all billionaires. Everything is easy, simple, *just* this one thing. You'd imagine they're out there killing it.
90% of the companies do not know how to manage software. They got weird dogmas, no KPIs, no ability to measure performance or debug problems. This is why they got external consultants and cloud vendors. What is really funny how they think internally about these issues. If Netflix and Amazon was publishing efficiency numbers and we were able to compare with the bottom 95% of tech users people would be shocked. The difference between the numbers I am aware of (number of computers / engineer for example) is 100x.
As a longtime ElasticSearch cluster admin/developer and Elastic Cloud customer, I don't feel bad for Elastic in the slightest and I'm psyched about this fork.
The way they operate their cloud service leaves a lot to be desired and encourages maximum spend if you end up wanting to use it for anything demanding in production.
My favorite moment as a customer dealing with all of the random times _they_ would take us down (even with triple redundancy) was their suggestion of operating a duplicate 3 datacenter cluster in production as a hot spare.
They must think their customers are cash machines.
Personal experience is that AWS elasticsearch has often been missing some really useful stuff, index rebalancing, some of the utility endpoints that avoid me having to spend forever rebuilding indexes, etc. I'd be a little spooked to run something really huge and customer-facing (maybe it's possible and I'm a n00b, but for the money we pay, it should be n00b friendly).
It's always great (really, it was quite easy to get started and usually works) until it's not (a couple times have had indexes break, or had to reindex to a fresh cluster to fix balancing problems).
I'm sure LOTS of people use aws elasticsearch for big, user-facing stuff, but I often feel you'd be better off managing it yourself if it were truly critical.
Also my saying that is extremely colored by experiences with pre-ES6 versions, where AWS's offering didn't have many of the configuration knobs available that you really _need_ to operate a decent cluster.
It's pretty bad when you reach the 40-50 node scale with 10's or 100's of TB of data. I've had about half dozen calls with their service team about this over the last year.
This is really a shame to hear. There was once a an Elastic SaaS company from Norway called found.io that were pretty sharp and customer-centric. They were acquired be Elastic pretty early on[1]. I believe Elastic Cloud was built from this. I guess found.io's culture of delivering a good product didn't survive?
Perhaps telling that none of the Found team are still at Elastic, from what I can tell on LinkedIn. I pay attention to that kind of stuff because (full disclosure) I operate the _other_ small customer-focused managed Elasticsearch company (bonsai.io) that _didn't_ get acquired by Elastic back in 2015.
Honestly the amount of negativity towards Elastic in this thread is jarring. I’ve been an ES customer (self hosted, some cloud) for years and have only good things to say about them. Maybe I’m less a fan of the number of features they try to bake in and the direction of the company towards using logs for metrics (which causes heavy disk load instead of just storing metrics in time series) but. Yeah.
I truly believe the negative voices are coming to the fore in this thread or they are paid Amazon folks (or they have a vested interest in AWS succeeding here).
No, I said “either these negative opinions are coming to the fore”- as in, I haven’t heard them before and they’re suddenly quite loud- which is entirely possible as people do not have a place to vent I guess.
Or people who have some form of vested interest in AWS succeeding (whether directly financial or indirectly benefiting from AWS being a monopoly) are influencing the discussion.
I was mostly thinking that in previous threads on this same subject these voices were not quite so loud.
Are you even a customer if you don't pay Elastic any money ? Your a user, sure, but not a customer.
These other opinions and negative voices you reference come from actual customers who pay Elastic large sums of money, and they feel they don't get good value and service for that money they're forking over.
Because you said you're "self hosted" on 'some cloud'. You don't need to pay Elastic if you're self-hosted and not using the Elastic Cloud - you can just download it and use it for free.
They could be paying Elastic for gold/plat/etc licenses on their self hosted instances or “self hosted, some cloud” could mean they run mostly self hosted with some Elastic cloud usage.
They all give you levers. I'm more referring to being required to overspend to overcome Elastic's incompetence (i.e., you're already aware of the levers, they're maxed out and the provider doesn't have a way forward).
Referring to my other example, in more detail: Elastic Cloud suggests operating your cluster across 3 datacenters for redundancy. This is a good idea. Then Elastic does maintenance in all three datacenters that your infrastructure is in at once and takes your clusters hard down. This is fucking stupid. Elastic's suggested solution to the problem: Operate a _duplicate_ cluster in 3 datacenters in a different region/provider. No guarantees that they won't do the same there either so it's not actually a solution.
This! The company I work for spending over $750,000 with Elastic Cloud every year and the quality of service we get from them leave a lot of be desired. I don't feel bad for Elasticsearch corp, they have done this to them selfs.
> In fact, Amazon contributes very little to open source in general, considering how much they take from it
I don’t think this is a fact. Amazon seems to contribute pretty significantly according to the pages [0,1] they put out that describes their contributions. Not to mention their membership in OSS foundations like Linux Foundation. [2]
You have the caveat about in relation to benefit they gain, but that’s pretty hard to measure. And I think isn’t really a good measure.
I’d like to learn more about why you make such an absolute claim and maybe you have some better measure.
I remember back in the 90s when big orgs (Microsoft, IBM) didn’t contribute to open source and can’t even think of any big orgs today that don’t contribute to open source. Even Oracle has big open source projects.
The absolute claim, aside from being at home in a rant on HN, comes from a cursory glance at https://github.com/amzn, weighted by contributors and popularity, and compared to companies of similar size. Google, Microsoft and Facebook all build and maintain multiple open source projects that are hugely popular with people who use them outside of the company sandboxes. For example, people benefit from React without Facebook gaining much directly. (Facebook! If Facebook has any redeeming qualities, it's their open source contributions to the frontend ecosystem, although I promise you I could ascribe malicious intent to those as well...) Contrast that with Amazon. On their GitHub page, I see a few obscure projects amongst a bulbous array of AWS SDKs.
To the sibling comment that asked about Firecracker -- I think Firecracker is awesome, and I did mention that in my original complaint. They even created it themselves! Well, a team of amazing engineers in Romania did. I have no personal insight into the matter, but it seems like they operate relatively independently from the AWS profit machine. Good for them too, it's incredible software. But I'm sure if they were to tell the story of how they got buy-in at Amazon to open source it, the same themes would come up -- how does Amazon benefit from this? In the case of Firecracker, the more people test it / harden it / run Doom on it, the more value Amazon can provide on its serverless platform. So again, unlikely to be purely altruistic intentions... but that's not to say there's anything wrong with that. I just find it all a bit distasteful in aggregate.
Yes that's true. Also interesting is, AFAIK, DynamoDB does not follow the Dynamo paper. DynamoDB uses an architecture based on single-leader replication, and the Dynamo paper describes a process where a coordinator node is in charge of replicating to each node. If I am not confused, any node among the top N nodes in the preference list can act as the coordinator.
(This information may be slightly out of date at this point)
But I think it's important to note that these companies don't contribute to open source out of any moral obligation, do they?
I think they do it to tie more developers and development around their eco-systems and products.
Maybe Amazon should get smart and start doing something similar. Or maybe they don't need that. But in any case I don't hold it morally against them that they don't. I think a bigger issue is it seems they pay and have been paying very little or no taxes.
No I would not add Kotlin to Google's credit. All the initial work and exponential adoption started with JetBrains. Google only greenlit it as an official Android dev language eventually (barring whatever OSS work they're doing on it only now).
Google is surely pumping lots of money into Kotlin, had it not been for them, and Kotlin would just be yet another runner up in the long list of JVM guest languages.
It is also due to Android Studio performance issues that Kotlin compiler improvements came to be, and also the compiler plugins for stuff like kapt and Jetpack Composer.
Android Java might be stagnant, not kept up to date with standard Java, yet it still rules in Studio tooling performance.
Kotlin was created by the Jetbrains team behind Intellij/Resharper. It's named after a city outside of St. Petersburg, Russia. It has nothing to do with Google...
I really don't understand this argument. Why do you think Facebook and Google have so many open source contributions? Is it really out of the goodness of their hearts? Or is it because that was part of their DELIBERATE STRATEGY to attract talent and use OSS as part of marketing outreach.
Microsoft's core business is developer tooling. In the 90's and early 2000's that could be closed-source and proprietary. By the 2010s it was clear that the only way to operate with the kind of tools they have is to be open source, so they pivoted. But their goal is still business.
Google built Kubernetes as a platform play to compete with AWS and Azure - brilliantly - by feeding engineer fears about "lock-in", giving them a set of tools that they could justify feeling "free", and then when the engineers invariably said "this is too complicated to build, maintain, and operate" they turned around and sold a GCP managed kubernetes solution! After all, who better to operate Kubernetes than the team that built it, amirite?
Android is the same play just competing with closed-source iOS instead of AWS.
Facebook built GraphQL for developers on THEIR Platform.
Apple built Swift for developers on THEIR platform.
Examples like this are just as cynical and capitalistic profit-driven as AWS "open sourcing" an SDK for interacting with AWS.
> Well, a team of amazing engineers in Romania did. I have no personal insight into the matter, but it seems like they operate relatively independently from the AWS profit machine.
Amazon and AWS is a massive multinational corporation with development teams around the world. Including Romania, where we had a dev center for a very long time: http://romania.amazon.com/#/
There is no such thing as the AWS profit machine. All dev teams around the world operate with similar levels of autonomy and responsibility. It's just that some of them are working on super internal systems, some on super external, and some open source. Some make a ton of money, and some don't make any, but are beneficial to the overall developer experience/ecosystem, and so make sense.
I have no idea if Amazon does this or not, but maintaining forks of projects is no fun, so it's in a company's best interest to contribute bug fixes and improvements that aren't part of their secret sauce.
Did the BSDs do it wrong, too? Apple uses a lot of FreeBSD software that they turn around and sell for profit. How about PostgreSQL, didn't Amazon fork that as well? My point is, there's nothing wrong with forks nor companies forking if the license allows for it. It's up to the developers to choose an appropriate license to not be forked/ripped off, if they so desire.
I personally am against modern day corporate America, but I can't blame them for this. The software is given away free/libre/gratis to be forked by whomever.
Perhaps to combat this, one should choose a non "Open Source(TM)" license, but a source availbe license. E.g https://mariadb.com/bsl11/ (not my personal favorite, just an example).
Also, I definitely agree with/do the same:
> Personally I'm doing my part by not building anything with vendor lock-in. It's great to be able to deploy to any cloud, if you value either robustness or flexibility.
My favorite "conspiracy theory" is that AWS intentionally creates stupidly verbose and numerous headers in all of their APIs just to up the bandwidth usage a few bytes per request at a time.
The problem is, that AWS is not really sharing the internals of their fork. Big corps always say that there's no point in open sourcing their fork, because it depends so much on internal systems. Which might be true, but in case of AWS and infrastructure the possibility of an AWS competitor similarly implementing those internal services would probably help the market...
MySQL for example took a proprietary API (the mSQL API) and implemented an open software for it. Then Oracle bought MySQL. MariaDB started and it quickly dropped full compatibility. Options for the developers/users. Great.
AWS/GCP/Azure on the other hand distorts the market. Just as Google/Apple/MS fund Chrome/Safari/Edge development from other sources, thereby distorting the market, forcing browsers to be free as in beer. (Which arguably hurts the market.)
People want to pay for services, not software and licenses. They want turn-key solutions that are available via API and GUI, instantly and on-demand. This is the fundamental reason why AWS is so successful and the demand is constantly proven with every new product launch.
Elastic (and other vendors) complaining about this instead of using it for their own success is a problem of their own making. At least a few companies are finally learning.
What ES wants of course is for Amazon to give them a cut of revenue from hosting ES.
We already know Amazon isn't interested in doing that (either at all, or at whatever price ES wanted, we don't know that).
They had no legal requirement to when ES was open source. So ES changed the licensing to no longer be open source.
So, Amazon could... a) decide to give ES a cut after all, b) decide to stop hosting ES, or c) fork the last open source version.
I don't think anyone is surprised they chose c? Presumably ES isn't either? Maybe ES thinks this will be good for them/bad for Amazon anyway, because they are hoping potential customers will abandon the Amazon fork and stay on the original ES fork?
Not sure why they'd be confident in that exactly. Maybe they know what they're doing.
As users/customers, we would rather have a choice of hosted vendors/platforms, and that it remain un-forked (so we can use/write software compatible with either vendor/platform). Competition is good for us as users/customers, that's in fact one of the reasons we choose open source, so no one vendor can set the hosting price all on their own without competition. We want to be able to choose among competitors for hosting, based on price, customer service, performance, uptime, whatever.
But ES didn't want that, they didn't want hosting competition to exist -- at least not without permission and agreed upon cut for them -- because, I guess, hosting was how they planned to make money as a company to fund development as well as profits for investors etc. So they changed their license to no longer allow it. So of the possible outcomes remaining... this one seems as good as any for the user/customer, I guess?
So, when you say "I'm doing my part by not building anything with vendor lock-in" -- I'm not sure which course you are suggesting. In fact, between ElasticSearch and new OpenSearch fork.. it's OpenSearch that is the one without vendor lock-in, right? OpenSearch is Apache licensed, and can be hosted by any vendor and still forked by anyone . It's ElasticSearch that has a license limiting what vendors can host it (without permission of ES), it's the one with vendor lock-in, right? So not building anything with vendor lock-in means... ?
Good interesting points. Now Amazon will be the good guy because they will run open-source version, whereas ElasticSearch is not, if I understand you correctly.
No single capitalist wants competitive markets. They want monopoly, for themselves. It is only when they don't have the monopoly or an easy way to get it that they cry for competitive markets. And that is good of course.
How is it price gouging if the price is on the tin? It isn't like there's a "surprise" as to how much they charge, and it isn't like there aren't a dozen alternatives including DIY.
I'm the first to say that AWS is too expensive, and I vote with my wallet (and the company I work for by proxy). But I'll never claim that there's any gouging involved.
Price gouging is the practice of using outsized leverage in a particular market to charge excessive prices. Like snow shovels doubling in price after a snow storm. Or $10 water bottles after a hurricane.
So for AWS the term is arguably correctly applied.
But I'd be more worried about the market if AWS was artificially undercutting pricing because it would kill the incentive to create competitors or innovation in the space.
> Price gouging is the practice of using outsized leverage in a particular market to charge excessive prices. Like snow shovels doubling in price after a snow storm. Or $10 water bottles after a hurricane.
> So for AWS the term is arguably correctly applied
What outsized leverage have AWS had for a decade? There are multiple competitors at different levels, AWS are just better in terms of coverage/redundancy and amount of services.
Point was just that having the "price on the tin" is irrelevant to whether there's price gouging going on.
But, I guess the argument could be similar to the case against Apple for its iOS App Store: there are lots of competitors, but the lock-in arguably creates a market definition of Apple customers. AWS customers are largely locked-in and at the mercy of AWS prices.
That's the argument. I'm not sure I buy it, but it's one perspective.
"without violating the license a small startup put in place to stop them?"
Elasticsearch was first released over a decade ago. ElasticSearch, now just Elastic, the company was founded over 9 years ago and now is public. Are they still a "small startup"? If so when does a company graduate from that status?
> Amazon is not doing it for any altruistic reason
The beauty of OSS is that motives don't matter. If Amazon contributes and it's not detrimental in someway to the code, then it's a plus for anyone else who wants to use it.
Precisely! While their business itself may need to be broken up, a community governed OSS project isn't bad for OSS when the alternative is a proprietary license that gives a single corp the ability to not contribute back or be exposed to virality.
All this being said, progressive corporate taxes seem more enticing year after year.
When a product that was previously Open Source changes to a non-open license, it's not uncommon for someone to pick up the last Open Source version and fork it, and release that for others to use and collaborate on. That's always going to be a good thing; that means people who care about Open Source licensing continue to have a version to use and collaborate on.
Is it really price gouging for bandwidth? Or is bandwidth just really expensive in general? I honestly don't know. I would assume if it was actually much cheaper one of the cloud's would undercut the other to get customers.
It's absolutely price gouging. I'm not going to rant about this for the 100th time, but at least I'm in good company [0]. Do the math on the cost you pay if you saturate 1gbps for a month vs. the cost you pay for 1gbps IP transit at basically any colocation provider.
Really this is the secret sauce of the cloud. Create new abstraction layers where you can charge for logical separation on a physical basis. First VMs, then containers, then serverless... Would be cool if somebody did it with bandwidth (looking at you, Cloudflare). Why can't I buy an elastically sized pipe? Why do I need to pay for the stuff I put through it instead of reserving a size for the time I'll need it?
That Twitter thread is only including pure bandwidth. What about all the highly redundant networking equipment ( firewalls, routers, switches, Nitro offloading, DDoS protection, attack detection), software for all those abstractions you get ( VPCs, subnets, security groups, vpc peerings, Elastic IPs etc. ) and engineers? None of what i listed you pay for directly, and bandwidth seems to be the most reasonable product to lump it all in.
It's like going to a restaurant and complaining about the price of steaks because beef should cost a lot less. There's a ton of other things involved, and yes, they probably have a decent margin, but not as much as you initially imply.
AWS has advantage due to economies of scale. (And naturally some disadvantages and challenges due to sheer size, which increases fixed costs, but that means that they can't easily downsize, which doesn't apply since they are still growing at an incredible rate.)
So they should be able to reach the lowest amortized cost for bandwidth with all of those costs included.
They price bandwidth so high because they can and they are still growing.
boy howdy, with all the flak i hear about this and the awesome talent in tech, you'd figure an entrepreneur or 1000 would take a stab at this, make it better, charge less. apparently there's gazillions to be made by even charging 50 percent of what AWS does.
so, when should we expect this gloriously efficient competitive market to kick in to action?
my guess is that the AWS ecosystem, despite "price gouging", is simply the best and will be because this is really hard, non-glorious engineering, where solid reliability actually matters. anyone who wants to can go ahead and co-lo, so, whatever. people who want cloud will pay, and those who can't or won't, will not.
There's tons of businesses that are happy to charge less for bandwidth; so it's clear Amazon (and some of the other high tier cloud services) are overcharging on this maybe by a factor of 10, although since transit bits are not all equal, someone with more detail could make a case that the overcharging is less.
It's easyish to compete on bandwidth costs, but Amazon has a lot of other features many people want; it's harder to replicate all of those, especially the part about having a long history of operating such services and not making a lot of changes to make things more expensive or otherwise more difficult. Having to pay a much higher than market price for an easily replaced good in order to get a good that's less easily replaced is textbook anti-competive bundling.
If your bandwidth usage is high enough, maybe it makes sense to send it all through AWS direct connect, and pay for transit yourself; although even then, the AWS direct pricing seems a bit high.
If you ask the fancy restaurant to cater a steak dinner for 1000 people they will charge you a lot closer to supermarket beef.
The point is that if you charge absurd prices for what has basically no marginal cost your pricing model is broken and 1) you are excluding customers that are particularly sensitive to this price or 2) you are liable to undercharge other customers that primarily use other services for which you are not charging what it costs you to provide.
For AWS, given the generally inflated prices, it's probably a lot more of 1) than 2).
As I explained in another commented AWS wants to disincentivize dumb bandwidth usage. They want you to use your bandwidth for traffic that needs it to EC2, and you get much better rates for static data from CDNs, S3, etc.
Your point that "AWS has a bunch of other benefits to where people just accept the bandwidth costs so they can leverage those other benefits" doesn't actually counter the original claim that "the bandwidth costs are outrageously overpriced".
You're replying to a comment that includes a tweet from the CEO of Cloudflare, which is quite literally providing that competition with free bandwidth and an increasing suite of computing products.
There are plenty of other platforms as well, like Digitalocean, that have much lower bandwidth pricing.
Nobody else can give you bandwidth out of Amazon data centers. Amazon's advantage is having a ton of services that work together, and they take advantage of it to price gouge on bandwidth.
If you're buying a standalone CDN service you can get massively better rates.
I'm kind of surprised that people are this upset about how much AWS charges for bandwidth. They may charge more for bandwidth than a colo would but they're not a colo. A colo you get a network port and -thats it- you provide everything else yourself, with its attendant cost, and you roll that up into your total bill.
If a colo provides you a 1 Gbps connection if you use less, you don't get a refund. And most of the time you don't get 24/7 saturation, or you get charged on some 95th percentile billing, and their networks are almost always oversubscribed anyways.
AWS is trying to disincentivize using it as a dumb pipe. They want you to use it smartly and if you just want to push static data there are much more cost effective ways to do it, such as CDNs, which are more cost effective for both you AND AWS.
Comparing AWS bandwidth costs and Colo and even other clouds like Oracle isn't fair because different things are associated with that cost.
It really is price gouging - bandwidth is actually cheap.
A couple of comparisons:
Oracle Cloud give you 10TB of bandwidth for free, with overage charged at around €7.5/TB.
You can rent a VPS from the likes of Hetzner, and they will throw in 2-20TB of bandwidth for free, with overage charged at something like €1/TB - AWS charge an eye-watering €125 for each TB!
I think the reason the big 3 (AWS, Azure, GCP) still charge such huge amounts is that they profit so greatly from it, and there is more than enough business to go round.
> I think the reason the big 3 (AWS, Azure, GCP) still charge such huge amounts is that they profit so greatly from it, and there is more than enough business to go round
Or that their networking does vastly more and is of better quality? Hetzner can't provide you with everything around a VPC that AWS do. Just a tiny example - peering VPCs across regions, which is free.
1. High bandwidth charges are an effective form of lock-in. Once your data is there, it's prohibitive to move it out again.
2. Bandwidth use is very poorly understood in many businesses, compared with simpler metrics like storage, memory, and CPU. AWS can run razor-thin margins on things that people easily compare to on-prem or VPS-style offerings, and then make the money back in areas like network traffic, fine-grained monitoring, and other items that as less obvious.
I totally agree, but was stunned to see DigitalOcean charging $0.10/GB for outbound transfer for their new (quite cool imo) apps service. You do get 40GB-100GB included, but it means it's unusable for bandwidth heavy apps. They include 1TB (which is pooled across all VMs) and 0.01/GB on their standard platform.
I suppose you have object storage which is still $0.01/GB (plus includes 1TB to start with), so for most apps your bandwidth will come out of that for most files, so your 0.1/GB price is only for html/api/etc transfer. But still, it's annoying and seems completely arbitatary.
Why would players in an oligopoly undercut each other when their implicit agreement around pricing makes all of them richer? Also, second tier cloud providers like Oracle give deep discounts and still can't compete with AWAZGO so pricing isn't necessarily a main competitive advantage.
Oracle's problem is that nobody wants to work with Oracle. If I was managing a high-bandwidth service I would avoid Oracle Cloud purely to avoid legal risk to my clients.
While I agree that's why no one uses Oracle, corps not competing for price at the tier 1 provider level still isn't great for the consumer/market, which is what I think the original commenter was getting at.
Keep in mind bandwidth gets cheaper as AWS gets bigger. If you are some random tiny colo provider, people don't necessarily care to peer with you unless you pay them for the privilege. If you are originating 20% of internet traffic, now people need to peer with you or their customers won't have a great experience.
IP transit costs something like $350-$700/mo for a Gbit. Amazon are certainly getting better rates, so even with equipment costs I doubt they're spending much more than $0.005/GiB. Their pricing starts at $0.15/GiB. (Not to single out AWS, the other big providers are much the same.)
That's sounds like a great deal: you're paying less than half of Hurricane's advertised minimum price. And I had understood HE were a mid/low price carrier.
Can I ask, are you in a data center? US or Europe?
What do you mean? Hurricane's current special is $90/1Gbit/m.
US.
Some of the larger bandwidth transit resellers like FDC will do Cogent for $0.02/Mbps. They have a marketing site at epyc100gig.net (not an affiliate or employee; just a FDC customer).
I believe that kind of low-for-1G-commit pricing is for their fully owned FMT1/FMT2 US CA facility. You have a lot easier peering in EU (AMSIX, DECIX, etc) that will help compared to the US's love of commercial exchanges like Any2/Coresite/Equinix where peering costs practically more than transit.
There are also ways out of vendor lock-in. Alternator comes to mind as a way of migrating DynamoDB workloads out to other cloud vendors or your own servers: https://docs.scylladb.com/using-scylla/alternator/
The main thing I look at in this situation is their approaches to security. ES decided authentication is a paid-only feature implemented via closed source proprietary code, and the result has been countless PII leaks; now sure you could say that’s the developers’ fault for making the endpoints internet-accessible, but when the system has been designed from the start to both be insecure and hold PII, you have to place some blame on the provider as well.
Amazon on the other hand developed a free and open source auth plugin and anyone is able to deploy it no problem.
There is absolutely nothing “cloud agnostic” about using Terraform. Every provisioner is specific to a cloud provider. If you are at any scale, moving a k8s cluster is the least of your issues.
It seems reductionist to say Amazon primarily wraps around open source. What about EC2? S3? Glue? DynamoDB? Many of the services that provide the most value are services Amazon has built out.
Om top of these, many of the core services that AWS themselves rely on, like SQS, SNS, Kinesis, Lambda, Cloudfront, ECS, Fargate, Elastic Beanstalk are mostly homemade
This is Amazon's playbook. Make a direct competitor and squeeze the originals out. They did that with jewelry early on and then anyone they couldn't buy out they would under cut until they capitulated like diapers.com The Everything Book by Brad Stone goes over this in detail. Clearly anti-competitive monopolistic actions are taken constantly by Amazon. The only reason they aren't trust busted is because the common line of reasoning is that consumers pay less for goods, but this is being looked at because shouldn't competition be lowering prices. IE if Amazon hadn't killed diapers.com, wouldn't diapers be cheaper overall? And the answer is they should be, but the government hasn't caught up. Once they start getting into the weeds, they'll see example after example of monopoly behavior destroying competitors and ultimately raising prices on consumers.
You described the cloud lock in model very well, especially the bandwidth part. What they charge for outbound is nuts, and the other clouds are not much better. Inbound is free of course to make it easy to send your data in but costly to get it out.
As someone who tried buying services from ES and had to deal with their smug sales people that had a total disdain to those who wanted to give ES money, I am happy I will never need to deal with them in future.
Coincidentally, AWS hasn't open-sourced anything that they use internally. Zilch. Nada. And yet they are using 'open source' developed by another firm (smaller is inconsequential here) to market themselves.
IMO, FOSS licensing is completely broken. Its definitions (of what is free/open) are from a boomer's era that is no longer sustainable. At least I wouldn’t want any of my FOSS projects become “corporate strategy” of any particular proto-nation.
That made me wonder which megacorp did? Google published a lot of papers about their architecture, but no code. They have many open source projects modeled after internal tools (eg. bazel), but there's no search/GFS/mapreduce/monorepo OSS project from them. MS has VSCode, that's great, and they even open sourced .NET. But Azure is a full black box, just like (almost) every Windows component. (Finally calc.exe is OSS!)
What's worse is the second order repercussions. Future open core/open source SaaSes will go straight to something like the Business Source License or the MongoDB license instead of traditional libre licenses. Amazon has done an incredible amount of damage to the open ecosystem.
"having subsumed the opensource version of ES, we are now relicensing, calling it our own, and would be really happy if the opensource community would lie to contribute, because actually we don't totally understand how this product works. Many thanks to all who help us"
The issue with AWS version are many fold, but the main one is that it forces extra usage of expensive EC2 units, for the following reasons:
1. Blue/Green updates -> Start a new cluster with new version, lock current cluster, copy over all the data (can take over a day), at the same time write to both clusters, when finished, lock both clusters while endpoints are swapped over, unlock new cluster, trash old cluster.
During this process Customer pays for both.
- Solution, done properly, fire up new node with new version, swap it in, wait for 'green', take out old... wait for 'green'.. rinse and repeat. Result.. system never goes down, endpoints remain the same, less cost to customer.
2.There are a minimum of the folloowing node types:
- Master (small) x3
- Data (heavy on storage, medium on memory)
- Hot Data (very heavy on memory as shards have to be
held in memory)
- Coordinating (query) nodes, heavy on memory, light on
storage (cos there is no real storage)
- Ingest (same as Query).
- voting only.. tiny
- AI/ML heavy on memory and storage.. cos they do real
work
In the AS world they have:
- Masters
- Data
- Hot Data (cos they are really pricey)
The Data nodes do all teh functionality of the data, ingest and query nodes. Th emain query always gores to a data node, while it passes to other data nodes to get the data out, then aggregates locally to it... so its incorrect usage of the system. A data node should only ever deal with its own data, and pass the results it finds back to a node away from the data.
As your system is squeezed.. you add data nodes... not coordinating/Query or Ingest nodes, which you probably need. But thats more money into the coffers.
3. Their userhasging function is also old (2011 vesrion of bcrypt) , and fails. Any static password produces different results everytime you use it.. at least on the current opendistro. So you are forced into either not using security, or using proper security, which can be cumbersome across the cluster (if rolling your own).
however.. on the cloud version they have base level security working, so thats Ok.. they arent using their own 'open' software.
I could go on.. but its all flaky, and misunderstood at the core developer level. Toput on record, I have spoken twice to teh core development team to describe how updates should happen.. the last time 18 months ago (by 'spoken' I mean a face to face video call) .. so they know hoe it should be done.. but dont do it.
4. I have noticed that there are some functions/methods available on the setup in Amazon Linux, that dont exist on other Linux versions (centOs/Debian) that are security related. AWS Linux is 'lifted' from Redhat.. so another piece of software that they didnt write.. but obviously Redhat are happy with this. Maybe they got the licensing deal sorted .. who knows.
basically gores like this:
"Who would actually install, in production, what essentially is very close to pirated software?"
Roll your own.. its a bit of grunt work up front, then 50% the cost of the cloud version.
So VSCode, which is basically slomo Sublime Text for retards and Typescript which is usable only for prototyping are a good enough compensation for having most of their income from Linux (on Win and Azure) after all the almost criminal FUD they poured on Linux[0]? Come on, get real.
Is Starbucks too big and a bully? Sure, they force Mom and Pop shops to close by out-competing them, and that sucks. But bullying? I believe that's just Capitalism.
Or say you're 6'6 and weigh 230LBS and you join a football team full of people who are 5'9 and 175lbs. Are you a bully just because you're bigger?
ES basically handed them a platter with a goose laying golden eggs and a sign that read "Free Goose" and hoped they wouldn't try to make money off the eggs.
> Is Starbucks too big and a bully? Sure, they force Mom and Pop shops to close by out-competing them, and that sucks. But bullying? I believe that's just Capitalism.
I'd say it's more like a mom-and-pop coffee shop giving free coffee to patrons hoping to make money on cookies, and Starbucks coming in, taking the free coffee, opening a nice stand right next to the shop and selling the coffee they got for free.
How is it unethical if ES chose to explicitly allowed it with the license they picked? The analogy would have to include a sign under the free coffee that literally said "Feel free to sell this free coffee for profit!", for it to be accurate.
I don't see how it's unethical if you do something that someone said was perfectly fine to do. "Obvious chosen outcome" comes to my mind long before "unethical".
You could always look at teh Oracle version (wait.. please... )
They started selling consultancy.. the solution was an Oracel DB...
To expand across the world they franchisedit.Th eagreeement was, you take te hfranchise, if yoou hit x target in 4 years, we will pay you y amount (large) ...and take it over.
That worked well. DOnt see any reason why Starbucks, or anyone else could nottake that on board, then everyone is a winner.
If they're giving you free coffee, and you sell the coffee you got for free... where's the harm? Starbucks is giving it away. Why would anyone buy your coffee rather than get it for free from the source?
Now, assuming Starbucks charged you for the coffee, and you then re-sold it, this should also be fine. Starbucks is charging you presumably a rate with which they can recoup their costs. But if they are selling it below cost, they are clearly putting themselves at risk. A lot of businesses take this kind of risk as a strategic part of their business, like with making the coffee free. But you have to do it in a way that a competitor can't turn around to their advantage, or you're screwing yourself.
Enter the concept of "not for resale". If a seller enters into a contract with a buyer, that contract can stipulate that the buyer can't resell the goods. Starbucks could theoretically require you to sign a contract saying you will not resell their coffee. That's pretty standard with licenses, even software licenses.
ES must have known that their license did not forbid reselling. Yet they based their business model on this resellable coffee that they were giving away in order to make money on cookies.
Is it Amazon's fault that ES chose a business model where they were selling coffee at a loss? Does Amazon have an ethical responsibility to keep ES's business afloat? Should we find any business unethical that tries to undercut the customer base of a rival, or take advantage of a rival's shaky business model?
I think you have to come up with a whole framework for ethical competition, because one rule at a time isn't gonna capture it. (But I also think Capitalism is inherently unethical)
I think the claim that Amazon is winning through "vendor lock-in" is pretty silly. Honestly anyone who can't quickly migrate the stuff they're hosting on AWS onto one of the many other cloud platforms is pretty bad at DevOps. If you're using K8S/Docker/etc it should be trivial. But even if you're not, the vast majority of AWS offerings were either built to be API-compatible with other existing tools (e.g. postgres-compat Aurora RDS), are literally identical to other services you can self-host (e.g. ElasticSearch) or others have built services compatible with AWS services (e.g. DigitalOcean's "Spaces" aim to be API identical to AWS S3 – you can literally use Amazon's S3 client libs to interact with various S3-compat services from other clouds).
It's not "lock-in", it's providing a great all-in-one solution. You can host everything you want to host on AWS, which has good stability, good latency, etc. People are locked in because the DX around using AWS for everything your platform needs is just better than other platforms / having different services on different cloud providers (at least for many people).
If you're not using any of the AWS services, that might be true but then you're also leaving a lot of potential on the table.
If you're "cloud-agnostic" and could migrate away from AWS in the blink of an eye then you're paying for an overpriced VM offering and should probably migrate to a cheaper hosting provider immediately.
> If you're "cloud-agnostic" and could migrate away from AWS in the blink of an eye then you're paying for an overpriced VM offering and should probably migrate to a cheaper hosting provider immediately.
No this is what everyone suggesting this does not get. The offerings are not equivalent.
There are a baseline of services that the cloud providers offer that can be made functionally-equivalent. It's not just EC2 but more like EC2,S3,Lambda,RDS,DynamoDB,ECS,EKS (plus some others and of course the other cloud's equivalents). The secret sauce is in the APIs and permisioning and all of these available within the same VPC (talking to each other without paying bandwidth costs).
"Cloud-agnostic" has _never_ meant "just VMs". Some of these services are majorly hard to duplicate on your own VMs as well. Feel free to implement "cheap VM hosting + S3" and burn cash on transit costs.
Cheaper hosting providers do not give you this by miles.
I buy the main reason being the all-in-one solution; the comprehensiveness is attractive. However, I think you're underplaying the lock-in: migrating clouds is non-trivial - mostly due to stuff that's not running in k8s/docker/etc; stateful apps (Postgres, etc), and or just static data like s3. This takes time, and careful planning and sometimes downtime - and is mostly avoided due to it being hard.
You're conflating the difficulty of migrating a complex system to anywhere with "vendor lock-in". It would be harder to migrate an Aurora RDS Postgresql instance to an Aurora RDS MySQL instance than it would be to migrate from Aurora RDS (postgres) to a hosted postgres anywhere else.
Everything on AWS is accessible via API, which means you can easily automate the migration process. So do that?
Obviously the more complex a system is the harder it will be to migrate, but that has nothing to do with Amazon trying to "lock you in" and everything to do with it just being a complex system that there is no industry standard solution for.
It appears you are using one definition of lock in while others are using another. What your version of lock in seems to entail is deliberate actions by the hosting provider to force you to stay with their offering. When I (and others it seems) use it, it just means the natural friction that keeps you in the provider. It doesnt have to be a deliberate strategy to be lock in. That doesnt make it any less difficult/painful to migrate.
What would you say they lock you into? What is AWS doing specifically that makes it harder to move between providers than if they were not doing that thing?
What do you mean when you say managed services? Because to me managed services are definitely not vendor lock-in because you can self-host most of those services and just migrate your data over.
With regards to permissions, I don't feel like it's possible to permission across an entire platform and not make it difficult to migrate over – do you know of any provider or platform that allows for cross-platform authentication with permissions?
Is there any good that done by Amazon to support OSS? Like ever. They started with cloning MongoDB and now Elastic with actually zero contribution to the community regardless of their insane profits. This is a clear single. Amazon can always clone and redistribute any open source software then lock it in for AWS. If we've started to witness declining in OSS, well at least we know now who started the wave.
Do people assume that a company as large as AWS would automatically have a lot to contribute to the OSS? Maybe most of Amazon teams have not much to open source yet. Contributing to OSS is a bottom-up effort. An engineer needs to be motivated to generalize her project, to peel the code from Amazon's vast internal infrastructure, and to go through an approval process to open source her project. Given that many teams have razor-sharp focus on delivering features, for good or for bad, I was wondering how many engineers are really motivated enough to open source something internal.
I don't think it's a bottom up process. I feel often it's a top down process. Most companies with lots open source activity normally have management that have decided that is something they want to encourage and then it comes down to people making their code open sourcable.
It is both a top down process and the culture of the company itself. In the real world, we know how big tech open-sourced some of the most efficient technologies that empowered hundreds, if not millions, of startups.
You don't need to assume. Amazon's Open Source team regularly talks about how much they do for open source in my companies Slack. It's their own words.
Amazon will of course have a PR statement about how it cares about open source. It's not like Apple is known for it's commitment to open source but it has a similar page: https://developer.apple.com/opensource/
Firstly, Apple is a hardware company, AWS is a services company, these two things are quite different. I'm not sure why you think mentioning them in this discussion brings anything helpful to the conversation. Secondly, Webkit is hugely influential as an Opensource project.
Well, Apple actually makes quite a lot of software, doesn't it? Such as its own series of operating systems and applications to run on their hardware? I'm not really sure how you think they aren't relevant to a conversation about OSS.
Maybe a dual-company like Mozilla would at least make it more clear that the big guy that are just taking advantage of the free lunch is doing more harm than good?
Elastic should have a for-profit and a non-profit company taking donations that would actually control the open source code and hiring the core part of team working on it.
I mean, we know how badly Amazon is behaving here, but at least they should have a option where they could realistically invest.
Asking for a company to invest in a competitor that can grow and eat their lunch with the money being invested by them is not realistic. Even because the company investing the money would want to know if that money is actually being invested back in the open source software and not used by a competing company.
If, giving this choice, they didn't invest back in the foundation, it would be much more clear that they are doing it in bad faith.
I like to think, or at least I hope, that OSS isn't going to decline, it's just going to evolve. The existing OSI licenses weren't meant for a world of clouds and SAAS. I believe we'll have to find the next-generation of licenses that can succeed where AGPL tried and failed.
As far as I know DocumentDB is closed source. Btw: I can name zero OSS projects from Amazon and this is not the same for Microsoft (VScode, TS) and FB (React, Jest).
On top of that AWS has done a lot of contributions upstream, particularly around the linux kernel, https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin..., (particularly the virtualisation stacks involved), some around OpenJDK at what looks like an increasing pace, and the like.
I know several employees contributing to the Linux kernel, and I know they're hiring more Rust contributors. You're right though, I think Amazon lags compared to MSFT, FB, and G. Even Netflix or IBM/RedHat, for that matter.
The start maybe done by open source some technology they use that can profit and help other startups. Maybe open source their own version of "React". That will be a good start.
I have mixed feelings about this server side license stuff that mongo db started. Imagine where the internet would be today if the creators of apache and mysql had tried to prevent shared hosting providers in the early days of the web from using their software
The time when Apache or MySQL started out was very different. Imagine where the internet would be if cloud computing itself didn't take off.
Do you remember a time when there were hundreds of hosting providers? Do you remember WebHostingTalk where admins would go to check hosting offers from suppliers around the world?
The monopolies finished that era. So I don't think that software companies trying to adapt now can be seen through the lens of what was 15 years ago.
There are more hosting providers today than in the era you are talking about. AWS has a "monopoly" simply because large companies are using it, and back in the day those companies would have run their own datacenters not used a shared PHP host. For a personal site or startup you have a thousand other options.
I wouldn't say AWS has a monopoly. There are tons of providers ranging from Digital Ocean to Google Cloud. AWS is just the largest because they've been around the longest. They're so deeply entrenched in Cloud that most people automatically think of them. I think things such as Netflix talking about their engineering and how they use AWS was a major boost at the start. Now a days people are literally studying to become AWS certified. We even have a AWS specialised working at my company. AWS and other cloud providers are also very smart in locking in start ups by offering them thousands upon thousands of free credit. Build your MVP on there and then end up vendor locked in but think it's a good thing.
I heard a tidbit that I don't know if it's true or not but I would like to think it is. AWS has become so expensive for some companies that they're starting to migrate back to their own datacentres.
I think you need to get a reality check. I have worked as an early engineer to 12 or more startups in the last 15 years and am very active in communities ranging from HN, IndieHackers, OnDeck or a dozen others.
I have not come across a single founder in the last 5-6 years who does not start with AWS credits or is not craving for them.
AWS is nothing but monopoly and they have utilized access to their cash to buy early customers. The same goes for Google and MS. The smaller hosting providers that you believe exist actually are stuck with whatever customers they used to have or a chance new WordPress blog.
Everything doesn't run on AWS. In Europe where a company might only sell to one country, like Sweden where I live for example many won't see the traffic required to scale beyond a single box, those same smaller companies might also not want to pay with their kidneys to host that one VM. Cloudflare is making this possible, so for smaller businesses and sites one could argue that Cloudflare is the real monopoly (entirely different markets).
Now companies like Shopify is enabling people to run a shop without any ops for peanuts.
Or smaller SMB it environments, we run a small datacenter at my company running VMware software to run our customers domain controllers, erps, fileservers and such, though this is decreasing, we used to run Exchange too, but migrated every customer to Office365 because it's cheaper than on-premise licensing, and we have 0 ops. The fact that we're local means we can offer dark fibre to many customer sites, giving them 0ms latency to us, making even the chattiest shit system run like things were on their premises. I guess we're "computing at edge" :p
Not saying you're wrong that AWS is a cloud monopoly, but everyone isn't purchasing in the cloud market (the majority of the worlds money doesn't flow through startups going global). And there's also Azure which is growing at an incredible rate challenging AWS at migrating legacy workloads to the cloud.
I'm not even that sure this is true in absolute numbers, but I'm pretty sure it is not in relative figures. There are waaaay more customer/businesses on the Internet today than there were 15 years ago and not that many providers more.
Do you have a source for the data that there isn't significantly more hosting providers today? The ease of setting up Plesk or CPanel to become a shared hosting provider today is incredible, essentially 0 maintenance so I doubt your statement holds true, considering how many more people are on the Internet today compared to 15 years ago.
Depends on the license, but if it was SSPL then Elastic wouldn't be able to run their hosted service without open-sourcing every bit of it, including all of the security, infrastructure monitoring, and authentication that is the backend of their business model. See https://news.ycombinator.com/item?id=26784552
Yes, but would we have TimescaleDB, CockroachDB, ElasticSearch, Docker (containers at all) and projects like these if there wasn't any money at the table?
Not saying you're wrong, but it's a multidimensional problem. One could argue AWS is nothing like shared hosting providers(compare scale), and a webserver is essentially "stateless" which means easier to build than say... A database holding sensitive information that doesn't break. I assume this is why postgres HA and horizontal scaling still really isn't a thing, while CockroachDB funded by VC "solved" this problem.
I think it's fair to let companies monetize on the service they built, while allowing people to run it on their own if they can. A problem here though is that the companies incentives mismatch the opensource project they're eunning. CockroachDB enterprise having killer features that the opensource version doesn't have and that noone will be able to PR because the company will reject it.
TimescaleDB went ahead and opensourced all their features, aligning their incentives with the project, but I don't know of anyone else who has done this.
The server side license stuff doesn't prevent shared hosting providers from using software. It just requires hosting providers to open source their infrastructure.
I think the internet would be even better today, if shared hosting providers had been sharing infrastructure technology since 25 years ago.
I think that's missing the forest for the trees. The license is designed to prevent hosting providers from selling the software as a service to their customers. The requirement to open source their entire infrastructure and operations is just a means to do that.
Technically it doesn't just prevent from selling SaaS. It prevents from selling SaaS without having the consent of the open source project first, which means negotiating a deal.
The value in AWS is not the software. (What I heard about it a few years ago it's very explicitly not the cobbled together software.) It's the whole enterprise, the structure, the teams, the operations staff.
I mean... there's very obviously a limit, right? You don't use an infinite number of programs. And it's not "all programs you use", it's "all programs that you use to make the Program ... available as a service". You don't need to provide the code for Microsoft Solitare, but you do need to provide the code for Apache httpd and your Linux distro. Which is pretty easy to do!
If there is any limit it's certainly not very obvious. For example if you manage/develop the service from your Macbook then are you using macOS "to make the Program available as a service"? And the rabbit hole just keeps going deeper. Running it on EC2 or ESXi? Whoops. Wrote any ad-hoc shell scripts? Better have them archived. What about all the firmwares and driver blobs, are those also programs that are used?
?
Sure, you could probably fight it in courts to get some reasonable delineation. But the license itself is horribly vague. So the safe bet is to avoid it like plague. It's just not very well written license.
Why? Dockerized versions of countless server-side software are there for free. In a lot of cases the value is that someone maintains it and operates it for you.
I feel Amazon took the feedback from the DocumentDB/MongoDB fiasco to heart and made positive change in their approach.
DocumentDB is a closed source proprietary database created by Amazon to emulate the MongoDB API. Think Google's Dalvik runtime vs Sun/Oracle's JVM.
This time around we have an open source fork of ES with big backers all contributing and very permissive licensing.
In both cases, Amazon gets to implement AWS-specific upgrades to management to depend heavily on EBS replication rather than application-layer replication. Would it be nice to have that secret sauce that makes Aurora/DocumentDB so nice to use compared to self-hosting or RDS? Of course. Do we have to have it to consider using or contributing to the open source software? No.
On the other hand, MongoDB is already sort of obsolete and trending towards death by the time that all ended up happening while ElasticSearch is hot and "new".
Where do you get the impression MongoDB is trending towards death? Seems to be growing by some metrics; the stock price has more than doubled in the last year. Not a fan myself, but still seems a long way from death to me and seem to be doing something right in enterprise market.
There's a difference between trending towards death and dead.
IBM has been trending towards death for decades now and it's nowhere near dead.
MongoDB is certainly on the road to death, IMO. As has Oracle DBMS since the 90s.
Most companies tend to make more money as their product's growth stalls out (extracting more money from existing customers).
The fact that Mongo had to create shitty-license underpins huge revenue problems. If you look at the major trends and surveys, the ones targeted at people who actually drive database adoption within companies, MongoDB is sliding YoY for 3 years now.
The place you see Mongo growing is Atlas. Yes, as their competitors can no longer offer MongoDB in their clouds, revenue shifts to MongoDB. That does not mean that use of the database itself is growing.
Find me a bunch of developers and infrastructure people who are EXCITED about running MongoDB. I guarantee you're going to find a 7-10+ year old infrastructure if you do.
Yes, because Mongo's license prevents them from using alternatives to Atlas (read: other cloud providers). Of course Atlas growth looks good. It's them or self-hosting.
That's not the same as starting a new product on MongoDB.
It's been said that the best way to fortify your business is to use your clout to make the world inhospitable for adjacent businesses.
As someone who is not currently in the cloud, that idea strikes me as being very pertinent to what's happening here. Increasingly many technologies are becoming cloud-only, or have non-cloud offerings that are decidedly second-class. Elastic offers on-prem support. I doubt Amazon will be doing the same with OpenSearch.
It may be a subtle effect, but it's pushing the world in a direction that makes me uncomfortable. If it's harder for non-cloud-based companies to maintain non-cloud-based offerings, then that will push the industry even more toward being dominated by SaaS products. And these products often leave clients and users locked in, with limited control over their own data, and, by extension, reduced ability to control their own fates. What I worry about is that we may be witnessing a return of Embrace, Extend, Extinguish, only in a new form that's even more dangerous because it's harder to see.
I appreciate the discomfort people have about the SSPL. It is a departure from the original ideas behind FOSS. But, at least as I see it, those open source principles were never an end in and of themselves. They're a means to a greater end: digital autonomy. To the extent that very large companies seem to be learning how to co-opt FOSS in order to re-assert control, FOSS's ability, in its current incarnation, to serve that end may be waning.
Elastic's on prem support amounts to little more than an onsite where they explain to you how the Java Garbage Collector works. EVERY detail about tuning your clusters derives from keeping the Java GC from ruining your day.
There's some index template optimizations but any semi-competent engineer or dba should be able to figure all that out (it's literally all in the documentation about what not to do).
You'll still be able to pay a consultant to come help you -- they don't have to be from Elastic.
In fact, it seems like Amazon just created an industry for third party consultants here.
From the announcement: "You should consider the initial code to be at an alpha stage — it is not complete, not thoroughly tested, and not suitable for production use. We are planning to release a beta in the next few weeks, and expect it to stabilize and be ready for production by early summer (mid-2021)."
Given that Amazon announced the fork in January and they don't expect it to be production-ready until summer, I'm guessing they've underestimated the amount of work required to package and distribute a product as complex as Elasticsearch. Given that, I doubt they will be well-equipped to keep pace with new feature development.
I would question the assumption that this is “not suitable for production use” means “everything is broken and we're way behind” rather than, say, “we are being extremely conservative because our customers will expect support as soon as we say it's production ready and we need to test every upgrade scenario for our large number of existing customers”. The AWS-managed ElasticSearch seems to be pretty popular and I would expect them to be as conservative about new offerings as they are with, say, RDS.
> they've underestimated the amount of work required to package and distribute a product as complex as Elasticsearch
The bulk of the work thus far has been to strip out the non-OSS components ("X-pack") and the many references to it, nothing to do with packaging, distributing, or even maintaining and developing features.
I for one will be happy when those are taken out. So many headaches trying to get bloated Kibana to start as a docker container before realizing that some random x-pack-disable flag needs to be set for it to start without a random error.
I'm not sure I agree with that assessment. Now that the fork is publicly available, others can contribute to get it ready, which wasn't possible until now.
Yes, others can contribute, but significant feature development on large-scale OSS projects tends to be driven by developers paid to work on the project full-time and coordinated by an organized steering committee with clear governance (or company if the product is owned by a single company). I don't see any of that in place for OpenSearch and getting that all started up is not at all a trivial endeavor.
The fork announcement was announced as a response to the Elastic stuff. I don't think they made any predictions about when it'd be ready in that blog post, so I'm not sure why they would've underestimated anything?
I have been perfectly happy with ES cloud services. Is this done by honest intentions from AWS or is it simply based on the fact that ES are making a lot of money of the cloud services?
Well, now nobody can provide a competitor to ES cloud services for newer versions. If you upgrade to v7.11 or above, you're locking in your choice for 'managed hosting for ES/Kibana' to ES cloud services.
> Is this done by honest intentions from AWS or is it simply based on the fact that ES are making a lot of money of the cloud services?
Is wanting to make money considered honest intentions? ES released v7.10 under the Apache license on purpose, so they knew that any form of license change would mean people can still use earlier releases without having to adhere to the SSPL, and anyone could legally fork it or run it on non-ES hardware without having to pay the piper.
Will be interesting to see the resources that AWS will throw at this. You can get a sense of the resource that elastic.co is throwing at elasticsearch at
I scrolled back through the commits. It looks like they've been removing traces of x-pack, Elastic branding, licensing checks, etc. since the beginning of March. So far it looks like one person is doing the bulk of all that work.
If there are new features, I haven't seen any. The real question is do they have a team for new feature work that they are putting together or is this just a fork that is doomed to fall behind as Elastic's huge team continues to develop their code base fixing bugs that will never get fixed on the Amazon fork, adding features that will never get fixed, eventually releasing the 8.0 release that has been in the works for two years, etc.
I don't see any evidence that they have that team so far. They're paying a few people to go through the moves of forking but I don't really see a grand vision beyond that so far.
I really need to add a compare feature to my tool as it would make analysis a lot easier. Having said that, there is no denying there is a huge difference in work being done in both projects over the past 30 days.
Amazon does have 16 open pull requests though, with about 7 having 20 or more file changes, but I didn't dig into them to understand their significance. Maybe it's another feature I'll need to work on.
If you look at the one year window for elasticsearch
its churn and activity has been extremely consistent and I'm not sure if this is an investment Amazon can and/or is willing to make.
However, knowing enterprise, I'm not sure if this will make a huge difference as those making the decisions might not really care and they'll just accept whatever Amazon tells them.
A lot of the projects looks like they were created to make it easier for Amazon to manage elasticsearch on their infrastructure and/or to overcome license limitations for features that already exists in Elasticsearch.
Are there specific repositories that you know of that would contain functionality that Elasticsearch does not have, that would be a strong differentiating factor? I'd be curious to index these projects to get a better idea of the investment that Amazon is putting into opendistro-for-elasticsearch that could have an impact on OpenSearch.
Most of the modules replace modules which Elastic provides, but not in the free distribution. For example, alerting, SAML auth/SSO, field and document filtering by user/role, etc.
However note that the most substantial module, the authentication module, is actually a fork of another product called Search Guard.
The differentiating factor in my case is simply the price. I think Elastic's X-pack modules actually provide a more complete overall experience, it's just not worth the cost. The Open Distro modules provide me with a budget alternative.
Elastic spends so much time and effort on making sure that their search is performant (and they are not shy about deprecating and removing features that are slow). I think this is where Elastic will continue to shine. It's one thing to add features, it's another to make it so they work well and make sure the integrating product team doesn't shoot themselves in the foot.
What may hurt them though, is the number of customers that currently feel things are currently "good enough". I don't know what their sales engagement looks like, so I'm not sure if this will really hurt them or not.
Perhaps! My money is on Elastic including an approximate nearest neighbor search in 8.0 which uses the new HNSW feature in Lucene 9, which is going to be hard to do in a distributed capacity and will be a significant feature if they can pull it off.
PS - I've never seen gitsense before and it's really cool. I especially like the focus quadrant!
Since search is your domain expertise, I'll take your word for it :-)
As for GitSense, checkout the impact section and sort contributors by "First Commit". With this, you'll get a very good idea of the developer's expertise level.
I think even with their "permissive usage guidelines" of the OpenSearch trademark, their own way of how they've been doing Amazon ElasticSearch would not be allowed... For example, you can't do: `Microsoft OpenSearch`, you have to do "for OpenSearch" or "with OpenSearch compatibility".
From their "permissive" trademark policy [1]
> You may also use the “OpenSearch” word mark to make accurate statements about compatibility and interoperability using relational phrases such as “works with,” “runs on,” “compatible with,” and the like (e.g., “Foocorp Software powered by OpenSearch” or “Foocorp Software for OpenSearch” or “Foocorp Software with OpenSearch compatibility”).
Hey All, if you're interested in getting a good understanding of this vs Elasticsearch, we invited the team to give a Haystack LIVE talk where they outlined the details and goals of the project: https://www.youtube.com/watch?v=J_6U1luNScg
OpenSearch was once was an initiative founded at A9, Amazon subsidiary, to create a personalized, cross-service, search engine: https://archive.is/PCKWq
OpenSearch is from an era when Amazon and Google were covertly competitive. Google didn't get anywhere with Froogle and AppEngine; whilst Alexa and A9 didn't move any mountains.
I'm happy to see a couple of good choices made here:
- Sticking with Apache 2.0
- Asking for a Developer Certificate-of-Origin rather than a copyright assignment
This bodes well for the future of this fork. Amazon also has the resources necessary to keep up consistent and quality maintenance of a project on this scale.
Elastic would definitely like you to view AWS as the Big Bad here, but their response to the Elastic betrayal is very good, and I would like to see more like this in the future.
I think this thread is much about shared source licenses like SSPL vs. "orthodox" open source licenses like GPL.
Based on the link below it seems to me the difference is that SSPL etc. have a clause which prevents me from making money by selling the use of the licensed software over the network for instance.
GPL puts some rather strict rules on users of copyleft software, mainly that you MUST distribute your modifications with the same license.
What I don't quite get is why adding a rule that says "if you make this software usable over the network you must make it usable for free" would be considered categorically less ethical than GPL.
GPL says you must give out your modifications for free.
SSPL says you must also give out the rights to use that software for free as well.
Isn't SSPL more ethical in the sense that it requires you to give out more for free?
Ethics requires a framework. Just because something is free it doesn't become good. For example free heroin samples!
It's a complex problem to even phrase the question of what do we mean by having a healthy software/IT ecosystem. Do simply count the number of users? GDP of the Internet? Number of git repos? Naturally those doesn't even begin to capture the self-balancing dynamics we are after. We want to encourage folks to start new ventures, but also to give back. But by giving back what if they eliminate old ventures? (Eg. Google "giving back" Chrome might make the Firefox venture non-viable.) How can we describe healthy competition? (It'd be good if the browser market wouldn't be cross-financed from ads, but - let's say - every user would tell their ISP to direct some of their subscription fee to one of the browser vendors.) Okay, but what does this have to do with licenses!? Yeah, it's a fairly hard problem.
I feel this is a very scary trend starting. I have not come across a single founder in the last 5-6 years who does not start with AWS credits or is not craving for them.
AWS is a monopoly and they use their cash to buy early customers. Initially it was Amazon's money, but now AWS has enough cash of their own to push whatever they wish to. The same goes for Google and Microsoft.
AWS directly building up the software side of what started out as IaaS (Infrastructure as a Service) is only going to hurt software vendors. We can only expect new software players or ones with low capital to restrict their licenses even more.
Open source licenses are not only for ideological freedom, but very necessary for companies (end users) to integrate and modify products on their own. We will migrate more toward source-available licenses instead since big giants are going to corner the small companies.
I find it fine for software to not be open source. Source-available but closed-source (in terms of freedom) is perfectly fine from a commercial standpoint and should be the gold-standard for mission-critical software in the backend. The problem comes from companies touting their software as open source purely for the marketing aspect in order to bring in customers and get free work done by people who don't get paid.
OpenDistro for ES is the surrounding tools for ES => plugins, index-state-management, basically a suite clone of ES Enterprise offerings (X-pack) because AWS can't ship AWS ES with X-Pack.
Open Distro for ElasticSearch was not a fork rather an Apache 2.0-licensed distribution of Elasticsearch enhanced with enterprise security, alerting, SQL etc... OpenSearch is a community-driven, open source search and analytics suite derived from Apache 2.0 licensed Elasticsearch 7.10.2 & Kibana 7.10.2.
I want to love elasticsearch and they keep making it harder. On top of this open source backflip, their sales staff would put Oracle to shame. Pressure tactics, making clients pay 5 figures for few hours of ES consultant time, very expensive training/certification (and cert expires every 2 years) - among few of them.
AWS gave us consultant time for no charge, guess who we picked to run our ES load.
Yeah the original OpenSearch project is a different enough domain that I think confusion will be minimal. We have talked to the maintainer and he is supportive. We have also posted disambiguation in case anyone does get confused. https://opensearch.org/disambiguation.html
In general I'm supportive of an Amazon open source fork here.
But the name re-use is unfortunate.
Amazon's argument seems to be "Don’t worry, we own the trademark for ‘OpenSearch’ cause it came out of Amazon originally, so it’s cool!”
That is really poor stewardship of the Intellectual Property of the trademark of a name that was part of a standard that was meant to be an open multi-vendor standard. Amazon owned the trademark to protect it's use under that standard, not to re-use it for something totally different harming the standard further.
But it's just another indication that the original Opensearch, like the era of believing in open web standards for inter-operability that it was part of, is dead.
Turns out that was developed by Amazon according to Wikipedia. So maybe they’re merging that usage into this offering (since that is a spec for search results)?
In all three cases, Lucene is used as a "low level" (Java) API which provides search capabilities. OS, ES and Solr turn Lucene into a server, with features like horizontal scaling (ES Cluster, Solr Cloud). The major differences are in how well that all works, how easy it is to administer, how much caching and optimization is done on top of Lucene, etc.
I haven't extensively used ES, but I've used Solr a lot (and contributed to), and I can say that it's a mess. The community is not one of the better ones I've seen. Bugs and stability issues are often ignored. Patches sit around gathering dust. There are some gems and very clever people in the community, of course, but it seems like there are too few of them to cope with the large beast that Solr has become. If I were starting a new project in the search space, Solr would not make my shortlist.
ElasticSearch seems to have more mindshare. It can be easier to find resources online to help solve your problems.. though ES moves through versions fast (and they do break backwards compatibility on major version bumps) so sometimes this can still be an issue.
Other benefit is that you don't have to rely on Zookeeper if you're horizontally scaling.
I don't have a ton of experience with Solr but they seem pretty comparable.
The real vendor lock-in for ES (and now AWS OS) is the REST query API; if Solr implemented ES's API, I bet $1 a lot more people would have moved or at least considered moving over
IIRC Solr also has some weird stuff about schemaless indices, whereas ES took the very Mongo-y approach of "yeah, just throw content at the index, don't worry about it" but then separately the approach of "I am angry with your new conflicting field in that document" and throws an exception; so you don't have to worry about the schema right up until you do have to worry about it
Not familiar with Solr but I believe the analogy to linux would be choosing the preferred distro while they all use the same kernel. There are a bunch of long comparison lists if you search for it.
It's nice that they announce it and that there's some sort of future effort promised. From my perspective we might not upgrade the elastic-stack (with current Elastic projects) too far to not bacome accidentally incompatible in case we want to make a switch.
One of the more discouraging aspects of being an OSS developer is that successful companies that use your software never consider contributing to the OSS developers. I suppose that is the nature of business though. Take what you can get.
Are people actually required to use ELK? What are your use cases?
The interface is completely cluttered and it takes loads of resource and it feels like it's waiting to be replaced with lighter and more focused products.
Graylog (though it uses Elasticsearch internally) does a decent job at log handling and creating all the visual items out of logs and Grafana/Loki can do quite good at it as well with a very small memory footprint.
Besides, most of the "business intelligences" aren't actionable but just some visual arts you wouldn't need but to stare at when you're bored.
I wonder if ES had originally been AGPL licensed would that have helped them? If Amazon adapts AGPL code to integrate it with their own infra=structure doesn't that in fact mean that all of Amazons' software-based infra-structure would become AGPL as well, and thus easily reproduced by Google Cloud, MS Cloud, Oracle Cloud etc.? Or even inhouse? In other words wouldn't it mean it would be easy to replicate the Amazon Cloud-business (on a smaller scale)?
Amazon just wouldn't do that. They would either not offer it as a service, or make a clone from the beginning like they did with MongoDB. In general none of the cloud providers are actually willing to comply with the AGPL license.
Am I missing something here? Elastic says this is a free sw, which you can install and use, but if you want someone else to manage the hosting, we are practically the only option.
How did that became an 'ethical position' ? If I am OSS dev, why should I contribute to them vs OpenSearch?
> and we don’t ask for a contributor license agreement (CLA)
Makes me wonder what these are for (copyright transfer) and why they decided it’s not needed. It also makes me wonder if this sort of thing has ever been taken/tested in court or if it’s paranoid friction with little value add.
> Makes me wonder what these are for (copyright transfer) and why they decided it’s not needed. It also makes me wonder if this sort of thing has ever been taken/tested in court or if it’s paranoid friction with little value add.
Some companies/projects might use them purely to avoid possible future legal headaches (I think GNU does this), and I'm not sure to what degree that has actually been tested, but they can also allow re-licensing under a different license which is more clear cut and I think that's more the issue here
Amazon is trying to say that they'll never relicense the code, so they have no need to take ownership over contributions.
Indeed that is likely but I wonder why didn't they at least require a Developer's Cerificate of Origin [0] that kernel.org uses. This is really lightweight (just append one line to git commit message) and supposedly provides a minimum legal base for the change. IANAL.
Eric S. Raymond is against them, but also argues that they are harmful--as opposed to just useless--because if they ever got to court, a jurist would look at the practices of the community to decide whether such a thing is common enough that they should be required. [1]
I know GNU does it (at least for Emacs) under the reason that the FSF can go after any GPL violation only if it is the clear copyright holder, but no such case exists, to my knowledge.
A copyright transfer would easily smell to people "Amazon is going to change the license at some point in the future to duck us over Elasticsearch-style". They're trying to avoid that smell.
anyone know if OpenSearch still uses "/" as a special character? Largest pita when trying to use ES for logging web applications and quite frankly, made it near unusable.
If Amazon fixed that, I would be firmly on their side. Also, any improvement over Kibana would be welcome.
They took over FreeRTOS for good, CBMC for good, with Xen they were a bit unlucky, but it still has much better security than KVM, and now they take over ElasticSearch.
Good Open Source efforts, much better than until a few years ago.
ElasticSearch will need to relatively quickly come out with a feature the OpenSearch doesn’t replicate or people will just use the minimum that both support (see MySQL vs MariaDB).
Just want to mention that "OpenSearch" is/was also an AWS^H^H^HAmazon initiative for websites to expose a search URL to browsers in HTML metadata, similar to exposing an RSS feed URL. They may want to consider renaming it to avoid complete and utter confusion, like searching "OpenSearch" (no the other one) using "OpenSearch" (no the ES fork).
Atlas is a virtual monopoly for Mongo solely due to SSPL, and it has created a ridiculously overpriced ecosystem for hosted and managed services, and tooling around it.
Parking the technical merits to one side, considering the sheer number of devs and early-stage products that are built on Mongo, I'd love for someone to go after them next.
Amazon already have DocumentDB which clones the Mongo API. I don't think its forked though, they just use a barely mongo compatible wrapper around their own db engine.
True, but it's not quite the same as what they've done with OpenSearch/Elastic. Also, from what I've read, despite claims, the compatibility isn't complete, esp with stuff like aggregations.
There are a few use-cases where you'd want the ability to have a managed/hosted vanilla Mongo setup vs an emulated experience.
"Amazon DocumentDB implements the Apache 2.0 open source MongoDB 3.6 and 4.0 APIs by emulating the responses that a MongoDB client expects from a MongoDB server, allowing you to use your existing MongoDB drivers and tools with Amazon DocumentDB."
Perfect case for a megacorp destroying open source plus business models. I start to hate amazon with a passion. Craziest thing is they are not paying taxes in Europe though they dominate the market.
Amazon needs be broken up. It's too big and too mighty.
Isn't this an example of an open source business model. Amazon is supporting development of this Apache 2.0 licensed OSS, which they plan to make money off of...
Indeed. But they are not the company supposed to make money of that project. They start to dominate several markets by cross financing and therefore need to be broken up.
What is the sell for ES over something like the fulltext search built into Postgres, considering that the cost of adding another dependency is not insignificant?
Maybe Amazon treating open source developers like it treats its blue collar workers will open people's eyes about working conditions in 21st century American capitalism.
You're starting from the wrong place if you're comparing Elasticsearch with a database. And you're also arriving at the wrong place if you think that any database can be distributed.
"Elasticsearch is a distributed, free and open search and analytics engine for all types of data, including textual, numerical, geospatial, structured, and unstructured. Elasticsearch is built on Apache Lucene and was first released in 2010..."
I suggest understanding what it is first before comparing it to other databases.
Elastic gives you a lot of the fancy stuff that SQL kinda needs extras and hard work for... but it's just a document store with fancy weighting features.
Elasticsearch is a no-sql database that optimizes for full-text searches, built atop Apache Lucene. If you're doing any kind of full-text search, for example, if you're trying to index a university library and make it searchable, then elasticsearch is for you. If you're not, I'd look elsewhere.
From my perspective, Amazon has made most of its profit price gouging consumers on bandwidth after vendor locking them into their ecosystem, where they bootstrap new services by wrapping open source software with some provisioning scripts, management dashboards and cookie-cutter API / console templates. Indeed, most of this is templated -- AFAIU, for example, each AWS service autogenerates its Boto bindings and parts of its console frontend via code generators. Amazon has really mastered the factory process of churning out new services, and when they find a popular one, they can invest more resources into developing it than the original team ever could.
And therein lies the rub. If Amazon is improving the software in a way that the original team couldn't, it's hard to say that the community isn't benefiting. I think what strikes me the wrong way is that Amazon is not doing it for any altruistic reason. In fact, Amazon contributes very little to open source in general, considering how much they take from it. Compare them to Facebook (React, etc) or Google (tons of dev tools) or Microsoft (VSC, TypeScript). What does Amazon have? Firecracker, kind of? And now a fork of ES because that's the only way they could continue making money off it without violating the license a small startup put in place to stop them?
Well, good for Amazon, I suppose, but I find myself instinctively disliking them for this. I'm not sure what the solution is. Hopefully technologies like Kubernetes and Terraform will encourage big customers to become at least cloud-agnostic, if not cloud-independent. At the very least it would be great if Amazon / Google / Microsoft stopped gouging bandwidth at such absurd margins. Or not. Maybe it will be their downfall as startups differentiate along those lines. That would be ironic, coming from the originators of "your margin is my opportunity."
Personally I'm doing my part by not building anything with vendor lock-in. It's great to be able to deploy to any cloud, if you value either robustness or flexibility.