For those that prefer the ease provided by AWS ES Service, consider Elastic Cloud, which affords most of the same capabilities but is run by Elastic themselves (it was previously known as Found, which Elastic purchased a few years ago). There's also an Enterprise offering. If you're looking for a hosted Elasticsearch solution, it's probably better than what AWS is offering. Side note: they update about as often as elastic releases, whereas AWS ES is consistently behind.
The only other options are completely public or IP-based whitelist, the latter which is untenable in most cloud environments.
I can be easily tested with AWS Elasticsearch.
While AWS ES can be cheaper in some configurations, Elastic Cloud is actually quite competitive in pricing for larger clusters when compared to AWS' ES-service. This post compares the two services, and there's an example price comparison at the end of the post: https://www.elastic.co/blog/hosted-elasticsearch-services-ro...
We support most official plugins, and if you get a gold or platinum subscription you can upload your own plugins. Elastic's X-Pack is included in every cluster, which includes security features like role based access control.
It's not possible for external service providers to integrate with IAM at this point.
Or, alternatively, am I mistaken about how configuration works?
(A bit of history: When Found (the company Elastic acquired and which is now Elastic Cloud) was in private beta in early 2012, we actually did allow custom cluster topologies. We ultimately disabled that as it was overwhelmingly used to make sub-optimal cluster configurations, such as 5 x 1GiB memory nodes)
Got a source?
Likewise with the doubling of nodes, this is obviously a blue-green style deployment. In place updates would be quicker but ES can get into all sorts of weird states that require manual debugging to fix, with blue green for most of the deployment you can simply flip back.
I've been pretty impressed with AWS ES compared to running it myself (other than the poor fit of IAM auth)
If this is a reality of using cloud based ES then clearly it's something to seriously consider before using it - which is all the author is saying. The article is titled 'things to consider' not 'things AWS needs to fix'.
ES is big complex beast of a Java app. This is good advice regardless from someone who has used both approaches (self hosted vs AWS) in production.
I did not get the impress that he's saying that AWS can resolve this easily.
This is a bad assumption. Loggly is a shared service.
> Any API that AWS ES exposes has to be there forever
This a bad assumption. No API is forever. Maybe you meant a different timescale. AWS has removed and made breaking changes to APIs over the years (e.g. random breaking change: https://forums.aws.amazon.com/message.jspa?messageID=513640).
Elastic search is honestly pretty simple to set up, save yourself money and trouble and just do it.
I do infrastructure engineering for a small startup and really I think with any of these managed systems you need to step back and evaluate them within the context of TCO, lock-in, security, reliability, performance and flexibility/customizability. I've heard ES isn't that much of a PITA to manage on its own, but on the flip-side I'd never sign up a small team to run PgSQL at scale.
I just run ES for my logstash setup, and ES is lovely and rock-solid... except when it isn't. For example ES deciding to just silently refuse input when its disk is 90% full - that was a bit hard to find when it happened. ES looked alive, but hunting down the reason why it stopped wasn't trivial. I've had a couple of similar but lesser gotchas as well.
I guess you could say of my experience that it's not that much of a PITA (as you say), but it is still a bit of a PITA.
Disclaimer: if these things weren't a bit of a PITA, there'd be no need for us sysadmins, so I should be grateful...
AWS has many cool toys and I use a subset of them every day. However, there's no way in hell I'd entertain the idea of going in fully for everything we do. Not only are there a bunch of inadequate services, they can also be nasty to debug and cause more problems than they're worth.
It sounds like you may have an inexperienced guy getting overly enthusiastic about what he could achieve instead of focusing on what's required (I don't mean to insult him, it just sounds like he may not know enough about infrastructure to be making these decisions properly). Being provider agnostic (at least as much as you can be) is currently a way I see a lot of companies starting to leverage the great tools that cloud providers have, but being able to be free enough to chop and change as the companies needs evolve.
Maybe point him towards things like Terraform and get him looking at what Google cloud and Azure can do as well as AWS?
"AWS ES 5.3 officially supports Curator now. Documentation has been updated to reflect this."
The change is trivial, so I get the sense that Elastic is just fucking with Amazon.
But if your need for ES is to support a backend system that would make your life inconvenient for a while if there are problems, is relatively small and won't grow too fast, but isn't business-critical, then the AWS managed service is fine.
It's possible it's not MT and they just didn't write the facade APIs. That'd be pretty crazy.
My biggest complaint would be lack of plugin support.
Also of note: Amazon's documentation on HTTP limits is wrong. There are some instance types listed as having a 100mb max payload that are only 10mb. We found that out when Logstash recorded a crapload of errors with the 10mb limit on what was allegedly a 100mb instance type.
It depends on your use case. If you are already familiar with Solr and it is good enough for your use case, then use it. Solr and ES are about the same feature-wise. Scaling is easier for ES because it is built-in. Here is a good comparison of their APIs.
part 1: http://opensourceconnections.com/blog/2015/12/15/solr-vs-ela...
part 2: http://opensourceconnections.com/blog/2016/01/22/solr-vs-ela...
I have been using it on a new project the last couple of weeks and it seems to be working well.
Had to provision our own EC2 instances.
It was 2 years ago though, things might be different now.
We hit that limit and had to ruthlessly prune live data.
You can now add 1.5TB per node (with very large and expensive instance types) as well as scale past 20. Requesting the limit increase was a lot more difficult than most other limit increases.