
Elasticsearch – the definitive guide - donretag
http://www.elasticsearch.org/blog/elasticsearch-definitive-guide/
======
rcfox
Glad to see this. The reference documentation _is_ pretty opaque.

So far, the best resource I've found is "Exploring Elasticsearch".
[http://exploringelasticsearch.com/](http://exploringelasticsearch.com/)

~~~
pan69
What version of Elasticsearch is this guide targeting? I believe there are
some fundamental differences between e.g. 0.9 and 1.0 releases if I'm not
mistaken.

~~~
rcfox
I haven't been able to determine when the content was last updated, but some
of the comments on the site are ~8 months old, so it might be a bit out of
date.

That said, the basic concepts should be more-or-less unchanged.

~~~
andrewvc
Hi, I'm the author. I actually just (soft) relaunched the book using a new
backend, so the content's a little stale. Now that I've revamped all the
formerly poor quality ebook generation I can focus on content once again.

Aside from not mentioning aggregations it should be fairly accurate, and is
still a useful guide for a beginner.

I' hoping to revamp it to discuss aggregations over the next couple weeks.

I lost most of my disqus comments, unfortunately, when I relaunched the site
last week as the URL structure changed. I revamped the book to have more
content per page. It's probably a little less SEO friendly, but makes for a
nicer reading experience.

~~~
satyampujari
Andrew, exploringelasticsearch is cool, thanks for the time & effort. Just
wanted to let you know that the github audio interview link is broken (or not
available publicly) since few weeks:
[https://soundcloud.com/andrewvc-1/github-interview-
edited](https://soundcloud.com/andrewvc-1/github-interview-edited)

I would really appreciate if you could put it back, it's really a good listen.

~~~
andrewvc
Ack, soundcloud took it down when I uploaded another interview and went past
their free plan. Just upgraded, it should now be available.

------
ilaksh
I'm looking at the guide and I see a lot of explanation but little to nothing
in the way of clear, simple instructions that will cover the most typical
basic use cases.

Of course Elasticsearch has many features and many different types of
interfaces, but most people don't need to use most of those features, and
having some example code available for a few common languages/platforms would
be very useful.

Elasticsearch has done a great job of streamlining the use of Lucene and of
course generally making many improvements, but based on the documentation I
have seen including this new book, Elasticsearch must derive most of its
income through consulting or support, and providing simple instructions
obviously is a direct conflict of interest.

I believe that the average user is like me: they want to index some documents
and then search the full text. They want a straightforward way to connect one
or two search boxes on their web application to Elasticsearch and then
retrieve some useful results. They do not want to learn the nuances of
different engines or search interfaces. They do not want to read a book.

~~~
arafalov
Well, you could use Solr. As with ES, the basic examples run out of the box.
But frankly, whichever search engine you use, at some point you will have to
read a book or dig hard to understand underlying issues such as tokenization.
But starting is easy.

And the Solr community gets its income from all sorts of direction. So, the
user mailing list is quite helpful.

~~~
ilaksh
Starting with ES is not easy. The documentation index is confusing. The basic
examples that I saw only showed half of the equation: searching. And that
example left out a lot of typical use cases without links to any examples for
implementing those use cases.

So what you are suggesting, that I should use Solr since the Elasticsearch
documentation is difficult to navigate or perhaps incomplete, is almost like
saying "I think it should be difficult to figure out how to use ElasticSearch,
since its so powerful, its not for kids. Therefore, the documentation is
deliberately opaque. If you want something easy, use Solr."

~~~
AznHisoka
I agree. Searching is actually the easy part. Setting up a distributed
cluster, allocating shards, replicas, rebalancing , etc is the hard part.

~~~
arafalov
And that's supposed to be the hard part. There are no workaround for the hard
parts. You need to understand the complex issues involved. However, you may
want to read the blog from Found.no, they have a lot of great material:
[https://www.found.no/foundation/](https://www.found.no/foundation/)

------
mikecx
What i'd really love to see is an administrating Elasticsearch guide and a
common recipes guide, both would be super helpful for getting started and more
advanced tasks. Having run into issues with Elasticsearch data scaling, trying
to pull answers out of the current guides or the IRC channel is like pulling
teeth... from an alligator... with a laser attached to its head.

------
baghali
Easiest way for me to learn Elasticsearch was through queries generated by
Kibana [1] and playing with them inside Sense [2]. Kibana relies on Facets,
version elasticsearch 1.0 introduces Aggregations, I would suggest using
aggregations for your projects.

[1]
[https://github.com/elasticsearch/kibana](https://github.com/elasticsearch/kibana)

[2]
[https://chrome.google.com/webstore/detail/sense/doinijnbnggo...](https://chrome.google.com/webstore/detail/sense/doinijnbnggojdlcjifpdckfokbbfpbo)

------
nemothekid
I recently started with elasticsearch for a hobby project, and my biggest
issue with it is really finding things in the documentation.

For example, the documentation brings up relationships, like parent/child, but
it isn't clear how to do the simplest case : return a parent's child
documents.

Or the fact that the _all_ the parameters to a request isn't listed in one
clear place. Its all hidden by a dozen separate examples, but if I want to
know options I can set for a _mapping, I can't find it.

------
jackhoy
If you want to set up Elasticsearch and have a quick play around with some
queries I have written a beginner friendly intro which you may find useful:

[http://red-badger.com/blog/2013/11/08/getting-started-with-e...](http://red-
badger.com/blog/2013/11/08/getting-started-with-elasticsearch/)

Looking forward to reading this book! Elasticsearch has been a great tool for
us.

------
brickcap
There is also a free ebook exploring elastic search

[http://exploringelasticsearch.com/](http://exploringelasticsearch.com/)

------
trustfundbaby
So excited to finally start seeing more books about Elasticsearch!

"Elasticsearch Server" is another really good book for anyone who's
interested. [http://www.packtpub.com/elasticsearch-server-for-fast-
scalab...](http://www.packtpub.com/elasticsearch-server-for-fast-scalable-
flexible-search-solution/book)

------
chatman
Solr Cloud is more mature and more customizable. ES is perhaps easier to start
with, but you could get a lot more customization and power with Solr Cloud.
The Apache community will ensure Solr always remains at a bleeding edge and
stable for large scale deployments.

~~~
cmiles74
I have used and spent time with both products and I couldn't disagree with
this more. In my opinion, Elasticsearch is the more mature product and has
been around longer than Solr Cloud (ES 2/2010, Solr Cloud 10/2012)[0][1], if
you count Compass developement (ES' precursor) than even longer. Where ES was
built from the ground up with clustering in mind, this is a feature that is
certainly a primary focus for Solr Cloud but not a key feature for much of
Solr's development. I would also contend that, in my opinion, there is no
basis for the assessment that Solr Cloud provides "more power" than ES.

Elasticsearch has been an open project for some time, and they have recently
formed a for-profit company in order to push the project forward even more.
Certainly the Apache Project has an investment in Solr and, I certainly hope,
that the project will continue even if Lucid Works (who employs 25% of the
Solr developers)[2] were to close up shop.

[0]:
[http://en.wikipedia.org/wiki/Elasticsearch](http://en.wikipedia.org/wiki/Elasticsearch)

[1]:
[http://en.wikipedia.org/wiki/Apache_Solr](http://en.wikipedia.org/wiki/Apache_Solr)

[2]: [http://www.lucidworks.com/about-us/](http://www.lucidworks.com/about-
us/)

------
vacri
What snake is on the cover? I thought it was a tapeworm at first - a rather
odd choice for a marketing animal :)

------
phirschybar
Is it odd that there is no search on the elasticsearch site / docs / guide?

------
PaulHoule
ElasticSearch >> Solr

~~~
djKianoosh
for the uninitiated, is there a decent blog/post somewhere that goes into some
details comparing the two?

~~~
btb
We have been running solr in production for 6 years or so. But I have with
interest been following the ES development blog, since I think they have a
nice api/features.

The momentum/community behind elasticsearch seems to be building rapidly(to
get an idea, just follow the "This week in Elasticsearch" posts on
[http://www.elasticsearch.org/blog](http://www.elasticsearch.org/blog)).

Having said that, solr have been rock solid for us so we have no real pressing
need to switch. Also we dont really have the problem(massive scaling) that
elasticsearch seems to be built to handle, we just have 15 mio pageviews or so
pr month. Our setup is 1 solr master, and 5 solr slaves(one on each of our 5
webservers). And do nightly dataimports from our SQL server database. That
last part is indeed where I think solr currently have a nice advantage over
elasticsearch. The solr dataimporthandler is really nice if your primary
datastore is a SQL server, and allows you to do all sorts of nifty javascript
and other transforms on the data in-flight as you stream it from your SQL
server. For elastichsearch there is a jdbc-river thingy that sorta lets you do
the same, but it isnt as polished or usable as the solr
dataimporthandler(IMO). And if you want to install it you have to do it via a
plugin link that points to a bit.ly address.. which makes me feel uneasy.

I also like that solr comes with an admin GUI out of the box. There exist some
ES equivalent plugins(mobz/elasticsearch-head), but like with the jdbc river
its a thirdparty plugin and I guess you have to trust that it doesnt screw
with your server. With solr all you need comes with the distribution, so you
dont need to spend mental energy on wether or not you can trust this or that
plugin to run on your server.

Also the .NET client for ES seems very polished and more sexy than the solr
equivalent.

Anyway my non-scientific gut feeling is that with the current momentum behind
ES, it will over time be a better choice than solr. But unless you really need
to scale massively, plain old solr is available and works just fine. And seems
to me to be somewhat easier to get running than ES(but then again I'm probably
biased after having run solr a long time).

------
hughes
They could not have chosen a more terrifying creature to put on the cover.

~~~
AznHisoka
It reflects the brand/product. ElasticSearch is terrifying to use :)

~~~
polyfractal
FWIW, we didn't choose the animal. O'Reilly's design department does a voodoo
incantation and choose an animal...the authors/editors have zero input :)

I quite like the snake though, think it looks nice

------
shanmoorthy
Congrats @clintongormley!

------
hernan604
great job guys, looking forward for it!

------
notastartup
I hear a lot about this but never really had a chance to incorporate it in my
projects because I didn't understand what it as for and why

~~~
arafalov
If you need people to navigate through the information you provide and it is
more than a couple of pages of links, you need search engine.

If your stuff is generic text then Google will find it for you. However, if
you have categories, unusual languages, business logic, geographic locations
and other non-pure-text content, custom search engines like ElasticSearch,
Solr, etc will give your customers much better results than generic Google
search.

As an example, do a search at LinkedIn and see all those categories and limits
popping up on the left. That's what you can get from the search engines, but
not from Google.

~~~
collyw
That's a really nice explanation.

Now if someone could come up with a site with explanations like this
(including when it is overkill compared to a basic alternative), that would be
really useful. The problem is when you go to the sites homepages, they will
tell you how you can do almost everything with the given technology.

------
antocv
Awesome, just awesome. Finally. Yay.

Thanks dudes and dudettes.

