
SQL interface for Elasticsearch - taf2
https://github.com/NLPchina/elasticsearch-sql/blob/master/README.md
======
rxin
If you use Spark SQL to query ElasticSearch as a data source, it already has
better SQL support with joins and you can push predicates down into ES.

------
amai
If you want real SQL on top of Elasticsearch have a look at
[https://crate.io/](https://crate.io/) ([https://crate.io/blog/sql-for-
elasticsearch/](https://crate.io/blog/sql-for-elasticsearch/)).

P.S.: I'm not affiliated with that company.

------
johnnymonster
I don't get the point of this. Do people not want to learn ES so bad that they
will use something like this without understanding how to build an ES query
object? All this plugin does is convert sql to an ES query...

~~~
dragonwriter
Not having to learn a different query language for every product is exactly
the point of SQL, and naturally also the point of pretty much every "SQL
interface for _X_ ".

~~~
threeseed
If only this was the case. Except that every database has their own flavour of
SQL with plenty of proprietary extensions. Not to mention each database will
support a different subset of SQL.

~~~
dragonwriter
> Except that every database has their own flavour of SQL with plenty of
> proprietary extensions. Not to mention each database will support a
> different subset of SQL.

Its a lot easier to become familiar with a new SQL dialect than a new query
language, especially since a lot of basic querying will be the same between
different dialects.

~~~
threeseed
Learning the basics is easy with any query language eg. MongoDB/ES JSON. It's
when you start writing more advanced queries that you realise that SQLs "write
once, run anywhere" premise is an illusion.

------
jermo
No joins.

I wonder why you wouldn't use PrestoDB to connect to Elastic Search. It
provides you with an SQL engine and you just need to write a connector that
knows how to get data.

Similar thing has been done in Crate.io.

~~~
rmsaksida
It does have joins.

[https://github.com/NLPchina/elasticsearch-
sql/blob/5cd6ab639...](https://github.com/NLPchina/elasticsearch-
sql/blob/5cd6ab63919bd4f366fc721638b31a54e3555b5c/src/test/java/org/nlpcn/es4sql/JoinTests.java)

~~~
lobster_johnson
Interesting — looks like the join isn't pipelined. The entire right-hand side
evaluates synchronously. So it has to wait for the entire right-hand result
set before it can evaluate the join operator, instead of streaming it
concurrently. I'm surprised anyone would do it this way in Java, which has
good support for concurrency.

Edit: Actually the file you linked to was a test file. Hash join code is here
[1], and it uses ES' scrolling feature to incrementally join, though it's not
pipelined. Not sure scrolling is entirely appropriate for this; it will
potentially hold an unpredictable amount of memory on the server end.

[1] [https://github.com/NLPchina/elasticsearch-
sql/blob/5cd6ab639...](https://github.com/NLPchina/elasticsearch-
sql/blob/5cd6ab63919bd4f366fc721638b31a54e3555b5c/src/main/java/org/elasticsearch/plugin/nlpcn/HashJoinElasticExecutor.java)

