Hacker News new | past | comments | ask | show | jobs | submit login

Anybody who is using Elastic Search as the basis for persistent data storage is going to come in for some scary surprises down the road in production. Without violating my NDA(s) I can't give significant details, but I've personally witnessed weird split-brain multi-master, dropped writes, etc. in response to network partitions and other common failure scenarios for a distributed data store. Elastic Search is a fantastic tool for high-speed full-text queries, but it is NOT and should never be used as a reliable persistent data store. Crate looks doomed to failure on this alone.

Without violating your confidentiality, can you share what version of ES you experienced these failures with? I have also experienced this first hand, but since 0.9x and 1.0x a TON of work has been done on both OOM (one of the leading causes of split brain scenarios) and split-brain from network partitioning. As I mentioned below, all of these issues are being addressed in an open and transparent way, and while there's still work to be done, a non-trivial amount of progress has already been made. I hope you can share at least some about your experiences.

I wish someone would revive the Zookeeper election and discovery backend. It's orders of magnitude better than the Zen ping system or anything like it (JGroups springs to mind). Unfortunately it's been left for dead for quite a while now. :(

When it comes to consensus real consensus servers are the only way and Zookeeper is the only production consensus server available outside of Google. (sorry etcd, you aren't quite there yet).

Speaking of, this fork has been posted on the README of the orignal Sonian repo: https://github.com/grmblfrz/elasticsearch-zookeeper

Seems to be compatible with ES 1.4 :D

full disclosure. i'm one of the co-founders of crate :)

In December 2010 we found about Elasticsearch and were truly amazed by it's simplicity, speed and elegance. We built our service and consultancy business around it.

In 2011 we've built some of the largest ES applications at that time (6TB, 120node cluster, http://2012.berlinbuzzwords.de/sessions/you-know-search-quer...) and started to develop a set of plugins, such as the in-out-plugin to allow distributed dump/restore.

With this background - and the mission to build a datastore that is as easy to use and administer as Elasticsearch we founded in 2013 Crate and raised some Seed money. Since that we're working hard to make this vision become true. We're often confronted with the results of the so-called Jepsen-Test (https://aphyr.com/posts/317-call-me-maybe-elasticsearch) that Aphyr published in 2014. Don't forget: Lucene, Netty, Elasticsearch, Crate - all are Open Source products (APL) and rely on all kinds of contributions - such as this analysis! No matter be it bad news or good news. We can only improve based on hard testing and feedback. However, this caused a lot of rumblings in the Elasticsearch ecosystem and the reaction of Elasticsearch was exemplary:

1) explain the reasoning and make an official statement: https://www.elastic.co/blog/resiliency-elasticsearch/

2) list all the issues and hunt them down. one by one: http://www.elastic.co/guide/en/elasticsearch/resiliency/curr... (and add new ones as they occur).

3) stay in contact with the community that reported the issues: https://twitter.com/aphyr/status/525712547911974913

All that being said. We see many people using Crate as primary store (and of course backing up their data) but we also see people that don't put that much trust in a younger database and keep all their primary data in another location and sync/index to Crate.

ALWAYS make backups (COPY TO / COPY FROM), make sure you have replicas, and most important configure minimum_master_nodes correctly to avoid split brain.

At Crate we stand on the shoulders of these great Open Source project, try to be as good citizens as possible and focus mainly on our Query engine (Analyzer, Planner, Execution Engine).

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact