Hacker News new | comments | show | ask | jobs | submit login
How Algolia (YC W14) Built Their Realtime Search (leanstack.io)
70 points by redox_ on Mar 24, 2014 | hide | past | web | favorite | 12 comments

For the search index, I wonder why they did not go with ElasticSearch or Solr, and decided to roll their own replication/distribution layer. RAFT is much simpler than Paxos, but it's still something pretty difficult to get right (as with any distributed system). Given that both ES and Solr have already got this figured out, what were the specific reasons for rolling your own C++/Nginx layer?

Search engine needs have evolved a lot since Lucene was designed (Lucene is the search library behind ElasticSearch and Solr). Users now want an instant search that search at the first keystroke and are able to tolerate typos. Lucene does not propose anything special for that, so you have mainly two options: - slow down indexing: index all ngrams of words (this also require a lot of disk) - slow down search: you need to find all words that match the prefix query in the dictionary and then retrieve all inverted lists. ES and Solr have added autocomplete module that is basically a compiled Trie that associate a string to a results set, but this is not a real instant-search engine, there is no intersection of inverted lists and this produce mismatch between the autocomplete and search results.

On our side, we have redesigned all search algorithms and data-structures from scratch to run on a mobile devices. We are for example able to search all cities from Geonames (> 3M) on a old iPhone 3GS in less than 50ms (including prefix search & typo tolerance) with only 1KB or RAM and a flash disk. You can't do that with Lucene. So I let you imagine what we can do with a high-end server ! For the High-availability & Consensus, this is something complex and we have spent a lot of time to test it by killing machines of a cluster, and we still continue to test it all the days since every deployment kill index builder process !

I believe this has to do with performances - ie. performing in few milliseconds rather than in tens/hundreds of milliseconds (at least that's what their benchmarks claim: http://blog.algolia.com/full-text-search-in-your-database-al...)

I use Algolia on my site, and I believe it has to do with speed. Algolia is significantly faster than ElasticSearch.

From my reading it seems that they are fully consistent between the three nodes in their cluster. This seems to suggest that any failure will block writes. It also seems like since each node has a full copy of the index, that means your entire index must exist within the confines of one machine. So how large of an index can you have in reality? Am I missing something?

It seems they accept writes as long as at least two clusters agree to accept it (which the majority in the 3 clusters etting the have).

I would think that each cluster has the whole index, so any cluster can be used for reads.

I would also think they have multiple such triple-clusters; one triple for each important customer.

(It is difficult to know if a (logical) node is actually a cluster or not.)

We can continue to index if one host is down and we can continue to search if two hosts are down. If a host is down, we are able to continue indexing because of consensus (2/3). When the host will be up again, it will retrieves the missing jobs from the 2 others servers and apply them to be exactly at the same level.

The SSD option intrigues me. Lately I find that using many consumer class HDs in more servers is better for this kind of application than using enterprise grade HDs or SSDs

Consumer disks would not give the same performance, we build & read a lot of indices in parallel, this produces a lot of random IO on disks. With two disk in raid 0, we are able to reach 1GB/s of random IO.

I see, but what I found interesting was that the price/performance ratio of this solution proved better than getting, say, four HDs in each machine or an extra machine, which is far more common in today's clusters (e.g.: Hadoop/Vertica clusters uses I guess the profile of the IO (as you said, lots of random IO) contributed a lot for this. Could you elaborate on the decision process that lead to this

Why do you think consumer disks are better for a search engine?

Everything is a balance between CPU/RAM/Disk. Having one of this resource that is far superior than the other would be a waste of resource. Our balance is globally to have about 16 cores/32threads per host with latest XEON >= 3.5Ghz, 256G of RAM and about 1TB of SSD (2X512G)

We perform a lot of IO, so the compromise is between price and number of IOPS we can get. Having the same IOPS than our two SSD in raid0, would require a lot of consumer disk, which leads to two problems: 1) We would have too many disk space and we will not have enough CPU to use all this space 2) It would not fit in a 1U server and would increase a lot the price of the server

I was thinking more of an architecture comprising either:

1. more servers with less capacity (CPU,RAM,Disk) each 2. same number of servers, but six + Hard Drive Raid0 in each

Both of which could be better for other apps but include their own set of problems and, as you pointed out, could result in higher costs.

I think I was misunderstood. I was not questioning the architecture chosen, sorry if I sounded that way. I want to learn why was it chosen since I don't see many like it. So far you have provided me with some very satisfying answers, and for that I thank you.

Applications are open for YC Summer 2018

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact