Hacker News new | past | comments | ask | show | jobs | submit login

I work on an open source alternative to Algolia called Typesense.

Algolia is a great product but can get quite expensive at even moderate scale. If I had a dollar for every time I’ve heard this from Algolia users switching over…

I recently put together this comparison page, comparing a few search engines, including Algolia, you might find interesting: https://typesense.org/typesense-vs-algolia-vs-elasticsearch-...




It's missing the most important thing:speed. We moved to Algolia mainly because of this. Elastic Search and Solr could not compete.


Oh yes. Speed is an important point. ElasticSearch & Solr use disk-first indexing (with RAM as just a cache), whereas Algolia and Typesense use a RAM-first approach where the entire index is stored in memory. This is what makes Algolia/Typesense return results much much faster than ES/Solr, and lets you build search-as-you-type experiences for each keystroke.

I was thinking about adding a row about speed to the comparison matrix, but couldn't find a way to express the comparison clearly... Imagine a row that said:

Search Speed | Super-fast | Super-fast | Slow? ...

That felt a little off. So I resorted to just mentioning primary index location as a proxy.

Open to suggestions on how to express this succinctly.


What index sizes are we talking about? If it's a few hundred gigs there's always the possibility of putting the entire ElasticSearch index into a ramdisk, or even just leaving lots of "free" RAM meaning the underlying OS will use it to speed up I/O transparently. Bare-metal machines with insane RAM sizes are a thing, and at massive scale could make sense.

I've had great success at a client where simply upgrading a DB to an instance with enough RAM to fit 80% of the entire data set fixed all performance problems and significantly reduced I/O "pressure" at least for reads (writes were never a problem).


I haven’t tried to do this myself so I can’t speak to it.

But one thing I would add is ElasticSearch is quite versatile and flexible, so I wouldn’t be surprised if you can contort it to get it to work for a wide variety of use cases. This is a blessing and a curse - blessing because it’s so flexible, curse because the flexibility breeds complexity and brings with it a steep learning curve and operational complexity.

Where I think Algolia / Typesense help is that things work out of the box without the learning curve or operational overhead.


That table seems... fine? Creating multiple data sets and comparing the various aspects of the products speed is a lot of additional work that you may not have signed up for, and is far from succinct (or easy). It might feel more empirical, but "feels faster" is fine. You're providing a free service - review of available products, and can use whatever metrics you choose.


What happens when the index can’t fit into memory with Typesense?

Does it OOM?


Correct, the OS’ OOM reaper will kick-in to try and protect core OS operations. So you don’t want to let it get to that stage - you’d typically want to keep at least 15% free memory for the OS to do its thing.

Commercially available RAM today goes all the way to 24TB these days, which should be sufficient for a good number of search use cases. Beyond that you’d have to shard data across multiple clusters.

Similarly with Algolia, they use 128GB RAM clusters, and recommend you keep your Algolia index size below 100GB: https://www.algolia.com/doc/guides/sending-and-managing-data...


Why not place the main algorithm for speed of search, so users can lookup the difference on another page.


Looks pretty interesting. There never really seemed to be any good alternatives to ES for a long time. Apart from building feature set, how do you target quality of search results? Do you have any test bed for measuring this and do you benchmark against other solutions to try and understand how everyone fares?


We have search relevancy tests baked into the automated test suite that runs on every commit. We keep adding to it as we get feedback about edge cases and new cases.

We don’t have comparative relevancy benchmarks. But we fo have performance benchmarks here: https://typesense.org/docs/overview/benchmarks.html


Any particular reason why Typesense can't handle:

> Exact Keyword Search ("query")

Any plans on adding it in future?


Just a question of priority, based on number of asks for it. We do plan to support it.

In the meantime, we introduced a way to turn off typo tolerance and prefix-search on a per-field basis. This has helped some users search for fields containing model numbers for eg.


Price is 100% why we're looking at typesense.


Does Typesense support searching in non-latin languages?


Yes it does - all languages except logographic ones (Chinese, Japanese and Korean) which we are actively working on: https://github.com/typesense/typesense/issues/228


Okay, those are the ones I meant ;) – thanks! This is the main limitation with our postgres implementation rn.


I heard a joke about FTS engines, but Whoosh !




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: