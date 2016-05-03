You just include a js file and add an input where you want the search and it just works(tm).
Reading this I get the impression that they do the indexing themselves or something but in the Quick Start the first thing you need to do is importing your data. What am I missing?
If you have any other usecase than technical documentation, we recommend indexing your data through the API.
However, it's easy to exceed their record limits, especially since you need to duplicate your index every time you want to 'sort by' something.
For instance, if I wanted an option to sort my results by date, and to then search these, I'd need to create 2 new slave indexes for date ascending and descending respectively. Sorting by anything else, like price, means creating yet more indexes and suddenly it's easy to turn 30K records into 150K. This happened to me and I ended up having to roll something custom instead (Vuejs frontend and Sphinx Search backend), since my client balked at the extra cost.
But, if you have a small dataset, or are fine with the costs, then Algolia is spectacular.
If you pay for a million records, you should be able to store a million records.
This brings up a related point.
How strong are the defensibility and separately the network effects in Algolia's business?
I guess once they have a customer, it might be annoying for that customer to switch. Is that accurate?
But is there any reason for the 101th customer to use Algolia other than the brand? My hunch says no, and that there aren't any network effects.
I signed up for your beta and will follow your progress.
Is your solution based on solr/lucene/elasticsearch?
especially if you have records that not even change often.
If you have 300,000 items in an index that you want to sort in 4 ways and want to update eg the price daily, you already consumed 36 million operations of the biggest non enterprise plan that includes 50 million operations.
Just by testing and tweaking the index every other day, we already use up to 500,000 operations.
But then again setting up search infrastructure in different countries and synching it in realtime also comes at a hefty price. So we will stick with Algolia for now, the speed is breathtaking and we will never be able to achieve 20ms responses with eg an Elasticsearch cluster.
To make this easy to implement, we provide a way to create index replicas, read-only indices that can have different settings from the master index.
When using replicas, every record added to a master index gets also added to the replica index. Same goes for deletion and update operations.
All indexing operations done on replicas are not billed.
By using replicas, you can adjust your calculate by removing the factor of four you included for each index,
meaning that 300K*30 days = 9million operations/month. This assumes you update the entire index daily,
whereas you could also only update the prices that changed, which would in turn further reduce the number of operations.
Why not? Assuming good hardware, why isn't that possible?
I'd love to see if someone has done any of the "realtime" Algolia demos backed by ElasticSearch.
In any case, ES excels at very different use cases - I've only seen Algolia provide "basic" search.
I think ES can get there, but depends a lot on what hardware you deploy (SSDs!), how you build your index, and whether you can geographically distribute your search engine close to your users.
We have one ES cluster with hundreds of queries per second that gives median 9ms response times and 99th percentile around 160ms. Another cluster with 100x more data that gets 20-25ms median response times and 99th percentile at 360ms.
Now both of these are just the ES response time, there is additional overhead in responding to an API request and then you also start to get into where your data centers are located relative to the end users.
More (slightly out of date) background on our config: https://data.blog/2016/05/03/state-of-wordpress-com-elastics...
Did not know Google is discontinuing its custom search engine. Looks like there may be a business opportunity here.
Hope you find it useful
Thanks for the heads up, I will investigate.
However, the market is very very tough here.
The problem here is consolidation: all big customers (successful websites) will end up being part of mega-corporations (Amazon, Microsoft, Google, Salesforce, etc.). And all these companies have their own search engine (which they use for other machine learning, etc. - not just search). For example, Twitch will probably switch to Amazon A9.
So the question here is: what is the game plan to return this $53M? Acquisition?
Algolia is better so they need to stick with it. But Amazon will get there... But if Amazon finds out that writing code is harder then writing a check they will write the check. But they are still trying.
What I notice with their results is that they start strong, but fall off very rapidly into partial gibberish.
For example, say I want to make a battery-powered project. I search "lipo." The results start out promising; charging boards, including a few options tailored to specific popular boards. But by the time we get to the end of page 1, only every other result or so is actually relevant. Useful results like batteries and connectors are interspersed with random microcontrollers, LEDs, motors, etc.
Still, it works and you can always refine by category if that sort of thing really bothers you. It seems like a pretty solid solution, and the issues I'm mentioning are probably caused by the implementation, which I'll bet uses the full text description of entries to weight things towards not excluding a potentially-relevant result.
But sites like Digikey/Mouser/etc have searching down. If I want a capacitor, first I pick what kind (electrolytic, ceramic, tantalum, etc,) and then I am presented with dozens of menus representing specific attributes that I care about. Capacitance, temperature coefficient, size/pakaging, manufacturer- whatever you could possibly want to select on, you can.
Sometimes I wish that other digital distribution platforms would take inspiration from that 'catalog' model. Discovery is difficult, these days.
Seeing a huge raise like this actually makes me feel better about the possibility of using Algolia rather than rolling my own Solr containers.
Because...they won't be pressured to be acquired?
EDIT: Looks like they do support this..
https://www.algolia.com/doc/api-client/ruby/api-keys/#genera...
However, I think for a small project it will probably be extra work to keep these keys synchronized vs just doing a postgres search for my project
https://www.algolia.com/doc/guides/security/api-keys/
My only complaint, if I had to make one, is that pricing is a bit steep for our use case but I can't imagine how much time I'd need to spend to get ElasticSearch running comparably.
Kudos to the team at Algolia. Hope you're bathing in pride this week. Looks like the praise is widespread and well earned.
This means that creating a defensible USP in that space isn't easy. Algolia did it by essentially developing their own search software from the ground up. It isn't necessarily better than competitors like Elasticsearch in every measure but it certainly has some interesting properties and it can be pretty fast, especially for things like autocomplete.
Lucene is great, but, it does have limitations: tough to embed in non-JVM applications and poor support for tokenizing things like emojis are two immediate things that come to mind.
Elasticsearch is maybe my favorite data storage product in the last 15 years, but the Algolia folks have taken a different approach that is also quite interesting.
Algolia is _not_ built on top of lucene at all.
Indexes are easy to setup, maintain, and SDK's support a variety of languages. Querying is fast enough to feel seamless.
Algolia wrote some good blog posts, for example:
https://stories.algolia.com/how-the-founders-of-algolia-thin...
https://stories.algolia.com/how-algolia-built-a-culture-firs...
Just their typo acceptance alone makes Algolia, imo, the best 3rd party search service available currently.
(We're currently using MySQL's full-text search and are finding it to be a long way short of the experience we'd like our customers to have.)
Algolia Pros:
* very nice UI/dashboard, stats, lots of options, flexible
* documentation is nice, 'onboarding' was not an issue
* it worked reliably most of the time
Cons:
* lock-in/proprietary. Think twice if you want to base your business on it.
* you have trust them with very sensitive customers data
* it is expensive. We reduced costs from $800/month to 40$/month by moving to self-hosted open source solution
with same level/quality of customer experience.
* guys at Algolia like to rewrite client libraries and make them completely backwards incompatible.
Good luck rewriting almost everything.
* support is incompetent and honestly it was useless in our case. I don't want to share details here in public as lots of people were involved
in the case, but their support is disaster and for us that was major reason to migrate away.
We had major issues for 2 weeks and in the end had to debug&fix problem in their client library by our developer.
Long story, but that one was disaster.
In short, it wasn't extremely horrible experience overall, product is nice in general, but elasticsearch is very very good and Algolia just couldn't justify price tag, sorry.
* we have never discontinued a feature in the API since the launch
* We never broke our API clients, we proposed a new version when a new feature required a big change but we kept the previous version (and this happened only on two API client in 5 years)
For the support, this is our engineer's team working on the product that does the support and we put a lot of effort to make sure all our customers are satisfied and get the relevant answers.
Then if you got the same customer experience with a $40 machine, you have probably not used all the feature/power of the engine. I am sad to see such a feedback and you can make me accountable to make sure we will do everything we can to satisfy all our users
Let me know if you wanna know more!
highly recommend checking them out. Congrats to the Algolia team!
This is a place where there are tons of unstructured data: emails, slack, atlassian, code base. Search seems like a useful tool for employees. Permission control is another essential feature.
Security is super important, but we can provide encryption-at-rest and have a secured API key [1] feature that allows you to segment your user base.
[1] - https://www.algolia.com/doc/guides/security/api-keys/#secure...
(their Paris desk is located virtually across the street, but it's not so great)
