Hacker News new | past | comments | ask | show | jobs | submit login

I wonder how do legacy search players like elastic / solr compete against the new age startups combining semantic and regular search ?



Lots of reasons:

1) switching search engines is hard when you’ve built your information needs around one. I’ve led lots of search engine migrations and they’re not fun. I even gave a talk on the problems companies face when doing so. https://haystackconf.com/us2020/search-migration-circus/

2) lots of the new search startups don’t offer full feature coverage. So just because a company is the new hotness it doesn’t mean it can fill the need of someone entrenched in Solr/elastic

3) why risk going to a startup when they haven’t proven they’ll be around in 3 to 5 years?

4) incumbent search engines eventually catch up at the speed of the enterprise market. Why spend a year migrating when the engine your using will implement the feature for you within that timeframe?


By adding the features that those new age startups launch: https://www.elastic.co/guide/en/elasticsearch/reference/curr...

Building a classic text search engine is way harder than building a KNN engine, and bolting a KNN engine into a term search engine is easier than the other way around.


Reading "legacy" near "elastic" make me feel a little bit old :D :D

BTW, if you are one of the leaders of the market, you don't need to continuously improve, just wait and let your competitors do the research job and implement only when the feature is mature.


:D :D

Sorry my question was on the basis of the quality of the results, simply put .. how does players who have good semantic search turn out against "legacy" players who had good text search


They are part of the hype. Lucene has vector search capabilities. Elasticsearch and Opensearch have support for that (slightly different implementations). I assume solr has similar capabilities. The combination of traditional search and vector search makes a lot of sense from a cost control point of view. Vector search at scale is expensive. The smaller the result set, the cheaper it is to do vector search over it. So using a cheap traditional search to limit the results before you run vector search makes a lot of sense.

Also, bm25 holds up well against vector search. A well tuned model can outperform it but many off the shelf models struggle to do that. Vector search is a useful tool but so far it's not a one size fits all solution that "just works". It's something that can work really well if you know what you are doing and with a lot of tuning. With things like Elasticsearch you can try both approaches.


pg_bm25/ParadeDB author here. What we're doing is building an opinionated alternative within PostgreSQL. If you are not using Postgres, or want your system to be separate, Elastic is still the best choice and will likely remain so.

Other people have brought up great points for why or why not to switch. Our vision for this is that ParadeDB is not merely "better" than Elastic, but rather different. Elastic will never be a PostgreSQL database, and we'll never be a NoSQL search engine. If you want one or the other, you'll pick either ParadeDB or Elastic.


Who is the competition besides Algolia? Last I checked most of the competition is either very expensive or very feature limited compared to Elastic/Solr.


Meilisearch seems like it is the best open source option.

https://www.meilisearch.com/


I think pretty much all the companies who provide vector search are indirect competitors




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: