
BM25 – The Next Generation of Lucene Relevance - based2
http://opensourceconnections.com/blog/2015/10/16/bm25-the-next-generation-of-lucene-relevation/
======
asjo
Xapian has been using BM25 forever[1] - what made Lucene switch now?

[1] [https://xapian.org/docs/bm25.html](https://xapian.org/docs/bm25.html)

~~~
apatry
BM25 has been available for Lucene since version 4 thanks to a refactoring
that allowed anyone to plug in a custom relevance score if I remember
correctly . The big news is that starting version 6, BM25 is the default.

~~~
asjo
Yes, so I am asking why is Lucene switching now?

------
rpedela
BM25 has been available for awhile so is there a reason it is becoming default
soon besides it is better in certain scenarios? Have there been improvements
to the implementation recently?

~~~
eva1984
I went to the elastic conf last month, there is a great talk about BM25. It
mentions that BM25 is more robust against common words than plain tf*idf.

[https://www.elastic.co/elasticon/conf/2016/sf/improved-
text-...](https://www.elastic.co/elasticon/conf/2016/sf/improved-text-scoring-
with-bm25)

------
CephalopodMD
Doesn't lucene let you choose your own relevance algorithms, smoothing,
metrics, etc. anyway?

