I just made a couple of searches with teclis. I have to say, it's not bad. It's ...

freediver · on Dec 14, 2021

Thanks. The index is tiny and it is just a proof of concept of what a single person can do with technologies available nowadays. I felt it is better for it to return zero results than bad results.

petra · on Dec 14, 2021

It's a good experience, for sure better than Google.

But I get 1/5 - 1/10 hit ratio(successful/empty searches). That's not habit forming, memory forming for me.

Is there a core use case where I would get a good hit ratio ?

freediver · on Dec 14, 2021

As the site says this demo is by no means meant as a replacement for Google, but rather to complement it. I would say Teclis is good for content discovery and learning new things outside the typical search engine filter bubble. A few examples of good queries are listed on the site.

A similar concept was shown here recently: https://search.marginalia.nu

marginalia_nu · on Dec 15, 2021

Should be added my search engine does not do vector search (or any sort of ML), I'm very much exploring the null hypothesis that is boring solutions.

gk1 · on Dec 14, 2021

> The index is tiny

What was the largest index you've had on Faiss? That seems to affect whether people think of it as more than adequate or terribly inadequate.

m3at · on Dec 14, 2021

Not the author, but at work we've had in the hundreds of millions. Faiss can certainly scale.

If you do have a tiny index and want to try Google's version of vector search (as an alternative to Faiss), you can easily run ScaNN locally [1] (linked in the article, that's the underlying tech). On small scale I had better perf with ScaNN

[1] https://github.com/google-research/google-research/tree/mast...

freediver · on Dec 14, 2021

Thanks for mentioning it. Just tried it and it seems to be faster than Faiss indeed on smaller scale searches.

leobg · on Dec 15, 2021

There's also hnswlib[1], which has supposedly lower memory requirements and allows for adding new vectors to an existing index.

[1]: https://github.com/nmslib/hnswlib/

freediver · on Dec 14, 2021

This demo is only about million vectors. The largest I had in Faiss was embeddings of the entire Wikipedia (scale in the neighborhood of ~30 million vectors). I know people running few billion vectors in Faiss.

leobg · on Dec 14, 2021

So one vector per article? Doesn’t this skew results? A short article with 0.9 relevance score would rank higher than a long article containing one paragraph with 1.0 relevance. Am I mistaken?

visarga · on Dec 15, 2021

It's a known problem. Choosing appropriately between cosine similarity or dot product depending on your use case can help.

marginalia_nu · on Dec 15, 2021

Yeah, this is also a problem traditional search ranking algorithms needs to compensate for.

leobg · on Dec 14, 2021

Also, BERT on cheap hardware? I thought that without a GPU, vectorizing millions of snippets or doing sub-second queries was basically out of the question.

visarga · on Dec 15, 2021

CPU BERT inference is fast enough to embed 50 examples per second. Your large index is built offline, the query is embedded live but it's just one at a time. Approximate similarity search complexity is logarithmic so it's super fast on large collections.

jkb79 · on Dec 15, 2021

It's about choosing the right Transformer model, there are several models which are smaller, with fewer parameters than bert-base which gives the exact same accuracy as bert-base, which you can use on a modern CPU single digit ms, even with a single intra-thread. See for example, https://github.com/vespa-engine/sample-apps/blob/master/msma...

leobg · on Dec 15, 2021

Thanks for the link!

I compared BERT[1], distilbert[2], mpnet[3] and minilm[4] in the past. But the results I got "out of the box" for semantic search were not better than using fastText, which is orders of magnitude faster. BERT and distilbert are 400x slower than fastText, minilm 300x, and mpnet 700x. At least if you are using a CPU-only machine. USE, xlmroberta and elmo were even worse (5,000 - 18,000x slower).

I also love how fast and easy it is to train your own fastText model.

[1]: distiluse-base-multilingual-cased-v2

[2]: multi-qa-MiniLM-L6-cos-v1

[3]: multi-qa-mpnet-base-cos-v1

[4]: multi-qa-MiniLM-L6-cos-v1

jkb79 · on Dec 15, 2021

Vector models are nothing but representation learning and applying the model out-of-domain usually gives worse results than plain old BM25. See https://arxiv.org/abs/2104.08663

A concrete example is DPR which is a state of the art dense retriever model for wikipedia for question answering, when applying that model on MS Marco passage ranking it performs worse than plain BM25.