Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This tutorial is very complex. Here's how to get free semantic search with much less complexity:

  1. Install sentence-transformers [1]
  2. Initialize the MiniLM model - `model = SentenceTransformer('all-MiniLM-L6-v2')`
  3. Embed your corpus [2]
  4. Embed your queries, then search the corpus
This runs on CPU (~750 sentences per second), and GPU (18k sentences per second). You can use paragraphs instead of sentences if you need more text. The embeddings are accurate [3] and only 384 dimensions, so they're space-efficient [4].

Here's how to handle persistence. I recommend starting with the simplest strategy, and only getting more complex if you need higher performance:

  - Just save the embedding tensors to disk, and load them if you need them later.
  - Use Faiss to store the embeddings (it will use an index to retrieve them faster) [5]
  - Use pgvector, an extension for postgres that stores embeddings
  - If you really need it, use something like qdrant/weaviate/pinecone, etc.
This setup is much simpler and cheaper than using a ton of cloud services to do embeddings. I don't know why people make semantic search so complex.

I've used it for https://www.endless.academy, and https://www.dataquest.io and it's worked well in production.

[1] https://www.sbert.net/

[2] https://www.sbert.net/examples/applications/semantic-search/...

[3] https://huggingface.co/blog/mteb

[4] https://medium.com/@nils_reimers/openai-gpt-3-text-embedding...

[5] https://github.com/facebookresearch/faiss



I also recommend this approach when you want to understand every step. I recently did a presentation about this topic: https://www.youtube.com/watch?v=hGRNcftpqAk

It covers end-to-end, including ClickHouse as a vector database.


TBH, it does not look "less complex", not at all. :) install, install, ... but where to install and run all of this? The topic is "serverless". This means you do not need to run anything, just need two cloud APIs and a Lambda Script.


If you are going to shill for your project, the proper thing to do is disclose that.

https://github.com/azayarni is a contributor (andre-z on twitter).


+1, disappointing to see.


15 minutes of install, install, install beats getting into the vicious SAAS vendor cycle of pay, pay, pay with heavy lock-in.


I'd rather run ~10 lines of code locally than setup 3 cloud services and a lambda function, but to each their own...


How would you host sentence-transformers model for free? You need it to vectorize each query so that has to be hosted somewhere. Is there any way to do it for free?


Just run it on CPU, on your own machine. That's the cheapest way. You could also rent a free/cheap VPS, and even parallelize across multiple machines/cores if you need it.


Maybe I'm grumpy today but I am shocked at how many responses you are getting where people think this is a novel idea. Has the engineering mindset really shifted into a default of "buy" even when build could take less than a week?


I was surprised, too, but then I realized they all work at Qdrant.

But the general dialogue around AI-related tools is surprising to me. The production parts of the langchain, embeddings, etc tools can usually be built in a few hours with better observability, performance, and maintainability.


How does sentence-transformers compare to OpenAI embeddings? How long does it take to generate an embedding on a CPU?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: