Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: USearch – Smaller and Faster Single-File Vector Search Engine (github.com/unum-cloud)
21 points by ashvardanian 11 months ago | hide | past | favorite | 2 comments
Last week was insane for vector search. Weaviate raised $50M, and Pinecone raised $100M... That's a lot and makes you believe that vector search is hard. But it's not.

I have spent the last couple of days implementing a single-file vector search engine from scratch, which is at least the tenth in the twenty years of my career. But this time, it's different. Instead of inventing a brand new algorithm and doing some crazy optimizations on the GPU, I:

1. took the standard HNSW algorithm, 2. fitted into 1000 lines of C++11 for portability, 3. added quantization and hardware-accelerated metrics, 4. wrapped for Python, JavaScript, Rust, and Java, and 5. open-sourced it!

It was fun, and to my surprise, it performed well, reaching 300K QPS on Amazon c7g instances. I never had to use third-party vector search products, but the first testers of USearch suggested 3x performance improvement over their existing solutions.

My colleagues and friends are also adding bindings for GoLang and the Wolfram language. We will soon add Windows support, a standalone server, and a distributed version based on UCall we shared a month ago. There are, of course, but you can already use it!

One of the apparent use cases is Semantic Search platforms. The example at the end of the GitHub page shows how to use USearch, UCall, and the UForm transformers together to build up a text-to-image semantic search platform in just 20 lines of Python.

Try it and join the development! We also have a lot of open positions, especially for those who want to work with us on next-get algorithms and AI infra rather than polishing and repackaging existing ideas :)

Wow, this is amazing and 1K lines is very tempting for me to look deeper and hack.

If I were to write my own implementation, where should I start?

I always prefer to start with theory or a paper, but it seems to be an outdated approach. You can try checking out FAISS, which has good performance, but overly complex to navigate. You can lookup HNSWlib, the original, but it looks more like a university project, rather than a production library... So if you want to see code - start with USearch :)

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact