Hacker News new | past | comments | ask | show | jobs | submit | carsonpoole's comments login

I wholeheartedly agree :) I've mostly put this together this weekend, so I haven't had time to do everything yet, but that's definitely on the todo list.

I FT'd some siglip models (see here: https://huggingface.co/carsonpoole/binary-siglip-text) that should be amenable to binary quantization so I'm going to tomorrow get an inference server running with that and then hopefully do the typical MTEB benchmarks for embeddings


There are also FAISS binary indexes[0], so it'd be great to compare binary index vs binary index. Otherwise it seems a little misleading to say it is a FAISS vs not FAISS comparison, since really it would be a binary index vs not binary index comparison. I'm not too familiar with binary indexes, so if there's a significant difference between the types of binary index then it'd be great to explain what that is too.

[0] https://github.com/facebookresearch/faiss/wiki/Binary-indexe...


not yet, but it's roughly linear at scaling, since it's a brute force algorithm. so with the current version it'd probably be about 22 seconds for a 1B vector search. the whole point of having metadata queries are to prevent those kind of searches from being necessary, and hopefully with some FTS interspersed it can reduce the number of similarity comparisons required even more


IMO at least there are a lot of things that other vector DBs are missing and should exist. I want to make this work at petabyte scale data with the features on the readme's roadmap plus some others I have nebulous ideas about.


It works right now, but I'm actively adding a lot of additional things that might make your life easier. The roadmap on the readme shows what I'm working on adding. Feel free to shoot me an email at carson at poole.ai and I'd be happy to give some guidance, but a quickstart is definitely at the top of my priorities also. :)


This is really cool! Thanks for writing it!


thank you for the kind words!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: