More

jn2clark · 2024-05-24T03:44:08

With binary representations you still get 2^D possible configurations so its entirely possible from a representation perspective. The main issue (I think at least) is around determining the similarity. Hamming distance gives an output space of D possible scores. As mentioned in the article, going to 0/1 with cosine gives better granularity as it now penalizes embeddings if they have differing amounts of positive elements in the embedding (i.e. living on different hyper-spheres). It is probably well suited to retrieval where there is a 1:1 correspondence for query-document but if the degeneracy of queries is large then there could be issues discriminating between similar documents. Regimes of binary and (small) dense embeddings could be quite good. I expect a lot more innovation in this space.

jn2clark · 2024-05-24T03:37:15

That's a great question. I think regimes like that could offer better trade-offs of memory/latency/retrieval performance, although I don't know what they are right now. It also assumes that going to the larger dimensions can preserve more of the full-precision performance which is still TBD. The other thing is how the binary embeddings play with ANN algorithms like HNSW (i.e. recall). With hamming distance the space of similarity scores is quite limited.

jn2clark · 2024-04-24T08:48:00

I would love an LLM agent that could generate small api examples (reliably) from a repo like this for the various different models and ways to use them.

jn2clark · 2024-03-27T22:14:40

What is accuracy in this case? is it meant to be recall or is it some evaluation metric?

gaocegege · 2024-03-28T02:03:24

Yeah, it is recall.

jn2clark · 2024-01-26T08:37:48

We (Marqo) are doing a lot on 1 and 2. There is a huge amount to be done on the ML side of vector search and we are investing heavily in it. I think it has not quite sunk in that vector search systems are ML systems and everything that comes with that. I would love to chat about 1 and 2 so feel free to email me (email is in my profile).

jn2clark · 2024-01-17T22:01:00

Take a look here https://github.com/marqo-ai/local-image-search-demo. It is based on https://github.com/marqo-ai/marqo. We do a lot of image search applications. Feel free to reach out if you have other questions (email in profile).

kklemon · 2024-01-19T12:23:31

That looks indeed pretty interesting. But I still feel that it's not very convenient for usage in a desktop environment with local files. This is of course not to blame on the project itself, since I assume that it simply targets different use-cases and audiences.

I also researched in the meantime whether such a functionality could be implemented at all for the Gnome Shell and, more specifically, for its file browser. But the search and extension APIs would not even allow it or require many hacks.

jn2clark · 2024-01-04T05:30:52

Can anyone comment on an open source multi-modal LLM that can produce structured outputs based on an image? I have not found a good open source one yet (this included), seems to be only closed source that can do this reliably well. Any suggestions are very welcome!

isaacfung · 2024-01-04T12:48:34

Something like this?

https://imgur.com/a/hPAaZUv

https://huggingface.co/spaces/Qwen/Qwen-VL-Plus

You can also ask it to give you bounding boxes of objects.

addandsubtract · 2024-01-04T10:24:17

I've only used LLaVA / BakLLaVA. It falls under the LLAMA 2 Community License. Not sure if you consider that open source or not.

jn2clark · 2024-01-02T21:56:30

That sounds much longer than it should. I am not sure on your exact use-case but I would encourage you to check out Marqo (https://github.com/marqo-ai/marqo - disclaimer, I am a co-founder). All inference and orchestration is included (no api calls) and many open-source or fine-tuned models can be used.

chatmasta · 2024-01-02T21:58:32

> That [pgvector index creation time] sounds much longer than it should... I would encourage you to check out Marqo

Your comment makes it sound like Marqo is a way to speed up pgvector indexing, but to be clear, Marqo is just another Vector Database and is unrelated to pgvector.

jn2clark · 2024-01-02T23:01:33

Fair enough, apologies for the confusion!

code_biologist · 2024-01-02T22:05:42

The reason I would use pgvector is because I am uninterested in another piece of infrastructure.

jn2clark · 2023-10-25T06:18:02

Try this https://github.com/marqo-ai/marqo which handles all the chunking for you (and is configurable). Also handles chunking of images in an analogous way. This enables highlighting in longer docs and also for images in a single retrieval step.

jn2clark · 2023-10-04T21:19:22

As others have correctly pointed out, to make a vector search or recommendation application requires a lot more than similarity alone. We have seen the HNSW become commoditised and the real value lies elsewhere. Just because a database has vector functionality doesn’t mean it will actually service anything beyond “hello world” type semantic search applications. IMHO these have questionable value, much like the simple Q and A RAG applications that have proliferated. The elephant in the room with these systems is that if you are relying on machine learning models to produce the vectors you are going to need to invest heavily in the ML components of the system. Domain specific models are a must if you want to be a serious contender to an existing search system and all the usual considerations still apply regarding frequent retraining and monitoring of the models. Currently this is left as an exercise to the reader - and a very large one at that. We (https://github.com/marqo-ai/marqo, I am a co-founder) are investing heavily into making the ML production worthy and continuous learning from feedback of the models as part of the system. Lots of other things to think about in how you represent documents with multiple vectors, multimodality, late interactions, the interplay between embedding quality and HNSW graph quality (i.e. recall) and much more.

PheonixPharts · 2023-10-05T03:05:27

> IMHO these have questionable value

In general I find they're incredible good for being able to rapidly build out search engines for things that would it would normally be difficult to do with plain text.

The most obvious example is code search where you can describe the function's behavior and get a match. But you could also make a searchable list of recipes that would allow a user to search something like "a hearty beef dish for a cold fall night". Or searching support tickets where full text might not match, "all the cases where users had trouble signing on".

Interestingly Q & A is ultimately a (imho fairly boring) implementation of this pattern.

The really nice part is that you can implement working demos of this projects in just a few lines of code once you have the vector db set up. Once you start thinking in terms of semantic search over text matching, you realize you can build old-Google style search engines for basically any text available to you.

One thing that is a bit odd about the space is, from what I've experienced and heard, is that setup and performance on most of this products is not all that great. Given that you can implement the demo version of a vector db in a few lines of numpy, you would hope that investing in a full vector db product we get you an easily scalable solution.