Yeah, that's exactly right. You take the embedding of your query and then compare it to the embeddings of your documents (or chunks of documents).
You can use vector databases that implement the Approximate Nearest Neighbors (ANN) algorithm, or if you don't have too many documents you can just brute force the similarity comparison.
I'm not sure at what point you _really_ need or want a vector database, but anecdotally, calculating the Hamming distance for a couple thousand vectors seems pretty negligible.
Very cool thanks. I am using pgvector right now. I will run into scaling issues at some point I suspect/hope.
I’ve been thinking about some setup on object storage. The data I have could be grouped per account. If you are able to compress the total index size to 1% of traditional vectors you can probably get a long way with fetching the binaries from s3 only when necessary to some sort of warm cache location.
Or just get a hetzner machine with 20TB of storage and not worry about it I guess
You can use vector databases that implement the Approximate Nearest Neighbors (ANN) algorithm, or if you don't have too many documents you can just brute force the similarity comparison.
I'm not sure at what point you _really_ need or want a vector database, but anecdotally, calculating the Hamming distance for a couple thousand vectors seems pretty negligible.