Hacker News new | past | comments | ask | show | jobs | submit login
SPLADE for Sparse Vector Search Explained (pinecone.io)
52 points by gk1 on Feb 23, 2023 | hide | past | favorite | 14 comments



I am not sure what algorithmic "secret sauce" pinecone is implementing beyond being a wrapper around FAISS, but I have got to give it to James that he is doing god's work in helping people understand ANNS and how to apply it.


You might be surprised to learn there’s no Faiss inside Pinecone. All the indexes are proprietary and have features you won’t find in Faiss, such as metadata filtering, real-time index updates, and horizontal/vertical scaling. Oh and of course sparse-dense hybrid indexing, which went live today.


I'll believe it once Pinecone becomes open source. This lack of transparency is why I prefer other vector databases such as Vespa, Milvus, etc.

Edit: While I do appreciate Pinecone's free tier and understand that it's great for small projects and the vector search community in general, the performance really sucks.


Do you plan to publish about what is being implemented soon / support ANN Benchmarks or BigANN Benchmarks?


We'll participate in BigANN this year or next. It's a question of engineering/research resources.


Completely false. You guys replicated and beat years of research by multiple teams at Facebook? With a few people?


What are you saying is false? That they aren't using FAISS? Their whole startup is built around a proprietary ANN solution, which isn't using FAISS.


Just out of curiosity - how do you know this if you don't work there?


When a founder says "You might be surprised to learn there’s no Faiss inside Pinecone" I'm inclined to believe them. They seem like a good team. I don't think they would lie about this.

I also just had a browse through some of the employee profiles on LinkedIn and their team absolutely seems to have the background you would need to build something like this from scratch. https://www.linkedin.com/company/pinecone-io/people/


Correction: I'm not the Founder, but I've been at Pinecone for two years, from when we were ~10 people. Long enough to know what's under the hood.

I've also been on HN long enough to expect a few cynics and conspiracy theorists in the comments.

This reminds me... The creator of Faiss, Matthijs Douze, will speak at our NYC meetup on March 21st. Join up and ask him if he's aware of any sneaky business going on! https://www.pinecone.io/events/meetup-march-2023/


If an engineer or product manager working there mentions it in a casual setting, I'm inclined to believe them. If an founder/executive says it, it's a tossup.

Why? Look at companies like FTX.


I’m a paying customer. They have been building up a system with new features over time that Faiss doesn’t support. At least from a customer view I’d be very surprised if they were using it. I may be wrong but I thought it was originally started by some academics in Israel.


FAISS is open source and highly extensible. I don't think it would be very difficult to build atop it.


At least out of the box, I'm pretty sure FAISS doesn't support sparse matrices.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: