I really don't want another database. I just want to have a solution built in fo...

killthebuddha · on March 26, 2023

FWIW supabase recently added support:

https://supabase.com/docs/guides/database/extensions/pgvecto...

bottlepalm · on March 26, 2023

Very cool, thanks.

fzliu · on March 26, 2023

Totally with you on that - it's annoying to need multiple pieces of infrastructure, especially for solo developers or small teams. Often times, you want to filter further based on scalar fields/metadata such as timestamps, strings, numeric values, etc...

That's why we built attribute filtering into Milvus via a Mongo-esque interface. No SQL and not as performant as an RDBMS, but it's an option: https://milvus.io/docs/hybridsearch.md

bayan1234 · on March 26, 2023

Yes exactly. My company has asked AWS if they will be adding support for pgvector for rds but they haven't been able to confirm if that will happen any time soon.

If the vectors are in the same database as the tabular/structured data then text to sql applications of llm's are so much more powerful. The generative models will then be able to form complex queries to find similarity as well as perform aggregation, filtering and joining across datasets. To do this today with a separate dedicated vector db is quite painful.

CuriouslyC · on March 26, 2023

You could write a FDW that reads/writes to a vector database using postgres id tagged vectors. You can write to it from postgres, reference it in queries, join on it, etc. That cuts out a lot of the pain from having separate databases, the only remaining issues are additional maintenance overhead and hidden performance cliffs.

qwick23 · on March 26, 2023

With Postgres, you can do almost everything, also a full-text search, but you still have Elasticsearch, Mejlisearch, etc when you need performance and advanced features. The multitool approach is suboptimal in most cases.

adobrawy · on March 26, 2023

In small teams, the infrastructure is often not able to be fully utilized, so performance is not an issue. However, feature richness allows this team to deliver higher-level feature faster. Think early stage startup (one or two engineers) or hairdressers-like business (they use a ready-made framework that targets a popular database and limits its feature to have a wide range of users). As a result, you can have a lot of such business creating a very long tail.

andre-z · on March 26, 2023

For small startups is better just to utilize a managed solution like Pinecone or Qdrant and do not take about infra at all.

vonmoltke · on March 26, 2023

SaaS products are infrastructure. Each different SaaS used is another piece that needs to be connected to the system and maintained; it thus becomes part of the system infrastructure. Each new SaaS piece has costs (time and money) associated with it.

That said, it's up to the individual company to decide if the added cost is worth it. Just because the cost exists doesn't mean it isn't worth it.

ChocoluvH · on March 27, 2023

Open source software nowadays are very easy to use.

If your guy couldn't get a single open source software straight, you had the wrong guy :(

I can only see managed service useful when I had 100X traffic and when strong SLA is required.

Seattle3503 · on March 26, 2023

I'm with you there. It seems like an extension to existing DBs would be better. I would like something like this for a file based DB like sqlite.

simonw · on March 26, 2023

sqlite-vss is an extension that adds vector search to SQLite: https://observablehq.com/@asg017/introducing-sqlite-vss

txtai · on March 26, 2023

txtai combines SQLite and Faiss to enable vector search. It also does a lot more than that.

https://github.com/neuml/txtai