Hacker News new | past | comments | ask | show | jobs | submit | ngalstyan4's comments login

We provide this functionality in Lantern cloud via our Lantern Extras extension: <https://github.com/lanterndata/lantern_extras>

You can generate CLIP embeddings locally on the DB server via:

  SELECT abstract,
       introduction,
       figure1,
       clip_text(abstract) AS abstract_ai,
       clip_text(introduction) AS introduction_ai,
       clip_image(figure1) AS figure1_ai
  INTO papers_augmented
  FROM papers;
Then you can search for embeddings via:

  SELECT abstract, introduction FROM papers_augmented ORDER BY clip_text(query) <=> abstract_ai LIMIT 10;
The approach significantly decreases search latency and results in cleaner code. As an added bonus, EXPLAIN ANALYZE can now tell percentage of time spent in embedding generation vs search.

The linked library enables embedding generation for a dozen open source models and proprietary APIs (list here: <https://lantern.dev/docs/develop/generate>, and adding new ones is really easy.


Lantern seems really cool! Interestingly we did try CLIP (openclip) image embeddings but the results were poor for 24px by 24px icons. Any ideas?

Charlie @ v0.app


I have tried CLIP on my personal photo album collection and it worked really well there - I could write detailed scene descriptions of past road trips, and the photos I had in mind would pop up. Probably the model is better for everyday photos than for icons


For similar isolation level anomalies in real world applications check out this SIGMOD '17 paper:

ACIDRain: Concurrency-Related Attacks on Database-Backed Web Applications: http://www.bailis.org/papers/acidrain-sigmod2017.pdf


Not sure what the approach of this library is, but can't you generate a nonce from a larger alphabet, hash the column values with the nonce `hash(nonce || column)`, and crypto-shred the nonce in the end.

Then, during hashing you just need a constant immutable state, which effectively expands the hash space, without incurring the mutable state overhead of replacement strings strategy.


this is cool!

Does this only collect logs from frontend?

Or it can also collect the backend and DB latency data related to a frontend interaction?


We collect logs across the stack. Here's some docs on our backend logging integrations (we also have connectors for major cloud providers): https://www.highlight.io/docs/getting-started#for-your-backe...


My understanding is Trusted Language Extensions refer to extensions written in PL/Rust - a Postgres extension mechanism to write user defined functions and use them in SQL queries.

PL/Rust is a more performant and more feature-rich alternative to PL/pgSQL, which is the traditional UDF scripting language for Postgres.

Building a vector index (or any index for that matter) inside Postgres is a more involved process and can not be done via the UDF interface, be it Rust, C or PL/pgSQL.

So, I think even if Lantern was written in Rust, it would not be a Trusted Language Extension under this definition.


(disclosure, i work at supabase and have been developing TLEs with the RDS team)

Trusted Language Extensions refer to an extension written in any trusted language. In this case Rust, but it also includes: plpgsql, plv8, etc. See [0]

> PL/Rust is a more performant and more feature-rich alternative to PL/pgSQL

This is only partially true. plpgsql has bindings to low-level Postgres APIs, so in some cases it is just as fast (or faster) than Rust.

> Building a vector index (or any index for that matter) inside Postgres is a more involved process and can not be done via the UDF interface, be it Rust, C or PL/pgSQL

Most PG Rust extensions are written with the excellent pgrx framework [1]. While it doesn't have index bindings right now, I can certainly imagine a future where this is possible[2].

All that said - I think there are a lot of hoops to jump through right now and I doubt it's worth it for your team. I think you are right to focus on developing a separate C extension

[0] TLE: https://supabase.com/blog/pg-tle

[1] pgrx: https://github.com/pgcentralfoundation/pgrx

[2] https://github.com/pgcentralfoundation/pgrx/issues/190#issue...


cofounder here.

You are right that there are many trade-offs between HNSW and IVFFLAT.

E.g. IVFFLAT requires there be significant amount of data in the table, before the index is created, and assumes data distribution does not change with additional inserts (since it chooses centroids during the initial creation and never updates them)

We have also generally had harder time getting high recall with IVFFLAT on vectors from embedding models such as ada-002.

There are trade-offs, some of which we will explore in later blog posts.

This post is about one thing - HNSW index creation time across two systems, at a fixed 99% recall.


External index creation also requires that a significant amount of data be in the table for it to be worth it, along with all the other potential issues.


It is strange that there is no mention of Ruben Vardanyan, even when FAST foundation is mentioned.

But still cool to read about Armenia's tech sector here.


We have not run microbenchmarks to see what dimension ranges perform best but those are coming soon! Below is an anecdotal answer:

We run our ci/cd benchmarks on 128dim sift vectors. We have some demos using clip embeddings (512dim) and baai/bge 768 dimensional embeddings.

Generally, smaller vectors allow higher throughput and result in smaller indexes. But the effect on performance is small. Once we merge the PR implementing vector element casts to 1 and 2 byte floats, the effect of this on throughput should be even smaller.


This sounds like a very useful feature, and we will prioritize this.

You’re correct that IVFFLAT would be faster for your use case. However, IVFFLAT’s shortcoming is bad recall, which means less relevant results for your application. We believe that our HNSW implementation (or other indexes) can handle use cases like yours.

We currently handle a similar use-case by rerunning our index searches with exponentially increasing LIMITs and dropping the results which are not needed. Could you send us an email at support@lantern.dev? We can generate the numbers by this weekend, and get back to you with concrete results.

By the way – not sure if you saw in our blog post, if you’re using pgvector in production and switch to Lantern, we’ll help you every single step of the way. It’s very quick, and we’ll also send you some free AirPods Pro at the end of it!


Our index access method will be called lantern_hnsw if pgvector or any other provider has already taken the hnsw access method name.

btw, we did not create our own vector type and just use size-enforced real[] arrays to represent embeddings. However, you can use our index with pgvector's vector type. So, if you already have a table with pgvector's vector column type, you can start using Lantern by just creating an index on the same column.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: