PaliGemma

mmastrac · 2024-05-14T21:04:10

This is an impressive amount of public AI work coming out of Google. The competition we're seeing here is really pushing things forward.

curl-up · 2024-05-14T21:32:31

Anyone here have experience with extracting image embeddings out of these models? All the image emb. models I tried so far were quite bad for my use cases, and I feel that hidden representations of models like these might be much better.

jerpint · 2024-05-15T12:04:34

Have you tried CLIP image embeddings ?

curl-up · 2024-05-15T17:56:19

Yes, that's what I am mainly trying to replace, as the performance is just not there for my needs.

histories · 2024-05-15T07:16:17

Just from the name my mind raced to LLMs trained on the Pali canon

echelon_musk · 2024-05-15T07:18:21

I had the same assumption!

airbreather · 2024-05-15T02:15:22

It refers to images, but would that extend to diagrams, like engineering drawings?

tosh · 2024-05-14T20:06:02

How does this model compare to the 3b Gemma if I would use it only with text?

coder543 · 2024-05-14T21:15:28

Well, to start with, there is no regular 3B Gemma. There are 2B and 7B Gemma models. I would guess this model is adding an extra 1B parameters to the 2B model to handle visual understanding.

The 2B model is not very smart to begin with, so… I would expect this one to not be very smart either if you only use it for text, but I wouldn’t expect it to be much worse. It could potentially be useful/interesting for simple visual understanding prompts.

simonw · 2024-05-15T02:41:26

Anyone found a good recipe to run this on a Mac yet?

adefa · 2024-05-15T03:16:58

You can run it by compiling transformers from source: https://huggingface.co/google/paligemma-3b-mix-448

simonw · 2024-05-15T04:02:45

Have you seen that work on a Mac? I've had very bad luck getting anything complex to work with transformers on that platform.

adefa · 2024-05-15T17:06:26

Yes, I was able to run inference on the unquantized model in CPU land on Apple Silicon.

Alifatisk · 2024-05-15T10:13:56

Is this related to Project Astra?

m3kw9 · 2024-05-14T21:48:03

Google markets their new tech like arxiv articles. They have lots to learn from OpenAI