I never understood the point of the pellican on a bicycle exercise: LLMs coding ...

_joel · 2025-09-29T17:18:44 1759166324

Because it excercises thinking about a pelican riding a bike (not common) and then describing that using SVG. It's quite nice imho and seems to scale with the power of the LLM model. Sure Simon has some actual reasons though.

Kuinox · 2025-09-29T17:23:50 1759166630

> Because it excercises thinking about a pelican riding a bike (not common)

It is extremely common, since it's used on every single LLM to bench it.

And there is nothing logic, LLMs are never trained for graphics tasks, they dont see the output of a code.

_joel · 2025-09-29T17:48:46 1759168126

I mean the real world examples of a pelican riding a bike is not common. It's common in benchmarking LLM's but that's not what I meant.

imiric · 2025-09-29T17:23:02 1759166582

The only thing it exercises is the ability of the model to recall its pelican-on-bicycle and other SVG training data.

furyofantares · 2025-09-29T17:41:19 1759167679

It's more for fun than as a benchmark.

Kuinox · 2025-09-29T17:43:44 1759167824

It also measure something llms are good probably due to cheating.

furyofantares · 2025-09-29T19:06:31 1759172791

I wouldn't say any LLMs are good at it. But it doesn't really matter, it's not a serious thing. It's the equivalent of "hello world" - or whatever your personal "hello world" is - whenever you get your hands on a new language.

mhh__ · 2025-09-29T17:18:26 1759166306

Memorise what exactly?

Kuinox · 2025-09-29T17:25:29 1759166729

Coordinate and shape of the element used to form a pellican. If you think about how LLMs ingest their data, they have no way to know how to form a pellican in SVG.

I bet their ability to form a pellican result purely because someone already did it before.

throwaway314155 · 2025-09-29T20:06:57 1759176417

> If you think about how LLMs ingest their data, they have no way to know how to form a pellican in SVG.

It's called generalization and yes, they do. I bet you could find plenty of examples of it working on something that truly isn't "present in the training data".

It's funny, you're so convinced that it's not possible without direct memorization but forgot to account for emergent behaviors (which are frankly all over the place in LLM's - where you been)?

At any rate, the pelican thing from simonw is clearly just for fun at this point.