A lot of things are called "world models" that I would consider just "models" so...

gjm11 · 2023-05-30T22:36:06

I agree that the Othello paper isn't, and couldn't be, strong evidence about what sort of model of the world (if any) something like GPT-4 has. However, I think it is (importantly) pretty much a refutation of all claims along the lines of "these systems learn only from text, therefore they cannot have anything in them that actually models anything other than text", since their model learned only from text and seems to have developed something very much like a model of the state of the game.

Again, it doesn't say much about how good a model any given system might have. The world is much more complicated than an Othello board. GPT-4 is much bigger than their transformer model. Everything they found is consistent with anything from "as it happens GPT-4 has no world model at all" through to "GPT-4 has a rich model of the world, fully comparable to ours". (I would bet heavily on the truth being somewhere in between, not that that says very much.)

skybrian · 2023-05-31T00:00:54

Yeah, I think it leaves us back at square one: we don't know much about how it really works.

I don't follow it too closely, but I've seen papers on mechanistic interpretability that look promising.

nopinsight · 2023-05-31T03:18:50

I'd say stronger evidence than the Othello paper is the ability to answer what-if questions coherently and plausibly.

I just asked GPT-4 "What would happen to Northern Thailand if elephants behave like kangaroos?".

The answers are probably better than what 90+% of humans could give after spending an hour researching on the internet.

The Sparks of AGI paper provides much more evidence and examples: https://arxiv.org/abs/2303.12712