A lot of things are called "world models" that I would consider just "models" so it depends on what you mean by that. But what do you consider to be strong evidence? The Othello paper isn't what I'd call strong evidence.
I agree that the Othello paper isn't, and couldn't be, strong evidence about what sort of model of the world (if any) something like GPT-4 has. However, I think it is (importantly) pretty much a refutation of all claims along the lines of "these systems learn only from text, therefore they cannot have anything in them that actually models anything other than text", since their model learned only from text and seems to have developed something very much like a model of the state of the game.
Again, it doesn't say much about how good a model any given system might have. The world is much more complicated than an Othello board. GPT-4 is much bigger than their transformer model. Everything they found is consistent with anything from "as it happens GPT-4 has no world model at all" through to "GPT-4 has a rich model of the world, fully comparable to ours". (I would bet heavily on the truth being somewhere in between, not that that says very much.)