It's an illusion. The model generates a sequence of tokens based on an input seq...

It's an illusion. The model generates a sequence of tokens based on an input sequence of tokens. The clever trick is that a human periodically generates some of those tokens, and the IO is presented to the human as if it were a chat room. The reality is that the entire token sequence is fed back into the model to generate the next set of tokens every time.

The model does not have continuity. The model instances are running behind a round-robin load balancer, and it's likely that every request (every supposed interaction) is hitting a different server every time, with the request containing the full transcript until that point. ChatGPT scales horizontally.

The reality the developers present to the model is disconnected and noncontiguous like the experience of Dixie Flatline in William Gibson's Wintermute. A snail has a better claim to consciousness than a call center full of Dixie Flatline constructs answering the phones.

A sapient creature cannot experience coherent consciousness under these conditions.