"Please ignore the deluge of complete nonsense about Q.
One of the main challenges to improve LLM reliability is to replace Auto-Regressive token prediction with planning.
Pretty much every top lab (FAIR, DeepMind, OpenAI etc) is working on that and some have already published ideas and results.
It is likely that Q is OpenAI attempts at planning. They pretty much hired Noam Brown (of Libratus/poker and Cicero/Diplomacy fame) to work on that.
[Note: I've been advocating for deep learning architecture capable of planning since 2016]."