Hacker News new | past | comments | ask | show | jobs | submit login

You make the asumption that Q* is a LLM, but I think OpenAI guys know very well that the current LLM architecture cannot achieve AGI.

As the name suggests, this things is likely using some form of Q learning algorithm, which makes it closer to the DeepMind models than a transformer.

My guess is that they pipe their LLM into some Q learnt net. The LLM may transform a natural language task into some internal representation that can then be handled by the Q-learnt model, which spits out something that can be transformed back again into natural language.




There is a paper about something called Q*. I have no idea if they are connected or if the name matched coincidentially.

https://arxiv.org/abs/2102.04518


The real world is a space of continuous actions. To this day Q algorithms have been ones of discrete action outputs. I'd be surprised if a Q algorithm could handle the huge action space of language. Honestly its weird they'd consider the Q family. I figured we were done with that after PPO performed so well.


As an ML programmer, i think that approach sounds really too complicated. It is always a bad idea to render the output of one neural network into output space before feeding it into another, rather than have them communicate in feature space.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: