What Is Q*? [video] (youtube.com)
5 points by peter_d_sherman 10 months ago | hide | past | favorite | 1 comment

Specific points of time in video:

"What is Q<star>?":


"LLM's [AI's] today still don't reason very well":


Peter Liu, GSM8K/STaR tweet:


"STaR: Bootstrapping Reasoning With Reasoning"


(arxiv: cs > arXiv:2203.14465, Computer Science > Machine Learning, Eric Zelikman, Yuhuai Wu, Jesse Mu, Noah D. Goodman, 2022)

>"We propose a technique to iteratively leverage a small number of rationale examples and a large dataset without rationales, to bootstrap the ability to perform successively more complex reasoning. This technique, the "Self-Taught Reasoner" (STaR), relies on a simple loop: generate rationales to answer many questions, prompted with a few rationale examples; if the generated answers are wrong, try again to generate a rationale given the correct answer; fine-tune on all the rationales that ultimately yielded correct answers; repeat.


Thus, STaR lets a [AI] model improve itself by learning from its own generated reasoning*."

