Smells like reinforcement learning in real life. Master sets ups the task enviro... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

		aldanor 2 days ago \| parent \| context \| favorite \| on: Focus on decisions, not tasks Smells like reinforcement learning in real life. Master sets ups the task environment, collects samples from students, picks the best and maybe even augments them. Students watch the master and learn... and the cycle continues. And then the master becomes a grandmaster (unless entropy explosion occurs).

gtirloni 2 days ago [–]

Yea, and it's what ChatGPT does sometimes even if you don't ask it for multiple options. It shows two options and asks you to select the best one. Essentially using customers to help with training.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact