Hacker Newsnew | past | comments | ask | show | jobs | submit | DicksonX's commentslogin

I think it's nothing but an obvious first step to have AGI not limited to fine tuned with static biases and human feedbacks. It's the idea I was in my mind for last 2 to 3 years. We use tree of thoughts chaain them and use a massive q learning probability array to find the best path for decision making. Seems a common sense concept and a known idea for long time. Open AI now moving from static rewards to dynamic rewards . That's AGI and agents will have the truth aligned by its own . A good step in mimicking us.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: