Hacker News new | past | comments | ask | show | jobs | submit login

Reinforcement Learning can train a model based on some reward function. The suggestion is that real-world accountability could be translated into such a reward function.

Also, OP explicitly mentioned "online learning", which is a continuous training process after standard pre-training.

For what it's worth, I don't think this would work. Rewards would come in too sporadically to be useful.






Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: