Hacker News new | past | comments | ask | show | jobs | submit login

How do you propose that punishment make the AIs "get better"? From their perspective they're already as good as they can be, based on their training.





Reinforcement Learning can train a model based on some reward function. The suggestion is that real-world accountability could be translated into such a reward function.

Also, OP explicitly mentioned "online learning", which is a continuous training process after standard pre-training.

For what it's worth, I don't think this would work. Rewards would come in too sporadically to be useful.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: