Hacker News new | past | comments | ask | show | jobs | submit login

One could argue that "cheating" is a valid solution.



Only when the learning environment is the same environment it will use when deployed. Agents for robots are often trained in simulation, where it's a common problem for the agent to exploit a physics bug in the simulator.




Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: