Hacker News new | past | comments | ask | show | jobs | submit login

No it's not. They ran it longer instead.



The 2022 paper pretty explicitly says that runtime is not a substitute. They say their best result "can only be achieved in our 8-GPU setup".


I assume you mean Fig. 6 here?[0]

But that was explicitly limited to 8 hours for all setups. Do they have another paper that shows that you can't increase the number of hours of a smaller GPU setup to compensate?

[0]https://dl.acm.org/doi/pdf/10.1145/3505170.3511478


They also changed the ratio of RL experience collectors to GPU workers (~1/20th the RL experience collectors, 1/2 the GPUs). I don't know what impact that has --- maybe each GPU episode has less experience? Maybe that makes for an effectively small batch size and therefore more chaotic training? But either way, why change things when you can just match them exactly?




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: