Hacker News new | past | comments | ask | show | jobs | submit login

> Scientists do this stuff on purpose to make their papers harder to read.

Not just that, but if your paper has math that the reviewer doesn't understand they're more likely to think the work is good and rigorous. It's not like they read it anyways.

> and training data be released for ML papers.

Other than checkpoints and hyper-parameters, what do you want? The wandb logs? I do try to encourage people to save all relevant training parameters in checkpoints (I personally do). This even includes seeds.




> what do you want?

Hyperparameters yes, but also the data used for training. I should be able to reproduce the checkpoint bit-for-bit by training from scratch. If their training process is not deterministic, also release the random seed used.


Oh yeah, that I agree. I'm kinda upset Google is frequently pushing papers with JFT and 30 different versions of it and making conclusions based on pre-training with it. This isn't really okay for publication. Plus it breaks double blind! I'd be okay if say CVPR enforced that they train on public datasets and can only add proprietary after acceptance (but you've seen my views on these venues anyways).

All ML training is non-deterministic. That's kinda the point. But yeah, people should include seeds AND random states. People forget the latter. I also don't know why people just don't throw args (including current iteration and important metrics) into their checkpoints. We share this frustration.


Being pseudorandom is often the point. That's very far from deliberately being nondeterministic.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: