*> what good is it if it's not reproducible?* I am one of those people who *actu...

KirinDave · on June 5, 2019

> High-quality or easy to install code is not necessary for a result to be reproducible. True reproduction would mean coding the algorithm from scratch by following the written description, and that's what it literally means in most other fields of science.

Which failed here. However, you're going to have a difficult time convincing many people that the attributes you described would be bad properties, just that they might not justify their expense.

> You can't download & install a Large Hadron Collider in an afternoon. Does that mean the LHC experiments are "not reproducible"? Of course not.

By the same token though, the LHC repeats experiments and solicits feedback on how to improve their methods, which they go to great lengths to publish and simulate, because they're aware of this problem.

> IMO, the "blindly rerun the code" definition of reproduction is actually a HUGE barrier to creating a true culture of reproducability in computer science. It results in super lazy reviewing where "public source code that's easy to install and puts the correct-sounding shit into STDOUT" becomes a stand-in for "paper actually describes a novel idea in enough detail that it can be truly reproduced".

Ah, yes. Yes. "If this code is TOO reproducible then people might reproduce it, and handwave handwave the quality of papers would decline.

That's certainly NOT the case in pure CS papers, which have only improved since the days when folks felt that "Lenses, Bananas and Barbed Wire" was how folks should go about writing papers.

Now, physics might be different. But there is surely a middle ground between, "I've shipped you a LHC just plug in in lol" and "This paper doesn't even remotely describe how we achived the results."

If you believe that wasting the time of scientists is bad, then surely you're for clear papers with accurate descriptions of the methods so that those who go and reproduce your work are not sent on wild goose chases?

> As a scientist who has actually done that leg work, I don't think packaging code so that it runs with a single click is the best use of public money in science in 99.9999% of cases.

No, we got that part. But surely someone does and maybe you can design your work to leverage that rather than reproducing and discarding scaffolding. My big concern here is that a lot of scientists (like you claim to be) are underqualified and unpracticed at software, and thus are surely seeing at least some aspect of their work distorted by software and hardware issues.

> Which I guess is just another way of saying that scientists should spend their time on science, not engineering.

Scientists are not going to be able to escape engineering. No one else is going to build what they need besides them.

> please remember who's going to be doing the actual work you're demanding. It's mostly phd students who make $30K/yr. And they have to do this work in their free time because their 60 hr/wk day job is fully allocated to doing the actual science.

Yeah, I'm aware. I suspect their lot would be better if your attitude wasn't that their work is disposable and unimportant.