Hacker News new | comments | show | ask | jobs | submit login

Why are we singling out climate scientists here? The only article of the three that were linked that was solely about climate scientists was the one from RealClimate; the other two make it more than clear that these issues span the full scientific spectrum.

And why dismiss so casually the argument that running the code used to generate a paper's result provides no actual independent verification of that result? How does running the same buggy code and getting the same buggy result help anyone? As long as a paper describes its methods in enough detail that someone else can write their own verification code, I would actually argue that it's better for science for the accompanying code to not be released, lest a single codebase's bugs propagate through a field.

The real problem, if there is one here, is the idea that a scientist's career could go anywhere if their results aren't being independently validated. A person with a result that only they (or their code) can produce just isn't a scientist, and their results should never get paraded around until they're independently verified.




Why are we singling out climate scientists here?

Because this recent rash of articles is a result of "ClimateGate". Clearly the issues raised are more general.

And why dismiss so casually the argument that running the code used to generate a paper's result provides no actual independent verification of that result? How does running the same buggy code and getting the same buggy result help anyone

I think it's a bogus argument because it's one scientist deciding to protect another scientist from doing something silly. I like your argument about the code base's bugs propagating but I don't buy it. If you look at CRUTEM3 you'll see that hidden, buggy code from the Met Office has resulted in erroneous _data_ propagating the field even though there was a detailed description of the algorithm available (http://blog.jgc.org/2010/04/met-office-confirms-that-station...). It would have been far easier to fix that problem had the source code been available. It was only when an enthusiastic amateur (myself) reproduced the algorithm in the paper that the bug was discovered.


It was only when an enthusiastic amateur (myself) reproduced the algorithm in the paper that the bug was discovered.

But that's the actual problem, that nobody else tried to verify the data themselves before accepting it into the field. If you could reproduce the algorithm in the paper without the source code, why couldn't they?

And while it may have meant that the Met Office's code would itself have been fixed faster, I don't buy the idea that having the code available necessarily would have meant the errors in the resulting data would have been discovered faster. That would imply that people would have actually dived into the code looking for bugs, but we've already established that the people in the field are bad programmers who feel they have more interesting things to do. Why isn't it just as plausible that they would have run the code, seen the same buggy result, and labored under the impression they had verified something?


I'm torn on this issue, but I certainly don't think that whether giving out the code will decrease the chances of independent verification is a "bogus argument", and it's not about "protecting" anyone from anything.

Writing your own code for anything but trivial analysis is a huge time sink. If I can take someone else's code instead of writing my own, I'll do so. There is a very real chance that making all codes public will seriously increase overall consolidation and decrease independent verifications. (Independent verifications are a problem anyway because funding agencies are unlikely to fund redoing the same experiment and journals are less likely to publish them.)




Applications are open for YC Winter 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: