I cannot describe how many times I've been shown results and when asking how to reproduce them, after several notes (and sometimes complaints to higher ups) I eventually get a series of command line arguments or a barely functioning R-script.
These conclusions are too important to be so sloppily produced. We need verification, validation and uncertainty quantification for any result provided to decision makers.
I was very happy to have that background when I took part developing statistical models used for wind hazard analysis on nuclear powerplants in my first job out of college.
Which is to say I think starting with the idea that you're aim for x10 is pernicious and tends to create dysfunctional teams. The claim some developers in some circumstances are ten time more productive than others may or may not be true but software development needs processes whose goal is to help an entire team rather than helping an individual to that "level".
Developers should strive to better themselves, but it's important not to fool yourself, too. Having a strong team is almost always better from a business point of view.
I shun "rockstar" and "10x" (and whatever other bullshit moniker they will come up with next) team members. Give me a group of smart people that gel well together, and are highly self-confident without egos getting into the way, and we can move mountains.
And this definitely didn't happen after those working on tools to help said analysts make reproducible results encouraged the analysts to use said tools... No, that would be crazy.
Reproducibility, like you say, however, is something that is an issue far more particular to data science, and worth more serious consideration and discussion. Hand-in-hand with that is shareability. I'm a fan of what airbnb has open sourced to address some of those issues in their knowledge repo project: https://github.com/airbnb/knowledge-repo
Docker could be that stable build process but it requires the additional assertion that there wasn't, say, a truckload of changes made using `docker exec` or a bunch of customizations to files which were copied into the image. Simply putting a note on the source repo which says that might be enough.
(I really like what C Titus Brown has written about reproducibility in computation research over the years: http://ivory.idyll.org/blog/tag/reproducibility.html)
There needs to be an immutable, high performance read data store that has a 30+ year plan for survival if we're really going to retool our world around expert systems.
Our use of data has grown so much faster than our network capacity (and indeed, it seems like we're going to hit a series of physical laws and practical engineering constraints here). "Data has gravity" but the only way to "sustainable" hold a non-trivial volume of data for 20 years right now is to run a data center with a big dht that detects faults and replicates data.
A lack of reproducibility is a major problem for DSEs and practitioners right now. In fact, I'd argue its the single biggest problem.
- easy to deploy
models, they are going to hate it. It just kills the fun and/or makes obvious if they made a mistake.
Fun I almost get, obviously good for productivity (though I think you'd really be sacrificing productive output for non-productive output), but I just don't get where you're even coming from with the "making mistakes more obvious" angle.
I don't understand why we couldn't have some system, perhaps using strace and friends, which tracks everything I ever do, and how every file was created. Then I could just say "how did I make X?"