
Principles of good data analysis - gjreda
http://www.gregreda.com/2014/03/23/principles-of-good-data-analysis/
======
westurner
Helpful; thanks!

"Ten Simple Rules for Reproducible Computational Research"
[http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fj...](http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003285)

* Rule 1: For Every Result, Keep Track of How It Was Produced

* Rule 2: Avoid Manual Data Manipulation Steps

* Rule 3: Archive the Exact Versions of All External Programs Used

* Rule 4: Version Control All Custom Scripts

* Rule 5: Record All Intermediate Results, When Possible in Standardized Formats

* Rule 6: For Analyses That Include Randomness, Note Underlying Random Seeds

* Rule 7: Always Store Raw Data behind Plots

* Rule 8: Generate Hierarchical Analysis Output, Allowing Layers of Increasing Detail to Be Inspected

* Rule 9: Connect Textual Statements to Underlying Results

* Rule 10: Provide Public Access to Scripts, Runs, and Results

Sandve GK, Nekrutenko A, Taylor J, Hovig E (2013) Ten Simple Rules for
Reproducible Computational Research. PLoS Comput Biol 9(10): e1003285.
doi:10.1371/journal.pcbi.1003285

~~~
needacig
This is way more useful than the original post, thanks.

------
ethikal
Great post! You mention that it is important to "be skeptical" \- I concur and
would add that it's helpful to approach the analysis from a non-biased
standpoint. Even if you are going into your analysis with certain goals in
mind, it is not only more ethical, but also more persuasive, to indicate any
inconsistencies in your findings.

------
zengr
I think for "Profile your data", some tools like OpenRefine really help.
[http://openrefine.org](http://openrefine.org)

------
gjreda
I forgot to mention reproducibility. Show your work (share the code).

