

Best Practices for Scientific Computing - punchagan
http://arxiv.org/abs/1210.0530v1

======
johnx123-up
Recent version is v2 <http://arxiv.org/abs/1210.0530v2>

Summary:

    
    
      1. Write programs for people, not computers.
      2. Automate repetitive tasks.
      3. Use the computer to record history.
      4. Make incremental changes.
      5. Use version control.
      6. Don’t repeat yourself (or others).
      7. Plan for mistakes.
      8. Optimize software only after it works correctly.
      9. Document design and purpose, not mechanics.
      10. Conduct code reviews.

~~~
akashshah
This sounds like best plans for computing in general, not just scientific
computing

~~~
3amOpsGuy
Yeah for scientific computing specifically i was more expecting to see
concerns like ensuring intermediate working state is persisted to a durable
medium and facilitating restarting from a working state "dump" instead of from
the top each time.

~~~
allenp
Around the end of summer there was a post on just this topic. I can't find it
now but it had really solid practices in this area. If anyone remembers it /
finds it I'd love to be able to read it again.

~~~
datapraxis
You may be talking about a post by S. M. Ali Eslami called Patterns for
Research in Machine Learning[1]. There was some discussion here as well[2].

The patterns he pointed out were:

    
    
      1. Use version control.
      2. Separate code from data.
      3. Separate input data, working data and output data.
      4. Modify input data with care.
      5. Save everything to disk frequently.
      6. Separate options from parameters.
      7. Do not use global variables.
      8. Record the options used to generate each run of the algorithm.
      9. Make it easy to sweep options.
      10. Make it easy to execute only portions of the code.
      11. Use checkpointing.
      12. Write demos and tests.
    

EDIT: Added list of patterns, formatting.

[1] <http://arkitus.com/PRML/>

[2] <http://news.ycombinator.com/item?id=4384317>

------
simgidacav
Very good points (indeed, I totally agree on each and every of them!), but
there's the risk that someone writing a software just for sake of the task
will just argue «I'm not interested in doing this, but in the task».

This is exactly what happens for many users when you suggest them to install
GNU/Linux and enjoy freedom. They say: «I don't want to learn another OS. I
just use the computer to get things done». Then they'll keep using Excel.

~~~
pif
> there's the risk that someone writing a software just for sake of the task
> will just argue ...

It's not a risk, it's a certainty! That's why the the last point is about
conducting reviews.

------
zerostar07
In practice, scientific software is always work-in-progress in general, people
who write it are in many cases novices and software remains unattended after
the results have been published. Although scientists should learn to write
good programs, we shouldn't expect them to write the best software; a lot of
science is not engineering.

------
pfortuny
Honestly,

I shall never understand why a work like this (useful and to the point) needs
a 'Conclusion' section. There are no conclusions, just a list of practices.

I know this is an absurd rant but can we stop this nonsense?

Conclusion: please stop concluding something which is not an argumentation.

~~~
merijnv
Because scientists are aggressively selective in what they read (you have to
be, given the volume of text produced) and skip large parts of the paper to
see if they're interesting. A very common paper reading approach is:

    
    
      1. Read abstract. Interesting? Go to 2, else stop reading.
      2. Read introduction. Still interesting? Go to 3, else stop reading.
      3. Read conclusion. Still interesting? Read the rest of the paper, else stop reading.
    

The conclusion is there to summarise everything and let people decide whether
they want to bother reading the entire thing.

