
The Recomputation Manifesto - ajdecon
http://www.software.ac.uk/blog/2013-07-09-recomputation-manifesto
======
pnathan
Going to bang a small drum here and proclaim:

When doing science, don't use a tech stack that will be obselete next year or
in 3 years. Use something that will be AROUND; something that is well defined,
something that other people can use down the road on a different system.

Virtual machines are "okay". Running them requires having the hypervisor
working correctly and - hopefully - the original architecture supported.

Personally, I used Common Lisp for my Master's work. Other examples of tech
stacks that are very stable are C, C++, and Fortran.

~~~
lifebeyondfife
For certain research problems, different and often obscure tool sets are
required. That, to me, is a great argument behind recreating everything your
environment needs in a VM - unlike Chemists who can't bundle up their lab into
a little transportable box, Computer Scientists can.

In making the argument for recomputation to aid its adoption, it's important
not to add too many constraints on the researcher e.g. "Your results should be
reproduceable _and_ written in C because it should still be around ten years
from now."

Even an obsolete language in a working VM sandbox can be edited and inspected
to make changes, verify correctness etc. If you want to extend the results of
that research you may have a porting project to update the code to more a
current language, but that's leaps and bounds ahead of how things are now:
read and re-read the paper until it makes sense and implement the algorithms
from scratch.

Agree with your hypervisor/architecture point but I suppose there has to be
some broad architectural choice made.

(The author of this article was my PhD supervisor. I've always found him to be
an inspiring, forward thinking researcher so as much as I love the
recomputation idea on merit, I'm also a little biased.)

~~~
gngeal
How are all these VMs going to react if people switch, say, to desktops with
ARMv8, or whatever comes next?

~~~
CJefferson
I can emulate a PDP11, a Nintendo Gameboy, or any other old system.

There are perfectly functional x86 emulators for ARM already. Many people used
to run Windows on PPC macs via emulation too.

~~~
gngeal
Emulating a Gameboy is fine, but emulating a large-scale scientific
computation may be just a little bit more demanding, I'm afraid. :/

~~~
CJefferson
It will be slower, but I don't think unusable. Actually, comparing gameboy
(and other consoles) is not really a good idea, as they are systems where
getting all the processors exactly in sync is important and difficult.
Emulating a PC is much easier, as no-one expects to get exactly equal speeds
out of processors, as every PC model is slightly different.

As I say, I (and many other people) emulated x86 windows on PPCs for years. I
ran visual studio and it dragged a little, but was fairly usable. Certainly it
won't be as fast, but then again by the time x86 has died and people have
moved, hopefully systems will have got fast enough to make up the difference!

------
turingfan
Hi, this is Ian Gent, author of the Recomputation Manifesto.

Many thanks for all the comments.

If anybody wants to get in touch to work in any way on recomputation, please
do! You can find me very easily on google.

Special thanks to @lifebeyondfife, I worked out who you are and you were a
pleasure to supervise too. Hope all is going well.

~~~
nkurz
Nice article, and thanks for coming by!

What are your thought on how to treat performance benchmarks, or really any
claims that one algorithm is "better" than another? Since these are often
extremely hardware specific, I've been wondering if instead of a VM it makes
sense to offer a full bootable image.

Are there virtual image formats that can be either run within a VM or copied
to a USB stick and booted?

~~~
turingfan
I've thought about this quite a lot because people often ask me.

My main comment is that for one algorithm to be judged against another, the
more different environments it is tested in the better, and we can get a
deeper understanding of its performance profile. If it's always better then
that's clear. If it is sometimes better and sometimes worse, either it's not
really better, or there's some dependency on the hardware or other
environment: but that is an interesting result in itself.

However there is always a place for very hardware specific claims. E.g. "For
this chip/motherboard combination this flag is better", and for that we might
always have a problem.

Interesting thought about the either VM or booting. Somebody suggested to me
making live images (as in live dvds) which would serve this purpose.

------
tigroferoce
TL;DR: any computer science paper that presents practical work without
disclosing source code should not be accepted to any scientific conference or
journal.

Agree on all. I've been few years in research in computer security before
quitting for industry. I must report that so many papers that presents some
kind of algorithms (I would say the majority) very rarely also provides the
source code of the implementation. I have always thought and advocated for
that any computer science paper that presents practical work without
disclosing source code should not be accepted to any scientific conference or
journal.

I know (because I did many times) that opening the source takes an incredible
amount of time, but it is mandatory for being capable of 'standing on the
shoulders of giants'. Writing code and keeping private in research is just a
non sense.

~~~
EzGraphs
Your TL;DR should read "...source code _and a virtual environment that allows
the process to be repeated_..."

From the article:

"There has been significant pressure for scientists to make their code open,
but this is not enough. Even if I hired the only postdoc who can get the code
to work, she might have forgotten the exact details of how an experiment was
run."

also

"The only way to ensure recomputability is to provide virtual machines"

To that end, the site [http://recomputation.org/](http://recomputation.org/)
is mentioned as a repository for recomputable experiments.

Point being: source code alone does not specify the process or workflow in
which it was used.

~~~
tigroferoce
sorry... TL;DR was intended for those who were not interested to read my whole
comment.

actually the idea of providing VMs preconfigured to run the test is very good
since it saves time both for who write the code and for those who want to test

> Point being: source code alone does not specify the process or workflow in
> which it was used.

I completely agree! when we released the code, we spent hours to clearly
define the environment where the code had to run.

------
ics
For those who may skim the article without reading the actual manifesto, the
closing paragraph is rather keen:

> A manifesto is a call that people reading it should vote for your point of
> view. Don’t vote with a signature or a petition. Vote by making your
> computational experiments recomputable. Do it at
> [http://recomputation.org](http://recomputation.org), or at your own web
> site, or at another repository. But make your experiments recomputable.

Full manifesto linked from the article:
[http://arxiv.org/pdf/1304.3674v1.pdf](http://arxiv.org/pdf/1304.3674v1.pdf)

Before even reading the article I was thinking to myself "gee, this might
actually be one of the best use cases I've heard for vagrant/etc". Turns out
that's exactly what this is :)

------
jgrahamc
This is great. I see that the first reference made in the paper is to my joint
paper in Nature arguing for release of source code. Even that seems like a
radical step too far to some scientists, goodness knows what they'd think
about this, but it's a great idea.

~~~
tome
Since John seems too modest to link to his own paper, here it is for those who
were as curious as I:

[http://www.nature.com/nature/journal/v482/n7386/full/nature1...](http://www.nature.com/nature/journal/v482/n7386/full/nature10836.html)

------
dasmoth
I hope "Recomputability" emerges as a distinct term.

At least in the biological sciences, I'm seeing the term "reproducibility"
used a lot where the meaning is much closer to "recomputability", i.e. "you
can repeat the exact computational steps we performed" \-- without necessarily
saying much about either the lab-work and/or sample-collection parts of the
project, or the possibility of performing similar analyses using different
tools/platforms.

(I'd also like to see a bit more recognition of the importance of full
reproduction -- i.e. someone starts with the same hypothesis or idea and does
their _own_ experiment -- in modern science).

~~~
_delirium
That's where I see recomputation as not _quite_ pushing the same goals as
reproducibility, even though its advocates often couch them as the same goal.
Recomputation can be useful, but re-running the exact same code in the same
virtual machine isn't really an independent reproduction of the claimed
scientific result. That often benefits from _not_ using the original source
code; two independently written implementations claiming to implement the same
approach and achieving the same results is a much better reproduction.

In the natural sciences, independent reproduction often finds subtle
dependencies on the original apparatus that change the interpretation: when
lab B tries to reproduce lab A's results on slightly different equipment and
can't, it can highlight an unnoticed dependency on some specific feature of
the original equipment, and may throw into question the original paper's
conclusions. You would never have found that if, instead of lab B
independently reproducing the result, lab A just packed up their equipment
into a shipping container and shipped it to lab B, who unpacked and ran it
unchanged. That's what the VM approach is arguing for, and that's not really
reproducibility.

~~~
turingfan
I completely agree. A recomputation of an experiment is not ensuring
reproducibility of the scientific result. It's ensuring reproducibility of the
individual experiment.

The analogy I have given is with cold fusion. If we could reproduce their
exact lab setup then we could find out if the results were real - i.e. were
not misread or anything, and assuming they were, have a chance of explaining
the anomaly.

But no, it's not the same as reproducibility.

------
lifeisstillgood
This is fantastic - and a serious challenge.

Recomuputability is to all intents and purposes the goal of devops and
testing. And we are stumbling around at the edges of proving one environment
it same as another.

This is one to watch - hell one to join in

~~~
turingfan
Yes that is good point. There is one advantage to recomputability. Which is
that - at least in the first instance - all that matters is being able to
recompute the specific experiment for a paper. So testing as in "it works on
other examples" is less critical. But indeed, as you say, there's close links
with testing.

------
kephra
imho, he is missing the most important point, and walking in the wrong
direction instead.

He is true, that science requires recomputation, the ability to verify or
falsify results. But recomputation in science is more then just the ability to
run the black magic box again. A black magic box makes it worse, because the
box might change and fail over time, and its black magic VM. Recomputation
requires source code, that is human readable.

So my suggestion instead is to use a combination of Gentoo and Linux
Containers instead. Gentoo enforces that everything on the machine has its
source code that did run through the compiler, and Linux Containers
encapsulate the project in a way, that a simple backup can preserve it.

 _well_ I normally prefer Debian because of lower maintenance cost. But Gentoo
could play out its strength in this edge case.

~~~
CJefferson
I think both are important. My real hope, in the long term, is that people
will package source in the recomputable VM, and recompile it as part of the
recomputation.

However, particularly when academics are gluing together multiple pieces of
software, often which are themselves quite fragile, just trying to reassemble
working software can be almost impossible.

I have a number of things I wrote myself from when C++11 was first coming out
(yes, I possibly shouldn't write software with compilers for unfinished
languages, but I like to live on the state of the art). Now C++11 support in
gcc is finalised and some corners have been cleaned up, these programs don't
compile any more. I know how to fix them, and have. I wouldn't want someone
else to have to do that.

~~~
kephra
Languages change, their compiler changed, a Gentoo backup would come with
sources of the compiler you used to compile your C++11preBeta project also.

------
justincormack
A VM is fine but at least it should be minimal so you can see what of the
400MB matters. A minimal environment (boot to TeX? Would be good).

~~~
turingfan
For the sample chess problem experiment, we do also provide a tarfile or zip
of the experiment directory, which is just a few MB. So if that works in your
environment, you're good to go. If not there's the 400MB to fall back on.

Obviously it would be nice to know exactly which bits of that are unnecessary
to save space, but for now I'm happy enough to be able to give you something
that works.

~~~
justincormack
Sure, just thinking longer term. Dependencies are important to understand for
replication. Eg your result might be only due to a dodgy random number stream
(say). What do you need to rebuild? What should it be robust to?

~~~
im3w1l
If you have a working VM, and non-working tarball, you can "binary search" for
the right environment.

------
xfax
This should apply equally to papers in Economics as well. The R&R Excel
debacle was embarrassing.

