
Mozilla, GitHub and Figshare team up to fix the citation of code in academia - matthewmacleod
http://thenextweb.com/dd/2014/03/17/mozilla-science-lab-github-figshare-team-fix-citation-code-academia
======
roel_v
"For another, the DOI is a persistent link. Broken links are a growing problem
for academia, as link structures are changed and online content is edited. “If
your persistent link is pointing to something on Figshare, which is ‘this
GitHub repository at that version, at that release,’ then even if the GitHub
repository changes or Figshare changes its link structure, that DOI will
always point to that object,” Mark Hahnel, founder of Figshare said."

Yeah... well there is the achilles heel of this whole thing. "Always" in
internet time means "2 or 3 years, or until we fail to close another round of
funding" in real-world time. I'm probably being a cynical old fogey again, but
let's see in 5 years time if this whole thing still exists (mozilla, for
example, doesn't have a very solid track record of keeping projects up, to put
it mildly) before I start putting URLs to this in papers that people will
probably still reference every once in a while 5 years from now.

Then again, if everybody thinks like I do, it'll never get off the ground -
classic catch-22.

~~~
trurl42
DOI exists since 1997, which is pretty much since forever in internet time.

Figshare on the other hand, not so much.

~~~
CJefferson
Yes, I don't trust either figshare or github to be here in 10 years (there was
a time when I thought sourceforge would be around for ever, now I keep
expecting to see it disappear any day now).

~~~
toomuchtodo
Why don't they partner with the Internet Archive? Its specifically structured
to be a long term archival/reference system.

------
yeukhon
The lab I work for tried to do this for a few years. But the problem with this
(cloud computing + storing in remote repository + trying to do reproducible
science + collaborative scientific computing) is always a tough sell. The idea
is neat. Everyone likes it. But transiting to the remote platform, letting
others to host and keep your data, and not always accessible to the machine is
a tough sell.

That's why some scientists are going to use Google's Compute Engine. They just
need the machines. The researchers have their own C++ and Python scripts. They
can live with some complexity. They are happy with HTCondor which is awesome
for running big computational jobs.

Sharing data results with the world is awesome. But transiting to a new
platform is again a big problem.

~~~
mcguire
Right, sharing code is always a lot more work than simply telling people what
happened when you ran it.

------
gjuggler
This is a very cool technical integration between two services that are — or
should be — used by most scientists working in code. But what exactly is the
"problem" of citation of code that this solution fixes?

Let's say you release your project to GitHub & figshare and now have a DOI in
hand. What are you supposed to do with it? Do you ask your users to cite this
DOI if they use your software? If so, what text should accompany the citation?
How do you track citations to your code? Will they show up in Google Scholar,
Scopus, Web of Science?

And what if the journal one of your users is submitting to doesn't accept
figshare / github citations? It's unfortunate but true that many publishers
disallow citations to unpublished / non-academic works. This is why many
scientific software projects have resorted to publishing papers on their
software — it's a hack to make a software project fit into the traditional
social system of scientific credit.

DOIs are a technical glue that binds together the thousands of academic
publishing outlets, but they do not solve the scientific or cultural issue of
what is the minimum viable citable scientific product, and how those citations
are generated, propagated, or valued.

Securing a DOI only solves a small slice of the problem of scientific credit —
a point most colorfully expressed by this blog post from CrossRef, the largest
DOI registrar for academic work: [http://crosstech.crossref.org/2013/09/dois-
unambiguously-and...](http://crosstech.crossref.org/2013/09/dois-
unambiguously-and-persistently-identify-published-trustworthy-citable-online-
scholarly-literature-right.html)

------
Fomite
I'm rather pleased - I had emailed GitHub a month or two ago asking about the
potential to get DOIs for repos, and here they are.

Worst case, GitHub and Figshare both go under and we're back to where we
started. The one hesitance I have is about the Figshare/DOI'd repo being
frozen in time - I keep making arguments to myself about how this is a good
idea or a bad idea.

~~~
jrochkind1
Well, the thing with git, is it keeps history anyway. There's no need to
actually freeze the repo in time -- but the DOI can be to a URL representing
(and linking to) a particular moment-in-time of the repo.

Is that what they're doing? That actually seems like a pretty good idea, if it
is, I hope they are! And if they're not, it would be trivial to do.

Github's UI still makes it easy to see what happened after (or before) that
point (including the 'latest' version), but if you're citing software used as
a tool for research results, it makes sense to be able to cite the actual
software that really was used, not it's hypothetical future evolution.

------
csense
If you use git, there is a way to produce, in a single line, a permanent
immutable citation to the current state of your code's master branch. Are you
ready for this revolutionary command?

    
    
        cat .git/refs/heads/master
    

Why are three large-ish organizations feel it necessary to combine their
powers for something this trivial?

~~~
rspeer
...Are you actually under the impression that you can take a hash of some code
and _retrieve the code_ from it?

------
T-A
Somewhat related: [http://www.webcitation.org/](http://www.webcitation.org/)

