

Why Scientists Need to Learn How to Share - return0
http://www.psmag.com/navigation/nature-and-technology/scientists-need-learn-share-77100/

======
kastnerkyle
Open data and data sharing is one of the key components of the reproducible
science movement. Some teams have even gone so far as to share their entire
virtual image with data, code, and associated libraries (when data size is
small). Docker would be a great choice for this, and some teams have moved to
that as well from what I hear.

This is definitely NOT the majority, but I think the people who take the extra
effort to help other researchers "stand on their shoulders" should be
commended. However, in the era of "big data research", it is not always
possible to share the data openly due to public file server size limitations.

------
Grieverheart
The problem stems mostly from the way research funding works, which promotes a
competitive stance instead of a collaborative one; scientists try to publish
to the highest impact factor journals possible and often hide important
details so that other research groups do not publish any new findings before
them. On the other hand, I'm not sure what a good solution to this problem
would be but I hope someone comes up with one soon.

~~~
wslh
I think ego and the patent system take a big part of it. Imagine a country
funding research related to cancer. The best outcome will be to cure it. It
doesn't matter if you share some of your research and scientists from another
country or group solve "the last mile".

The issue in this scenario is how the patent system currently works. Instead
of giving the ownership to the scientists who make the breakthrough, the
discovery must be shared between the different research groups.

~~~
stargazer-3
Not sure how important is the role of a patent system here. In theoretical
physics and astronomy, despite having no need for patenting, the problem still
persists. I would say that the standard university publish-or-perish model
seems to be the main obstacle in the road to open research.

------
return0
I agree that forced sharing of data is an absurd requirement. It would make
sense if the data itself was attributable and citable, so that scientists can
get recognition, citations and attribution when their data is being used by
other studies and when these citations have the same kind of impact to their
career as do article citations. Journals try to enforce data sharing because
they want to maintain their position as arbiters of academic affairs.

~~~
gjuggler
It's important to recognize that PLOS' new data sharing policy is only about
data directly tied to the publication of a traditional journal article. So
while PLOS authors are being forced to share their data, it's _only_ done in
the context of a published article, which will accrue citations, recognition,
and attribution in the standard way.

So any data shared by this policy WILL absolutely be tied to the traditional
methods of scientific credit, via the linked journal article. To me, requiring
that reasonable data is published and archived alongside scientific literature
doesn't seem absurd at all.

> Journals try to enforce data sharing because they want to maintain their
> position as arbiters of academic affairs.

Is there any evidence behind this claim? PLOS seems to behave in exactly the
opposite way. It's true that SOME journals use extreme selectivity or control
over copyright to maintain their position as arbiters of science. But PLOS,
whose largest journal PLOS One is both open access and makes no judgment on
the impact of the science it publishes, seems to be actively reducing the
amount of control it exerts over academic activities.

------
_delirium
One problem in CS is that a lot of interesting datasets are encumbered. When
work is done in collaboration with a company (which is common), the company
may often not agree to publish the dataset, and retains a veto right on what
gets made public.

Among many areas, this is getting quite normal in natural-language processing
and machine translation, where a lot of the good datasets are owned by
companies. A lot of research lately has come either from, or in collaboration
with, Google in particular, because they have a treasure-trove of data that
powers Google Translate. They are not likely to agree to release that, because
it's one of the keys to their competitive advantage in that area.

Even for in-house data, there are increasingly explicit demands from
universities that scientists work with the university technology-transfer
office to commercialize their work. Everyone wants to be the next Stanford,
with spinoff startups and a stream of licensing revenue. So even if you
collected a nice dataset in-house, there are financial pressures that rather
than just giving it away, you should instead talk to your local venture
capitalist about your unique, hard-to-replicate dataset with promising
commercial value...

------
kleiba
While this may be true in some areas, I'd like to point out that a lot of data
is already being shared amongst scientists. I know, for instance that in the
field of natural language processing, there is a large number of data
collections (corpora) available -- although not all of them free of charge.
But they are there and people use them. Competing research is performed using
the same corpora to compare the different outcomes, and you want comparability
so that the impact of your work becomes evident to reviewers.

Some of these corpora came out of research projects that explicitly stated the
creation of such a shared resource as one of its goals. So sharing is a topic
that's well on the community's agenda. While it's certainly possible to
improve sharing even more, it's not true that scientists are not already doing
it.

Of course, there might be different attitudes in different fields.

~~~
return0
What you 're referring to is closer to materials, rather than the results of
research.

~~~
kleiba
That's certainly true in a number of cases, but I also think that distinction
can sometimes be hard to make. And the mere sharing of results (in the form of
papers) is basically the bread and butter of research anyway.

------
untilHellbanned
Science publishing is not different from Hollywood movie making. The same
self-serving human nature is the driver in both.

I work in the OP's field and appreciate his view. That being said, PLOS has
been a considerable disappointment. Noble minded efforts like this without
real rewards won't work. Just like nobody cares about some famous
actor's/actresses' pet charity cause.

What will work in science publishing is a system that both incentivizes people
BOTH economically and in terms of their reputation. That system doesn't exist
right now.

