
Open science: why is it so hard? - tedlee
http://lemire.me/blog/archives/2012/01/10/open-science-is-hard
======
roel_v
It's not hard, it's that most researchers just don't care. Most researchers
have access to all papers they need through their universities, and if they
don't, most papers are only an email away. Publishing in those journals for
which you need to pay (which is only a minority) is paid for by grands or the
university. How many people do you know who couldn't get their papers
published _only_ because they couldn't afford the journal charges?

The truth is that only a small (albeit vocal) minority cares about 'open
science'. For most researchers, it just doesn't matter, and there is no
incentive to pursue it. Actually having time to write papers worth publishing
is more of an issue than those that do get published being accessible to
people who, in all honesty, have no interested in them (i.e., the general
public).

(it's quite interesting to note that (in my statistically undoubtedly non-
representative experience) the demographic that advocates 'open access' the
most is quite homogeneous - mostly PhD students with a new professor here and
there. Maybe it's the cynical me, but I've been in the game long enough to see
people I meet at conferences convert from being advocates for open access to
not caring about it any more once they get some years of experience and
realize it doesn't matter that much.)

~~~
kitsune_
I think you underestimate the potential interest in open science (and by
extension, open data).

The access limitations act as a barrier when it comes to the application of
research. Yes, of course, a multi-billion dollar company with huge R&D
departments will have all the subscriptions available, but what about the
smaller companies and start ups? What about individual people?

Also, the current system rewards incestuous island thinking. There isn't a
giant catalog of knowledge, and it's a shame. Imagine one library with all the
world's research and data, open, categorized, expandable, for everyone to use.

~~~
roel_v
"I think you underestimate the potential interest in open science (and by
extension, open data)."

Probably. Like I said, my impression is based on personal observations only.
That said, I don't really have objective reasons (of the cuff) to believe
otherwise, nor do I have data (or even an idea of how one would measure it
correctly). For example, I don't know anyone except from undergrads who don't
know how to use their libraries who have ever been unable to get a certain
paper.

Likewise, how hard is it really for a small company or individuals to get
access? They can do exactly what everybody else can do - shoot the author an
email asking for the paper. I've _never_ had anybody not send it to me, once I
got a hold of their current contact details, which with the web nowadays is
very easy. How many people really do independent real research, completely
disconnected from anybody involved with universities? They may exist but
they're such corner cases that they certainly don't have the critical mass to
make 'open access' a real issue.

My point is: while 'open access' is a theoretically nice idea, the _practical_
realities are that it's just not a real problem for the vast majority of
people. It's a philosophical objection to certain practices at best. Which is
not enough to instill enough urgency in those whose support is needed to
change things.

~~~
jamii
A good fraction of the freelance work I have done has required looking up
papers (eg the current implementation for <https://github.com/jamii/texsearch>
is based on a bioinformatics paper).

It's fine when you only want a single specific paper but when I'm often
looking for a solution to a problem I don't know the name of which means
scanning through a couple of hundred papers an hour to see which ones are
relevant.

~~~
Radim
But paper abstracts are always free (even with paid journals).

If you have to scan random papers to find a solution to a problem you cannot
name, I fear for your well-being :-)

~~~
jamii
> If you have to scan random papers to find a solution to a problem you cannot
> name, I fear for your well-being :-)

The disadvantage to being a generalist is that I don't know the terminology of
the field I need to research. Take the texsearch example above. Given a LaTeX
string I want to find similar strings from a large corpus so I start by
googling things like "code search" and "syntax tree search". After scanning a
few dozen papers and following links I find that the magic search term is
"approximate string matching" which nets me an overview paper. I scan through
the links from that paper and dismiss most of the algorithms as unsuitable for
my particular problem until I'm left with a few candidates for prototypes.

Back then I had access through my university. Today I wouldn't be able to read
half of those papers. For recent math/CS papers I can usually find a preprint
but anything else is a struggle.

------
oscilloscope
Astronomers are incredibly dedicated to outreach. You can get data used by
professional astronomers in publicly-accessible databases on the web. Here's a
collection of these resources:

<http://ned.ipac.caltech.edu/help/links.html>

The challenge is you have to learn a bit about the databases and visualization
software to get going. Then you might need to learn some electromagnetism,
relativity, and astrophysics to interpret the data. There are excellent
tutorials, some many years old, that you can find.

Open science is hard because science is hard. It takes knowledge to interpret
the data. It takes effort to transform the data into something that can be
interpreted. And there's a lot of data.

~~~
andreadallera
> Astronomers are incredibly dedicated to outreach.

I really don't want to sound overly critical - I'm sure that there is good
will behind whoever made the site you linked - but that site is what
represents _astronomers' dedication to outreach_ then I don't really know what
to say...

See, if researchers _really_ wanted to let their research known to the public
the efforts on their behalf would be much, much more focused than what it took
to put together that site. Take your average startup as an example - they're
_seriously_ dedicated to outreach, their success depends on it - would you
give them even a 1% chance of success if their website looked like that?

~~~
Vivtek
The point is that data is there _at all_. You won't find that in many
sciences. What it looks like is profoundly unimportant.

------
flashingleds
I think it's difficult to refute that if the funding was public, the results
should be public. But the scientists have to publish in big journals to win
public funding grants, and the big journals aren't motivated to go open access
and surrender their cash cow.

So you're not likely to budge either the scientists or the journals by arguing
about what's ethical. It seems to me like the best approach is to change the
way public grants are awarded. If grants become conditional on you ONLY
publishing in open access journals, well you don't have much choice do you?
Ultimately this whole game was only ever about attracting the money you need
to do your job. Pretty soon the expensive publishers stop getting submissions
because they're all diverted to open-access journals.

Of course it would never be so easy in reality. There is a pretty entrenched
chicken-and-egg situation with science publications, and it will be
unavoidably messy to break it.

------
kitsune_
I find it really aggravating that schools such as the ETHZ get billions in
taxes a year yet refuse to publish the majority of their papers online.

~~~
polyfractal
To be fair to the university, many journals have legal contracts that forbid
you from publishing your manuscript online within X days of publishing in
their journal. Basically, they get first publication rights.

After a certain period of time most journals let you publish on your own site
(although few authors actually do, which is a different issue)

------
larrydag
I am an Engineer and an Operations Research professional. Yet I have mostly
worked for companies that do not do engineering or are very small
organizations. They do not want to allocate me the resources to research or
software. So I've had to rely on what is open. I've used Free Software as a
toolset to perform a lot of my work. Yet access to the research behind a lot
of those tools is very hard to come by. Places like Penn State's Citeseer
<http://citeseer.ist.psu.edu/index> and R-Project <http://r-project.org> have
been a great haven for open research for my work.

I do believe science should be open. Yet I also believe that the scientists
behind the research should be compensated. But all the while there is great
research out there sitting behind a publisher that limits access to us
practitioners who do not have the resources to gain access. I think there is
definitely an opportunity to find somewhere in-between where the two can meet.

------
kghose
What open access does do is help researchers in smaller institutions and
institutions in poorer countries to have easier access to papers.

This is HUGE. If you work/study at a place with less resources it becomes
annoying to try and get papers for journals your instn doesn't subscribe to.
This is a barrier to research that just should not be there.

------
mkag
One of the reasons the status quo is very hard to change is that academia is
built on reputation and prestige, and there is really no other measure of
success. That means that if we are at some stable steady state going outside
the system and doing something like opening up your data to everyone versus
trying to publish in a brand name journal will be a disadvantage to you since
the number of publications in these types of journals are they way that you
are judged. The issue isn't about whether the journals charge for content or
not. Journal subscriptions are cheap compared to labor and reagents and, as
always with third-party payer systems, the incentives aren't really aligned to
skimp on them. The real question are there better ways of giving people credit
for their work in a way that enhances their career in a proportional way to
their achievement? Are alternative systems better for rewarding the right
people faster, and thus moving research faster? The answer may be yes, but
there is a significant energy of activation barrier to making any kind of
switch from the publishing-as-a-measure-of-achievement model.

------
j45
Open is not profitable?

------
andreadallera
> Repeat after me: scholarship is not a publishing business.

Nope - it's much more than that. It's a way for a caste of too-often
incompetent and self-absorbed university dwellers to perpetrate itself.

I've had the misfortune of working in university for a while. People on
department are living in a bubble - they write books and they publish on
journals that are read only by other professor "studying" the same "subject".
They treat subjects like "the history of mining in scandinavia". Why? Because
they're often the _only_ person in the world (or one of the 2-3 people in the
world) who is studying that. What does it mean? A grant, and later a position
as a full-time professor. Everyone of them has his small little niche in which
he's the best specialist in the world... that is used to both justify their
research and to avoid real world competition.

Publishing rules will never change if the underlying ecosystem doesn't. If
professors will ever be interested in _expanding_ their audience (now they're
interested in the _opposite_ ) then publishing will change accordingly.

