
Why the world of scientific research needs to be disrupted - llambda
http://gigaom.com/2011/10/31/why-the-world-of-scientific-research-needs-to-be-disrupted/?
======
mironathetin
Two comments:

1\. Scientists publish in slow peer reviewed journals, because scientific
publication has to be peer reviewed. The reviewers are a very small selection
of other scientists, who are expert in the field of the publication. As long
as the reviewers don't signal white smoke, a publication is no science by
definition. This way of work is important to keep quality excellent. We
certainly don't need more quickly published junk that stirs the world but is
found to be inaccurate later.

Can we accelerate the process of peer reviewing? I doubt that. As I have
written, the reviewers are other scientists who have to find the time to read
and comment new articles. The more time they spend with this, the less time
they spend doing research. So the small amount of experts usually is
responsible for being slow.

You may be quicker by internet publishing, if (IF!) you work on a highly
dynamic field where other scientists are hungry for your results. As the
author writes, this is the exception. Most scientific results will be useful
only a long time after the publication, if they will be useful at all. In this
way, science is much faster than application already.

2\. The author writes, we could make progress faster. Please: why? It seems to
be a sign of the time that we want to accelerate everything. It is most likely
that this will only produce more noise. We need to slow down our lives and our
thinking to stay accurate and produce real value. I am glad that science
results, at least in my field, are still reliably scrutinized by many. This
way I know that reading and spending the time to understand the stuff is worth
my time.

~~~
_delirium
On #2, it varies by field, but I agree somewhat, in that I generally think
there's already a significant problem of people being too quick-on-the-draw
without first familiarizing themselves with the literature. There is a _lot_
of stuff published that rehashes (without citation) existing published results
or minor variations on them, sometimes from decades earlier, that the authors
and reviewers just didn't happen to be familiar with. If you accelerate
everything and distribute the peer-review to people with less deep knowledge
of the literature, I suspect this will only increase, adding more noise to the
literature as we spend the 2010s rediscovering the 1970s, accompanied by
triumphant press releases and blog posts about Breakthrough Discoveries.

I notice this particularly in computer science, where it seems nobody is
capable of locating or reading journal articles published before around 1995,
a handful of canonical "classic" papers excepted.

~~~
quanticle
How much of that is because research from the 1970s is effectively locked away
behind ACM and IEEE paywalls? How much research gets rehashed because the
researcher can't find the result he or she is looking for via the atrocious
search tools provided by the likes of Elsevier?

I think that more openness in scientific publication will in fact mitigate the
problem of people being quick-on-the-draw, as contradictory findings and
previous publications of similar results will be easier to find, even by the
non-scientific audience.

~~~
kd0amg
_> > I notice this particularly in computer science, where it seems nobody is
capable of locating or reading journal articles published before around 1995,
a handful of canonical "classic" papers excepted._

 _How much of that is because research from the 1970s is effectively locked
away behind ACM and IEEE paywalls?_

If you're serious about keeping up with research, a couple hundred dollars
every year to get through a paywall doesn't sound so bad. The real problem
with the stuff from the 1970s and earlier is that it's not readily available
even to paying members. When I went looking for an article from 1970, the ACM
Digital Library had an entry for it but not the document itself.

~~~
impendia
"A couple hundred dollars"? I see that you are not familiar with the pricing
schemes of Elsevier and others.

~~~
kd0amg
I thought what I quoted was about ACM/IEEE. $198 per year (less than that for
students) to the ACM gets you unlimited access to their online archives. I
didn't check the IEEE rate, as I rarely encounter a paper I have to get from
them instead of from ACM -- is it much more? What am I missing out on by not
checking Elsevier's stash?

~~~
impendia
I am in math rather than CS/EE. I am not precisely sure, and indeed Elsevier
goes to some effort to make their pricing complicated, but I am pretty sure it
runs well into the thousands.

I am a professor at a reasonably good state university, having just come from
Stanford. At Stanford they subscribed to everything, and here our department
picks and chooses so that I constantly run into paywalls despite a university
subscription.

I don't know how much it would be to upgrade to Stanford-level access, but if
it were $200 a professor I assume they'd do it. (Certainly I'd pay $200 from
my salary for that.) I'm guessing high four or low five figures per prof in
the department.

I'm guessing ACM/IEEE are nonprofits? Kudos to them for making their prices
reasonable. There are some professional organizations in math (e.g. the AMS)
that do something similar. But unfortunately a lot of our journals are
published by for-profit companies.

~~~
pnathan
The ACM and the IEEE are the key professional societies for computer science,
computer engineering, and electrical engineering research.

Wikipedia says the IEEE is non-profit, not sure about the ACM.

IEEE weights towards the EE and high-math end of things, ACM weights towards
the CS end of things.

The Journal of the ACM is about $300USD/yr for print/online access
(nonmember).

I have a wide variety of interests and would prefer to have full access to
about 5-7 journals (Some Wiley, some Elsevier). Assuming $300/year prices
that's a pretty decent sized price to keep up with research interests. :-(

------
e_g
In my personal experience the main problem with scientific research today is
caused by publication counts representing the sole metric for academic
success.

Research is 'optimized' for publications and segmented according to conference
schedules. That in itself I consider detrimental to the research effort.
However the major damage in terms of advancing resarch in my opinion is caused
by the induced tendency of scholars to closely stick to the main stream
paradigms, trends and topics. Naturally it is much 'harder' to do research
that does not directly build up on the current state of the art. Such research
usually takes much longer and therefore results in a lower publication
frequency. Secondly it is carrying a much higher risk of not bringing about
positive results. Thirdly I deem the chances for acceptance in peer reviewed
conferences and journals to be much lower. Such venues are often biased
towards current main stream research trends.In my observation there is also an
aversion against research that questions the state of the art. People who have
spent years or decades to master every aspect of the state of the art have a
strong incentive to shoot down anything potentially disruptive.

As such the incentives are misaligned with regard to the aim of science of
figuring out _new_ things.

~~~
jpdoctor
> _In my personal experience the main problem with scientific research today
> is caused by publication counts representing the sole metric for academic
> success._

It also leads to bloat in the literature: People learn the "least publishable
increment" and go publish that.

~~~
e_g
Exactly. A study by Tuliving and Madigan (<http://alicekim.ca/AnnRev70.pdf>)
outlines this nicely. In the words of David T. Lykken
([http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.118...](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.118.2655&rep=rep1&type=pdf)):

 _In their 1970 Annual review chapter on Memory and Verbal learning, Tulving
and madigan reported that they had independently rated 540 published articles
in terms of their "contribution to knowledge". With "remarkable agreement",
they found that they had sorted two-thirds of the articles into a category
labelled "utterly inconsequential"._ _"The primary function these papers serve
is to give something to do to people who count papers instead of reading them.
Future research and understanding of verbal learning and memory would not be
affected at all if none of the papers in this category seen the light of day"_

 _About 25 percent of the articles were classified as: "run-of-the-mill ..
these articles also do not add anything really new to knowledge... [such
articles] make one wish that at least some writers, faced with the decision of
whether to publish or perish, should have seriously considered the latter
alternative"_

 _Only about 10% of the entire set of published papers received the modest
compliment of being classified as "worthwhile"._ _Given that memory and verbal
learning was then a popular and relatively 'hard' area of psychological
research, attracting some of the brightest students, this is a devastating
assessment of the end product. Hence, of the research ideas generated by these
psychologists, who are all card-carrying scientists and who liked these ideas
well enough to invest weeks or months of their lives working on them, less
than 25% of 40% of 10% = 1% actually appear to make some sort of contribution
to the discipline._

Given that this work was published in the 1970s when the "publish or perish"
pressure was less strongly developed I expect today's number to be worse.
Concerning Information Retrieval the field of research of my PhD I would
certainly attest that this applies.

~~~
pnathan
I have been trawling through large numbers of papers recently as part of my
computer science Master's background research, and some papers are quite
frankly unintelligible garble... I can't even understand how they passed peer
review.

I would say that of the articles I've read, maybe 5% are meaningful
contributions that _advance_ the science; maybe 25% contribute something to
the current understanding, perhaps 60% are "we did something cool", and 10%
are "are these reviewers awake".

------
markkat
To disrupt scientific research you need to change the way that science is
funded. Science is not for-profit, unlike the disrupted industries mentioned
here. If you have to show how determining how a flagella works is going to
produce revenue, then you are going to stifle science in another way.

Good science is slow, deliberate, and motivated by open-ended questions and
discovery, not by end points.

We need to fund the process of science, and make sure that the knowledge
gained is shared publicly. No other disruption is necessary.

~~~
dalke
"Science is not for-profit"

Science is also not not-for-profit. Plenty of for-profit businesses do
research, and publish in scientific journals. Some research fields (drug
development comes first to mind) have a very large industry presence.

------
bdhe
Tim Gowers, fields medallist and the person who started the first polymath
project just wrote a couple of blog posts (the latter including suggestions
from the former) on a new model of math publishing. They are highly detailed
and well thought out and also talk about things one might not understand
unless one works in academia.

[https://gowers.wordpress.com/2011/10/31/how-might-we-get-
to-...](https://gowers.wordpress.com/2011/10/31/how-might-we-get-to-a-new-
model-of-mathematical-publishing/)

[https://gowers.wordpress.com/2011/11/03/a-more-modest-
propos...](https://gowers.wordpress.com/2011/11/03/a-more-modest-proposal/)

------
jules
Another vector for disruption is with statistics. Scientists use statistics to
analyze the results of their experiments all the time. Unfortunately, most
scientists are not statisticians. They spend a lot of time on statistics that
they want to spend doing things they're good at. They make lots of mistakes,
for example using the wrong statistical test or deciding what test to use
after the experiment, etc.

With today's computer speeds there is no reason to do the statistics the way
they're doing it. You don't need to make strong and unjustified assumptions
about your data because then you can use a simple T-test. Today with
simulation you can do pretty much any statistical test you want. You can do
Bayesian inference.

However, most of these scientists are not capable of programming that
themselves in R. What is needed is a simple GUI interface where they can state
their assumptions and enter their data. Then the program calculates their
posteriors and does any hypothesis test they want. The statistics scientists
use today are optimized for pen and paper. The assumptions no longer hold. Who
cares if the computation takes 500 milliseconds on a computer instead of 3
milliseconds?

------
iqster
A very practical problem with making data available publicly is privacy. This
is likely not going to be an issue for data coming out of the LHSC, but comes
into play in CS research. I know a bunch of researchers who work with large
datasets that have user locations, cell tower communication, social network
data, etc. Of course, even researchers work with data that is annonymized. But
almost nothing is truly anonymous. You have to assume that researchers working
with the data are going to be sensible about how they use it. If not, they
will face serious consequences (i.e. lose their jobs and hurt their
reputations). A dataset made public typically cannot be controlled in this
manner.

Researchers must also go through IRBs (independent review boards) at their
institution prior to engaging in research that deals with human subjects. If
the collected data is going to be made available publicly, it makes the
process more arduous.

Btw ... there is a decent dataset repository for CS researchers doing mobile
computing called Crawdad.

~~~
brainid
"Researchers must also go through IRBs (independent review boards) at their
institution prior to engaging in research that deals with human subjects. If
the collected data is going to be made available publicly, it makes the
process more arduous."

That is not really my experience. Getting IRB approval to release human data
just means proper de-identification which one should do anyway. For example,
generally subjects are given randomly generated IDs with only the PI having
the master list.

------
kia
_Unless scientists and researchers start to put the interests of collaboration
and “open science” ahead of their desire to be promoted or win tenure the
system will not change_

This is the main problem because most won't.

~~~
markkat
I absolutely would if I could do it and still get funded. I'd love to share
everything openly. Just pay my bills as I do it.

Currently, science is funded based on individual results. You can't blame folk
for wanting to be promoted or to win tenure.

~~~
_delirium
> Currently, science is funded based on individual results.

Yeah, there's a direct tension between the move over the past few decades
towards very competitive grant funding (10% funding rates, short grant terms,
much fewer long-term, large block grants for centers), and a goal of having
everyone share everything altruistically. If you purposely set up science
funding so that it encourages cutthroat competition, people are going to have
to behave like cutthroat competitors, and those who try not to will (on
average) lose to those who do.

------
grep2
Actually, there are quite a few points where modern technologies would allow
for a significant streamlining of research processes. Off the top of my head
(as I work in the field):

1\. data acquisition - you won't believe in how many labs devices with a
computer interface (gpib or whatever) are running in standalone mode, costing
hours in grad student hours where parameters are changed and values are read
by hand - in the best case people duct-tape something together with a
_labview_ program. No. Just No.

2\. collaborative data sharing - if you want to show your boss a graph, you
email him a jpg - where is the site to upload a csv and show a graph to other
people/edit together?

3\. Writing papers: The state of the art is mailing a LaTeX(!) or Doc(!) file
to your colleagues with .v1.edited appended... Reviewing the published
material is just the last step.

PS: I'm working on a solution for #3 (Etherpad + LaTeX preview + export in the
appropriate journal format). Drop me a line if you're interested in details.

~~~
davidblondeau
My company, Collaborative Drug Discovery (<https://www.collaborativedrug.com>)
has been developing and offering a secure collaborative data sharing
environment for drug discovery data (chemistry and SAR data). There are other
tools in the space as well though labs and companies are slow to change. The
sector is very secretive and closed so it takes time for habits to change.

Some factors have been promoting more collaboration and data sharing:

* The increasing cost of research

* Specialization and the emergence of micro-biotech (5 people biotech startups)

* Foundations like the Bill and Melinda Gates Foundation that push for more collaboration amongst recipients of their grants (disclaimer: Collaborative Drug Discovery has received grants from the Gates foundation as well)

------
eor
I think the major funding agencies are aware of the problem and the potential,
and are working on solutions in their own way (for example, the NSF Office of
Cyberinfrastructure's DataNet program,
<http://www.nsf.gov/funding/pgm_summ.jsp?pims_id=503141>). But major cultural
change in the way scientific research is conducted just isn't going to happen
overnight.

------
necolas
There are already some efforts to speed up peer-review and make more
scientific publications available for free to anyone, e.g.,
<http://www.plosone.org/>

> Unless scientists and researchers start to put the interests of
> collaboration and “open science” ahead of their desire to be promoted or win
> tenure the system will not change

Something similar could be said about so many industries, including our own.

------
chalst
Note that the nature of research varies quite a bit from discipline to
discipline. Physics has a preprint-oriented culture that reduces the conflict
between individual interests and the wider ends of research.

------
galactus
I wish people would stop using the word "disruptive" all the time...

~~~
tacoe
Why?

------
mhb
Society for Amateur Scientists:

<http://www.sas.org/>

------
irollboozers
bum ba da bum

<http://beta.microryza.com/>

