
The Growing Impact of Old Scientific Papers - denismars
https://medium.com/the-physics-arxiv-blog/the-extraordinary-growing-impact-of-the-history-of-science-642022a39d67
======
pervycreeper
The extent to which the lack of universal open access is impeding human
progress seems impossible to measure, but this gives us a tiny hint.

~~~
PythonicAlpha
Yes, the speed of innovation is growing with accessibility to knowledge. In
the middle ages, access to knowledge and science was very limited. Thus also
innovation was also limited to a few people.

This should make people understand, that hiding knowledge away in obnoxious
expensive science publications is one of the biggest hindrances to innovation
today.

Science knowledge (when it is the truth) never grows old. Or would you think
one moment, that the wheel is an invention so old, that we don't need it any
more?

~~~
thaumasiotes
Well, a traditional wheel, made from three wooden boards (|), doesn't see much
application today. ;)

~~~
PythonicAlpha
But when it would have been patented then by one of todays patent attorneys
and the patent would still be valid, you could trash our modern life!

------
a3_nm
Very interesting result, and it makes you feel really sorry that most of those
old papers are behind paywalls. (Incidentally, this is a problem which is not
going to get solved even if everyone switched to open-access venues today.)

Just a comment though: not all citations are equal, so just counting them is
quite a crude metric. For instance, a lot of citations in my field
(theoretical CS) are "attribution citations" that point the reader to the
original paper that introduced a concept or proved a result; and these are not
the same as citations of work that you actually extend, or to which you
compare.

As in theoretical CS things progress fast and people usually improve upon
fairly recent works, my feeling (not backed up by data) is that most citations
of work older than 10 years are attribution citations; and for attribution,
you don't really need to have read the original paper, you just need to know
what it introduced or proved. So maybe the Web is making it easier to look up
older papers and cite them, but it doesn't mean that the older paper will
influence your research beyond adding a bibliographical entry.

You could say those additional cites may be useful to the reader, but even
then, readers unfamiliar with a concept would often do better to find a recent
survey about the concept, rather than try to understand the original paper
that introduced it. (The original paper is usually hard to read because it is
old and language and notation have changed; and people probably didn't have a
good understanding of the concept when they introduced it.) So those citations
are mostly for courtesy.

~~~
gioele
> Just a comment though: not all citations are equal, so just counting them is
> quite a crude metric. For instance, a lot of citations in my field
> (theoretical CS) are "attribution citations" that point the reader to the
> original paper that introduced a concept or proved a result; and these are
> not the same as citations of work that you actually extend, or to which you
> compare.

And these are not the same as citations of other works that your work is about
to disprove: "A recent experiment on mice (Doe et al., 2014) completely failed
to take into account the variance of X". There we go, Doe just got one more
citation.

------
privong
_“Our analysis indicates that, in 2013, 36% of citations were to articles that
are at least 10 years old and that this fraction has grown 28% since 1990,”_

Put another way, in 1990, 28% of citations were to articles that are at least
10 years old. (28*1.28 = 36). So, in 1990 a significant number of citations
were already older papers.

I wonder if there is a way to weight individual citations within each work
(e.g., by age of the paper cited?) to further strengthen the signal.

Also, at some point, the fraction of old papers cited should approach 0% (the
fields had to start sometime). It would be interesting to reproduce this
analysis for time bins which are older. I presume one would find that the
fraction of citations that are to papers 10 years or older would be a
monotonically increasing function of time. So one then needs to ask if this
increase is due to better access to articles or if it is simply due to there
being a larger body of work which is older than 10 years?

~~~
yxhuvud
Fields generally doesn't start - they branch off as a specialization of
something else. So no, there will be no 0 % unless you go really far back.

~~~
privong
Right, the aside about "starting" was hyperbole. And I purposefully said
"approach 0%", not "be 0%".

As an example, there are astronomy publications stretching over more than 100
years, which could be used in a study like this. Analyzing the citation data
in 10 year bins may be able to see if the increase in citations to "old"
papers (> 10 years old at the time of the citation) is due to an increased
corpus of papers (the citation fration should rise with time, likely related
to the total number of prior papers in existence) or due to improved acess to
older papers (the change in the past 10–20 years should be significantly
greater than in the previous time bins).

------
vanderZwan
So here is one definition of culture, emphasis mine:

> "Culture refers to the _cumulative deposit of knowledge_ , experience,
> beliefs, values, attitudes, meanings, hierarchies, religion, notions of
> time, roles, spatial relations, concepts of the universe, and material
> objects and possessions acquired by a group of people in the course of
> generations through individual and group striving."[1]

With that in mind, this development can only be a good thing. I wonder if it
measurably speeds up scientific developments? If time and energy don't have to
be spent rediscovering something, the more it can be spent on building on the
existing knowledge instead.

[1]
[http://www.tamu.edu/faculty/choudhury/culture.html](http://www.tamu.edu/faculty/choudhury/culture.html)

~~~
privong
> With that in mind, this development can only be a good thing. I wonder if it
> measurably speeds up scientific developments? If time and energy don't have
> to be spent rediscovering something, the more it can be spent on building on
> the existing knowledge instead.

It will be interesting to see. Though, in order to reap the benefits, time and
energy needs to spent reading and familiarizing one's self with the older
literature. An unfortunate result of the "publish or perish" culture of
science today is the explosion in the number of papers being published. This
makes it difficult to keep up with, and digest, the new results that are
coming out. Given that, it may be difficult for people to add to that the
older literature.

Certainly one could argue that understanding the old literature first is the
correct way to go about it, but one cannot sacrifice an understanding of the
recent literature. Papers can get dinged for only citing old results, which
can have the unintended side-effect of suggesting: a) that particular topic is
dead, as a field of study, or b) that the authors are unable to their work in
the context of recent results, which shows a lack of knowledge of the field.
So, keeping up with the current/recent literature is neceessary, and there are
only so many hours in a day.

It is certainly no excuse for not knowing the older literature, but it is a
realistic constraint on what can be expected.

Edit: I should add, the electronic access to old Astronomy articles has been
of great help to me, and has resulted in my finding and reading older papers
which are relevant to the work I am doing. It would have been much more
difficult when the only way to read the papers was to find a physical copy of
the journal.

~~~
seanmcdirmid
Publish and perish creates more noise, especially in more mature fields. And
those papers have citations, adding to the citation count. That older
citations are preferred in such an environment is not very surprising.

There are plenty of dubious reasons to cite newer papers that happen in a
competitive publishing environment; e.g. you might try to ingratiate yourself
with the PC by citing their most recent papers (and protect yourself from
stupid rejections). This further distorts the results, making citations often
not a very useful measure of progress and impact (some fields are worse than
others in this regard).

------
vegabook
Perhaps quality of recent scientific research volume has not increased as much
as quantity, so the importance of each individual research piece is lower.
Also, selection bias at work: like hit music, only the good stuff still gets
played, and if there is sea of lower quality new work, the old foundational
classics will be favoured now that they are more easily found.

This helps with a dilemma I often face - that of buying recent or older works
on Amazon or elsewhere. Instead of always buying the most recent publications
on a topic, why not buy the axiomatic decades old works.... indeed in my
collection I often find these display more information density, and higher
clarity of thought: the latter being inversely proportional to ease with which
a document can be produced.

------
spindritf
The alternative explanation to easier access is "great stagnation." Discovery
is slowing down, low hanging fruit has been picked, so older papers are
relatively more important than they used to be.

------
wolfgke
Relevant:
[https://archive.org/stream/GuerillaOpenAccessManifesto/Goamj...](https://archive.org/stream/GuerillaOpenAccessManifesto/Goamjuly2008_djvu.txt)

------
frozenport
>>this fraction has grown 28% since 1990

Is 28% growth over 25 years significant? How did the growth in the last 10
years look? Somehow the algebra in the article makes for a far more moot
point.

------
bsder
It could also point to the fact that there are a lot of new papers that don't
do anything worth citing.

~~~
raverbashing
Oh they do, they fill publishing quotas

More papers, more grants

However, if I'm not mistaken, 10 years ago the buzzwords were telemedicine, so
if your article was about cryptography just stick somewhere that it could make
telemedicine safer or something.

------
SFjulie1
could also might have been renamed: the diminishing impact of modern paper
because every scientists know they are not worthy.

~~~
joelthelion
Or more plausibly, because they are behind a paywall.

~~~
simoncarter
Most academic institutions pay to have access to the large academic sites. If
you do need to read a paper that's still behind a paywall, you usually email
someone at a different university who does have access, or even just look up
the authors academic page, which will often have a pdf. So I can't see
paywalls being an issue. It certainly never was for me or my former
colleagues.

~~~
dalke
As you point out, not all academic institutions have this access. For example,
a college which only teaches undergraduates is unlikely to subscribe to the
specialist journals, even though a couple of the professors will be interested
in those topics. (One common solution is for, say, the chemistry professors to
get a personal ACS membership, which gives access to a limited number of ACS
journal articles per year.)

There are researchers at companies. There are researchers with no affiliation.
Many have an issue with paywalls even though you haven't.

I'm a self-employed software developer in cheminformatics who also does
research in the history of the field. I can do this because the local(ish)
chemistry library has most of the papers on paper in the basement. It's a
public library, supported by my taxes. Otherwise it would be very expensive to
get copies of the hundreds of papers I've read or looked through.

As an example, one of the papers from the 1960s has information I wanted in
'figure 2'. Only it turns out that figure 2 was swapped with figure 2 from the
next paper in the journal. Both papers were by the same author. I don't know
if it's an author error or a layout error by the journal. It would have been
much harder to figure that out if I had to ask friends at another site for a
copy of the paper in the first place.

So yes, I am a researcher whose research is restricted by the cost of reading
the latest journals. My decision to look at the history of the field, rather
than the present, is partially influenced by the fact that I have better (read
"cheaper") access to the old materials than the new. Interlibrary loan is
amazing.

~~~
simoncarter
Your response implies that you've not read the thread of comments I was
contributing to. I wasn't taking a position on firewalls, nor on the current
publication/journal practise. Rather, joelthelion tried to argue that paywalls
could possibly explain the lack of citations for more recent publications. I
gave counter arguments. Not sure what your comments on the right or wrongs on
paywalls and journalling practises have to do with this?

> There are researchers at companies. There are researchers with no
> affiliation. Many have an issue with paywalls even though you haven't.

I never made any comment either way. I really don't see how you can make that
comment. Just mentioned that researchers I have known find ways to get round
paywalls, if they ever happen to to encounter one, if needed, including even
emailing the author of the papers. If you want to publish you can't submit a
paper for review without having demonstrated knowledge of the related
literature, and where your work fits within that. Researchers will find a way
to read and cite the relevant literature that they need to, and thus can't be
used an excuse for lower citations for more recent papers.

EDIT: Didn't intend for my post to be harsh.

~~~
dalke
The thread is "the diminishing impact of modern paper because ... " with the
alternative suggestions that a) 'every scientists know they are not worthy',
and b) 'because they are behind a paywall'.

You replied to (b), saying that that likely wasn't the case because "I can't
see paywalls being an issue. It certainly never was for me or my former
colleagues [at academic institutions]."

My reply is two-fold. First, it affects me. I am writing a paper. I have
excellent citations from the 1950s to 1980s because all of that is on paper,
which is easily ("cheaply") accessible to me. I don't have good citations for
the 1990s and onwards because those cost something like $30 each from the
publisher. (It's actually cheaper to get most of them through Interlibrary
Loan, which has much lower page charges than the publisher.)

Hence, just like you have observations that it doesn't affect your research, I
have a counter-observation that it does affect my research.

The second point was to highlight your implicit suggestion that nearly all
research is done at academic institutions. While you didn't say it explicitly,
your counter-argument is very weak unless you make that assumption. I don't
think you meant to make a weak argument. With the same weak argument, I could
say that it affects me, and friends of mine who are self-employed or working
in small companies, so therefore everyone must be affected by it.

Now, I think it's true that most published research is from academics, though
since I work in mostly pharmaceutical chemistry I can say that many
publications in my field come from industrial research.

"If you want to publish you can't submit a paper for review without having
demonstrated knowledge of the related literature, and where your work fits
within that."

Yes, I know that. The cost of doing the literature research has made it very
hard for me to publish. Indeed, that's my point. As a self-funded researcher,
I can say that science is an expensive endeavor.

"and thus can't be used an excuse for lower citations for more recent papers"

Strictly speaking that's not true. It could be that more people are publishing
historical reviews. It could be the modern trend to include older citations.
If you look at papers from the 1950s, you'll see that there might only be a
few citations. By comparison, the modern citations sometimes seem to use the
citations as a badge of honor, or proof that the person is scholarly.

Based on the evidence in the paper, therefore, you cannot make the conclusion
you did. Nor can the paper's authors, since all the paper did was observe a
trend that is in alignment with the hypothesis. The next set of tests might be
to pick out a selection of papers and ask people now to judge which items need
a citation. If the same set of judges say that papers from the 1990s should
have had more citations, then this would suggest that there's been a cultural
change.

The paper suggests that multiple factors may be involved. It does not identify
which of those are the most important. They point out that the chemistry field
is one of the few which hasn't changed. This happens to be my area of
experience.

Mechanical search of chemical documentation started in the 1940s with punch
card machines. Organizations like CAS, from the American Chemical Society,
have long existed to make chemical documentation more searchable. Companies
like ISI (now Thompson Reuters) started in the 1960s to computerize entry and
keyword-based search, with online searches by the 1980s, though not full-text
search.

The lack of change may indicate that the search technology of the 1980s, based
on human indexing and keyword searches, is sufficient for the gains seen. To
be fair, chemical publications is more open to keyword indexing than, say,
Health & Medical Sciences. (I know this from reading some of the ISI
publications, dating from when they entered the Health & Medical Sciences
field.)

This paper says "between 1990 and 2013, the number of scholarly articles
published per year grew close to 3-fold. As a result, there is much more
recent work for researchers to learn from, build upon and cite". I've been
reading the chemical documentation literature from the 1960s and 1970s. They
were talking about exponential growth back then.

For example, I have a chemical information text book from the early 1970s
saying that the doubling time for chemical documentation is about 13 years. I
checked the modern numbers, and it's still holds.

If the exponential rate of growth is the same then and now, then a paper in
1990 would be equally biased towards recent decade papers, on a percentage
basis, as someone writing now. That's how exponential curves work.

So, I'm not all that convinced about the paper. They aren't able to
distinguish between a cultural change and an access to information change, or
if it's due to improved search technology (eg, 1970s tech but with auto-
indexing) or due to easier access to the literature.

------
tikhonj
At least for me, the title was a bit confusing. It's not the "History of
Science" as a subject of study in and of itself or a specific academic field,
but rather the large corpus of existing results and publications that's having
an effect.

I find this distinction very important, because it helps separate the
immediate process of science--the people, the historical quirks--from the
actual results. It's not a perfect division, of course, but I think it's
pretty good in most scientific fields and also very important. It helps
distance science, the cumulative understanding of our world, from the people
who produced it who are, after all, just human. In my view, this is the main
goal of the scientific process, so it's just another component of what makes
science _science_.

This is also not to say that the history of science is not interesting or
worth studying on its own, merely that it is something largely distinct from
the underlying science itself and should ideally be kept that way.

~~~
_delirium
The claimed stat seems more ambiguous to me on which of those possibilities is
the case. It's specifically citations to old papers that are increasing. That
is the _original_ old papers, not just their results. Citing old results
"updated" or "collected", in e.g. a modern textbook or survey article is one
thing, and could more plausibly be divorced from the history of science. But
if people are really reading (vs. just blindly citing, which may be the case)
old scientific papers, that seems more to me like the history of science, in
at least a somewhat broader sense, is poking its way into modern science more
significantly than it used to.

When you read the original papers, they come along with their whole historical
era, all the way down to the quirks of language, notation, problem framing,
epistemological assumptions, etc. You even, sometimes, need to know something
about the history-of-science as a field to correctly read and interpret a
paper from a different era, which you probably want to do if you are really
citing it for significant content. (Admittedly, a lot of the citations are
probably throwaway cites to old papers the authors haven't themselves read,
along the lines of "This was first studied in 1927 [1]". In that case someone
still needs to do some history-of-science work to read this 1927 paper and
characterize it accurately as "first" to study some problem... but that work
might be done by someone else.)

