Hacker News new | past | comments | ask | show | jobs | submit login
The most-cited papers (nature.com)
101 points by harleyk on Oct 29, 2014 | hide | past | web | favorite | 20 comments

Really enjoyed the relationship between this paper's title and the graph of its citation frequency:


I think it would have been useful to have the graph reflect discipline's color.

The Perdew-Burke-Enzerhof functional, and the generalized gradient approximation, is a Big Honking Deal for accurate quantum-mechanical simulation of solids.

For more: http://en.wikipedia.org/wiki/Density_functional_theory

(I used to be a happy customer, back when I was a scientist.)

Flipped through to find whether genomics/bioinformatics would be well represented; there were at least 2 papers on BLAST in the top 15, and 4th paper is the Sanger method. But where is Watson & Crick?

My guess is that all of the subsequent papers that uses BLAST as a tool have to cite it; similarly all sequencing papers cite Sanger as a tool which is why its citation rate dropped when next-gen sequencing method replaced it - which goes to show citation is not an accurate measure of scientific impact because it is equivalent of citing "Git/compiler/IDE" for a software project.

>which goes to show citation is not an accurate measure of scientific impact because it is equivalent of citing "Git/compiler/IDE" for a software project.

The key difference is that BLAST is not a commodity, or at least was not at the time of its release. If there was only one compiler, or one best compiler, we would cite this in CS papers. In fact, "David A. Wheeler's sloccount" is commonly cited in CS papers, as well as Weka, LLVM's Klee, and Z3.

On the other hand, nobody cites Djikstra's algorithm, or optimizing compilers, because those are considered foundational and have not changed in a long time. If BLAST was never supplanted, it would eventually not be cited because that would simply be how sequencing was done. But since it represents a new practice, citing it is necessary to place your work in the context of other work.

I don't see how you can say that BLAST or Klee were not scientifically impactful -- being the force multiplier that enables other avenues of research is possibly the most valuable thing a scientist can do. For example, should a paper that proves P!=NP become instantly the most cited paper in computer science? Though the question is one of the most deep open problems in the field, it isn't relevant at a lot of levels of study, and so it probably wouldn't ever become more cited than Klee.

Further still, I'd guess that the top 1% of cited papers are mostly methods papers, as the article bears out. The next tier are the top-notch findings papers that you mention, because they spark other research. The C45 paper probably doesn't have as much cites as Weka, because Weka is relevant beyond C45, but it certainly has more cites than a less-meaningful machine learning paper.

To carry all of this back to a software project, software projects in their Readme.md always cite their language (like a key method), always cite their dependent libraries (like specific methods used in that work), but never cite things like binary search (a foundational finding). A project MIGHT cite a compiler if that compiler is the only compiler that compiles it! In that case the compiler is not a commodity, and thus worth citing. Version control and editing are totally orthogonal to the actual software, unless you're working in a visual language or something, in which case you would in fact cite your IDE.

Thanks for your detailed response. I see your point. Tangential, I used to use Weka and liked it a lot and good to know that it is still very popular despite all of the Big Data/Data Science touting Hadoop/Apache Spark.

An interesting follow-up would be to run PageRank on the citation graph. That should lower the importance of the scientific methods papers since they are likely to be cited by a very large number of random papers of limited importance, which bumping up papers that have led to further work that is itself important.

Citation count and co-authorship apparently does a better job at pointing out good papers. Even though PageRank was in part inspired by citations studies. http://link.springer.com/article/10.1007%2Fs11192-007-1908-4 (Paywalled)

Two of the top 10 are different editions of a laboratory manual: Sambrook, J., Fritsch, E. F. & Maniatis, T. Molecular Cloning (1989), and Maniatis, T., Fritsch, E. F. & Sambrook, J. Molecular Cloning: A Laboratory Manual (1982).

Citing Laemmli et al was de rigueur for many years, while it was certainly an influential technique, it doesn't rank above the discovery of DNA.

What I find interesting is that many papers use references from the 70's and some times even 100 years old, in where they used very crude tools to come to certain conclusions.

I'm not saying we should discard old science discoveries, but it would be interesting redoing the experiments with today's technology.

I think that one usually cites the paper were the idea was first proposed, and also some of the latest advances in the trend (including, as your example, papers where the experiment is redo with newest technology).

The thing is that "the latest paper" on the topic is very likely to change over time, but the first paper that proposed the idea is likely to remain the same.

Nobody discards "old" science discoveries, any more than they discard the "old" inventions of steel, or indeed the wheel, they build upon them.

Re-visiting previous studies with newer approaches is a common theme in archaeology, where sites or artifacts are sometimes incompletely investigated on purpose, with the view to leaving undisturbed material that can be investigated in the future using technology not yet conceived of.

I hope this doesn't make me sound like a hipster, but I can't remember the last time I saw a wheel.

How did you get to work today? Even if you walked, are there really no vehicles along the road?

So, I was actually making a joke :)

How did you scroll this page?

Touchpad/touchscreen swipe?


This is purely anecdotal, but one of my favorites is "Medical researcher discovers integration, gets 75 citations."


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact