
Hundreds of extreme self-citing scientists revealed in new database - dmckeon
https://www.nature.com/articles/d41586-019-02479-7
======
cbanek
As someone who now works in astronomy, I'm not at all surprised at the high
self-citation rate for the field. It is true that a lot of papers are
published by large consortiums. For example, at LSST (where I work), if you
have been working on the project for 2 years, you are considered a "builder"
and added as an author to all major project wide papers.

Those papers, which tend to be long and full of great stuff, are cited a lot,
and have hundreds of authors.

I wonder how many of these papers are where the first author has cited other
papers where they are the first author. (Or really, at least the first few
authors) It seems like for the data shown, it is just if anyone in the author
list is anywhere in the author list of the citation?

Also for some research niches, you may be one of the few people writing papers
on a subject. There's no one else to cite.

I do think there's some very valid points about bringing the person up to
speed on previous research that brought them to the current paper. But I don't
think those citations should really count as a citation in terms of metrics
for how successful a scientist is.

To be honest, I find all the metric gaming about number of papers and
citations to be ridiculous. I don't hear many people saying they want to write
the best paper in their field, or something new. It all seems to be a numbers
game these days. Academic career growth hacking, if you will.

~~~
o09rdk
This probably varies by field, but the "large project" thing can be gamed too.

So, for example, in biomedicine you often have lots of people on a paper who
might only read a draft, make some trivial suggestions, and then be added as
an author.

As a result, there's this pressure for large groups to form, where everyone is
added and everyone can cite each other.

This doesn't mean the projects are bad, but it does lead to individuals with
large citation counts primarily because they find ways to add themselves to
everything, regardless of their level of effort. People should get credit
where it's due, and large projects involve lots of people. But what defines a
"project" has become very vague.

I've become extraordinarily disillusioned with academics. Science gets done
but the rewards seem to filter preferentially to those who are able to game
the system, and the system exists out of a need to make one's self look as
productive as possible, in areas where contributions are generally necessarily
tiny or nonexistent, even among very competent people, because the problems
are hard and because so many people see the same things at the same time.

~~~
cbanek
I don't disagree with what your saying, but I think LSST's builder concept is
actually quite amazing and the opposite side of that coin.

For people building the telescope (think hardware, software, logistics,
everything before the science can be done), many of whom are not academics,
and don't typically get authorship or write papers, it's great to get credit
for working on the project in a formal, public way. You don't even have to
edit or provide some kind of task directly related to the paper either, which
I agree can get somewhat clique-ish.

~~~
Bartweiss
The builder concept is actually really appealing. Academia can tend towards
the same problem as consulting, where tasks get sharply split between "credit-
producing" and "not worth doing".

Answering questions about your past papers; looking over someone else's
proposed methodology; or cleaning up an internal tool into one you can share
are all great tasks for advancing the field, but none of them bolster a CV,
earn grants, or help you get tenure. If you want credit for them, you usually
have to commit lots _more_ time to the task, like running a formal discussion,
becoming an author, or polishing the tool into an OSS contribution. All too
often, the result is siloed projects and work abandoned as soon as it's
published. (How many papers offering some novel twist on priming or ego
depletion could have been turned into replication-and-extension if past
authors had been involved?)

Especially in astronomy, with large projects and lots of non-PhD team members,
this makes so much sense. (I believe something similar may happen at LIGO - if
not formally then at least in practice?) If work is going to be judged by
authorship, it's only fair to recognize that at a certain point the groundwork
and floating aid people give is comparably valuable to the act of writing up
some chunk of the text.

~~~
godelski
This definitely happens on LIGO. You have hundreds of authors. My optics
professor in undergrad was never a first author but he sure is an author on a
lot of papers.

------
melling
"In 2017, a study showed that scientists in Italy began citing themselves more
heavily after a controversial 2010 policy was introduced that required
academics to meet productivity thresholds to be eligible for promotion"

Cobra Effect

[https://en.wikipedia.org/wiki/Cobra_effect](https://en.wikipedia.org/wiki/Cobra_effect)

~~~
MaxBarraclough
I don't see that it's the Cobra effect. Did it actively make them less
productive?

~~~
duskwuff
If it means that some academics are choosing to cite their own papers rather
than keeping up to date on the literature of their field, there's likely to be
some potential insights missed.

~~~
lonelappde
There's no reason to assume that adding a self cite affects how much they read
others.

------
not2b
Self-citation is appropriate for a new paper that builds on the results of a
previous paper. But in evaluating how influential a researcher is, it makes
sense to exclude self-citation, while being careful to avoid any implication
that self-citation is wrong.

~~~
n1231231231234
when self-citation is OK and when it is not OK should be part of one's
academic training/phd training.

PLOS gives reasonable citation guidelines, and in this context their Rule 5 is
particular relevant:
[https://journals.plos.org/ploscompbiol/article?id=10.1371/jo...](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006036)

~~~
ccalf
I find this approach strangely condescending. For example the author says:

> Understanding the value attributed to X, Y, and Z in that particular text
> requires assessment of the rhetorical strategies of the author(s).

They could've just said, if you want to know _why_ the author thinks XYZ are
important, you need to look at _what_ they are saying about it.

I'm a hardcore postmodern leftist, but I don't see how writing in such a
contorted way helps practicing scientists. In fact I would argue that this
kind of listing obscures a politics of its own; it is so busy prescribing
citation practices that it won't examine its own politics.

That said, it's the first time I've seen this guide so maybe I need to read up
on the issues; a list of do's / don'ts isn't the best way to introduce and
help people understand the issues.

~~~
EGreg
What is a hardcore postmodern leftist? You don’t believe in objective truth
but make claims as if it exists?

------
hannob
The core problem here is that universities think that citation statistics are
a useful metric to evaluate the quality of the work of a scientist. There's
plenty of evidence that this is not the case or that even the reverse may be
the case [1], but this idea refuses to die.

[1]

~~~
PeterisP
It sucks as a metric but it does have _some_ rough correlation in most cases,
and I'm not aware of any better easily measurable metric - if you have one in
mind, it'd be great to hear. The alternative of having a bureaucrat "simply
judge quality" IMHO is even worse, even less objective, and even more prone to
being gamed.

The main problem is that there is an objective need (or desire?) by various
stakeholders to have some kind of metric that they can use to roughly evaluate
the quality or quantity of scientist's work, with the caveat people _outside_
your field need to be able to use it. I.e. let's assume that we have a
university or government official that for some valid reason (there are many
of them) needs to be able to compare two mathematicians without spending
excessive time on it. Let's assume that the official is honest, competent and
in fact is a scientist him/herself and so can do the evaluation "in the way
that scientists want" \- but that official happens to be, say, a biologist or
a linguist. What process should be used? How should that person distinguish
insigtful, groundbreaking novel and important research from pseudoscience or
salami-sliced paper that's not bringing anything new to the field? I can
evaluate papers and people in my research subfield, but not far outside of it.
Peer review for papers exists because we consider that people outside of the
field are not qualified to directly tell whether that paper is good or bad.

The other problem, of course, is how do you compare between fields - what data
allows you to see that (for example) your history department is doing top-
notch research but your economics department is not respected in their field?

I'm not sure that a good measurement _can_ exist, and despite all their deep
flaws it seems that we actually can't do much better than the currently used
bibliographic metrics and judgement by proxy of journal ratings.

Saying "metric X is bad" doesn't mean "metric X shouldn't get used" unless a
better solution is available.

~~~
gdhbcc
Why not simply use replication as a measure? Have your studies been
replicated? How many other studies have you replicated?

Would both help solve the replication crisis, and resolve this problem.

Of course then you might have 10 000 studies replicating the same easy to do
study... which is why the "score" should be reduced based on how many other
times that study has been replicated.

~~~
newacctjhro
This might work for hard sciences, but not for mathematics.

Or, I dunno, paleontology or sociology or other stuff.

~~~
CrazyStat
Indeed. My research (in statistics) is primarily methodological: I invent and
describe methods that might be useful, and on a good day prove some
theoretical results demonstrating that they might be useful. There's nothing
to replicate there.

Citations can be a useful metric here, particularly if you can identify
citations of people actually using the method (as opposed to people just
mentioning it in passing, or other methodological researchers comparing their
own methods to it).

~~~
didibus
Wouldn't replication here just be peer reviews ?

------
einpoklum
I'm not Italian... and am not meeting any productivity threshold.

But my work is incremental, and I obviously don't want to repeat what I said
in a different paper, so I cite earlier work in later work. TBH, I don't think
it's possible to avoid self-citation unless:

1\. Your research is so popular that by the time you need to cite it, it's
been surveyed, or improved upon, or otherwise adapted. 2\. You switch research
subjects relatively often. 3\. You publish "blocks" of work, each based on
fundamentals in your field established by others - and they're not
incremental.

~~~
maaaats
It doesn't say that self-citation is wrong per se, but that some people use it
to game their citation count to the extreme.

------
tony
If you narrow yourself to a specific niche well enough, you'll see the same
names in citations. To be fair, the areas I dig into don't feel nearly as
competitive as say, physics, which I couldn't make heads or tails of.

The whole reason the internet and wikis took off is we were very liberal in
how we linked. If we disallowed inbound citations, wouldn't it be a lot harder
to backtrack and grasp contextual underpinnings?

Anecdote: In the field of adult attachment theory <-> love there are a few
prominent scholars that cite each other: Shaver, Hazan, Mikulincer. They do
papers citing their own work and each other [1]. There's also a book by
Mikulincer highlights Shaver's upbringing with his parents, his past as a
hippy, etc. They're delivering very nice content, and they cite others outside
their ("circle"?)

Are there potentially scholars in the field with valuable contributions that
go unnoticed? Possibly. It doesn't make self-citations in their papers any
less helpful. Also I worry that regulating citations through some system may
affect the quality of content and fix something that's not broken.

Which brings me to another issue, aren't we supposed to be helping each other?

[1] Example:
[http://adultattachmentlab.human.cornell.edu/HazanShaver1990....](http://adultattachmentlab.human.cornell.edu/HazanShaver1990.pdf)

~~~
LeonB
When you say ‘“circle”?’ I think “clique” is appropriate as ‘a network where
every node is connected to every other node’.

------
ChuckMcM
Perhaps it would be useful for reviewers to point out which citations do not
contribute to the paper? It really is a tough problem. If someone is toiling
along in some niche they have carved out, they and their colleagues may be the
only one working in that space. That leads to a lot of cross citation and self
citation.

That said, if you publish paper A, and then cite it in paper B which builds on
that work, then in paper C you really only need to cite paper B if you're
building on the work, not B and A. It might make for in interesting data set
to plot out those sorts of relationships.

~~~
_delirium
As a reader I personally prefer if they do a more complete set of citations,
instead of making me follow up a multi-step chain to dig them up, as if I'm a
compiler resolving transitive dependencies. I like little history-map
sentences like: "This technique was introduced by Foo (1988) and recast in the
modern computational formalism by Bar (2009); the present work uses an
optimized variant (Bar 2012)."

You _could_ just cite the last paper here, which is the only one used
directly, and which presumably itself cites the earlier papers. But it's more
useful to me if you include the version of the sentence that cites all three
and briefly explains their relationship.

~~~
jrumbut
That kind of sentence is gold.

Often half or (much) more of the value of a paper is in the references, and
that's not a bad thing. Sometimes it is the first thing I read.

There's no ink shortage, no link limit on the Internet, and every paper has an
abstract for quick filtering. As a curious person I want everything that
serves to establish the argument cited so I can be guided to papers of
interest and get a better idea of where an idea fits in the broader field.

------
std_throwaway
You need a source of trust in these systems. Journals used to have that role.
They had high standards that were upheld by editors selecting only worthy
publications. Today it seems that many journals aren't as trustworthy as they
seemed to be in the past. It's also easier to spam the journals with your
publication and to bullshit your way into publication. The incentives to
publish a lot are also way higher now that your grant money is highly
dependent on your citation count. Journals can publish more and easier and
lower the standards for submission to earn more money. The system is basically
eating itself and we haven't found a cure yet.

Filtering for self-citations is useful to identify the bubbles. But it is not
sufficient to determine if those bubbles only contain hot air or if these
scientists are actually working on something with substance in a narrow field
where few others publish.

~~~
snarf21
"It Is Difficult to Get a Man to Understand Something When His Salary Depends
Upon His Not Understanding It" -Upton Sinclair

------
throwawaywego
The opposite of extreme self-citing is self-plagiarism (either out of
ignorance, to avoid extreme self-citing on ground-breaking research, or with
malicious intent: passing the same paper to multiple journals as a new
result).

> The rate of duplication in the rest of the biomedical literature has been
> estimated to be between 10% to 20% (Jefferson, 1998), though one review of
> the literature suggests the more conservative figure of approximately 10%
> (Steneck, 2000).
> [https://ori.hhs.gov/plagiarism-13](https://ori.hhs.gov/plagiarism-13)

If work by another author was enough to inspire you and add a reference, then
your own previous work should certainly qualify, if it added inspiration to
the current paper. Self-citing provides a "paper trail" for the reader when
they want to investigate a claim or proof further.

(Like PageRank, it is very possible to discount internal PR/links under
external links, and when you also take into account the authority of the
referencer, you avoid scientists accumulating references from non-peer
reviewed Arxiv publications).

------
bifrost
I found this situation regularly when going down the rabbithole of the anti-
vaxx or anti-5g people. One "scientist" makes a highly dubious claim,
thousands of nutjobs cite this one scientist, "scientist" then goes on to cite
articles that cites their work. I'm basically waiting to find Alex Jones cited
in a serious article at this point.

~~~
std_throwaway
If you work in a very narrow field of science you're basically on your own and
have to cite yourself because there's nobody else to cite.

~~~
jedberg
In which case you have to ask yourself, are you so brilliant that you’ve found
an important topic that no one has considered yet, or have all the brilliant
people already figured out that topic isn’t worthy of study?

It’s the same with the startup world. If you’re the only one doing a thing,
are you brilliant or foolish?

~~~
agdpf
>In which case you have to ask yourself, are you so brilliant that you’ve
found an important topic that no one has considered yet, or have all the
brilliant people already figured out that topic isn’t worthy of study?

Imagine how many inventions we would have missed if all inventors had shared
your mindset.

~~~
jedberg
I think you’ve misinterpreted what I said. I’m not suggesting everyone is a
fool, quite the opposite.

It’s just an important question to ask yourself.

~~~
reallydude
The purpose of PHDs are to move human knowledge forward. You have to do an
analysis of something that, in all likelihood, nobody has done before (or not
enough to be considered settled).

~~~
nashashmi
But then your analysis has to be challenged as well. And the challenges should
be published. Success or fail.

If you live in your own bubble the needle doesn't move forward.

------
khawkins
Went into the data and took the top 1000 individuals with self cite
percentages over 40%, then sorted by institution. Nearly every major
institution had individuals in this group: Johns Hopkins (4), Cal Tech (4),
Georgia Tech (2), MIT (5), each of the Max Planck Institute campuses (3-7),
Moscow State (7), Penn State (6), Stanford (1), Utrecht (2), University of
Zurich (4), ETH Zurich (1), DLR (3), Imperial College London (3), University
of Tokyo (2), Princeton (5), Kyoto University (4)...

I feel like if this problem were very concerning we'd see the distribution
concentrated at certain institutions but I'm not sure there's one with over 10
researchers at them. We hear a lot about questionable Chinese journals, but
the highest institution in this list is the Chinese Academy of Sciences with 3
individuals.

I think the more likely case is there are a few bad apples, some bad practices
we can't ever fully get rid of, and that some research lends itself more to
self-citation.

------
jacquesm
Given that pagerank had its origins in citations it should not be surprising
to find link farms and other spam in scientific publications.

~~~
mathgenius
This is not pagerank. But they really should use pagerank in this nature
article because (very likely) it will blow away this self-citation problem.
People should be able to self-cite as much as they want. Not using pagerank is
the problem.

------
chiefalchemist
For those who wonder why there are climate-change deniers who won't listen to
"the science" this article is for you. The sad and honest fact is, science has
become a spin-factory in a way mainstream media has.

Yes, in theory, the scientic method / process is a wonderful standard.
Unfortunately, once it's exposed to egos and profits it becomes somethings
else far less worth of praise and honor.

I'm not doing a take down of science, science has already done that to itself.
The sooner the rest of us come to terms with that, the better.

~~~
SubiculumCode
Says ChiefAlchemist? Twas Newton's chief failing, alchemy.

------
PeterStuer
What surprises me is how naive the scientific community publicly pretends to
be on these matters.

We 'marketed' science as due to our socioeconomic dogmas vaguely based on
completely misunderstood caricatured Darwin ion theory there can be no
alternative. We turn science into a quantative metrics game, and by golly, act
all surprised that scientists do game the system?

How useful do you think Google Search would be if they just stopped after
Pagerank v0.1 and called it a day, then let all the websites 'vote with their
links'?

------
your-nanny
People who are working out their ideas far outside the mainstream have no one
to cite but them selves. Some are quacks, to be sure, but sometimes a field is
just not ready for their work because the utility of the idea is not easily
apparent, or because it's perceived to be too risky. A field needs a healthy
mix of the the curmudgeonly stubborn thinkers going their way no matter the
cost, and those making steady progress on solvable problems.

~~~
guntars
Do you have any good examples, historical or otherwise?

~~~
pkaye
Possibly Barry Marshall and Robin Warren who discovered H Pylori causes peptic
ulcers and subsequently won a Nobel Prize. They were ridiculed by doctors and
scientists for their theory.

~~~
jacquesm
I can see them citing themselves once or twice but not 10's of times or more.

------
joe_the_user
On a semi-related note, I occasionally look at the news items Google news
suggests for me, and these include a significant portion of climate change
denialists propaganda, including one shocked, shocked by Nature "suppressing
academic freedom with this list" (and my searches are never _for_ climate
denialism).

Which is to say, these may be a few sciences but it seems like they
significant resources behind them, somehow.

~~~
zdragnar
Strange, I use google news on a semi regular basis, and have the opposite
experience; most are fairly bland coverage of climate science research
announcements with the occasional hyperbolic doomsday stuff.

------
jeffwass
My undergrad college physics professor (Fay Ajzenberg-Selove) introduced this
metric back in the 50’s. She faced major sexism and bullshit claims against
her productivity, and had to use this metric to prove her detractors wrong,
that she was as good as or better than most of her male colleagues in terms of
performing useful and interesting research, to earn herself a faculty
position.

[https://en.m.wikipedia.org/wiki/Fay_Ajzenberg-
Selove](https://en.m.wikipedia.org/wiki/Fay_Ajzenberg-Selove)

~~~
jeffwass
Did I describe this wrong? Not sure why it got moderates down.

Basically, my professor was one of the first female physicists trying to make
it in a clearly male-dominated field nearly 60+ years ago.

She was not offered jobs, or jobs at reduced salary vs male colleagues, being
told her research was not as productive.

She literally invented the concept of considering number of citations as a
proof of impact, relevancy, and productivity, and showed that her work was
better than most of her male colleagues.

Only after she jumped through this hoop herself was she able to get a faculty
position. Damn impressive of her.

------
YeGoblynQueenne
Counting citations is a rubbish metric in general. It's supposed to be a proxy
for reserach quality but it's so easy to "game" (in the sense of optimising
for citation count, rather than research quality) that a high number of
citations doesn't mean anything.

Neither does a low number of citations. For example, my field is small and
kind of esoteric, so we don't get lots of citations either from the outside or
the inside (one of the most influential papers in the field has... 286
citations on Semantic Scholar; since 1995).

With a field as small as a couple hundred researchers it's also very easy to
give the appearance of a citation mill. Given that papers will focus on a very
specific subject in the purview of the field, it is inevitable that each
researcher who studies that specific subject will cite the same handful of
researchers' papers over and over again- and be herself cited by them, since
she's now publishing on the subject that interests them.

As to self-citations, like Ioannidis himself says there are legitimate
reasons, for instance, a PhD student publishing with her thesis advisor as a
co-author. The student will most probably be working on subjects that the
advisor has already published on and in fact will most likely be extending the
advisor's prior work. So the advisor's prior work will be cited in the
student's papers.

So I'm really not sure what we're learning in the general case by counting
citations, other than that a certain paper has a certain number of citations.

------
SubiculumCode
To be fair to some researchers in certain specializations, there may only be a
handful of scientists publishing on the topic. Self-Citation proceeds
naturally from such circumstances.

------
tikiman163
Having conducted a reasonable amount of academic and scientific research, this
metric is more likely to be mischaracterizing research than revealing any
issues. This doesn't even establish a causal-link between self-citation and
poor research quality, it just assumes it.

Most researchers continue to do new research on the same concept after a
publication, and they will of course site their earlier work when continuing.
Additionally, post-graduate researchers often have their names placed on the
research of grad students they are in charge of, even though they often have
minimal involvement in the research or conclusions drawn.

You might be able tell something from the ratio of other authors from all
citations to the number of self-citations, but only if you could eliminate
self citations that were not either inclusion by proxy or cases where they are
merely continuing research on the same topic with new methodologies.

There are already methods for identifying bad research, none of which can be
achieved through the use of non-human-assisted data analysis of the authors
list of research. The only way to be sure is critical review and 3rd party
verification of results with repeated experiments.

------
jonny383
I worked as a RA in my university days in a large, globally reputable
university in Australia. I can safely say there was a culture of: 1.
Dismissing research from other staff who had "less than X" citations as non-
sense. And 2: Self-referencing from new academics trying to break into the
"exclusive, trusted" club. It was genuine madness that corrupted a lot of good
people.

------
holy_city
Interestingly enough, the published work [1] has two self-citations. So maybe
there's something to that phrase, “the next work cannot be carried on without
referring to previous work.”

[1]
[https://journals.plos.org/plosbiology/article?id=10.1371/jou...](https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3000384)

------
carusooneliner
Regrettable that the article begins by outing a prof from an unheard of
university in India, who probably publishes in low repute journals and
conference. Ideally the citing malfeasance score should be weighted based on
journal and conference reputation.

~~~
Vinnl
And how do you measure journal and conference reputation? Conventional
measures there also lean on citations, and journals have been known to do
quite a bit of manipulation to get those scores up.

~~~
carusooneliner
Start with Google Scholar's list of top publications:
[https://scholar.google.com/citations?view_op=top_venues](https://scholar.google.com/citations?view_op=top_venues),
and detect self-citing and citing rings within the universe of these
publications. Almost certainly the guy with the record for self-citing doesn't
have a paper in any of the top publications.

------
19ylram49
A particular interest of mine is database systems (and by extension,
distributed systems) and I’ve noticed this pattern a fair amount when reading
database-related CS papers. It tends to feel like a small circle of
researchers citing each other. Don’t get me wrong, that doesn’t say much about
the actual quality of said papers, but it’s a pattern that I’ve definitely
noticed. As a result of this, I tend to also make sure that I pay attention to
seemingly interesting papers with low citations; sometimes, it immediately
makes sense why they have low citations, while, other times, I find myself
fascinated by the research despite the low citations.

~~~
seieste
Some fields are very small, and some authors only focus on one small part of a
field.

So, high citation count isn't always a direct indicator of quality.

------
tilolebo
Now the question we all want to have answered: what's the database engine?

------
tezthenerd
A few months ago a couple of physics postdocs set up a website (vanityindex
dot com) and proposed a few "vanity metrics". Can be fun to check out your own
vanity index and that of your colleagues.

------
InfinityByTen
This is a news? I thought the Publish or Perish idiom made it obvious.

2 weeks of effort goes into doing something and you spend 2 months of writing
a paper of the slightest of the result. Many of these papers introduce an
infinitesimal increment to knowledge, at best and you can tell how long would
it have taken to get it working. And this is Numerical Mathematics. There are
clans of Mathematicians who just go around citing each other. And a quality
paper comes around every 5 years or so.

------
techie128
Not defending people who self-cite but here is an alternative explanation.
There are many areas of science that have very few researchers working on the
same or related problems. In addition papers tend to build on prior works of
the same or related researchers. Over time we may see clusters of what look
like "self-cited" papers. This is not abnormal.

~~~
lilott8
Yup, I self-cite quite a bit. But not to demonstrate productivity or some
other arbitrary metric(s). I self-cite because the images I use in my research
are _really_ difficult to make and I'm in a niche enough field that using the
images really helps people who are not familiar with the area understand what
is going on. These images were arduous to make, so I reuse them. (I should
note, that I've gotten in "trouble" for reusing them without a citation).

------
sytelus
PageRank was precisely invented to solve this issue. I have never understood
why Google Scholar itself took stance not to even compute it and stick to
h-index. Google Scholar is a defacto standard for looking up researchers and
whatever metric they adopt would be adopted by the rest of the world.

~~~
tannhaeuser
It's frustrating to see science emasculating itself and rely on commercial,
even monopolistic services when there is no lack of research in document
search and comprehension. Ten or fifteen years ago there was citeseer for
tracking citations. It was a messy Perl program/site to scan through mostly
TeX files for metadata and references, yet it worked relatively well, and was
held reasonably up-to-date. Then they rewrote it into CiteseerX, such that it
became useless. It never recovered.

------
weq
The fundemental problem with science today is that it needs funding. Funding
means vested interests.

If a scientist is trying to game the system, they are doing it to secure
future funding. Either they are gaming the system to make the funder look
good, or to appeal to a future funder.

------
wbillingsley
As an indication that this is a hard problem to solve, remember that these
papers have usually gone through peer review anonymised. They cite "Smith &
Jones 1982", but it might not be obvious from the writing that they are either
Smith or Jones.

~~~
wbillingsley
I completely agree with this. Sometimes you have to look closer than people
often do to spot self-citation.

------
bluetwo
Like many other good things, you've got to figure out what to do when the
@ssholes show up.

------
bitL
Number of citations in academia == lines of code as a metrics for a software
developer?

~~~
holy_city
More like Github stars

~~~
Vinnl
I think lines of code is a better metaphor, because the number of lines of
code is a necessary artefact of writing code, but easy to game without
actually improving quality once they become a measure of evaluation.

------
o_p
Maybe there exist an academic equivalent to blackhat PBNs, groups of academic
self-referencing in loops to boost their citations. Number of citiations seems
like an awful measure for academic contribution.

------
choeger
Hmmm. There is also the problem that you cannot distinguish between a citation
and a mere reference. I used to cite earlier work a lot for context, not for
mining citations.

------
____smurf____
A lot of researchers build upon their previous work, and the work of their
peers. I don't see how researchers can avoid citing their old work in this
case.

------
vernie
I know it's a consequence of conference and journal guidelines but nothing
bothers me more than researchers who self-cite in the third person.

------
otakucode
If you try to reduce a complex intellectual task to a metric or a checklist,
you are begging for exploitation.

------
tdhz77
Like a Las Vegas show rated #1 by the casino owned magazine. They were voted
#1, however.

------
higherkinded
"...so again, like I said earlier,.."

------
bencollier49
Did they assess this list for GDPR and harassment law in the UK? IANAL, but if
I was naming and shaming in this way via a derived data set, I'd be scared of
getting into quite a lot of trouble with the law.

