
Goodhart’s Law: Are Academic Metrics Being Gamed? - ubac
https://thegradient.pub/over-optimization-of-academic-publishing-metrics/
======
acgan
Glad to see this important piece here (disclosure: I am one of the editors of
The Gradient).

[https://ieeexplore.ieee.org/document/5089308](https://ieeexplore.ieee.org/document/5089308),
from RCIS 2009 (Beel and Gipp) noted that "Google Scholar seems to be more
suitable for searching standard literature than for gems or articles by
authors advancing a view different from the mainstream."

Unrelated, but interesting: scraping Google Scholar is remarkably annoying if
you want to actually use the data. The easiest way (in my experience) seems to
be regex hacking on the BibTeX files, but this seems truly broken.

~~~
throwaway2048
Blocking scraping is the norm for Google, for instance the Public Youtube API
allows you to view a grand total of 3 or so videos per key per day before it
starts blocking you.

Google has basically got as bad as twitter in terms of giving a big middle
finger to third party devs, but they have been smart enough to maintain a
completely useless public/free tier for most things.

~~~
acgan
That makes sense. I'd hope that Scholar would be different, though.

A piece on how a researcher spent a summer filling out CAPTCHAs / scraping:
[https://www.nature.com/articles/d41586-018-04190-5](https://www.nature.com/articles/d41586-018-04190-5)

~~~
buboard
Scholar should be different, considering that they are the only ones in the
world who are given access to everything

------
radioactivist
Something seems a bit wrong with the graph "Publication rate by career length"
\-- should the y-axis be "Average number of published papers _per year_ "? (I
can't imagine that someone whose first paper was in the 1950s only published 1
additional paper in the next 30 years)

~~~
papreclip
They got their PhD and entered industry. Publication was no longer a priority
and maybe not even an option.

~~~
radioactivist
If the axis labels are correct (which, after referring to the full paper, they
seem to be) then I think this is the only reasonable explanation. I.e. that
the dataset tracks _all_ authors of scientific publications, rather than those
who are active researchers (or had research careers and subsequently passed).
[Note: I did not see any attempt to take this into account upon skimming the
full paper]

Given the high attrition rate in many academic fields (and the small number of
publications typical in early years) this would then rationalize the low
numbers seen here. Though, I would say the meaning of these numbers would be
quite different than if this were presented for those who had research careers
(the more relevant number I think).

------
adipandas
Fantastic over view of current trends in academia. There is truly a huge bias
in the research publications. I think blogging is a better way to put forth
your ideas and research rather than getting a publication in some cases.

~~~
andreyk
Why not both? Berkeley AI Blog, Stanford AI Blog, CMU ML blog... all show that
you can do both. Reviews (part of paper submission process) are legitimately
useful, if done well. As a researcher, having Arxiv as a standard medium and
conferences to help filter interesting papers (if imperfectly) is also useful.

~~~
adipandas
Yes, agreed. Arxiv is also an awesome platform.

------
xorand
Excellent article. A relevant link would be the San Francisco Declaration on
Research Assessment (DORA) [https://sfdora.org/](https://sfdora.org/)

Back to the article, lots of gems, like:

>Today's researchers can publish not only in an ever-growing number of
traditional venues, such as conferences and journals, but also in electronic
preprint repositories and in mega-journals that offer rapid publication times.

Did I just read a very normal point of view of a researcher putting on equal
footing electronic preprint repositories and mega-journals?

The BOAI Open Access preachers surely can't believe their eyes :) Heresy! (No
researcher was involved in the BOAI flawed definition of green OA as archiving
and gold OA as publishing.)

------
tgv
Do people benefit from gaming the system? Then it surely is being gamed. And
they do. Funding and tenure depend on these metrics.

> Overwhelmed by the volume of submissions, editors at these journals may
> choose safety over risk and select papers written by only well-known,
> experienced researchers.

There is a bit of a "circle jerk" in this process: if you know the right
people, you can get better reviews. In return, you review their papers or
requests favorably. That also leads to repeating authors.

~~~
hughzhang
Sometimes people idealize academia though. It's important to realize that
everything has its flaws :).

~~~
erichocean
The thing to realize about academia is that the vast majority of people in it
have never experienced the real world. They went straight from grade school to
college to _working_ in academia, and stay there until they retire or die.

Academia is literally all these people know, and they've been sheltered their
entire lives. No one should be surprised that this produces strange results.

Obviously there are exceptions, but there's no denying it produces some
interesting world views.

------
acollins1331
Yes. But a good department will know you for who you are. Plenty of great
science takes a long time do do, and it is known it's hard to get funding for
long-term monitoring experiments. Most grant money is for new and innovative
ideas and no one is pushing out one of those every year. And if you are then
you're not doing 90% of the work on any of them.

------
buboard
They should have limited the analysis to the most popular journals. There are
tons more journals nowadays because its so easy to run one - but it s more
important to know what’s happening at the well known ones. The lesser know are
largely ignored

------
spodek
Tough call: Goodhart's Law says yes, metrics will be gamed, but Betteridge's
law says the answer is no.

~~~
Nuzzerino
One of those is a fundamental condition of a game-theoretic scenario,
independent of environment. The other is a marketing trend.

------
caoilte
No. We're all being gamed. Academics were just nearer the front of the queue.

------
dr_dshiv
I wish there was a Google n grams for Google scholar...

