
Detecting manuscripts and publications from paper mills - blopeur
https://febs.onlinelibrary.wiley.com/doi/full/10.1002/1873-3468.13747
======
DyslexicAtheist
_> Paper mills are believed to be fueled by unrealistic publication
requirements or quotas, combined with monetary publication rewards to authors
[2-5]. Where unrealistic publication requirements are widely applied over long
periods of time, this could create a broad and growing base of paper mill
clients. To both meet client demand and their capacity to pay, paper mills
will likely aim to generate large numbers of manuscripts at minimum cost. This
will likely require at least some falsified or fabricated data, as performing
genuine experiments could render manuscripts unaffordable.

> At the same time, externally supplied manuscripts should resemble genuine
> manuscripts if they are to be accepted for publication, so paper mills need
> to balance their requirements for efficiency and volume with the requirement
> that their manuscripts also appear to be genuine._

^^

the paper doesn't say too much about the state of research around CS / BigTech
related topics. I have been browsing arxiv (and arxiv-sanity.com) for more
than a decade and doing so on at least a weekly basis for work and to satisfy
my private curiosity. My interest are mostly classic CompSci topics and
Security. The genre seems increasingly polluted by papers which can only be
described as BS research.

You can usually smell the bad with a few simple rules that help to know what
to look for[1]. It increasingly feels like dumpster diving in the domains
around AI/ML. My partner who is in healthcare tells me that medicine is even
worse whenever I complain (though I have no evidence for it other than
anecdotal).

[1] _Hanson 1999: Efficient Reading of Papers in Science and Technology_ :
[https://www.cs.columbia.edu/~hgs/netbib/efficientReading.pdf](https://www.cs.columbia.edu/~hgs/netbib/efficientReading.pdf)

recent (published in the past month) random examples of what falls IMO into
the category of BS papers (these aren't outliers but are becoming the norm!):

 _P4-to-blockchain: A secure blockchain-enabled packet parser for software
defined networking:_
[https://www.sciencedirect.com/science/article/pii/S016740481...](https://www.sciencedirect.com/science/article/pii/S0167404819301762)

 _Security & Privacy in IoT Using Machine Learning & Blockchain: Threats &
Countermeasures_
[https://arxiv.org/abs/2002.03488v1](https://arxiv.org/abs/2002.03488v1)

 _REST: A thread embedding approach for identifying and classifying user-
specified information in security forums_
[https://arxiv.org/abs/2001.02660v1](https://arxiv.org/abs/2001.02660v1)

 _Artificial Design: Modeling Artificial Super Intelligence with Extended
General Relativity and Universal Darwinism via Geometrization for Universal
Design Automation_
[https://openreview.net/forum?id=SyxQ_TEFwS](https://openreview.net/forum?id=SyxQ_TEFwS)

~~~
drongoking
I'm surprised it's taken this long. There is no peer review to arXiv. If
you've ever reviewed for a conference or a journal (10-30% acceptance rates)
you've seen the "raw feed" of submitted papers and you realize the average
isn't very good. Many get rejected because they simply have a few flaws, but
some are badly flawed (or just plain wrong) and shouldn't be published
anywhere. ArXiv has no peer review so it's virtually the raw feed. I cringe
whenever I see arXiv papers cited as if they were published work --- doing an
end-run around peer review can't be good for science.

