Using Pokémon to detect scientific misinformation (the-scientist.com)
207 points by wizeman 7 months ago | 47 comments

This looks like a very similar approach to that taken by Boghossian, Pluckrose, and Lindsay in using a prank to test and illuminate journals engaged in poor scholarship.

Entertaining, but clearly very necessary.


That paper is a hoot, especially the bibliography :

> 5. Joy MP, Joy DD, Joy TP (2006) Rabies outbreak in Pokémon daycare center. Infectious Diseases 33: 377-378

> 9. Mazzetti D (2016) Fraud in exercise science journals: Do you even peer review? Archives of Broscience 112(7): 896-902.

> 11. Crichton M (2013) Origin and Defeat of the Andromeda Strain. Journal of Cryptovirology 116(6): 1360-1363.

> 18. Joy SM (2006) Pangolins do not cure cancer any other diseases. Journal of Please Stop Eating Endangered Species 21: 420-430.

and on and on and on XD

There are over ten citations there to real articles about predatory journals, all linked in the paper. Even this:

> 28. Stromberg J (2014) ‘Get Me Off Your Fucking Mailing List’ is an Actual Science Paper Accepted by a Journal. Vox 21: 10-11.

(See also HN discussions on the abovementioned paper: https://news.ycombinator.com/item?id=22052019, https://news.ycombinator.com/item?id=11098655)

Number 9 and 18 seem legitimate in their own ways

> To make matters worse, my Pokémon-inspired paper on the novel coronavirus has already been cited. A physicist based in Tunisia published “The COVID-19 outbreak’s multiple effects,” which claimed that COVID-19 was human-made and is treatable with “provincial herbs,” in another predatory journal, The International Journal of Engineering Research and Technology. He not only cited my article, but also cited one of my made-up references, “Signs and symptoms of Pokérus infection,” as the paper that first identified SARS-CoV-2.

The next paragraph is what got me.

> When I asked the author how this happened, he failed to see any problem with citing a paper he never read while writing a paper outside his field, and was unaware of the difference between open access and predatory journals.

This is an interesting solution to a problem I hadn't given much thought to in the past.

On the one hand, it's demoralizing that we've reached this point where misinformation spreads at the same rate as good information. On the other, it does seem like a decent way to expose predatory journals.

Could this method be extended to other fields?

There's that classic double joke about fad technology names being gibberish and hapless tech recruiters:

"I typically ask recruiters to point out which of these are pokemon" https://imgur.com/gallery/r0SEEoh

The trick is that a lot of Pokémon names make good tech names. Metapod would be a great name for a perl documentation suite, or a container orchestration tool. And that makes caterpie a questionable, but valid, name for a partial container orchestration tool; and maybe butterfree could be containers evolved.

Onix is clearly a unix-like or linux distribution that's shaped around some software with first letter O (Opera? OpenOffice? I dunno), in addition to a Pokémon.

Onix is shaped around Operator Framework. It's all containers.

Vulpix could be something to do with pictures.

Farfetch could be a database library.

Hmmm that onyx one almost had me, was gonna say it's both a pokemon and a language...but the Pokemon's spelled Onix.

At the time that was doing the rounds (2015 or so) someone expanded it into a "Pokemon or Big Data" quiz game, which is startlingly difficult: https://pixelastic.github.io/pokemonorbigdata/

I thought that was extremely easy, but only because the Pokemon didn't go past Generation 3. Flink sounds a whole lot like Klink, a Pokemon from Generation 5.

> only because the Pokemon didn't go past Generation 3

That's about two generations too many for me to be able to know :P

Yeah...the first one's pretty much burned permanently in my mind forever, I can recognize most of the second ones, but after that I don't really know them.

i can easily understand and accept the idea of an infinite gender spectrum and associated naming conventions but there are ONLY 150(+1) pokémon and nothing anyone says will change my mind

This is fantastic. You should submit this as a post on HN - it deserves its own thread :)

It's easy if you know your Pokémon well. It's really hard if you kinda know names because many of them have lookalike names ;)

Damn, got the first one wrong :

> ADABAS was NoSQL from a time when there was no SQL. The technology now is owned by Software AG. "Software AG: We're not sure what we do either."

Damn, I seem to get only half of them right, not better than random chance ! EDIT : Ok, 59% correct. (Eventually you get a 'victory' screen.)

I've seen many people saying "Onyx was the Pokemon, not the language" and I have to resist the urge to reply in all caps going "IT'S SPELT ONIX NOT ONYX"

I failed this miserably last year. Now I got 93% correct. I blame the quarantine and all the publicity Pokémon has done.

Ekans belonged after Python.

> it's demoralizing that we've reached this point where misinformation spreads at the same rate as good information.

This has always been true. It's a fundamental part of the human story, an emergent property that arises from the cost to produce misinformation compared to the cost to discover the truth.


I would expect that generally misinformation spreads faster. People that verify authenticity before spreading are slower because they're verifying authenticity.

Verifiers may be slower to repeat, but those with a good track record of authenticity presumably have larger networks willing to take them at their word. It may take longer for Nature to publish something than a random crank journal, but Nature's impact factor is orders of magnitude higher.

> One should not automatically trust all documents formatted as a scientific paper.

Yup, I recently got fooled there...


One of my friends in college managed a medical journal, which was owned by a moderately well-known professor trying to quickly cash in on his fame. His job consisted mainly of formatting word documents while pretending to be a team of reviewers.

Sleazy conferences are quite common too. I think this is a deeper, broader problem than many scientists would care to admit.

Funny stuff! Gotta love Pokemon papers!

That said, since people shouldn't just believe whatever they read, even if it's published in Nature or Science or something, it's a tad redundant to point out that papers published in academic formatting can be complete nonsense.

A solid peer-review can help reduce quality issues, but it doesn't mean that whatever's published is true, reliable, or trustworthy. It's weird that so many folks seem to think that journal articles are gospel. That's just not what peer-review does.

Related: People shouldn't believe things just because they're written in math or Latin.

I think the point of the article here is not to try to imply that peer review is infallible, but that peer review, in this particular case, isn't happening in any form, despite the journal advertising,

> American Journal of Biomedical Science & Research is a peer reviewed open access journal dedicated to publish (sic) high-quality research in all areas of the medical, pharmaceutical, health and engineering sciences.

This really reminded me of SCIgen, made to generate non-sensical CS papers with the same intention https://news.mit.edu/2015/how-three-mit-students-fooled-scie...

The interpid Dr. Herbert Schlangemann.

>> This paper, “Cyllage City COVID-19 outbreak linked to Zubat consumption,” blames a fictional creature for an outbreak in a fictional city, cites fictional references (including one from author Bruce Wayne in Gotham Forensics Quarterly on using bats to fight crime), and is cowritten by fictional authors such as Pokémon’s Nurse Joy and House, MD.

Dammit, now I want to read the Gotham Forensics Quarterly paper.

I'd like to add the Stone louse to this list: https://en.wikipedia.org/wiki/Stone_louse

The value of one single article, however presented in "scientific" form, whichever magazine publishes it, is practically 0. The sooner one realizes that, the sooner one can stop debunking each and every homeopathy "paper"... the earlier one can move to do real science.

Yes, it's sad, you will not be able to detect small effects because they will have a very hard time getting reproduced above noise level, but there is just no alternative.

Several articles, and one of them was then cited at least once.

I think that's a very naive way of looking at it. You can't say "if everybody would just educate themselves on this topic", because the vast, vast majority of people do not.

Singular papers can do real harm, as we can see from a whole movement of "vaccines cause autism" stemming from one fallacious paper.

And even without going to those extremes, it's clear that confirmation bias will cause people to find even a single paper confirming their scientifically bankrupt ideas as a way to bolster and legitimize their campaigns to spread harmful misinformation.

> You can't say "if everybody would just educate themselves on this topic", because the vast, vast majority of people do not.

> Singular papers can do real harm, as we can see from a whole movement of "vaccines cause autism" stemming from one fallacious paper.

The vast, vast majority of people don't read the papers you like either. That includes the "vaccines cause autism" movement; 0% of members have read any related paper.

You mention some real problems, but they aren't addressed by the peer review system.

If you publish a paper with fake 'good' data, why would some spurrious animal names even be relevant? 'Zubat'? Epidemiologists probably don't know the English-Chinese translation for the million species in China.

This is not really spoofing, this is just plain fraud, right?

When they fill papers full of jargon, data that doesn't make sense, crazy political posturing so that the paper reads as something fantastical - and it gets by - that's a problem.

But this seems to be merely a real concern that it's quite a bit of work to reproduce scientific results and that someone doing a quick review of a paper has no material way of validating 'everything'.

The paper 'looked promising' because it was promising, assuming those sending it in weren't completely making everything up.

You could make a fake passport that might fool a lot of people, that doesn't mean the system is broken.

Science maybe has problems but I'm not sure this one strikes at it very well.

From the article:

> Some would argue that editors cannot recognize Pokémon names, but lines in the text such as “a journal publishing this paper does not practice peer review and must therefore be predatory” or “this invited article is in a predatory journal that likely does not practice peer review” would have tipped off anyone who bothered to read the articles. These papers did not slip in under the radar; they were welcomed in blindly.

Data is not meaningful in and of itself. What matters is what that data signifies, i.e. how it changes the reader's understanding of the world.

It doesn't matter if the data is made up if what it signifies is complete nonsense anyway. If their goal was to commit fraud they would obfuscate their deceit, not make it a google search away from clearly being a hoax. Your definition of fraud is so broad that it removes any responsibility of due diligence from the reader, which is the entire point of having a peer review process in the first place.

Is it the author or the journal that is fraudulent?

