And pre-print servers have even accelerated science. Because frankly this is a M...

EvgeniyZh · 2023-07-30T05:34:19

> We aren't going to journals

I do. Some of my peers do and some don't. It differs between fields and you're probably from CS (peer reviewed conferences) where progress is often faster than publication cycle, but it is not like that everywhere.

> they are helping us advance our careers

And it is important role which cannot be disregarded just because it doesn't match your idea of best possible promotion scheme.

> but we already live in the system that you're suggesting is silly and it's been this way for centuries.

Centuries? Can you elaborate? Peer review in journals as we know it is less than 100 years old.

godelski · 2023-07-30T08:17:01

> And it is important role which cannot be disregarded

I disagree. I think academia has fallen for Goodhart's law. While I would say that CS is probably on the worse side of things (especially ML), I don't think we're special in having critical problems. The publish or perish paradigm combined with the noisy signal of jorunaling is a big reason we are in the replication crisis. Because novelty is held in high regard while replication isn't of any concern. The irony being that novelty is quite rare and that the faster you publish the less novel works will be (in general. There are exceptions of course). You just can't do novel work fast.

> Centuries? Can you elaborate? Peer review in journals as we know it is less than 100 years old.

What I was referring to is that we've been just sending papers to one another for centuries. Uploading to arxiv is just an easier form of that (and was the explicit reason for its existence). That we scientists have been communicating with one another for centuries "figuring it out" without journals just fine. As far as I'm concerned, it is a failed experiment. It was okay in the beginning, had some good ideas, but then the rot grew and took over. It only works when everyone acts in good faith, and at scale only requires a few bad actors to spoil the barrel. Maybe I'm biased because of the problems of my field, but my friends in other areas still frequently hit problematic walls that just slow things down rather than improve them.

EvgeniyZh · 2023-07-30T10:42:01

> I disagree. I think academia has fallen for Goodhart's law.

It may or may not, but my claim was that we need some mechanism to decide who gets a position and who doesn't, and it should be scalable for the current scale of science and beyond, i.e., at least hundreds/thousands of applications. Saying that journals are used for career decisions and this is not good (enough?) and thus doesn't need replacement mechanism is weird.

> a big reason we are in the replication crisis.

The stated reasons are the same for different branches of science, while replication crisis is far from that. Replication of 100-year-old experiments is also not always easy/successful. Publish or perish motivates people to cut corners, but I think it is far from being the main reason.

> What I was referring to is that we've been just sending papers to one another for centuries. Uploading to arxiv is just an easier form of that (and was the explicit reason for its existence).

(Almost) nobody reads arxiv (edit: I mean (relevant part of) arxiv as a whole, of course most people read specific papers on arxiv). In fact the snr of arxiv is so low, even compared to journals that just by filtering garbage you can get quarter a mil followers [1]. So instead of random experts of varying quality and motivation I'll get a single rando on twitter (no offense to A K, he does good job)? No thanks.

> It only works when everyone acts in good faith, and at scale only requires a few bad actors to spoil the barrel.

Citation needed? You could claim the same for science as whole, but self-correction turns to be extremely robust, even if slower than you may want.

[1] https://twitter.com/_akhaliq

_aavaa_ · 2023-07-30T16:58:44

> Replication of 100-year-old experiments is also not always easy/successful.

It's a struggle to replicate 3 year old papers, forget 100 years.

godelski · 2023-07-30T21:37:59

> my claim was that we need some mechanism to decide who gets a position and who doesn't

I don't disagree. My claim is that the current mechanisms are nearly indistinguishable from noise but we pretend that they are very strong. I have also made the claim that Goodhart's Law is at play and made claims for the mechanisms and evidence of this. I'm not saying "trust me bro," I'm saying "here's my claim, here's the mechanism which I believe explains the claim, and here's my evidence." I'll admit a bit messy because a HN conversation, but this is here.

What I haven't stated explicitly is the alternative, so I will. The alternative is to do what we already do but just without journals and conferences. Peers still make decisions. You still look at citations, h-indices, and such, even though these are still noisy and problematic.

I really just have two claims here: 1) journals and conferences are high entropy signals. 2) If a signal has high entropy, it should not be the basis for important decisions.

If you want to see evidence for #1, then considering you publish in ML, I highly recommend reading several of the MANY papers and blog posts on the NeurIPS consistency experiments (there are 2!). The tldr is "reviewers are good at identifying bad papers, but not good at identifying good papers." In other words: high false negative rate or acceptance is noisy. This is because it is highly subjective and reviewers default to reject. I'd add that they have a incentive to too.

If you want evidence for #2, I suggest reading any book on signals, information theory, or statistics. It may go under many names, but the claim is not unique and generally uncontested.

> The stated reasons are the same for different branches of science, while replication crisis is far from that.

My claim about journals being at the heart of the replication crisis is that they chase novelty. That since we have a publish or perish paradigm (a coupled phenomena), that we're disincentivized from reproducing work. I'm sure you are fully aware that you get no credit for confirming work.

> (Almost) nobody reads arxiv

You claimed __aavaa__ was trolling, but are you sure you aren't? Your reference to Ashen is literally proof that people are reading arxiv. Clearly you're on twitter and you're in ML, so I don't know how you don't see that people aren't frequently posting their arxiv papers. My only guess is that you're green and when you started these conferences had a 1 year experiment in which they asked authors to not publically advertise their preprints. But before this (and starting again), this is how papers were communicated. And guess what, it still happened, just behind closed doors.

And given that you're in ML, I'm not sure how you're not aware that the innovation cycle is faster than the publication cycle. Waiting 4-8 months is too long. Look at CVPR: submission deadline in early November, final decision end of February, conference in late June. That's 4 months to just get Twitter notifications, by authors, about works to read. The listing didn't go live till April, so it is 6 months if you don't have peer networks. Then good luck sorting through this. That's just a list of papers, unsorted and untagged. On top of this, there's still a lot of garbage to sort through.

> the snr of arxiv is so low... that just by filtering garbage you can get quarter a mil followers

You and I see this very differently, allow me to explain. CVPR published 2359 papers for 2023 and 2066 for 2022. Numbers from here[0] but you can find similar numbers here[1] if you want to look at other conferences. Yes, conferences provide a curation, but so does Ashen and Twitter. The conferences cover many areas and don't tag or sort, and I know you aren't just going down that list and reading each paper title to determine if you should read it or not. I follow several accounts like Ashen. I also follow several researchers because I want to follow their works. This isn't happening because conferences are telling me who to listen to, this is because I've spent years wading through the noise and have peers that communicate with me what to read and not to read. Additionally I get recommendations from my advisor and his peers, from other members of my lab, from other people in my department, from friends outside my university, as well as emails from {Google,Semantic} Scholar. I don't think it isn't a well known problem that one of the most difficult things in a PhD is to get your network established and figure out how to wade through the noise to know what to read and not to. It's just a lot of fucking noise.

The SNR is high EVERYWHERE

And I'd like to add that I'm not necessarily unique in many of my points. Here's Bengio talking about how the pressures create incrementalism[2]. In this interview Hinton says "if you send in a paper that has a radically new idea, there's no chance in hell it will get accepted"[3]. Here's Peter Higgs saying he wouldn't cut it in modern academia[4]. The history of science is littered with people who got rocketed to fame because they spent a long time doing work and were willing to challenge the status quo. __Science necessitates challenging the status quo.__ Mathematics is quite famous for these dark horses actually (e.g. Yitang Zhang, Ramanujan, Galois, Sophie Germain, Grigori Perelman).

> Citation needed? You could claim the same for science as whole, but self-correction turns to be extremely robust, even if slower than you may want.

This entirely depends on the latter part. Yeah, things self-correct since eventually works get used in practice and thus held up to the flames. But if you're saying that given infinite time to retract a work it proof that the system is working then I don't think this is an argument that can be had in good faith. It is better to talk about fraud, dishonesty, plagiarism, how a system fails to accept good works, how it discourages innovation, and similar concepts. It is not that useful to discuss that over time a system resolves these problems because within that framework the exact discussion we are having would constitute part of that process, leading us into a self referential argument that is falsely constructed such that you will always be correct.

There are plenty of long term examples of major failures that took decades to correct, like Semmelweis and Boltzman (both who were driven insane and killed themselves). But if you're looking for short term examples and in the modern era, I'd say that being on HN this past week should have been evidence. In the last 8 days we've had: our second story on Gzip[5], "Fabricated data in research about honesty"[6] (which includes papers that were accepted for over a decade and widely cited), "A forthcoming collapse of the Atlantic meridional overturning circulation"[7] (where braaannigan had the top comment for awhile and many misinterpreted) and "I thought I wanted to be a professor, then I served on a hiring committee"[8] (where the article and comments discuss the metric hacking/Goodhart's Law). Or with more ML context, we can see "Please Commit More Blatant Academic Fraud"[9] where Jacob discusses explicit academic fraud and how prolific it is as well as referencing collusion rings. Or what about last year's CVPR E2V-SDE memes[10] about BLATANT plagiarism? The subsequent revelation of other blatant plagiarisms[11]. Or what about the collusion ring we learned about last year[12]. If you need more evidence, go see retractionwatch or I'll re-reference the NeurIPS consistency experiments (which we should know are not unique to NeurIPS). Also, just talk to senior graduate students and ask them why they are burned out. Especially ask both students who make it and those struggling, who will have very different stories. For even more, I ask that you got to CSRankings.org and create a regression plot for the rank of the universities against the number of faculty. If you even need any more, go read any work discussing improvements on the TPM or any of those discussing fairness in the review process. You need to be careful and ask if you're a beneficiary of the system, if you're a success of the system, a success despite the system, or where you are biased (as well as me). There's plenty out there and they go into far more detail than I can with a HN comment.

Is that enough citations?

[0] https://cvpr2023.thecvf.com/Conferences/2023/AcceptedPapers

[1] https://github.com/lixin4ever/Conference-Acceptance-Rate

[2] https://yoshuabengio.org/2020/02/26/time-to-rethink-the-publ...

[3] https://www.wired.com/story/googles-ai-guru-computers-think-...

[4] https://www.theguardian.com/science/2013/dec/06/peter-higgs-...

[5] https://news.ycombinator.com/item?id=36921552

[6] https://news.ycombinator.com/item?id=36907829

[7] https://news.ycombinator.com/item?id=36864319

[8] https://news.ycombinator.com/item?id=36825204

[9] https://jacobbuckman.com/2021-05-29-please-commit-more-blata...

[10] https://www.youtube.com/watch?v=UCmkpLduptU

[11] https://www.reddit.com/r/MachineLearning/comments/vjkssf/com...

[12] https://twitter.com/chriswolfvision/status/15452796423404011...