I guess you're just trolling, but in case you're serious, yes, being able to gro...

_aavaa_ · 2023-07-30T03:18:04

I was very serious.

The journals take works others have done (work often paid for with public money) and get unpaid volunteers to peer-review it. Next they require the authors to sign away their copyright to it. And then they turn right back around and sell it to very same universities where it came from for thousands of dollars a year.

And for what? So they can have have profit margins rivaling big tech? (Their major expenses are lawyers and server costs for pdf documents…)

And why do researchers publish there? Because they need to in order to keep their jobs, or to get funding. You need those high-impact-factor and prestigious journals. And so you sign away your research.

Most 1st world researchers don’t care or don’t know about the costs. But most of the world is not so lucky. Yes you could go around begging the authors for a free copy (but definitely don’t go to scihub or library genesis and download them for free, that would be illegal…).

The “space for improvements” starts with burning down the journals and scattering their ashes among the fields to fertilize new science.

EvgeniyZh · 2023-07-30T05:36:11

> burning down the journals and scattering their ashes among the fields to fertilize new science.

Oh yeah, destroy working system and then maybe figure it out, works like a charm every time.

_aavaa_ · 2023-07-30T16:57:23

Again, we had scientific progress (and problems with it) before the current system.

The only currency that matters is if your results can be reproduced, as as far as I'm aware the overwhelming majority of journals make no attempt at reproducing the results before accepting it. Hell, it doesn't seem like they even make an attempt at make string there's enough information provided to even attempt to reproduce it.

When I come across a paper that does any sort of modelling, I simply assume I won't be able to reproduce it if the code isn't included. They all seem to leave out important information.

EvgeniyZh · 2023-07-30T17:23:02

> Again, we had scientific progress (and problems with it) before the current system.

Again, the scale and accessibility of science is different now and then. We had commute before steam engine, it doesn't mean we can get rid of all engine-based transportation now without consequences.

> The only currency that matters is if your results can be reproduced

This is extreme reductionism. LHC results cannot be replicated unless you have multi billion particle accelerator in your backyard. JWST cannot be replicated. Many other large scale experiments are nearly impossible to replicate. In fact, these are often considered "important" ones. And this not necessarily due to price, but rather combination of price, unique expertise and willing to spend effort.

You know who will be able to tell you if the experiment description makes sense and has chances to be replicated if you ask them? Peers. Peer review at will never worked (and there were attempts) and never will because nobody likes to do peer review. I'm yet to see any proposal for peer review without editorial invitations which has at least theoretical chances to succeed.

> I won't be able to reproduce it if the code isn't included.

So you do mean "reproduction", not "replication". This is very low threshold and definitely not "only currency that matters". If I send you the code that alters the results somewhere deep inside, you'll be able to reproduce them.

_aavaa_ · 2023-07-30T18:03:03

> Many other large scale experiments are nearly impossible to replicate.

Thankfully, the LHC is not what all of science looks like. There is a great deal of science that can be reproduced without needing billions of dollars.

> If I send you the code that alters the results somewhere deep inside, you'll be able to reproduce them.

But the current status quo for most papers is that you don't even have that. If you public put out code that manipulates data in this way then it can be scrutinized.

The alternative is to attempt to replicate the results, and waste months of time trying to guess what the authors did (since they won't tell you), and in the end never being able to reproduce it, not knowing if it was because key information was missing or because it was falsified.

That is, if you can spend the time attempt to reproduce the results, rather than publishing your own novel and revolution results. And if you'll find a journal who'll be interested in publishing your failure to replicate.

godelski · 2023-07-30T18:48:42

I've tried emailing peers for code and they would ignore me. I then got my advisor to talk to their advisor because I couldn't replicate their program and I got code sent to me. It was nowhere near what was explained in the paper and didn't give the results. Nothing happened. No retraction, no nothing. Just made it harder for me to publish my paper because it is very hard to beat results that were made up.

So what I'm saying is, I agree with your point here. The proof is in the pudding. Journals are convincing everyone that anything not in them is false and garbage but once garbage gets into them it creates and impossible bar to pass. Now repeat this for 50 or so years in a highly competitive space and what do you get? Well I think we all know the old saying: garbage in...

godelski · 2023-07-30T19:54:03

The LHC has multiple teams that will replicate an experiment. The facility is so big that you have different groups. You also have different detectors (such as ATLAS and CMS) where you replicate the experiments because each one has a bit of difference. And except for the very high energy stuff, other labs around the world can reproduce the results. Not to mention that particle physics has a 5 sigma significance threshold...

But for a lot of this, what happens is that they make the claim, they contact peers, send out the data, and have their peers confirm. You ever wonder why experiments like these have hundreds of authors[0,1]? That's because that shit is peer review. [0] has literally 8 pages of authors, which is in the format of first initial, last name. There's also 5 pages of university/lab affiliations.

Everything you're talking about here is done outside journals. And guess what, there are two versions of the arxiv work for [0], July 31st 2012 and August 31st 20122. Guess when it was "published". September 17th of that year. The announcement? July 4th! Btw, the journal is showing v2.

This is how big science is done. The replication and verification is happening intra-paper. You of course have to trust the machine, but that's true for everyone and a limit that can't be bypassed until a later date when a new one is built. Which frequently the first tasks are to confirm other observations, such as the JWST confirming exoplanets that were previously observed.

So to sum up: Replication happens, on big important experiments, outside journals

> [reproduction] is extreme reductionism

I'd still argue, with _aavaa_, that reproduction is the only currency that matters. Your experiment isn't worth shit if it can't be reproduced (i.e. it is indistinguishable from fraud)

> LHC results cannot be replicated

Most of the work is replicated, prior to publication.

> JWST cannot be replicated.

Observational data is confirmed. And JWST replicates its predecessors (an important part of its mission). (I'll even add LIGO in here and repeat the LHC and JWST comments)

> You know who will be able to tell you if the experiment description makes sense and has chances to be replicated if you ask them? Peers.

Everyone agrees with this. We are just saying the reviewers aren't necessarily your peers and the people you should be asking.

> because nobody likes to do peer review.

Lots of people like to do peer review and it happens all the time. If you're talking about reviewing for a journal, then yes I agree with you. But if you're talking about replicating work, building on, doing it to learn, or for fun, then this happens all the time. In fact, grad students do this professionally. How many works did you rebuild and try to replicate?

> I'm yet to see any proposal for peer review without editorial invitations which has at least theoretical chances to succeed.

You haven't dismissed __aavaa__ or my claims, only asserted they are wrong. If you would like to say why we are wrong, we're clearly open to hearing it and responding. But you've just asked us to trust you.

> So you do mean "reproduction", not "replication".

Reproduction is a form of replication. Replication is a spectrum. Reproduction is the minimum and with a lot between confirming experiments within a different context frame. But I don't think anyone is saying that just running your code constitutes reproduction. I'd say it is confirmation of your code. The source availability is about verification, as in I can probe it and look for errors, not so much reproduction.

[0] page 25 https://arxiv.org/abs/1207.7214

[1] appendix C page 41 https://arxiv.org/abs/2207.00043

godelski · 2023-07-29T19:50:37

And pre-print servers have even accelerated science. Because frankly this is a MCMC system because the nature of the game is everyone stumbling around in the dark.

I am a scientist and I'm all for "get rid of journals and conferences BECAUSE we already do figure it out." It isn't "somehow" we're already fucking doing it. We aren't going to journals and searching them, we go to arxiv. We read what our peers share with us. We read what comes out of other groups that we know that are doing similar work to us. Journals and conferences aren't helping us do science, they are helping us advance our careers because we live in a publish or perish ecosystem that quantifies our work based on the prestige of a conference who is prestigious because they reject 80% of works. Nowhere in here is a measure for the actual quality of a work. That is often a difficult task and can only be done by peers, which is already happening. Identifying bad works is easy but identifying good works is hard.

I'm sorry, but we already live in the system that you're suggesting is silly and it's been this way for centuries. Journals are just people capitalizing on our work and labor. Conferences at least put us in the same room, but they have no business claiming that they can accurately ascertain the quality of a work.

EvgeniyZh · 2023-07-30T05:34:19

> We aren't going to journals

I do. Some of my peers do and some don't. It differs between fields and you're probably from CS (peer reviewed conferences) where progress is often faster than publication cycle, but it is not like that everywhere.

> they are helping us advance our careers

And it is important role which cannot be disregarded just because it doesn't match your idea of best possible promotion scheme.

> but we already live in the system that you're suggesting is silly and it's been this way for centuries.

Centuries? Can you elaborate? Peer review in journals as we know it is less than 100 years old.

godelski · 2023-07-30T08:17:01

> And it is important role which cannot be disregarded

I disagree. I think academia has fallen for Goodhart's law. While I would say that CS is probably on the worse side of things (especially ML), I don't think we're special in having critical problems. The publish or perish paradigm combined with the noisy signal of jorunaling is a big reason we are in the replication crisis. Because novelty is held in high regard while replication isn't of any concern. The irony being that novelty is quite rare and that the faster you publish the less novel works will be (in general. There are exceptions of course). You just can't do novel work fast.

> Centuries? Can you elaborate? Peer review in journals as we know it is less than 100 years old.

What I was referring to is that we've been just sending papers to one another for centuries. Uploading to arxiv is just an easier form of that (and was the explicit reason for its existence). That we scientists have been communicating with one another for centuries "figuring it out" without journals just fine. As far as I'm concerned, it is a failed experiment. It was okay in the beginning, had some good ideas, but then the rot grew and took over. It only works when everyone acts in good faith, and at scale only requires a few bad actors to spoil the barrel. Maybe I'm biased because of the problems of my field, but my friends in other areas still frequently hit problematic walls that just slow things down rather than improve them.

EvgeniyZh · 2023-07-30T10:42:01

> I disagree. I think academia has fallen for Goodhart's law.

It may or may not, but my claim was that we need some mechanism to decide who gets a position and who doesn't, and it should be scalable for the current scale of science and beyond, i.e., at least hundreds/thousands of applications. Saying that journals are used for career decisions and this is not good (enough?) and thus doesn't need replacement mechanism is weird.

> a big reason we are in the replication crisis.

The stated reasons are the same for different branches of science, while replication crisis is far from that. Replication of 100-year-old experiments is also not always easy/successful. Publish or perish motivates people to cut corners, but I think it is far from being the main reason.

> What I was referring to is that we've been just sending papers to one another for centuries. Uploading to arxiv is just an easier form of that (and was the explicit reason for its existence).

(Almost) nobody reads arxiv (edit: I mean (relevant part of) arxiv as a whole, of course most people read specific papers on arxiv). In fact the snr of arxiv is so low, even compared to journals that just by filtering garbage you can get quarter a mil followers [1]. So instead of random experts of varying quality and motivation I'll get a single rando on twitter (no offense to A K, he does good job)? No thanks.

> It only works when everyone acts in good faith, and at scale only requires a few bad actors to spoil the barrel.

Citation needed? You could claim the same for science as whole, but self-correction turns to be extremely robust, even if slower than you may want.

[1] https://twitter.com/_akhaliq

_aavaa_ · 2023-07-30T16:58:44

> Replication of 100-year-old experiments is also not always easy/successful.

It's a struggle to replicate 3 year old papers, forget 100 years.

godelski · 2023-07-30T21:37:59

> my claim was that we need some mechanism to decide who gets a position and who doesn't

I don't disagree. My claim is that the current mechanisms are nearly indistinguishable from noise but we pretend that they are very strong. I have also made the claim that Goodhart's Law is at play and made claims for the mechanisms and evidence of this. I'm not saying "trust me bro," I'm saying "here's my claim, here's the mechanism which I believe explains the claim, and here's my evidence." I'll admit a bit messy because a HN conversation, but this is here.

What I haven't stated explicitly is the alternative, so I will. The alternative is to do what we already do but just without journals and conferences. Peers still make decisions. You still look at citations, h-indices, and such, even though these are still noisy and problematic.

I really just have two claims here: 1) journals and conferences are high entropy signals. 2) If a signal has high entropy, it should not be the basis for important decisions.

If you want to see evidence for #1, then considering you publish in ML, I highly recommend reading several of the MANY papers and blog posts on the NeurIPS consistency experiments (there are 2!). The tldr is "reviewers are good at identifying bad papers, but not good at identifying good papers." In other words: high false negative rate or acceptance is noisy. This is because it is highly subjective and reviewers default to reject. I'd add that they have a incentive to too.

If you want evidence for #2, I suggest reading any book on signals, information theory, or statistics. It may go under many names, but the claim is not unique and generally uncontested.

> The stated reasons are the same for different branches of science, while replication crisis is far from that.

My claim about journals being at the heart of the replication crisis is that they chase novelty. That since we have a publish or perish paradigm (a coupled phenomena), that we're disincentivized from reproducing work. I'm sure you are fully aware that you get no credit for confirming work.

> (Almost) nobody reads arxiv

You claimed __aavaa__ was trolling, but are you sure you aren't? Your reference to Ashen is literally proof that people are reading arxiv. Clearly you're on twitter and you're in ML, so I don't know how you don't see that people aren't frequently posting their arxiv papers. My only guess is that you're green and when you started these conferences had a 1 year experiment in which they asked authors to not publically advertise their preprints. But before this (and starting again), this is how papers were communicated. And guess what, it still happened, just behind closed doors.

And given that you're in ML, I'm not sure how you're not aware that the innovation cycle is faster than the publication cycle. Waiting 4-8 months is too long. Look at CVPR: submission deadline in early November, final decision end of February, conference in late June. That's 4 months to just get Twitter notifications, by authors, about works to read. The listing didn't go live till April, so it is 6 months if you don't have peer networks. Then good luck sorting through this. That's just a list of papers, unsorted and untagged. On top of this, there's still a lot of garbage to sort through.

> the snr of arxiv is so low... that just by filtering garbage you can get quarter a mil followers

You and I see this very differently, allow me to explain. CVPR published 2359 papers for 2023 and 2066 for 2022. Numbers from here[0] but you can find similar numbers here[1] if you want to look at other conferences. Yes, conferences provide a curation, but so does Ashen and Twitter. The conferences cover many areas and don't tag or sort, and I know you aren't just going down that list and reading each paper title to determine if you should read it or not. I follow several accounts like Ashen. I also follow several researchers because I want to follow their works. This isn't happening because conferences are telling me who to listen to, this is because I've spent years wading through the noise and have peers that communicate with me what to read and not to read. Additionally I get recommendations from my advisor and his peers, from other members of my lab, from other people in my department, from friends outside my university, as well as emails from {Google,Semantic} Scholar. I don't think it isn't a well known problem that one of the most difficult things in a PhD is to get your network established and figure out how to wade through the noise to know what to read and not to. It's just a lot of fucking noise.

The SNR is high EVERYWHERE

And I'd like to add that I'm not necessarily unique in many of my points. Here's Bengio talking about how the pressures create incrementalism[2]. In this interview Hinton says "if you send in a paper that has a radically new idea, there's no chance in hell it will get accepted"[3]. Here's Peter Higgs saying he wouldn't cut it in modern academia[4]. The history of science is littered with people who got rocketed to fame because they spent a long time doing work and were willing to challenge the status quo. __Science necessitates challenging the status quo.__ Mathematics is quite famous for these dark horses actually (e.g. Yitang Zhang, Ramanujan, Galois, Sophie Germain, Grigori Perelman).

> Citation needed? You could claim the same for science as whole, but self-correction turns to be extremely robust, even if slower than you may want.

This entirely depends on the latter part. Yeah, things self-correct since eventually works get used in practice and thus held up to the flames. But if you're saying that given infinite time to retract a work it proof that the system is working then I don't think this is an argument that can be had in good faith. It is better to talk about fraud, dishonesty, plagiarism, how a system fails to accept good works, how it discourages innovation, and similar concepts. It is not that useful to discuss that over time a system resolves these problems because within that framework the exact discussion we are having would constitute part of that process, leading us into a self referential argument that is falsely constructed such that you will always be correct.

There are plenty of long term examples of major failures that took decades to correct, like Semmelweis and Boltzman (both who were driven insane and killed themselves). But if you're looking for short term examples and in the modern era, I'd say that being on HN this past week should have been evidence. In the last 8 days we've had: our second story on Gzip[5], "Fabricated data in research about honesty"[6] (which includes papers that were accepted for over a decade and widely cited), "A forthcoming collapse of the Atlantic meridional overturning circulation"[7] (where braaannigan had the top comment for awhile and many misinterpreted) and "I thought I wanted to be a professor, then I served on a hiring committee"[8] (where the article and comments discuss the metric hacking/Goodhart's Law). Or with more ML context, we can see "Please Commit More Blatant Academic Fraud"[9] where Jacob discusses explicit academic fraud and how prolific it is as well as referencing collusion rings. Or what about last year's CVPR E2V-SDE memes[10] about BLATANT plagiarism? The subsequent revelation of other blatant plagiarisms[11]. Or what about the collusion ring we learned about last year[12]. If you need more evidence, go see retractionwatch or I'll re-reference the NeurIPS consistency experiments (which we should know are not unique to NeurIPS). Also, just talk to senior graduate students and ask them why they are burned out. Especially ask both students who make it and those struggling, who will have very different stories. For even more, I ask that you got to CSRankings.org and create a regression plot for the rank of the universities against the number of faculty. If you even need any more, go read any work discussing improvements on the TPM or any of those discussing fairness in the review process. You need to be careful and ask if you're a beneficiary of the system, if you're a success of the system, a success despite the system, or where you are biased (as well as me). There's plenty out there and they go into far more detail than I can with a HN comment.

Is that enough citations?

[0] https://cvpr2023.thecvf.com/Conferences/2023/AcceptedPapers

[1] https://github.com/lixin4ever/Conference-Acceptance-Rate

[2] https://yoshuabengio.org/2020/02/26/time-to-rethink-the-publ...

[3] https://www.wired.com/story/googles-ai-guru-computers-think-...

[4] https://www.theguardian.com/science/2013/dec/06/peter-higgs-...

[5] https://news.ycombinator.com/item?id=36921552

[6] https://news.ycombinator.com/item?id=36907829

[7] https://news.ycombinator.com/item?id=36864319

[8] https://news.ycombinator.com/item?id=36825204

[9] https://jacobbuckman.com/2021-05-29-please-commit-more-blata...

[10] https://www.youtube.com/watch?v=UCmkpLduptU

[11] https://www.reddit.com/r/MachineLearning/comments/vjkssf/com...

[12] https://twitter.com/chriswolfvision/status/15452796423404011...