Slowly and almost imperceptibly if you look at it day-to-day, public research repositories like arxiv and biorxiv, along with public code repositories like github and gitlab, are becoming or maybe already are the world's most important academic "journals."
All research and code posted on them gets a quick once-over; good work gets the attention it deserves; bad work is quickly ignored. Reviews take place over the Internet via both public and private forums.
Gatekeeping power lies more and more in the hands of a global, distributed scientific community open to anyone willing and capable of doing and reviewing the work. It's fabulous IMHO.
> All research and code posted on them gets a quick once-over; good work gets the attention it deserves; bad work is quickly ignored.
What is the basis for all these claims? Who is giving it the quick once-over?
Crowd-sourced review and information, despite some strengths and high initial hopes, has a record of extraordinary misinformation and disinformation. Why would we want to use that system for scientific research?
I prefer careful peer review, standards for announcing funders, etc.
arXiv has 200 expert moderators spread through different fields to filter out papers that are blatantly misleading, unoriginal, non-substantive, or in need of significant review and revision. It's not a peer-review, but it's not a complete free-for-all either.
This is completely false! Please stop telling people that arXiv moderators judge the technical content of papers. That's not their role and it leads to people trusting arXiv when they should not.
It's a complete-free-for-all.
arXiv moderators do not judge the technical content. They don't filter misleading submissions. They don't filter work in need of revision. This is literally described on the arXiv website: https://arxiv.org/help/moderation "What policies guide moderation before public announcement?
"
They filter spammers, check for total obvious nutjobs ("My grilled cheese said P=NP"), for crazy formatting, for blatant copyright violations.
Publishing junk on arXiv is trivial if you're not too crazy and know a little bit how to use the right words. You can publish anything.
> They don't filter work in need of revision. This is literally described on the arXiv website: https://arxiv.org/help/moderation "What policies guide moderation before public announcement? "
The page you're linking backs up what I've listed. The third subheader in that section for example:
> A submission may be declined if the moderators determine it lacks originality, novelty, or significance.
> Submissions that do not contain original or substantive research, including undergraduate research, course projects, and research proposals, news, or information about political causes (even those with potential special interest to the academic community) may be declined.
> Papers that contain inflammatory or fictitious content, papers that use highly dramatic and misrepresentative titles/abstracts/introductions, or papers in need of significant review and revision may be declined.
---
> it leads to people trusting arXiv when they should not
I'm only claiming that their moderation is a quick once-over to filter out papers blatantly in violation of those policies (like the "total obvious nutjobs" you describe), while being clear that it's not a peer review.
“May” is the key word in that long sentence. bioRxiv certainly does not review content of submissions using the standard definition of the word “review”. They may scan for style and general content type—for example rejecting reviews.
But I agree with the parent comment that these archives are extremely valuable.
I disagree that informal community comments are of much critical value. In most cases twitter comments and micro-reviews are relatively trivial and are usually based on quick reads rather than deep perusals.
I'm not sure I'd go that far with the optimism, at least in my field (artificial intelligence). Some obviously bad work does immediately fade into obscurity, and better work probably does on average get more attention, but the variation is huge. There is so much stuff on arXiv, that to get attention you need some kind of PR push so people notice it in the firehose, or a dice-roll around a viral tweet or science journalist noticing it. Some of the better funded university and corporate research groups have actual professional PR and science-comm teams doing coordinated social-media blitzes, press releases, and blog posts around new arXiv papers! That's a huge factor in determining whether a given paper gets attention.
Thanks for the perspectives. One important nitpick:
> Some obviously bad work does immediately fade into obscurity, and better work probably does on average get more attention
You don't know what you haven't seen, which is just a restatement of the core problem. You'd need a study of the entire population to know about the correlation between paper 'quality' and outcomes.
The problem is that careful (conference) reviews don’t scale. Large conferences end up suffering from a highly stochastic behavior where excellent work is borderline rejected on a regular basis while mediocre/incorrect work gets accepted every so often. github/arxiv are no silver bullet but offer an interesting alternative (with their own set of challenges, though).
(I'm not saying any of these are needed to make it "respectable" or even that they should be... just wondering how arxiv does its thing)
Personally I like the overall concept of arxiv. Even with no one necessarily reviewing a paper, which is probably unlikely, the fact its even accessible for later review when necessary is worthwhile.
Speaking of which, if anyone knows someone at arXiv, would be great if you could prod someone to get back to me about this PR or the associated email I sent them about it: https://github.com/arXiv/arxiv-browse/pull/197
It would add the ability for people to state that they have reviewed a given work. Might not be the direction they want to go in, or it might not be - but so far I'm not even sure if someone's seen it, unfortunately.
As far as I know the moderator isn’t shown, but that’s probably a good thing as things get approved on a ~overnight schedule so publicly shaming them for mistakes seems counter productive. Additionally the moderation is very light, “April fools” papers are common and I’ve never heard of something being rejected.
For submitting you need to be approved by someone who already has several submissions in that field.
I much prefer a system where connected insiders push through un-replicatable papers in an opaque quid pro quo system so they can pad each others' tenure committee packets!
The current publication system doesn't necessarily avoid that problem either. Many papers are accepted due to the prestige of certain authors and conformity to other papers in the same field. Filters like these can lead to a false sense of security in the quality of the work.
Reed Elsevier and the other rent seekers got no reason to live.
That said: in the legal world, where PACER extracts $0.10 per "page" even now that you get things electronically and the cost is near zero:
There's a service RECAP (PACER spelled backwards), where you can install a browser extension which automatically copies anything you download from PACER to a free archive. So one person pays, and the rest get it free.
It is amazingly complete, much more than you would think. For instance, here's the entry for the Elizabeth Holmes (Theranos) trial:
Reading this comment, I assumed PACER was a private provider who sells access to otherwise freely available material which originally might be scattered and unorganised as government things usually are. Charging for access would therefore be reasonable.
But a quick search reveals that this is in fact a government-provided service. So your government charges you 10 cents per page to download PDFs from a glorified file server.
The US has around 1.5M people working in the legal field. Let's say each of them does 10 document lookups every day. 15 million requests in an 8 hour workday comes out to 500 requests per second. Don't quote me on this, but I'm pretty sure a modern workstation could handle that, let alone a dedicated server or ever a few of them with good caching and client-side load balancing. SCOTUS spends 16M yearly just on building maintenance, I think it's safe to assume they can afford a handful of servers without having to resort to microtransactions...
Going down the cost rabbit hole is problematic, the costs are not just for the service infra. Even for the service you need project managers , product owners, devs, SREs, QAs and data entry operators who collate or atleast update the DB . Easily costs that can run in millions/year.
Typically such revenue streams cover other holes in the budget not related to this service as well, and there is indirect overhead costs like Contracts, control structures, HR, vendors etc harder to amortize.
Any costs for what should be free public access in not right. The argument should be that we pay already tax, this information public should have free and easily .
I generally agree and my estimates were of course the bare minumum, but the "we already pay tax" argument doesn't really work since, well, this service clearly isn't tax-funded given the fact that it isn't free. It might be tax-subsidised, which often makes sense for things that are very expensive and have limited usefulness to individual citizens (like many services governments render to businesses are). If we consider accessing court records to be something with very limited usefulness to most people and primarily used by companies to make money (which I do, despite having done it myself on a few occasions), how expensive it actually is to run becomes very important when convincing "the people" to fund it 100%.
> Reed Elsevier and the other rent seekers got no reason to live.
Listen to me, HN. Stop listening to this meme. You aren't using your brains.
Everyone has a hard-on for Elsevier, and no other publisher. Why is that? Because they don't know anything about publishing. They've just heard the name Elsevier (and it kind of looks evil) and so they just parrot it ad-nauseam. What about Springer? Taylor & Francis? Wiley-Blackwell? And what about the hundreds of smaller publishers that control major journals? Everyone gives Elsevier shit about suing Sci-Hub, but nobody gives the American Chemical Society shit for suing Sci-Hub. The fact is that people only hold up Elsevier as the great evil because people are tying to over-simplify a complex problem by finding a "single evil", because then they don't have to think about a more complex, nuanced problem.
The fact is that there is a reason that paid journals keep existing, and it's not the profit margins of the Big Four. It's the academic research industry. Every single academic research institute in the world that publishes papers depends on the reputation of journals. Getting your paper published in a "prestigious journal" is literally the only way to progress a researcher's career, and thus get more funding. Without funding, there is no research! And the journals are providing real due diligence happening in the process of creating those journals, and somebody has to pay for that process.
If the paid journals went away tomorrow, researchers would be fucked, and academic institutions would have no idea what to do with themselves. So please stop with this ridiculous meme that Elsevier is The Great Satan holding back science. Sure, they should profit a lot less! But getting rid of them entirely with no system to replace them will be destructive to scientific research.
With due respect, if Elsevier and Springer (and IEEE etc.) went away tomorrow Science would not “be fucked”. We would be forced to sit down and work through a set of new open access journals and conferences. It would be an annoying few months and some publishing activities would be modestly disrupted. But all of the reviewing and editing and conference chairing is already run by volunteers. The only reason we don’t replace the journals now is because of (huge amounts of) path dependence and because nobody can solve the coordination problem of getting everyone to drop everything and do it. But if those publishers went away tomorrow, you’d solve both those problems in an instant.
Absolutely agree. The loss of prestige journals would be salutary and force researchers to focus on content, not cover.
GK Marinov, BJ Wold, and colleagues analyzed the quality of nearly 800 ChIP-seq datasets in the NIH GEO database (see figure S8 in their 2014 G3 paper: https://doi.org/10.1534/g3.113.008680 ). They independently scored the quality of data generation and analysis and then compare their summary quality scores to the journal impact factor in which each dataset was published.
Can you guess the polarity of the correlation? Yes, it was negative, and consistently so over a five year span.
Good point, but I think the opposite is true: scientific research is already broken, and blowing up the publishers would force us to rebuild it proper.
I co-authored a peer reviewed paper. Because our English is bad and our context is different, we called a system operating at 100Hz a "high frequency sensor" instead of a "fast sample rate (context) sensor". They gave that paper to an HF (radio) engineer to review it. He said "I think I was given this paper by mistake, this is not HF. Anyway, nice paper, change the color of this graph please."
Pair that with the general replication problem (no one has the money, time or incentive to replicate anything), that publish or perish mentality, the idiotic bias against publishing negative results - jeez, the situation is baaad.
> And the journals are providing real due diligence happening in the process of creating those journals, and somebody has to pay for that process.
Both things can be true. They can be providing valuable due diligence and also sucking a lot of value (much of it funded by public/taxpayer money) along the way.
I worked for Reed Elsevier for a few months in the 01990s (on an outsourced IT contract). They were already internally distributing propaganda encouraging their employees to promote extensions of copyright to factual information (so-called sui generis database rights). They are a profoundly evil institution to their core, in a way that the American Chemical Society just is not.
Your position seems to be that impact factors and bibliometrics are crucial to the progress of science. Nothing could be further from the truth.
as for Springer, Taylor & Francis, and Wiley-Blackwell: yeah, them too. Forgot to mention those.
The paid journals take research which is mostly paid for with public or non-profit money, and hide it behind paywalls. You can't avoid this fact.
Their "reputation" is mostly just a legacy, like the New York Times'. At one time they sent out paper journals, which was the only way information could be disseminated, and charged libraries reasonable fees. There was a manageable number of such journals so a library could get most or all of them. That world is gone.
> The paid journals take research which is mostly paid for with public or non-profit money, and hide it behind paywalls. You can't avoid this fact.
Commercial publishers have paywalls, but for new research they no longer have to be the only source for publications.
Nowadays, I think it’s often the indifference/laziness/whatever of the authors that prevent accepted manuscripts (same text, but different layout) from also being freely accessible.
- you have to mention the DOI, which, I guess, resolves to Elsevier’s site.
Reading https://www.elsevier.com/about/policies/hosting (“Sites or repositories that provide a service to other organizations or agencies, even if those other organizations or agencies are themselves non-commercial entities, are considered to be providing a commercial service, and this service activity will also require a commercial arrangement with Elsevier”), they make special exceptions for arXiv and RePEc.
So, that’s not optimal, but (for new papers) also not as bleak as it typically is described.
Who is paying and who has access is only one part of the puzzle. Most people only care about access, but they don't understand why that access continues to be limited.
Why do journals exist? It's not to provide information. Ever since the Internet was invented, we can distribute information virtually for free. Everybody knows this. Yet the journals persist for decades. So their purpose is not to provide information.
Yes, of course, their reputation is invented and legacy. It's been shown time and again that papers published in "reputable journals" can be quite problematic. But everybody knows this too. It's not like academic institutes are completely brainless. They know they could have somebody "less reputable" publish their information and it would have the same scientific merit, or that they could even publish it themselves on a blog. But they don't; they publish on the "reputable journals", even though the reputation is clearly not impacting the research results.
So why do these publishers exist? The true purpose of paid, "reputable" journals is to provide an excuse for research institutions to dole out money to people who meet a quasi-arbitrary barrier to the money. They know they don't have any good system of how to assign money, or who to promote, because in general it's hard to quantify. So they hide behind "the reputable journal" and thus the "reputation" of their researchers. This way they can receive more money (because "our scientists are published by reputable journals") and they can dole it out just as easily.
Opening access and reducing cost is a great idea. But shunting money away from journals will result in the entire research industry scrambling to put together a replacement that will allow them to continue being funded, determine how to dole out that funding, organize journals "for free", and retain some sort of rigor/due-diligence/quality from the publishing process. Can this be done? Sure! But we should make that our goal and not ignore the big pink elephant in the room, which is that journals are still a necessary evil for research funding. If we want to get rid of paid journals, let's actually think through the resulting impact and build a resilient system to replace them. Imploding them and "hoping for the best" is just going to hurt research.
It's been going on for years, I know that. I don't know if the rent-seekers have ever tried to shut it down. The fact that it's so complete tells me that a huge number of mainstream law firms look the other way when their employees use the browser extension.
In the case of PACER, they are public records, and have no copy right attached to them as they are government documents, the government can not hold copyrights by definition everything the US government produces public domain.
So was the FBI. They harassed my friend Aaron for a year when he started snarfing PACER documents from public libraries. In the end they had to conclude it was legal, so they couldn't prosecute him for it; instead he was persecuted by different US Attorneys for downloading papers from JSTOR until he committed suicide.
A good development - depositing papers in OA journals is extremely expensive and internationally, most funding bodies requiring OA don't give you money to publish OA. For example, Nature Communications charges around $5,000 for a single paper; I can send a student to an overseas conference for that kind of money!
Public repositories like arxiv or biorxiv are free.
I could not agree more: this is a good thing. Studies funded by public money should be freely available by default. This is a great decision and I hope major funding agencies in the USA follow this example.
As a practicing scientist, I ostensibly have access to a rich variety of journals from my employer. However, I've found journal access to be difficult during WFH/COVID. My default is to look for an open access paper, before jumping to the digital library. I've also had a few cases where the digital access seemed not to be working, which only put up barriers to my work.
The supposed advantage of commercial archiving is duration, where as self-hosted tends to depend on the author; But we could create solutions where, if works are being used by clients (hospitals, research), the client would be in charge of automatically downloading all the dependencies. Like a Maven repository, but with P2P resharing.
In similar vein, software used by public institutions should be required to be made open-source. Open source is already the VIIth marvel of humanity, and needs consolidation.
Can we get ISO to make all papers free next...? You know since moat industries need to abide by their standards but they refuse to freely make available these standards...
It might even be argued that the de facto mandatory nature of their standards constitute a form of (international) competitiveness, because their prices sure aren't just trivial to particularly smaller businesses.
Especially when you take into account that in other economies (outside the USA), their prices can be downright show-stoppers.
But then, it isn't particularly a new thing for larger economies to abuse standards and intellectual property regulations, to "compete" with smaller economies in ways that diametrically oppose the supposed (or at least often hailed) purpose and goals of both standards and intellectual property regulations.
ISO exists at the whim of the sovereign entities. You are most likely a citizen of one (or perhaps even more) of the entities that ensure ISO continues to exist, and could ask them to get ISO to give away the standards documents.
Of course if the money does not come from ISO selling documents, it would need to come from tax revenue. In reality however it was scheduled the real money comes from the richest countries, as they're most able to afford it, your politicians may have the opinion that even a modest sum expended in this way would be justified by their opponents as profligate...
Seemingly as part of the same push, the UKRI announced funding for an open research project I've worked on before: Dr Alex Freeman's Octopus [0], a more radical software platform that splits research projects into smaller components. It's good to see that the openess of research is finally getting to be a priority of the funding bodies.
Alex has been working on that for so long; it was really gratifying to see that she finally got some traction and serious funding to get it to go somewhere. Really looking forward to seeing what will come out of it.
> “When versions of articles are made available too soon, this undermines the need to subscribe [to journals],” wrote a spokesperson, Amy Price, in an email to Science.
A remarkable thing about the COVID crisis is the unprecedented close cooperation and data sharing by widely disparate scientific communities.
The obvious global benefits may also be driving cooperation and sharing in other fields, including (and perhaps especially) publishing. Closed data has no beneficiaries except commerce - it's being seen as a dead model. Finally.
This is so right. It used to be theoretically justified to charge a cost for editing, reviewing management, and printing costs of a paper journal but most paper journals have disappeared and the paper format is useless today. High tech has made these journal companies giant vampire squids poking their blood funnels into academia to suck the life out of mankind's progress!
But down the road I can see that academic research publication will become like Facebook - loaded with bullshit promoted by hypesters - wasting time and killing thousands - and no one to enforce sanity in that lunatic wilderness!
If the 19th century journals would get with a program they would become science repos with a much smaller stuff and light fees (enough to support one staff managing editor per journal) for the review process! Publishing a paper in this model should probably cost about $1000 - $2,000, about the same as it cost the last time I was in academia...
In case someone has some direct knowledge or credible sources: I'm not sure what is stopping reform at a political level. Who is the political constituency that obstructs open access? The publishers seem too small, and the science and higher ed constituency too large.
I can't give you direct links, but I know that large scientific/engineering organisations have not been pushing for free open access. For example the IEEE has been part of an open letter (which included Elsivier) lobbying against Plan S. These organisations derive a lot of their money from publishing, so while many (most) of the members are strongly in favor of OA, the buerocratics in the organisations often are not. Which gives mixed messages to politicians.
On top of that I believe that non-scientific publishers have also been part of the lobby campaign as they see OA as weakening copyright.
Funders have been relying on publishers to tell them (by means of who they publish) what researchers are worth funding. This is keeping up the pressure for academics to keep publishing in paywalled journals.
There is some pressure to both make research not paywalled (Plan S is the strongest) and less costly. However, given that most of the funders' researchers can access most relevant research (albeit rather clumsily, thanks to the access mechanisms, and not 100%, hence Sci-Hub even being popular among those who do have access), and that the extracted rent on the scale of a country budget isn't that much, the pressure isn't particularly strong. Additionally, it's hard for a single funder/country to move on its own without ruining their academics' careers (hence Plan S's focus on signing on more funders).
Now we need to get the Nobel Committee to say they will only read open source papers - this is a valid option as the paywalls have denied most of the world's 'second tier' countries access for their scientists (to say nothing of interested science readers wherever they are) so going open source will only add to the common pool of knowledge
It’s truly horrific that as an interested layman I am completely UNABLE to read most scientific papers. And to think about what they did to Aaron Schwartz…
They usually send you a 'pre-print' which (in their opinion - I'm not a lawyer) is not subject to the same copyright as the final reversion which was sent to review.
Thanks. Any sources for their opinion? Probably I shd read the copyright release form in more detail, but if you think terms of copyright this won't fly.
“Authors publishing via subscription models may also self-archive a copy of the accepted version of their manuscript (post-peer review, but prior to copy-editing and typesetting) in an institutional or subject repository, where it can be made openly accessible after an embargo period, in accordance with the relevant Springer Nature self-archiving policy (Nature, Springer, or Palgrave Macmillan)”
- via their non-commercial personal homepage or blog
- by updating a preprint in arXiv or RePEc with the accepted manuscript
- via their research institute or institutional repository for internal institutional uses or as part of an invitation-only research collaboration work-group
- directly by providing copies to their students or to research collaborators for their personal use
- for private scholarly sharing as part of an invitation-only work group on commercial sites with which Elsevier has an agreement
After the embargo period
- via non-commercial hosting platforms such as their institutional repository
- via commercial sites with which Elsevier has an agreement“*
(Seems a bit less constrained than SpringerNature)
Even if it is potentially technically illegal, I don't know about anyone who tried to stop it. Note that some release forms explicitly allow author's versions of the covered work for download on the author's website. There's usually a stipulation that they need to be different in some way, for example by not using the same formatting template as the journal/conference version of the paper.
I don't see any new insights on the page that you linked. I personally signed copyright transfer forms that allowed for author's versions. There were some conditions that I don't remember in detail. But the gist is that a commercial publisher still allowed me to have a version of the article text online.
They make you go through hoops and remember you - after a few requests the paywall emerges.
In this modern day, with publishing costs steadily falling, their demands for payment have become ever more demanding
It is truly awful that publishers do this, however there is a certain website-that-shalt-not-be-named that is papering over that gap while we researchers get our shit together.
I know it's in the original title, so I'm not criticising OP the submitter, but this has to be the first time I've ever seen 'U.K.' as dotted intials like that! Seems oddly funny, I don't know why really.
Wiktionary notes it's customary in legal case notes, but gives no other usage, which I suppose supports it being a first for me.
Are there any incentive for the researcher to not make the research paper free? AFAIK most prestigious journals allows researchers to publish their preprint papers in arxiv/the researcher's site.
High quality, prestigious journals are not free, charge you to publish, charge you to read, charge you for copies of your own paper. However, they only publish good work - so your work is not surrounded by low quality papers.
I would like to think the world has changed, but most academics would probably still hand over the bulk of their grant for a Nature publication under their name.
It sounds good but has become hopelessly corrupted in many fields. Cronyism ("I'll pass your paper if you pass mine"), conformity ("This isn't what the Cool Kids have agreed to, so you can't publish it") and also "This is too readable. Needs more jargon!" have taken over. Does this mean it should be thrown out altogether, or just reformed? I'm not sure.
One point to my others and pretty orthogonal to them:
I was a tech assistant in Google Patent Litigation. Part of my job was to bust patents (a dream job, right?), and for that, I would scour the Internet for literature to invalidate a patent being asserted against us.
I would constantly see articles behind some paywall, and I would never, never click through. There are reasons you can imagine, like (1) I didn't want the hassle of justifying the expense, and (2) I hate those publishers. Both true.
However, the biggest reason was:
I don't know if it's any good until I read it.
The vast majority of articles are not helpful for my purpose. I can't tell if they really are until I read them. If it turns out that some article is the killer, then of course Google would pay for it. But for the 1000's that are not -- well, why waste the money? I can almost always find the same information somewhere else, for free.
Do peer reviewed papers make the reviewers public?
That would be an interesting angle if not. Allow anyone to publish, but they compete for high-status reviewers. Also, the reviewers have some skin in the game so far as correctness.
"Pay journals for “gold” open access, which makes a paper free to read on the publisher’s website, or choose the “green” route, which allows them to deposit a near-final version of the paper on a public repository, after a waiting period of up to 1 year."
> But starting in April 2022, that yearlong delay will no longer be permitted: Researchers choosing green open access must deposit the paper immediately when it is published.
If you read further you would come to understand that is the current state of things, and the significance of the submitted article is that a new policy means the papers must be immediately made free.
All research and code posted on them gets a quick once-over; good work gets the attention it deserves; bad work is quickly ignored. Reviews take place over the Internet via both public and private forums.
Gatekeeping power lies more and more in the hands of a global, distributed scientific community open to anyone willing and capable of doing and reviewing the work. It's fabulous IMHO.