Hacker News new | past | comments | ask | show | jobs | submit login
Should All Research Papers Be Free? (nytimes.com)
644 points by mirimir on Mar 13, 2016 | hide | past | web | favorite | 309 comments

If it is funded by government in any way (public university, research project) I think it is borderline defrauding the tax payer that the research funded by tax-money is not free by default. Since close to all research is government funded in some way, shape or form...my answer would be yes in the general case.

I think the long term answer is decentralized publishing. Publish everything you do on a university or private website and let others decide if it's good or not when they want to cite it instead of a peer review that is set in stone. I think people reading papers deciding if they want to cite you are smart enough to figure out if it's good research or not. The peer review process is overrated (and quite often suffers from insider networks). If you decentralize publishing you can also have other researchers upvote a paper to basically approve of the academic standards in the paper. I also think the static nature of papers is a problem. I'd much rather cite a specific version of the paper. I'm thinking about git and pull requests along the lines of "want to cite, fixed layout" or "new research disproves this" etc.

I don't have a ton of time to search each university's publication database or every 2nd tier research team's private home-grown web site.

Journals provide a filtering intermediary that helps me better use my time. Hopefully I can figure out which editorial teams are going to publish good vs ok vs crap research and pick from those, relying on the curative capabilities of their staff to provide interesting and useful reading.

Journals should still want to publish great research, and publicly funded research should still be available to the public. Maybe there's indeed a more relaxed middle-ground to "just put it all out there and hope folks find and share it." We all agree the journal industry needs some help, but I don't think society is well-served by completely decimating its economic model.

> I don't have a ton of time to search each university's publication database or every 2nd tier research team's private home-grown web site. Journals provide a filtering intermediary that helps me better use my time.

I don't have time to scavenge through all the tech news every day, but that's partly why I visit HN. Also, I use Google when I'm looking for some specific topic.

In other words, filtering can be done perfectly well by communities and third-party organizations.

I can certainly rely on social-network-style filtering to surface interesting reading.

It comes with its own trade-offs, particularly the tendency toward "viral" titles and abstracts so that a particular paper would be more likely to be clicked / skimmed / shared. I'm not sure that we want to encourage that as I think it would injure the quality of published works in general.

There are merits to being able to put faces, and reputations, behind the editorial direction of the information stream. Social-network-style consumption would necessarily anonymize or obscure the effect.

As before, there's definitely agreement that more access is better and I'd bet there's a good middle ground available.

For example, a journal I worked with while I was an undergraduate changes their articles to open-access nine months after publication. You still get the editorial process and curation, the team has some financial support to evangelize their journal's mission and most folks who won't have needed the most immediate access can get to it for free. If you're an individual and want the articles immediately, the "subscription" is in the form of a $40 or more donation each year, which receives as a gift the biannual physical volumes and full digital access.

They're not the majority of course but options do exist that straddle both stakeholders' needs. Flipping the market completely on its head simply might throw out useful qualities that wouldn't completely transfer over.

One difference between HN and, say, math is that in this community most readers are probably interested in at least a sizable minority of the topics that get posted here, whereas in a hypothetical community of mathematicians discussing papers, I find it hard to believe that most people will be interested in more than about 1% of the literature.

Then you just use more specialized content aggregators or more specific searches. To make an analogy to reddit - rather than following r/math, you'd follow a specific topic like r/compling.

In that case, you probably know 100% of your peers, and it would be easy to find their work :)

We already have research hubs for free publications. As a former Systems Librarian "Journals provide a filtering intermediary that helps me better use my time" is true for professors period. Students just want articles.

> I don't have a ton of time to search each university's publication database or every 2nd tier research team's private home-grown web site

Directory of Open Access Journals - https://doaj.org/

Advance Public Access to Articles (Chorus) - http://www.chorusaccess.org/

But they provide a lousy and expensive filtering.

So-called "high impact" journals suffer more retractions. http://peterdedmonds.blogspot.dk/2015/06/retractions-and-hig...

Society is most definitely well-served by completely decimating [legacy publishers] economic model. Legacy journals stand in the way of progress and they are laughably inefficient in what they do (filtering / online storage / layout (really!) )

Low impact journals may be significantly worse, but if fewer people look for defects fewer defects get found.

It's not suffer from retractions, but benefit from them. No one bothers to find errors in weak papers and weak journals don't bother to publish retractions. You would expect more retractions for the best journals because they still have decent standards.

Related anecdote: I had a confererence publish a completely plagiarized article of mine, changing only the names of the authors. The paper still contained recognizable photos of my students performing the experiments. I reported this to the confererence and to the IEEE which owned the copyright to our original paper. No action of any kind was taken. Not even a 'knock it off' email.

We're there already, at least in my field. Many research groups and individuals do actually host copies of their papers on their own websites, whether or not the journals are officially OK with it. Failing that, most Ph.D. dissertations are available for free to the public through university websites. If you want to find out what a research group has been up to, the chapters in a group member's dissertation are often substantially the same as the paywalled journal articles with his name on them.

Journals can still perform their review, critique, filtering, and indexing functions without also acting as sole gatekeeper for the source material.

I don't have time to read every newspaper. Aggregator sites (like HN) provide me with a handy list of links to articles relevant to a specific interest or theme. Thus, I have not been overwhelmed by celebrity gossip when I go looking for nerd news.

And many of those aggregator sites are community curated, meaning that there is no editor to pay--only the site hosting bills.

But on the other hand, I'm not sure I want to see what karma whoring looks like for academics.

If there is truly economic value to be had in filtering papers, we would expect an eventual growth in service businesses to help sift through the glut of openly published papers. Since we're currently in a long transitional period from closed to open publishing, there's plenty of time to evolve these third-party filters - from both the business and community sides - without incurring a chaotic "filter vacuum" period.

The major difference, however, would be that the research would be at least accessible to those who can't afford to pay the intermediary. That's what seems odious (to me, at least) about the current arrangement: if I want to access most of the research my taxes have funded, I have to pay again. This is on a moral par with being forced to pay for access to read laws and court decisions that should be a matter of public record, especially in an era where the marginal distribution cost is rapidly heading to zero, and I can't fathom an argument against having basic, immediate, and perpetually guaranteed access in either case.

I agree with you that it's ridiculous in both cases, but I have to point out that PACER, the Federal Judiciary database, isn't free.



Journals provide a valuable service, but I don't think that exclusivity is a necessary component of that service. The people buying the journals will (probably) keep buying them even if they can legally get the papers for free elsewhere, because getting a physical copy of the paper was never the point - curation and peer preview are what subscribers are really paying for. Of course, this goes both ways: if journals start publishing crap to cut costs, people will stop buying them.

Couldn't we have a wikipedia equivalent for research? The peer reviewers do voluntary work anyway. I'm not in the system, so this is a genuine question.

You're describing an Overlay Journal. https://en.wikipedia.org/wiki/Overlay_journal

The publishing of academic journals generates billions of profit per year. One could easily decimate these profits and still have a viable industry, whatever definition of decimate one cares to use.

Profit margins around 40% if I recall...

Profit margins doesn't mean much in this case. Most science related work (e.g peer review) is done for free. Typesetting costs around $30 per article [1]. It looks like big part of spending is administration and lawyers for taking down copies of the articles.

[1] https://discrete-analysis.scholasticahq.com/post/40-welcome-... (ctrl+f $30)

If there's a demand, there will be a solution. With proper review sorting mechanism, I'm sure that can be solved

Science has been for hundreds of years the most open and transparent institution. But I think that recently it has appeared a new contender that is even most transparent, open software development. I think science could use the same approach for research, from step zero.

One scientist has an idea, he publishes his hypothesis and intended metodology. Others can jump in and tell him, for example, that the hypothesis has been proved wrong in a recent paper, or suggest improvements in his methodology. Others can chip in and offer to replicate the experiment to increase the sample size. Mathematicians could observe and correct the statistical analysis before the conclusion are published. I think that will improve the quality of the results.

Of course, the bigger problem is that most papers would have dozens or hundreds of authors, diluting the individual contribution of each one. On the other hand, finished reasearch already would be peer reviewed and corrected.

The parallels are greater than even those you described. I'm helping a startup focused on bringing out an integrated research environment to help fix many of the issues presented in the comments here and the article. Happy to chat with anyone who would like to see a demo. Email Lane (at) MyIRE dot com.

Have you thought on writing an article about it? It'll be very interesting to read.

That's coming soon.

You're definitely on to something; the parallels between science and open software are tantalizingly close.

From my experience a main problem is real estate; you can't do most science experiments without a proper lab. We could probably brainstorm some creative solutions (like operating an "open" contract research organization to be the hands for everyone's ideas), but at the moment this is too big of a barrier.

With real estate comes cost, then funding, then jockeying for funding, and finally journal impact factors as a measurement of worthiness.

It's also why one can't make a science-related "hackathon" without changing the concept.

that's a great idea

someone should figure out a way to make that happen

Lately I have been thinking about this, and I figured that publishing/citing can be described exactly using a Merkle tree. Things like ipfs have been on the hn front page quite a lot, I wonder whether anybody thought about using them to decentralize paper publishing.

This is not a technology problem. People send their papers to reputable journals because of their reputation and journals have a good reputation because everybody wants to publish there. Getting papers into a big journal is nearly the only benchmark academics have when it comes to getting a job or tenure.

AFAIK from my colleague doing PhD in Europe, one of the main issues is Impact Factor. The universities play "gather the points" game, and you get, say, 35 points for publishing in a major (paywalled, expensive) title while you may get only, say, 18 or 10 points if you published in another (non-paywalled, less expensive, less reputable) journal.

If you have less points than other people, you don't get a grant, or you're more likely to not have your contract extended etc.

The ministry reevaluates the (journal's IF<->points per publication) mapping each year, but it takes a few years for a new journal to get significant number of points. Publishing in low-IF journals if your paper could be accepted by a major one is a career suicide.

The hidden implication is that the system reinforces the power of the all-powerful top journals.

[1] https://en.wikipedia.org/wiki/Impact_factor

Another important point, is that lower IF journals generally have a smaller audience than high IF journals. This is really important for how many times your articles are expected to be cited. Your H-index [1] is now sometimes requested from funding institutions, and I'm starting to see it on resumes. It's all kind of sad.

So, in a lot of ways, the competitive research culture isn't helping inaccessible/exclusivity journal problem, since researchers are actively seeking out those journals.


I completely agree (I am currently a scientist). It's a very hard problem to solve. I'd love working on it and in many ways it can be framed as a tech startup type of problem since it is essentially trying to build a consensus network. In a theoretical world in which I have retirement level money that would probably be the issue I'd try to fix for the rest of my life (unfortunately expecting no progress).

Mathematics seems like the most promising field to start with if you follow a bowling pin type of strategy. There have been some boycotts of Elsevier etc. from math departments and they have somewhat of a "free publishing" tradition at least from my outside point of view. It's also an excellent viral starting point because you can infiltrate other disciplines that tend to cite math papers (physics, machine learning, economics etc.)

I think it is a technology problem, because the journals exist in the first place to solve a technology problem: How to distribute, quality assure etc a paper. Without internet this obviously cost a fair amount of money. Nobody was going to just print copies of journals and distribute them for free.

I doubt a system like the journals would have developed in a world where internet already existed.

I think a variation of the system used to build the Linux kernel could have worked for scientific papers as well. In the Kernel you have a pyramid of trust. Linux trust some people directly, who again each trust another set of people. This continues downwards. In this fashion code can be signed off and bubble up through the hierarchy.

I suspect in similar fashion a scientific paper could bubble up by being reviewed by ever more respected scientists in a hierarchy.

Thanks for clarifying, I was replying to the technological insight in kriro's post however. Btw, is technology here completely irrelevant? The way to access content is pretty important after all. In a decentralized publishing system, what is reputation? Maybe it could be an intrinsic measure with a ranking system of some sort. But this is pretty hard. Maybe the "natural" way to publish content has no unique or sensical way to quantify prestige. The following question arises: is prestige a good criterion to judge science?

I am not sure what percentage of pay walled papers are gov't funded. I suspect many come from private universities or large companies. As for paper versioning, this is already done to some extent. Not as explicit as code on github, but many authors will release multiple versions of a paper. That being said, it is trivial to do versioning as described if you use latex and put your paper's source on github.

U.S. private universities get plenty of public funding.

Almost all published biomedical research is government-funded. "Private" universities in the U.S. get massive federal grants for research.

Companies and private universities publish obviously a lot of research papers. These are still more often than not somehow funded by governments e.g. Darpa, EU funds (Horizon 2020) etc. I am not aware of many projects that are truly completely, privately funded.

lots of private university papers are still paid for with govt research funds

I agree with your first premise - yes, they should be free. In this sense, the peer review process should be paid for by whoever funds the research, rather than whoever wants to read it.

And what do you think about companies that perform large-scale medical analysis on genomic data of customers, and keep their research behind closed doors?

(Probably not a big issue right now, but it could be in the future).

Well, they are the ones funding the research teams and equipment, so it's a different scenario IMHO.

I think we can agree though that knowledge should converge into being easily accessible, all while reducing the collateral effects of doing so (as the article explains: tenure, promotions, prestige…).

Some research grants are industry subsidies. Taking this away results in pure, accessible research that is absolutely useless.

Full disclosure: I'm a founder of a company called Scholastica that provides software that helps journals peer-review and publish open-access content online. One of our journal clients, Discrete Analysis, is linked to in the NYT article.

It is incredibly obvious that journal content shouldn't cost as much as it does.

- Scholars write the content for free

- Scholars do the peer-review for free

- All the legacy publishers do is take the content and paywall PDF files

Can you believe it? Paywalling. PDFs. For billions.

Of course the publishers say they create immense value by typesetting said PDFs, but as technologists, we can clearly see that this is bunk.

There's a comment in this thread that mentions the manual work involved in taking Word files and getting them into PDFs, XML, etc. While that is an issue, which you could consider a technology problem, it definitely doesn't justify the incredible cost of journal content that has been created and peer-reviewed at no cost. Keep in mind that journal prices have risen much faster than the consumer price index since the 80s (1).

The future is very clear, academics do the work as they've always done and share the content with the public at a very low cost via the internet.

PS. If you want a peek into how the publishers see the whole Sci-Hub kerfuffle, check out this post from one of their industry blogs - the comment section is a doozy: http://scholarlykitchen.sspnet.org/2016/03/02/sci-hub-and-th...

1. https://cdn1.vox-cdn.com/thumbor/jtj2dzMfklULQipRZt_3xaLoFxU...

Scholars actually pay journals to publish their papers. No, this isn't vanity press or a bribe. Scholars must work very hard to reach the point where they can pay those fees.

For example, say you submit a paper to Nature. Odds are it will be rejected, because Nature has a high impact factor. Say your paper is both accepted and passes the review process. Now you have to pay Nature to publish your paper. Depending on the options you ask for (e.g. making it free to download costs extra) you can wind up paying thousands.

No university is going to say to one of their researchers, "Hey, could you maybe not get accepted by Nature quite so much? It's expensive!". Nature knows this.

Also, the format of submission varies by field. While word files may be standard in some fields, in others latex is expected. Obviously, typesetting a latex document is pretty gosh-darned easy for a script to do.

> Scholars actually pay journals to publish their papers.

Depends on the journal and the field, really.

The fact that it's a thing at all is outrageous enough.

Indeed, and as further commentary to the previous two posters -- Nature doesn't charge publishing fees, because its popularity allows for high funding revenues.

This is not true for the vast majority of journals in the biomedical scienecs.

I think this is very different. Nature has created a strong brand and they can charge whatever they please. I understand why publicly funded institutions would want to pay to have their research papers published in Nature, it could be part of their marketing budget, like billboards.

It's just that publicly funded institutions should be forced to make their papers widely available free of charge.

> It's just that publicly funded institutions should be forced to make their papers widely available free of charge.

Exactly. This problem needs a big push from outside the system, from government and people, because the publisher-academic careerist marriage is, unfortunately, working well for both.

It seems the real BS is paying to publish in "big name journal" because the Cabal will place a high value in said "big name journals"

It makes just as much sense as wearing a fashion accessory. "This person must be important because of his fancy watch"

I mean, not really. It's a big name because the best research is published there. Therefore, if you are published there it's because you are doing some of the best research.

> It's a big name because the best research is published there.

The fact they charge the author to be in the journal immediately brings into question if it is truly the best research.

I mean, maybe there is better stuff somewhere. Nonetheless, its position as an selective and influential journal with excellent research is pretty secure, which is precisely why they can charge. I still don't see it as quite analogous to an expensive handbag.

You see my point though. Charging to be in the journal means that there may be amazing research that simply refused to pay. Charging to be published in the journal is at odds with it always containing the best research.

Exactly that

Papers published there are certainly at a high quality level, in predicate logic:

A - Published on Nature

B - Good Research

A -> B is true, which does not imply !A -> !B

Sure, but the claim I disagreed with was the claim that big-name journals were nothing more than a brand with cachet. That is not true; research in one of them is almost certainly of the highest standard.

Of course the publishers say they create immense value by typesetting said PDFs, but as technologists, we can clearly see that this is bunk.

Amusingly, when I was a grad student, some friends showed me how they could use LaTeX to produce a manuscript that looked like a spitting image of a Physical Review article.

That was in 1992.

Well Phys.Rev. makes their LaTeX style file freely available, so it's not surprising. But yeah, it lifts quite an amount of burden from their editors if you already use their style for typesetting.


In maths everyone typesets their own papers. Good publishers let you use their (La)TeX styles beforehand to make sure it will compile finely, after that they do little else.

Check out Overleaf - here's a referral link that gives you and me more disk space: https://www.overleaf.com/signup?ref=64cf3f9ff138

Web based LaTeX editing, makes it very easy to collaborate. But also they have many standard journal and university templates, and direct submission links to lots of journals to make the whole process of formatting and submitting a paper much simpler.

Yes, this of course raises the question of why some journals charge so much.

I still believe publishers have value, but recently some of them have figured out how to do as little work as possible while making the most money.

Who can blame them in a capitalist society. Anyway, I'm glad to see people respond via places like Arxiv. Sci-Hub is more like civil disobedience. Great that it happened for all the learners out there, but it's going to make some people very angry because they have bet their careers or invested based on the idea that the government protects and enforces copyright agreements.

We want people to invest in research. Academia has been very lucky in this regard as higher level institutions have continually grown in America. Should people begin to feel that investing in research has no return value, there might not be so many tenureships around in the future. I don't think this is happening I'm just conjecturing and suggesting that while a correction is definitely needed, I'm not sure leaping to the "all free" model is going to work for every scientific community. I could be 100% wrong though. I like the direction in which we are headed. It's more trusting, and I think that means growth for a society.

The returns that we expect and want from research have nothing to do with publishing models, though. And most of the people profiting from the current publication system aren't what we would call great scientists, so much as opportunistic capitalists. They aren't advancing society; they're just pulling in a paycheck, extracting rent from people trying to make actual scientific impacts, or figure out how to transform those advances into productive realities.

From the kitchen link,

    "A PDF is a weapons-grade tool for piracy"

    "This is despite the fact that the publishers’
    action directly addressed a very real
    economic problem."
This "very real economic problem" that the author describes is not making enough of money off of textbooks. The publishers have lost all relevance by having contorted themselves over the years to fit into a model which no longer makes sense.

Journals can run themselves (with proper software) and historical impact factor needs to be abolished.

> Can you believe it? Paywalling. PDFs. For billions.

Wow, I'm amazed that a scientists of all people would tolerate such a scam. Why don't they come together, and en masse leave these pay-walled journals, and switch to free open ones?

In politics, you hear about how uneducated voters are tricked by politicians, but this is far more worse. Here you've got some of the most intelligent and creative people on the planet knowingly allow themselves to be screwed over. Wow.

Here's a hint: some perfectly qualified scientists pursue work outside academia because they're unwilling to put up with publishers and journals. The ones who stick around base their career advancement almost entirely on publication-derived credentials, so it isn't in their best interest to rock the boat. There's a strong selection bias at play.

To be fair, many perfectly qualified scientists pursue work outside of academia because they couldn't get a permanent job in academia. Which is another sort of selection bias at play, particularly wrt their commentary on academic practice.

Publish or Perish is a highly accurate description of the career. Improving the publishing experience is not the same as publishing, so while you and a couple of your colleagues band together to improve the process, another colleague continues to publish and now looks better than you on paper.

Hmm.. If only governments required that any research it funds be published in open access journals. If enough people around the world are made aware of this issue, they could petition their government to enact such a policy...

Why are you under the impression scholars write the papers for free? That is what researchers are employed at universities to do ... Yes, it is not like publishing a book but these papers are not really like books anyway.

> Why are you under the impression scholars write the papers for free? That is what researchers are employed at universities to do ...

Researchers are employed to drumroll please research. (inc. teach, and mentor, and advise, and admin, etc.)

Disseminating your research, that requires writing it up, yes. So, no, they don't technically write up the research for free, but those who benefit (the publishers) are not the same entities who pay the researchers salaries so it's as if they are getting something for practically nothing, i.e. "free".

> Yes, it is not like publishing a book but these papers are not really like books anyway.

Nobody is making that comparison.


It's very simple, the internet has shone a strong light on a very profitable business model and people don't like what they see.

They're not getting paid by the publisher.

No we don't get paid by the publisher but we are paid to write papers. I can be downvoted for being correct but it doesn't change anything. Researchers are paid by research institution to do research for which product are papers.

You’re being downvoted because this technicality / POV has no real bearing on the main question and doesn’t add to the discussion.

You're totally right. He does make some other great points in his comment which could be acknowledged in a measured response...

The point is that publishers make immense profits from publishing the work of others whom they don't compensate. Do you think it is 'fine' that the output of publicly funded research is locked away and rented out to other scientists by 3rd parties who contribute very little?

The majority of scientific research in the U.S. and elsewhere is funded by tax-payers money in the form of research grants. The outcome of this research (the paper) should be free.

I feel especially strongly that papers that result from taxpayer-funded research should be free.

I also think code used to produce results should be published as well.

Have you worked with biologists? If they could figure out how to use git, let alone github, I would be astounded. No forgetting, these are incredibly smart people, they just are bench-top scientists, not laptop ones. If you want their code, it will be, at best, a copy-paste of badly hardcoded Matlab, years in the making, with for-loops for the sake of for-loops alone, and without ANY comments or documentation whatsoever. Honestly, it would take less time for you to write it yourself and compare, rather than to try to peek under the hood of a spaghetti mess.

Honest to god and on the grave of my mother this happened to a friend. He was going over some old Fortran 88 code, filled with GOTO statements. The code, at best, was a rat's nest. Only a deep and long fight could get it into your brain. At about 3 am after a long day in the lab, she finally gets to a line in the code that says GOTO LINE 12345 with the comment 'HAHA MADE YOU LOOK' . Her boss bought her a new computer after she threw that one off the roof.

That experience is universal with Biologist code.

> Honestly, it would take less time for you to write it yourself and compare, rather than to try to peek under the hood of a spaghetti mess.

That's not actually the point. The point is reproducibility. If you write your own code and get a different result, why? If the reason is the code then having the other code, no matter how terrible it is, allows you to figure out what the difference is. And then somebody is wrong and you know why and can publish that result.

Exactly. IMO, this is a big problem in some areas of computational biology right now. Code to a famous project gets released, and instead of independently validating the results, other researchers (rationally) choose to just build on top of the famous project. It's faster, easier, and the "collaboration" leads to social prestige for everyone involved.

The end result is that empires are built, reproducibility is (paradoxically) harmed, and we're someday going to end up finding out that some big, high-profile projects were built on pillars of sand.

My experience is that it takes a certain kind of twisted intelligence to write code that is below a certain level of quality, but still works. Someone like me who is a mediocre programmer? if my code gets too terrible, it simply stops working. I'm forced to use some base level of software engineering if I want to write a large program that runs, just because there's a limit on how many random side effects I can keep track of.

From what I've seen when supporting EEs? People who are that much smarter than I am don't have this limitation. They can write thousands of lines of spaghetti perl and bash and as long as nobody touches the damn thing, it works fine. God help you, though, if you make a change.

But... there is an established job role for this; I mean, you don't want to make your EEs sysadmin their own LSF cluster, either.

Nope, its always below average programmers who don't seem especially smart, and who don't want to learn who produce the worst code in my opinion.

There's a difference between 'being smart' and 'being able to program well', and that difference is precisely why academic code has its reputation.

My observation is that to make a program run, a less intelligent person needs to organize that program better. But, intelligence is a controversial subject in and of itself, so I don't expect to see agreement.

> God help you, though, if you make a change.

Or need to add functionality. Or, just wait a year.

Sounds to me like there's an opportunity to propel biology forward by teaching some biologists how to manage their code in a way that won't cost large amounts of productivity.

This is exactly the goal of http://software-carpentry.org/workshops/

Ha, it could be worse. I have a friend who's getting his PhD in microbiology and he and his team were running into some trouble with these massive multi-gigabyte CSV files from NIH (gene sequencing data, if I recall). Turns out they were trying to open the little buggers in Excel and wanted to stay in Excel since it was familiar. At some point, I wound up just suggesting they use split on the stupid (but very important) things and people acted like I just blew their minds.

I think sometimes you just get so caught up in your field of expertise that you miss some of the tools that could drastically help you entirely. You have to wonder how much of an efficiency drain that adds up to over the course of even just individual research projects. And that's before you get to bad code.

What is Fortran 88? I was under the impression it jumped from Fortran 77 to Fortran 90?

That would also have the benefit of making sure the code actually works. From my time in CS grad school I'm pretty sure there are many published papers where the author never had code that achieved the results they described.

Or code that worked on the specific test case described in the paper but was flaky as hell on anything else.

The most disillusioning thing for me as an undergrad was to try to replicate results from a paper on a relatively complicated problem, and not achieving anywhere near the papers level of results.

I got access to the source code, and the super complicated algorithm added almost nothing to the results, the glossed over / hand waved past data normalization worked so well that there wasn't a need for any further classification. This paper was pretty well received and cited, despite it basically not work.

I am afraid that what you describe is a pretty fundamental limitation of peer review. Currently it is not expected of reviewers to try to replicate experiments, but only to check if the methodology of the paper seems sound and the relevant previous work is cited. Should this be expected of reviewers? Should the authors provide all code and data so that the results can be replicated at the push of a button? That seems a pretty tall order. Seems like we'll be stuck with a modus operandi of lots of papers being published, peer review only catching obvious errors, and replication being very rare.

I get that, and I don't even think that that should be the reviewers job. But for CS in particular, I do wish more people published source code and build procedures.

I think a central hub for public comments, with source, and available PDF would be incredibly useful for CS as a field. For most papers I had access to, to get the source I'd have to go through at least one person. So I tried for quite a while to replicate the result before even trying to get access. If it was available, on say github, I'd have grabbed the source code as a resource to understand the paper, and with something like publicly available commenting, I think it'd have been a non-issue.

The problem is, I can see how such a system would be incredibly constructive to the field and the community, but it could be a liability to the authors, and would make publishers meaningless, so I doubt it happens anytime soon.

At least in the engineering fields, many of the papers require at least a discussion of the verification and validation of the codebase. So it is improving, if admittedly far too slowly.

The problem is it's very difficult to fit a full and proper validation in the appendix of a paper. What's really needed is to standardize the inclusion of "additional material" with the paper, including source code, test cases, and results.

Wow, I never considered that this might fall under the same umbrella. It seems very reasonable because it isn't very hard to publish that now. Even if the shared code wouldn't run without a ton of requirements and very specific setup, it's still something. I know we all hate to release code that doesn't compile or run outside our own setup because we then feel we will be responsible for helping someone else getting it to run on their machines to which we have no physical access. Not to mention it is always a mess figuring that stuff out. But other research fields do not feel compelled to hold the hands of their researchers to replicate experiments. Presumably, whoever's code is easiest to run will more likely serve as the basis for other future research, thereby enhancing career growth of the original publisher. I'm not a researcher by the way, and this just dawned on me, so if this is obvious to all of you, then I apologize for the block of text.

I agree with this in theory... but do you really want my garbage, ad hoc, messy code that I used to pipe output from one program to another?

Yes, please.

To be fair, a lot of people do publish their working code with the paper. It helps people like me understand how to do the work.

Bad code is fine, good code is better, great code is great. Stick to fine and I can do my job and you can write papers knowing that your work will be used by many.

I'm not actually recommending this license, but it seems relevant to mention:

The CRAPL: An academic-strength open source license http://matt.might.net/articles/crapl/

"the Community Research and Academic Programming License"

Definitely yes, then I can see if somewhere in that garbage, ad hoc, messy code why I can't replicate the result myself.

If it's that garbage, how do you know your results are correct? It should be at least non-garbage enough that someone else could theoretically check it, or you're not doing science at all.

You don't. But there are 5x-100x more bugs lurking in the methodology so triage is the name of the game.

A much bigger problem is that grantsmanship strongly incentivizes against verifying your methodology and being diligent in your construction of null hypotheses.

This is like saying "I don't want to file a tax because my handwriting is bad".

Well, if you're embarrassed by your terrible handwriting, maybe it's a time to put some effort to improve it.

Ok I don't mean this in a mean way but you don't seem to appreciate what CS research is like. It is not about handwriting.

Filing taxes is not about handwriting either - I guess you missed the analogy. Research isn't about writing good code, but publishing your code is useful(for reproducibility, which is part of research), and if you're embarrassed by your code quality, then you should put some effort into it.

Absolutely yes. Science isnt supposed to be clean and tidy, but reproducible.


Especially because if you are forced to publish your code, it will be better.

Considering that a mistake in such a program could completely skew results, yeah that seems pretty important.

To be honest, I'm surprised anybody even takes research seriously which depends heavily on a custom program whose source code is kept hidden.

Yes, because if it doesn't even compile then I will call BS on your results.

See, that's exactly what you shouldn't do if you want to encourage academics to release the source code.

Academic code sometimes only has been made to work on one machine (the author's). Someone reproducing the results should still have access to the source, even if they have to hack it to work on their system.

It's better than nothing.

I don't know... Did you do it right?


But seriously.

I think you are providing a large barrier unnecessarily. Research projects and publishing a paper is more complicated than that. It is good when the researcher can do that but it is not always practical.

Really, a lot of the comments to this and the downvotes seem to betray a lack of understanding of what cs research is like, which is understandable I guess. CS research is not like creating physics experiments, there are methods, techniques, and algorithms. When there is an experimental result, CS researchers do not treat those as sacrosanct and do have an informed skepticism but we are generally able to determine and discern the bulk is what is important from the content of the paper itself, in most cases.

Yeah, if you look at the experimental results section from papers in SOSP, you could come up with all kinds of complaints probably all day long but there is more to it and the community than is being taken for granted here.

What about all the IP generated as a result of taxpayer-funded research?

That too, although a case could potentially be made that licensing of university-owned patents reduces the need for tax subsidies. E.g. the University of Wisconsin recently won a $234 million judgment against Apple just as their state funding was being cut by $250 million. http://news.wisc.edu/warf-wins-patent-infringement-lawsuit-a...

Wouldn't it be more efficient to just tax Apple by the same amount of money and then use it to fund research, rather than turning universities into patent litigants and then having a large proportion of the money go to lawyers rather than research?

I'm not really interested in those particulars but offered it as possible rationale that isn't there for academic publishing. I don't think patents (or copyrights) should be granted for publicly funded research at all, as it's effectively work-for-hire already owned by the public. But it gets more difficult when public grants are only a minor part of the funding.

It seems like the easy way to solve that would be to not have research where the government funds only a small part. If the government is going to fund it, fund it the whole way.

The alternative has way, way too much potential for rent seeking. You'll get private interests who connive a way to have the government pay to do the hard/uncertain/expensive part while they don't put in their dime until they're already sure it's going to pan out, at which point they swoop in and claim the full IP monopoly to the detriment of the public.

"Easy"? Certainly in biology that would overturn the entire system for drug/treatment discovery. The government is good at funding basic research, and the private sector is good at bringing products to market. There is no clean separation between the two activities.

The government is good at funding research but it doesn't fund FDA trials and neither will anybody else if they aren't provided some incentive to. That's a pretty clean separation.

The second is a completely separate problem from the research. If you have a promising drug that people have been using in other countries for a thousand years but has never been FDA approved, it never gets FDA approved because nobody can patent the prior art so nobody will pay for the drug trials even if no research is required whatsoever.

Two possible solutions are to have the government fund the FDA trials or to give the party who does a temporary monopoly on the drug regardless of patentability. And if two parties offer to fund the trials in exchange for the monopoly, let them bid for the right.

There are three types of funding going into US labs: money, green cards, and IP rights. Science is already horribly under-funded and removing IP rights, one of the sources of non-conventional funding, would make it even more so unless the removed non-conventional funding was replaced with conventional funding. Which is unlikely. Given that context, I'm all for letting government funded labs keep their IP.

From a government / national perspective should it be free world wide?

If so, why should nations give away their expensive research that potentially give them an edge either military or commercially?

Personally I almost never traded technology when I played civilization, let alone give it away!

> From a government / national perspective should it be free world wide?

Yes. The answer is unequivocally yes. Your country will benefit vastly more from the research being freely available than other people having a better world on your dime will disadvantage you.

And here's the thing. Other countries face the same calculus. Scientific research brings the greatest return on investment of anything ever. It's why we have electricity and computers and satellites and penicillin. At the national level you can pay for it and give it away and still come out ahead regardless of what anybody else does.

Since every country benefits from doing it by more than it costs them, everybody can "cooperate" by doing research and giving it away without having to worry about anybody else defecting -- because defectors can't hurt you, they can only not help you. The sooner everybody realizes that the sooner everybody can choose to cooperate and the better off the whole world will be.

And the returns when every country publishes research for free are... large.

Nationally-funded research that's published in expensive journals is still very much available to the rest of the world. That's true with about anything that's published...

I was gonna say that the rest of the world purchase from these journals, but of course the journal might as well be foreign so that argument doesn't hold.

Good point about the current situation.

Journals don't buy the science. It's given to them for free from scientists in return for "prestige". Most of the time this is damaging to their government's interests, since they give away taxfunded research usually to a foreign entity.

No, it's rather important, vital indeed, that code is published. Ironically Phil Jones of the UK Met Centre made this point himself:

    "We have 25 or so years invested in the work. Why should I make the data available to you, when your aim is to try and find something wrong with it. "
Some of us always thought that having your ideas and results discussed and examined in a public forum was what science, an intensely cooperative activity, was about.

- And especially when public policy often demanding massive input from the taxpayer depends on it.

Stuff with military applications should not (and isn't) published in journals either.

> Personally I almost never traded technology when I played civilization, let alone give it away!

This is one of the easiest ways to manipulate the AI players...

Are you, as a taxpayer, willing to pay (your share of) the extra $1000 or so per paper that it would take?

Yes, as a taxpayer, I am 100% ok with that. The $1000 is a small fraction of the total cost, and I'd be paying close to or more than that amount for all the journal subscriptions that a federally funded research institution carries anyways. The benefit being, that the work that I am paying for is accessible to everyone instead of just the elite institutions.

Bingo. Putting the $1000 in context of the total cost makes it much more palatable.

$1000 for what? The research and paper writing was already funded by the tax payer. Peer review is done for free by other researchers today. Journals provide very little value for the money they extort for paper access.


>But that financial model requires authors to pay a processing charge that can run anywhere from $1,500 to $3,000 per article so the publisher can recoup its costs.

Edit: also, I don't expect margins of 30% from companies producing very low value. Do you really think that if they'd lower their margin to 0, with prices of around 30% less, everyone would stop complaining?

Please tell me what someone like Elsevier does then? Basically they have a monopoly on various prestigious journal names and extract rent from both sides. Finally, researchers are taking a stand and slowly moving towards open journals. See here: http://openaccess.commons.gc.cuny.edu/2015/05/20/elsevier-ev...

If they're merely extracting rent, you should expect two things to be true:

1. Their margins are extremely high, 90% range. 2. Open access journals can charge orders of magnitude less

Neither is broadly correct.

Regarding the claim that ~30% margins are excessive: do you think that if their prices went down by 30%, everybody would be happy?

Margins for Elsevier are closer to 40%, but I don't care about margins. What I care about is that we the tax payer have already paid for the research. Research often happens with government grants, done by researchers working at public universities. Editing and peer review is done for free by other researchers (who are also usually paid by grants and public universities). Elsevier does nothing but extract rent and pay for lawyers to help them keep extracting.

It is insane that they go after people who wrote the paper when they try to give it away on their own website.

A little reading goes a long way: https://en.wikipedia.org/wiki/Elsevier

You've paid for the research, but not for the publishing cost. If you don't care about the cost, then the decisions you support will obviously be skewed.

Claiming that they're just extracting rent ignores the cost, which even at a 40% margin is still 60% of what they charge.

Do you have any information on their legal costs that would reduce the cost above?

Do you think everyone complaining about prices would be happy if they would reduce prices by 35% across the board, thus making no profit? Or would you still want it to be free? If the latter, your complaint is not solely based in Elsevier's "rent extraction", but in something else.

But the author also pays for the publishing costs. Submitting a paper already costs 1-2k to have it in the journal. PLoS has shown that the 1-2k range is the publishing cost. For even more information that you'll likely not read see here:


It's hardly surprising that publishers would fight dirty to hang on to a business model where scientists do research that is largely publicly funded, and write manuscripts and prepare figures at no cost to the journal; other scientists perform peer-review for free; and other scientists handle the editorial tasks for free or for token stipends. The result of all this free and far-below-minimum-wage professional work is journal articles in which the publisher, which has done almost nothing, owns the copyright and is able to sell copies back to libraries at monopolistic costs, and to individuals at $30 or more per view.

http://www.researchinfonet.org/wp-content/uploads/2013/09/We... also doesn't find such fees, saying only one Elsevier journal charged page fees.

https://academia.stackexchange.com/questions/18625/do-spring... suggests otherwise. Where do you see closed journals charging 1-2k to submit papers?

> so the publisher can recoup its costs.

Given that peers review a paper for free and it costs fractions of a penny to host a pdf, why should we care about the publishers? What value do they ultimately bring to the table?

We should be rethinking how this is done entirely.

This is off the cuff, but...

Perhaps researchers applying for public funding should be required to register as peers. Each researcher submits their publication to a public service like data.gov (maybe research.gov?). Peers are automatically assigned according to area of expertise. The peering process proceeds much like it does today.

On completion of review and approval, the paper and any materials needed to reproduce the research is made public and available to anyone.

The cost for all of that would be relatively trivial.

Also, many of those journal fees are paid for with tax dollars to begin with. An expense that would no longer be necessary.

Some portion of the other 70% is going toward legal fees to protect "their" intellectual property, developing new ways to monetize the windfall, lobbying to obstruct copyright reform, running lower-margin businesses unrelated to academic publishing, and so on.

40% profit margins now, not 30%, but I completely agree: it's critical to keep an eye on what the company is actually doing rather than sweeping it under the skirt of the economic term "value," which never ceases to amaze in its ability to diverge from the more common definition of the word.

Yes, I suspect creative accounting.

In CS, conferences (which organise peer review and pretty much all publish open-access proceedings) hardly ever charge more than something like $200 for attendance, and this also pays for the conference venue, logistics and a slush fund of financial support for student attendees. Why would the peer review and publication part cost more in other fields?

It seems much more likely that PLOS simply charges the "$1000 or so" because they can, and it's easy enough to piggyback on the traditional publisher narrative that they are actually providing this amount of added value.

PLOS is a non-profit, and employs full-time professional editors.

People forget the weird workflow in any journal that does biomedical work (including most of the PLOS family): manuscripts are submitted as Word files, then converted to XML and reformatted by a semi-automated process. There's manual labor involved to format the references and cross-references, mark up tables and figures, and proofread the HTML, PDF, and XML versions. The XML is then submitted to PubMed Central for archival storage.

There's a lot of manual labor involved in the profess. You can't just hand-wave away and say "peer review is provided for free" and assume the rest of the process is free.

For all that matters , a badly formatted "times 12" word document is just as fine as a "polished" research article. This is science we are talking about, it's not an infomercial. Besides, all journals have strict and long guidelines that all presumbissions follow. I just don't buy the manual labor argument.

The federal NIH open access standards require JATS XML: http://jats.nlm.nih.gov/

Proper formatting makes it possible to automatically extract bibliographic reference data, text-mine the contents for whatever interesting purposes you might think of, display it in apps and new formats, and store it for decades in a stable format.

I've read enough journal articles that I've come to appreciate proper formatting, rather than "Time New Roman 12" Word documents or "Generic LaTeX Template #17"

DJB got a quite different view on journal provided LaTeX styles : https://cr.yp.to/writing/visual.html

Nice value added by the MSRI style right here.

Then they should tell authors: "give us your material conformant to our schema and we'll cut the price in half" That way the author can decide. I bet that most would stil submit Word documents.

Then just make the Word document open access. Someone at Google or Facebook can come up with some machine learning tools to automatically convert these into the proper format and add cross references.

Then require that the papers be written in fucking markdown and have pandoc handle the 1" borders and 12pt times New Roman font.

Markdown does not support cross-referencing of figures, tables, and sections, and doesn't have numbered "environments" like LaTeX (e.g. theorem, definition, example). It would need to be extended.

There's a slow-moving Scholarly Markdown project, but I don't know what its status is. http://scholarlymarkdown.com/

I agree with the sentiment, though. If there were some markup language that supported what academics do with LaTeX and Word, and converted cleanly to JATS XML and to nice-looking PDFs, it would be enormously useful.

$200 is a lot lower than I've seen for top CS conferences recently, e.g. CCS 2015 was over a grand for the full conference last year: http://www.sigsac.org/ccs/CCS2015/pro_reg.html

Yikes, and I have a paper in the oven I'm hoping to submit to CCS this year... I suppose my mental image was influenced by less flashy conferences along the lines of https://cie2013.wordpress.com/registration/. For what it's worth, I don't think there is any reason to assume the peer review and publishing effort put in by CiE is of lower quality - though its acceptance rate is likely to be higher than that of CCS, which may play a role if attendance is closer to being proportional to how much they accept than to how much is submitted.

CS conference proceedings published by our friends at ACM and IEEE are usually for-pay just like their journals.

As a taxpayer, you already pay state universities to access those papers. If every paper was published as open-access, I am sure the price of open-access could be reduced.

>If every paper was published as open-access, I am sure the price of open-access could be reduced.

Do you have any evidence of such economies of scale in existing publishers? As the article says, their margins are around 30%. That means the maximum they could reduce the price without losing money (with simplifying assumptions) is in the 30% range. If you have a way to be significantly more efficient than the status quo, go ahead, I'm sure everyone will be thrilled.

>As a taxpayer, you already pay state universities to access those papers.

That's a fair point. Is the government paying more in access fees than they're saving by not using open access journals?

You can't take the costs provided by journals as a minimum because they are likely providing services that are no longer necessary or could be made much more efficient.

Here's an example of how to do it on the cheap:

"Scholastica charges us $10 per submission. We have a grant to cover this and our other costs, which are very low. Our total costs probably average about $30 per accepted article. Therefore, there are no charges for authors, and obviously there are no charges for readers either. A typical article processing charge for one article published in a traditional journal would keep us going for around 50 published articles."


If authors find that such journals fulfil their needs, then the traditional publishers will have been outcompeted. But make that case directly; don't yell at traditional publishers for being expensive, when they also provide far more service. If you don't think it's needed, don't use it.

That's the problem with the current system, though. The authors aren't choosing a journal for the extra services the publishers provide, they're choosing it for the prestige associated with e.g. Nature. It's similar to the reason why any new social network, no matter what features it has, would have a hard time competing with Facebook.

This isn't entirely true. Authors try to publish in Nature not just because of its prestige, but also because it provides a wider audience. If you have a strong result that you think would be interesting to a wider audience, then publishing in Nature or Science, rather than a specialized journal, makes sense.

Additionally, these journals really do provide more services than most specialized journals. In particular, the editorial service at Nature or Science is of much higher quality than your typical specialized journal. The editing includes editing the text for language and clarity, typesetting, and commissioning third-party high-level summaries (Nature's "News and Views" section). In contrast, at least in computer science and mathematics, most journals do minimal, if any, editing. The editors of prestigious journals do add value, and this helps to maintain the journals' relatively broad audiences.

Even if it did cost $1000, which it doesn't. Tax pays the salaries and materials to do the research, which is more like 10k to 100k, an additional 1k overhead is a drop in the ocean.

It's actually more than $1000. PLOS ONE starts at $1350 and tops out over $2000.

And for a lab that's publishing a lot, it can add up. Some labs are putting out 10+ pubs a year. And depending on the field, that $10-20k can be a big deal.

FTA: >But that financial model requires authors to pay a processing charge that can run anywhere from $1,500 to $3,000 per article so the publisher can recoup its costs.

Given academic salary and researcher publication rate, I am confident that it is far less than $1000 you pulled out of air. How much did your tax go up when NIH mandated free publication, after a 2-years period?


Other open access journals are similar. Publication rates are high, because the vast majority of published research is NOT open access.


>But that financial model requires authors to pay a processing charge that can run anywhere from $1,500 to $3,000 per article so the publisher can recoup its costs.

They may have pulled it out of thin air (although I've seen similar numbers elsewhere, this doesn't seem to be in dispute).

They should be. It's better to pay for open access than lock science behind paywalls. There is vast knowledge that could be mined by machines, indexed by AIs , processed for finding novel patterns, but it's impossible when it's all locked behind paywalls. Want to buy "prestige" ? pay for it. That's more fair than "give us exclusive rights to your science for free or you don't get prestige".

Universities already pay comparable subscription fees

If most papers can be accessed for free, why would universities still pay?

They cant be accessed legally. Most universities are too invested in the current system to stop paying. That's why publishers bundle their journals in extremely expensive packages they sell to libraries.

Harvard is starting to complain. Harvard.

Maybe public funders ought to claim copyright on any work produced, and require open access publication.

That's a perfectly fine policy, if they want to implement it. Just note that it will increase the cost of every paper, because they now need to pay for open access.

Publication costs are minuscule relative to research costs. Rounding error.

And furthermore, they're minuscule compared to costs of journal access.

Edit: I'm wrong on the second point. They're actually comparable. See comment below.

According to the article, they're 70% of publishers' revenue (straightforward extrapolation from 30% margin). I can't think of any sense in which that's minuscule.

Research costs may be correct. Do you know where I can find data on average funding for published studies? Or how many government funded studies were published at what budget?

Sorry, I was wrong about publication cost vs the cost of published articles. They're comparable.

Let's say that I'm writing a paper. Maybe the underlying research cost $100K. So paying $3K for publication isn't much. And let's say that I need to review 100 papers for background, discussion, etc. At $30 each, that's also $3K. Even less, for subscription-based access.

So maybe it's a wash for grant-funded academics who publish in journals that charge $3K to publish and $30 per copy. But it certainly isn't a wash for academics with marginal funding, not-for-profits, etc, who publish at much lower cost.

When I finished my PhD at Northwestern, part of the university's procedure involved going to the ProQuest Web site. ProQuest is a journal and dissertation publishing company.

They asked if I wanted my dissertation to be available, free of charge, to anyone interested in reading it.

Clicking on "yes, I want to make it available for free" would cost me something like $800.

Clicking on "no, I'll let you charge people to see it" would cost me nothing.

Having just finished, and being in debt to do so, it shouldn't come as a surprise that I wasn't rushing to pay even more. So now, if people want to see my dissertation, they have to pay -- or be part of an institution that pays an annual fee to ProQuest. (BTW, e-mail me if you want a copy.)

My guess is that it's similar with other journals. And while professors have more than PhD students, they have limited enough research funds that they'll hold their nose, save the money, and keep things behind a paywall.

Which is totally outrageous. It's about time that this change, and I'm happy to see what looks like the beginning of the end on this front.

There is an alternative: you host your thesis (and/or papers) on your own website, or you put them on arXiv.org. It's called self-archiving and it's allowed by most publishers. Funny thing is: making your papers available for free also increases the number of citations, something academics really care about.

And another alternative at some universities is the university thesis open archive itself. In Northern Europe I'm told some universities won't even give you your diploma if you don't upload your thesis in their open archive.

Frankly, every university in the world should have an open archive for thesis (and mutualized is even better). It's just the continuation of keeping thesis copies at the university library.

There's nothing that stops you from uploading the PDF elsewhere, right? Why not just push the file to some free hosting service (or even GitHub in a GitHub Pages repo)?

Some journal forbid doing that.

Pro tip: check http://www.sherpa.ac.uk/romeo/ to review journal self-archiving policies.

Some things, like dissemination of knowledge, are truly in the interest of all humanity. It seems criminal that a few hundred people at the publishing houses should benefit at the expense of billions' welfare.

"It seems criminal that a few hundred people at the publishing houses should benefit at the expense of billions' welfare"

To me it doesn't seem criminal. It is criminal, for to keep knowledge in the hands of the few and privileged to the detriment of the majority of all humanity is a crime. But not only does the Industry keep this knowledge to itself, it prevents even those who directly fund it (tax payers in developed countries) from accessing it themselves.

I honestly despise those journals who manipulate the current system for profit, just as I and I believe most anyone, would despise any organization or individual manipulating a system at the expense of the well-being of the many.

And I laude sci-hub and Elbakyan for something I consider a service to humanity.

While it is the scientists who are writing the papers and editing them, the journals provide a valuable service as intermediaries between the scientists. Whether this service is best provided "for free" or for profit is another question. In some fields, the leading journals are already published by not-for-profit organizations such as the ACM, but I am not sure that they are doing a better job than Springer or Elsevier.

In a language that's understandable to the HN crowd, journals:scientists are as VCs:(founders + investors).

What exactly is the valuable service they provide as intermediaries? Why couldn't all taxpayer-funded scientists just publish their (anyway voluntarily peer-reviewed) research on whatever free online platform?

As a taxpayer-funded scientist I publish all of my stuff on arXiv before submitting it to a journal, and it is 100% worthless for my career. If my research is not published in respected journals, other researchers will not bother looking at it, and justly so. Have you tried reading a paper selected at random on arXiv? You'll spend many long hours reading it only to end up realizing that it's garbage (unless you're lucky to realize this from reading the abstract).

Completely agree with your point. But the value here is brought by the reviewing process, which is (at least for CS as far as my knowledge goes) done voluntarily by other researchers, not by the journal itself. So the prestige of particular publishers could easily be shifted towards a free online platform as long as the same people continue to review the publications.

They are extracting rents from research far beyond the value they add to it.

Does ACM sell individual papers? If so, what's the price?

It's about the same as everyone else: about 15 USD per paper, e.g.


They also have a somewhat restrictive licensing on their software:


This part is kind of weird, as they have a noncommercial clause on their copyrighted algorithms, which makes them non-free.

On a slightly different note, I was somewhat saddened by how the International Lisp Conference talks were recorded but unavailable because the ACM will not allow their publication.

How the hell can that be "non-profit"? Where is the money going? Free conferences in Dubai?

The ACM publishes annual reports with a high level breakdown of their finances. The most recent one is here:


The finances are on pages 12 and 13. The short version: they're spending money on all the things you'd expect, but they seem to be slightly cash flow positive and have accrued around $100m in net assets.

What I'm wondering about is the economic justification for charging $15 per article copy. Other than "we can", I mean.

The usual "economic justification" for such things is that people are paying it.

In reality, very few people pay the individual article prices. The real market is in site licensing, as it were. The individual article prices are probably set to support the sales story on these institutional access licenses, not the other way around.

I think you can also get access to the ACM library via membership. When I was in college I paid for a student membership, I think its was $25/year or something generally reasonable. Probably a better value proposition than my actual CS degree, not that its a great point for comparison.

That's true. Many of these sorts of memberships are at least a few hundred dollars when you are no longer a student, so they add up pretty quickly (but maybe have tax credits).

Paying $25 annually for membership would be entirely fair. That's far less than site license for even one journal.

But what are criteria for student memberships?

Most definitely, here's a list of journals: http://dl.acm.org/pubs.cfm?CFID=591346815&CFTOKEN=85546644

And here's an article that, for instance, costs $15 to download: http://dl.acm.org/citation.cfm?id=954342&CFID=591346815&CFTO...

The ACM is paywalled for the majority of the publications it manage.

I found the ACM paywall to be one of the obstacle that limit access of the new wave of the code educators to older results in teaching programming: they are not in the academic sector and lack both knowledge of and access to this enormous amount of papers.

So, in this case, the paywall keep a new generation naive, as they reinvent independently research that have been conducted decades earlier.

I agree with you that knowledge should be accessible to all.

But, do realize that there most certainly is value being contributed by journals who are making you pay. For example, the editorial work, the typesetting, the selection, the whole system, etc. So they do need a paycheck from somewhere.

So, I think this is actually like the common software/music/movies piracy situation... but a lot better!

My (simplified) feelings about piracy are: do it if you're dirt-poor, don't do it if you can easily afford it. By this account, Kanye was a d-bag for pirating.. whatever software he pirated, because he, of all people, should be able to afford it. A starving artist with lots of debts, in my opinion, can't be accused of doing a great moral wrong when he's stealing a thing which has an effective zero marginal cost (arguably). The biggest problem here is that... there do exist people like Kanye who really should pay for things but end up not doing so.

What's different in this scenario is that... I am almost 100% certain that all institutions that should be paying... will be paying! I mean, can you imagine some Harvard or MIT lab skipping payment on the journals? No, that's just not going to happen. The Harvards and MITs of the world will keep paying... the rest -- the public, or schools in suffering areas and nations, who can't afford it as easily, will be able to get it using scihub. It works out wonderfully in this way.

Journals do minimal typesetting, and practically no editorial work. Paper reviews are done by academics, free. Editors are mostly academics, not depending a paycheck from publishers. There is cost, like marketing cost, but none that justifies $35 per article to the general public.

That's either a bad lie, or you don't understand how this works. As a small example: Science, Nature, and various other journals produce high quality videos (like this: https://www.youtube.com/channel/UCv0aU2eKry3kdSTnFa8QAWA ). And they have many in-house designers and editors who do important work like identifying from a sea of work what is important and what is not, etc.

It's neither, I think it's just that different fields operate differently. In CS most papers are published in conferences. Authors do their own typesetting, editing, figure making, etc. Peer review is organized on a volunteer basis by the program chair, who is (as far as I know) not usually compensated for their efforts.

This means we don't get many fancy videos unless we or our university's PR people make them, but it still seems to work fine for getting science done.

Here's the breakdown of the current job postings at Elsevier (not to say that this reflects accurately how the $35 / article is divided up):

* 112 IT

* 10 customer service

* 3 data analysis

* 28 editorial

* 3 accounting

* 4 general MGMT

* 3 HR, 35 marketing

* 18 manufacturing

* 22 product MGMT

* 3 program MGMT

I would really like to see an official breakdown of the cost per article and especially the processing cost paid by authors.

edit: formatting.

All everything 'should' be free. At least, that which is not scarce.

The correct question to ask is 'can' all research papers be free - does the world continue to spin, will research still happen, will we still progress, if they are free?

The only reason we even have this debate to begin with is because the producers of this information require scarce/controlled resources in order to survive.

This is so not true. Closed-access journals do not pay authors. There's no advance. There are no royalties. At best, they don't charge authors for publishing their work. They don't pay reviewers, and typically not even editors. Maybe they do pay for editing. But that is minuscule cost, relative to $30 per copy.

What happened here is that jerks hijacked the academic publishing industry. They turned a system that was largely pro bono publico into an intellectual pyramid scam. Academia has been slow to respond, mired in the web of prestigious citation. But maybe this is the end game.

Producers are funded by grants or private research funding. They get no money that are paid for access to articles, that is why they publish preprints for free.

It's not exactly post-scarcity, it's just zero marginal cost. It's true that, given the product has already been created and has zero marginal cost, it "should" be free. But for setting the expectations that producers of future products have about what compensation they'll receive we can't simply say it should be free.

The marginal units absolutely can be free, just pay for the research itself. Fixed cost for fixed work.

That is how private research is funded, say at Microsoft Research. You agree on "deliverables" and get your money for each stage of your work. Deliverables include tech reports, source code, prototypes and things like that. You get nothing after your stop working on the project. For producers, i.e. researchers, this scheme is not worse than grants and getting money for published articles, citations etc., they don't get money for years after the publication anyway.

Posting in a seperate comment here as I was mainly aiming to provoke debate with the original one.

This generally chimes with my feeling. It seems to me that the users of research can derive vastly different value from it and so the flat price model doesn't really work.

Consider some new research on light emitting diodes and the value Samsung might get from that vs. me reading it out of curiosity.

For that reason, to me it makes sense to treat academic research as infrastructure and have free access to all funded via taxation.

I think Elbakyan should do everything to make sci-hub easily replaceable. Once it's hosted on multiple places it would be much harder to shut down.

Maybe completely free research papers are not the future but there should be a Spotify for research papers that is affordable for everyone. I hope that Elbakyan will reach her goal and ultimately change the whole industry.

Taxpayer funded research must be free to read.

Also, a research that has been at least partially tax-funded resulting in a publication, must not be usable as an necessary ingredient for a commercial patent.

That is, a patent can include this type of research, but it cannot be a 'necessity' for the patent to be viable. Or, if the particular research, is necessary for a given patent to be viable, the patent must grant no-fees, no-commercial-strings-attached use.

This allows a corporation to establish patents as means to protect itself, while allowing the tax funded research to be used by others without commercial strings attached

Something I'd like to see here: results published in research papers aggregated and released as open data.

There must be a lot of interesting meta-analyses that aren't getting done because the necessary data is locked away behind paywalls, and usually not in an easily machine readable format into the bargain.

> “The real people to blame are the leaders of the scientific community — Nobel scientists, heads of institutions, the presidents of universities — who are in a position to change things but have never faced up to this problem in part because they are beneficiaries of the system,” said Dr. Eisen. “University presidents love to tout how important their scientists are because they publish in these journals.”

For me, this is the cog of the problem. People who are in a position to change should push for it.

There must be thousands of people who could use free access to research papers: PhDs and Masters now in industry trying to apply the state of the art, engineers who have worked their way into a subject, concerned citizens who want to read the source material.

I am a PhD who'd love to be working in industry, but I'm shit scared that once I leave the gates of the university I'll simply lose touch with the state of the art because the papers will no longer be accessible.

Many university library systems have programs where alumni (and sometimes outsiders) can purchase access to their resources for a yearly fee. If that's unavailable to you or cost prohibitive, most academic libraries are open to the public. When I worked at an engineering library for a large university, a good part of my day was spent helping working professional access these kinds of resources.

Depends on your field, but my local library has the latest Cell, Science and Nature magazines for perusal, and anything older than 1 year is usually released gratis. If you need more than this to stay up to the state-of-the-art, chances are your future "industry" job will also have access to this, or it will become more of a hobby.

Just my opinion as a PhD, working in industry, who does not read nearly as much as he used to.

Yes. They should.

It is in the best interests of humanity to make the knowledge obtained through research available to anyone looking for that knowledge. There is a clear consensus among scientists that the current publishing model is at best inexpedient and at worst hostile to that end.

Most people are asking what good the current publishing model provides, but I think to answer that question we need to ask: "compared to what?" It seems clear to me that the current model is better than having no publishing mechanism at all, but I doubt that anyone seriously thinks that the "none" model is the only alternative.

I think that if we sat down today and thought up a new publishing model from scratch, we would be able to outdo the status quo on just about every "good" people have mentioned here, as well as provide features that the current model is incapable of. I think it is highly likely that we could make a system that ran on donated resources alone.

Some things we might want/have in a "from scratch" model:

1. Direct access to data-as-in-a-database instead of data-as-a-graph-in-a-PDF

2. Blockchain-based reputation system for scientists

3. P2P storage and sharing of scientific data

4. Tiers of scientific information, e.g. an informal forum-of-science, semi-formal wiki-of-science, and formal publications

5. Automated peer review process

6. A better and more consistent authoring tool for scientists

The main problems with tax-funded research and grants is that money is given in return for citations in journals with high "impact factor". As a result, publishers of those journals are indirectly supported by the state. Instead, government or funding organizations should review the results of the work for themselves, but they are unable to do it, because they usually don't understand a thing about research subject.

That's the problem, organizations don't want do do the work so they outsource it. But the journal market is not free, it's dominated by incumbents who earned the position centuries ago. As the experience with open access journals shows so far, it's near impossible to get scientists to volunteer their free labor to a new journal. Regulatory action (like requiring open access for all govt-funded science) should be taken here.

But the publishers outsource the same work to referees. Why not cut out the middle man and have the funding agencies orchestrate the peer-review? That's what they do for grants anyway.

Can someone explain me why the researchers themselves don't publish their work for free? The article says they are not paid for the articles so I don't see why they couldn't do that.

For the $1000-3000 that a single OA application costs I can pay for a new computer, or send one or two students to conferences, or pay for part of a stipend, finance a temporary web design guy, etc. Now I try to publish 10 papers per year - that's $10-30,000 out of the window for no "return" to our group. For $30,000 I can send everyone to several conferences!

Conferences are more important to a scientific career than an OA paper. The people who hire you will in all likelihood have access to your paper even if it's behind a paywall.

> Conferences are more important to a scientific career than an OA paper

Depends on you field of research, maybe? Paper publications make or break early biomedical research careers.

Further, the grants that these funds are paid out of often have travel funds built-in, so conferences are still available.

>Paper publications make or break early biomedical research careers.

Yes, but that has nothing to do with the OA-status of those publications - if it's a high-impact journal, then it makes the career, if it's not, then not.

%s/application/publication, can't edit anymore

For some it is because the publishers take the copyright. For others it is just because the authors don't really care, don't know about it, or don't really think about it.

I am a professor and I was at a meeting a couple weeks ago that talked about open access publishing. There were people from many departments and there was one professor there who asking questions that made it clear he didn't know anything about online publishing (he was asking things like "where are the papers stored?" and said something about how "all this online stuff is like Big Brother")

Because researchers depend on funding which depends on publications in prestigious journals which charge fees for access.

Every researcher I know would publish for free if it wouldn't ruin their career to do otherwise. They want their results disseminated as widely as possible.

> Every researcher I know would publish for free if it wouldn't ruin their career to do otherwise.

Why can't the national/international associations in each field set up simple publishing operations (or pay a U. press to do it), with the same people doing peer review, and use their power to designate the presitage of each journal: 'this will be our tier 1 journal, this one a tier 2, etc.'

Whatever it costs, it would cost far less than the existing setup. I'd think the U. libraries would be happy to share some of their enormous savings to fund the operation.

Most journals withhold the copyright, and the scientists happily relinquish it. They can distribute preprint copies of their article (very few do in life sciences) but not the article itself.

I would not say that most scientist happly relinquish it, but they trade off the prestige gained from publishing in the journal for the copyright.

The real problem is the use of journals as an arbiture of research quality. If it didn’t matter to your career if you published in Nature or PLOS One then everyone would publish in PLOS One. If you are more likely to get a job or grant by publishing in Nature then people will do almost anything to get their work into Nature. We are trying to solve the wrong problem by worrying about journals controlling copyright.

Sorry, i find it hard to imagine a scientist feeling sad that he will publish a Nature paper. You are spot on that we need a new arbiter. It's 2016 and we rely on this ancient system, it's no wonder that the elseviers of the world take advantage of it.

But the control of copyright is a real problem, and i think we underestimate how much it hampers science (esp. considering the wonderful things one could do with machine analysis of the texts)

I have not met a scientist (including myself) that was happy about giving up the copyright, but it is viewed as the cost of publishing in a “prestige” journal. Nearly all scientists would prefer to retain the copyright if they could and make their papers widely available to everyone interested provide it did not exclude publishing in a career advanacing journal.

Publishing used to cost money when it required physical printing/distrubution/storage of journals. Now all of this is basically free, but they still charge. Most theoretical physicists for example only care about "publishing" in the ArXiv (all free, open source). The traditional publishing is ridiculous.

Publishing cost today is smaller -- yet publishers actually charge more.

Libraries used to get a physical copy of the papers, which would grant lifetime access to the research.

Today, they pay for subscription, and as soon as they stop paying, they lose access to everything.

I hope this publicitly doesnt lead to swift shutdown of scihub. She provides us with a great service that helps many researchers work faster. We should also commend her for stirring the most lively debate about an anachronistic and dumb publishing system.

This isn't the right question. The question is, "Who should be profiting from research papers?" The Journal performs quality control for the sake of consistency and prestige, but the papers and their reviews are put together by researchers, commonly at great cost for marginal personal gain. The article's hero doesn't really care. She needs to read papers, and needs other people to be able to read them, so she built sci-hub (demo: https://sci-hub.io/10.1038/nature16990).

WRT > This isn't the right question. The question is, "Who should be profiting from research papers?"

I am not sure that the way you put it s right either.

Because "who should be profiting from research papers?" is too generic of a question, and does not appear to necessarily supersede the question 'should tax-funded publication be readable for free?'

If I may rephrase your question to be: "Quality control of a research paper, must be, necessarily funded (either by money or a form of barter). Therefore question a) who should fund it, question b) who should receive funding to do the quality control"

Then, obviously, this is an important question. And I do not believe has been clearly answered either in polices or on this forum.

My answer to ( a ) would be -- the same entity that funds the research (therefore in this case the tax payers)

My answer to ( b ) would be -- a licensed or otherwise professionally certified group, independently selected (that is not selected by the researcher that authors the publication).

Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact