I want to go off a tangent here, though. Now that open access (whether arXiv or SciHub style) is becoming the norm, I wonder what can be done to improve the format of scientific papers? Like e.g. making them more like this:
instead of regular PDFs?
Personally I'm a big fan of blog posts: they're written for a broader audience in understandable language. I don't have any trouble reading English, but papers are almost universally written in a way that is more complicated than necessary. And in blog posts, if someone made a "novel new algorithm", at least you can be sure there's code, raw test data, screenshots, or whatever else is needed to DIY. And someone on the web is talking about it, perhaps even with comments below the post itself.
I edited the link to the more "lightweight" version (i.e. just the page, without loading his entire homepage as well, which is kind of heavy), so please consider trying again.
I agree with you RE blog posts, and the point of me linking to Bret Victor's work is to give an example of a form that's better-suited to the media we operate with today. Papers still live in the paper age (pun not intended).
Since Bret has liberated the scientific article from paper, we may as well take advantage of the more flexible layout possible with digital media. Interactive diagrams are so nice for learning that I think it's worth pursuing Victor's idea further. Fred Akalin's blog has a couple great examples! https://www.akalin.com/quintic-unsolvability
The purpose of a paper is to be rigorous and exact and move the field of study forward. That tends to make them dense and use the fields jargon - which is short hand for massive amounts of information.
The problem isn't the jargon itself, it's the fact that most labs have built up decades of institutional knowledge that they hoard for fear of their research being "scooped." I'm just thankful I don't work in downstream medical research anymore, where the number of replicable papers goes from one in two to one in ten, at best.
Your comment about reagents from one manufacturer failing while another succeeded brings back fond memories of trying to make our own botulinum neurotoxin to study neurons- and the rest about hoarding institutional knowledge reminds me why I tried to get out of that. I was one of those ambitious undergrads, but I permitted myself to wilt under the insanity of a system in which I KNEW that a partnered laboratory knew exactly what to do, but they were forbidden to share the knowledge because of politics. By the time I had cut through the wasteful uselessness, I discovered that the only sane person there, one who had been willing to share knowledge, she had actually DIED.
And so I merrily skipped over to the bioinformatics department, which has issues as well but in which I was able to manage them fairly effectively. :D
Unfortunately it's the same for software and algorithms (at least in bioinformatics, which is my field). Everything is treated like closely guarded secrets, at least in the smaller environments.
Thankfully the tendency is slowly being reversed and there are even things developed out in the open: the "problem" with those is that being in the open, it's harder to get publications out.
But imagine if the incentives of scholars somehow shift so that they look less to funding committees and more to the audiences of blogs and magazine articles. I think such a shift would do more good than harm.
Scholars would still write serious papers to establish their reputation amongst each-other, but insofar as they rush to publish, I'd rather they pump out accessible blog style articles than the kind of gibberish that is made today in order to inflate the citation metrics.
you mean something like
Then yes I think these are a very valuable and increasingly important part of the scientific process. I think you might benefit from labelling the blog as 'math blog' which is quite distinct from the original link.
This is a function of two things. The first is that venues often have strict page limits for their submissions. Authors typically have a lot to say and not a lot of space to say it, so they pick language that is precise, dense, and typically colorless. The second is that every domain has a lot of style and nomenclature that the layman won't know. It can be learned, but you must remember scientists write for each other, not for the layman.
The second: I was talking about reading papers from my own field. That I don't understand a Biology paper (which indeed I don't -- I've tried) is perfectly understandable, but infosec papers shouldn't (and don't) contain any unknown lingo to me. It's just a very convoluted way of writing.
The whole thing is 283 kB and half of that is due to three small images that are already compressed as much as is possible (but that are actually part of the comment about the paper formatting, not the paper itself).
That seems very reasonable for a scientific paper which would have been about the same size as a PDF.
which loads the article... plus the entire homepage of Bret Victor, which is kind of heavy.
The primary shortcoming of current scientific papers is that they're too short. They're written to physically fit into a paper journal, but their brevity massively impairs efforts to validate and replicate research. We're taught in high school science class that a paper should include all the information needed to replicate a study, but in practice that simply isn't true. Our methods of publication haven't kept pace with the complexity of modern research and analysis methods. Failure to replicate and research misconduct is eroding the foundations of science.
In 2017, the appendices of a paper should include everything - the full dataset, complete lab notes, full source code, the works.
(Disclaimer: Posting in a personal capacity.)
If you could get the Windows source code with some paid subscription program, so that it's not pirated, it still wouldn't be open source. It would be more like "open access".
Open source means we can modify it and share our modified version, and even charge for it.
Most users of scientific papers aren't looking to extend those papers and redistribute modified copies; they just read them privately and make citations (which nobody can tell whether they were from a licensed or pirated copy).
And this is exactly why Richard Stallman doesn't like the term.
> The official definition of “open source software” (which is published by the Open Source Initiative and is too long to include here) was derived indirectly from our criteria for free software. [...]
> However, the obvious meaning for the expression “open source software”—and the one most people seem to think it means—is “You can look at the source code.” That criterion is much weaker than the free software definition, much weaker also than the official definition of open source. It includes many programs that are neither free nor open source.
> Since the obvious meaning for “open source” is not the meaning that its advocates intend, the result is that most people misunderstand the term.
That is, Stallman is criticizing "open source" for being, among other things, easy to misunderstand as not requiring free software licensing — even though it does, in fact, require free software licensing.
It's saying that one of the problems with "open source" is that you can make the same mistake you just made (thinking that being able to see the source code makes it "open source").
Richard Stallman's primary issue with "open source" is that they intentionally avoid talking about issues of software freedom, and have consistently muddied the waters by co-opting free software as "open source".
Looks like they've gone in the opposite direction. Access to the torrent repository is no longer available. The only way to download articles now is by clicking through ads.
If the torrent repository were gone permanently, it would be extremely sad, since it's (AFAIK) the only way of having automated and/or bulk download of papers, for example for data mining — sci-hub.io is understandably not really suited for this.
No I tried to pose the same question recently on the forum but it's locked down for new users. At best it has something to do with recent litigation. At worst they're being hypocritical and seeking that sweet ad dollar. I agree that the current situation is only marginally better than the status quo.
See the Distill initiative, for one direction:
HTML (in some reasonable form) is my favourite format for reading text.
It's plain text so you can throw parsers at it to extract text, data, and formulae. It can be automatically converted into HTML for online display, PDF for downloading, etc. It's not pretty, of course, but we already have plenty of experience parsing and extracting useful information from another widely used format that mixes semantics and presentation all over the place, so why not?
PDF's are a danged-sight easier to organize for research as well, IMO.
 because it's Bret Victor, of course
(1) MOOCs - student does not need to pay exorbitant tuition fees
(2) Sci-Hub - student does not need access to a university library
(3) ? universal accreditation - this piece is missing
idea for start-up: aggregate MOOCs, package instructions on how to retrieve information online, connect instructors and tutors to students, arrange accreditation, offer BA, MA, PhD programs, we have start-ups in every industry but higher level education, could it be possible? …
This will turn out to be harder than it appears. The accreditation process is complex, expensive, and bureaucratic - possibly by design.
PhDs accreditation would be particularly complicated. PhDs are partly a form of academic hazing and lineage development and very much not just about the research/content.
Unfortunately education is more of a political problem than an information redistribution problem.
I hope scihub has a well thought out exit plan, in case of seizure.
~ 40 M files, at 1 MB each (my own estimate). I hope they have backups / insurance files with all the papers collected this far. Given the data volume it's not trivial but surely not impossible.
I should say I'm bias as I run ipfsstore.it but to host all 26gb there would not be expensive.
I like ipfs, but in this situation specifically it's got problems. It's not private and it's not secret. It means that, by hosting a mirror of the data you're announcing: I'm publicly offering these documents, likely illegally as far as copyright is involved.
(And without a country/region limitation, which may be another issue)
Neither is BitTorrent.
Quite a big fee :p
Although I don't know much about creating 'DApps' or how Ethereum or IPFS works or what SciHub contains beyond a primitive grasp , just sayin :p
 I watched a video of a talk "Distributed Apps with IPFS (Juan Benet) - Full Stack Fest 2016" https://www.youtube.com/watch?v=jONZtXMu03w
Another is 13 2TB drives at $70 a pop. May run into some network congestion sharing however :)
26TB is a massive amount of data.
Having downloaded (very) large dumps in the past with torrents, I've found it always a bit annoying especially when attempting to access a subset, torrent client interfaces are less than ideal here. Not that someone can't build a good document browser on top of torrents - ala those movie torrent streaming apps.
I'm not familiar with how IPFS works in detail but being able to access it via a file system sounds much better UX wise. Hopefully it can support that scale.
But to your point the primary end-goal of "censorship-free" persistent access is largely the same AFAIK.
Most of the trackers seem to be down, so it's just getting data from tracker2.wasabii.com, and even that is pretty slow, so it'll take a while to fetch for all 1693 torrents.
I wrote a script that pings the tracker and keeps regenerating a static HTML file, so it'll keep updating itself, though not very fast.
True, current publishers don't compensate peer reviewers, limit curation to the set of people who will pay their submission fee, force editing and typesetting back on the authors, and charge egregious fees disproportionate to the cost of distribution. But the concept of the business isn't invalid.
This isn't throwing the baby out with the bath water, this is people trying to stop this nasty ass bath water from being thrown out because there used to be a baby in it.
Prices are a different thing, but I'm definitely spending my day here adding value.
I suppose the general notion is that the only problem publishers solve is the distribution which is kinda non-value-adding nowadays, but don't think about the editorial process, which needs actual work.
Sure, I think you guys add value. My critique was towards the cost of old papers and the ownership which journals take for published work. You guys need to get paid but the world also needs it's open access to the scientific corpus.
The papers which have very little value adding editorial input do have a harder time legitimizing their slice of the pie, though.
But in theory it's possible, e.g. Facebook's Open Compute Project, it just needs the right kind of funding model.
It's just a matter of finding a way to do it that doesn't jeopardize the entire business.
> Peer review
Organized and performed by volunteers.
> curating for quality
> editing, typesetting
Done by authors (LaTeX isn't perfect, but it gets you "good enough" typesetting. And the last time a journal edited a paper of mine they mangled it because the person editing wasn't familiar with technical topics & jargon).
> creating physical copies
No one needs these any more; they just print the PDF if necessary.
Yes, they can. But they don't.
>curating for quality, editing, typesetting,
They don't do this. You are expected to do it all. Your papers are sent back for revisions and/or rejected by peer review editors and reviewers, who again, work for free.
>and creating physical copies could be expensive.
Physical copies cost additional money to acquire anyway.
I don't think this is a good solution, because in reality reviewing these papers is complicated work for people who have spent years, often times decades, in their field. They do deserve to be paid.
I do think publishers provide tangible, real value, but often the value they provide is significantly outweighed by the problems that arise from the way they run their businesses.
I absolutely agree. But this is not how it works and suggestions to pay reviewers is usually met with haughty academic nonsense.
Publishers in academia (and in the normal book industry, to boot) add near zero value while extracting huge fees.
That's like saying that dictatorships can be a superior form of government, because they provide centralized planning and decision making at a much lower overall cost. True, current dictatorships don't do so... but the concept isn't invalid.
Lower impact journals are basically glorified FTP front ends. They don't create content, validate that content, editorialize that content, or even select that content. They just host and charge an enormous premium for it.
- Strip download watermarks ("Downloaded by Wisconsin State University xxx.xxx.xxx.xxx on January 12, 2017 13:45:12"). Many times, journals published by the same publisher do the watermarking similarly so you need write just one pdftk (or other PDF manipulation software) script for every journal under their banner. At worst, it's a one-script-per journal effort.
- PDF optimization. A lot of publishers produce un-optimized PDFs that could be 25% (or more) smaller with a completely lossless space optimization pass. This should save storage/network costs for access to individual papers and, more importantly, reduce the burden for bulk mirrors.
(I'd contribute the scripted passes myself if I had contacts within sci-hub.)
If you run
java tool.pdf.Compress file.pdf
it will generate file-o.pdf and often trim quite a bit of weight.
That's probably what I would suggest sci-hub use too because it's easy to use and automatic.
EDIT: this looks quite promising too: https://github.com/pts/pdfsizeopt
But it has bugs with handling some images... converts them to black boxes. I'd only use this one if you manually verify the result first (so not a fully scripted process). (Or this could be due to me not installing all the bleeding edge dependencies from source as instructed.)
You could tweet at them with a link to a Github repo of your scripts, they are active on Twitter
Thanks for the tip about tweeting. I need to set up another github account disconnected from my professional identity but that shouldn't be too hard.
EDIT: and maybe a separate Twitter account. And maybe both accounts used only from public wifi, too. I wouldn't put much past the publishers when it comes to getting revenge on people who help sci-hub. Unlike Elbakyan, I am not out of their reach.
Around 60 German universities have cancelled their contracts with Elsevier: http://www.the-scientist.com/?articles.view/articleNo/49906/...
Obviously, the stated reason for cancellation would be that costs or too high, not that everyone downloads papers from sci-hub.
They have since introduced open access options, and universities have renewed their subscriptions.
If we aren't paying researchers for peer review, the journal pay wall is just a net loss in productivity. It's just one more useless middleman.
You've misunderstood something there. With the "open access" model the costs are merely shifted around. In the traditional model the library would pay for subscription, now it's the researcher paying page charges for publication. In the past of society publishers the society had a reputation to lose, but Hindawi doesn't have such restraints on it.
20 years ago, the process involved the creation of a physical artifact -- bound paper, handed out to participants, mailed to subscribers, etc. This is a fairly non-trivial undertaking, certainly costing quite a few thousand dollars.
As of a few years ago, the last of the major conferences in my area stopped creating printed proceedings. Along with that goes away the need (desire?) to ensure precise formatting rules, etc. -- the job of the typesetter is gone. We just submit PDFs, and everyone does their own.
The other "costs" of a journal or conference were already borne by the research community -- most of our conferences and journals are all-volunteer. It's a decentralized cost spread across the salaries of everyone in the community.
In our model, at least, the role of the publisher is greatly diminished. They provide some hosting, some indexing... and not much else.
Not all fields are like that. Science and Nature, for example, have paid editors. Fine. Happy to pay for that.
The shift in costs happened a while ago - the costs are probably in the range of $10-$100 per paper now, compared to quite a bit more before.
Not necessarily. In my own field (a branch of linguistics), many of our journals are open access, but they are still free to publish in. Some of the publishing costs are covered by the fact that these journals are put out by learned societies that are sitting on large endowments or receive state funding, and even for journals that are open access online, libraries often still want to pay for hard copies.
Hindawi is half that, but they will publish anything.
And more lawyers! If the journals did sue the universities it would be a one time legal cost to defeat them (thereby driving them out of business) versus the ongoing and growing cost of journal subscriptions.
I totally agree with the position that for-profit journals are bad for the research community, but is there any doubt that legally they have the right to charge universities for the content that they own the rights to?
How so? The university will just say that they terminate the subscription to save costs (which is true), and they are not responsible for legal transgressions of each researcher. They just can't tell them "get your papers from sci-hub". Even the researchers themselves are only downloading, not distributing, which in many legislations is looked at quite favorably with comparatively small punishments.
So no one will drop their top-10 subscription, while expensive tail journals are probably doomed (tail journals are either very cheap and run by schools, or VERY expensive and managed by Elsevier).
Elbakyan's work has inspired me to only publish my work in jornals that embrace open access and open data. I'll be damned if I am a slave to impact factor and other haughty metrics.
While I do not use Sci-Hub, I think that users who use it are doing so morally and ethically (in the sense of conscientious objection). i hope they are also willing to pay penalties if they are found to be violating copyright (this is generally considered a requirement for intential protest).
The other day in a HN discussion, someone cited a paper in response to my comment. I was able, in 30 seconds, get to the full text of that paper, which allowed me to reevaluate my opinion in context of what I read.
This is how science can, and should be useful for individuals. And beyond arXiv and SciHub, it generally isn't.
Granted, the current journal publishers spend too much money on overhead, and I certainly don't support the for-profit ones like Elsevier that rake in huge profits. But I also don't see much allowance made for the fact that publishing research in a peer-reviewed format involves labor which should continue to be compensated in some way (incidentally, it's not entirely true that researchers themselves aren't compensated - publishing offers an indirect benefit but a real one in the sense that publications are directly linked to salary increases down the line).
Peer review is done by others in the field (peers), typically for free. Online publication is "where do I stick this PDF", and does not require per-paper work (or if it does that can be done by the author). That leaves us with copyediting, and in many cases, copyediting issues are caught by peer review rather than any paid editor.
I also think that you underestimate the importance of copy-editors. It's not the job of a peer reviewer to make sure that the author uses an apostrophe properly, etc. There needs to be a specialist who is dedicated to catching those errors, particularly since, as you note, peer-reviewers aren't directly compensated so it's unfair to burden them with a whole other set of responsibilities.
Many Elsevier and Springer journals no longer perform copy-editing. Authors are expected to have their paper checked by a native English speaker and copy-edited at their own expense, and then provide the journal with a camera-ready PDF. In my own field, it is the open-access journals published by non-profits that actually have the best language quality and typesetting.
This is a problem that goes beyond journals into for-profit scholarly publishing more generally. In my field, Brill is an infamous publisher for this: it demands camera-ready PDFs for most of the monographs it puts out. So, your library ends up having to spend 400€ on a book where about all the publisher contributed – besides unpaid peer review – is printing, binding, and mailing it out.
I think "needs" might be too strong here. We (the field of CS) get by without it just fine -- the standard practice is to include copy-editing "nits" at the end of one's review. A shepherd, assigned during the final phase, does a final pass on the paper before approving it.
Yup. Typos slip through. There are papers with poor English.
Oh well. For the most part, it works out pretty well.
(And as a reviewer, I have little objection to also noting writing fixes while I'm reading your paper. I'm going to spend anywhere from 30 minutes to 5 hours reading the thing -- the writing fixes are a small additional cost. If you've done a decent job on the writing in the first place. If it's totally botched, I'll reject your paper and tell you to fix it before submitting it again. :-)
Now, would I prefer that the authors of submitted papers had to pay $50 for someone to do a copy-editing pass before I reviewed it? Heck yes. But perhaps we'll get DNNs to fix this for us one of these days. :)
Likewise, it's normal for editors in my field to make meta-level suggestions about writing style. I'd wager that history journals place a greater emphasis on prose style than CS journals do, since making an historical argument often depends on telling a compelling narrative. Hence a publishing model that works for a CS journal might not work for humanities journals and vice versa.
Maybe I gave the wrong impression by mentioning apostrophes, however. Professional copy editing is, in my view, super important in terms of differentiating a good journal from a bad or mediocre journal. As is the vetting procedure of finding appropriate peer reviewers and coordination between them. Whether or not you think of that as "real" work compared to producing actual research, it is still work that should be compensated in some way, in my view.
An analogy might be the professor teaching a class vs the maintenance guys who makes sure the projectors are working and the lights turn on. One might be more important than the other, but I don't think either should be working for free.
I'm confused that you aren't aware of this but, peer review and copy-editing is by-and-large done by other academics, not journal publishers, and, at most, for what amounts to an honorarium.
And online publication is not a job, per se. We automated it a long time ago. Teenages have tumblr blogs now. There is no reason why the practical aspects of online journal curation could not be handled by a very small team of people working in the IT departments of a few universities.
Journal article publication may contribute to academic careers, but that has nothing to do with the publishers' involvement: it has to do with the quality of the journal, which is driven by the academics in writing and reviewing for it.
Simply put, journal publishers are >entirely< overhead.
And although peer review relies on specialists volunteering their time, finding and vetting those specialists still requires work.
I'm not trying to say that the current publishing model is defensible, but I am pointing out that running a high quality peer reviewed journal still requires at least one or two dedicated workers. Whether you think they should be paid for their time or expected to volunteer is a different issue, but we shouldn't pretend that the whole apparatus is an illusion created by greedy publishers.
Also, no editor at any journal (I've published ~20 papers) made any edits to my papers.
If the editor sent some copy edits to the text, I'd just send it back to them unedited and ask them to publish it. In fact, I love pushing back against unreasonable editorial requests.
I'm having trouble expanding this into a sensible-sounding general moral principle. Does this hold regardless of how large the penalties are? (Relevant considering the often very high punitive damages for copyright infringement at least in the case of entertainment media.) Is it specific to some kinds of laws or government system or fully universal in the sense that somebody protesting the policies of a stereotypical dictatorship is also morally obliged to be willing to be shot/flogged/whatever the corresponding punishment under that system is?
I see people commenting that just because of this release, universities won't cancel their subscriptions to the journals. Well, that would be great - let them keep paying, while the content also gets out for free.
This is like the trend where you can pay what you want for stuff, or nothing. I wonder if that model would apply to scientific research - pay what you want for the paper, or nothing - but if you want to support that research.. hopefully people would still pay.
Just thinking out loud... probably already been thought of or wouldn't work (or I'm just self-defeatist). :)
- outside top 500 scientists across all the fields combined no one produces quality papers.
- the manuscripts received by EICs are garbage. EICs themselves rarely touch anything, or edit anything.
- "peer review" process in all non-the-most-upper-tier publication is a joke.
- copy-editing and proofreading is non-existent.
About 15 years ago a lot of journals that are now owned by the major publishers were society publications. This means that societies owned the title, owned the copyright and owned the process. Societies could not find people capable and willing to run a journal for $5k or so a paper including distribution and printing. So the societies went to the likes of Springer, Informa, T&F, Elsevier etc. Those said "Ok, sure. $2k per paper and we produce and distribute it or $5k per paper and we do everything". That was too much for societies. Instead the societies said said "Hey, what if we sold you the rights to the journal? Would you in that case do all of it for free, and let our EIC suck his or her thumb while still being the man/woman on the title page?" "Of course" said the publishers.
And so now we are here. Production costs did not go down. Distribution costs did not go down. In fact, distribution costs increased because before that the journals were printed by a little printers in the US as the volume of most of journals is microscopic ( later printing went to China in the same microscopic quantities ). Now, however, journals also need to be distributed in all kinds of wacky XML formats on all stages of production, which needs to be coded, in most cases by hand.
So unless every Joe, Jack and Jill, the scientist, wants to go back into the publishing world nothing is going to change.
[Source: pillow talk]
Just to make sure I read this correctly: Are you saying that, when judging by paper quality, there are only 500 good scientists on Earth?
The equation today isn't the same - with open source / community effort, I suspect the cost/effort of publication today doesn't even come close.
I was just randomly looking at what it would cost (rough estimate) for 25TB of S3 storage - 2,000,000 put requests a month, 2,000,000 get requests, 25TB bandwidth OUT and 25TB bandwidth IN - and it's less than $3,500 per month.
So the cost of hosting the published item seems actually quite reasonable... reducing the cost of the rest of the "publication" plumbing are problems that technology also helps address.
Late addition: The cost of distribution of a scientific journal in electronic format is minuscule - put it on a blog. Medium.com is free.
Another addition: I have seen a few of the author contacts. All of the ones that I saw said the author could publish his original manuscript on the web to free. The author could just not publish the result of the publisher's processing of the manuscript. Someone should ask the authors why aren't they publishing their original submissions.
"The annual budget for arXiv is approximately $826,000 for 2013 to 2017, funded jointly by Cornell University Library, the Simons Foundation (in both gift and challenge grant forms) and annual fee income from member institutions."
> Medium.com is free.
Until its not, or until, as a commercial enterprise, Medium decides to completely pivot, or decides a controversial paper is too much bad publicity.
If I publish to Medium, will future researchers be able to find my article in 10 years? In 50?
Also I don't see how you've taken peer review into account here.
That said, Wikipedia mostly works. Between professional editors and the community, decent knowledge is categorized, edited/peer-reviewed and interlinked. Why the same cannot be done with academic papers is beyond me, other than to assume people simply don't want to change the status quo.
I also agree with your last point - so long as authors can contractually publish their own work, they should - and the Internet should make it as easy as humanely possible. I've published a paper before to an academic journal and had that clause, and then put it online on my own site as well (until I took my site down).
Random thing - I do think the standards of what constitutes knowledge matter, but boy do we hold them so high. I think there are many forms knowledge comes in, and expecting them to all be polished perfection is perhaps too lofty a goal. Knowledge is always a work in progress. Discerning between fact and fiction is the same whether you have a polished paper or a garbage one, it's just one is a lot easier to consume and you have more trust inherent by the nature of the reputation associated with the publisher, author, citations, etc.
This is an exceptionally incorrect statement to make. The other points are fine, which only makes it more baffling you would make this point.
Sure, there's an ethics problem during reviews to push additional citations for people in your university or something but that's not too different from the current situation with grant applications, just more direct.
Combining the two thoughts, "suggested payment to cite" with a "pay what you want" model could interesting too.
I suppose this would be a direct transfer of value from those who want to build knowledge on other people's work, to those who did the work. As long as it was not unduly expensive to do so, maybe - after all, the saying goes "paying my dues". I would think you wouldn't want to inhibit or slow down people from building on knowledge either (which is what things like Sci-Hub seem to be trying to avoid).
I wrote a few published papers some time ago, and I was told that without citations, a paper would be rejected as academic research should always be based on enhancements or references to prior work. So it seems kind of built in to the mechanism that citations are the requirement to build knowledge on top of when creating such works.
It would be interesting to see how a model like this could work. Since the value created by the researcher is essentially part of a web of value, it might be hard to determine who should get credit.
For example, we have 3 existing papers, A, B and C, and one new paper, D. Paper A cites paper B, which cites paper C - and all of them are additive to our knowledge base. Therefore, as the author of paper D, if we were to pay to cite paper A, should that also include some payment to the authors of paper B and C too?
I suppose one addition you could add to this - payment to only LIVING researcher(s)/author(s). If the researcher/author is deceased, then the payment stops. In this regard, it sounds like royalty/licensing fees.
Students would pay a little fee, if that doesn't impede their research too much.
Also I was thinking of a time based value. Papers / Journals older than X years, or that have already received a fair amount of fees could fall into free access. That could make students looking for alternative or forgotten ideas more motivated.
But it's unfortunate that Sci-Hub is also disrupting non-profit scholarly associations that cover their own budgets through journal subscriptions. In these cases, the fact that libraries and readers have to pay for access to an article is somewhat balanced out by the fact that those fees are going to pay for staff, conferences, and the other worthwhile activities of the non-profit associations.
EDIT: For those not familiar, Oink was a torrenting site but what distinguished it from the tons of other sites was how highly curated it was. High quality audio, proper grouping and genres, and best of all you could request anything that was missing and the community would magically add it.
Side-stepping the politics and legality for a moment, I find it interesting how this is a very natural example of the fault-tolerant design of networks.
A 'DOI' is a resource locator. There's a 'fault' in the network in that some people cannot locate the resource when providing the DOI. Sci-hub has sprung up to route around this fault, and provide people with the resource.
Practically, there are real people making decisions who are involved in this process, but on an abstract level it feels like this is somehow inevitable? It feels like an emergent behavior from any sufficiently advanced network, and any attempt to stop it seems futile.
Does this happen in realtime? I read that sci-hub sometimes uses credentials from legitimate users, but in that case they would be easily identified.
So, perhaps someone with access submits papers on request? And if so, what about watermarking?
years ago i recommended to scihub that they should use https://github.com/kanzure/pdfparanoia but they weren't too concerned.
It's already passed peer review, but the final version will be tidied up with additional editing and will have e.g. the line-by-line numbering stripped out. I see sci-hub versions of papers from months ago, or older, that never update to the final edition. The content doesn't differ much but the final version may have better-placed figures, corrected typos, and other minor improvements.
I consider it a heresy that some sites (fuck ResearchGate, Academia.edu) consider it fine to alter the PDFs. Document integrity should be the holy grail. Files should be final and never get touched. If they are, they are different.
Different versions of a paper should have different DOIs. A DOI is fundamentally quite abstract, and there is no reason that an object-appropriate versioning scheme can't be laid on top. Figshare, for example, uses a versioning system where there is a single DOI referring to the overall artefact (e.g. 10.6084/m9.figshare.2066037) and a versioned-by-suffix DOI referring to a specific version (e.g. 10.6084/m9.figshare.2066037.v16). That works really well IMO.
(or arXiv:1707.08475v1 [stat.ML] for this version)
sci-hub is not like it, but it's more curated that tbp. You can enter a paper's URL or DOI and it'll take you right to it.
I think there is curation that should be build on top of it, especially if the concept of "review" could be moved to an online system.
This is sad to say, but in reality I think this isn't going to massively impact things for the publishers. Academia at its core is where the problem lies. Sure paid subscriptions are a big part of things, but it's the stuff most don't realize (the authorship fees and institution sub fees) that give the publishers power.
The hardware is ancient. It's really overpriced. https://xkcd.com/768/
But they're still selling lots! Why? They're approved by the examination boards, and have an effective monopoly. (oligopoly if you include HP and Casio, but most schools choose one brand and require all students to fall in line).
The academic market is not remotely capitalist, so even though there are cheaper & better options available that people will use in everyday life, the "official" results will still use the officially-sanctioned products.
Right now, publishers offer take it or leave it bundles of hundreds of journals at prices the publisher set. If a library doesn't want to pay, the library/university system loses access to everything. And many people at the university can't do their work without such access.
SciHub existing allows negotiators for the universities and libraries to shrug and say whatever when Elsevier threatens to cut access to every Elsevier journal.
Not to mention Elsevier, till exempel, WILL be cracking down on piracy especially after their grant from the US courts regarding Sci-Hub. Sci-hub hasn't doomed anything and the publishers still control academia. You should be mad about this, and you should blame academia itself. Push for open journals like PLoS One when possible and try to convince people to move off that god-awful Impact Factor metric and you might inflict damage on the big publishers.
Anyone know if this is a typical sentiment? I'm just curious if it's true that many researchers are offended by this movement, and what the reasons are.
I firmly believe that there are always two sides to any topic, so we should explore the flipside. What are some arguments against blatantly opening up access to paywalled articles?
My university paid 500-2000€ / article in extortion errrrr i mean publishing fees. That should be more than enough.
It seems like the sentiment is in the other direction, though. Maybe most researchers don't care.
Individuals and institutions in poor countries may well turn to sci-hub. I certainly have. But I would venture that not much of the journals' revenue came from individuals or poor institutions in the first place. I didn't pay to read paywalled papers before sci-hub either; I got them via authors' sites or personal contacts, or just didn't get to read them at all.
Elsevier, et al, can now make a legitimate claim that they are not restricting access to those who cannot afford a sub. But their tax on academia will continue.
Publishers are tracking mass downloads (see the Aaron Swartz case) so given some of the very obscure papers I've retrieved from Sci-Hub I assume it's unlikely they downloaded them beforehand. My go-to assumption for how it works is that a bunch of people have donated access to their university network access and Sci-Hub is just a load-balancing / cache layer.
"Science" and "subscription" (or any monetary incentive) don't compute in a single sentence. Aren't scientists funded by governments and/or corporations? Why should anyone pay them a royalty above that?
It's a legit question and not trolling, don't mistake my slightly angry tone for degrading please.
I am curious if at one time did Universities publish these independently and were they more accessible to the public? When did this practice of restricting access to papers via subscriptions begin?
Is it reading experience? Site performance? Difficulty in navigating publishers' sites?
Are there any good experiences you can point to? I'm really interested in making this better.
Open access is probably a net good, but it just shifts the costs around.
I suppose that's partly a backlash against predatory open access journals that publish any rubbish as long as (significant) fees are paid..
The focus is on degrees, not on true learning. So much of what occurs is in universities is total waste. But people put up with it to get the paper. As long as people keep blindly giving absurd sums of money to get the paper, these expensive publications will last. The answer is for people to wake up and value learning over a diploma. When that happens, then finally issues like this will go away. Heck, as a bunch of people have pointed out, many of these papers aren't even for real learning. They are worded in such a way as to make them sound smart to their peers, but unintelligible to the public.