Here, the government is pursuing criminal charges for someone who has taken research papers (funded, by and large, through public research grants) to presumably make them available to the public.
In doing so, he wronged various publishers, who have not financially supported any research, who have not financially supported the scientific review of said research, who have financially gained by (likely) not only charging the original author for the submission but also the universities which provided the infrastructure critical to the majority of research.
Spearheading this noble effort to right the wrong and restore law and order is US attorney Carmen Ortiz, who is cited for the wise words: "Stealing is stealing".
Lets be reasonable here: a decently sized server farm could probably keep the entirety of documents hosted on JSTOR in RAM. Today alone, imgur burned through 50TB of traffic; it delivers a petabyte a week. I'm not going to believe a sob story about how distributing 100KiB PDFs to someone running 'wget -r' is DoSing their systems.
But he recklessly damaged a protected computer. They're going to have to replace the fifty cent lock with a new fifty cent lock. That will cost over fifty US cents!
(Yes folks, it's true: pretty much everything is a federal felony if someone doesn't like you.)
Add in beaucratical paperwork, a multitude of companies bidding the lowest price to replace said lock and the salaries of the team of people who are going to be spending the next 2 years replacing it and it really starts to add up.
Well, the article does note that the documents were 'returned'...
>The indictment alleges that Swartz, at the time a fellow at Harvard University, intended to distribute the documents on peer-to-peer networks. That did not happen, however, and all the documents have been returned to JSTOR.
Should I infer from this that they were found by the swat team in a bag labeled 'Swag'?
> Lets be reasonable here: a decently sized server farm could probably keep the entirety of documents hosted on JSTOR in RAM.
One largeish science publisher I worked with had 9-10TB of (pdf) document data. In addition to that there's a search engine, and image versions (of everything) to allow for look inside. Then there's dynamically generated HTML for online view. Electronic publishers do with large amounts of data and it isn't wholly static content. I don't know much about JSTOR but I think it's safe to assume that the implementation is non-trivial.
> I don't know much about JSTOR but I think it's safe to assume that the implementation is non-trivial.
It's non-trivial, and it's also one of the smallest chunks of their budget. As a (bloated, inefficient, high-salary) non-profit, you can read JSTOR's financial filings for yourself:
They spend ~$4m a year on all computer costs. (To put that in perspective, they spend $1.3m a year on 'travel' & 'conferences, conventions, and meetings'.)
I believe Swartz's point is that 10TB of data is not something you can build a self-perpetuating bureaucracy around, anymore, and that this implementation is, in fact, simple enough that it can be made available and maintained for a tiny fraction of the cost of JSTOR.
I'm not really talking about what Swartz's argument is, but you seem to have missed the point that there's more to being a publisher than holding data.
But I don't see the complication on the tech side.
Getting the scanned copies is likely the most complex portion of this and is outside the realm of the site. The rest is just displaying static content and filling a Solr cluster with OCR data for search which is a trivial task with off the shelf OSS tools. Seems like a 2 week MVP project given how clunky the website is (why on earth does it move to the top of the page when I click the 'next' arrow when viewing the article, arg).
However, he did 'break into' an MIT switch closet to run 'keepgrabbing.py' over a 1Gbit/s connection. He wasn't just downloading 100KiB PDFs, either. He downloaded at least two million documents. The indictment isn't clear exactly how many, and it sounds like he downloaded a lot more than 2 million, to boot. Not all of JSTOR's documents are neat 100KiB PDFs, either: a substantial portion are scanned images (1+ MiB PDFs) from old journals. So, we're looking at the TB range of data.
This is not to say that his intentions were ignoble...
So there is at least 2 million scientific documents that publishers are profiting from withholding.
I'm not generally anti-copyright, but I believe the profits publishers make on scientific publishing are unconscionable - not only do they impede progress, but in many cases (eg, medical research) they cost lives.
> “The criminal investigation and today’s indictment of Mr. Swartz has been directed by the United States Attorney’s Office,” said a statement released by JSTOR on July 19. “It was the government’s decision whether to prosecute, not JSTOR’s. As noted previously, our interest was in securing the content. Once this was achieved, we had no interest in this becoming an ongoing legal matter.”
Yes and hopefully it also means that many people take note and speak up against the law and against his harsh treatment.
And this is not just about the law by the way. The fact that governments around the world can be blackmailed or corrupted by a small number of ruthless publishers is a political issue as well. It's also a credibility issue for researchers to some degree, particularly those who have already made a name for themselves and still play along with this.
Academics are completely free to submit their papers to open access journals, the reason they don't do so is because they want the brand value from publishing in prestigious closed-access journals and because submitting to those journals is free.
Building that brand value has risks and cost money to produce (due to editorial, etc costs) hence the publishers need to recoup that value. Open-access journals to recoup that cost either charge the authors or rely upon a subsidies from wealthy benefactors (universities, industrial sponsors).
There are no editorial costs. Editors are not paid. There are no authorial costs. Authors are not paid. There are a few copy-editing costs. Copy-editors get paid a few hundred bucks per article. There are a few administrative costs. The administrators get paid huge amounts. There is also no risk - the journals were set up a long time ago, and are the way they are due to inertia.
If you imagine for-profit academic journal publishers (especially Elsevier) as anything other than vicious monopolists defending their entrenched positions via copyright law and extortion of university libraries, then you have the wrong idea.
> There are a few administrative costs. The administrators get paid huge amounts.
I'll step over the part about these sentences contradicting each other. A colleague of mine worked as a part-time "managing editor" for the top journal in his field, meaning he was paid to coordinate the flow of submissions to the (unpaid) senior editors, managing the responsibilities/schedule of the editor-in-chief, and a host of other work I haven't quizzed him about. He worked hard, and for not much money, which is somewhat reasonable, as he was effectively an executive assistant. The EIC is expected to regularly travel to annual conferences (in one case, because that's when the journal's editorial board meets), so these costs are paid by the journal (to my understanding). He also does a great deal of evangelist work for the journal internationally. While some of these costs may be paid by honorariums (I'm not clear), they must be paid by somebody, and the journal is a likely candidate for a portion of those costs.
My point is that every time we talk about costs of journals, someone mentions either a) administration is a no cost event or b) administration is an overpriced event. Yes, the reviewing and editing is typically performed by volunteers (the costs of which are borne by their respective employers) There is a substantial amount of work that goes on behind the scenes. I've had only the slightest of peeks behind the curtain, and I'm blown away by how much occurs that I wouldn't have initially guessed.
I'm not saying the repository companies (and some publishers) are not money-grubbing parasites; I'm not sure the evidence is in their favor. But little though it's been, my limited view of journal administration is that it's a non-trivial set of tasks costing more than I originally suspected. Claiming otherwise without experience to the contrary is really just FUD.
I was intending to imply that the massive dollar flow to these administrative costs is not proportional to the value provided. The cost for your friend to fly around was not in any way proportional to the price levied on the journal purchasers and users.
I am very interested to hear more about the activities "behind the curtain", because (being a junior academic at a teaching-oriented institution) I haven't had any interaction with a journal publisher that provided actual value beyond providing a website for me to submit my work, and a branded stamp to certify that my work was correct. In every case, I have had to submit a camera-ready final copy that was typeset by me and typo-checked by either me or the unpaid reviewers, and at no point has anyone approached me or anyone else I know asking us to publish our work in journal X as opposed to journal Y. As far as I know, to the degree anything I have ever written has been read at all, it's because I put the PDFs online myself (technically in violation of the copyright agreement, but a behavior that seems to be tacitly accepted in practice). Of course, even after people read the online PDFs (to the degree those papers get read at all), people cite the paper as if they read it in the paper journal or proceedings...
Typically the commissioning and production editors (as opposed to the academic editors) are paid, as are the copyediting staff, the design staff, the sales and marketing teams (brands don't build themselves), etc.
The average academic journal has 25%-30% profit margin (higher than the rest of the publishing industry), but it can take upto 5-7 years for a new STM journal to build enough of a reputation to break even. The profits from the successful journals have to cover the failure of the others (not unlike the startup industry from an investor perspective).
If you look at an open access journal like PLOS Biology the fundamental economics aren't that different. PLOS Biology charges authors $2900/paper, at about 20 papers an issue that means an issue generates just under $60,000 in revenue (excluding revenue from print sales). Overall PLOS runs at a 20% profit rate but it's not significantly different from other mid-tier closed-access journal publishers.
But what do journals do? I know what they did: they were clearinghouses of current scientific knowledge. Now, in the days of the ArXiv, what do they do?
The only answer I can come up with is: help academics gain tenure and promotion. Except for a few flagships (Science, Nature, Cell, etc) most journals are not current, and are not read. Issues are not received with bated breath by people rushing to find out what the cutting edge is. Instead, everyone who cares has gotten the preprints, and everyone else doesn't care.
Arguing about the profitability of journals feels like arguing about the business models of buggy whip manufacturers.
Here, the government is pursuing criminal charges for someone who broke into locked facilities and committed a textbook case of computer fraud to do something that might be morally defensible. If he weren't illegitimately accessing these articles, it'd be a different story.
If you have a tip about an unsolved murder that you want to report anonymously, and you break into someone's house to do so, good on you for reporting the tip, but you still broke into someone's house.
Are you suggesting the case against Swartz is as serious as a burglary in Cambridge, Massachusetts? Or less serious, since he did not damage a lock or deprive the owner of the use of anything? Or is your argument that this is legitimately a high-profile federal case?
Despite what Aaron's intentions were, he clearly knew he was doing something illegal. He broke into a building (closet?), setup a computer illegally and transferred information illegally. In addition to all of that, he tried to evade security when he knew he had been discovered.
It seems like there were many other (more reasonable) avenues that he could have taken to accomplish his goal. He's really the only one to blame for getting himself into this mess. I sincerely hope that he gets the punishment he deserves which should be a firm slap on the wrist. I'm uncomfortable with the idea of him getting any jail time, but if it comes to that then he'll have to deal with those consequences.
I disagree that there are other, more reasonable avenues to achieve the goal of putting as much information as possible into the public's hands. Breaking into a closet that you think you can access and downloading a lot of it is undeniably faster and more effective than lobbying or voting or petitioning companies or building alternative publishing mechanisms (all of which are also avenues which he has personally taken, to great effect.)
If ten thousand people took Aaron's approach and simply got caught less, society would be significantly better off for it.
Indeed, one of the facets of civil disobedience is that you have to be willing to accept the consequences. As much as we may like the activist, we can't support making an exception to the law, because that would invite anarchy.
However, this logic rapidly breaks down when the prosecution is vindictive and the consequences disproportionate, as they are in this case. Before the indictment was expanded, he was facing up to 35 years in prison.[1] Even if he doesn't get that, he'll likely bankrupt himself and his family defending himself, and suffer consequences for years to come. No one should have to "deal with" this. It's the government that deserves the blame for his fate, not him.
> we can't support making an exception to the law, because that would invite anarchy.
Exceptions to the law are made every day - it is hardly applied at all when the violators are politicians, big media organizations, and bankers. I'd agree we don't already make exceptions to the law when high profile connected public figures (eg the ex-head of MFG and the guy who authorized F&F) are indicted for anything.
That's a great point and yet another reason why it's becoming increasingly difficult to accept the argument that activists who perform civil disobedience must accept the consequences.
I'm not familiar with the specifics in this case, but if it's like most other cases they're not pushing for 35 years. They're pushing for him to take a much shorter sentence by threatening 35 years. (It's still ridiculous, but my point is they probably don't think 35 years is appropriate for his crime -- they think 35 years is appropriate for getting him to accept their deal)
While I do not disagree, one point I would like to add regarding the notion of "breaking into a computer closet, etc.". Someone, a Harvard student, recently posted to HN a "love letter" to MIT. The letter went into detail about how liberal MIT is with its resources for students. And how that really has benefitted her studies. It seems MIT is somewhat unique, at least vis-a-vis other universities in the region, in their approach to making resources available to "almost anyone" (i.e. you do not need to be an MIT student) for academic purposes.
Is it possible that if he were to have tried this stunt at another institution he would not have so easily succeeded? Was he simply taking advantage of MIT's liberal policies with respect to computer resources? Or is MIT's "do whatever you need to do" environment irrelevant... as we ponder thoughts of "breaking and entering". Just a thought. Maybe it's irrelevant. What do you think?
MIT alum here. Lots of MIT resources are accessible to almost anyone. That doesn't mean they all are -- it's easy to get into our computer labs, but the network closets are actually off-limits to anyone other than network admins. Some things are more liberal, but that's not a free license to do whatever you want.
Nor does it mean it's acceptable to abuse MIT's trust. In particular, presumably as a result of this case, JSTOR now requires strong authentication from the individual MIT account holder, instead of permitting access from MIT's IP address space as they used to.
Finally, yes, MIT does have an "it's better to ask forgiveness than permission" culture. But that very clearly only applies to legitimate MIT affiliates. I know of at least one other legal case (of perhaps equivalent importance) where MIT's lawyers said, if this guy were an MIT student or staff member, we'd go to bat for him, but since he's not, take the content down.
Thanks for this. I was just curious. "Abusing trust" is exactly the type of thought I had when I first read about this case. It sounded to me like MIT is very generous with letting people use the computer labs and he really took these privileges a little too far. But being far from MIT I can only form a picture from what I read. Thanks for the color.
> He's really the only one to blame for getting himself into this mess.
Not historically, due to principles of solidarity. In decent leftist movements, activists expect assistance from others. I heard one way to assess a leftist movement is how much support incarcerated people get.
(Imagine blame-oriented workplaces where people are completely on their own if something goes wrong. Who'd want to work in such a toxic, self-defeating environment?)
After all, prison is often kind of a limited death penalty, imposed by the state. You're stripped away from social bonds and freedom. (Particularly in the US; though fortunately Aaron is wealthy and white, a great advantage. Not to mention that he was engaged in a rather elite crime.)
Leftist is completely redundant in the above comment. If you don't have a committed hard core of true believers you have a talking shop not a poitical mvement; left or right.
> I sincerely hope that he gets the punishment he deserves which should be a firm slap on the wrist.
A felony, even without any jail time, is an irreversible and life changing punishment. He will no longer be able to vote or (possibly?) leave the country, or work at various organizations and corporations that automatically do not hire felons.
Edit: didn't know that felons could still get passports. They can have trouble getting visa's for sure though.
This is one of the biggest misconceptions about voting out there. The suspension of your voting rights is dependent on the state in which your felony is committed. There are only 12 states where it can be suspended for life and even then it's usually dependent on the crime (i.e. in Nebraska only those convicted of treason will lose the right for life.) Voting rights are restored in all other states under different conditions.
> He will no longer be able to vote or leave the country,
Why leaving the country would be a problem? I haven't heard of that one before. Say what if wants to fly to Europe for vacation, he can't because he was convicted of a felony?
Swartz is smart and has interned for (law professor) Larry Lessig.
I doubt he's even going to try to make the case he didn't know it was illegal.
This is civil disobedience against unjust laws at its best.
To quote Wikipedia's summary of Martin Luther King's "Letter from Birmingham Jail"
Against the clergymen’s assertion that the demonstration was against the law, he argued that not only was civil disobedience justified in the face of unjust laws, but that "one has a moral responsibility to disobey unjust laws."[1]
A TRUE BILL
b r e n d a l h a n n o n
_________________________________
Foreperson of the Grand Jury
scott l garland 9 - 12 - 12
_______________ ___________
Assistand United States Attorney
DISTRICT OF MASSACHUSETTS
September 12, 2012
Returned into the District Court by the Grand Jurors and filed.
Granted: grand juries rarely refuse to indict. But it's not just the prosecutors behind this decision.
As someone who has served on a federal grand jury this is true (the rarity of a false bill), but is most likely due to two factors I saw over and over again:
1) The government generally doesn't bring borderline or weak cases in front of a grand jury. They have limited time and budget and don't want to pursue a case they don't feel is a sure thing (or, more accurately, a sure deal with the defendant as most cases do not go to trial).
2) Grand jury decisions need not be unanimous, only simple majority. I voted against indictment in a couple of cases I felt were weak, but they were indicted none-the-less. Further the burden of proof is not beyond a reasonable doubt but rather a preponderance of evidence[1], which is a lower bar.
Part of this is because the federal criminal case load is so heavy that there's no use to even waste money and time taking a case to a grand jury that can't lead to an indictment.
As Aaron is a founder of http://demandprogress.org I am not surprised by this. Their campaign emails are highly critical of the current political establishment fighting SOPA, PIPA, NDAA and related issues quite effectively through social media. This is deliberate over-reaction to petty protest crime. Like WikiLeaks, I imagine he has developed enemies due to these efforts. This entire situation reminds me of the treatment of political protest punk band Pussy Riot in Russia recently.
You need only reflect that one of the best ways to get yourself a reputation as a dangerous citizen these days is to go about repeating the very phrases which our founding fathers used in the struggle for independence.
To me the Largest Sadness is that he did not succeed in releasing the corpus to the public, which I presume was his real goal. Making this information accessible would be one of the largest contributions an individual could make to the world, possibly worth the trade of spending the rest of one's life in prison. The tragedy is doing so without the reward.
The only greater contribution I can immediately come up with that is definitely accessible to some of the readers of this board would be leaking the Google Books archive. Depriving the world of life saving knowledge for purpose of profit is as great a sin as genocide. Sometimes conscience must trump the law.
Sincerely,
A long time member resorting for the first time to an anonymous account.
As a result of Swartz's conspiracy to release JSTOR articles to the public, it led to the company making many articles free for public for viewing. A lot of good has come about from what Swartz did, even if the morality of his illegal actions are under dispute. There has been a lot of recent criticism against academic journals and their publishing companies whose business model is keeping all science and research behind gated walls and away from the public. Even if the research was funded with public money!
They've decided to make pre-1923 articles from American journals, and pre-1870 articles elsewhere, available publicly (w/o any subscription or even a free account needed): http://about.jstor.org/service/early-journal-content-0
Their original FAQ on it used to have a question about whether this was related to the "Swartz situation", where the answer could be paraphrased as "no but sort of yes", basically that they had been planning it all along but may have moved up some initiatives in response to the publicity. Doesn't seem that their current FAQ has that question anymore.
One may well ask: "How can you advocate breaking some laws and obeying others?" The answer lies in the fact that there are two types of laws: just and unjust. I would be the first to advocate obeying just laws. One has not only a legal but a moral responsibility to obey just laws. Conversely, one has a moral responsibility to disobey unjust laws. I would agree with St. Augustine that "an unjust law is no law at all."
Martin Luther King, 16 April 1963, "Letter from Birmingham Jail"[1]
Could someone who knows about such matters comment on whether this is the usual consequence for what he did, or if it's likely that he's been given "preferencial" treatment for his political activism?
It's likely the government doesn't like him because a few years ago he released 18 million pages of public court records that the government ordinarily charges 8 cents per page to access.[1] As these were public records, what he did was not illegal nor, in my opinion, immoral. But the government didn't like being undercut, and now they have a good opportunity for some payback.
I'm pretty confident all the data in PACER will be made public at no cost at some point in time. I think the $.08/page charge is supposed to cover the administrative costs (like paying for photocopies) but the PACER program actually runs at a surplus. I think this may have even been part of the rationale for the trial they ran which Swartz used to do the downloads. The government wants to open up PACER.
But what struck me as partularly stupid about what Swartz did (besides the fact that the data will probably be released anyway, in due course, without the need for "activism") is that he installed stealth code on a computer in a Federal Court Law Library. Of all places he chose a federal building, and a Federal Court Law Library. This just sounds idiotic.
And the irony of it all, at least to me, is I just downloaded his Superceded Indictment for free from archive.org. It appears others are succeeding in making court documents publicly available without installing stealth code on federally-owned computers. Maybe they do not have everything in PACER yet, but I think it's only a matter of time. Courts are perhaps a little slow to change with new technology but despite their budget constraints they are definitely making progress. And publicresource.org seems to be getting bulk data with the blessing of the courts and without installing any stealth scripts on Law Library computers.
If one really wanted to engage in some sort of activism to free up (what should be free) legal documents, maybe a better focus is Lexis-Nexis. A true monopoly, founded on a dubious intepretation of copyright law. Can you copyright court decisions? They managed to do it. And the founder is on the Forbes list.
PACER is partially open now; for example, I went looking a little while ago for filings relate to modafinil prosecutions, and wound up paying nothing at all because I fell below their $10/monthly cap or whatever.
JSTOR have said that once they had a guarantee the articles had been secured--which I believe means that Aaron claims he no longer has them and has not, and will not distribute them--they had no more interest in pursuing the case. This is the feds acting on their own.
JSTOR said that publicly, and they could well have said that privately. But I'd be interested to know what the various journal companies have said, either to the prosecutor or to the various politicians they lobby and contribute to.
Yeah, it would definitely be interesting to know that.
But even so, is it really that hard to believe that the feds have their own agenda and are acting on their own for this one? I realize I have no proof to back up my assertion that this is all them, but it's not outside the realm of possibility, that's for sure.
They could have their own agenda, sure. But given how much money there is to influence American government these days, I think it's also worth asking "who could benefit financially from this action?"
JSTOR is a non-profit. They dropped their charges once it was clear the data had not been re-distributed. The state of Massachusetts also dropped its charges. Only the federal government is pursuing him now.
Having to pay for access to papers and being forced to pick cotton in the blinding sun every day of your life under pain of the whip are not the same thing.
I'm going to go out on a limb and assume that you completely missed the point of my analogy, which was a reference to the exploitative, all-but-mandatory unpaid 'research assistant' jobs or 'internships' and the like. We're fortunate that this is unheard-of in our industry, but in some fields, it's the norm.
Though I would thank you for deleting the completely out-of-proportion reference to Hitler that was in your original comment ("Hitler liked cakes, therefore anyone who likes cakes is a Nazi"). It seems even Hacker News is not immune to Godwin's law.
In doing so, he wronged various publishers, who have not financially supported any research, who have not financially supported the scientific review of said research, who have financially gained by (likely) not only charging the original author for the submission but also the universities which provided the infrastructure critical to the majority of research.
Spearheading this noble effort to right the wrong and restore law and order is US attorney Carmen Ortiz, who is cited for the wise words: "Stealing is stealing".
Lets be reasonable here: a decently sized server farm could probably keep the entirety of documents hosted on JSTOR in RAM. Today alone, imgur burned through 50TB of traffic; it delivers a petabyte a week. I'm not going to believe a sob story about how distributing 100KiB PDFs to someone running 'wget -r' is DoSing their systems.