The repeated use of "stole" in the indictment is interesting, even beyond the usual metaphorical usage to discuss copyright infringement.
In this case, the indictment alleges that the documents were stolen from JSTOR, which does not even own them! In the vast majority of cases JSTOR scanned documents whose copyright is owned by someone else, and acquired or was donated a non-exclusive license to distribute copies via its service. In many cases the documents are even public domain. The indictment continues the theft metaphor by discussing the effort and expense JSTOR incurred in scanning the documents, and the alleged attempt to render this less valuable by redistributing "its" documents, analogizing this to the loss someone suffers in a theft.
But effort expended to build a private repository consisting of copies of things you don't own doesn't give you ownership of the result, any more than Google Books doing the same has given them ownership of the documents that they've scanned. If you scraped Google and "stole" their scans, you would be violating Google's Terms of Service, and Google might indeed feel subjectively like you've taken something of value (their exclusive access to this repository of scans), but I think it would be a stretch to say that you've "stolen" "their" documents.
They are talking about theft of services, not copyright infringement. In any event these charges are going to be very difficult to beat since they're federal, even though there are some obvious holes in the indictment. It will be almost impossible to get any of the evidence thrown out even if there was an illegal search and seizure. His best bet is probably to get the Harvard legal team to go to bat for him, although it's difficult to say how likely that is.
I wasn't really commenting on the legal sufficiency of the indictment, just the rhetorical dishonesty of accusing someone of "steal[ing] well over 4,000,000 articles from JSTOR" (quote from the indictment) when JSTOR didn't own those articles. They could've just alleged violation of JSTOR's TOS and thereby theft of network services. I suspect JSTOR or people sympathetic to them had a hand in writing the indictment, though; JSTOR has a long history of attempting to spread the misinformation that it somehow "owns" its archive.
Well, if someone stole $100,000 of property from a storage facility, that wouldn't mean the storage facility claimed ownership of the property, just that it was the location of the theft. Maybe you're overthinking this a bit.
Well this is more akin to breaking into the storage facility and making a copy of all the Paintings stored there. The value of the goods being stored has not been reduced.
Not to torture this analogy any further, but would you feel safe storing your stuff at such a facility after something like that? No, you'd probably look elsewhere for your storage needs. Breaking in is still bad and would be the subject of criminal charges. If the US attorneys decide that use of the word 'stole' is somewhat over the top, then guess what? they can amend the indictment - just as the defense can amend their motions.
My point is not that the government is correct or morally justified in bringing this indictment, but that getting hung up on terminology like this obscures the legally problematic issue of having (allegedly) bypassed the security systems to download material he was not supposed to have access to, regardless of who actually owns said material.
But: (1) JSTOR isn't a storage facility in that sense; the copyright holders do not pay JSTOR to store their items, so this is a bad analogy.
(2) If the outrage is supposed to be about bypassing security systems, why is the government hung up on the "theft" terminology? Especially when JSTOR, the party arguably injured (in some way not specified), has asked the government not to prosecute?
No, this is clearly a convenient way to get a politically inconvenient person labeled a felon.
1. The analogy is only to point out that a 3rd party repository can be negatively affected by a break-in event if it doesn't have an ownership interest in the materials it stores.
2. The government is not hung up on the 'theft' terminology. The words 'steal' or 'stole' only appear three times in the 15 page indictment and the actual offenses he is charged with are wire fraud, computer fraud, unlawfully obtaining information from a protected computer, and recklessly damaging a protected computer.
Except that copyright infringement is definitely not theft. The owner doesn't lose his documents. It's fine if your opinion is that copyright infringement is wrong, but let's not call it by inappropriate names.
He is accused of stealing bandwidth from JSTOR, not the documents. "Theft of services" not theft of property. Theft of bandwidth is almost as absurd as theft via copying. JSTOR apparently isn't interested in free transmission of knowledge
If you read the indictment you'll see that they very much are not interested in free transmission of knowledge.
They charge >$50k/yr for access: " For a large research university, this annual subscription fee for JSTOR’s various collections of content can cost more than $50,000."
That price actually seems pretty reasonable for a large research university.
The real question is how much they charge individuals who want to get an article. My first google search (http://www.jstor.org/pss/27757488) results in $12/article. This is very steep when you're trying to do research and don't even know if the article is what you're looking for.
Well, you wouldn't want any old rabble getting access to valuable knowledge. Far better for that access to be safely controlled by the major research institutions, who can clearly be trusted to pursue knowledge in a responsible manner.
How is that reasonable? Sounds like Mr. Swartz was willing to host them for free! And he would have gotten away with it too if it wasn't for those meddling police.
But seriously, $12/article is ludicrous. That must be way above cost recovery or they're not doing a very efficient job of running JSTOR. Perhaps the co-founder of Reddit would do a better job...
Most public libraries have relationships with JSTOR that allow members to access the articles online. I use the Boston Public Library and look up articles via Google Scholar. All free.
I admit that I don't have statistics [edit: on libraries], but most libraries in the world are not large or in the US, and JSTOR's prices for a "small" library in "the rest of the world" are much, much larger than [edit: wrong — comparable to or perhaps a bit larger than, but not much, much larger than] their entire budget. Check out http://support.jstor.org/csp/PriceCalculator/. This code (for Chrome) gives me a yearly price of $81162.70, although it hangs the browser for a while first:
function mouseEvent() { var event = document.createEvent("MouseEvents"); event.initMouseEvent("click", true, true, window, 0, 0, 0, 0, 0, false, false, false, false, 0, null); return event; }
function each(list, thunk) { list = Array.prototype.slice.call(list); for (var ii = 0; ii < list.length; ii++) { thunk(list[ii]); } }
each(document.getElementsByClassName('expand'), function(link) { link.dispatchEvent(mouseEvent()) })
each(document.getElementsByClassName('e-only'), function(link) { link.dispatchEvent(mouseEvent()) })
It's sad that you have to write javascript code to do that! (But also cool that you did. :)
"Complete Current Scholarship Collection" for 22751.90 is a duplicate of all the things above it. So I think some of the entries have been double counted.
The real price for most libraries may about 1/2 or less of your estimate (they won't be interested in everything). And 20,000 to 40,000 is (well, shouldn't) be a lot of money for a public library.
That's the salary for a single employee! I would expect a library to have at least 5 employees, plus a budget to buy books.
Also I would expect a small library to have only a subset of the papers, and for serious research you would need to "go into the city".
I think you're thinking very much of US salaries. $40,000 a year shouldn't be a lot of money for a public library in the US, because it's the salary for a single employee (or the total costs for half an employee!), and the wonderful public library system in the US does indeed have multiple libraries. But world GDP per person is about US$10k per year, compared to the US's US$47k — and the bulk of that GDP comes from a few rich countries with only a small fraction of the population. An average country is something like Jamaica, Thailand, or the Dominican Republic, where the per-capita GDP is something like US$8.8k.
So US$40k per year is the salary for almost five employees. Except that within Jamaica or Thailand (or, to a lesser extent, the US) the median salary is much lower. And it's probably not the prime minister's niece who's working the librarian job. So maybe it's more like eight to ten employees.
So, yeah, most libraries — even measured numerically, but especially measured by the number of people who rely on them — are a lot poorer than what you're used to.
I haven't checked yet to see if the National Library here in Buenos Aires has JSTOR access.
I don't know this for sure, but I suspect that if you contacted JSTOR from a low income country they may give a better deal.
BTW, if you really do need JSTOR, it's not hard to find a library card number from a US library and use that for access anywhere. (Well, I don't know JSTOR specifically, but all the other databases I've used from my library are available to me at home after I put in my library card number.)
Their price schedule divides "Public Library – Small" into "US", "Canada", and "Rest of the World". It's possible that someone phoning them up from Senegal or Paraguay would be able to negotiate a lower price, but it's not as if their existing price list doesn't recognize the existence of different countries. (Still, lumping Switzerland and Malawi into the same category might not represent a deep level of consideration of the issues.)
For what it's worth, I was using their web site from my house here in Argentina, which is usually classified as a "middle-income country," but where you can hire a full-time employee illegally for US$4000 per year.
I was rebutting a factual claim ("Most public libraries have relationships with JSTOR that allow members to access the articles online"), not a normative one. An analogous factual claim might be that most Zimbabweans drive Mercedes. Even without having access to Mercedes's sales figures by nation, that ought to appear unlikely to you?
Yes, let's agree on and further reason from the the premise that it is not currently true that most Zimbabweans drive Mercedeses ;)
My point was: your argument seems to be based on refuting the argument that the JSTOR subscription is not expensive for the average library because it is only about one yearly salary of the average rank-and-file employee, by saying that that only holds for the libraries in the US (maybe some parts of Europe, but let's say the US for the sake of this argument), and that in many other countries salaries are lower and therefor the relative cost of a JSTOR subscription higher.
So, my (perhaps naive) interpretation of this is that your ulterior argument is that JSTOR is too expensive for many libraries outside of the US, and that they therefore don't have access to its contents.
I further deduce from that, from the context in which you bring it up, is that you don't find it a problem that people take the content from JSTOR and redistribute it to people who don't have easy access to libraries who do have a subscription. Now I'll grant that this is a fairly big leap to make, and maybe you're not holding that position; but within the given context (of people arguing pro and con the actions of the Reddit guy what's-his-name), I think it's not unreasonable of me to assume so, either.
So, to close the circle, my 'question' was (but of course it is a 'question' that is, in the end, a way of stating my position in the discussion...) if it is reasonable to hold that when something is too expensive for people, it is OK to circumvent the rights holders' restrictions on the use of something. (I'm deliberately being vague on issues like 'moral ought' vs 'legal ought', if JSTOR really has a common-law variation of a database right on their collection, jurisdiction etc. - I don't really think they're important for the question at hand).
There is no non-exclusive copyright. J-STOR is not the copyright owner, period.
They do own the right to the composition of their collection, so someone who got the whole collection would be liable to infringing their right on the composition of the collection; in contrast, a random sample of articles would infringe on the publishers' IP rights rather than J-STOR's.
The subtle point is that J-STOR is absolutely not interested in the original copyright owners having to hunt down abusers, because that would (in all likelihood) appear like an additional, avoidable hassle to the latter and would make them less likely to agree to have J-STOR distribute their content. [edit: apparently it's the US Attorney General more than J-STOR who is pushing this case forward]
In comparison: if someone sneaks into a cinema to see a movie, you would accuse him of cheating them of the entrance fee, and not of "stealing the movie". If someone sneaks into a cinema and uses his camcorder to record the movie, he is cheating the movie theater of the entrance fee and misappropriating the production company's movie (with the suspicion that he might pirate it later), but he did not steal the movie from the theater. That would involve something like walking away with the movie theater's copy of the movie, which would fulfill the criterion that what's stolen is not there afterwards.
Misappropriation of IP is not stealing. It's unauthorized copying - that certainly has the potential to harm the bottomline of the copyright owner, but with an impact that is much harder to quantify than the stealing of an actual physical thing.
IP owners and friends of them who use the word 'stealing' want to frame the situation in such a way that appeal to the nonexistence of monetary loss is excluded - mostly because these same owners are investors in, and not creators of, the IP and do not have any other perspective than squeezing whatever value they can out of their investment.
(The authors of the original articles probably couldn't care less about some punk illegally downloading their texts, because they don't see any money from it anyways).
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unlessyou have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use.
Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission. JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship.
Yeah, fortunately they don't appear to actually enforce that regularly through technical measures. As a researcher with legitimate paid access (via my institution) to JSTOR, it would be absurd if this were enforced. If there is a special issue of a journal exactly in my research area, I pretty much need to read all the articles in it, or at least skim them. To comply with the terms, do I really have to choose an article to avoid reading, so I only download (N-1) of the articles in the issue?
I agree with most of your comment, but there are a couple of points where I wanted to add some commentary.
Compilation copyright only applies to compilations where some creativity is employed in selecting the items to be included. There is no "sweat of the brow" database right under US law. JSTOR almost certainly does not have a compilation copyright on their collection, since any creativity being employed in selection is being employed by the journals they archive, not JSTOR employees.
At any rate, the indictment does not include any charges of copyright infringement.
§ 103. Subject matter of copyright: Compilations and derivative works
(a) The subject matter of copyright as specified by section 102 includes compilations and derivative works, but protection for a work employing preexisting material in which copyright subsists does not extend to any part of the work in which such material has been used unlawfully.
(b) The copyright in a compilation or derivative work extends only to the material contributed by the author of such work, as distinguished from the preexisting material employed in the work, and does not imply any exclusive right in the preexisting material. The copyright in such work is independent of, and does not affect or enlarge the scope, duration, ownership, or subsistence of, any copyright protection in the preexisting material.
Excerpt:
In the terminology of the copyright law, a database is a “compilation.” The Copyright Act defines a compilation as “a work formed by the collection and assembling of preexisting materials or of data....” (1) Compilations were protected as “books” as early as the Copyright Act of 1790.
I hate to quibble over words, but a painting can be stolen "from the Louvre" even if it's there on loan from a private collection, just as you could say, "The necklace was stolen from my jewelry box," without implying that your jewelry box was the legal owner of the necklace.
This is more like taking a photograph of a painting on loan to the Louvre, though--- they're alleging that a copy was made, in violation of their terms of service, of a document that they don't even own (but do host). In that case, I would think that you might be violating the Louvre's camera policy, and you might even be colloquially "stealing" something from the painting's author (e.g. if you go on to publish illicit copies from your photo), but you aren't plausibly stealing anything from the Louvre.
The argument over whether something digital can be stolen at all is a different argument than the one you made in the comment I replied to. The question of taking photographs of artwork (which are not exact copies, but which have their own cultural, educational, and commercial value) is a third question which is interesting in its own right. At this pace, I'm having a hard time keeping track of what we're arguing about.
In any case, my intention was not to signal my support for one of two predefined sides in a battle over the concept of intellectual property, it was just to point out stealing "from" doesn't have to have the meaning you read into it.
My point being, i have no idea what service they provide, didn't read the fine article. But if they have an all-you-can-eat plan for like $300 a month per seat. Then the charges should have been "joe doe hacks MIT network and steals $300 worth of replaceable goods".
In the same lines that if someone break into a car-wash and use 5min of their water, he will hardly be convicted of organized crime and causing damages over $300 (because that's what they charge for the premium wash)... why it's so messed up when you put a computer in the middle?
Yeah, that bothered me too. Especially on page 14, where they demand that he give back the "proceeds obtained." How is that going to be determined in such an unrealistic sense?
You're precisely incorrect as far as the law is concerned.
If I make a painting, I own the copyright. If you take a picture of said painting, you own the copyright on the picture. If someone makes a collage of your picture they own the copyright on the collage. Both of you are liable for copyright infringement against my rights, but this is independent of your own rights as described above.
What Aaron did sounds seriously sketchy (sneaking into MIT wiring closets, trying to download the entire database, etc.), a fact that Demand Progress and several commenters here seem to be ignoring.
Defending his actions would require a very strong, multi-pronged version of the argument "if it's physically / technologically possible, it must be ok." Can MIT legally limit guest access to its network? Can JSTOR limit access to its content? Well, technically, their software didn't limit it, right? He just changed his IP address and they let him right back on, gave him permission. And then he had to change his MAC address. And then physically move to a different building.
But it doesn't matter anyway, because legal restrictions are legal restrictions. It's impossible to enforce every legal restriction in software. Put another way, we don't have to read JSTOR's server code to figure out if there's a violation of policy here -- the policy is written out as a legal document.
In the hacker world, there's a tendency to think that if something's possible, even easy, then it shouldn't be considered "breaking in" or "stealing." If my Gmail password is "password," then of course you're going to read my email! I had it coming. In the real world, though, this is still a crime.
Right, because the standard penalty for trespass onto campus property is a federal indictment. Good thing no MIT student has ever snuck into a restricted area before.
Even with MIT's charitable (but declining) tolerance towards student hacking/pranking, they take a very dim view of non-students doing the same (even alumni).
There used to be considerable tolerance of Harvard students doing pranks at MIT also, since Harvard/MIT had a bit of a prank-exchange rivalry (though MIT students engaging in shenanigans on the Harvard campus is more common than vice-versa).
Is this worse than the sort of thing that goes on in the early days of most startups, including our most revered? People around here have a lot of respect for pg, rtm, tlb and their startup Viaweb - go back to "Founders at Work" and read how they got computer time needed to get the startup going. That kind of thing is practically universal in startups. The good ones, anyway.
So Aaron appears to have cut some corners in getting an interesting project off the ground. Slap him on the wrist.
Well, this again is a very hacker-centric perspective. Even if it's truly important that someone be able to surreptitiously copy all of JSTOR once in a while for the sake of innovation, who is in a position to let him off based on that?
Plus, putting everything else aside and evaluating him as a hacker, he doesn't come out looking too good. If he'd scraped all of JSTOR without getting caught, it would make a better story. As it is, he attracted a lot of attention, and the report of how he was caught has an air of inevitability to it. JSTOR called up MIT, MIT was looking for him, and he gave them a lot of time to find him.
This is the most technically competent charging document I've ever read. I guess there must have been some hackers on the grand jury.
Paragraph 35 & 36: which "protected computer" on MIT's network did he access? Certainly they're not trying to claim his laptop was a protected computer? Are they talking about the DHCP server or whatever registration frontend MIT has for the DHCP assignments? I have trouble with the concept that a violation of a computer use agreement (when there are no operative security barriers in place) constitutes a violation of the computer fraud and abuse act. Then again, I've always thought that act was vague and therefore overbroad.
Obviously what he did was bad in some sense (at least from the perspective of JSTOR and MIT), but even if it should be a crime rather than a civil dispute or internal disciplinary action at MIT, I don't like the fact that just about any misbehavior on the internet becomes a federal case because the probability of no interstate resources being used is very low.
Finally, I take issue with the notion that someone who is accessing a service through a public interface is criminally responsible for downtime if too high an access rate causes service degradation or an outage. The claims that JSTOR's servers were overloaded and (one?) even went down at some point are clearly there to set up a later claim of damages. Haven't they heard of rate limiting (in this case, since it was a rogue laptop stashed in a data closet, rate limiting by IP)? That wouldn't work against a concerted denial of service attack, but this was no denial of service attack. JSTOR seems to have been relying on manual intervention to stop article leeching that could lead to a (partial) outage. That's naive, and not a good idea.
* Prosecutor presents evidence to Grand Jury. This may include witnesses or documents.
* Grand Jury votes on if there is enough there to approve indictment
If they've got a computer crimes division, then they're going to have hacker types in the prosecutor's office to do this stuff and get the details right.
The indictment is going to be the most slam dunk part of the evidence that there is, as it's written by the prosecutor and there's no counter to it. If it doesn't look airtight, then it's probably a very weak case.
Though, looking at it here, It's not looking very good for aaronsw. The combination of mac address spoofing and a locked wiring cabinet show physical and electronic security that was bypassed, repeatedly. That's easy to explain to a jury.
A funny omission is that the indictment never actually says that the wiring cabinet was locked, or how Aaron supposedly broke into it. I infer that it wasn't locked.
But, given the rest of the indictment, I'd think that the prosecutor would be sure to throw in "Swartz picked the lock on the restricted wiring closet in order to introduce the Acer computer," since it would incline the grand jury to be more likely to hand down an indictment — unless she knew this was false.
But there is an element of permission inherent in DHCP. Your device is actively configuring my device specifically to allow network access. It's not an open door; it's a sign saying "This way please". That said, its obvious this was an attempt to circumvent access controls.
There's an element of permission inherent in a door! I mean, it's a breach in a wall, specifically put there at great additional expense, just to allow people entry! In either case, an enabling technology isn't inherently an invitation. Again, just because you CAN use a technology to do something, doesn't mean you MAY.
And with respect to the comment about "this way please", and the "actively configuring my device", please note that the client initiates the DHCP conversation with a Discovery message.
Discovery: "Can anyone give me DC info so I can set up my H, please?",
Offer: "Yes, I can, here's one configuration option!"
Request: "Yes please, that sounds good, I'll take it"
Ack: "Okay, you got it".
I'm unfamiliar with MIT's guest setup, but I assume they let you get an IP address, but before you can access anything, you have to acknowledge their terms of service / acceptable use policy. If you fail to abide by this, you'd be accessing the network without permission.
You're right that (as alleged), this would be an obvious attempt to circumvent access controls.
Given the way things are worded, I'm guessing that the MIT computer that was improperly accessed was a router or switch. Hell, just plugging into the switch directly could be construed as unapproved access to a computer device. I think the Federal law treats anything with a processor a computer.
Yeah, I think it's a bad argument to personify network protocols or imbue them with intent. Legally speaking, what is more important is the intent of the person using them.
A stranger would probably be allowed to enter the property unless you had posted No Trespassing signs. They'd be required to leave upon request but I'm not sure that simply entering your house via an open door would constitute a crime or tort. An open door could likely be construed as implied consent.
>I take issue with the notion that someone who is accessing a service through a public interface is criminally responsible for downtime if too high an access rate causes service degradation or an outage
Ah... well surely you've heard of people being prosecuted for denial of service attacks? Most recently, members of anon getting raided because they used LOIC? If you use a network in a way that is intended to degrade others' quality of service, even if you are just accessing things via normal protocols at a really high rate, you are breaking the law. In this case, it does not look like they are alleging that he intended to cause a service disruption, but they claim that he repeatedly circumvented measures that JSTOR put in place to halt his unauthorized activities, which caused service disruptions for other legitimate users and therefore denial of service.
I think 18USC1030 is pretty broad in its definition of a "computer". Back in the MBTA hacking case, MBTA claimed a magnetized piece of paper was a computer under this clause, and the first judge that looked at it bought that (sanity prevailed and that decision was later over-ruled). I wouldn't be surprised if they are considering the MIT network or at least the routers that were configured to prevent his access the protected computer in this case.
I don't like the fact that just about any misbehavior on the internet becomes a federal case because the probability of no interstate resources being used is very low.
So is that why Comcast routes traffic to networks 30 miles away across three states and back?
“It’s even more strange because the alleged victim has settled any claims against Aaron, explained they’ve suffered no loss or damage, and asked the government not to prosecute,” Segal added.
JSTOR is being very vague about their role in this, so that might unfortunately be wrong about just how settled JSTOR considers things on their side. Their statement feels extremely carefully worded: http://about.jstor.org/news-events/news/jstor-statement-misu...
MIT (and the alleged disruption to other MIT JSTOR users' access) may also be relevant to the decision to charge.
My understanding is that MIT has a freewheeling attitude to information, but also deep organizational and funding links to national security institutions. So their hacker ethos might tell them to brush it off, while their federal relationships require them to take a tougher stance.
In terms of broader implications, I would actually have many fewer problems with MIT pressing straightforward trespassing charges. If he broke into a server room and messed with equipment there without the owner's permission, they could prosecute that under very ordinary state criminal law long predating the computer age.
It's the weird "stealing documents from JSTOR" federal case under the Computer Fraud and Abuse Act that's more worrying, because it's extremely vague what those kinds of charges can cover (in some interpretations, essentially any violation of a ToS).
A place where all academic research that has been funded in part by
public funds is published, journals be damned. Hopefully with deep
pockets to fight off the lawsuits."
For NIH-funded studies, PubMed Central is actually doing a pretty good job. It would be great if PMC actually had a mandate to collect papers for all US federal grant-funded research, though.
This all hinges on what he was going to do with the documents. If he was looking to perform some large-scale analysis (such as he has done before) and publish the results academically, then this would fall under the academic mission of MIT, and therefore be legit. But if this were the case, why go through the hassle of hacking the system? Why not just ask JSTOR for cooperation? Or maybe he did, and they rejected it?
There has got to me more to this story, because I just can't for the life of me believe that he would download the documents to "free" them on internet (as is alleged).
I wonder what they'll push for. He sounds pretty screwed if this evidence pans out. Looks like he could even end up with a few years' time if the prosecutors want.
And you have a right to demand it: by federal law, all publicly funded research should be in the public domain (unless national security blah blah blah...). Unfortunately this law isn't enforced except in some aspects through provisions of all recent NIH grants (and maybe others).
It is alleged that he signed up for guest accounts on their network with different laptops, changed his MAC address and re-registered if the IP he was using was blocked (by JSTOR) or cut off of the network (by MIT), and finally connected a laptop in a basement networking closet.
I guess you could say that is 'hacking' in the unauthorized access sense, but not in any meaningful sense. It isn't breaking and entering if someone repeatedly trespasses somewhere (say, banned from a store) even if they change their clothes to avoid detection.
He found ways to get around the (minor) protections put in place using a computer. That fits the colloquial definition of hacking. We don't own the term anymore - if we ever did.
This was struck down by the courts by the way. Why? If unauthorized access is a federal offense, and unauthorized access can be determined by (e.g.) a EULA, then it basically allows a business to dictate criminal law. For example:
1. This would allow all of those 'You agree that you are not a law enforcement official' B.S. 'terms of use' on warez servers to actually have teeth.
2. "If you are a {black,hispanic,gay,etc} person, you are not allowed to access my website."
Do any of these sound reasonable? I should hope not.
* How about "You are not allowed to hyperlink to this page?"
* How about that 'MySpace Hacking' case? From Wikipedia:
Judge Wu summed up his opinion by stating that allowing a violation of a
website's Terms of Service to constitute an intentional access of a computer
without authorization or exceeding authorization would "result in
transforming section 1030(a)(2)(C) into an overwhelmingly overbroad enactment
that would convert a multitude of otherwise innocent Internet users into
misdemeanant criminals." For these reasons, Judge Wu granted Drew's motion
for acquittal. Government eventually decided not to appeal [16].
There are two separate issues as well, whether or not a restriction is valid and the ease with which it is circumvented. Lots of EULA/TOS terms are unreasonably broad. But let's say the TOS says "use of service without a password is unauthorized". Seems reasonable.
Somebody runs a program to guess passwords until they find one that works. "Ah ha," they say, "I have a password, so my usage is now authorized." I don't think a judge or jury will buy that. Even if the password was easy to guess.
I believe that was SpikeGronim's point. Assuming the definition of unauthorized is legit, there's no such "it was too easy" defense. Or "I could do it, therefore I must have been allowed to do it."
How long until posting a negative comment to a blog is "unautorized access" to that blog? Gaining access was easy: all you had to do was type a comment and hit submit. But some Powers That Be decided they didn't really want you to post that, so now it's a federal computer crime.
Speaking of "TV drama levels of understanding of criminal justice"... :)
With only a few exceptions, persons accused of crimes are not presumed to have mens rea. Statutory rape, for instance, has "strict liability"; even if you don't know you're committing a crime, you're liable. Most criminal offenses are not like this. The state is required to establish mens rea.
A prosecutor could say that a ToS-infringing blog comment is a criminal violation, but unless that prosecutor can establish that the comment was made in purposeful, knowing, or reckless violation of the ToS, they'd be wasting their time.
That's currently unsettled, and yes a lot of computer-crime experts (even typically law-and-order-leaning ones) consider it a huge problem--- it might actually be a federal crime to post a comment on a blog in violation of the blog's Terms of Service.
You're getting hung up on the "access" part, it's the "authorization" that's important. In your example it would likely be almost impossible for the blog owner to prove that someone intent on posting negative comments was not authorized to do so.
As a side note, I think we all should be more careful with "omg, we're losing our freedoms!" comments. They're usually not as clever as they feel at the time, and have the dangerous property of inuring us to such claims.
It isn't breaking and entering if someone repeatedly trespasses somewhere (say, banned from a store) even if they change their clothes to avoid detection.
It might not be breaking & entering, but it's still trespassing, which is a crime.
I'm not sure what you're appealing to. Real-world analogies don't always transfer readily to digital law. Even in the words of the ageing CFAA, "exceeding authorization" to steal commercial information is most certainly a punishable crime. You don't have to pull out nmap and zero days to be convicted of hacking.
22.
On October 8, 2010, Swartz connected a second computer to MIT’s network and
registered as a guest, using similar naming conventions: the computer was registered under the
name "Grace Host," the computer client name "ghost macbook," and the throw-away e-mail
address “ghost42@mailinator.com."
23.
The next day, October 9, 2010, Swartz used both the “ghost laptop” and the
“ghost macbook” to systematically and rapidly access and download an extraordinary volume of
articles from JSTOR. The pace was so fast that it brought down some of JSTOR’s computer
servers.
Also mentioned is that Swartz used the network closet to take 2 IP addresses. Are we to infer he hooked up both laptops with an IP each? Or is the real scenario that he was using one laptop under 2 identities? (Not that impossible.)
How is this not "unauthorized access" "in any meaningful sense"?
They blocked his IP, they blocked his MAC, and he hid a machine in a wiring closet to get on MIT's network. What would he have to do to make it "meaningful"?
Crack a password. Use SQL injections. Steal a credit card. Spoof someone else's MAC and IP. Steal a cookie. Something like that.
He was accused of using a guest network account on MIT, with a fake name, new MAC and IP, and throw-away email address. From there, he used a script to download lots of JSTOR documents.
This isn't the internet equivalent of "checking out too many library books". It's the internet equivalent of "checking out too many library books whilst wearing a false mustache".
Nah. It should only be illegal if normal people can't do it. Most of what Aaron did is stuff lots of people do.
There's a difference between wearing a dummy badge that says "I am Gary Host", a badge that incorrectly says "I am Bill Gates" (as that would be some kind of identity theft) and forging a passport in "Gary Host"'s name. What aaronsw did was far closer to the first.
Now, you could argue that scripts are power tools, and using them requires a higher standard of behavior. If you are driving a plane, giving dummy credentials over the radio is a lot more serious than a kid with a toy CV radio.
Even then, the dummy credentials didn't really cause any damage. The damage was done by the script itself. Even if he used his real name, the damage would have been done.
I did not say that it was not unauthorized access, just that hacking (in the breaking into computer systems sense) involves actually breaking in.
Changing a MAC address (on a device that you own and control) does not constitute hacking. It is as simple as ifconfig(8) and if you have a consumer router it probably has an option in the web interface to do it.
The indictment asserts that Mr. Swartz intended to distribute the files downloaded but did not substantiate this claim. I wonder what proof they have of this? (There are, of course, a great many laws dealing with probable intent that need only convince a jury of said intent without demonstrating it's validity.)
That was what he did in that case, but it wasn't what he did with the four hundred thousand law review articles he analyzed for conflicts of interest more recently.
The indictment doesn't have to provide all the evidence presented to the Grand Jury, and the Grand Jury process itself is secret (the actual court case won't be).
I've heard that, in drug law, possession of more than a certain amount is automatically interpreted as intent to distribute. Maybe the prosecutors in this case intend to argue something similar based on the nature or quantity of the documents retrieved.
Right... the real question seems to be what was he going to do with the files. The indictment doesn't really offer any evidence to this, and I doubt Aaron will be doing much talking until his defense.
This is not the first time he has done something like this if memory serves me. In late 2008 Mr. Swartz and Carl Malamud went to select libraries, ones with free PACER access, and proceeded to download ~700 GB of information that was behind a paywall. After which they made all of it available on Mr. Malamud's website.
Cambridge, MA– Moments ago, Aaron Swartz, former executive director and founder of Demand Progress, was indicted by the US government. As best as we can tell, he is being charged with allegedly downloading too many scholarly journal articles from the Web. The government contends that downloading said articles is actually felony computer hacking and should be punished with time in prison.
“This makes no sense,” said Demand Progress Executive Director David Segal; “it’s like trying to put someone in jail for allegedly checking too many books out of the library.”
“It’s even more strange because the alleged victim has settled any claims against Aaron, explained they’ve suffered no loss or damage, and asked the government not to prosecute,” Segal added.
James Jacobs, the Government Documents Librarian at Stanford University, also denounced the arrest: “Aaron’s prosecution undermines academic inquiry and democratic principles,” Jacobs said. “It’s incredible that the government would try to lock someone up for allegedly looking up articles at a library.”
Demand Progress is collecting statements of support for Aaron on its website at …URL…
“Aaron’s career has focused on serving the public interest by promoting ethics, open government, and democratic politics,” Segal said. “We hope to soon see him cleared of these bizarre charges.”
Demand Progress is a 500,000-member online activism group that advocates for civil liberties, civil rights, and other progressive causes.
About Aaron
Aaron Swartz is a former executive director and founder of Demand Progress, a nonprofit political action group with more than 500,000 members.
He is the author of numerous articles on a variety of topics, especially the corrupting influence of big money on institutions including nonprofits, the media, politics, and public opinion. In conjunction with Shireen Barday, he downloaded and analyzed 441,170 law review articles to determine the source of their funding; the results were published in the Stanford Law Review. From 2010-11, he researched these topics as a Fellow at the Harvard Ethics Center Lab on Institutional Corruption.
He has also assisted many other researchers in collecting and analyzing large data sets with theinfo.org. His landmark analysis of Wikipedia, Who Writes Wikipedia?, has been widely cited. He helped develop standards and tutorials for Linked Open Data while serving on the W3C’s RDF Core Working Group and helped popularize them as Metadata Advisor to the nonprofit Creative Commons and coauthor of the RSS 1.0 specification.
In 2008, he created the nonprofit site watchdog.net, making it easier for people to find and access government data. He also served on the board of Change Congress, a good government nonprofit.
In 2007, he led the development of the nonprofit Open Library, an ambitious project to collect information about every book ever published. He also cofounded the online news site Reddit, where he released as free software the web framework he developed, web.py.
Press inquiries can be directed to demandprogressinfo@gmail.com or 571- 336- 2637
“This makes no sense,” said Demand Progress Executive Director David Segal; “it’s like trying to put someone in jail for allegedly checking too many books out of the library.”
No it's not. It's like sneaking into the library at night and making photocopies of all the books. Then, upon getting caught, the perpetrator sneaks back into the library in a different disguise and continues to photocopy more books. Repeat this action of getting caught and sneaking back in a few more times and combine this with the fact that his downloading of documents affected JSTOR performance for other legitimate users of the archive and you get a sense of what he's really done.
How is this excusable?
I'm completely onboard with those who claim that we need some reform in scientific publishing, but Aaron's actions smack of low ethical standards to me, not to mention extremely poor judgement on his part.
EDIT: Hi downvoter! Can you please explain why you think I'm wrong?
Not a downvoter, but the way governments are reacting to those demanding transparency is what I call draconian. This is no longer symmetric opposition, this is a way of terrorizing those who want a little more freedom. Faced with obstacles as the one Aaron faced, I would do the same. So sue me.
Disobedience is not the same as terrorism although they would like you to think so. Disproportionate punishments are what I term a terrorist act. Stop being such a conformist and stop using their language. Every time you test these boundaries and fight for it you will be fighting for your freedoms.
Fair enough. I agree with most of what you're saying here. I definitely believe that any punishment involving more than a fine and/or some community service would be disproportionate to the crime here.
What I didn't like was the statement portraying him as some kind of hero. I'm just pointing out that he's not. He didn't have a legal right to be using the documents, he shouldn't have been trying to download the documents using a guest account at MIT obtained by submitting false information, and he certainly shouldn't have tried to get back into the network after being banned multiple times.
If his goal was to put a lot of scientific papers into the public domain, I can think of many other ways he could have achieved this. So I'm also a bit puzzled by his approach here.
Stop being such a conformist and stop using their language.
This attack seems rather uncalled for.
Every time you test these boundaries and fight for it you will be fighting for your freedoms.
I can see where you're coming from on this, but I'm not entirely convinced that you're right.
Anyone on MITNet has access to JSTOR articles free of charge to the user. Similarly the ACM, IEEE, etc. all have agreements like this with major universities.
JSTOR offers packages and any articles not included in those packages can be purchased by individuals. Most choose instead to request it via interlibary loan and you'd have it within 24 hours usually.
Yeah, I was laughing about how an Acer laptop took down their service and did damage to their network. If I was JSTOR, I wouldn't prosecute just because it makes our company look ridiculous.
Because, of course, JSTOR plans for 100x usage spikes and performs neither logging nor accounting, so serving a file is just as simple for them as downloading it is for the client.
> Does anyone think it's odd that an Acer laptop could write these files to disk faster than JSTOR could serve them?
Why would that be odd? SATA = 3 Gbps throughput with minimal overhead, Ethernet = 1 Gbps with lots of overhead (IP headers, Ethernet headers, HTTP headers)
It sounds like JSTOR's servers aren't really optimized for high article download rates, to the point that his one laptop accounted for a significant part of normal continental US load. They probably had some bottleneck in the system that they never noticed before — maybe their logging infrastructure was absurdly slow or something.
Did it specify that he was writing to local disk? If I did a stunt like this, I'd probably try to write to some cloud storage like Amazon S3. I'd bet that Amazon's servers can drown JSTOR's, particularly if there's a big pipe like MIT's in-between.
It's hard to trivialize downloading 4 million articles using a web scraper's bag of tricks and then some. If the information was publicly accessible these charges wouldn't stand unless he tried to distribute it. If it was something so commendable, why would you cloak your activities or go to a different university to do your dirt instead of Harvard (where your a fellow of some sort) or Stanford (where you attended). Regardless of the motives and ideals or the excess of the charges, this isn't one of those hapless grandma versus the RIAA stories. He must have known what he was doing.
The pricing and restrictions on the dissemination of academic papers is by any rational evaluation nothing short of ridiculous and contradicts the academic ideal of free exchange of ideas for the advancement of knowledge. However, history of scholarship is also a history of patronage, academic politics and in-fighting for greater prestige.
It's sad that someone like Aaron has to be treated like a domestic terrorist. It's sad that we have a vindictive justice system willing to flaunt the Constitution in this day and age with what effectively amounts to cruel and unusual punishment so they can "make an example" out of someone.
However, it's no one's fault that Aaron was so emboldened to take this initiative without sufficiently ensuring that he would be free from criminal prosecution.
Am I alone in thinking that these "hacktivists" will only prompt government to push more frivolous data theft laws and heavier punishment for offenses that may one day victimize hapless, innocent people? It's going to get a lot worse before it gets better.
> However, it's no one's fault that Aaron was so emboldened to take this initiative without sufficiently ensuring that he would be free from criminal prosecution.
Maybe he did it knowing full well what the consequences might be. He seems to be a pretty principled guy.
Demand Progress is an organization Aaron co-founded. They've done some great watchdog work on things like PROTECT IP, the Patriot Act, the Internet Blacklist Bill etc.
There is a riser closet in my office with various internet service providers wiring in it feeding the entire building.
If I were to enter this riser closet and plug into my laptop into one of these lines, I would be charged with theft of service and deservedly be sent to jail. It doesn't matter if the door is locked or not. It doesn't matter what kind of security they put in place or not. It doesn't matter if I only sent a few bytes of data on their network and didn't harm anyone elses' service. It is still theft of service.
Why the hell is MIT stashing information in closed systems in first place? I thought the idea (OCW etc.) was to enable more people to learn, participate and benefit from work of academics and researchers. Hell I even donate a few hundred bucks every now and then to OCW.
It is mind boggling how the supposedly smart people are not getting their heads out of their asses so late in a world frighteningly short on distribution of knowledge that can be effectively used to solve the wicked problems that are crippling it for so long.
We really need a global, openly accessible knowledge network and a platform where all eligible can contribute and collaborate to research at least when it comes to areas that impact human society at large - medicines, natural resources etc. It is hard otherwise to see how things like Cancer and Energy shortage can be tackled.
But is it clear that MIT does not also store its research papers in JSTOR instead of making them publicly available? Reading the article I did not get that impression.
While many researchers at MIT publish in journals that sometimes have restrictions on the distribution of their papers (exclusivity, etc.), most MIT publications are issued via:
1) DSpace http://dspace.mit.edu (an open publishing platform for academic material. Most all theses produced by MIT researchers can be accessed by the public here)
2) as an MIT technical report (mostly published via dspace)
Certainly, a huge number of MIT publications do end up being mirrored in JSTOR, but they are usually published via whatever journal or conference proceeding they are accepted to first. If an author cannot find a journal that is willing to publish their paper, then they will probably issue it as an MIT technical report.
Ok, that's actually useful information. I find MIT's approach fairly reasonable - may be all others should follow suit and JSTOR-like walls would not be necessary some day.
I donate to OCW as well (LOVE OCW) but I don't think any of this information is on a closed system. I believe JSTOR allowed all mit ip addresses access (for free) to every article in their system.
Aaron's downloads came from an MIT address to the JSTOR database.
I agree with you that knowledge needs to be more accessible, but this is not the method to achieve it.
can someone please explain what the deal is here for us uninitiated? sounds like they are throwing the book at him for stealing books? seriously? why is the prosecution being so aggressive? did he profit from it or something? this sounds so petty
The cynic in me suspects that he has annoyed someone in power with his many other political activities (Demand Progress etc). When I first had contact with Aaron he was just this guy who wrote rss2email, it's been very inspiring to see him move on to hacking politics.
He violated Terms of Service for a large database of journals at an egregious rate. If he had stolen say...10,000 articles this may be a non-issue. The fact that they can tie 4,000,000 articles to him is what makes this so bad.
The other part that is extremely bad is his measure to continually evade MIT MAC banning through spoofing. That measure of evasion proves, to some extent, his intent to steal and is a form of computer fraud.
I wouldn't be too surprised if him causing a month (months?) long interruption of JSTOR for an entire campus deeply involved in government contracts was also part of the ire.
Well, I'm going to guess that the prosecution is putting that in there just in case they can get the jury to agree. They can claim anything they want, the jury has to decide if it's true without a doubt.
Um, I don't know, maybe because he wrote a manifesto to that effect and has circulated various methods for doing so? He was sponsoring a google group for article requests for awhile. Seems to be gone now.
Wait a minute. All he needed was a guest account to access JSTOR? That's like saying, ANYONE IS ALLOWED TO DOWNLOAD FROM JSTOR. This isn't just bad security, this is no security.
Most academic journal repositories grant institutional access based on IP address blocks. Some institutions keep this narrow and force you to use an HTTP proxy, which gives them the ability to put additional institutional-level authentication in place. The upside is that you can access the journals from off-network, the downside is using the proxy can be a major pain. Other networks, e.g. MIT, are permissive with the IP restriction and don't mandate the use of one centralized proxy as long as you are on-network. I suppose that may change after this case.
"Our ultimate long-term objective is to make JSTOR available to everyone who wants access to it, while doing so in a way that ensures sustainability of the service."
Cynically, it seems like the bit about "ensures sustainability" can be translated as "we will aggressively prosecute in order to protect our bloated salaries."
I'm probably Advocating Crime but, couldn't a bunch of people coordinate and do what he was doing, over a year or two, distributed, from several universities? You totally could. The more, the less noticeable, and punishable.
edit: it wouldn't surprise me if something like that showed up; the problem has been highlighted, the legal issues made clearer, and JSTOR bloodied. Go Aaron.
There's a loosely-organized project at Wikimedia Commons to liberate at least the public-domain works locked up in JSTOR. This almost certainly violates JSTOR access policies, but once downloaded and stripped of their JSTOR title page, it's probably not illegal for the Wikimedia Foundation to host the result, since the result is in the public domain, even if ToS were violated in the document's acquisition.
So he allegedly goes out and buys a laptop just to do this heist...and then he blows his cover by doing a scrape fast enough to apparently bring down some of the MIT servers? Why was he in such a rush?
Most likely the servers crashed for whatever reason, and they looked at the logs and saw lots of recent accesses from a particular computer, and post hoc ergo propter hoc. Sysadmins often blame outlier users for crashes.
JSTOR Statement: Misuse Incident and Criminal Case
The United States Department of Justice announced today the criminal indictment of an individual, Aaron Swartz, on charges related to computer fraud and abuse stemming from his misuse of the JSTOR database. We have been subpoenaed by the United States Attorney’s Office in this case and are fully cooperating. While we cannot comment on this case, we would like to share background information about the incident and about our mission and work with the academic community and the public.
What Happened
Last fall and winter, JSTOR experienced a significant misuse of our database. A substantial portion of our publisher partners’ content was downloaded in an unauthorized fashion using the network at the Massachusetts Institute of Technology, one of our participating institutions. The content taken was systematically downloaded using an approach designed to avoid detection by our monitoring systems.
The downloaded content included over 4 million articles, book reviews, and other content from our publisher partner’s academic journals and other publications; it did not include any personally identifying information about JSTOR users.
We stopped this downloading activity, and the individual responsible, Mr. Swartz, was identified. We secured from Mr. Swartz the content that was taken, and received confirmation that the content was not and would not be used, copied, transferred, or distributed.
The criminal investigation and today’s indictment of Mr. Swartz has been directed by the United States Attorney’s Office.
Our Mission and Work
Our mission at JSTOR is supporting scholarly work and access to knowledge around the world. Faculty, teachers, and students at more than 7,000 institutions in 153 countries rely upon us for affordable and in some cases free access to content on JSTOR. Since our founding in 1995, we have digitized the complete back runs of nearly 1,400 academic journals from over 800 publishers. Our ultimate objective is to provide affordable access to scholarly content to anyone who needs it.
It is important to note that we support and encourage the legitimate use of large sets of content from JSTOR for research purposes. We regularly provide scholars with access to content for this purpose. Our Data for Research site (http://dfr.jstor.org) was established expressly to support text mining and other projects, and our Advanced Technologies Group is an eager collaborator with researchers in the academic community.
Even as we work to increase access, usage, and the impact of scholarship, we must also be responsible stewards of this content. We monitor usage to guard against unauthorized use of the material in JSTOR, which is how we became aware of this particular incident.
"Our mission at JSTOR is supporting scholarly work and access to knowledge around the world. Faculty, teachers, and students at more than 7,000 institutions in 153 countries rely upon us for affordable and in some cases free access to content on JSTOR. Since our founding in 1995, we have digitized the complete back runs of nearly 1,400 academic journals from over 800 publishers. Our ultimate objective is to provide affordable access to scholarly content to anyone who needs it."
AHAHAHAHAHAHAHA, at 50K a year? When they won't even sell it to institutions they don't consider proper schools? Somebody needs to re-evaluate their mission statement...
50k is cheap as hell. UMass Boston had 125,000 JSTOR downloads last year. Let's assume we pay $50k (close enough), that's 40 cents an article. This is cheap as hell compared to other databases that charge more than that for a single search!
It's further harming the possibilities of being an independent scholar or even reading academic literature outside academia, though, which I think is a significant detriment to academia. I'm not as worried about the $50k a university library has to pay, but if you're an independent scholar for even one year, it becomes clear how harmful to the research community JSTOR is.
Journals that used to sell archive access to individuals for, say, $50 or $100 annually, now won't sell you a membership at all, because to save money on hosting they've moved their subscription infrastructure to JSTOR, and JSTOR refuses to sell individual subscriptions. So you end up "stealing" your access; at various times I've gotten my JSTOR access via ssh -D proxies to a friendly grad student. Now I try to return the favor by providing such access to independent scholars where needed. This sort of gaming shouldn't be necessary with a non-profit organization that is supposed to be working in the public interest, though. Hell, public domain journals from the 18th century that were scanned using public grant money are locked up behind a JSTOR paywall!
Many university libraries allow to you join as a "friend" of the library, which typically costs ~$100/year and gives you full access to all library resources.
I see your point, but you could just as easily walk into your local university, use their network as a guest, and access millions of dollars worth of content for free. Most databases allow you to download PDFs so you can download a bunch of articles and take them home, legally. Proxying through your grad student friends is breaking the law.
Even among state universities access to electronic journals for guests is becoming increasingly rare (usually prohibited in the license agreement). Also if you look to much into how scholarly communications works, the idea of charging 50k is insane since that vast majority of the cost of journals are actually paid for by universities in the first place.
edit: although I do agree with your point that in with the current state of things the cost per usage of JSTOR is significantly less than most other electronic journal collections
I would be impressed if it's an actual trend that guests were being blocked from accessing licensed content.
I work in this field and I can say that almost all databases and journals have IP authentication, generally with the entire campus IP range white listed. If the vendors don't implement that, they generally use password protected accounts. There are some alternatives such as uploading a list of barcodes, Referring URLs (I'm not kidding ProQuest for example allows this), Athens, Shibboleth, and a few others. These, however, are not commonly implemented by libraries because they are not widely adopted by vendors and require additional IT support that libraries simply don't have.
Therefore, unless libraries proxy all their users through their proxy servers (They don't for on-campus users. Usually, all links will go through the proxy, but if on campus the proxy will redirect you directly to the database to avoid the overhead), it would be near impossible to enforce the restriction of guests. I would wager that in many of these scenarios, it's all smoke and mirrors, and that access is really there if you know what you're doing (e.g. just visit the vendor site while on-campus without going through the library links).
I've been at at least one university (a smallish liberal-arts college) that used a separate foo_guest wifi network for guests, which dumped people into an IP range that wasn't in the database-access whitelist (enrolled students and faculty would connect to a different wifi network, using their username/password, that was whitelisted). I agree that that isn't usual practice, though, despite being technically required by some database agreements.
The configuration I've seen is to have not only guest accounts but also to have guest computers which are on a separate IP range, same with the guest wireless network. The important issue is actually what do the details of your license agreement say regarding guest users. Some state universities will actually request for this to specifically allowed since in many states it's considered, even if unofficially, a right of tax payers to have access if they need it
But many state schools let you join the library even if you're not a student. e.g. http://www.lib.iastate.edu/info/6197 - $20/year for a non-affiliated member.
They are probably bound by contracts with the actual content owners which limits who they can give access to. Not saying it's right but even the price is probably not all their doing.
This seems contrary to Demand Progress's statement:
“It’s even more strange because the alleged victim has settled any claims against Aaron, explained they’ve suffered no loss or damage, and asked the government not to prosecute,” Segal added.
I got the impression Demand Progress was referring in that sentence to MIT as the "alleged victim", which seems slightly disingenuous to me. It also seems plausible that MIT would behave in the manner described. However I have not yet seen a statement from MIT that would corroborate this, either.
And even if it were true, the "alleged victims" don't get to decide whether the government prosecutes. The only cases where the victim can force the prosecution to abandon their case are those crimes such as spousal abuse where the only hard evidence is the testimony of the victim themselves. In other cases, the government can get all of the evidence it needs through subpoena.
That sentence may have been updated to clarify:
“It’s even more strange because JSTOR has settled any claims against Aaron, explained they’ve suffered no loss or damage, and asked the government not to prosecute,” Segal added.
This sounds very hypocritical. If their "Mission and Work" is to support scholarly work and academic usage, and encourage usage of large sets of data, why are they suing Aaron's ass instead of working with him to further understand what were his needs that their system was lacking and how they could reduce that gap? Why else would Aaron hack their system if not for research?
Sounds like instead of spending money in improving their system and security, they found a more beneficial (to their eyes) approach: charging Aaron will instead set the example.
Edit: see @mbreese's correction.
Edit 2: looks like my comment missed some important pieces of the puzzle. See further down the thread.
Here's what's supposed to happen: A student on campus wants to do a research project analyzing a large number of journal articles, or even just the metadata around a large number of journal articles. They approach their librarian, the librarian approaches the journal vendor (JSTOR or someone else), and everyone works together to find a way to get the student their data. Maybe the vendor hands over a special dataset. Maybe they give them back-end access to their database. Maybe they allow the student to run a Python script against their website, but only in the middle of the night so as not to slow down service for other users.
Here's what sometimes happens instead: The student writes and runs a clever script, the vendor notices that their servers have slowed due to automated script activity on their webpages, shuts down access from that IP address, and lets the school know. University IT staff and librarians drop what they're doing and try to track down the party responsible. Once they've been identified, the nice librarian has to have a talk with the student about what's permitted under the university's license agreement with the publisher, and together they go to the data vendor to ask forgiveness and permission. They usually get it.
Here's what Aaron did: He walked onto the MIT campus, set up a script not to analyze metadata but to actually download large numbers of documents, and when his IP was blocked, he used traditional hacking as well as Johnny Long-style "no-tech hacking" to get around it.
http://video.google.com/videoplay?docid=-2160824376898701015
Publishers, rightly or wrongly, assume that someone systematically downloading entire journal runs is intending to set up a shadow database to give away their content for free. The feds seem to agree, in this case.
The thing that gets me is that JSTOR was more than reasonable in this case. They didn't immediately shut down the whole campus (like some overly-aggressive publishers do), but started with the IP addresses involved. When it didn't stop, they had to cut off access completely. Aaron, who isn't even a student at MIT, managed to kick the whole campus off JSTOR for weeks, and as soon as they restored access, he went at it again. If I were a librarian at MIT, I'd want the book thrown at him.
It's excessive, and I don't think that's what he'll serve. But he was warned multiple times that MIT and JSTOR didn't want him doing what he was doing, and he continued doing it, so as far as I can tell he's getting exactly what he asked for. He probably thinks he's the Rosa Parks of scholarly communication. I think he's reckless and grandiose.
Swartz's organization seems to pretty clearly claim that both MIT and JSTOR asked the feds not to charge him. While JSTOR's statement doesn't actually say thatº, it certainly seems plausible to me since they very clearly say that they privately engaged and dealt with the matter with Swartz. You simply don't do that if you're pursuing criminal charges.
º I would probably be hesitant to say that publicly as well, considering the feds came down with a monster 35-year indictment against a juicy target. The "victim" announcing on Day 1 that they don't support the charges would not be accepted very charitably by federal prosecutors.
Unless JSTOR is itself at risk of prosecution for something, it doesn't make a whole lot of sense to me that they'd care how Federal prosecutors felt about their public statements.
I immediately thought of that book as well. As Silverglate frequently points out, the federal attorney is using extreme penalties to force a plea deal in this case. It's de facto denial of a fair trial when the Sword of Damocles (i.e. 35 years in prison) is hanging over your head.
That does shed a new light to their statement. Seems like both parties made some lousy calls which led to a very ugly situation.
I don't understand the legal mechanisms that could lead to JSTOR not showing any public support for Aaron, and if those exist, but I hope they will show some support during the process (as well as the MIT). Gov shouldn't put digital theft in the same basket as physical theft.
There are no legal mechanisms, but showing support for someone that intended to destroy the value of their publisher's copyright would probably displease their partners...
Why did Aaron hack their system instead of working with them to further understand what were his needs that their system was lacking and how they could reduce that gap?
I don't understand why he was so desperate to access it via MIT. There are dozens of libraries in the Boston area with access to JSTOR with guest access to their network. If he wasn't in such a rush, he could have easily bounced around campuses and likely have avoided detection.
If the indictment is accurate, he did throttle back his script a bit, and continued to run it for several months after the last JSTOR server crash or blockage.
He is the author of numerous articles on a variety of topics, especially the corrupting influence of big money on institutions including nonprofits, the media, politics, and public opinion. In conjunction with Shireen Barday, he downloaded and analyzed 441,170 law review articles to determine the source of their funding; the results were published in the Stanford Law Review. From 2010-11, he researched these topics as a Fellow at the Harvard Ethics Center Lab on Institutional Corruption.
He has also assisted many other researchers in collecting and analyzing large data sets with theinfo.org. His landmark analysis of Wikipedia, Who Writes Wikipedia?, has been widely cited.
Interestingly, while the Grand Jury Indictment does discuss evidence for many of the other accusations, they do not discuss any evidence for this intent-to-distribute claim.
If all academic papers were released under a Creative Commons license, would this even be an issue?
Aside from the allegations about breaking into various physical hardware infrastructure at MIT, wouldn't that be like being charged with downloading too many Jonathan Coulton albums?
> The indictment alleges that Swartz, at the time a fellow at Harvard University, intended to distribute the documents on peer-to-peer networks. That did not happen, however, and all the documents have been returned to JSTOR.
This is beyond belief. If the prosecutor tries enough charges, some of them will stick, especially before a jury who may think this is a hacking case or a file sharing case. He'd do well to avoid the "hacktivist" label in court.
There were some poor judgment on display and I generally don't condone breaking the law but it would be a shame to lose Aaron's epic productivity and ingenuity for any period of time.
I don't understand why the US Attorney is suggesting he be charged with such ridiculously inflated charges.
Reviewing the Indictment report, "JSTOR did not permit users... to download all of the articles from any particular issue of a journal." Further, "JSTOR notified its users of these rules, and users accepted these rules when they chose to obtain and use JSTOR’s content."
So basically JSTOR is claiming Aaron violated their terms of use. Their terms of use are likely an adhesion contract with a passive shrinkwrap notification. (remember ReasonableAgreement.org ?).
It is not certain that Aaron did in fact agree to such terms, or what consequences doing so and then violating said terms should have. Regardless, JSTOR proclaiming certain terms may not be sufficient to deny Aaron of his rights as a consumer, citizen, and human.
The report goes on to claim that Aaron took action to "avoid MIT’s and JSTOR’s efforts to prevent this massive copying". MIT and JSTOR allowed users to access their network, with no system in place to ensure that a user was a student (by design, as MIT admits) or that they were using their real name (or a single MAC address). A researcher accessing JSTOR is really less of a concern than other potential types of access so perhaps this is not a good system. The report suggests Aaron took action to "elude detection and identification" but courts have held that anonymous speech and action are valid parts of society. They take issue with his using a Mailinator address but such an email address is just as valid as any other and simply allows others to read ones mail.
The report whines that the "rapid and massive downloads and download requests impaired computers used by JSTOR to service client research institutions". This inconveniencing of other users could have been avoided and the blame for how JSTOR allocates resources lies with the architects of JSTOR.
MIT acted to ban the IP ranges that they believe were in violation of their rules. Users were to use the network to support MIT’s research, or at least not obstruct it. However, very likely Aaron was conducting research. Any hindrance to other users may have been the responsibility of MIT's infrastructure team.They further request users "maintain the system’s security and conform to applicable laws, including copyright laws" seemingly suggesting Aaron was in violation of copyright. Very importantly, MIT should remember that when it comes to copyright "Reproduction for purposes such as criticism, comment, news reporting, teaching, scholarship, or research, is not an infringement of copyright." The last point of MIT rules is that users "conform with rules imposed by any networks to which users connected through MIT’s system" which makes little practical sense and is certainly selectively enforced. Assuming a JSTOR web server is now a network, so is my personal web server. On all html files on my webserver I link to a ReasonableAgreement-style notification that no user may browse such files between 8am and 11pm EST. Any MIT student, faculty member, or guest who connects during those hours is in violation and should be kicked by MIT, for if a rule is to be fair it should be consistently enforced. This third rule is simply a CYA clause and is its selective enforcement is arrogant.
To conclude, the document does suggest that perhaps Aaron did violate the JSTOR terms of use for their website. When a normal business decides to deal with a violation of their terms of use they deactivate that customer's accounts. However, Ithaka Harbors Inc., a “non-profit” organisation (Presidents yearly compensation is over $400,000 – see their 2009 Form 990 at http://www.guidestar.org/FinDocuments/2009/133/857/2009-1338... ) funded by the Mellon Foundation and the Gates Foundation and committed to “the core values of higher education” and to a “deep understanding of technology” (ha), decided to alert the feds. As to MIT's claims that Aaron broke their rules for using their internet connection, it seems he neither obstructed MIT's research nor violate laws or copyright laws. As to their third claim, he may have violated the terms of a "network" but so do a significant portion of MIT’s users everyday.
Why is the Obama administration pursuing an investigation in Wire Fraud and Computer Fraud?
Frankly, I think most of the claims in your post are bonkers, but this one sticks out:
This inconveniencing of other users could have been avoided and the blame for how JSTOR allocates resources lies with the architects of JSTOR.
"During November and December, 2010, Swartz used the "ghost laptop" (i.e., the Acer laptop) at MIT to make over two million downloads from JSTOR. This is more than one hundred times the number of downloads during the same period by all the legitimate MIT JSTOR users combined."
So we're entirely clear on this: you are saying this is reasonable and that the majority of the blame for any disruption lies with JSTOR. You are saying that JSTOR should have anticipated that a single legitimate user would have placed a hundred times the demand on the system than the entire population of a major university.
Basing whether or not Aaron's actions were "wire fraud" based on quantity doesn't really make sense. Saying he downloaded more than other users at MIT is really just saying other people requested less files than he did. I remember when I needed academic papers, I stopped using my institution's access to JSTOR because frequently the same paper could be found (legally) using Google Scholar or similar. As academia switches to things like arXiv, JSTOR will be used less and less. I understand that 2 million is orders-greater than an average individual might request, but basing the legality on quantity is flawed. If one file is accessed illegally, you have broken the law. If 2 million are, you have also broken the law.
As to being entirely clear, yes, I would have expected the employees at JSTOR (like the VP of technology making $300k) to have thought of possible ways to rate limit a connection. Just as you can prevent DDOS with something like iptables hashlimit, JSTOR could have done something better than simply refusing all MIT traffic. http://serverfault.com/questions/211135/how-to-prevent-a-loi...
As an analogy, if I host 1 million images of cats on my webserver, and I keep everything open and allow indexes in the Apache configuration, expecting the average person to only download 10 pictures and then you use 'a little WGET magic' to download 1 million jpgs, it seems odd for me to complain when my server crashes and my friend Erica can't download 1 single cat image that she had been wanting to. Yes, your downloads caused the tie up but how angry can I be? Your browser sent a request to my server and my server fulfilled the request.
Like David Segal says, the United States government wants to punish someone for an action akin to allegedly checking out too many books from the library. Over the objections of both MIT and JSTOR, presumably. And you're saying I'm bonkers.
This is what I mean by bonkers: nobody is "basing the legality on quantity". You made that up. I didn't say that. The indictment didn't say that. The law doesn't say that.
As it seems your concern for fact starts and ends with the considerably less relevant quantity of JSTOR employee compensation, I see no reason go on.
One of the summary findings of the indictment does seem to suggest that quantity was a factor:
"[The] defendant, AARON SWARTZ, knowingly and with intent to defraud, accessed a protected computer... in excess of authorized access, and by means of such conduct furthered the intended fraud".
JSTOR's salary rate is really the least of my concerns (though wasteful bureaucracy in the non-profit industry does concern me in general and tangentially relates to this story). My concern is over the US federal government pursuing felony charges when a researcher violated a website's terms of service statement. I'm sorry you received any other impression.
Well, it does seem that if JSTOR serves 7000 institutions around the world, having that number effectively jump to 7100 for a couple of months shouldn't be a big drama. Also, it's pretty surprising to me that the average number of downloads at MIT was about 1.4 per person per month. (2 million / 100 / 12000 people / 2 months.) I would have expected at least one order of magnitude higher.
if JSTOR serves 7000 institutions around the world, having that number effectively jump to 7100
I understand what you're thinking, but that's not what happened. He downloaded a hundred times MIT, not a hundred times the average institution. You're assuming that MIT is close to average, but there is no reason to think that. I strongly suspect that MIT is a bigger user than (for example) Rye Neck High School or the Scottish Agricultural College.
One hopes he is smart enough to have a solid legal argument for what he has done. If he loses the legal battle then it's going to set a bad precedent for all these academic document stores to continue keeping hold of this information. On the other hand, if he were to win then it may make it harder for groups like JSTOR to continue restricting access to their data.
A solid legal argument for switching your MAC address, hiding a laptop in a closet, and deliberately concealing your face to try to avoid recognition? He isn't going to win that defense.
Hypothetically: wiring closets are normal places for computers engaged in always-on network services. Putting a box over a laptop can keep it from getting stolen. Your bike helmet can end up in front of your face for a while while you're taking it off or putting it on. Switching your MAC address is reasonable if there's a MAC address conflict, or if you've gone and apologized to the network infrastructure guys for causing problems and they accepted your apology but didn't have the password for the DHCP server handy.
Try to remember you're seeing the picture painted entirely by the prosecution, without even the evidence that the grand jury saw to support it.
"...or if you've gone and apologized to the network infrastructure guys for causing problems..."
Read the indictment. He started on Sep 25th. He was blocked by IP on Sep 25. He switched his IP on Sep 26. The entire netblock was then blocked by admins. To try to turn the service back on for normal users, admins tried to block his MAC address on Sep 27, which worked until he changed it on Oct 2, and then added a second machine on Oct 8.
Your "hypothetical" defense doesn't even come close to flying. This isn't a little misunderstanding; this is network admins blocking access, and being circumvented, multiple times.
Thanks, but I know what the indictment alleges. But the facts of the indictment may be false, and there may be many other highly relevant facts that are omitted.
The man is a heroic martyr, who risked everything to set knowledge free. (Knowledge most of which was produced at the public's expense!)
He may very well die in prison.
Or perhaps he will be forced to publicly recant and merely be forbidden from using computers. I hope that in the latter case he will have the good sense to emigrate.
One day, his tormentors will be harshly punished. Unless, of course, "the future is a boot stamping on a human face — forever."
"Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th."
Can we be done with this thread? I know for a fact that I am not the only one who has flagged it. No good can come of this. In a likely worst case given who we're talking about, Aaron will be prodded into commenting publicly on a criminal case.
You are, however, free to believe that I was trolling. I can't stop you.
I invite the moderators to delete my account, if they believe that I am a troll. As things are, I merely refuse to go along with the herd-think on this site.
You seem to have a lot of comments for someone who doesn't even have a TV Drama-level understanding of the criminal justice system. You know that even people accused of murder are often free during their trials, right?
For some of us, memory of how Mitnick was treated (or for a more modern example, Manning), still taints how we expect accused "hackers" to be treated by the criminal justice system.
Physical detainment of someone accused of such crimes is hardly unheard of.
Mitnick's story is absolutely nothing like Aaron's. Mitnick was a fugitive after already having done prison time and had to be tracked down via cell phone triangulation.
A political prisoner is not usually granted bail. It isn't hard to see why.
Murder is a genuine crime recognized by all nations. One accused of murder will find few people who are automatically sympathetic.
On the other hand, the "crime" Mr. Swartz is accused of would be seen as an entirely moral act by a great many people around the world, myself included. It is quite possible that if he were to board a plane, he could find a place where he would not only be able to continue in his line of work, but would be respected as a hero who flipped the finger at the Evil Empire and its fat rent-seeking parasites.
The prosecutor would be quite foolish to grant bail.
Heading off a message board nerd response: the prosecutor does make influential bail recommendations. But don't you get the impression that this particular commenter would be disappointed if Swartz wasn't imprisoned immediately? That sure would cut down the drama and cost the commenter some opportunities to howl at the moon about political prisoners.
Thanks for the reply. I had been thinking of submitting the NYT article but based on your response decided to hold off. Not sure I think that flagging it is the right thing but I don't want to pour gasoline on the flames.
In this case, the indictment alleges that the documents were stolen from JSTOR, which does not even own them! In the vast majority of cases JSTOR scanned documents whose copyright is owned by someone else, and acquired or was donated a non-exclusive license to distribute copies via its service. In many cases the documents are even public domain. The indictment continues the theft metaphor by discussing the effort and expense JSTOR incurred in scanning the documents, and the alleged attempt to render this less valuable by redistributing "its" documents, analogizing this to the loss someone suffers in a theft.
But effort expended to build a private repository consisting of copies of things you don't own doesn't give you ownership of the result, any more than Google Books doing the same has given them ownership of the documents that they've scanned. If you scraped Google and "stole" their scans, you would be violating Google's Terms of Service, and Google might indeed feel subjectively like you've taken something of value (their exclusive access to this repository of scans), but I think it would be a stretch to say that you've "stolen" "their" documents.