
The Craigslist Lawsuit - squigs25
https://3taps.com/the-craigslist-lawsuit.php
======
tomasien
Greg Kidd, the founder of 3taps, did not have to keep fighting this fight - AT
ALL. He is one of the top execs at Ripple Labs, was in the first round of
Twitter (and Square), and doesn't have his net OR self worths tied up in
3taps. He continued this because he believes it was right - and I, for one,
thank him for it.

~~~
briandear
Right for repurposing someone else's data without permission? Nothing right
about that.

~~~
thaumaturgy
You can't usually convince anyone that they're wrong just by rephrasing some
situation in a way that's more favorable to you.

"Right for using information freely provided by Google to help people find
homes?" See? Now we have both found a way to describe the same situation in
ways that makes it sound like two entirely different situations.

------
a-dub
Back when the first bubble burst in late 2001, I scraped a bunch of historical
craigslist data from a secondary archive and built an interactive gnuplot
webpage of post-traffic by category over time. At the time, it got
slashdotted, a couple hundred thousand people looked at it and it was all fun
and good.

So I thought afterwards, hey, the economy is kinda sketchy still and looking
at this stuff sure is neat... I should build a real tool that robustly and
respectfully logs daily post totals for more locales, and maybe build out a
cool little graph portal. Maybe I can even do a little NLP to make it smarter.
hey, it's craigslist, they're community minded.. they thank me when I post,
they won't mind. They give pencils to teachers even.

So I email them, and Craig responds in a cc'd message with a 'hey cool, can
this guy use our RSS feeds'? At which point, the assholes that worked there
started inventing every excuse under the sun as to why doing so would totally
damage their infrastructure (because you know, polling RSS every half an hour
is total abuse.)

Anyway, that's when I realized that all the hippie-dippie stuff was just
window dressing and that I really truly was dealing with a really special
species of asshole.

I put the project down and walked away. The end.

~~~
Rapidwire
Hey, can you post some quotes from those at Craigslist who were against you
using their data for various graphs?

~~~
a-dub
In the end they shut me down with this (names redacted, of course):

\--snip--

We already have all this done in house. If our CEO decides to publish it we
can do so easily.

Please understand it's not that I think you are trying to do something wacky,
it's just that I really am in the business of not working our servers any
harder than they need to.

Sorry,

\--snip--

------
tptacek
_the same statute that led to the demise of Aaron Swartz_

For fuck's sake.

CFAA criminal sentencing guidelines may very well have contributed to Swartz's
suicide. They incentivized prosecutors to create complex, showy indictments
cross-linking multiple felony charges (because exploiting unauthorized access
in furtherance of other felonies is an accelerator in the CFAA). CFAA may be
broken in several ways.

 _But CFAA is also the sole federal statute governing unauthorized access_. In
civil litigation, CFAA is the only statute that provides a civil cause of
action relating to unauthorized access to computers of any sort.

People like to write about civil CFAA as if it was some sort of nuclear
option. But civil and criminal cases are worlds apart. If you're going to sue
someone for misusing your computer systems, or even just violating your terms
of use, CFAA is merely the statute that enables that. That has nothing
whatsoever to do with overzealous prosecution.

Invoking Aaron Swartz in an argument over who's allowed to show apartment ads
where is manipulative and grotesque.

~~~
gojomo
As the EFF argues in the linked brief, there shouldn't be _any_ "civil cause
of action related to unauthorized access" when the data in question is made
publicly available on the internet.

Craigslist was abusing the CFAA with an expansive interpretation – treating
unapproved _use_ as if it were the same thing as _unauthorized access_ –
similar to that of overzealous federal prosecutors. Craigslist's argument, if
embraced by the courts, would make other cases imposing penalties on the reuse
of otherwise-public data easier.

The reference is fair to make these points to a mass audience, although a bit
macabre.

~~~
rayiner
> As the EFF argues in the linked brief, there shouldn't be any "civil cause
> of action related to unauthorized access" when the data in question is made
> publicly available on the internet.

Why?

The principle underlying the Craiglist lawsuit is centuries old. Craigslist is
like a shop open to shoppers. The shopkeeper makes the premises open to the
public, but the scope of that access is limited by the shopkeeper's purpose in
granting that access. If a member of the public accesses the property for
improper purpose, a civil action for trespass arises.

The fact that the premises is an Internet website changes nothing.

~~~
nitrogen
_The fact that the premises is an Internet website changes nothing._

It absolutely should. There is no _accurate_ physical analogy for an HTTP
server that responds with "200 OK" and valid data for a given set of "GET
_whatever_ HTTP/1.1" requests. Rather than contort existing trespass law to
match the Internet, we have to derive meaningful boundaries for the Internet
from ethical first principles if we want the conclusions to be remotely sane.

~~~
rayiner
The law of trespass reflects a basic ethical principle: owners of private
property may invite the public to access their property for a proper purpose
(the scope of which may even be implied) but nonetheless retain the right to
deny access and to sue for trespass those who access that property for some
other purpose. That ethical principle applies just as much to websites as
coffee shops. Just because Craigslist makes its website available to the
public for a defined purpose does not mean it's not trespass for a company to
access that data for a different purpose.

~~~
nmrm2
(I am going to assume we are still talking about the article.)

First of all, trespass is a red herring in this case. Defendent used Google's
Cache -- _NOT_ the Craigslist website -- to gather data. Craigslist's claim is
that accessing a Google Cache of its website constitutes unauthorized use.

If we're going to resort to ill-fitting metaphores, it's closer to the owner
of a piece of artwork posting public notice that their art can only be viewed
in their studio, _permitting_ a public gallery to show that art, and then
suing you for trespassing because you looked at their art while it was in a
public gallery.

Craigslist asserting copyright claims in this case is plausible if precarious
(the very quietly changed their ToS just 4 days prior to filing suit -- that's
bullshit if I've ever smelled it. Furthermore, as an aside, I wonder whether
the rise of walled data gardens shouldn't give us ethical pause).

However, Craigslist's claim to CFAA violation in this case is absurd and
dangerous. Period.

~~~
rayiner
I didn't know about the Google Cache thing. The Wiki page doesn't mention it.
That definitely changes at least the CFAA part of the case. Did they ever
scrape CL directly?

~~~
x0x0
yes, both padmapper and 3taps were scraping CL directly for some of the time

padmapper was scraping, got C&D, so moved to 3taps

3taps got C&D (and was playing dumb games like switching ip addresses to avoid
blocks), then moved to scraping CL posts out of google's cache, claiming they
therefore weren't bound by terms and conditions of CL. After CL blocked google
from caching posts, 3taps went back to scraping CL.

~~~
sangnoir
Was blocking caching on Google what he referred to as "interfering with Google
and other search engines"? That's laying it thick.

------
x5n1
We really need a non-profit organization that provides a data store with an
api for common things like classified listings, sms messages, pictures, likes,
etc.

That can help us move away from this sort of chicken and egg problem with user
generated data. These companies are basically hogging it because they were
able to build the user base.

If we can get the data in a non-profit store with a licensing scheme that
basically says you must as a part of using this data add any user-generated
data submitted to your website back to this store so other developers can
build products on top of it, we could really innovate in classifieds and
social networks.

Perhaps something like that can be funded by EFF or related organization...
because then we can potentially apply governance to that user generated data
which has not been possible with private companies.

The chicken and egg problem can be solved if big non-profit tech and civil
rights brands like the ACLU, EFF, Wikipedia, etc. all get behind this and
market it.

~~~
blatherard
At least in the US, I think satisfying the 'purposes' requirement for non-
profit status would be difficult. Just providing a free service isn't enough.
Here's the IRS brief description:

"The exempt purposes set forth in section 501(c)(3) are charitable, religious,
educational, scientific, literary, testing for public safety, fostering
national or international amateur sports competition, and preventing cruelty
to children or animals.

source: [http://www.irs.gov/Charities-&-Non-Profits/Charitable-
Organi...](http://www.irs.gov/Charities-&-Non-Profits/Charitable-
Organizations/Exempt-Purposes-Internal-Revenue-Code-Section-501\(c\)\(3\))

UPDATE: I confused non-profit and charitable organizations. Disregard.

~~~
dragonwriter
You seem to confusing non-profit status with charitable (501c3) status; a
charity is a specific subtype of tax-exempt nonprofit (most notably different
from most other nonprofits in that, in addition to the organization being tax-
exempt, contributions to the organization are tax-deductible for the donors.)

------
brownbat
Excellent update on one of the hard cases EFF has been fighting.

There's a link to an interesting law review article on how the CFAA can make
it a criminal act for arbitrarily banned users to even browse to a public
webpage:
[http://digitalcommons.law.umaryland.edu/cgi/viewcontent.cgi?...](http://digitalcommons.law.umaryland.edu/cgi/viewcontent.cgi?article=3216&context=mlr)

It's an absurd result and frustratingly unaddressed by the courts.

------
jxm262
> and will make its API source code, the settlement agreement, and other legal
> filings and public policy resources available.

This is interesging to me. A couple years ago being young and naive i received
a cease and desist order from craigslist legal team demanding i remove my
craigslist scraper from github. It was largely a toy project to play around
with an html parser library i wanted to learn anx thought it could be useful.
Of course I now understand it was against their tos and from an ethical
standpoint, avoid scraping anything unless getting permission, but at the time
I was terrified I'd be sued for a ton of money. It felt incredibly aggressive
to go after me , a student at the time.

So I'm curious.. is it illegal to scrape but ok to release the source code?
Where is the line drawn?

~~~
chmike
That's why I would be tempted to create a craigslist competitor with really
free access to the data.

~~~
thaumaturgy
Network effects: it's difficult to dislodge an entrenched competitor once
they've accumulated a large enough dependent userbase.

However, a number of other services are chipping away at Craigslist,
developing their own userbases in niches that Craigslist used to occupy. (e.g.
Tinder)

------
tsycho
I don't understand.

>> The Court has ruled that users—not craigslist—own the copyrights in their
postings.

>> ... Craigslist finally conceded in Court that no such harm or impairment
ever occurred.

>> Craigslist completely rewrote its Terms of Use, removing many of the most
abusive clauses.

Everything above seems to be against Craigslist. Then why does 3taps have to
agree to a settlement to pay Craigslist $1 million?

And if there are other parts of the court ruling that went against 3taps which
this blog post doesn't mention, then how can Craigslist be forced to forward
that money to EFF?

~~~
mjn
Wikipedia's summary of the case doesn't make it sound as positive for 3taps,
especially as the status of the case (their motion to dismiss the case was
denied) is not what they wanted:
[https://en.wikipedia.org/wiki/Craigslist_Inc._v._3Taps_Inc.#...](https://en.wikipedia.org/wiki/Craigslist_Inc._v._3Taps_Inc.#Opinion_of_the_Court)

Two significant problems for them: the Court sided with Craigslist's view that
3taps knew its authorization to access the website was revoked when it
received Craigslist's cease-and-desist letter, so scraping past that date
might constitute unauthorized access; and Craigslist's change to its ToS on
July 16, 2012 to claim copyright on posts was valid, so reuse of their
material after that date could constitute a copyright violation subject to
statutory damages.

------
melvinram
It's not clear why they are shutting down if "the Court has ruled that
users—not craigslist—own the copyrights in their postings."

Maybe "3taps lacks the resources to continue the fight" implies that the
lawsuit has drained their bank accounts and they are out of money.

~~~
noir_lord
It does say that 3taps has to pay $1m to craigslist that might have killed
their onhand cash.

------
thinkcomp
The actual lawsuit docket is here:

[http://www.plainsite.org/dockets/k5ulex5l/california-
norther...](http://www.plainsite.org/dockets/k5ulex5l/california-northern-
district-court/craigslist-inc-v-3taps-inc-et-al/)

------
guelo
Wow that's quite a spin on the fact that they lost the lawsuit and had to fork
over $1 million.

------
Buge
So 3taps has to pay craigslist $1M, and craigslist then has to pay that to the
EFF. That seems pretty odd.

~~~
fencepost
3taps may not have been able to keep fighting it, but they were able to keep
Craigslist from profiting by driving them under. I wouldn't be surprised if
there was some element of "agree to this or we spend it all making you pay
your lawyers then fold the company with all assets completely exhausted -
we'll even sell the name and spend that against you."

I believe the term is pyrrhic victory.

And there may also be a (future) kneecapping element to it with the release of
their scraping source code.

~~~
thaumaturgy
Right, and CL's lawyers would still be OK with the outcome because now they
have established precedent for a million-dollar settlement of a lawsuit for
scraping data. If anybody is dumb enough to use that open source code to
provide a service that uses scraped CL data, CL's lawyers will be able to open
with a settlement offer well in excess of a million dollars.

------
fadzlan
Not sure if I understand this correctly. Does this mean that if Instagram or
Twitter terms does not allow scraping of their user's generated content, any
developer can just go ahead and do it because the copyright holder of the post
is the user?

Since the site does not hold the copyright (and rightly so), the site owner
does not have the rights to say what can be done with the data. That belongs
to the users that generates it.

In that case, how do we know if all the individual users consent to the
scraping? If you scrape 10,000 data, and one user complains, would you be in
trouble? And does the user has the right to know who are accessing the data
outside of the normal use (since if they don't know, they can't object)?

------
jasimq
"3taps replied that it did not access craigslist and instead obtained the data
from Google" What does this mean? how do they get that data from Google?

~~~
gsharma
Probably scraped Google's cache pages, so they would never touch Craigslist
servers.

------
jister
Company A made a chocolate fountain for the "public" to see. People enjoyed
it. Company B thought this is a great opportunity to make cakes out of it.
Because the fountain is "public" they made this as the source of their cake
business. Company A complained to take down the fountain and....

Well you know the rest of the story. :)

~~~
hayksaakian
If only the chocolate fountain was infinitely copy-able :)

------
mtw
interesting outcome. Do users own the copyright to their pictures and postings
on Facebook? twitter?

Can I build a Facebook scrapper and redistribute it to other sites?

~~~
tedunangst
You can read the various terms of service and user agreements to find out.

------
j_lev
Would a "cannot use for commercial purposes" clause have nipped this one in
the bud? I'm still on the fence with this one. I think Craigslist could have
played it a lot better but I find it hard to believe no-one here can empathise
with the founder.

------
shawnee_
_3taps built a data exchange that aggregated user-generated data housed on
various websites and then made that data available through an API to
developers, including PadMapper and Lovely._

Craigslist discovered that it had become (has become) the "MLS" of rentals...
and perhaps even more accurately -- it's a brokerage of _housing_ data -- both
rentals and sales. So when property management companies (PMCs) discovered how
darn easy it was, for example, to flood craigslist with multiple ads for the
same unit, or to flood it with units that were never available to begin and
thus alter market perception -- certain people got exactly what they wanted:
hyperinflation in rents, or the subsequent upward pressure on housing prices,
or both.

 _As recently as 2010, craigslist welcomed innovative uses of the publicly
available data ... Over the next two years, as innovators like PadMapper and
AirBnB began to thrive, craigslist reversed course, and punished the
innovators it previously welcomed to use the data. In February 2012,
craigslist rewrote its Terms of Use, abandoning its long-articulated position
that users own their own content which was freely available on the “public”
part of craigslist 's website._

As outraged as everybody was about this, it is exactly what the real MLS does
when you decide to sell your house. You sign a contract promising to pay some
Realtor's brokerage company 6 percent of whatever your house goes for -- in
that contract you are essentially giving them the "copyright" of your house
listing; they own it on the MLS and that is why you have to pay them the big
bucks. Never mind that they do basically NOTHING other than simple photography
and data entry to post on the MLS... but now they require you give them ~$66K
of your equity for their 3 hours of work. (Source:
[http://www.mercurynews.com/business/ci_28512250/report-
silic...](http://www.mercurynews.com/business/ci_28512250/report-silicon-
valleys-housing-affordability-crisis-worsens) Median price of "entry level"
home in San Mateo County = $1.1M).

Same thing is happening in rentals / property management co's (PMCs), but
slightly different symptoms.

Nobody is attacking the problem the right way, though. 42Floors tried the
experiment and found it to be a failure, too. (Source:
[https://news.ycombinator.com/item?id=9881213](https://news.ycombinator.com/item?id=9881213))

The market should be putting more pressure on brokers to compete with each
other ... damn that 6 percent. (Right, but the NAR signed a non-compete
agreement with itself so it gets to do that)

Hackers should stop building tools that make it easier and cheaper for the
PMCs and real estate agents to steal everybody's equity.

~~~
jacquesm
> You sign a contract promising to pay some Realtor's brokerage company 6
> percent of whatever your house goes for

For comparison: NL is roughly at 1,85 (negotiable).

------
thomasrossi
Do you think the sentence has impact on other scraping scenarios, say scraping
for travel data for instance

------
trhway
why they took on CL instead of, say, FB? Or they think it is better to start
with an easy/smaller guy and ramp it up to the bigger fish?

