

3taps says CL blocking "all general search engines" - sigmadelta

homepage pop-up: "At approximately noon on Sunday August 5th, Craigslist instructed all general search engines to stop indexing CL postings -- effectively blocking 3taps and other 3rd party use of that data from these public domain sources. We are sorry that CL has chosen this course of action and are exploring options to restore service but may be down for an extended period of time unless we or CL change practices. As soon as we know more, we will share it here and on our Twitter account."
======
storborg
I don't think this is accurate. As far as I can tell, there is nothing in CL's
robots.txt, meta tags, or response headers that prevents Google from indexing
them. Further, requesting a CL post with the Googlebot user agent yields the
same content. This only leaves the possibility that they are excluding Google
via specific IP blocks, which seems unlikely. Is there something I'm missing?

~~~
andrewcooke
i don't know, but if i understand <http://3taps-
statistics.qatro.com/craigslist/index.pl> correctly then they are missing lots
of posts. the numbers seem to be percentages.

~~~
dangrossman
<http://3taps.com/stats>

Scroll down to the "daily report" and they've dropped from almost 2 million
posts a day to 153.

~~~
storborg
Yeah. This is all talking about 3taps, not "general search engines". 3taps
seems to be claiming that Craigslist has cut off Google, but I think it's just
that Craigslist has cut off 3taps.

*Edit: Craigslist added nocache directives to their posts, which means that 3taps can't scrape the Google cached copies. They're not blocking anyone. Interestingly, this also reveals that 3taps was previously violating the Google TOS, which prohibits automated access of the Google cache.

~~~
dangrossman
Do some site:city.craigslist.org searches and limit to 24 hours. You'll get no
results. You can't cut off 3taps without cutting off Google.

~~~
sigmadelta
Just tried again, and I got a bunch of results:

[http://www.google.com/search?q=site%3Asfbay.craigslist.org+b...](http://www.google.com/search?q=site%3Asfbay.craigslist.org+boat&btnG=Search)

Of the 10 results on the first page, 7 are within the last hour.

(1) Boat sfbay.craigslist.org/sby/boa/3191071012.html 2 hours ago ...
408-726-8722 i don�t know about the motor or about the boat but if you want to
see it call.... that is the reason that i can�t wrote about the ...

SF bay area boats - by owner classifieds - craigslist
sfbay.craigslist.org/boa/ - Cached - Similar SF bay area boats - by owner
classifieds - craigslist.

(2) 1964 Fabuglas 16ft fishing boat
sfbay.craigslist.org/nby/boa/3191162483.html 1 hour ago ... This was my dads
fishing boat. It runs good. But could use a little TLC. Its a 1964 16 foot
Fabuglas out of Nashville Tennessee. Most of the ...

SF bay area marine services classifieds - craigslist sfbay.craigslist.org/mas/
- Cached Sat Aug 04. Shipwright/Boat Work - (berkeley) ... Boat & Marine
Related Service - (hayward / castro valley) ... SF Charter Boat, Book Now -
(San Francisco Bay) ...

(3) SIDEWINDER 16' SPEED BOAT trade for services or...??? - Craigslist
sfbay.craigslist.org/eby/bar/3191164088.html 1 hour ago ... 1980 BLUE
sidewinder motor boat. Seats 4. 35+mph. 70 hp 2 stroke VRO Evinrude motor, no
need to premix fuel. Runs strong. Starts right up.

Fishing / Hunting Boat sfbay.craigslist.org/sby/boa/3176240765.html 6 days ago
... 2004 War Eagle Boat, semi v front flat bottom(17ft.), with a 2003 40hp
Mercury(4 stroke) motor with 20- 25 hours on it. The boat is on an EZ ...

(4) Sailboat rudder off a 22' boat
sfbay.craigslist.org/sby/boa/3191237551.html 14 minutes ago ... 4'10'' tall
rudder from a 22' boat. It is in great condition and is a solid Bay rudder.
Can be brought up in case of a grounding by pulling on a rope ...

(5) MB Sports V Drive Ski & Wakeboard Boat
sfbay.craigslist.org/eby/boa/3187986181.html 1 day ago ... 2002 MB SPORTS 220
V-Drive. This boat is an excellent for both skiing and wakeboards. We have
used it and enjoyed it for slalom skiing and ...

(6) * BAYLINER* NICE BOAT, NICE PRICE... BEST OFFER MOVING ...
sfbay.craigslist.org/eby/boa/3191163056.html 1 hour ago ... VERY CLEAN BOAT
New wheel bearings on trailer. 3.0 Mercruiser 135 H.P. Great on gas. 40-45 mph
top speed. Just registered! Garaged for 7 ...

(7) wanted boat polisher sfbay.craigslist.org/eby/boa/3191196564.html 50
minutes ago ... wanted boat polisher (pittsburg / antioch) ... I am looking
for someone to polish and wax my 28 ft boat. topside only dont have to do the
hull.

------
true_religion
Pretty brilliant, I don't think Craiglist ever needed Google traffic at all
anymore. People _know_ to go there to buy and sell.

~~~
calbear81
Not to mention that local classifieds are not really meant to have any
permanence so often times when I saw a CL listing on Google for an item I was
looking for, it was already sold or the posting was deleted. I'm still looking
forward to a viable Craigslist competitor though.

------
sigmadelta
[http://blog.sfgate.com/techchron/2012/08/10/craigslist-
backs...](http://blog.sfgate.com/techchron/2012/08/10/craigslist-backs-off-
exclusive-rights-to-ads/)

"One data harvester, 3taps, said earlier this week that Craigslist had blocked
search engines such as Google from including Craigslist pages in search
results. But that report was inaccurate.

3taps’ product and quality assurance leader, Meg Nakamura, acknowledged
Wednesday in a chat with The Chronicle that something fishy was taking place,
but developers there haven’t fully figured out what’s going on."

------
sigmadelta
[http://www.sfgate.com/technology/businessinsider/article/Cra...](http://www.sfgate.com/technology/businessinsider/article/Craigslist-
Is-Definitely-Blocking-Search-3769297.php)

Not sure I agree with most the conclusions drawn in that article.

The article does say that "sure enough, Google displays recent listings from
Craigslist right now," which does seem to be true for me, too, when I try.

------
sigmadelta
<https://twitter.com/markmilian/statuses/233015694432813057>

Mark Milian ‏@markmilian 7 Aug

Contradicting earlier statement, 3Taps spokeswoman emails to say, "Craigslist
is still allowing indexing of pages." Still nothing from CL PR

------
sigmadelta
Actually the part about search engines doesn't seem to be true... I just
performed searches using Google, Yahoo, and Bing and got links to CL postings
that were made within the last hour.

~~~
mooism2
Google, and I presume other big search engines too, cache robots.txt for a
week by default. We're well within the window for them to still be indexing
CL.

~~~
sigmadelta
If that's true, how does 3taps know that CL is blocking search engines? Also:

$ wget -q -O- --save-headers <http://www.craigslist.org/robots.txt> | fgrep
Last-Modified

Last-Modified: Fri, 04 Nov 2011 18:13:24 GMT

------
sigmadelta
[http://www.theverge.com/2012/8/7/3225476/craigslist-
blocks-3...](http://www.theverge.com/2012/8/7/3225476/craigslist-blocks-3taps-
padmapper)

