Hacker Newsnew | comments | show | ask | jobs | submit login

Hi there, allow me to correct this misconception. I've debunked that idea often enough that I wrote a blog post about this four years ago: http://www.mattcutts.com/blog/toolbar-indexing-debunk-post/ I wrote an earlier debunk post in 2006 too: http://www.mattcutts.com/blog/debunking-toolbar-doesnt-lead-...

I noticed a new twist in your post though: you're saying that because of Safe Browsing (which checks for e.g. malware as users surf the web), those urls are sent to Google. The way that Chrome and Firefox actually do Safe Browsing is that they download an encrypted blob which allows the browser to do a lookup for dangerous urls on the client side--not by sending any urls to Google. I believe that if there's a match in the client-side encrypted table, only then does the browser send the now-suspect url to Google for checking.

Here's more info: https://developers.google.com/safe-browsing/ I believe the correct mental model of the Safe Browsing API in browsers is "Download a hash table of believed-to-be-dangerous urls. As you surf, check against that local hash table. If you find a match/collision, then the user might be about to land on a bad url, so check for more info at that point."

Hope that helps. Further down in the discussion, someone posted this helpful link with more explanation: http://blog.alexyakunin.com/2010/03/nice-bloom-filter-applic...




Sorry but don't believe you about google toolbar. I had a private page with no links in or out and yet it appeared in google search. It was not guessable and there was no chance for a referrer link. The page was never shared with friends nor accessed outside my own computers.

I only found out when a friend searched for his name and the page appeared as it was my phone list

-----


Multiple people have run controlled experiments like I described in http://www.mattcutts.com/blog/debunking-toolbar-doesnt-lead-...

The most common way such "secret" pages get crawled is that someone visited that secret page with their referrers on and then goes to another page. For example, are you 100% positive that every person who ever visited that page had referrers turned off on every single browser (including mobile phones) they used to access that page?

-----


Are you sure that it is the referrer headers? PP clearly stated there were no outgoing links on the secret page. I think there's a much more mundane explanation: javascript stuff downloaded from Googles CDN. People nowadays are so used to just plopping jQuery etc. into their web pages that they forget that this stuff has to come from somewhere. If it's from Google, I'm quite certain that their CDN loader phones home right before it gives up any of the good stuff.

EDIT: Confirmed, though I was wrong in that there's no loader, requesting jQuery from ajax.googleapis.com gives them a nice fresh Referer header pointing at your secret site for their spiders to crawl. Be mindful!

-----


I'm 100% sure. That page was for me and me alone. It was never accessed by anyone but me. I never shared the URL with anyone.

Referrers only get shared through links. There were no links to or from that page. Going to a page and typing in new URL does not provide a referrer.

-----




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: