
XSS attacks on Googlebot allow search index manipulation - fanf2
https://www.tomanthony.co.uk/blog/xss-attacks-googlebot-index-manipulation/
======
apo
Here are the steps to reproduce the attack from what I gather:

1\. Find a vulnerable site. The author picked Revolut, a 3-year old, well-
funded fintech startup. Others might be found at
[https://www.openbugbounty.org](https://www.openbugbounty.org).

2\. Inject the script. The author did so by tacking a URL parameter containing
script content to a link he obtained from the Revolut site.

3\. Preview the attack with Google's Web Rendering Service, which apparently
uses the same version of Chrome used by Googlebot.

4\. Submit the link to Googlebot for crawling.

5\. View the cached page from the Google results page.

> I reported this to Google in November 2018, but after 5 months they had made
> no headway on the issue (citing internal communication difficulties), and
> therefore I’m publishing details such that site owners and companies can
> defend their own sites from this sort of attack. Google have now told me
> they do not have immediate plans to remedy this.

Translation: Google declares open season on this attack.

~~~
ryanlol
>Google declares open season on this attack.

This has always been the case, people have been exploiting this for _at least_
a decade.

~~~
picklemorty
Was the first I was thinking

------
jefftk
Related: "In the past 3 months we surveyed all internal XSS bugs that
triggered the XSSAuditor and were able to find bypasses to all of them."
\--[http://nakedsecurity.sophos.com/2019/07/18/google-chrome-
is-...](http://nakedsecurity.sophos.com/2019/07/18/google-chrome-is-ditching-
its-xss-detection-tool/)

(Disclosure: I work for Google)

------
geophertz
The thing with this vulnerability is that it is just an XSS.

It has nothing to do with Google apart from the fact they don't run GoogleBot
using a recent version of Chrome.

The other thing is that if I understand correctly, this could work without
JavaScript. You could just inject HTML <a> tags to inject links in XSS
vulnerable website.

PS: Apparently Google Bot has been updated to the latest version of Chromium
which means it is even less a vulnerability on Google's side.

~~~
quanticle
Exactly. The breathlessness of the article made no sense to me. It's like
someone writing, "Did you know that if someone breaks into your home, they can
rearrange the books on your bookshelf‽" Well, yes, of course they can but if
someone has broken into my house, unauthorized alphabetization is the least of
my worries.

Similarly, if there's an XSS vulnerability on my site, Google search index
manipulation is pretty far down on the list of things I'm going to be worried
about.

~~~
Dylan16807
Think of it more like a DDOS. If they break into your house, you have bigger
worries. If they break into the houses of a hundred thousand strangers, and
use them to demonstrate that their site should get all the search results and
your site goes on page 5, what can you do?

------
tk2
I accidentally found this circa 2008. I found a xss vulnerability on a local
news website, where I could inject js into the webpage URL.

I then posted a poc on my blog. The poc will create a img link to my blog.

To my surprise my blog ranked 2nd for the news name. The only explanation that
I can think of was googlebot followed the poc link and thought the news site
has a link to my blog. Of course this is only possible if googlebot execute
js, which was not a standard for other crawlers at that time.

I believe the blackhat term for this is Google Bowling. Correct me if I'm
wrong.

------
methyl
> Googlebot is based on Google Chrome version 41 (2015), and therefore it has
> no XSS Auditor

It is no longer true: [https://searchengineland.com/google-will-ensure-
googlebot-ru...](https://searchengineland.com/google-will-ensure-googlebot-
runs-the-latest-version-of-chromium-316534)

------
3xblah
"Since Googlebot executes Javascript, this allows an attacker to craft XSS
URLs that can manipulate the content of victim sites."

In some imaginary future with ubiquitous headless browser-based bots, having
Javascript disabled might actually be a good test for "Are you human?"

------
inian
Google is depreciating the xss auditor anyway -
[https://portswigger.net/daily-swig/google-deprecates-xss-
aud...](https://portswigger.net/daily-swig/google-deprecates-xss-auditor-for-
chrome)

------
coldcode
No fix citing "internal communication difficulties", perhaps Google could use,
I don't know, some kind of tool. I too work for a huge company, and oddly
enough we also have communications difficulties sometimes despite all of our
tools. But we not a technology company, seems like a Google should be better
at this.

~~~
the_duke
"internal communication difficulties" does not mean "I can't talk to the
person", it means political infighting or the responsible team doesn't
care/see it as a problem.

~~~
AJ007
There is probably a long winded explanation somewhere how this is actually a
feature and not an exploit.

~~~
chrismorgan
Well, how are you going to fix it without removing important features like
submitting pages to the index? There are no perfect XSS auditors; you can
fairly consistently work around Chrome’s, which is why they’re giving up on it
now.

I don’t see any realistic solution to this problem.

~~~
londons_explore
I can imagine the search teams response now...

"If your site has an XSS, then anyone can make any content appear on that
domain... So why's it so bad if anyone can also make anything appear on the
domain in Google Search? - we're just reflecting reality.

Next you guys will report that you wrote a comment in hackernews and that
appeared on google search too!"

------
akerro
Mining Monero on googlebot anyone?

~~~
sametmax
It's not going to stay long enough on one page to do so.

~~~
londons_explore
it stays on a page for 30 seconds, and has a pretty decent CPU.

I'm going to guess you could get a decent amount of monero mined...

Googlebot will only visit a page if it thinks that (on average across the
domain), there are ~99 human visitors for every bot visit. So you'll have to
hide the monero miner on a popular domain.

~~~
codedokode
That definitely is not true. I know sites that have higher share of bot visits
than 1%.

------
toto007
Also if googlebot has been fixed and the chrome fix the url, you can however
exploit the bug of xss with an proxy web [https://www.hidemyass-
freeproxy.com/proxy/it-it/aHR0cDovL2Jl...](https://www.hidemyass-
freeproxy.com/proxy/it-
it/aHR0cDovL2JlZGJyZWFrZmFzdC5iZS9lbi90dW5pcy02NjI_c2x0X2NpdHk9JTNDL3NjcmlwdCUzRSUzQ3NjcmlwdCUzRXZhciUyMHglMjA9JTIwZG9jdW1lbnQuZ2V0RWxlbWVudHNCeUNsYXNzTmFtZSglMjJwYWdlLXRpdGxlJTIyKTt4WzBdLmlubmVySFRNTCUyMD0lMjAlMjJDaWFvJTIwVmluY2Vuem8lMjI7YWxlcnQoJTI3dmluY2Vuem8lMjBzaSUyMG51JTIwc3RydW56JTI3KTslM0Mvc2NyaXB0JTNFJnNsdF9uZWFyYnk9Njcy)

or open the main url with firefox browser:
[http://bedbreakfast.be/en/tunis-662?slt_city=%3C/script%3E%3...](http://bedbreakfast.be/en/tunis-662?slt_city=%3C/script%3E%3Cscript%3Evar%20x%20=%20document.getElementsByClassName\(%22page-
title%22\);x\[0\].innerHTML%20=%20%22Ciao%20Vincenzo%22;alert\(%27vincenzo%20si%20nu%20strunz%27\);%3C/script%3E&slt_nearby=672)

------
codedokode
This is actually not bad. Maybe it will make site developers aware of XSS and
motivates to fix it?

It isn't Google's job to fix your broken sites.

------
jaksc09
This issue no longer applies since Googlebot now runs the latest Chrome:
[https://searchengineland.com/google-will-ensure-googlebot-
ru...](https://searchengineland.com/google-will-ensure-googlebot-runs-the-
latest-version-of-chromium-316534)

------
tripzilch
I'm curious if the author had just injected a URL as plaintext, no JS
involved, does the Googlebot also follow strings that just look like URLs?

------
liveoneggs
brilliant use of XSS here. This is just great.

------
quickthrower2
Nicely laid out for the black hatters!

~~~
anaphor
They are probably already aware of this if they do this sort of black hat SEO
type stuff.

~~~
ryanlol
Yeah, this particular technique is as old as the hills.

CVE-2007-1287 was abused for this specific purpose over a decade ago, and
remnants of that still occasionally show up in google indexes.

------
abadabadingdong
Looks like Tom's blog is toast atm - 502 error. This is why we can't have nice
things.

------
amelius
> This presumably manipulates PageRank

Is PageRank still used by Google?

~~~
kevlened
This was on a post a few days ago [0]:

> Ex-Google-Search engineer here

> The comments here that PageRank is Google's secret sauce also aren't really
> true - Google hasn't used PageRank since 2006.

[0]
[https://news.ycombinator.com/item?id=20442044](https://news.ycombinator.com/item?id=20442044)

