

How we got read access on Google’s production servers - detectify
http://blog.detectify.com/post/82370846588/how-we-got-read-access-on-googles-production-servers
tl;dr: How we got $10.000. We were able to upload an XML file to Google. The XML parser was vulnerable to XXE. We got full read access to their production servers, including the &#x2F;etc&#x2F;passwd.
======
mixmax
In large production environments it's almost impossible to avoid bugs - and
some of them are going to be nasty. What sets great and security conscious
companies apart from the rest is how they deal with them.

This is an examplary response from google. They respond promptly (with humor
no less) and thank the guys that found the bug. Then they proceeded to pay out
a bounty of $10.000.

Well done google.

~~~
cdelsolar
Hmm a pretty cheap road trip for just ten dollars, and I'm also not sure why
they thought it necessary to include an extra significant figure for cents.

~~~
pmcjones
Some countries reverse the role of period and comma in numbers. The author
meant ten thousand.

~~~
barsonme
I'll admit, it threw me off at first too.

------
msantos
A few webcrawlers[1] out there follow HTTP redirect headers and ignore the
change in schemas (this method is different of OP's but achieves the same
goal).

So anyone can create a trap link such as

    
    
        <a href="file:///etc/passwd">gold</a>
    

Or

    
    
       <a href="trap.html">trap</a> 

once trap.html is requested the server issues a header "Location:
file:///etc/passwd"

Then it's just a matter of seat and wait for the result to show up wherever
that spider shows its indexed results.

[1]
[https://github.com/scrapy/scrapy/issues/457](https://github.com/scrapy/scrapy/issues/457)

------
numair
... And this is why you want to discontinue products and services your
engineers can't be motivated to maintain. Amazing.

This should scare anyone who has ever left an old side project running; I
could see a lot of companies doing a product/service portfolio review based on
this as a case study.

~~~
spindritf
Or just move it to some cheap VPS where it cannot damage other services or
your infrastructure.

~~~
adaml_623
Or your reputation or your ethical and possibly legal duty to protect your
clients?

~~~
spindritf
Compartmentalization is part of that.

------
halflings
I hope it doesn't get unnoticed that the guys who discovered this
vulnerability created a really great product, Detectify :

[https://detectify.com/](https://detectify.com/)

They also discovered vulnerabilities in many big websites (dropbox, facebook,
mega, ...). Their blog also has many great write-ups :
[http://blog.detectify.com/](http://blog.detectify.com/)

~~~
detectifyuser
While they are probably good at doing this manually, their automated tool
finds very little. And they were kind of assholes on support :(

~~~
ibu
Nice try, Tinfoil.

~~~
borski
Er...CTO of Tinfoil here. We respect the Detectify guys a lot. Not sure what
you were trying to get at, but there's no conspiracy here. We don't engage in
subversive competitive tactics.

------
raverbashing
This is another reason not to use XML, plain and simple

It's too much hidden power in the hands of those who don't know what they're
doing (loading external entities pointed in an XML automatically? what kind of
joke is that?)

~~~
jebblue
XML made it for more manageable to create machine to machine API's. I can say
we surely would not want go back to the 80's and 90's when dong that stuff was
a nightmare.

~~~
ori_b
Yes, it was a drunken, stumbling step forward. Let's take another one, and
move to something simpler, which solves the problem better.

To quote Phil Wadler's paper about XML, where he established some of the
principles that influenced Xquery: "So the essence of XML is this: the problem
it solves is not hard, and it does not solve the problem well."[1]

I suggest reading the entire paper; It shows a number of shortcomings, but
it's also rather enlightening about how XML actually is structured, and how
its semantics are defined. (ie, in spite of that quote, it's not just XML
bashing)

[1][http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.109...](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.109.2857&rep=rep1&type=pdf)

~~~
wpietri
Hm. In his introduction, he says, "XML is touted as an external format for
representing data." To me that mostly misses the value of XML. I think of it
as an interchange format, not a closely-mirror-my-datastructures format. I've
used it before when I want a long-lived data format that is mostly annotated
text, and I'd happily do it again.

That said, I'm very skeptical of the XML-for-everything school, and nearly
murdered a group of engineers who were using XML to transfer data from one
spot in an app to another, even though it all ran in the same JVM. So maybe
I'm more defending a small subset of XML rather than the XML-industrial
complex.

------
chmars
The guys behind this report have an interesting pricing model: Pay what you
want!

[https://detectify.com/pricing](https://detectify.com/pricing)

The pricing models has apparently worked so far. Are any active users of
Detectify here and can share their experience?

~~~
detectifyuser
I like the price but they found nothing, honestly, and we're not very nice
when I emailed them for support.

~~~
bronxbomber92
Just wanted to point out the hilarity of your typo: "we're not very nice
when..." vs "were not very nice when..." completely reversed what you meant to
say ;P.

------
raesene3
Interesting to see this hit big companies like google. The problem, I think,
stems from the idea that most people treat XML parsers as a "black box" and
don't enquire too closely as to all the functionality that they support.

Reading the spec. which led to the implementations, can often reveal
interesting things, like support for external entities..

~~~
vidarh
Also horrible defaults in XML parsers. That _any_ XML parsers allow retrieval
of DTD's without explicit options specifying allowed sources etc. is beyond
me. It's not just local file access, which becomes a security hole when you
let users pass you XML files, though that is one of the worst ones.

But the number of times I've seen production apps that turn out to behind the
scenes request DTD's or schemas from remote servers regularly have made that
one of the first thing I check if I am tasked to maintain or look into
anything that parses XML. Often these apps stop working or slow down for
seemingly no reason because the DTD or schema becomes unavailable, and nobody
understands why.

~~~
bambax
It's bad practice to fetch an external DTD on a server you don't control,
first for security reasons, second because your application then depends on
something that can go away anytime, third because it's rude to the third
party.

twic is right that one should always use entity resolvers that point to local
ressources and that parsers should run in a sandbox without external access.

He's also right to say that by default parsers shouldn't go fetch external
resources; I think the reason is historical; entity resolvers appeared later
than the parsers themselves.

~~~
bhaak
It is bad practise but you know that it is uncannily common?

Just remember that the W3C had to impose download restrictions on the (X)HTML
DTDs
([http://www.w3.org/Help/Webmaster#block](http://www.w3.org/Help/Webmaster#block))

------
cheald
XML legitimately scares me. The number of scary, twisted things it can do make
me shudder every time I write code to parse some XML from anywhere - it just
feels like a giant timebomb waiting to happen.

~~~
bambax
> _every time I write code to parse some XML_

Why would you write code to parse XML?

Use an existing parser to parse.

Use XSLT to modify/transform (including generate JSON/CSV/other).

~~~
jerf
Ironically, using an existing parser is what opens you to this vulnerability
in the first place. If you hack your own together based on a vague idea of
what XML really is, you're very unlikely to "correctly" handle entities,
you'll probably just put in enough to handle simple XHTML entities, and that
makes you immune to this problem! It's the _compliant_ parsers that are
vulnerable to this....

~~~
Peaker
Or, if you use existing parsers in a language like Haskell, you know parsing
is supposed to be a pure function. If parsing suddenly requires IO effects,
you can be suspicious and try to figure out what is going on.

~~~
gamegoblin
Even with haskell, someone could sneak in a performUnsafeIO call if you aren't
careful. Of course this is trivial to detect with compiler flags etc.

~~~
Peaker
We're not talking about a malicious XML library here, though. We're talking
about a misunderstanding regarding what happens during legitimate parsing of
XML.

~~~
gamegoblin
I was just responding to you about pure functions. You can make a Haskell
function with a pure type signature that includes a call to unsafePerformIO.

~~~
Peaker
You can, but:

A) Legitimate libraries don't (unless the IO action is in fact pure)

B) Rogue libraries that do this will not generally work: laziness,
optimizations, RTS races can all make the IO action run 0..N times,
arbitrarily.

C) It doesn't change the fact that in Haskell, the XML library exposes the
weird XML behavior of looking up external entities by being in IO (my original
point) -- because of A.

------
njharman
take away: XML should not be used (at least as user input). It is too
powerful, too big. It is much too hard and expensive to test and validate.

Input from potentially malicious users should be in the simplest, least
powerful of formats. No logic, no programability, strictly data.

I'm putting "using XML for user input" in same bucket as "rolling your own
crypto/security system". That is you're gonna do it wrong, so don't do it.

------
NicoJuicy
Offtopic: the reply was generated with Google's internal meme generator, i
read about it here :
[https://plus.google.com/+ColinMcMillen/posts/D7gfxe4bU7o](https://plus.google.com/+ColinMcMillen/posts/D7gfxe4bU7o)

Actually digged it when i read it a few years ago and awesome knowing that it
was probably used for this reply :)

------
NicoJuicy
A job well done. This is actually impressive and quite interesting to see
after what you are searching for (afterwards it seems logical :))

------
enscr
Is there a startup that can help automate custom attacks on websites? Like
guide the webmaster to look for holes in their setup. I'm guessing some
security expert can do a good job educating new businesses on how to prepare
for the big bad world.

~~~
detectify
Hi! In fact, that's exactly what we do at Detectify. Just check out
[https://detectify.com](https://detectify.com)!

~~~
d0m
I think you just proved that writing an excellent blog post like you did is an
amazing way to get new customers!! Maybe make it a tad more explicit in the
post (or page) what detectify do. I personally had no idea.. but I just
checked the homepage because I liked the design and was curious, and it's only
then I realized what you guys were doing.

------
plq
For those who'd like to know more about xml-related attack vectors, here's a
nice summary:
[https://pypi.python.org/pypi/defusedxml](https://pypi.python.org/pypi/defusedxml)

------
dantiberian
Very cool hack. Is $10,000 around the top end of what Google will pay out?
This seems like quite a serious bug as far as they go.

~~~
NicoJuicy
No,

You can see the general payout levels here:
[http://www.google.com/about/appsecurity/reward-
program/](http://www.google.com/about/appsecurity/reward-program/) , normally
the top payout is about $ 20,000, but the top payout (for Chrome) currently 2
people have been rewarded with $ 60,000. There is an overview of the top
payouts though: [http://www.chromium.org/Home/chromium-security/hall-of-
fame](http://www.chromium.org/Home/chromium-security/hall-of-fame).

Some payouts are $1337 ,$3133.7 or $31336 :P

Microsoft rewards even up to $100.000 for security issues in the latest OS
(currently Windows 8.1)

~~~
tptacek
These payouts are for product vulnerabilities; things that Microsoft and
Google ship to customers; vulnerabilities that those vendors are effectively
creating on hundreds of thousands of machines they don't own.

------
mwcampbell
I'm surprised nobody has mentioned containers, e.g. Docker, as a way of
limiting the damage from this kind of bug. In a container whose only purpose
is to run the application, /etc/passwd should be as uninteresting as:

    
    
        root:x:0:0:root:/:/bin/sh
        bin:x:1:1:bin:/dev/null:/sbin/nologin
        nobody:x:99:99:nobody:/dev/null:/sbin/nologin
        app:x:100:100:app:/app:/bin/sh

------
kirab
I think they couldn’t read /etc/shadow, so it’s not that bad at first. But
then they could surely access some configuration file of the application
itself, probably containing DB creds and of course more information which
helps to find more vulns.

~~~
thrownaway2424
It's shocking to me that baking "db creds" into a binary or configuration file
is still so common that anyone would expect it to be true on a randomly
selected server. Is this still the industry standard?

~~~
dreamdu5t
How else would you do it? If you use a configuration "service" the credentials
to access the service must be baked in.

~~~
thrownaway2424
Well, I can think of a couple of ways off the top of my head, that I'm sure
will be shouted down for being simplistic:

1) ident protocol, or something similar. On the internet, it's a disaster, but
for machines all owned by the same organization, it makes sense.

2) ssl client certificate. this can be hardened in various ways like having
the certs expire every ten minutes etc.

------
yummybear
You should be aware that pixilating or blurring screenshots are likely not
sufficient to ensure that the contents are unrecoverable.

------
peterkelly
I never understood why internal or external entities were included in XML. Can
anyone explain what useful purpose they serve?

~~~
bazzargh
Exactly the same as #includes and #defines in C - they let you organize your
code in multiple files, be more concise, and shoot yourself in the foot,
repeatedly.

They were useful for document editing use cases - remember this was before
SOAP and xml serialization, and sgml tooling that already supported this stuff
existed. You can see the record of the decision here:
[http://www.w3.org/XML/9712-reports.html#ID5](http://www.w3.org/XML/9712-reports.html#ID5)

------
antocv
So, when you have read access to googles prod servers, what else would be fun
to do besides reading /etc/passwd ?

Getting the source?

~~~
skj
The source is not generally accessible from prod servers - only binaries and
supporting data, and only the ones running on that computer.

I guess it's possible you could find a computer that hosted both search and
the codebase. But, since search is for external and the codebase is for
internal, I'd be that they don't share clusters.

~~~
ithkuil
what if that file is per container and every software runs isolated? it's
still a potential issue because you could retrieve other sensible information
(log files?).

~~~
skj
Sure. I was only addressing the concern of accessing source.

------
ajsharp
Cheers to google for properly compensating these guys for their findings.

------
h1ccup
Well done. I had to deal with some similar issues with my own project, and
they weren't legacy code either. This should push me to go through some of my
code again.

------
pearjuice
That must have been be a nasty call from Sergey to NSA head quarters earlier
this week.

"Sir, I am sorry to inform you that another backdoor has been found. We will
introduce two more as agreed upon in our service level agreement."

------
sebban_
Awesome work! The bounty is a bit low though.

------
blueskin_
I wonder how many of the blurred entries were NSA.

------
4ad
Just $10k?

This sells for at least 10 times more on the black market. Why would one
rationally chose to "sell" this to google instead of the black market.

Some people don't break the law because they are afraid to get caught, but I
like to believe that most people don't break the law because of the moral
aspect. To me at least, selling this on the black market poses no moral
questions, so, leaving aside "I'm afraid to get caught", why would one not
sell this on the black market? Simple economic analysis.

Very serious question.

~~~
tptacek
That vulnerability _does not_ sell for 10x on "the black market".

* It fits into nobody's existing operational framework (no crime syndicate has a UI with a button labeled "read files off Google's prod servers")

* A single patch run by a single organization kills it entirely

* The odds of anyone, having extended access and pivoted into Google's data center, _keeping_ that access is zero.

I'm not an authority on how much the black market values dumb web
vulnerabilities but my guess on a black market price tag for this bug is
"significantly less than Google paid".

 _Later: I asked a friend. "An XXE in a single property? Worthless. And at
Google? Worth money to Google. Worth nothing to anybody else."_

~~~
flylib
"dumb web vulnerabilities" that have huge implications could fetch a pretty
penny for sure

~~~
tptacek
No, they can't. Read the inverse of my bulleted list to see what makes money:

* Bugs that fit readily into operational frameworks (ie: it would be reasonable to have a UI with a button invoking that bug and/or any of the 15 other bugs like it)

* Bugs that can't be killed with a single patch cycle by a single entity

* Bugs that provide long-term access, or access that is unlikely to get your entire syndicate caught

Example of a potentially lucrative web bug: bug in Wordpress.

Example of a bug unlikely to be lucrative: "read any Facebook server file".

I know that sounds crazy and backwards, but I don't think it is.

~~~
theboss
I think you two disagree on what a "dumb web vulnerability" is.

