Break Google

cfinke · on Sept 16, 2011

When you search for "${", the page is missing 26 lines of minified JavaScript (lines 9-35 of a non-broken page, at least for me), almost certainly because of a templating bug. These lines, among other things, are responsible for adding the top toolbar to the page. (The missing JS is here: http://pastebin.com/B9cy3T2c)

jcampbell1 · on Sept 16, 2011

I think google search uses this templating language:

http://code.google.com/p/google-ctemplate/

It makes sense that the ${ could cause problems.

simonw · on Sept 16, 2011

In my experience it's pretty rare for template language bugs to cause errors if user entered content includes one of their special characters. The template would have to be evaluated twice for any problems to occur - once to insert the user's template code in to placeholders within the original template , and then once again to execute the resulting combination.

kragen · on Sept 16, 2011

Sad to say, Mustache.js has exactly this bug last time I checked, but only under some circumstances.

Minimal reproduction:

        Mustache.to_html('{{b}}', {b: '{{c}x}' }) -> '{{c}x}'
        Mustache.to_html('{{#a}}{{b}}{{/a}}', {a: [{b: '{{c}x}' }]}) -> '{{c}x}'
        Mustache.to_html('{{b}}', {b: '{{c}}' }) -> '{{c}}'
        Mustache.to_html('{{#a}}{{b}}{{/a}}', {a: [{b: '{{c}}' }]}) -> '' (wrong)

tptacek · on Sept 16, 2011

Hrm, but is $ a metacharacter in that language?

brlewis · on Sept 16, 2011

No. It doesn't even appear on http://google-ctemplate.googlecode.com/svn/trunk/doc/referen...

RyanKearney · on Sept 16, 2011

Hrm, did you click the link?

irahul · on Sept 16, 2011

Most likely he did. Did you?

If you did click and skim the page at 1000000000 words/sec, those `$` over there are for USD, and not part of templating system.

tcarnell · on Sept 16, 2011

I disagree! It does not make sense that submitting any text via a form input should in any way interfere with a templating engine, in the same way we dont expect to be able to affect a database by entering SQL into a form field.

The fact that Google brings back an empty result set to me indicates the problem is a bit deeper...

vinhboy · on Sept 16, 2011

Can you tell us what tools you used to find all this info and to un-minify their js? thanks!

cfinke · on Sept 16, 2011

I did a diff of the source code for a SERP for "${" and a SERP for "$$", ignored any lines that were the same except for s/${/$$/, and then un-minified with http://jsbeautifier.org/

aristus · on Sept 16, 2011

Tip to the poster, and to anyone: Google (and Facebook, and others) have bug bounty programs. You can get paid tens to thousands of dollars if you report vulns to the vendor first.

johnbatch · on Sept 16, 2011

Pretty low bounty.

The base reward for qualifying bugs is $500. If the rewards panel finds a particular bug to be severe or unusually clever, rewards of up to $3,133.7 may be issued.

http://googleonlinesecurity.blogspot.com/2010/11/rewarding-w...

tptacek · on Sept 16, 2011

For random XSS-style things on web pages, that's a high bounty.

The crazy lucrative bugs you may have heard of tend to be drive-by remote code execution in popular clientsides (like IE or Flash), and the stories about valuations tend to be apocryphal.

tshaddox · on Sept 16, 2011

Even those bounties are only for security-related bugs, one of which this doesn't appear to be.

0x12 · on Sept 16, 2011

So, that gives you your BATNA for negotiations with the dark side...

Either google is very confident that they don't have serious bugs or they are setting themselves up for a problem. Imagine the value of finding a serious bug in adsense or adwords.

snowboardbum1 · on Sept 16, 2011

For me just typing ${ breaks the layout. I agree it probably has something to do with a template engine. I know Java EL uses the syntax ${variable_name} and so does Velocity Templates.

The bug doesn't exist on https://encrypted.google.com/

karlzt · on Sept 16, 2011

and it doesn't exist if you search it from the url bar or the searchbar (with google as search engine) of firefox.

pavpanchekha · on Sept 16, 2011

Oh, cute. Yet another injection flaw in Google.

Guys, (and I don't mean Google, I mean all of us), don't fix injection by plugging injection bugs; put together some framework that actually avoids all of these problems (or at least doesn't let you add bugs).

nokcha · on Sept 16, 2011

This is actually a hard problem in the general case, and it is an active area of research. One promising approach is static taint analysis, wherein the source code of a web app is analyzed to detect whether "tainted" output is given to a sensitive "sink" without being properly sanitized. See, e.g., Omer Tripp et al., "TAJ: Effective Taint Analysis of Web Applications" (PLDI 2009) (http://www.cs.tau.ac.il/~omertrip/pldi09/paper.pdf).

As an example of a difficult case, consider the following pseudocode snippet:

  ...
  if request['raw']:
      print("Content-Type:text/plain; charset=utf-8\r\n\r\n")
      print(doc)
  else:
      print("Content-Type:text/html; charset=utf-8\r\n\r\n")
      print(html_sanitize(doc))
  ...

In one branch, doc must be HTML-sanitized; in the other branch, it must not.

colanderman · on Sept 16, 2011

That's a poor example. I would never send a document as HTML without tags. html_sanitize() should really be generate_html(), which adds structure to the document.

What the GP is saying (and I agree with) is that generate_html() should use a library which understands HTML structure and only allows content to be generated using a strict API (no doc+="<foo>bar</foo>" garbage).

Such a discipline greatly reduces the chance of injections, to the point where you have to actively write code to create injection points. And it's simple to follow: any time you write HTML, use the library.

Taint analysis sounds nice in theory, but you can get the same effect by writing code modularly (i.e. only one small module can actually access the raw output stream) and using libraries to create structured data.

jerf · on Sept 16, 2011

You don't know where "doc" is coming from from a snippet like that. I use logic moderately similar to that for my blog software. If I'm writing the blog post, pretty much pass through what I wrote. If it's a comment (back when I had them), process the heck out of it. The blog post itself is a blob of HTML, basically.

I do not currently write my blog posts in a templating language (any more than anyone else does), though Hamlet [1] has me sort of tempted as it is so close to what I write anyhow.

[1]: http://www.yesodweb.com/book/templates

defen · on Sept 16, 2011

It's not a poor example, and it's not about sending the document without tags. It's about whether special characters should be escaped, and the answer depends on the Content-Type that the client requested.

Peaker · on Sept 17, 2011

A framework could use static types to tag whether it is escaped or not.

Then, the framework could map different kinds of requests (e.g: raw content vs. html content) to different types.

Then, the only way to convert between the types are functions that do proper escaping.

MostAwesomeDude · on Sept 17, 2011

twisted.web.template is a great example: http://twistedmatrix.com/documents/current/web/howto/twisted...

pavpanchekha · on Sept 16, 2011

You're making the same mistake made by people who mistake good static type systems and type inference. Yes, it would be nice to infer the correct escaping function; but forcing people to escape somehow would be sufficient.

AndyKelley · on Sept 16, 2011

Django does this.

pavpanchekha · on Sept 16, 2011

No. Django solves the 90% problem, which is usually a fine approach but will llikely lead to security vulnerabilities down the line.

I'll refer to something I wrote last time I had this argument: http://pavpanchekha.com/programming/injection.html.

orclev · on Sept 16, 2011

Nice article. In response to the last part I can think of a way to achieve the sort of smart escaping via template you talk about using Haskell and Hamlet (among other templating systems used by yesod). I believe, although I can't absolutely confirm that Hamlet already performs context appropriate escaping, based mostly on the type signatures and the names of a few of the functions.

gregwebs · on Sept 16, 2011

Yes, Hamlet does context specific escaping. It will handle all the examples given, except you can't mix your javascript in with your html (which is generally good advice anyways).

I disagree with the articles premise that injection is always a display issue. In the [Yesod web framework](http://www.yesodweb.com) which uses Hamlet, we sanitize, not strip html by default before it is ever put in the database. The more you can make injection not a display issue, the better- you just have to know your options.

dmm · on Sept 16, 2011

You should replace your '<' with '&lt' if you're going to claim your page is xhtml.

pavpanchekha · on Sept 16, 2011

Thanks. Pages are compiled with org-mode, I'll report a bug.

dmm · on Sept 17, 2011

I love org-mode.

AndyKelley · on Sept 18, 2011

I don't know what you mean by "90% problem", but unlike what your blog article suggests, Django's template engine escapes everything by default. You have to explicitly pass content through a filter to request that it not be escaped.

Based on the fact that the suggestions in your blog article could easily support someone forgetting the "|escape" on a variable, I would accuse your methodology of only solving the "90% problem".

abeld · on Sept 16, 2011

How exactly?

Due to django's "we want the templating system be general, to be usable for stuff other than html", it can't provide support for such 'guarantee that the output is well formed / valid / has no injection attack entry points' features.

AndyKelley · on Sept 16, 2011

> How exactly?

Everything is escaped by default, and you have to explicitly request for your content to be unescaped.

wisty · on Sept 16, 2011

OK, you escape stuff. Great.

How do you handle user-submitted image tags? Do you allow rich formatting, and if so, how do you sanitize it? (see http://www.codinghorror.com/blog/2008/08/protecting-your-coo...)

I think that Django uses sha for password hashes. They should use bcrypt, right? Did you turn on XSRF protection (which I think is off by default)? Are cookies secure?

Web security is not as simple as 's.replace("<","<")' (escaping by default).

How do you generate slugs? Could someone put something nasty in a pathname or URL?

Django is not secure. You can secure it, with minimal effort, if you keep things radically simple. But you do need to know what can go wrong, so you don't introduce any "features" that are actually "gaping security holes".

Even if you do everything right, it doesn't mean that Django is magically secure no matter what people use if for (obvious, yes, but this is HN and sometimes failing to point out the obvious can get you downvoted). That's why people are objecting.

Pewpewarrows · on Sept 16, 2011

Handling user-submitted image tags is (in my opinion) way outside the scope of the framework. Which tags and attributes to whitelist, or whether to use html markup at all compared to a different language like markdown, is very project dependent. If you have to, just install BeautifulSoup or any of the other great libraries that have cropped up in the last year or so to handle the sanitizing.

Django uses sha for password hashes because until recently there hasn't been a better library to ship with natively across all the platforms that Django supports. If you know you'll only be working on *nix, django-bcrypt can enhance the default password hashing behavior. As other commenters have noted, they're moving to PBKDF2 in the near future as a better included hashing library.

CSRF is on by default. If you need secure cookies and HSTS headers, there's a package that provides them called django-secure, which last I heard is being rolled into Django proper in the near future.

Django prevents path traversal and anything else you can imagine that might be nasty in a URL. The auto slug generation included.

So how exactly is Django not scure again? Where are the "gaping security holes"? Or do you have no idea what you're talking about.

jwpeddle · on Sept 16, 2011

CSRF in on by default. Cookies could be more secure, and it's being worked on. Django is moving to PBKDF2 (there's no pure python bcrypt lib). There's not really opportunity to do anything interesting with slugs.

Like any framework, there will always be room to improve security, but it does do very well out of the box. At least it makes you work to expose anything obvious.

exogen · on Sept 16, 2011

Seems likely that it's due to a lack of escaping in a custom templating layer. I wonder if it could be used to perform a XSS attack?

ahlatimer · on Sept 16, 2011

Unlikely. You'd have to get someone else to run the same query.

exogen · on Sept 16, 2011

...which you can do simply by posting a link anywhere.

Edit: I guess it would be more helpful to explain why for those not familiar with XSS. If all it takes it a specially crafted URL to your site to exploit it, your site is toast. The security model of the web assumes that people can open even the shadiest of links without negative consequences. I could have obscured the URL with a shortener and named the link "Cutest cat pic ever!" I could have hosted a page on a totally separate domain and put the crafted URL in a hidden iframe. All I have to do is send document.cookie over to my server and now I control your account.

ahlatimer · on Sept 16, 2011

My mistake. I thought it didn't work if you linked to it directly. It turns out the bug just manifests itself differently if you do that.

inconditus · on Sept 16, 2011

Or iframe it.

gburt · on Sept 16, 2011

If this is a templating engine type thing, you should be able to do something like

${KEYWORD}${

If you can figure out what "KEYWORD" is for a given template tag as well. I tried links and a few others, but none that I can identify: it does still reproduce the bug though.

michaelpaul · on Sept 16, 2011

And i thought that this was one of the most sanitized input field in the internet. Let's seen how long it takes to google deploy a fix on their search.

alorres · on Sept 16, 2011

Also works if you use the html code: "${" It returns the symbol instead of the search query.

epaga · on Sept 16, 2011

Interestingly enough, the https version of Google doesn't have this bug.

https://encrypted.google.com/#q=${

looks fine, but

http://www.google.com/search?sourceid=chrome&ie=UTF-8...{

doesn't.

dchest · on Sept 16, 2011

The first one is broken for me too (in a different way, though).

tathagatadg · on Sept 16, 2011

They fixed it .. http://google.com/#q=${

redwood · on Sept 17, 2011

funny tho that at this point such a query would not return the discussion about this flaw :)

esrauch · on Sept 16, 2011

Anyone have an explanation for this? Looks like its messing up the CSS for the top link nav.

raldi · on Sept 16, 2011

I'm not seeing it. Can you post a screenshot?

dangrossman · on Sept 16, 2011

http://i.imgur.com/3UuEZ.png

adriand · on Sept 16, 2011

Looking at those screenshots, I see that you all tried:

${

However, I tried the literal

'${'

Which also breaks it. In fact, it looks like anything that has ${ in it will break it, anywhere at all in the search string.

Also, if you close the parentheses, e.g. ${}, it fixes it. This works with any number of leading ${, e.g. ${${${${${}

0x12 · on Sept 16, 2011

you are leaking your gmail address with that screenshot.

not sure if that was intended or not.

vorbby · on Sept 16, 2011

Hmm, anybody have an idea as to why mine's different?

http://i.imgur.com/OrqtK.png

gburt · on Sept 16, 2011

You loaded the page directly. It only seems to happen if autosearch was involved.

Loading http://www.google.com/search?q=${ does what you see, but entering ${ in to the search box and pressing enter does what everyone else sees.

edit: compare your clean human generated address bar to the other screenshot's messy software generated one.

hammock · on Sept 16, 2011

It's still broken, even if it looks different

vorbby · on Sept 16, 2011

Ahh, makes sense. Thanks for the explanation.

laCour · on Sept 16, 2011

It looks that way when you search using Chromes omnibox. Go to google.com then try.

akkartik · on Sept 16, 2011

Maybe it's a specific theme? The old default theme?

jc4p · on Sept 16, 2011

Here you go http://i.imgur.com/9y5TK.png

vorbby · on Sept 16, 2011

http://i.imgur.com/OrqtK.png

bengl · on Sept 16, 2011

Note: this also works: http://www.google.com/#q=$%7B

jdandrea · on Sept 16, 2011

The Google Search Appliance, with roots in Google proper, generates XML results, which are then (normally) transformed via XSLT. (I'm looking at you, XPath.)

http://mikewest.org/2007/06/escaping-curly-braces-in-xslt-at...

I s'pose attributes don't enter into it, but still, I wonder if the XSLT pass (if any) has anything to do with this?

IanMikutel · on Sept 16, 2011

I bet Google knows about this and its when they do bug triage its so low priority that it just hasn't been fixed yet.

simplycomplex · on Sept 16, 2011

This looks like a javascript issue but I have seen a server error from google - http://www.rajeeshcv.com/2010/07/have-you-seen-google-search...

nhebb · on Sept 16, 2011

On a related note, searches for many symbols do not product results. Searching for '&' will bring up results for ampersand, but most others that map well to words do not, e.g. $ => dollar, % => percent, etc.

deleo · on Sept 16, 2011

Looks like the syntax of Google's CTemplate someone posted today: http://code.google.com/p/google-ctemplate/

tshaddox · on Sept 16, 2011

> Shortest way to produce issue is here — http://google.com/#q=${

Making it an actual hyperlink would've been a bit shorter.

SwellJoe · on Sept 16, 2011

I would have been suspicious of a hyperlink in this context...since this is potentially discussing a XSS vulnerability on Google (not necessarily, but maybe).

sullof · on Sept 16, 2011

This is from yesterday: http://twitter.com/#!/passpack/status/114181283097214977

mikecane · on Sept 16, 2011

I tried it in my Search box at WordPress.com blog and it has no effect.

zeynalov · on Sept 16, 2011

try to search this and see real break 9999999..99999999999999999999999

kaerast · on Sept 16, 2011

No, that's designed to break - within that range is a large amount of credit cards numbers, and Google blocks that. The OP however is a genuine bug.

zeynalov · on Sept 16, 2011

thanks for info

henriquepss · on Sept 16, 2011

That's 'cause the money ($) is the key (}). Oh, you're welcome.

karlzt · on Sept 16, 2011

the bug exists only by searching on the google homepage, searching it on the searchbar of the browser doesn't happen.

pyre · on Sept 16, 2011

Breaks for me in either case, just in different ways (an auto-search link breaks the toolbar, while the other one doesn't even display the toolbar).

dadoner · on Sept 16, 2011

it doesn't break on google.co.ma

skeptical · on Sept 16, 2011

Everyone seems to avoid to mention the obvious. This breaks the page layout only, not really a critical issue concerning google's integrity/security.

Still, this obviously doesn't look good. Above anything else google has excelled on being simple and reliable. All this javascript goodness added recently might be a step in the wrong direction. If stuff like this starts to happen every now and then, google's reputation might be at stake.

speleding · on Sept 16, 2011

If your templating language is going to use a magic character it would seem useful to pick something less common than $. There are several odd characters on my keyboard (§`~±|¤) and if you are willing to use the ALT key there are really obscure characters that you can safely filter from the input instead of going through the trouble of escaping them. Filtering is so much more efficient/easier/safer than escaping.

Imagine how much easier life would be if in HTML we only had to filter for § instead of escape every <, > and ".

speleding · on Sept 17, 2011

Why the down vote?

dramaticus3 · on Sept 16, 2011

~ is not unusual on the internet, it is used for home directory webspace

http://www.proweb.co.uk/~matt for instance

dramaticus3 · on Sept 17, 2011

That's my homepage. It's been a long time since I read it.

Reading it now as a treatise from my younger self, though I didn't write it, I realise that spirit is lost. For a while it was "our" place but now we have to return to the underground.

dramaticus3 · on Sept 17, 2011

I changed it

http://www.proweb.co.uk/~matt/oldworld.html