
Why does Google prepend while(1); to their JSON responses? (2010) - chupa-chups
https://stackoverflow.com/questions/2669690/why-does-google-prepend-while1-to-their-json-responses
======
rayiner
> Contrived example: say Google has a URL like
> mail.google.com/json?action=inbox which returns the first 50 messages of
> your inbox in JSON format. Evil websites on other domains can't make AJAX
> requests to get this data due to the same-origin policy, but they can
> include the URL via a <script> tag. The URL is visited with your cookies,
> and by overriding the global array constructor or accessor methods they can
> have a method called whenever an object (array or hash) attribute is set,
> allowing them to read the JSON content.

What an absolutely ridiculous language and platform we decided to base the
whole web on.

~~~
archgoon
This isn't really a language issue. This is strictly at the level of browsers
having a very complicated and ad-hoc concept of permissions deciding when a
page should be allowed to make requests to, and to which, servers. We could
have a richer permission model api exposed to whatever language we were using
(so only certain white listed scripts tags would have the ability to make
requests for example); but we don't. In any case, this is going to be at the
DOM level, which isn't really part of javascript.

Pages could have more control (via a header manifest or something) over
tracking and permissioning based on where particular scripts come from; but
that's not the model we went with. All scripts get put into a single executor
and namespace. Now, this isn't an irrevocable choice, but a browser still has
to support existing pages that would still be vulnerable to fun XSS attacks.

~~~
eridius
It is a language issue, because the security hole is the fact that you can
redefine the Array constructor or accessor methods.

On a similar note, it's also crazy that you can't just use `foo instanceof
Array` to figure out if something is an Array because it could be an Array
from a different namespace, instead you need `Array.isArray(foo)`.

~~~
tptacek
That's not really a security hole except in the weird browser context of
having to execute content-controlled code --- and then only due to the same-
origin policy. Being able to redefine anything in the system is normally a
virtue for a language.

~~~
rsj_hn
Being able to redefine everything is _never_ a virtue as it introduces non-
locality. You can't reason about what a particular piece of code is doing
without understanding the full data flow that got you to this point and any
re-definitions that might have happened along the way (with one re-definition
overwriting the previous one).

For example, look at the following code:

    
    
        function test(a) {
              if (a > 1) {
               console.log("bigger");
               return;
            } else if (a <= 1) {
               console.log("smaller");
               return;
            } else {  
               console.log("Why is this happening to me?!");
            }
          }
      
          var a = { x      : 1,
                    valueOf: function() { 
                                       this.x -= 1; 
                                       return this.x==0?1:2;
                                        }
                  }
      
      test(a);

~~~
tptacek
I'm not looking to recapitulate the debate about whether monkey patching is
good or not. It's probably bad most of the time. That doesn't make the
possibility of doing it at all, ever, bad. If it helps you, just substitute
"mainstream feature" for "virtue".

~~~
rsj_hn
There is a difference between being able to monkey patch

    
    
      * objects created in the same scope
      * objects created outside the current scope but global
      * global objects
      * core language features
    

For anyone who has to read code someone else wrote, as you go down this list
life becomes much more painful. Even something as simple as forcing all the
patches to be in one place so there is an unambiguous place to look for them
would make a huge difference.

~~~
tptacek
Well, you can do those things in Ruby, you can do them in Javascript, you can
even do them in Python. Is there a high-level "interpretable" language that
takes a hard line on this, or is it just a standard we're retrofitting onto
Javascript?

~~~
rsj_hn
No, you cannot.

In python you can certainly define your own object which might be a list and
override the list accessor for _your object_ , but you cannot override the
general list accessor for _all lists_ or those lists that you don't have a
reference to.

This is a big difference between a language like python and javascript.

In javascript, buried deep in a function scope of some library might be code
that overrides Function.prototype.apply, and then everywhere else in your code
that a function is invoked, the new behavior will take effect. Nothing like
that type of interference is possible in Python. In Python, you can only
monkey patch those things that you have a reference for.

~~~
JoachimSchipper
Eh, this works fine (on Python 2.7):

    
    
        class A(object):
            def foo(self):
                return 'foo'
    
        a = A()
        a.foo()    # returns 'foo'
        a.__class__.foo = lambda self: 'bar'
        A().foo()  # new instance (!) returns 'bar'
    

You're correct for list ("TypeError: can't set attributes of built-
in/extension type 'list'"), but that's more a implementation limitation than a
principled stance.

~~~
rsj_hn
You just looked up the class of a which is A and then you patched the foo
method in the class definition of A. This is no different than you saying
A.foo = lambda ...

But what you can't do in Python is monkey patch a method to a class for which
you don't have a reference. Try it.

This is not an "implementation limitation", it is because Python is not a
prototype language, so you can't monkey patch what it means to "call a
function" for every method in every class because there is no "Function"
object that is a prototype of all method calls in Python that plays a similar
role to Function in Javascript. Similarly you can't patch "Object" de-
reference and change the behavior of every single object de-reference. Those
prototype chains are not exposed to you.

In javascript you can override methods of classes which you can't reference.
That's huge. Imagine one person creates an object with a foo method, and
someone else changes what it means for all objects of any class to call any
method. Think about what you have to audit to determine what the behavior of
foo is -- in Python, you just need to audit anything that has a reference to
the class where foo is defined. In javascript, you need to know what all the
code is doing. That's a big difference.

~~~
likpok
Python did have some scheme to do something like this. If you were using
gevent, you needed to call gevent.monkey_patch() to override all the network
calls that you were making under the hood, so that they would happen on the
event loop.

Yes, gevent did have a reference to the classes it was patching, but it still
made pretty deep changes to a program.

It was janky and has been replaced several times by other async frameworks. It
doesn't go as far as javascript to be sure, but it is sort of what you're
suggesting.

~~~
rsj_hn
Yeah, the issue isn't monkey patching per se, but lack of isolation.
Javascript just doesn't support the concept of a "module" that cannot
interfere with globals via side effects or is somehow restricted to
interacting with the rest of the system via a clearly defined API that can be
checked without reading all the code in the module.

That, combined with the fact that people load thousands of modules just to do
simple short programs, makes it very hard to reason about javascript code
efficiently.

You can try to hide global references from a javascript module, for example by
using "with", but you will always have access to things like constructor()
which will give you the raw window, so you can't hide a reference to Window
even if you shadow Window via "with".

Even ignoring that, you still can overwrite builtins from anywhere in the
code, thus effectively changing the shared runtime. There is no notion of
scoped builtins -- e.g. overwrite builtins all you want, but just in your own
scope.

That lack of isolation is the problem, not monkey patching your own objects.

------
Sephr
I created a UTF-7 JSON hijacking method back in 2010 that enabled full hijack
of entire JSON data streams in Firefox and Safari:
[https://www.reddit.com/r/programming/comments/b7ebd/json_sni...](https://www.reddit.com/r/programming/comments/b7ebd/json_sniffing_with_utf7_injections_will_only_work/)

Originally at [https://code.eligrey.com/sec/json-
hijacking](https://code.eligrey.com/sec/json-hijacking) (archive:
[https://web.archive.org/web/20100304213300/http://code.eligr...](https://web.archive.org/web/20100304213300/http://code.eligrey.com/poc/json-
hijacking/))

------
robocat
[http://www.tomanthony.co.uk/blog/facebook-bug-confirm-
user-i...](http://www.tomanthony.co.uk/blog/facebook-bug-confirm-user-
identities/)

Mentions how inconsistent usage of the XSSI prevention caused a information
leak that could detect which user was logged into Facebook - received a bug
bounty of $1000 from Facebook.

Previous discussion:
[https://news.ycombinator.com/item?id=19306309](https://news.ycombinator.com/item?id=19306309)

------
gtirloni
Discussed a few times here:

[https://hn.algolia.com/?query=google%20prepend%20while(1)](https://hn.algolia.com/?query=google%20prepend%20while\(1\))

~~~
chupa-chups
Sorry, didn't notice :( This time i relied on HN's mechanism to automatically
redirect one to old posts with the same URL.

Usually i check.

~~~
a-wu
Well I'm one of today's 10,000[0] so thanks for posting anyway!

[0] [https://www.xkcd.com/1053/](https://www.xkcd.com/1053/)

~~~
etxm
I’ve never seen that xkcd before.

Very meta.

~~~
domoritz
There is an XKCD for almost every situation.

------
amenghra
It protected against a weird javascript edge case in the distant past and
against flash injection in the past.

See also [https://stackoverflow.com/questions/15306636/why-do-
facebook...](https://stackoverflow.com/questions/15306636/why-do-facebooks-
jsonp-callbacks-start-with)

------
dang
2017:
[https://news.ycombinator.com/item?id=14280625](https://news.ycombinator.com/item?id=14280625)

2013:
[https://news.ycombinator.com/item?id=6982205](https://news.ycombinator.com/item?id=6982205)

------
AntonyGarand
I made a post few months ago regarding this exact vulnerability:
[https://dev.to/antogarand/why-facebooks-api-starts-with-a-
fo...](https://dev.to/antogarand/why-facebooks-api-starts-with-a-for-
loop-1eob)

HN discussion:
[https://news.ycombinator.com/item?id=18443125](https://news.ycombinator.com/item?id=18443125)

\---

While this vulnerability is getting old, it's very interesting to see its
prevention still in effect on major websites. Even if the original
vulnerability is patched, we never know when a modern variant might pop out,
such as using UTF16BE as charset to extract array data!

------
ramshorns
Something I just learned about JavaScript: the semicolon is an empty
statement, so this is an infinite loop that does nothing.

~~~
ygra
JavaScript shares this trait with a lot of languages that borrow syntax from
C.

~~~
gravypod
A good example of this is a simple strcpy in C:

while ( _a++ =_ b++);

~~~
umanwizard
Correct version (not broken by formatting):

    
    
      while (*a++ = *b++);

~~~
amelius
How did you manage to enter it in HN? Is there an escape sequence for
asterisk? Or are you using special Unicode codepoints?

~~~
umanwizard
Double space at the beginning of a line makes that line verbatim (and
monospaced).

[https://news.ycombinator.com/formatdoc](https://news.ycombinator.com/formatdoc)

------
asimpletune
This reminds me of the arguments that the folks at n-gate make over why they
won’t use https. Basically, “it’s not me, your client is broken”.

There are too many conflicts on interest to fix the broken client.

~~~
gowld
The n-gate that is a web log of Hacker News comments they don't like?

~~~
jmiserez
Yes, but specifically this article that is unrelated to HN comments:
[http://n-gate.com/software/2017/07/12/0/](http://n-gate.com/software/2017/07/12/0/)

------
spockz
I thought this kind of protection was only necessary when using arrays on top-
level json. Can nested arrays still be obtained somehow when not using this
(or other) prefix?

------
gravypod
Is this still needed with CORS?

~~~
AntonyGarand
It is for older browser which do not support CORS, as well as different
variants such as using a script with the UTF16BE charset:

[https://portswigger.net/blog/json-hijacking-for-the-
modern-w...](https://portswigger.net/blog/json-hijacking-for-the-modern-web)

------
ec109685
This isn’t needed for browsers newer than 2011:
[https://stackoverflow.com/questions/16289894/is-json-
hijacki...](https://stackoverflow.com/questions/16289894/is-json-hijacking-
still-an-issue-in-modern-browsers)

~~~
detaro
From a link on that page, various variations that worked in newer browsers:
[https://portswigger.net/blog/json-hijacking-for-the-
modern-w...](https://portswigger.net/blog/json-hijacking-for-the-modern-web)

