Hacker News new | past | comments | ask | show | jobs | submit login
JSON users: Avoid CSRFs by not using top-level arrays (pocoo.org)
186 points by Jach on June 18, 2011 | hide | past | web | favorite | 81 comments

I'm not familiar with CSRF so had to look this up:

[Cross Site Request Forgery] vulnerabilities occur when a website allows an authenticated user to perform a sensitive action but does not verify that the user herself is invoking that action. The key to understanding CSRF attacks is to recognize that websites typically don't verify that a request came from an authorized user. Instead they verify only that the request came from the browser of an authorized user. Because browsers run code sent by multiple sites, there is a danger that one site will (unbeknownst to the user) send a request to a second site, and the second site will mistakenly think that the user authorized the request.

From: http://freedom-to-tinker.com/blog/wzeller/popular-websites-v...

Via: http://www.codinghorror.com/blog/2008/10/preventing-csrf-and...

A few days ago I asked, "Ask HN: Best Approaches to Prevent Session Hijacking?" (http://news.ycombinator.com/item?id=2663293) and presented a technique I'm experimenting with that uses session timestamps as tokens to validate that each request is authentic and an attacker hasn't stolen the session key.

It works by updating the token (timestamp) on the server and in the cookie on each request and they both have to match on the next request. A short buffer (say 30 seconds) is permitted to avoid false positives from click-happy users on slow/flaky connections.

Has anyone used this technique in production?

How do you mitigate problems with a user opening multiple tabs over a span of time, and attempting to submit the first one opened?

I believe the best practice is to provide the user's session id in their cookie and provide that with the requests via javascript or even at the time of page rendering. Since a third party (untrusted domain) cannot access information from your cookie, they also cannot form a valid request.

This method is called double submitting cookies.

See: https://secure.wikimedia.org/wikipedia/en/wiki/Cross-site_re...

A session is shared between multiple browser tabs so this shouldn't be a problem. If the user opens a new tab/browser in incognito mode, then the browser will ask them to login in again b/c it can't read their session to authenticate their identity, and then when they login they will have two separate sessions going, each with a unique token/timestamp so multiple sessions isn't a problem either.

My bad, I thought the timestamp was being inserted into a form ala Rails' authenticity_token method in a way that would allow it to go stale if a new request was submitted in a new tab.

Although this makes sense, I feel it is not as safe as double submitting cookies, which effectively creates pseudo-random requests, but should still be safe enough in practice.

If you double submit on each request, this will double the number of requests on your server and reduce your server's capacity. And isn't all JavaScript subject to being subverted in XSS attacks, thus potentially rendering it ineffective?

In double cookie submission, you are not issuing two requests to the server, but rather using javascript to append the session id to either the post body or the URI.

Since the browser automatically submits the cookie via HTTP Headers, single submission by itself is not safe. Since a third party cannot read the value of the cookie, they cannot recreate the proper request and will, consequently, fail.

Of course, both our methods will fail under an XSS, but should still prevent CSRF. I still think a cryptographically generated secret stored in the cookie is less guessable than a timestamp.

The timestamp, session_id and user_id are tucked away in an AES encrypted bundle with a SHA-1 signature.

How much does running everything over HTTPS mitigate this cross-site stuff?

It doesn't. HTTPS mitigates passive sniffing and most MITM attacks but does not have any effect against cross-site script attacks such as CSRF and XSS.

Haacked has a step-by-step description of how this attack works: http://haacked.com/archive/2008/11/20/anatomy-of-a-subtle-js...

The Microsoft asp.net web stack avoids this by automatically wrapping json responses with an object "{d:...}".

The comments describe this as only affecting FF2.0 although testing was informal. (You should, of course, still protect your services.)

That wouldn't work, you can override the Object constructor in JavaScript as well. You have to prepend something like `while(1);` to the JSON responses you're returning.

I can see why you would think that but it does work because a script that just contains a JSON object is not a valid JavaScript file. The browser will give an error. Whereas a script that just contains a JSON array is valid.

I think it's just the ASP.Net AJAX Toolkit not plain ASP.Net or ASP.Net MVC.

Thanks, I should have clarified. Here is the breakdown according to Phil's post linked below:

ASP.NET ASMX Web Services, WCF Web Services, and the now-defunct ASP.NET AJAX automatically wrap with {'d':...}.

ASP.NET MVC does not.

Phil explains that the reason there is a difference is that with MVC there is no common client library that automatically strips the {'d':...} wrapper. So they felt it would be too confusing for users. http://haacked.com/archive/2009/06/25/json-hijacking.aspx

That was an old article so maybe they have changed the default behavior since then. In any case, you can manually wrap the response with {'d':...}.

ASP.Net also requires POST for web service calls, so this sort of include doesn't work.

I hope you're not a web programmer. I'm not a web programmer and I knew that.

And the award for most pompous, unnecessary, and conceited response goes to... cheez!

This attack was used successfully against gmail in Jan 2006.


Facebook is dealing with this by prefixing some of their JSON responses with "for(;;);".

Yes and Google has 'throw 1;' at the start of some of their files IIRC (may be some other value than '1').

OK, I see how that prevents an attack (it seems to create an endless loop), but how does this parse as regular JSON for the good guys?

I imagine they simply remove the first 8 characters from the response string. eval("(for(;;);{})"); doesn't work, but eval("(" + "for(;;);{}".substring(8, 10) + ")"); works fine where 10 is the length of the response string.

EDIT: This is the correct idea. Search for shieldlen or safeResponse in the JS to see how it is implemented. There is always something interesting to be learned by digging through FB's client side code.

You should never be eval()ing json, thats the most dangerous thing you can do with it. All browsers have had native JSON parsers for years, which will choke on any code that's sent through them. These will be automatically employed by any js framework you're using, or I think its generally JSON.parse().

By eval()ing your json you are doing most of the attackers work for them. All that stuff in the article about mime types is redundant if you're eval()ing your json.

Basically, don't ever eval() anything, in any language.

I agree that eval() is evil and JSON.parse() is the way to go.

However, in this particular case this doesn't matter much, because the site itself controls where to load the JSON from. Manipulating the JSON response requires access to the webserver or the DNS, and in those cases the attacker could have manipulated the initial HTML response as well.

> By eval()ing your json you are doing most of the attackers work for them. All that stuff in the article about mime types is redundant if you're eval()ing your json.

This totally misses the context of the article. The article is about CSRF. That is, the _attacker_ downloads and executes the JSON.

This is _not_ about downloading the attacker's JSON! It is about how to construct the JSON in a way that it is unaccessible through a <script> tag from the attacker's site.

This is good advice, but a non-sequitur in this case. This is not a code-injection attack, this is an information leak. It lets an attacker get around the normal constraints on fetching cross-site URLs. There is an EVAL happening, but it's being done by the attacker, not by you. And it's being done implicitly. The real problem here is a security hole in Javascript itself. It's too flexible for its own good.

While I generally agree with your sentiment, I was merely writing a brief proof of concept test in the firebug console.

From: http://www.json.org/js.html

"The use of eval is indicated when the source is trusted and competent. It is much safer to use a JSON parser."

FB is still using eval() if you look at their code. As the source of the JSON is their own service, and they can, therefore, trust it assuming proper sanitization; the same applies for my test case.


Patience. Writing a good response takes a bit longer than downvoting.

This will stop embedding it in <script> but why couldn't the attacking website do the same with eval and substring?

Because you cannot issue cross-domain AJAX calls, the attacker does not have access to the response body as a string that can be manipulated.


Oh duh, I'm dumb.

Sometimes the laziest solutions are the most elegant.

Tangential: One of my favorite pieces about simplicity, laziness, dogmatism and getting things done is from Mark Jason Dominus[1]. His context isn't connected to this at all (it's about when and whether to use shell commands inside Perl scripts), but the larger point is very relevant: taking "Do the simplest thing that could possibly work" seriously can have surprising outcomes.

tl;dr Sometimes ugly is elegant, too.

[1] http://perl.plover.com/yak/12views/samples/notes.html#sl-3

Interesting article, and I had a similar issue with #perl recently as well.

They started with trying to fix a performance problem I didn't have, and then after my refusal to give them more information to fix a problem I didn't have, started insulting me.

They are adding executable code at the beginning of what is supposed to be a data structure. This solution is not elegant at all. It's an ugly hack that uses a side-effect of code to plug a hole in a rather convoluted security model.


Come to think of it, this seems to be solvable by a much better method - the same one that is used to prevent standard CSRF. The problem is effectively the same.

Server A is a valid server. User logs into it and gains privileges. Then he visits server B, which is a bad server. B tricks the browser into sending a request to server A to do something (abusing the elevated privileges). The only addition with AJAX is that server B also manages to read the result of its attack. That wouldn't matter if it couldn't trick the server A honoring that page request in the first place.

You can easily solve this by signing your requests, effectively binding two pages on your server together. Normally, PAGE1 has a form (or JS code) that requests PAGE2. You simply need to enforce that only legal (yours) pages can do that. This is achieved by adding a token to PAGE1 that must be sent to PAGE2.

For example, token = hash(server_secret + PAGE2_identifier + user_id).

Upon receiving a request for PAGE2, the server will know all the arguments that went into creating that token, and re-generated. If the user-supplied token and server-generated tokens don't match, the request is denied.

As long as client side scripts from server B cannot read the token from server A, this should work even with AJAX requests. Since attacker (server B) cannot know server_secret, they will not be able to guess your generated token, and their requests to PAGE2 on server A will fail.

As the author of said document: The title here is misleading. While this attack is a form of CSRF, doing that does not magically solve all your CSRF problems. It just counters one particular attack vector which are JSON responses.

Generally speaking, all incoming requests should be verified - authorized or not. These days with all kinds of wonderful web frameworks, CSRF protection is pretty simple. Django handles CSRF with a token in a hidden form field: https://docs.djangoproject.com/en/dev/ref/contrib/csrf/

Wouldn't Oauth solve this, since all requests are signed?

Having a CSRF token on your forms does not prevent, and has nothing to do with, with the attack in the article.

Hi, why not? If the attacker can't get any information without sending a token with the request, what's there to worry about? I'm not very good with js so I might have misunderstood something. Thanks!

The Django "{% csrftoken %}" that you put into forms (and similar things in other frameworks), is used when posting form responses. It turns into a hidden form field (<input type=hidden>). This helps protects against someone creating a form on their own malicious site, that posts some data to yours.

This attack mentioned in the OP is effectively completely different. It is off a GET request.

Imagine you were running a social network site, and you had an API (authenticated via HTTP sesions) that was a GET request go get the firends list. This method returned the (logged-in) users list of friends in a JS array.

Note that Django-style CSRF tokens are not relevant here, as they are only for protecting POSTs.

The attack described in the post is using a script tag, and a redefined array setter, to direct a user with a live logged-in session on your site to it to fetch data.

So coming back to my example. I am a malicious hacker, and I can socially engineer an end-user to come to my site. I put a script src=yoursocialnetwork.com/get_friend_list. This will fetch the data, and I will be able to extract that info in my javascript, and then post that back to my site so I can capture that info.

Thanks a lot for the excellent explanation. I tend to do my Ajax requests with post just to get the token in. Is there a reason not doing so, like savings in bandwidth or something like that? Might that be a gain Facebook is trying to achieve?

  > Is there a reason not doing so, like savings in bandwidth or something
  > like that?
GETs may perform slightly better, see http://developer.yahoo.com/performance/rules.html#ajax_get

Yours is a good solution, and effectively blocks the attack mentioned in this article. From a REST purity standpoint, it's "unclean" to require all API calls to be POSTS, but, hey, life is short.

I believe that he's right that just putting it in a hidden form field would be useless for this sort of attack. However, I believe rails and django actually include the CSRF token in headers for ALL ajax requests, not just form submits.

This does not help at all in this case, proving that it is not that simple.

Title is incorrect, Array usage is about Javascript/JSON Hijacking not protection against CSRF.

Yeah, the article got it right (CSRF was the previous section) but the submitter titled it wrong.

Correct, the technical term for it is 'XSSI', or Cross Site Script Inclusion.

I haven't heard of that term before now, but I would argue that XSSI is a way of doing CSRF rather than CSRF, or something entirely different itself. (From Wikipedia: "Unlike cross-site scripting (XSS), which exploits the trust a user has for a particular site, CSRF exploits the trust that a site has in a user's browser.")

Apologies if the title made it seem like not using javascript arrays was a magic bullet to preventing all CSRF.

They are definitely similar, but I wouldn't say XSSI is a type of CSRF. CSRF refers specifically to a few methods of attack, distinct from what is used in XSSI.

Regardless, it seems CSRF is much more widely known than XSSI, so you could say worrying about the distinction is just pedantry. I was very surprised when I searched earlier and could not so much as find an OWASP mention of XSSI. Still very important to know though.

Based on what? Never heard of XSSI in this context, isn't XSSI used for extended server side includes?

Array() doesn't seem to be called when defining an array with [] in Chrome 12 and Firefox 3.6+

Because [] is not syntactic sugar for new Array. The code the article wrote about doesn't work in any implementation that I know of. I think the ESv3 spec wasn't very clear: "Create a new array as if by the expression new Array()." But the implementations (always?) did the right thing, and the ESv5 spec is more clear: it adds "where Array is the standard built-in constructor with that name."

This comment and the GP suggest that the article is wrong. Is it? Can anyone confirm a JS implementation that manifests the described behavior (the Array constructor being called when parsing top-level [] in JSON)?

It obviously worked this way at some point (http://news.ycombinator.com/item?id=2668888), so I'm guessing the older IEs, at least, still have this flaw.

> Because [] is not syntactic sugar for new Array

In Firefox 4 and other new browsers this is no longer the case. But right now it's still an issue as many people are using older browsers.

I just tested with Firefox portable 3.0.0 and it looks like it was already fixed then (June 2008).

The attack worked in an older Firefox, at least.

The articles information is very out of date, and relies on multiple issues to successfully work:

  * It needs an array literal (eg. []) to be constructed using the function referenced by the Array property on the global object. (This was spec ambiguity and only effects older IE and Firefox -- very old firefox, maybe only up to netscape or phoenix?)
  * It needs assignments in object and array literals to call setters on the prototype chain.
Both of these issues were fixed by ES5 (the first may have been fixed in ES3.1) by saying that Array and Object literal notation both use the initial values of Array and Object (so you can't change the constructor used), and by saying that all assignments are "direct" so won't call setters on the prototype chain.

This effectively makes JSON hijacking impossible, except of course for the large numbers of old browsers that are out there.

This is also a distinct issue from JSONP hijacking, for which there isn't a solution other than to not use JSONP.

Does anyone know if it's possible to override constructors for the global "Object" in Javascript? The author's assumption is that it is not possible and therefore the best solution to this is to wrap all your JSON with {} instead of []. My intuition is telling me that's not the fix. Can someone verify?

It seems the best solution is not to use a top level object but (as mentioned below) Facebook's solution to prepend for(;;) to all JSON and strip it before parsing or Google's to prepend 'throw' and strip it pre-parsing.

Update: This conversation says it's not possible, but I'm still not a believer: http://sla.ckers.org/forum/read.php?2,35337,35337

An idea: when using session-based authentication and returning JSON, check that the Referer [sic] matches your domain. I think I'm missing something major, though.

One elegant solution mentioned at a DNSSEC talk was to simply include your crsf token in the header of all get/post requests and have the server reject anything without said token. Assuming you'd be using some js library to do your ajax, you could make that modification in there so that you'd transparently use it without needing to modify any existing code. Only the urls that send html will not need the crsf header token.

From http://www.codinghorror.com/blog/2008/09/cross-site-request-...

The HTTP referrer, or HTTP "referer" as it is now permanently misspelled, should always come from your own domain. You could reject any form posts from alien referrers. However, this is risky, as some corporate proxies strip the referrer from all HTTP requests as an anonymization feature. You would end up potentially blocking legitimate users. Furthermore, spoofing the referrer value is extremely easy. All in all, a waste of time. Don't even bother with referrer checks.

How is spoofing the referer extremely easy? I can think of no way for an attack site to do this, short of having control over the entire user machine (or a significant browser exploit anyways), at which point all of your webapp security is irrelevant.

You cannot assume a request is coming from a browser.

curl --referer http://www.example.come http://www.example.com

Yes, but if the user wishes to avoid their own security like this, they can already do it a thousand other ways (and to no adverse affect, there is no opportunity for an attacker to exploit this, I don't know what you're getting at).

I was merely pointing out how easy it is to fake, although you are correct a third party site could not do this.

Atwood's point is that checking the "referer" will both be unreliable and, more importantly, lead to false positives; there are better alternatives, namely, double submitting cookies as I have pointed out elsewhere with regard to this article.

He's definitely correct about the false positives and that nonces and/or double submitted cookies are superior, I'm not arguing that, just the 'furthermore, spoofing is extremely easy,' which makes no sense there.

Isn't this XSSI, not CSRF?

Apache Extended Server Side Includes? Haven't seen XSSI before. What's the I?

I agree, this is not the attack I think of when somebody mentions CSRF. Well, the solution at least isn't. I would be very suspicious of anyone who claimed to solve their CSRF holes by not using arrays.

Cross Site Script Inclusion. The article touches on a lot of things (including CSRF), but the HN title refers specifically to JSON (As well as the link's '#' fragment identifier), and therefore the last section, where it shows an attack site including unprotected JSON through the <script> tag, and unless I'm seriously mistaken, this is the definition of XSSI.

Yes, this is XSSI and the exploit comes in because a JSON array is essentially treated as "executable" code in some JS implementations.

Here are a couple of resources that go a litle more into XSSI:

Google tech talk: http://www.youtube.com/watch?v=jC6Q1uCnbMo&feature=playe...

Gruyere codelab: http://google-gruyere.appspot.com/part3#3__cross_site_script...

further reading: 'Would it be more secure if API endpoints for Collections would wrap the JSON serialized Arrays inside an Object literal?'


ultimately you should use tokens to verify the request was not forged.

Why would this not work:

a) Resources using something other than GET are automatically not affected.

b) For GET resources, require that the body of the request contains a value, perhaps from a cookie but could be anything, ensuring that the request was made using xhr, which is domain restricted.

Sounds like the simplest way to me, curious why it wouldn't work.

Point A sounds perfect to me. Why bother with GET at all?

For point B, I notice that I get an "x-requested-with: XMLHttpRequest" when doing ajax() from inside jQuery. I assume this is not there when someone SCRIPT SRCs something (why would it be?), so that may be useful to someone.

Good thought. The "double-submitted cookie" technique does exactly that.

Here is the catch:

> So now what happens if a clever hacker is embedding this to his website and social engineers a victim to visiting his site. >

If that happens then CSRFs or JSON is not the highest priority thing to worry about. The hacker controls everything. And no matter what I do, he can find a by pass.

Does anyone know how to rectify this if you're using the Rails 3 respond_with pattern?

This is a non-issue if you use JSON-RPC.

What's with the downvotes? Is my statement not true??

Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact