Hacker News new | past | comments | ask | show | jobs | submit login
JSON: The JavaScript subset that isn't (timelessrepo.com)
157 points by judofyr on May 15, 2011 | hide | past | web | favorite | 35 comments

tl;dr: the JSON specification accepts two obscure Unicode codepoints in string literals that the JS specification doesn't. eval() may fail on valid JSON. Workaround: escape those codepoints.

Plus sensationalistic title and pedantic introduction.

Hey, thanks for the tl;dr. That was useful.

> Plus sensationalistic title and pedantic introduction.

This, not so useful (and unkind)...

Had the author written merely the tl;dr line above do you think his important observation would have gotten the attention it deserves?

The author outlined his thought process to show that his observation was actually based on non-trivial research.

While I like tl;dr's, I appreciate intros that put things in context, and I'm more likely to trust observations whose underpinnings are spelled out.

This post is based on ES3. Both U+2028 and U+2029 are valid in ES5 string literals as part of a LineContinuation. See http://es5.github.com/#x7.3 (ES5) instead of http://bclary.com/2004/11/07/#a-7.3 (ES3).

So really, the title should be: “JSON: The ES3 subset that isn’t.”

Issue reported: https://github.com/judofyr/timeless/issues/57

Valid point, but not a great example. You get a syntax error in Safari's Web Inspector even if you type that JSON yourself (i.e. no Unicode funny business):

  > {"JSON":"rocks!"}
  SyntaxError: Parse error

Right, you don't need "U+2028" character to reproduce the error. Why ? Because the parser handles {} as a block code, not as an object declaration.

Actually, the "good example" should be : var obj = {"JSON":"ro

Though, to his credit, in the little REPL that he shows, he typed ({"JSON":"rocks!"}), not {"JSON":"rocks!"} - the former is valid.

To tomstuart's credit, I fixed the post after he wrote that comment.

Meta: Right after I originally wrote this comment I saw that tomstuart answered too, so I decided to delete my comment. Well, it appears that so did Tom, and suddenly there were no replies here.

HN feature request: compare-and-swap comment semantics.

Indeed. You get the same error in Chrome and a slightly different one in Firefox. That isn't valid JavaScript on its own.

  > {"JSON":"rocks!"}
I don't think this is such a good point. While the example is not a valid expression, it is a perfectly good JS value/statement, like (almost) all other JSON. Wikipedia might go a bit too far in saying that all JSON formatted text is legal JS code, but (IMHO) it still is a perfectly good subset.

The unicode part is a clear error and should be fixed.

Another caveat: it's unsafe to embed arbitrary JSON in an HTML <script> tag, because a string containing "</script>" will get picked up by the parser.

This particular case can be patched by replacing "</" with "<\/".

If you always escape the Unicode characters when you output JSON, you don't have this problem. At least Facebook/Twitter are using this method.

One trick is to base64 encode and decode the data

It sounds to me more like the javascript specification has a bug or design flaw, rather than saying as a blanket statement that JSON isn't really part of JS. Ideally, javascript and JS implementations should fix that flaw, rather than parsers having to work around it.

ECMAScript was standardized over ten years ago. The specification has been publicly available and (as far as I know) every implementation has treated line terminators equally. We can't just change 5+ implementations of JS to make it compatible with a some tiny data format that works fine 99% of the time anyway.

I'm not saying that JSON isn't a part of JS; of course it is! I'm just saying that it isn't a strict subset of JS and that you can't depend on eval() doing the right thing.

Does anyone know if the jQuery JSON parser has this fix by any chase?

jQuery JSON parser will delegate to JSON.parse where available, which doesn't have this problem.

Using JSONP with jQuery, you have to directly eval() the response from the server, and you don't control what the server sends - it could send invalid javascript with these code points in it. You can't fix that from the client-side, jQuery can't fix that.

The servers have to be patched.

Thanks for sharing this tricky bit of knowledge, Magnus. Glad you're helping raise awareness on this further :)

Oh, and thanks again for patching the JSONP middleware.

JSONP does not force you to use eval. All that JSONP is doing is telling the web service to write JavaScript in such a way to immediately invoke a function with the name of the callback. Usually it's just called with a string. What you do with that string is up to you.

JSONP does force you to use eval, because JSONP means "instead of returning JSON for me to parse with a JSON library, return executable JavaScript for me to point at with the src attribute of a <script> element". The string consisting of the argument to the callback is de facto evaled by the user agent (when the executable JavaScript is executed) in order for it to become a JavaScript value at all.

The user agent evals the JSONP response before anything else happens: the server might return the string 'yourCallback({"JSON":"rocks!"});', which gets evaled to a function invocation whose argument is whatever the string '{"JSON":"rocks!"}' evals to. If that string doesn't successfully eval (as per the OP), your JSONP is broken.

Perhaps we need a JSONP callback that instead does `yourFunction(JSON.parse('{"JSON":"rocks!"}'))` (though, slightly more sophisticated, I hope).

That still won't work unless you escape the line endings. And you'll also need to escape single quotes.

You are mixing eval() with other stuff.

Or, conversely, everyone here is confusing eval() with evaluation in general. JSONP does not call eval() as such, but it does evaluate the server response as JavaScript, because it ends up as:

    <script src="/whatever?callback=foo"></script>
The source of which is something like

Therefore, the hash supplied to foo() must be valid JavaScript.

Moreover, JSON.parse will be able to parse the JSON text containing the "U+2028" character without any issue.

Sounds like something that a 3rd party javascript library could handle :). Point made though.

No. Using JSONP, you are forced to directly eval() code from another domain - you can't just use a javascript library to change that code.

JSON started out with JavaScript and slowly expanded beyond the ECMAScript specifications to become more versatile. I don't see what the big deal is.

Humans prefer easily processable and storable information, and for this reason they like to think JSON is JavaScript Object Notation.

This reminds of how ECMA doesn't like to be called the _European_ Computer Manufacturer's Association, since they have now grown beyond that, and prefer being called Ecma.

As the OP says, the big deal is that people frequently assume that JavaScript's eval is an acceptable JSON parser — for example, this is the fundamental assumption which makes JSONP work. If your JSON isn't guaranteed to be valid JavaScript, you might run into actual problems.

It does seem more like a bug in the JSON spec than anything else.

Wouldn't using eval to parse JSON be quite hazardous in the same way as python's pickle, and thus be limited to pretty specific use-cases (limited moreso than just simply to JS)?

It used to be quite common and is still done (and is required for the hack known as JSONP); lots of the time you're retrieving JSON from the same server you loaded the original page from so it's safe.

I see. Should have read the whole thing before commenting.

Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact