Hacker News new | past | comments | ask | show | jobs | submit login
JSON5 is a proposed extension to JSON (json5.org)
143 points by zekers on Mar 1, 2014 | hide | past | web | favorite | 246 comments



As much as i would like to see comments in json: if we start throwing around json files that area not really json, but we call them json, (at least in everyday talk), we will end up breaking more apps then we fix.

Maybe the question is instead; why the hell do we need comments (and loosening of t he syntax, etc) in the first place?

Are we seriously going to keep insisting on json as a configuration format?

As Stormbrew already pointed out, we already have a format that is ideal for configurations (and sure, data exchange, why not), and it is called yaml.

yaml have comments

yaml makes it easy to enter multiline strings

and most if all; yaml is very very easy to write!

tl;dr: Just use a format suited for your needs instead of trying to change something that doesn't. Oh, and a couple of smiley faces thrown in there to ensure people don't read this in the wrong tone. People do that.. Like, all the time.. damn, now my tl;dr is too damn long! i have to add another.

tl;dr;tl;dr YAML BITCHES! (╯°□°)╯︵ ┻━┻ (but also, a puppy: http://i.imgur.com/kuDsS0i.jpg )


>Are we seriously going to keep insisting on json as a configuration format?

Yes. It has good universal support, often without needing any libraries, it's simple, succint, and has good tooling.

>As Stormbrew already pointed out, we already have a format that is ideal for configurations (and sure, data exchange, why not), and it is called yaml.

Let's just not go there. YAML is a pain in the ass to parse, has different incompatible versions, the libraries are of widely varying quality, is not natively (without third party stuff) supported in most languages, and it's generally a mess.


There are lots of problems with YAML; it does too much. If I had time, I'd definitely want to do a 2.0 that gives it a small haircut removing the most bothersome of problems. I've been unable find time for work required: about a year of discussing, writing, coding, testing, packaging, and forging consensus.

What I'd keep in YAML is the information model. When we started YAML ~12 years ago, it was obvious that configuration should be in XML, and that XML's information model was the correct way to organize data structures. Part of YAML's work was explaining a different way of doing things to those who'd otherwise use XML. This isn't a concern these days...

That said, the productions look painful because the specification doesn't separate the scanner from the parser. Once you do that, the syntax is quite a bit more sane to grok (see PyYAML source). It's not nearly as bad as what you may think... IF you see it this way.

Besides a few unfortunate syntax structures, YAML's complexity and sharp edges comes from it's venture into typed objects, type-spaces, and implicit typing. Much of this, for configuration files, is unnecessary.

When we wrote YAML, we only had a few years experience with it; and well, it wasn't done as a full time endeavour. It was a guess as to how things should work. It wasn't easy to bootstrap YAML. It's now ten years later... and, well, lots of people have experience with it. It's probably time for the haircut.


oh, please do this.

I love yaml -- except on the rare by very painful occasions where I get hugely bitten by incredibly weird problems arising from unexpected interplay of yaml features.

Some of the originators of yaml are probably the only people who the social power to promulgate a revision with a haircut. That would be awesome.


So if there were people willing to take on the lions share of the years work of consensus forming, what would it take to get you on board to YAML-haircut? What forms of decision making would make it attractive to you?


That's a great question; those who only participate at the periphery can hardly expect significant influence.


Perhaps a different question then - if such a group were to form whose consensus would they need to pull in, what rough roadmap would you suggest they follow. What, in short, is going to hurt and when should they duck?


Thanks for the response.

I like the promise of YAML, but having tried it a could of projects, found all these issues which prevent it to be a simple, turn-key solution that JSON can be (at least for simple needs).

How do you feel about TOML? I find it a sane compromise, at least for congiguration file needs.


YAML uses dash (-) and colon (:) and white-space for a reason, it's not a random selection of structural markers.


How about TOML? There was an implementation for most languages under the sun within a week of its release.[1]

(Certainly I acknowledge that it is not as ubiquitous as JSON or YAML, but I do like TOML better than both of those for configuration.)

[1] - https://github.com/mojombo/toml#implementations


YAML looks nice, but it's very overcomplicated for what it's usually used for [1]. Nobody wants to figure out what %TAG directives or "|", "|-", and "|+" at the end of lines mean, the difference between the folded style and the literal style, etc. just to read a configuration file.

I don't like JSON either due to the comment issue; simple ad-hoc configuration formats like most C programs seem to have mostly work, but aren't as nice as a standard format. If anything, I like configuration files expressed as scripts in whatever language the program is written in, since they're very flexible (if I want 100 almost-identical entries for whatever reason, I can say so in the file rather than writing a separate generator), and while programming languages are complicated, people tend to already know them; but that does tie you to a specific language.

[1] http://www.yaml.org/spec/1.2/spec.html


I think this is an entirely valid criticism of yaml. I'd absolutely support there being a simplified form of yaml (YAML The Good Parts?) that covers what people actually want to do and doesn't try to be a swiss army knife object serialization format.


Problem is people don't agree on Good Parts. People seem to think keeping YAML a superset of JSON is the good part. I'd think otherwise.

Anyway, http://ogdl.org/ is one candidate for YTGP (YAML The good parts) but it's comments can carry metadata, which is a huge turn-off, others think TOML (https://github.com/mojombo/toml) is a good replacement, but it has no support for alternate number types. You write in something like

   mask = 0xDEADBEEF 
into something like

   mask = 3735928559


Ideally you would have a strict subset of YAML rather than a totally different language for compatibility reasons. You can call it the Friendly, Readable, Declarative YAML standard so we can have a standards war that is FRDY vs. JSON.


That movie was horrible :P Would not like to watch again.


Hah. I love it.


Json originally had comments, they were intentionally removed:

https://plus.google.com/app/basic/stream/z12ztpczbxrdglfgl04...


Crockford's explanation is pretty absurd, though.


It makes perfect sense -- if you provide a freeform content section of an otherwise strictly formed document (for interoperability reasons), then people will abuse it to store arbitrary, uninteroperable data (as was seen with binary blobs being dumped into XML).

The point of a standardized serialization format is well-defined parsing semantics and universal interoperability.


Except that in practice, this has just meant that people have defined their own ad-hoc extensions to JSON that add support for comments, since it's so useful for stuff like JSON-formatted config files.


It's also meant that JSON documents don't have data encoded in comments. Working as intended.


>It makes perfect sense -- if you provide a freeform content section of an otherwise strictly formed document (for interoperability reasons), then people will abuse it to store arbitrary, uninteroperable data (as was seen with binary blobs being dumped into XML).

That will then be their own bloody problem, not Crockfords.


Indeed. I see SQL comments and Postgres COMMENT ON statements used to send information to applications. Really funny, that....


And you can't base64 encode it and store it in a string?


I thought xml was developed to be able to include binary parts on purpose since it can be useful.


How so?

Also "Suppose you are using JSON to keep configuration files, which you would like to annotate. Go ahead and insert all the comments you like. Then pipe it through JSMin before handing it to your JSON parser."

Seems like a perfectly fine way to have comments (if you absolutely need them) in a production environment.


Presumably JSMin removes optional quotes in object laterals, so it probably doesn't output valid JSON.


That would be a pretty big mistake for the author of both the JSON spec and JSMin to make? Maybe it is but it seems unlikely.


JSMin is not the right tool for this. I'm sure it conforms to the JavaScript (ECMAScript) spec but probably not the JSON spec. Here's a trivial JavaScript function to convert JSON5 to regular JSON with no comments and quoted identifiers and all that good stuff:

function JSON5_to_JSON(str) { return JSON5.parse(str).stringify(); }

This is exactly what is suggested in the Usage section of the linked article.


If the author of the JSON spec and JSMin says it's the right tool for the job, I am inclined to trust him on that barring further evidence that it is not...


Since when does Crockford say that JSMin is the right tool for this particular job: translating JSON5 into regular JSON?


Yeah but lots of people disagree with his decision. Lack of comments makes it worse for config files.


YAML is annoying, can't use tabs, you have to use spaces... and most yaml parses if they see a tab don't warn you or anything.

Highly annoying for configuration files.


> YAML is annoying, can't use tabs, you have to use spaces

Are you serious? Would you put tabs in source code too?

> and most yaml parses if they see a tab don't warn you or anything

If they warned you, you still can't use tabs it's just that you're more aware that you can't use them.

It's not entirely clear to me what your point is. Perhaps it's that tabs is your preference. Unfortunately, spaces are the preferred whitespace marker for 99% of programmers.

Also, pulled directly from the YAML FAQ [1]:

    Why does YAML forbid tabs?

    Tabs have been outlawed since they are treated differently by different editors and tools.
    And since indentation is so critical to proper interpretation of YAML, this issue is just too
    tricky to even attempt.  Indeed Guido van Rossum of Python has acknowledged that allowing TABs
    in Python source is a headache for many people and that were he to design Python again, he
    would forbid them.
[1]: http://www.yaml.org/faq.html


>Are you serious? Would you put tabs in source code too?

Of course. Why the fuck wouldn't I put tabs in source code?

We INDENT the code pressing tab. We don't press 4 spaces. Why shouldn't the code reflect that? And tabs are symbolic (logical entities), so they are customizable.

You suggest we'd rather use the elaborate kludges to handle spaces as tabs in the editor? What year is this? 1978?


Because tabs are a pain in the arse when you open them in a different editor and neat code suddenly becomes unreadable as the indentation gets messed up.

Most editors have a "tab button prints spaces" option that isn't too difficult to find. (Plenty of them have a format code option as well, so it isn't such a big deal, but overall I find it easier just to switch everything to spaces.)


>Because tabs are a pain in the arse when you open them in a different editor and neat code suddenly becomes unreadable as the indentation gets messed up.

I never had that experience, and I've used Vim, Eclipse, ST3, TextMate and BBEdit. How does that ever happen?

A tab is a tab, no matter the editor. One might be set to show it as 8 chars wide or 4 chars wide etc (since it's a logical unit), but no indentation gets "messed up".

If by identation you mean: "variables arranged to start at the same point because the programmer has OCD", maybe. But no declarations or indentation that matters, like braces etc ever changes.

>Most editors have a "tab button prints spaces" option

Yes. The back-to-the-seventies elaborate kludge I've already mentioned. It's 2014.


It happens when you open up code written by others. That's when I usually see it.

In that case you would have to set your editor to display 8 spaces per tab, or 4 or whatever the code editor where the code was written was using. I don't see that being any less of a kludge than changing it to do spaces when a tab is pressed.

I have given you a reason why I prefer spaces over tabs. Your reasoning seems to be "because I can". Then start harping back to the 70's despite that part of your comment being irrelevant.

Is there an actual reason you prefer tabs over spaces, given that you can't visually tell the difference most of the time?


Also as I mainly use Python, it is the preferred way to indent.

http://legacy.python.org/dev/peps/pep-0008/#tabs-or-spaces


I think you might overstate the percentage here, but I think the overall sentiment is true. But I've always thought a better approach would be to ban combining leading tabs with leading spaces at the file level and leaving the choice up to the user beyond that. I believe python3 actually takes this approach.

But I am glad that YAML made any choice at all. Allowing mixing is absolutely the worst possible option.



It's a lovely theory, but put into practice it's a lot of work to maintain, especially with more than one editor. I was a proponent of this concept for a long time, but it's just not worth it [1]. If you want to be able to align the leading edge to something other than a tab stop you are better off from a practical perspective to just enforce the use of spaces.

[1] Think in particular of when code moves around. Sometimes it moves to a place where the indent level is different but the total number of expanded spaces is the same. This looks fine until you look in another editor.


Just use a capable editor. It's 2014, for heaven's sake!


The problem occurs when you are SSH'ed to a machine you are setting up, or one that is customer owned, you use whatever editor is around to make a quick edit, not realising that the file labeled .conf is actually a YAML file, you hit tab because it makes logical sense and then shit breaks.

No thanks, YAML is a terrible choice for configuration files.


If you're connecting to a machine over SSH and you're making enough changes that it's going to be difficult to maintain an indentation scheme, open it up on a local editor with SFTP. Emacs supports it out of the box, and Sublime has an excellent nagware SFTP package in the standard repos.


All YAML files are required to begin with "---". If you're aware of what YAML is, and you've seen YAML ebfore, and yet you somehow "aren't aware" that something is a YAML file, you're incompetent.


There is no requirement for a YAML file to begin with "---".

Look at the database.yml files that come with Rails applications, or look at the +MANIFEST file that pkgng requires on FreeBSD.

Those are yaml files, no triple dashes.

Calling me incompetent isn't helping your case either.


>Sometimes it moves to a place where the indent level is different but the total number of expanded spaces is the same.

This is exactly why Smart Tabs is the only intelligent way to do it - if you change the width of a tab, everything resizes instantly to match the current user's aesthetics, but things that need to be aligned stay aligned. The only case in which it breaks is if you switch to a non-monospaced font, and in that case, God help you.


Which is precisely why tabs-only is a problem: tabs cannot express character-granularity alignment, so you HAVE to mix them with spaces in this way. (Or give up pretty alignment in favor of fixed indentation for continued expressions, which I'm not willing.)

Spaces just get out of my way.

P.S. I'm undoubtly biased by Python. The community's agreement on spaces and specifically 4 spaces has been a pure blessing.


"spaces are the preferred whitespace marker for 99% of programmers."

99% of which programmers?

The Linux Kernel uses tabs for indenting


Well, I use spaces for marking whitespace in many cases. Just not for indenting, that's what tabs are for :).


Because not every editor I use will automatically turn the tab into spaces. This is especially a problem when I am editing a YAML file over SSH using vi or nano, or pico or thousands of other editors.

They stick the damn tab in, then the program that is parsing said YAML file does the wrong thing and I am confused, annoyed, and most times really pissed off because I will spend hours trying to figure out what went wrong.

YAML is stupid for configuration files.


Tabs vs. spaces... in some respects programming is so incredibly primitive.


So are "seeing" and "hearing".

I for one am looking forward to the time when neural implants will translate my high level programming thoughts into microcode sent directly to the CPU for execution.


Yeap, opposite of makefiles. I opt for TOML. YAML is great for fixtures.


Or you can do this:

config = { "version": "1.0", "comment": "JSON has comments too!" }


I hope you're joking.


Why? It's a valid solution.


Mixing metadata in with data isn't ideal.


Mixing metadata with a serialization format isn't ideal. Want metadata? Use a markup language.


Actually it is. Then you can treat them the same way.

And since it's a configuration format, you know what keys are accepted and what keys are for the metadata, and thus won't clash.


But metadata is also data.


this is what is recommended for package.json in node. Specially "//" as the key.


That's what we did for some of our KDE build infrastructure metadata, since there was no way I was using YAML if I could avoid it for something this simple.


> As Stormbrew already pointed out, we already have a format that is ideal for configurations (and sure, data exchange, why not), and it is called yaml.

Unfortunately YAML for untrusted input and data exchange is unsafe by default, depending on the language and implementation. A flag might need to be set, or extra modules included like SafeYAML[1] to keep Yaml from instantiating arbitrary objects.

[1] https://github.com/dtao/safe_yaml


I thought the problem wasn't with yaml but with allowing deserialize arbitrary objects which is unsafe by default for a format used both for 'trusted' and 'untrusted' input, If you have a json library which tries to allow deserializing arbitrary objects by default (with a load rather then unsafe_load method). Python's pickle serialization is unsafe but it warns you that its unsafe and is not widely used leading to it not being used as a serialization format for for unsafe input.


>The problem with YAML is it's not safe by default

Why would that be an issue when using it as a configuration format?


Because the parsers for that configuration format are unsafe too? Duh!


I see this comment every time for many of years in almost every discussion about using JSON for configuration and while YAML is certainly used by many projects most of the people still continue to use JSON and I think that's because YAML sometimes feels like Scala of serialization markups when people just want something like Python. I personally think that TOML[0] is not only more simple but also as easy to read as YAML and we use it in our projects without any issues.

0. https://github.com/mojombo/toml


TOML is ok too, though I still like YAML better personally. I honestly didn't really like ini files much at any point, even when everything on windows used them, so TOML starts from an unhappy premise for me.


If web browsers had native YAML parsing, we probably wouldn't need JSON5. Web browsers aren't going to get native YAML parsing.


They probably aren't going to get native json5 parsing either (except in the sense that you can do something stupid like eval it). That said, I don't think there's any particular need for yaml in the browser. Browser code is usually dealing with machine-generated data, where even normal json is just fine.


Rich client-side apps need configuration files too.


If this was merely a configuration (say for node.js) which sits on the server, then you probably just import a npm to read yaml (I don't write node.js so I don't know if using .yaml is feasible as node.js config file or not). So the use case is limited.

I used to work on a project which users could edit a configuration file through a web editor and we chose YAML because writing JSON by hand is painful (I hate the comma error!). But we processed this YAML file for the user on the server side, so having a native YAML parser in browser and Javascript wouldn't really help me at all.


No, I'm talking about configuration files that get interpreted by code that is executing within the browser.

For example, a configuration-data file format for specifying a "brush" in a JS-client paint program. That'd obviously be a schema on top of JSON, right? Well, now you've got all of JSON's inherent limitations.


Then you don't need JSON, all you need is the plain old JS object.


You're thinking of the live representation of the "model" of a brush in the program. I'm talking about the "definition" of a brush, from which the program loads that model. Another example would be, say, a "level" in an HTML5 game. These things ship alongside the game as blobs of data. Those blobs need a format that the browser can parse. Currently, JSON is that format, and it's inadequate for that.


don't need native json5 parsing. just load the lib up. the problem is yaml is there isn't any safe+browser version that I know of.


> Web browsers aren't going to get native YAML parsing.

Because of what?


YAML failed the intelligence test of getting its name right in the first place: "Yet Another Markup Language". YAML is NOT a markup language, in any way shape or form, yet the people who designed and named it earnestly thought they were designing a markup language, and that there was a need for yet another one.

Only when somebody pointed out that obvious fact to them, did they come up with a recursive retronym to paper over their initial stupidity: "YAML Ain't Markup Language". How clever by half.

I prefer to use formats that were designed by people who actually knew what they were doing and what it was called and how it was meant to be used.


We tried to use YAML at first, but problems with data types (can't correctly remember what, but there was no way to force something to be something) made us to rewrite our test fixtures to JSON. The only problem with JSON for us is that it doesn't support any comments, but JSON5 seems to fix it.

There's a interesting format for configuration called TOML[1], you should check it out!

[1] https://github.com/mojombo/toml


JSON Schema has a ton of implementations and tools hanging off of it. It is possible to load and validate a config, and then display a reasonably nice UI for editing it (in such a way that the resulting state is also valid), all by creating a single declarative JSON Schema. Nothing comparable exists for YAML as far as I can tell, and by virtue of its complexity, it is unlikely to exist for quite some time.


Whitespaces. These are the real killers.


s-expressions


This brings barely anything useful to JSON and adds processing overhead. Almost all json is produced and consumed but software not people. Adding trailing commas or single quotes or optional quotes is a waste IMO and adds needless complexity for very little gain. Now if they were introducing some useful new types that would be a different matter.


I often use a JSON file containing dummy content when I'm working on a web app, and I manually edit the JSON a lot during development (then, when I've settled on a structure, use the static JSON file as the design for a real API endpoint). Comments and ES5 niceties would be useful for those situations, at least. You could even assign JSON5 to window.JSON in your dev environment only, and still get the performance of real JSON in production.


One other thing to consider is that JSON was meant as a replacement to XML to be more concise and easier to use as well as being generally smaller in payload size. Adding comments, while good for people, is not really what it was developed for. It's really for machine to machine communication and adding human readable stuff like comments and dangling commas muddies the protocol.


Your comment doesn't seem to address what I was saying. I just described a situation where you might use JSON5 in development (so you can put comments etc in your dummy data) and JSON in production (so you get native performance).


Here's how I've solved the use-case you talk about. I built my development files in my own format, and then transformed them to JSON for the wire. In my mind, the hassle this involves is a down-payment I gladly make in order to be able to work with sharp tools.

For your particular scenario, where your file format is similar to json already, it's not a lot of work to parse the file, and remove lines which - when stripped - begin with '#'. That's five minutes' work.

Json is a wire protocol. We had XML before that, but XML tries to compromise between being a wire serialisation format, and lots of other features (e.g. human-editable, file serialisation, schemas).

Crockford's decisions are a picture of minimalism. He wanted to find a way to get a standard wire format, without making any changes to javascript to get it. Json isn't perfect as a wire format, but it remains effective as a balance of those priorities.

The original post says, "JSON's usage has expanded beyond machine-to-machine communication." This is the source of the problem - the author wants to compromise JSON's mission to serve a non-core functions, things that are unrelated to being an available-everywhere wire protocol.

If we do json5, pretty soon someone else will be pointing out that we need to encode json schemas. When we point out that it's a compromise on the mission, they'll say, "that's OK - we've already made the decision to make JSON a general purpose format. Look at the way we added comments, which goes against the mission."

We'll get into hassles with libraries and versions. This is the road to XML. We already have XML, it's a horror to work with, and the reason for that is that it tries to be all things to all people instead of being a sharp tool.

Before XML we had SGML. When XML was young, I was involved with projects that chose it because it was simpler than SGML. XML is now far more complicated than SGML, in different ways of course. Now people overlook XML (it's too damn complicated!) and use json.

Fortunately Crockford seems to have foreseen the problems here, and made hard decisions early (no comments) in order to entrench against a repeat of the pattern.


can't you just do

    _dontcare = {es5: syntax, /*with comments*/}
    data = JSON.stringify(_dontcare)

?


Yeah, you can do that in your .js source, but not in a separate, non-executed file. It is useful IMHO to separate static data from code. Also, this is useful for configuration files.


No, because syntax is undefined. Trailing comma and comment is removed when it is stringified.


It's interesting, the first thing I thought of when looking at this was CSON (https://github.com/bevry/cson) which essentially just provides some syntactic sugar on top of JSON like CoffeeScript to JavaScript. I haven't look at this enough to really see the difference between JSON5 and CSON but at the surface it looks like its just CSON rebranded as JSON5. I agree with you that in general most json is machine generated and in most cases should never be manually created by people. It's a language for machines to talk to machines not machines to people. Granted it's sort of against JSONs philosophy but I also would love to see some more types added to JSON rather than syntactic sugar. There will always be libraries like CSON or CoffeeScript to add sugar, its better to make the base more useful and extendable.


It looks to me like a major difference between this and CSON is that this is safe and CSON is not.

Literally the only difference between coffeescript.eval and cson.parseSync is that the latter runs in a sandbox. You can't screw up so badly as to cause global side effects, and you can't require modules, but that's where the safety ends.

For example

    cson.parseSync("Math.sqrt")(4)
will return 2, and

    cson.parseSync("1 while true")
will cause an infinite loop.

That's very powerful for config files; I'd much rather have my config say "y: Math.sqrt 2" than "y: 1.4142135623730951". But it means that you absolutely must not parse CSON that comes from an untrusted source.

Now obviously JSON5 provides very little over JSON as a data-exchange format. In fact, the only practical advantage I see is that you can use Infinity as a value.

I can see two use cases for this:

1. You have a web service for developers, and you want human-editable config files that can be parsed safely on the server side. CSON won't work there, and JSON5 would be nicer to work with than JSON. It's a good thing that JSON5 is a strict superset of JSON, because this is one hell of rare scenario.

2. You want your exchange format to be more human readable for debugging and testing. This use case isn't realistic, however, so long as JSON5.stringify is aliased to JSON.stringify.


The unsafe eval of objects in CSON seems to be an implementation quirk rathet than a feature. They're working on a static parser.

Check out their WIP https://github.com/bevry/cson/issues/33


That's good. It would be nice if they'd document the spec before doing this. Right now it's pretty unclear what is and isn't valid CSON, since any CoffeeScript code will run.


Note that CSON is used in Atom.io, Github's new text editor: https://github.com/atom/language-ruby-on-rails/tree/master/g...

I've used CSON before, main reason being that I wanted a config file with comments plus easy integration with a JS/CS project. YAML also works well.


"most json is machine generated and in most cases should never be manually created by people"

Except when sketching out a static JSON file that you'll eventually implement as a real API connected to a database, once you've experimented and arrived at a good structure. I do this a lot. I wouldn't output JSON5 from a live API, but I would use it for sample data in development.


That's fair. My argument is that you shouldn't muddy the implementation of a protocol for debugging purposes. It creates unnecessary complications that then have to be dealt with on either end in software. I think the problem that you are bringing up is more of a development environment around creating good data structures in JSON rather than a pitfall in the protocol itself.

I'm not saying JSON is perfect just providing a counter argument and stating a general belief I have about over complicating thing when a simple companion tool or something like CSON works just as well. CSON or CoffeeScript are good examples because they provide a more pleasurable environment for the developer but at the end up the day compile down to their native JSON or JavasScript respectively. I think we are in agreement in that I dont like the JSON5 syntax should necessarily be what's going over the wire.


If anything, trailing commas would actually be nice. It won't complicate parsing, but it does make the automatic generation of JSON much nicer as you no longer have to specially treat the first or last element of an array or object.

The rest of the changes I'm not so keen on.


> it does make the automatic generation of JSON much nicer

Are you generating JSON as freeform text? Because a call to JSON.stringify (or whatever JSON-dumping method exists in your language of choice) would not be "made nicer" by handling trailing commas.


How so?


Because the tool handles generation so you don't give a damn whether the final format uses trailing commas or not? And you can use trailing commas in your source language if that's supported?


Ah, I see. You are assuming a JSON encoder exists for a given language. I'm not.


I have hand written a fair amount of json in the last week including some json schemas and the trailing commas gets me coming from golang literals and back.

I don't mind the idea of alternate syntaxes that compile to json in the spirit of coffeescript or less/sass/haml etc.

Coffeescript makes it very clear that it is just an alternate syntax for javascript. If that is the aim of json5 that is cool. I just don't want to see public APIs producing and consuming json5 in the wild and breaking stuff.


If you're generating JSON with string concatenation instead of using a JSON serializer, you're almost certainly doing something wrong.


My thoughts exactly.


Disruption for basically zero benefit.

The proposed modifications are utterly trivial and don't really make anyone's life better.


As someone who writes a large amount of node/js, this would definitely make my life easier for when I have to write out a package.json, some other sort of json config, or test data (which happens a lot). All of the proposed changes would take away annoyances that I experience being a human that has to occasionally write JSON by hand.

As a side note, negativity for the sake of being negative does not help the conversation at all. Please take that kind of attitude somewhere else.


Expressing the opinion that these are not useful ideas isn't "negative for the sake of being negative".

These are poor modifications. Don't modify JSON to be a better configuration format, just use a better format.


"just use a better format."

Rather hard to do when the rest of the world you need to interact wouldn't necessarily move with you.


Hey now... When we started using JSON as our file format for scene metadata at Disney Animation these were exactly the features people wanted.

Hand-hackability and simplicity is why we use JSON. These changes definitely make things better for humans.


I like crockford's suggestion. if you need these features, write javascript. (or something akin to json5), and run it through a minified, or eval it in a javascript sandbox to convert it to plain json.


So the suggestion for comments in JSON is "don't use JSON"? That's a dumb suggestion, and defeats the entire point of having JSON in the first place. No common formatting/tooling/parsing library can work with the config file if it has a comment. Every app wanting to allow comments needs to go get a full JavaScript EE and parse them as JS? That's absurd.

So this proposal is coming back and fixing that silly limitation.


The suggestion for comments was to use JSMin to process them out before handing it over to the json parser.

That is, if you really want comments, crockford says, go ahead and add them. Just remember to remove them before parsing as JSON. Which is actually not that hard. you can even do it with sed. Just for the love of crock don't put processing directives in the comments.


"Just remember to remove them"... in other words, don't use JSON. If you have to preprocess it with another program first, then you might as well use another format anyways.


> If you have to preprocess it with another program first, then you might as well use another format anyways.

What?? By that logic, you might as well hand-write x86 assembly, since C requires a "preprocessing" step (some might call that compilation).

Using "JSON5" as a human-writable format for storage and then compiling it to real JSON before going over the wire seems like the best of both worlds to me. The wire format benefits are preserved, the format is validated as a side effect of compilation, and you get to write in a well-defined language with comments and bareword-identifiers. Your objection is that you don't want to run a compiler, so you'd rather send JSON5 over the wire?


    > dumb suggestion .. defeats the entire point of having JSON in the first place
No. Json's mission is to be a wire protocol. You're suggesting using it for other purposes, and in a way that compromises its role here as a wire protocol. It's not intended to be a multi-purpose serialisation format.


We should add namespaces to JSON. And schemas. And schema validation. And stylesheet transformations. Yes, that would all be good...

Extensible types would be the next thing to add.

Now that'd be a good extension.

And we could call it Extensible JSON language. A good abbreviation would be XJL.

And eventually we'd have a standards-body defined way of querying JSON documents. We could call it JQuery.

(</sarcasm> just in case...)


Sarcasm yes, but mostly a real life exhibit of the slippery slope fallacy.

Sure, if we add comments to JSON the next thing it would be transormed into XML, and AJAX would be like SOAP.


Sarcasm aside, schemas have been implemented for JSON: http://json-schema.org/


we should also have JSON comments


I have no idea why you said this in reply to a sarcastic post, because

- we SHOULD have JSON comments. Have you never edited a JSON-based config file?

- No one says that HTML or XML shouldn't have had comments

- JSON5 already suggests adding comments, so your proposed extension to a sarcastic proposed extension to a serious proposed extension is idempotent


How could you forget processing instructions and stylesheets?


We should also add an extension to XML that lets you omit the opening tag. </sarcasm>


JSON really shouldn't change at all that is why it is so great, basic types and very simple and clean. We should build a fortified defense around JSON and keeping it the same.

Of course there is no problem making a new standard, mongo did that with BSON. It is different and should be called different, nothing wrong with abstractions.

The reason we all love JSON is that it has stayed the same. Develop a coffescript like JSON precompiler with comments and trailing commas and all sorts of things like no quoted keys if needed but call it something different. I think the last thing anyone wants to do it shake the JSON standard or have to have multiple parsers try for all the new JSON formats/schemas/namespaces/etc like an XML SOAP nightmare.

EDIT: Ok so it does have compiling to regular JSON but just not a fan of the name.

This file is written in JSON5 syntax, naturally, but npm needs a regular JSON file, so compile via `npm run build`. Be sure to keep both in sync!


> Develop a coffescript like JSON precompiler with comments and trailing commas and all sorts of things like no quoted keys if needed but call it something different.

Someone did, it's called yaml.


And I still don't understand why you'd pick yaml over anything else out there.


I pick it for config files humans have to write and read because writing and reading json as a human is tedious and error prone.


As someone who has to write yaml... writing yaml is significantly more error-prone than JSON. It has far more edge cases than JSON does.


I agree it has a lot of corner cases, but they are mostly in service of allowing for an uncluttered document. The fact that you can always write a subset of the document in json is helpful when things get hairy.

But this is a thing that often gets ignored when talking about what humans are good at reading and writing. We aren't computers (or at least, we aren't as linear of computers). It's obvious enough in the languages that we speak and write every day with each other that we derive from and plant a lot of meaning in the margins of our expression. We tend to understand things quite clearly when noise has been elided out. A lot of JSON becomes pure noise: "{{[[{[{{[".

Really, though, what I'm most in favour of is that now that we've agreed on a basic data model that is less insane than its obvious predecessors (ASN.1 and XML) it would be nice if applications with configuration opened themselves to allowing the user to decide the specific format for their own benefit. If it's TOML or JSON or JSON5 or JSON+comments (Sublime Text) or YAML or whatever, so long as it results in the same structure who cares?


> A lot of JSON becomes pure noise: "{{[[{[{{[".

I'd rather that be visible brackets than invisible whitespace.


Why?

Assuming 'invisible whitespace' in this context is actually a meaningful thing. I find it hard to miss indentation, if anything, to the point that given a conflict between whitespace and a sea of brackets/braces my first instinct is probably to trust the whitespace. I don't think I'm alone in that.


so do you use tabs or spaces. or tabs AND spaces. Is that 2 spaces or 4? or did I accidentally put 5 in somewhere? is a trailing space going to cause problems or not? Text editors certainly don't help keep this stuff straight. I just don't think white space should be part of the syntax. It makes editing HARDER. not easier. And it's not even visible, unless I turn on a special feature in my text editor.

I just don't know what exactly you YAML guys are smoking that you think it's easy or more humanistic.


...

So, in YAML you can only use leading spaces. Leading tabs are an error. This confusion of yours is not at all present in the actual thing you're talking about. And if you mix them in json god help you if your editor settings don't match someone else's because you might be looking at something nearly incomprehensible that won't error or warn at all.

I mean, have you ever actually looked at a code file of any sort with mixed tabs and spaces when the person who wrote the file had a different tab stop than you? This is a readability problem no matter what. Forcing you to have sane indentation is a feature of whitespace-sensitivity, not a bug.

What exactly do you think trailing whitespace does?

If I'm smoking something, at least I've used the things I'm talking about instead of spreading weird FUD.


> at least I've used the things I'm talking about instead of spreading weird FUD.

> A lot of JSON becomes pure noise: "{{[[{[{{[".

Pro-tip: if you are going to call someone out for not knowing a format and spreading FUD, don't post things where you do the same exact thing. It pretty much invalidates the criticisms you've been making as it says "I don't know what I'm talking about, here is proof."


That'd be cute if what I said there didn't come from experience, even if I did exaggerate the specifics.

Yes, the specific sequence of tokens is not valid because a list obviously can't be a key. The idea is still the same. I'll fix it so it is a valid sequence of tokens that's just as confusing: "}}}]}]}}"


honestly that's a code smell in any data serialisation notation. if you really did have a data structure like that it would manifest in yaml as an indentation 32 space characters. If you adhere to the 78 column rule, you've squeezed most of your data to the right of almost half your screen.

Anyways, invisible syntax errors still suck. At least with json you can (eventually) figure out your braces are unbalanced. Or use a syntax highlighter to spot the problem. No syntax to highlight if you have a problem with yaml.


Or how about this:

Crazy, huh?


> I find it hard to miss indentation,

I've been having to deal with large YAML config files lately, and find it's easy to miss indentation. Even with niceties that add lines to the different indentations, it's hard to keep track of where you are. Beyond a short, simple YAML file, I find it gets confusing really fast to the point of frustration.


Hierarchical, supports comments for explanation of intent, is more supported than HOCON. Data type specifiers a major plus.

I'll take all your suggestions.


Is the reason for "json5" understandable?


I love this concept; my only concern is that by adding comments, it becomes difficult to round-trip a JSON document. XML has the concept of comment nodes that can survive a round-trip, but there's no easy way to do this in JSON. Perhaps you could limit where comments could go (ie: tagging object properties or array items only) and add a "__comments__" property to an object and/or list that contains the matching comments for the items in the object and/or array.

Also, please add ISO-8601 dates (repeating my comment below)!

{ timestamp: new Date("2007-04-05T14:30Z") }


For the love of god yes! a standardised date format is the only thing that really needs changing in json.


We already have a standardised date format, as such. A specific subset of ISO 8601:

  > JSON.stringify(new Date())
  "2014-03-02T15:08:55.309Z"


What's the benefit of cluttering JSON with a constructor like new Date(...) when you can just standardize on ISO-8601 strings? JSON is not supposed to be eval'ed anymore.


JSON is JS. If I manually create a JSON document, it should also be valid in the context of code. And strings are not dates -- they are strings. That's why we have boolean and number literals.


So many other languages use JSON as a data format that keeping a weird constructor call in there would do more harm than good. Especially since you can keep the formatting code in your codebase and invoke that when you need to use that property.


We had a bug once when some logger output happened to get passed through JSON. The receiving code expected arrays of strings, but at some point one of the records happened to just contain a date. That ended up getting deserialised as a real Date object, not a string, which obviously broke. Using the same representation for different data types just seems to be asking for bugs.


The problem is that your code should have been expecting that field to be a string and nothing else, and other fields that contain dates to be a date and nothing else. In other words, if you're parsing JSON in this manner you need to apply some sort of schema to the data or have another field in the record denote the datatype. Anytime you're trying to auto-magically detect the datatype of the field, you're gonna have a bad time.


What is wrong with standard unix timestamps?


There's not a bijection between UTC time and "standard unix timestamps" due to leap seconds

if you encode TAI in a unix timestamp, this would be a solved problem, but due to the fact that you cannot encode the time zone in the "unix timestamp datatype", you'd also have to handle cases in which people chose another timezone.

This makes it good enough for storing data (it'd be nice if the size of the data type would be explicit) but suboptimal for interoperability with other systems that could be using a different standard

either you encode the timezone in a different field in your JSON, or someone (the JSON5 people?) could define a literal for timestamps like:

t+1393752828Z (this would also eliminate the ambiguity between timestamps and number literals)


Numbers, booleans, and null should be removed. What's wrong with using strings for every value?


Yes please. I thought this would be listed first when I clicked on the link.


I love how this proposal "fixes" a bunch of non-issues, while making it much more difficult to write a JSON parser, and then doesn't address the only real problem with JSON:

That it doesn't have a date/time type.


More fundamentally: JSON doesn't have a well-defined integer type. A JSON parser that accepted just 0 and 1 as the only integers would be conforming.

And this causes real bugs in real programs:

https://lists.gnu.org/archive/html/qemu-devel/2011-05/thread...


What would you prefer JSON had done?

Defined numbers as int64? Or arbitrary precision bignums? Then JavaScript would not be able to support JSON without an external bignum library, which many/most applications do not actually need, and JSON would be less convenient to use.

Or would you prefer they had specifically defined IEEE double precision as the numeric representation? Then JSON numbers would be useless for qemu's offsets and other applications that need numbers not representable as IEEE double precision.

Leaving it unspecified means that implementations support what they can. If you end up needing actual int64s in JavaScript, you can drop in a BigNum library and get them. It's true that not all numbers can be represented by all implementations, but that was true already.


Provided a rich range of integer types, with specifications in the standard. Leaving it unspecified is really the worst choice if you trying to interoperate with real applications and libraries. Supporting "what they can" is great for crappy implementations, and terrible if you're trying to consume this stuff and get work done.


A "rich range" of anything is contrary to the spirit of JSON. JSON became popular in part because it was a much simpler contrast to technologies like XML that provided a "rich range" of everything.

Leaving it unspecified means that JSON as a format is capable of arbitrary precision, without requiring that implementations carry around bignum libraries if their particular application doesn't need them.


JSON also has no floating-point type. It has a generic number type, and it is up to the parsing application to decide how to handle it. You have complete freedom; There is nothing to stop you treating it like an integer, float, decimal, or bignum.


I wanted to add precisely the same thing.

A lack of a standard date/time type means that some people use UNIX time, others use ISO strings (which you then need to detect, parse and handle errors for), and others are crazy and use their own format, localised just to make it fun (are people writing JSON using Django templates or PHP date formatting - yes, yes they are).

To have one standard way that everyone does dates would be nice.

PS: And the HN measure of "More comments than upvotes?" clearly works as controversial/bad ideas do generally follow that rule.


JSON in ECMAScript 5 does have a date/time type. A specific ISO 8601 subset:

  > JSON.stringify(new Date())
  "2014-03-02T15:08:55.309Z"


    > JSON.parse("{\"date\":\"2014-03-02T16:08:43.444Z\"}")
    Object {date: "2014-03-02T16:08:43.444Z"}


Yes, but it is trivial to then convert it to a date with the new Date() constructor.


But it's not trivial to know that it's a date.

It's ok if you are expecting a Date. But when you want to write generic code that "discover" the properties, then you have to do dirty things, like trying to parse every single strings as a date.


Close. That's a string that can be parsed as a date as opposed to a date literal.


The only thing JSON needs is a standard date type. All these JSON5 additions are a waste and in some cases harmful.

Example: Comments were taken out of JSON because some parsers started using them to store processing instructions.


This is a terrible idea.

JSON has been successful because it found a sweet spot between three factors:

1. It is useful. There is enough flexibility to represent most structured data easily.

2. It is portable. There are libraries to read and write JSON data in every major programming language.

3. It is simple and unambiguous.

Things like allowing keys to be unquoted if they're valid identifiers in JavaScript and as JavaScript happens to be defined by the latest standard save two characters, at the expense of breaking portability and future-proofing, making the specification more complex, and introducing ambiguity. This is not a worthwhile trade-off.


This just seems to fix problems for the lazy or those that lack architectural rigidity in their development patterns.

JSON is a data encapsulation format that doesn't care about the data contained within. That is up for your programs to consume and decide.

If you think JSON needs "more" like comments, parsing, multi-line, etc. then perhaps you need to revisit your architecture.

If you disagree I am sure there are plenty of frameworks out there that will babysit your data and document things for you. But upsetting the core apple cart here would be a huge mistake.


The support for numeric Infinity is basically the one unabashedly-semantic modification; it doesn't go far enough, though, since it should be possible to serialize any float, valid or invalid, in a data-exchange format. I'd think proper support would involve a Rational representation--encoding Infinity as, say, 1/0.


So anyone that picked JSON as a configuration format needs to revisit their architecture? At the moment, yes. But if comments had not been forbidden then things would be just fine.


Where do I go to vote against this? Just write a script that just converts your invalid JSON into real JSON.


These are all good ideas, but will NOT be adopted.

Trailing commas are especially appealing: it simplifies outputting JSON from "(A , ) * A" to "(A , ) * ". In fact, in this respect, XML is easier to output than JSON.

But it will not be adopted. Crockford already removed comments from the JSON spec once. Guy knows what's up: it's a standard, keep it simple. If you want the kitchen sink, use XML. JSON5 will achieve about as much adoption as yaml: some. Having json in the name and syntax won't bring massive adoption.


This is not meant to be adopted, as in, replacing JSON everywhere.

It doesn't make sense for REST APIs. It doesn't make sense for communication between programs. In fact, its basically impossible to use this implementation for that, because (the last time I checked) JSON5 just used the JSON reference serializer. If you create JSON programmatically, it will always be valid vanilla JSON!

The use case is hand-written JSON. Either for configuration files (like Sublime Text). Or if you have hand-written data that you would otherwise put in a object in a .js file. You might want to separate your data from your code, but you can't just dump the javascript object literal into a json file, because it's not valid JSON. And maybe you want to leave comments in there, or trailing commas in lists.

If you send JSON over the wire, its trivial to sanitize it:

    JSON5.stringify(JSON5.parse(sloppy_json))
This will also remove comments of course, if you are concerned about them eating bandwidth.


It allows trailing commas, and claims IE6 compatibility. I don't understand how both can be true.

  In Internet Explorer 6 and 7, there is an easy to introduce
  javascript bug relating to commas. If, when defining an 
  array or an object, you leave a trailing comma after the 
  last item in your collection, IE will fail to parse your 
  javascript file:

    var x = [1,2,3,]; //ERROR
    var y = {'a': 1, 'b': 2, 'c': 3,}; //ERROR


They wrote a custom parser that works in all browsers -- it doesn't use eval. JSON5 is a subset of ES5, not a subset of JavaScript that runs in all browsers. Notice that it also allows reserved words as keys -- an ES5 feature that doesn't work even in IE8.


A successor, revision, or update to JSON must address Dates to have any hope of displacing JSON 1.0.


This is not meant to displace plain JSON!

JSON5.stringify(obj) produces plain compliant JSON. Using JSON5, programattically produced JSON will always be JSON 1.0. This is for configuration files, and static (hand-edited) data.

And what is wrong with just using ISO dates in strings? I mean whats the difference between, say:

    "date": /2010-03-23T23:57Z/,
    "date": Date(2010-03-23T23:57Z),
or whatever the syntax would be and

    "date": "2010-03-23T23:57Z"
? The irony is that JSON5 without dates is more compatible with JSON, than any JSON implementation with dates. (JSON5.stringify(JSON5.parse(...)) always produces pure JSON without loosing information. I don't know how you would convert a date literal into regular json other than turning it into a string, on the other hand.)


JSON5 == kinda cson?

although i think coffeescript object notation was an unfortunate name choice. people averse to coffee will immediately associate bad thoughts with it.

https://github.com/bevry/cson


I honestly thought TOML (https://github.com/mojombo/toml - created by founder of Github) was a good configuration format. But unfortunately, I'm not sure that Tom has been a good steward of it and most of the interest in the early dies died off because Github issues would go unanswered or unresolved for months.


Same here. I have been keeping an eye on libucl[1] lately. The nginx-like config style seems very nice. I ran across it when looking at FreeBSD's pkg-ng (it uses it). I haven't seen any libs for it in any other languages yet though, so maybe it will end up being just another 'also ran'.

[1]: https://github.com/vstakhov/libucl


Most of this is already implemented in my C library called "liblaxjson" [1]

I did it because I had a use case where the JSON file is for user input, not for computer-to-computer communication. So I wanted it to not be so annoying to edit and more importantly, to allow comments.

[1]: https://github.com/andrewrk/liblaxjson


This is what json should have been from day one.

How many fields in all json streams ever transmitted aren't alphanum compliant? My bet is 99% of them are simple names, so forcing it to be surrounded by quotes for that 1% is just a waste of chars and shift keys, even if they're mostly automated.

I spend a lot of time designing, testing and validating apis for which I have to read tons of json, and believe me, without quotes my life would be so much easier.

Comments are also important for testing. Trailing commas would save plenty of extra code to remove them from lists.

These improvements would be mostly welcome.


the reason the quotes were added to keys is not alphanumeric compliance. It's avoiding javascript reserved words like "do" and "class".


This adds too little and too late (not that much should be added to JSON). And parsing some of these features would be a headache and a possible source of bugs (I maintain a JSON parser in C[0]). Keeping it simple is much more important than adding new features.

[0] https://github.com/kgabis/parson


Almost all of the issues solved by JSON5, are also solved by EDN:

https://github.com/edn-format/edn

I'd like to see a little bit more love for edn... and if you're gonna pick something incompatible with plain old JSON, why not edn?


I can only speak for myself, but I don't want something incompatible with JSON, or completely different.

I want a format that looks like Javascript or Python code, and that is a strict superset of JSON (e.g. can parse all valid JSON). I want to pluck my object or list literal from my Python or JS code, put it into a config file, and have it work, no matter wheter I have final commas in lists or not. I want to be able to insert comments. And I want to be able to describe it as "JSON, but you can use comments", so people will be able to understand it immediately and edit the files. This is similar to what Sublime Text uses for its config files, for example. EDN, or more prominently YAML are just not well-known enough (and the latter is hellishly complicated).

Also, a nicety of JSON5 is that it serializes into plain JSON, so interopability is always ensured.


How about defining a format for dates? That would be more helpful than most the proposed changes here.


There are no date literals in Javascript or Python, so would loose compatibility. The nice thing is that you can just take almost any object literal and put it into a json5 file, and it is valid json5.


What's wrong with ISO-8601?


I'd like to see explicit date types using ISO-8601 format, ie:

timestamp: new Date("2007-04-05T14:30Z")


Your appliation needs to know that "timestamp" is a timestamp anyway. Why would you want to add that to your data as well?


To make parsing easier? By the same logic why differentiate between numbers, strings, and booleans?


Because those are javascript primitive types, which json is built around in the first place. Date is not a javascript primitive, and is therefore not included.

If we need to follow that logic (to include useful non-primitives) we would end up with a clusterfuck of a spec that noone would want to implement.


But why stop at dates? What about all the other useful things we could put in, DOM elements and and sets and synchronization clocks...


Except that if your application is a generic client that doesn't know anything about data model of the documents it's manupulating. For example, very limited type information is the primary issue JSON can't be used in truly RESTful APIs as JSON has no hyperlinks. Myriad of standards were invented to propagate metadata (mostly, type information) with the documents.


> Except that if your application is a generic client that doesn't know anything about data model of the documents it's manupulating

This hypothetical generic client always comes up in discussions of data formats and RESTful APIs. But I've never understood what use it could be. It seems to me that any meaningful software (i.e., anything beyond a debugging tool) that would consume data or an API would need a more specific UI than a tree view with links.


Well, I have a typical CRUD webapp that has a lot of places where user just has to edit a table with properties. I don't really like that unless server has a non-standard way to provide information on schema (luckily, Tastypie does that for me), I'll have to re-declare every model's field at client side.


XML, Schemas, XML-RPC, SOAP and Thrift were all invented to solve those things. It got to verbose, so JSON was created to make thing simple again. If we add back all the staticness, we'll end up where we started.


I agree that JSON5 misses much of the spirit of JSON's simplicity, but I build a Perl 6 JSON5 parser for fun anyway, if anyone is interested: https://github.com/Mouq/json5

The actual grammar is here: https://github.com/Mouq/json5/blob/master/lib/JSON5/Tiny/Gra... (it would be nice if Github's syntax highlighting for Perl 6 regexes was more sophisticated than "turn it green," but I'm glad it syntax highlights in the first place)


Good Ol Hacker News, regurgitating something from 2 years ago:

https://news.ycombinator.com/item?id=4031699

That said, Aseem is very sharp. Nice to see this come up for discussion again.


Nice work, aseemk. I put together a quick jsperf [1]. JSON5 code was copy-pasted into setup, since it isn't on a CDN (that I saw).

1 http://jsperf.com/json-vs-json5


For what really happens when you add lots of convenience features to something like json -- it's not XML, it's YAML.

And it turns out all those extra convenience features in YAML are a mess -- if you haven't discovered that yet, it's becuase you haven't been bitten by a weird edge case related to the complex interplay of features yet.

Of course, this proposal is just a few convenience features, it's not nearly to YAML level. That probably also means it doesn't add enough benefit to any developers pain points to justify anyone using it instead of the more universally recognized plain old json.


I'm eagerly awaiting the C, C++, and Python implementations. Has anyone started on these?

Extending rapidjson, jsoncpp, yajl, and simplejson seems like a natural place to start.


I have a python implementation, but haven't found the time to share it yet.

Funnily, I independently "invented" this before I found JSON5. I called it "yottson", because that's how JSON would be pronounced in German and what I sometimes call it in my head. Its a fitting name, because it is my ideosyncratic "almost JSON".

Formats like this are mostly meant to be used for configuration files. For example the sublime-text configuration files are JSON + comments. I don't know what all the hate is for, it's clear that you wouldn't send this over a wire to a client expecting pure JSON. In fact, the serializers in JSON5 and in Yottson only ever create standard-compliant JSON. Its a case of "be liberal in what you accept, strict in what you send".

One thing I'd still like to implement though is lossless parsing. The parser would keep track of all the comments and whitespace, so that when you change a value programmattically in the config file, it only modifies the value and preserves all the indentation and comments. Right now, as I mentioned, writing the file programmattically always results in pure compliant JSON, but kills all comments.


I would love to not have to quote identifiers!


Commas and colons are unnecessary clutter in JSON, that's what they should change, not adding more complexity in exchange for nothing (dangling commas at the end? please). Quotes inside keys and values should also be optional unless they contain spaces.

JSON should be simplified (if any), not made it more complicated.


"JSON isn't the friendliest to write and maintain by hand."

And this is a problem why, since it SHOULDN'T BE WRITTEN BY HAND?

This suggestion adds additional complexity to every parser of the "new" format, in exchange for... nothing, apart from a few less syntax errors for sloppy hand-editors. WTF.


Why shouldn't it be written by hand?

If you have an object or list literal in your JS, it's already "almost-JSON". People write a lot of "almost-JSON". I want to separate data from code and put it into a data file, but then I have to remove final commas, comments, and so on. This is great for static data that would otherwise be in your code.

Also it is great as a config file format. Sublime Text uses JSON+comments, for example. Everybody who knows Python, JS, or JSON knows how to use it immediately.

I don't know where the problem is, since with the current implementation it is not possible to write JSON5 not by hand! JSON5.stringify just calls the reference JSON serializer.

The added complexity is not much, and has already been implemented. You don't need to change every implementation. It's just a drop-in replacement for JSON so you can have comments in your static JSON data files. I think it's extremely convenient. Using JSON as a config file format in Sublime Text for example is only viable since they are heavily commented.


I have a silly question, only having used JSON a precious few times, but if people want comments in JSON, couldn't that be done just by a named field within an object that follows the existing specification, iow an optional property holding a string?


By doing that, you've now added more data to the object. It'd be like doing <tag comment='bar' /> in XML - the actual data has changed.

So you'd end up defining some well-known comment name. But then you can't serialize arbitrary data structures, because they might conflict. So that gets complicated.

After everything, the JSON spec could have allowed comments to stay in. Instead, the author went off on some nonsense reasoning about incompatibility.


How hard would it be to just parse JSON5 into JSON? (Strip out the comments, add quotes around keys, eliminate trailing commas from objects and arrays...)

Once that's in place, then JSON5 would instantly become much more viable...


I really hope this idea dies quickly.

One of the best things about json is its simplicity and rigid syntax.

If we open up this can of worms, we will incur thousands of man-years of future wasted time tracking down incompatibility issues.


a bit OT, but you guys might find this useful.

JSONH is json for homogenous collections, read: csv files.

https://github.com/WebReflection/JSONH


Want comments and all kinds of other goodies in your config? Why not make a config.js that offers full configuration language flexibility?


That is what this is.


The one best thing about this - comments.


Obligatory xkcd - https://xkcd.com/927/


Hexadecimal numbers and object keys without quotes would be useful to me. The rest of it ... meh.


On the plus side, since it's all optional, we can all start claiming compliance immediately.


"[objects, arrays, ] can have trailing commas."

That's all I needed to hear, I'm on board.


Comments are not in json for a well understood reason: https://plus.google.com/app/basic/stream/z12ztpczbxrdglfgl04...

I want trailing commas.

But the delta confuses me - what is it and how is it used and can it be negative?


Crockford does opinionated wrong. It's fine to be opinionated, but it's not fine to give absurd (often stupid) justifications for actions and beliefs that are wrong:

"I removed pointers from C because I saw people were using them to manipulate implementation defined details of their environment, a practice which would have destroyed interoperability. I know that the lack of pointers makes some people sad, but it shouldn't."

Another of my favorites is the justification for part of why jslint is unusable in any real environment because Crockford thinks anonymous functions are unprofessional and only used by people who don't know what they're doing:

https://groups.yahoo.com/neo/groups/jslint_com/conversations...


>It's fine to be opinionated, but it's not fine to give absurd (often stupid) justifications for actions and beliefs that are wrong

That's asking Crockford to not be Crockford.


Dates... where are the dates!


OMFG comments are allowed! Was it so hard to allow those in the first spec?


I think comments are really a bad idea. If you want comments, either make them explicit properties in your data structures or use something else like YAML or XML. JSON is the most practical language out there for inter-machine and inter-program data representation.

JSON has two unique properties among data representation languages that make it useful for certain types of tasks: 1) it is very simple to write conformant parsers and serializers, and 2) the parsed representation contains all the original semantic information in the original document (excluding maybe whitespace). If you have a separate syntax for comments, then parsers must either drop the comments (making transformation tools less powerful) or parser must provide a more complex representation (e.g. like the DOM) and drop the simple map/list/scalar data model.

Please don't make JSON the next XML! Source: I spent about 5 years suffering with XML in the Enterprise Application Integration world.


This is pretty unnecessary. If you need all these things, just use XML.


I couldn't tell, does the Numbers support include NaN?


Stopped reading at "can have trailing commas"...


They are really useful. The only two good options are allowing trailing commas or considering all commas to be whitespace.


They are ugly.


They also make JSON generation by machines much easier, and don't complicate the parser. [1]

[1] Source: I've written a JSON parser from scratch and think it would be valuable


More importantly, they make it possible to have not-ugly documents (commas where they belong, after the clause they extend) work well with version control and merging.

    -   "blah"
    +   "blah",
    +   "blorp"
is far uglier than any trailing comma ever has been.


Is that a problem that should be solved on a data format level or a version control level?


Do you have a proposal to achieve solving it at the version control level that doesn't require either a complete paradigm shift in programming and data formatting langauges to everything being trivially an AST or embedding knowledge of every format ever invented to your version control?

Because it seems a lot easier to just design newer languages to be friendlier to version control than either of those. JSON has little excuse here, even when it was 'invented' it was a bad idea to simply run it through eval and pray it had nothing malicious or damaging in it, so compatibility with IE6's stupid parser wasn't really all that special. Never mind that it didn't take off as a really popular format until well after that was a concern.


if text editors can syntax highlight why can't version control systems syntax version?


If you store JSON in a version control system, then it is usually hand-edited JSON. (If it is generated programmatically at runtime, then the data is in a database.)

And if a file is hand-edited, it's format should have niceties that make hand-editing more pleasant, like optional trailing commas, and comments.

By making the format more suited to hand-editing, you also make it better for diffing and VCS.


    require([
    -   "blah"
    +   "blah",
    +   "blorp"
    ], function () {
AngularJS templates often can involve JSON structures for <select> and other UI junk.

It'd be great if a backend could serve JSON structures directly to the Controller, which would inform the Directive's @attr template. So maybe the problem goes away in this regard, and the build could handle AMD/CommonJS wrappers which is outside of the version control layer to an extent — or at least wrappers do not necessarily have to be.

    require([
        "blah",
        "blorp",
    ], function () {
I added ``blorp`` — I never removed ``blah`` except for some limitation of the language I'm using. Why should that be part of the organic record?


Comma-separated lists are inherently ugly--you're basically using a binary operator to cons values together. I can never understand why more languages don't support either juxtaposition (i.e. Lisp-like list construction) or semantic whitespace with "bulleted" prefixing (i.e. YAML-like list construction.)


Because a comma-separated list is how humans write (and read) lists?


They're very useful to achieve clean diffs in version control. (Fwiw, I find the asymmetry of their absence ugly.)


I'll propose this: websites should be readable.


JSON is perfect as it is, leave it like it is.


what about a open binary format ? I don't mean bson.


Yes, please.




Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: