
Douglas Crockford: Why I removed comments from JSON - mmastrac
https://plus.google.com/118095276221607585885/posts/RK8qyGVaGSr
======
DTrejo

        {
          "//": "I approve of the choice to omit comments from JSON."
        }

~~~
generateui
I was exactly thinking this when I was halfway reading the article. "Why not
just add a property as comment?"

~~~
tolmasky
There are many good reasons not to add documentation as properties. For
starters, it is a very US-centric approach to things. I would probably be
annoyed if I was using a foreign language library that had random strings
which contained text I did not understand in my objects.

Secondly, the second you start putting documentation in your JSON, be prepared
for services to break if you ever change that documentation. It may be silly
to do "if (json._doc_.indexOf...)", but stranger things have happened and if
you are making an API it is generally a bad tactic to require your clients not
to be silly.

"But I intend to have it in there solely because I can't use real comments --
I don't plan on _shipping_ these". OK then probably no problem -- but at this
point they're really no better than normal comments (with respect to the issue
that Crockford brings up) since theres no reason you can't put compiler
directives in the "comment" properties:

    
    
        {
        /* do something special mr. proprietary compiler */
        "compiled_property": "magic"
        }
    

vs.

    
    
        {
        "//" : "do something special mr. proprietary compiler"
        "compiled_property": "magic"
        }
    

So the possibility of diverging incompatible implementations remains --
however I think the real reason this didn't happen wasn't the lack of comments
but rather that JSON's biggest selling point is almost universal
compatibility.

~~~
malandrew
It's not US-centric. It's English-centric and English is the lingua franca of
software engineering as it is with many other professions. TBH, programming is
and will always will be English centric and I would never feel 100% confident
with any software engineer that isn't very proficient in English insofar as
reading and writing are concerned. Practically every tome, text, blog or
article of any real value in the field is published in English, and only once
they are really well known are they translated to other languages. The only
time the English speaking programming World is behind that of another language
is when a new language is developed by a non-native English speaker (such as
Lua developed in Brazil and Ruby developed in Japan). And even then no
language would ever see popular adoption to the point of importance if it
doesn't adopt English as the language of its source and its documentation.

This isn't bigotry. This is just the way things are.

FWIW, English isn't my first language, Portuguese is, but I have native
fluency in English.

~~~
jballanc
> This isn't bigotry. This is just the way things are.

Indeed, I don't think this situation is all that different from Medicine or
Biology. Parts of anatomy, diseases, and taxonomic classifications will
forever be known by their Latin names. Likewise, programmers will probably
still be referring to "if" statements and "for" loops long after no English
speakers remain.

~~~
mseebach
> Likewise, programmers will probably still be referring to "if" statements
> and "for" loops long after no English speakers remain.

Or, in reverse; I knew what those words did in BASIC before I knew what they
meant in English (which isn't my native language).

~~~
kolinko
Same here :)

------
hp
It makes total sense to omit comments, because JSON is optimized for machine-
readability and interoperability (simple spec, simple implementation, no
extension mechanism, fast to parse). Putting comments in would compromise it
for that purpose.

JSON isn't a great config file format, if your config file is meant to be
human-edited. But the solution is simple; use a format designed for human
editing.

Config files need UI design too.

During Akka and Play 2.0 development, we approached this by starting from the
hand-rolled parsers found in Akka 1.2 and Play 1.2 (both had ad-hoc parsers to
support a "pretty" config file format). We took the aesthetic preferences of
the hand-rolled parsers seriously, and came up with HOCON:
<https://github.com/typesafehub/config> (scroll down to "JSON Superset"),
<https://github.com/typesafehub/config/blob/master/HOCON.md>

The HOCON format is a superset of JSON and also happens to be mostly
compatible with the Play and Akka 1.2 ad hoc parsers (which were independently
developed, so two data points on what people wanted).

HOCON is roughly similar to YAML in complexity. Like YAML the spec is pretty
long ( <http://www.yaml.org/spec/1.2/spec.html> ). And the Typesafe Config and
SnakeYAML jars have pretty comparable amounts of bytecode in them.

HOCON or YAML would be terrible formats for an API or something like that, and
they're painful to implement, but I strongly prefer them to JSON for human-
maintained configuration.

Anyway, I think the genius of JSON was its focus on machine interoperability
rather than trying to do everything, that's why it works so well for machine
interoperability. But it doesn't mean you have to use it for everything.

~~~
gnuvince
From json.org:

"JSON (JavaScript Object Notation) is a lightweight data-interchange format.
It is easy for humans to read and write. It is easy for machines to parse and
generate."

Humans are mentioned before machines. Comments improve human readability.
Furthermore, comments are very simple to implement, and don't lower the ease
of parsing.

~~~
cube13
There are 3 mutually conflicting goals of JSON:

1\. Machine parsing/generation

2\. Human parsing/generation

3\. Ability to transmit it in a parsable format through HTTP

#1 conflicts with both #2 and #3. It conflicts with #1 because binary data
formats are much easier to parse, and are generally smaller on the wire than
plain text formats. Also, it conflicts with #3 for the same reason, since HTTP
is a plain text format, and JavaScript generally deals with plain text better
than binary data.

#2 conflicts with #3, because comments are nice for human readability, but
should never be sent on the wire, because they are not necessary to parse the
sent data. If your comment is important, it should be a data element that's
sent on the wire.

The end result is a compromise. Since two of the concerns effectively make
comments useless, the compromise should be towards those two. In addition, the
reason given by Crockford makes quite a bit of sense. JSON, much like XML, is
a format for the transmission of data. Everything in a given JSON object
should be important. It's not a programming language, which means that control
directives are unnecessary.

------
lurker721
Consider these two paraphrased arguments against Crockford, each made
independently by many poster-programmers here:

(1) I don't like comments, but removing them won't solve anything: People who
like comments can just embed them as funky properties.

(2) I like comments, but embedding them as funky properties is annoying and
awkward, and I don't want to.

Now look at (1).

Now look at (2).

Now relax your eyes like you're looking at a stereoscopic poster until (1) and
(2) merge together and you achieve a Zen-like understanding -- or a view of a
schooner.

------
burgerbrain
> _"I saw people were using them to hold parsing directives"_

What could possibly make somebody want to do that? Are there any examples
around of people doing that?

~~~
tolmasky
You can imagine something like this:

    
    
        {
           /* if IE */
           browser: "IE"
           /* else */
           browser: "standard"
           /* endif */
        }
    

Pretty terrible and still possible (but admittedly harder) without comments.

~~~
greghinch
If you are storing this kind of implementation logic in your data, I hope I
never have to work with you (not aimed at parent posting, but rather the
global "you"

~~~
davedx
Unfortunately it's all too common in mobile development - mobile is the new
"bad old days" of user agent sniffing hell.

~~~
pavel_lishin
You typically don't store this sort of thing in data, though.

Then again, we have certain types of logic stored in a database table, loaded
through fixtures... so my two cents may be worth much less than what they
appear.

------
warmfuzzykitten
Comments about removing comments thus far seem to miss the point. By removing
comments, he greatly simplified the parser. Since he was sure to be criticized
for this design choice in the direction of simplicity, it was a brave
decision! Had there been enough radical simplifiers on the XML committee, we
might not have needed JSON. Wait... Committee? Never mind.

~~~
gnuvince
Greatly simplified the parser? It's probably 10 lines of code.

~~~
doomslice
[http://coding.smashingmagazine.com/2012/04/27/yahoos-doug-
cr...](http://coding.smashingmagazine.com/2012/04/27/yahoos-doug-crockford-on-
javascript/)

"One interesting story about leaving things out: as we got closer to releasing
JSON I decided to take out the ability to do comments. When translating JSON
into other languages, often times the commenting piece was the most
complicated part. By taking the commenting out we reduced the complexity of
the parsers by half—everything else was just too simple."

------
rbanffy
Why not simply add "comments will always be ignored, any JSON parser that
assigns semantic value to them is non-compliant, every time it's used baby
Cthulhu cries and the sad soul who wrote it will burn in hell for eternity" to
the spec?

This more or less prevent anyone from using JSON for configuration. Unless, of
course, you parse it with eval.

~~~
sirclueless
What better way to accomplish that then to remove comments from the spec?
There's nothing stopping a non-compliant parser from allowing them anyways,
it's just that now it is abundantly clear that such a parser is non-compliant
because it is "impure".

~~~
burgerbrain
It would be an improvement because then I could count on comments in JSON
doing nothing (instead of possibly not parsing).

------
dap
As several people have pointed out, this doesn't make a lot of sense. People
could just as well put parsing directives in special properties. On the other
hand, having proper comments would make using JSON for configuration
_significantly_ nicer. It would also allow you to have JSON snippets that
themselves are documented, as when documenting a JSON web API.

Using properties as comments is awkward at best: it doesn't match what people
are used to, either the key or value ends up being a dummy, and consumers that
iterate over properties have to be smart enough to ignore special doc
properties (which can make automatic validation against a schema more
difficult). The solution of running it through JSMin is pretty unsatisfying
too. It's never wrong, so how about we always do that?

------
Yarnage
This makes sense to me. JSON is designed to simply hold data so there
shouldn't need to be any comments.

If it's related to configuration then there should be documentation regarding
what is and isn't supported. If it's simply data you're sending across the
wire then there should be documentation somewhere; you wouldn't want to waste
bandwidth transmitting comments.

~~~
njharman
> simply hold data so there shouldn't need to be any comments.

How does that follow? Usually "data", 5.23423, needs commenting more than
most. What the hell is that? Why 5 decimal places?

Also comments in config files are for many things; when file was created,
change history, by whom, who to contact with problems, warnings not to edit as
it's managed via chef/puppet.

~~~
Yarnage
>How does that follow? Usually "data", 5.23423, needs commenting more than
most. What the hell is that? Why 5 decimal places?

You would already know the answer before seeing the JSON file so I'm not sure
why you care.

For instance, you're not going to be receiving JSON data over email and then
putting it into a system manually. Instead, you'll have APIs that handle the
JSON formats for you and simply ingest the data.

If you're processing a large volume of data using JSON as the interchange
format, why on Earth would you want it to include comments? No service on
Earth does this that has any volume of users.

>Also comments in config files are for many things; when file was created,
change history, by whom, who to contact with problems, warnings not to edit as
it's managed via chef/puppet.

This is not the job of a comment. These go stale and all are available via
whatever version control mechanism. However, keep in mind you're talking a
very specific edge case in a development environment. Typically these don't
matter, at all. If you need comments within the dev system for whatever
reason, you can just strip them out. Puppet would obviously have appropriate
permissions so no one can simply modify them anyway without knowing they're
messing with puppet.

Shipping items, however, should simply have documentation regarding what
configurations you product does or does not support.

~~~
roel_v
Sounds like you're making an awful lot of assumptions about other people's
workflows and how they 'should' do things. Have you ever actually worked in a
production environment? (no way to ask this without sounding snarky, it's not
meant to be mean-spirited)

~~~
Yarnage
I made no assumptions and there is no need to be rude with your incredibly
vague question. There are obviously better ways of asking and I'm sure you
knew that (otherwise you would have asked a specific question).

~~~
roel_v
What's vague about the question? I'm not sure how I can be more specific: have
you actually ever worked in a production environment? The point being that the
things you sum up sound very much like textbook-knowledge and theory, and
nothing like something somebody who has actually, you know, _worked_ with
software to _get things done_ (as opposed to just playing around) would say.

~~~
Yarnage
>What's vague about the question?

You asked nothing relevant to the conversation. Instead you asked about me
ever working in a production environment and nothing more.

That questions accomplishes nothing but being insulting. I hate to sound crass
but you're acting like an ass.

If you wish to bring something topical and relevant to the conversation then
by all means do so but insulting someone gets nothing accomplished other than
trolling...in which case you already won. Congrats.

>The point being that the things you sum up sound very much like textbook-
knowledge and theory

No, they don't. Everything I stated is from real-world experience. I cannot
fathom what kind of environment you work in where service to service
communication includes comments going over the wire. But, as you can see above
in my other responses, I've already covered all of these aspects. Nothing
theoretical.

------
kevincennis
var crockford = function(){ while(1); }

~~~
Yarnage
Error:

Problem at line 1 character 25: Expected exactly one space between 'function'
and '('.

var crockford = function(){ while(1); }

Problem at line 1 character 27: Expected exactly one space between ')' and
'{'.

var crockford = function(){ while(1); }

Problem at line 1 character 27: Missing space between ')' and '{'.

var crockford = function(){ while(1); }

Problem at line 1 character 29: Missing 'use strict' statement.

var crockford = function(){ while(1); }

Problem at line 1 character 34: Expected exactly one space between 'while' and
'('.

var crockford = function(){ while(1); }

Problem at line 1 character 35: Unexpected '1'.

var crockford = function(){ while(1); }

Problem at line 1 character 37: Expected exactly one space between ')' and
';'.

var crockford = function(){ while(1); }

Problem at line 1 character 37: Missing space between ')' and ';'.

var crockford = function(){ while(1); }

Problem at line 1 character 37: Expected '{' and instead saw ';'.

var crockford = function(){ while(1); }

Problem at line 1 character 37: Unexpected ';'.

var crockford = function(){ while(1); }

Problem at line 1 character 39: Cannot convert 'array[0]' to object

~~~
user-id

      Problem at line 1 character 39: Cannot read property 'disrupt' of undefined
    

Huh?

~~~
kevincennis
'disrupt' is an internal thing. There's a try/catch somewhere that's failing,
but he catches the error and adds it to JSLINT.errors as if it were a problem
with YOUR code.

~~~
TazeTSchnitzel
It was probably an innocent mistake, he most likely put it there for
debugging.

------
drivebyacct2
Well at least there aren't semicolons in JSON.

~~~
anonymoushn
Otherwise we might run into a bug in JSMin!

------
andyl
Lack of comments in JSON is a bummer.

But it is easy enough to pipe the JSON thru a filter...

~~~
Someone
...and before you know it, your system that is built using various third party
libraries contains 'JSON' files with various comment conventions:

    
    
        - # in column 1
        - # in column 1, backslash at end of line escapes the newline
        - # as first non-blank
        - # as first non-blank, backslash at end of line escapes the newline
        - -- as first non-blank
        - C-style multi-line slash-asterisk comment asterisk-slash
        - C++-style //
        - Python style doc strings
        etc.
    

Then somebody will attempt to write the comment filter that will handle them
all, somebody else will make a JSON parser that round-trips all different
versions. Net result: a JSON 'standard' that is ugly and leads to parsers that
are larger then necessary, not 100% reliable (how do you guess whether a
backslash at the end of line escapes the newline?)

I would rather have a simple standard with one kind of comment.

