There are many good reasons not to add documentation as properties. For starters, it is a very US-centric approach to things. I would probably be annoyed if I was using a foreign language library that had random strings which contained text I did not understand in my objects.
Secondly, the second you start putting documentation in your JSON, be prepared for services to break if you ever change that documentation. It may be silly to do "if (json._doc_.indexOf...)", but stranger things have happened and if you are making an API it is generally a bad tactic to require your clients not to be silly.
"But I intend to have it in there solely because I can't use real comments -- I don't plan on shipping these". OK then probably no problem -- but at this point they're really no better than normal comments (with respect to the issue that Crockford brings up) since theres no reason you can't put compiler directives in the "comment" properties:
{
/* do something special mr. proprietary compiler */
"compiled_property": "magic"
}
vs.
{
"//" : "do something special mr. proprietary compiler"
"compiled_property": "magic"
}
So the possibility of diverging incompatible implementations remains -- however I think the real reason this didn't happen wasn't the lack of comments but rather that JSON's biggest selling point is almost universal compatibility.
It's not US-centric. It's English-centric and English is the lingua franca of software engineering as it is with many other professions. TBH, programming is and will always will be English centric and I would never feel 100% confident with any software engineer that isn't very proficient in English insofar as reading and writing are concerned. Practically every tome, text, blog or article of any real value in the field is published in English, and only once they are really well known are they translated to other languages. The only time the English speaking programming World is behind that of another language is when a new language is developed by a non-native English speaker (such as Lua developed in Brazil and Ruby developed in Japan). And even then no language would ever see popular adoption to the point of importance if it doesn't adopt English as the language of its source and its documentation.
This isn't bigotry. This is just the way things are.
FWIW, English isn't my first language, Portuguese is, but I have native fluency in English.
> This isn't bigotry. This is just the way things are.
Indeed, I don't think this situation is all that different from Medicine or Biology. Parts of anatomy, diseases, and taxonomic classifications will forever be known by their Latin names. Likewise, programmers will probably still be referring to "if" statements and "for" loops long after no English speakers remain.
I don't think you can claim that. Maybe for the next few decades or centuries but beyond that who knows what will happen.
Practically every tome, text, blog or article of any
real value in the field is published in English ...
Is it possible that you just don't know they exist? You probably don't know enough to make such claims, as an English speaker myself, I know of a few books that I would want to read but they haven't been translated yet. There aren't many, but then again I don't know about others because of me not knowing the language.
To add to this, I suspect any non-English speaker would rather have English comments that they can at least throw at Google Translate, than no comments at all.
That do not match my experience in a Chinese Web company. All my colleagues are very good hackers but only some of them are confident at writing English. Most important literature is translated in Chinese, and they prefer reading translated versions. Comments in the code are in Chinese. It is sometime a bit more difficult for us for variables and functions names but it works.
I would agree that a good coder must be able to express clearly and unambiguously complex processes in mother tongue and his programming language of choice but excellent grasp of English is only a "nice to have", when your local community is developed enough.
From my work with Taiwanese engineers, I have found that reading and writing are as much two different skills as speaking and listening are.
Almost all of them can read English remarkably well, but some of the emails I get from them... Often grammar and syntax are nowhere to be found, and the vocabulary is occasionally amusing (a couple of the memorable gems have been "USB sticker", presumably thinking of "USB stick" (and still ambiguous in the context of the message, what they were talking about was a USB WiFi dongle), and "redeploy" referring to a reboot or power cycle).
I don't really see that these are valid concerns. It's not really a US-centric approach - you'd have the same issues with comments. If you were concerned about internationalizing documentation, then you could easily get around this by making the property name localized and store the documentation in the appropriate language property.
As for breaking services - you could make these JSON parameters optional and very clear that the documentation won't affect the way the service runs.
If you have documented the API sufficiently, then it's not a bad tactic to require clients "not to be silly". Clients who misuse APIs can't complain if they don't use them properly.
Article, blogpost, same thing. The first sentence is
> I removed comments from JSON because I saw people were using them to hold parsing directives, a practice which would have destroyed interoperability.
I do not read something on the lines of "you can also use a comment as a property". I wonder how you read that in the first sentence?
>> I removed comments from JSON because I saw people were using them to hold parsing directives, a practice which would have destroyed interoperability.
>I do not read something on the lines of "you can also use a comment as a property". I wonder how you read that in the first sentence?
lol, please re-read my comment. I said the first COMMENT, not sentence, had the same suggestion.
What if you need to encode arbitrary dictionaries, which may have keys like "//", rather than objects with specific named attributes?
And what if you want multiple comments in the same scope (say, a comment before each attribute)? From the RFC: "The names within an object SHOULD be unique." In practice parsers will just override values, as JS does, but I think it would be legal to reject JSON with duplicate keys.
IMHO, the complexity that this would introduce just highlights why it's a good idea not to have them.
> "but I think it would be legal to reject JSON with duplicate keys."
No it would not. `SHOULD` has a very specific meaning:
SHOULD This word, or the adjective "RECOMMENDED", mean that there
may exist valid reasons in particular circumstances to ignore a
particular item, but the full implications must be understood and
carefully weighed before choosing a different course.
It makes total sense to omit comments, because JSON is optimized for machine-readability and interoperability (simple spec, simple implementation, no extension mechanism, fast to parse). Putting comments in would compromise it for that purpose.
JSON isn't a great config file format, if your config file is meant to be human-edited. But the solution is simple; use a format designed for human editing.
Config files need UI design too.
During Akka and Play 2.0 development, we approached this by starting from the hand-rolled parsers found in Akka 1.2 and Play 1.2 (both had ad-hoc parsers to support a "pretty" config file format). We took the aesthetic preferences of the hand-rolled parsers seriously, and came up with HOCON: https://github.com/typesafehub/config (scroll down to "JSON Superset"), https://github.com/typesafehub/config/blob/master/HOCON.md
The HOCON format is a superset of JSON and also happens to be mostly compatible with the Play and Akka 1.2 ad hoc parsers (which were independently developed, so two data points on what people wanted).
HOCON is roughly similar to YAML in complexity. Like YAML the spec is pretty long ( http://www.yaml.org/spec/1.2/spec.html ). And the Typesafe Config and SnakeYAML jars have pretty comparable amounts of bytecode in them.
HOCON or YAML would be terrible formats for an API or something like that, and they're painful to implement, but I strongly prefer them to JSON for human-maintained configuration.
Anyway, I think the genius of JSON was its focus on machine interoperability rather than trying to do everything, that's why it works so well for machine interoperability.
But it doesn't mean you have to use it for everything.
"JSON (JavaScript Object Notation) is a lightweight data-interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate."
Humans are mentioned before machines. Comments improve human readability. Furthermore, comments are very simple to implement, and don't lower the ease of parsing.
3. Ability to transmit it in a parsable format through HTTP
#1 conflicts with both #2 and #3. It conflicts with #1 because binary data formats are much easier to parse, and are generally smaller on the wire than plain text formats. Also, it conflicts with #3 for the same reason, since HTTP is a plain text format, and JavaScript generally deals with plain text better than binary data.
#2 conflicts with #3, because comments are nice for human readability, but should never be sent on the wire, because they are not necessary to parse the sent data. If your comment is important, it should be a data element that's sent on the wire.
The end result is a compromise. Since two of the concerns effectively make comments useless, the compromise should be towards those two. In addition, the reason given by Crockford makes quite a bit of sense. JSON, much like XML, is a format for the transmission of data. Everything in a given JSON object should be important. It's not a programming language, which means that control directives are unnecessary.
Consider these two paraphrased arguments against Crockford, each made independently by many poster-programmers here:
(1) I don't like comments, but removing them won't solve anything: People who like comments can just embed them as funky properties.
(2) I like comments, but embedding them as funky properties is annoying and awkward, and I don't want to.
Now look at (1).
Now look at (2).
Now relax your eyes like you're looking at a stereoscopic poster until (1) and (2) merge together and you achieve a Zen-like understanding -- or a view of a schooner.
If you are storing this kind of implementation logic in your data, I hope I never have to work with you (not aimed at parent posting, but rather the global "you"
You typically don't store this sort of thing in data, though.
Then again, we have certain types of logic stored in a database table, loaded through fixtures... so my two cents may be worth much less than what they appear.
It's not hard to imagine if you sorta hold your breath and let yourself get a little dizzy and think hard about XML you've had the pleasure of messing with.
I quickly get visions of version numbers, customized namespace declarations, typedefs, strftime date format strings...
Not JSON, but here's a truly horrible example of Internet Explorer using specially formed comments to take different actions: http://www.quirksmode.org/css/condcom.html
Yes, but that works. Taking in such JSON and then immediately spewing it back out doesn't change the underlying meaning. Transforms from such JSON to some other format (perhaps even another JSON format) must explicitly choose what to do with "comments", instead of accidentally just discarding them. Given that parsing directives are going to exist somewhere, this is the correct place for them.
(Better yet is to create an explicit place for metadata. I almost reflexively use {"metadata": null, "payload": ...} now instead of putting my payload right at the top level, because wouldn't you know it, sooner or later some metadata always seems to wander in. And if it doesn't... shrug. If you're in a place where you can afford JSON in the first place the extra cost is probably way below the noise threshold for you.)
But the point is that if you have comments in your JSON, the first time you do some sort of "for key in data" to transform the data and spit it back out, the comments are gone; you may never even realize they were there to start with.
If you do that with the metadata explicitly stored as a separate key-value pair in your blob, then this doesn't happen; the meta data is never silently discarded when you, say, take all the key value pairs in the JSON blob and send them out down the wire to a client. If you want to strip the meta-data, you have to do that.
I know this isn't Python, but I think the Zen of Python is on point here: "Explicit is better than implicit."
> But the point is that if you have comments in your JSON, the first time you do some sort of "for key in data" to transform the data and spit it back out, the comments are gone; you may never even realize they were there to start with.
If you've stored comments as regular data, you haven't lost them but you've just transformed them in the output.
Your criticism appears to be based on a transform written with an incomplete understanding of the source data. I'd submit the problem lies in the incomplete understanding of the source data, not the fact the source data had comments. If your transform didn't "know" comments were possible, what else did it not "know"?
> I'd submit the problem lies in the incomplete understanding of the source data
I'd submit that an incomplete understanding of the source data is not necessarily a problem. It's often a design goal. Generic tools have a limited understanding of the source data by design. I don't want my JSON parser/formatter/minifier/etc. to know about some silly parsing rules you added as comments. I want my JSON parser to understand JSON as it's defined.
Your nonstandard comment directives are the problem, not the fact that I didn't write a custom tool.
Also, Emacs uses comments to set file-local options. There's a long tradition of overloading comments to achieve metalinguistic ends. JavaDoc and Doxygen are great examples.
Even when handed a decent macro language with whizzy namespaces and a great DOM, I imagine that some people will still stoop to gross and convenient hacks.
Comments about removing comments thus far seem to miss the point. By removing comments, he greatly simplified the parser. Since he was sure to be criticized for this design choice in the direction of simplicity, it was a brave decision! Had there been enough radical simplifiers on the XML committee, we might not have needed JSON. Wait... Committee? Never mind.
There would have been demands for tools that transform JSON to JSON, or JSON to XML, to preserve comments across the transformation.
Adding that capability would raise the complexity of the parser, because comments would have to be made part of the data structure that is built and transformed. For instance, it would be harder to embed the data structure for the JSON in JS objects.
But yes, for a parser that's ingesting data for immediate processing and has no need for comments, there's no discernable win regarding parsing simplicity.
"One interesting story about leaving things out: as we got closer to releasing JSON I decided to take out the ability to do comments. When translating JSON into other languages, often times the commenting piece was the most complicated part. By taking the commenting out we reduced the complexity of the parsers by half—everything else was just too simple."
Why not simply add "comments will always be ignored, any JSON parser that assigns semantic value to them is non-compliant, every time it's used baby Cthulhu cries and the sad soul who wrote it will burn in hell for eternity" to the spec?
This more or less prevent anyone from using JSON for configuration. Unless, of course, you parse it with eval.
What better way to accomplish that then to remove comments from the spec? There's nothing stopping a non-compliant parser from allowing them anyways, it's just that now it is abundantly clear that such a parser is non-compliant because it is "impure".
As several people have pointed out, this doesn't make a lot of sense. People could just as well put parsing directives in special properties. On the other hand, having proper comments would make using JSON for configuration significantly nicer. It would also allow you to have JSON snippets that themselves are documented, as when documenting a JSON web API.
Using properties as comments is awkward at best: it doesn't match what people are used to, either the key or value ends up being a dummy, and consumers that iterate over properties have to be smart enough to ignore special doc properties (which can make automatic validation against a schema more difficult). The solution of running it through JSMin is pretty unsatisfying too. It's never wrong, so how about we always do that?
This makes sense to me. JSON is designed to simply hold data so there shouldn't need to be any comments.
If it's related to configuration then there should be documentation regarding what is and isn't supported. If it's simply data you're sending across the wire then there should be documentation somewhere; you wouldn't want to waste bandwidth transmitting comments.
> simply hold data so there shouldn't need to be any comments.
How does that follow? Usually "data", 5.23423, needs commenting more than most. What the hell is that? Why 5 decimal places?
Also comments in config files are for many things; when file was created, change history, by whom, who to contact with problems, warnings not to edit as it's managed via chef/puppet.
>How does that follow? Usually "data", 5.23423, needs commenting more than most. What the hell is that? Why 5 decimal places?
You would already know the answer before seeing the JSON file so I'm not sure why you care.
For instance, you're not going to be receiving JSON data over email and then putting it into a system manually. Instead, you'll have APIs that handle the JSON formats for you and simply ingest the data.
If you're processing a large volume of data using JSON as the interchange format, why on Earth would you want it to include comments? No service on Earth does this that has any volume of users.
>Also comments in config files are for many things; when file was created, change history, by whom, who to contact with problems, warnings not to edit as it's managed via chef/puppet.
This is not the job of a comment. These go stale and all are available via whatever version control mechanism. However, keep in mind you're talking a very specific edge case in a development environment. Typically these don't matter, at all. If you need comments within the dev system for whatever reason, you can just strip them out. Puppet would obviously have appropriate permissions so no one can simply modify them anyway without knowing they're messing with puppet.
Shipping items, however, should simply have documentation regarding what configurations you product does or does not support.
Sounds like you're making an awful lot of assumptions about other people's workflows and how they 'should' do things. Have you ever actually worked in a production environment? (no way to ask this without sounding snarky, it's not meant to be mean-spirited)
I made no assumptions and there is no need to be rude with your incredibly vague question. There are obviously better ways of asking and I'm sure you knew that (otherwise you would have asked a specific question).
What's vague about the question? I'm not sure how I can be more specific: have you actually ever worked in a production environment? The point being that the things you sum up sound very much like textbook-knowledge and theory, and nothing like something somebody who has actually, you know, worked with software to get things done (as opposed to just playing around) would say.
You asked nothing relevant to the conversation. Instead you asked about me ever working in a production environment and nothing more.
That questions accomplishes nothing but being insulting. I hate to sound crass but you're acting like an ass.
If you wish to bring something topical and relevant to the conversation then by all means do so but insulting someone gets nothing accomplished other than trolling...in which case you already won. Congrats.
>The point being that the things you sum up sound very much like textbook-knowledge and theory
No, they don't. Everything I stated is from real-world experience. I cannot fathom what kind of environment you work in where service to service communication includes comments going over the wire. But, as you can see above in my other responses, I've already covered all of these aspects. Nothing theoretical.
I'm forced to wonder from your comments throughout this thread whether you have much real-world development experience, versus having just ingested a bunch of theoretical information on best practices. The things you're dismissing out of hand happen all the time, and while sometimes they're useless, more often they're all the documentation you'll get, and you'll feel exceedingly lucky to get even them.
Your comment about Puppet is especially troubling. Puppet doesn't "have appropriate permissions", it's a root-level tool for managing system state. The files it manages may have all kinds of ownership and permissions, some of which are not ones you can just mess with (e.g. the system will throw up its hands and bomb out if they're wrong), and all of which are still modifiable by anyone with root access. There's no way to differentiate short of comments in the files themselves.
>I'm forced to wonder from your comments throughout this thread whether you have much real-world development experience, versus having just ingested a bunch of theoretical information on best practices. The things you're dismissing out of hand happen all the time, and while sometimes they're useless, more often they're all the documentation you'll get, and you'll feel exceedingly lucky to get even them.
This is highly arrogant comment. There was no need to be rude. I've been developing and working with actual data for over a decade now. Yes, in the real world sometimes comments are transmitted over the line but I'm sure you can agree that isn't a good idea. Yes, many times you don't get good documentation but you make it sound like that's acceptable and anyone who thinks otherwise doesn't have real-world experience.
In all honestly I would expect comments generated from some odd software packages but it's been an incredibly long time since I've seen data transmissions that contain comments. In environments dealing with petabytes of data you can't afford to send comments with every single file.
As for Puppet, I think you misunderstand my point. Yes, it is a root-level tool. That doesn't mean any user should have the same permissions as Puppet. Why wouldn't you simply place configuration files people shouldn't modify in places where they don't have permission but Puppet does? Honestly, I thought that was standard practice.
You can claim all the experience you want, but your statements belie the truth. JSON is in no way limited to raw data interchange, and the files puppet manages cannot be put off to one side in some magical place where root cannot access them. System administrators will always have access to the files no matter where they are or what permissions you have set, and they need to know that the files are being automatically managed. The system also needs to know where to find the files, you can't start randomly moving /etc/fstab around, for example.
>You can claim all the experience you want, but your statements belie the truth. JSON is in no way limited to raw data interchange,
I never stated such a thing but that is what it's designed for and the primary use-case for JSON. I would imagine most other cases are edge-cases.
>and the files puppet manages cannot be put off to one side in some magical place where root cannot access them.
No one said to push them off to the "side" or into some "magical place". It's really simple: puppet has permissions to the files, your users do not.
A system administrator worth his salt isn't going to be messing with configuration files any which way and if said administrator has ROOT access then they should already know that's the level where puppet works and they could screw something up.
It appears that he is referring to JSON within the limited context of web APIs. The scenarios dismissed as a "very specific edge case in a development environment" appear to crop up all the time when using JSON for something other than APIs - for example, CloudFormation stack templates are written in JSON, and would sorely benefit from some in-line comments.
>It appears that he is referring to JSON within the limited context of web APIs.
I was referring to JSON as an interchange format, which is what it is.
Arguably Amazon should have chosen something different for templates but I think that's an issue beyond simple comments and outside the scope of the argument. Regardless, it's hard to image anyone thinking it's a good idea to send comments over the wire especially if the end point is some sort of service.
It's fairly common to annotate that data. While every entry in my postgresql.conf file is documented someplace, it'd be an absolute nightmare if defaults couldn't be commented inline or if I couldn't relay why a value is set to a particular value by adding a comment.
Note in this case, nothing would be transmitted over the wire. JSON is just being used as a convenient format. Even still, it can be handy to exchange annotated data over the wire.
>It's fairly common to annotate that data. While every entry in my postgresql.conf file is documented someplace, it'd be an absolute nightmare if defaults couldn't be commented inline or if I couldn't relay why a value is set to a particular value by adding a comment.
What happens when you accidentally delete a value you didn't mean to? You would refer to documentation. Honestly all configuration files _should_ be well documented, including postgres so I don't see the issue here. Why would you want to include every single possible option, probably commented out, when you could simply grab the ones you need from the documentation? I would prefer having a lean configuration file that shows exactly what I'm using rather than 400 commented options; that's a MESS to maintain.
>Even still, it can be handy to exchange annotated data over the wire.
I cannot disagree more. It is not useful AT ALL to exchange annotated data. Unless you're using an ad-hoc system (which obviously wouldn't scale), you're going to be handling this through APIs and services that already know and understand the JSON file format. So the only purpose of comments at this stage in the game is using up extra bandwidth.
Right, but what if you want to document why you personally are using a particular option, rather than just what an option does? I agree that a lot of the time config comments are unnecessary, but they do have a place.
That is something that MUST exist in documentation. If you're putting it within the configuration file itself it can go stale or simply deleted one day and you would have no idea what occurred after the fact without looking at a revision history.
I'm a bit puzzled by documentation seems like a bad word around here; it has its place and I'm not suggesting writing up a huge document. It takes only a couple of minutes to drop some text in Word / Google Docs / Wiki / your flavor and put it someplace accessible for your entire team.
Documentation isn't a bad word - I regularly make documents to describe the architecture of my projects, along with their an outline of how to get into the codebase, for example. Comments are also documentation, though, and in this case comments are the right place. If I'm looking at a configuration option and thinking 'That seems a bit odd. Why do we have that set?' I want to be able to see the reasoning right there, not have to hunt through a wiki in the hope that someone has put their reasoning somewhere.
In my experience, comments are less likely to go stale than wiki docs.
I'm not refuting the value of documentation, I'm just acknowledging the referential integrity issues with it. All too frequently I've encountered cases where a config file gets updated but the wiki doesn't. Since all configuration files are version-controlled, I never run into a situation where a file changes and I don't know why (well, discounting bad commit messages). But, a documented wiki wouldn't help in that case anyway. A comment is documentation too ...
I didn't say include every possible option. But it sure is handy when I modify one and there's a nice comment telling me "hey, if you change this, you absolutely must go change this other value too, otherwise all hell is going to break loose." I'm much more apt to screw up the config if I have to look that up every time. That aside, it's really helpful to communicate to other people on the team that "this value is 768 MB because that's roughly 1/2 the total memory on this machine."
Re: transmitting annotated data. All I can say is not every application of JSON is for APIs for mobile devices. When I have a connected gigabit network and gzip data, I may not be that concerned about an extra 50 bytes of annotated data. But, again, that even presupposes that its only application is for computer-to-computer communication. The value in comments are human-to-human.
>I'm much more apt to screw up the config if I have to look that up every time. That aside, it's really helpful to communicate to other people on the team that "this value is 768 MB because that's roughly 1/2 the total memory on this machine."
JSON comments would be a short-cut, sure, but that information should be within reach and should NOT only exist within the JSON anyway. So the helpfulness seems really limiting here.
>Re: transmitting annotated data. All I can say is not every application of JSON is for APIs for mobile devices. When I have a connected gigabit network and gzip data, I may not be that concerned about an extra 50 bytes of annotated data. But, again, that even presupposes that its only application is for computer-to-computer communication. The value in comments are human-to-human.
You're right in that the value for comments are human to human; I just cannot picture a scenario where you're actually transmitting data over the wire and including comments. I am not limiting this to mobile device; any service end point should ignore any comments and they will never be seen anyway.
As I mentioned before, if you're ingesting files in a very ad-hoc manner then of course comments could be useful but that's not a typical use-case of JSON. JSON is typically used as an interchange format for end point to end point communication and comments in ANY type of file in that scenario are useless.
'disrupt' is an internal thing. There's a try/catch somewhere that's failing, but he catches the error and adds it to JSLINT.errors as if it were a problem with YOUR code.
I followed the little uproar about the JS semicolon. I do wonder though, did it became a running joke because of Twitter, or the developer rejecting them? (rejecting the not-use, I mean)
...and before you know it, your system that is built using various third party libraries contains 'JSON' files with various comment conventions:
- # in column 1
- # in column 1, backslash at end of line escapes the newline
- # as first non-blank
- # as first non-blank, backslash at end of line escapes the newline
- -- as first non-blank
- C-style multi-line slash-asterisk comment asterisk-slash
- C++-style //
- Python style doc strings
etc.
Then somebody will attempt to write the comment filter that will handle them all, somebody else will make a JSON parser that round-trips all different versions. Net result: a JSON 'standard' that is ugly and leads to parsers that are larger then necessary, not 100% reliable (how do you guess whether a backslash at the end of line escapes the newline?)
I would rather have a simple standard with one kind of comment.