Hacker News new | past | comments | ask | show | jobs | submit login
JSON as configuration files: please don’t (2016) (arp242.net)
401 points by goranmoomin on April 13, 2019 | hide | past | favorite | 426 comments



"Lack of programmability" is a feature. Declarative config lets tools statically analyze and transform the config. You can't even figure out what the dependencies are going to be for some Gradle projects without executing them. https://stackoverflow.com/questions/51153878/conditional-dep... -- The author's programmable example is a classic Bad Idea in my experience.

Also, check out JSON schema and its integration into tools like VSCode. VSCode gives you generalized autocomplete on its own config and known config formats like package.json by pulling in json schema definitions. It works well. Easily a better experience than the executable-program-as-config I learned to hate in the JVM world, and easily better than ad-hoc config like Caddy's.

I have a more practical pitch: make your tool consume json5 as well as json.

Also, it's telling that the author wasn't even willing to stick their neck out enough to pitch a single solution. I guess in the end they realized there are always trade-offs and any alternative would have its own downsides. :) But I'm getting a bit fatigued of these you're-doing-it-wrong posts. They are effortless to write and they're like crack to HN. I can complain about literally everything.


This totally misses the point of the article :(

JSON is a fine format for storing configuration data, and any reasonably structured data. So is XML, protocol buffets, what have you.

JSON is a poor format for humans. For a configuration file that an actual human would edit by hand, and read and try to understand later, JSON is a pain.

Comments are mostly useless in a machine-generated and machine-consumed file. They are indispensable for humans, because they are a link to what machines are not doing.

Nice formatting is useless for a machine-consumed file, because they process a stream of bytes. They are indispensable for humans who recognize text very differently.

This boils down to one thing: config files are source code, and need all the amenities that make source code comfortable for humans.


> This totally misses the point of the article :(

I didn't. There wasn't much to the blog post. In fact, the blog musings are the same criticism we've heard a thousand times. But please don't suggest that I missed some deep point in a trivial blog post when you really just mean "I disagree".

I even pointed out a real-world concrete example where JSON config is made pleasant to edit by hand for humans: an IDE like VS Code that pulls in JSON schema.

Though since you only doubled down on the blog post's opinion instead of responding to a single point I made, can I say that you missed the points of my comment? You didn't even address my criticism by pitching an alternative. How are we supposed to have a conversation?


> I even pointed out a real-world concrete example where JSON config is made pleasant to edit by hand for humans: an IDE like VS Code that pulls in JSON schema.

I think the debate here is what "editable by humans" means. If you're using a tool for editing JSON like VS Code, I don't believe you're really editing it by hand. You're using a program that has specific things for JSON that give you to have a pleasant experience editing. I believe editing by hand means using a standard plain text editor and it's a lot worse.

Formats that use braces are relatively easy to parse by machines but can be harder to read by humans. Utilizing white space makes things easier for humans to read. A JSON file could have everything on one line. But something like YAML it forces you to use white space and new lines which helps humans to read it.


I would 100% rather edit JSON (in vanilla vim) than ever touch YAML again in any editor I've used. It has weird rules for what whitespace is significant and what indent level I need for things, versus JSON where % will match quotes and parents and braces quite easily.

The only significant thing missing from JSON is comments. JSON5 or convention can solve that, so it's not as bad as you'd expect.


I'm sorry man, but it really doesn't. In over ten years of using yaml I haven't seen a single instance where the whitespace rules were in any way weird or counter intuitive.


Most programming languages don't have significant whitespace, but it doesn't prevent people from writing readable code in them. And I haven't ever seen anybody write a long JSON config any differently - it's indented etc, for the same reason why C code is indented.

Still, editing JSON directly by an IDE that still exposes its syntax is "editing by hand", just as editing C in Vim with ctags is still editing by hand.

I do have to note though that VSCode uses JSON with extensions. Most obviously, it allows comments. But there are other minor things that make it easier to write (and parse) - e.g. it allows trailing commas in literals.


> And I haven't ever seen anybody write a long JSON config any differently

It's not people to have to worry about. It's the machine generated code. But now that I think about it more, YAML is a bad example since it's a superset of JSON. So a machine could generate really ugly YAML too.


Operating under the assumption that config files will be handled by a plain text editor and not an IDE seems unreasonable.


I've edited the VSCode config in JSON form by hand when adding snippets and overriding how the vim plugin handles some keys. Instead of the normal settings view, it was just the JSON in the text editing view.


JSON is a poor format for humans

But do you know what sucks even harder? XML. If I can replace an XML configuration file with JSON then screw it.


How does it suck harder? Sure, the closing tags can be a bit annoying (but decent editors can insert those automatically), but attributes don't have that problem, and contrary to JSON it has comments.


Closing tags is not really an issue with XML.

In my experience, the biggest reason people hate it is because it has too many ways of structuring data, and it allows people who design the schemas to go overboard. Attributes or child nodes? Multiple namespaces or just extend the one you have? CDATA or embedding?

That said, simple XML is as good as simple JSON. XML is fine if you keep it simple when designing the schema. AND of course one can screw up with JSON too. But after almost 20 years of people over-engineering their XML schemas, I can't fault anyone for choosing a simpler data format.


The big difference is that mapping that XML into native data structures in your language is a mess in the general case because XML has so damn many ways of structuring the data. JSON maps more or less directly into JavaScript/Python/Perl or really any untyped language with arrays, hashes, and scalars.


Personally, the main reason I don't like XML is that it's too verbose.


I think part of it is the context that JSON and XML are used in, particularly the era of software that uses these formats for configuration and the design implications that has.

Recently I've had to work with XML config files for a tool called Oozie, which is a data pipeline scheduling tool within the hadoop/spark ecosystem, and it has been soul crushingly tedious for me. Everything feels verbose and opaque, the documentation seems to prioritize enumerating every possible configuration option over providing minimum viable configs for common use cases.

JSON configs often just feel more simple and developer friendly. I'd say this has less to do with technical differences between JSON and XML and more to do with how "modern" software systems have been designed to be more ergonomic for developers/administrators and these modern systems happen to be more likely to use JSON.

It's also fun to hear non-technical people at work talk about "Jason files".


> It's also fun to hear non-technical people at work talk about "Jason files".

Ha! That one always makes me smile. One of the best exchanges I’ve heard was something like:

“... and we’ll have JSON handle all the serialization”

“Jason? Oh we hired a new engineer?”


This is a great point: correlation between tool and era/ergonomics. My current role is swimming in XML and abides in the XML era design philosophies.


> It's also fun to hear non-technical people at work talk about "Jason files".

I usually go the other way; Jasons I know have their names represented as "JSON" in my mind, and sometimes in written form.


There's a flip side to the common use case of documentation... How do you solve the uncommon use case when all the docs it seems are showing "how easy it is to do xyz." A month ago, I started on a new web front end. It was set up with React, Redux and webpack. Since it was primarily going to be me solo on the project, I wanted to integrate TypeScript. There are tutorials everywhere for starting with TypeScript but good luck finding something to say how to integrate it in.


You might be able to guess my name by my comment.

I really wish everyone would think of JSON as a French word. "Jay-sonne" (or jay-SAWN). Not "Jase-in"... it would help keep the confusion to a minimum.

Ha - https://www.youtube.com/watch?v=zhVdWQWKRqM He starts by saying "Jason" but then suggests the French version. Love it!


I find it interesting you're being downvoted. I think your question is reasonable: why does XML suck "harder" than JSON? This just seems a bit faddish to me. Things like namespaces and attribute/child distinction in XML are super annoying ... until you really need them in your ML (JSON, in this case). Then, suddenly, you're reinventing a wheel, hoping all of your downstream tools 'agree' on the behavior.


The votes seem to be going the other way now :-)

I've heard people prefer JSON over XML because XML is too complex... and now we have json schemas, jsonpath... the namespaces can't be far away.

I'll admit that XML probably would be better without entities and the crazy whitespace handling.


Why does it suck harder? I’ll give one example: arrays.

In JSON they’re plain and simple. You even get to use good ‘ol [ ] delimiters.

In XML? It’s a sea of open and close tags.

Now do an array of arrays. Then hand-edit it.

Like I said, it’s no better. But it’s way easier to read.


I can easily import/export the JSON and share it within my codebase. The XML is much worse in that regard (unless there's some clean way to do the same thing that I'm not aware of).


Millions of developers use XML all the time, it's called HTML.

People can easily read and edit it with deeply nested structures, attributes, and even malformed data. Frontend frameworks using components and variations like JSX also maintain the same style because it's very natural to use.

Now try taking a typical HTML document and expressing it in JSON. Even an empty page would be completely unmanageable without tooling. People have a strange reaction when they hear "XML" but it's much more structured, usable, and widespread than you think.


I think using the popularity of HTML (a rich text format) to justify using XML as a configuration format is a bit disingenuous. The two use cases (structured data and rich text) are genuinely very different from each other. In rich text, the primitives are text and markup. In structured data, the primitives are structures (lists, maps) and data (strings, bools, numbers). The impedance mismatch goes both ways - try taking a typical JSON document and expressing it in XML. Which parts end up as attributes? Elements? Text nodes? How do you express the semantic difference between a list and a map? A string and a boolean value? It’s arbitrary, ugly and verbose no matter how you slice it. And, just like it’s inverse, basically unmanageable without tooling.

Also HTML is not XML. HTML is non-strict (the spec is it’s parser not its format). HTML doesn’t have schema files. HTML doesn’t understand self-closing tags. HTML does have void elements (img, br, input, etc). HTML is designed for humans to write. And it’s a tool fit for its purpose, unlike the abomination that is XML.


Another commenter linked to this: http://www.jsonml.org/

Scroll down and I think it shows it perfectly how (X|HT)ML tags are much simpler than JSON syntax once things get complex and nested. It's not that hard to define basic primitives like we do with HTML (which has dozens of tags) but XML also lets you define your own schema if necessary to make things more compact.

I understand the technicalities between HTML vs XML but I don't see how it makes any practical difference when you're editing a bunch of tags in a text file. It's the same thing. The structure looks identical. What is the actual issue that makes HTML easy but XML hard?


This is such a contrived example, they take html and show the equivalent conversion in json with the same schema. The point is in a configuration file you can remove nearly all the cruft of xml, instead of having <input “key”=key “value”=value> you don’t need any brackets or extra info. You just type { key:value } and you’re done. As many times as you want. I’ve had to hand edit complex msbuild configurations (xml based) in past projects and I can tell you lists and maps are hell. We’ve since converted them to yaml (another discussion), but the point is xml is terrible for human editing.


And yet you have literally done the same exact thing in reverse - in XML, the exact equivalent would be <key>value</key>; you don't need any extra attributes, either.

Well, until you do need some metadata for that key-value pair. Which is why even in many JSON schemas it's pretty common to get something like { "key": "key", "value": "value", ... } (where ... is usually empty in practice).

The problem with MSBuild, for the most part, isn't XML - it's its own data model (which is not XDM, by the way).


The nuance that differentiates my example is I’m not picking some random web page (where html might actually make sense to use) and trying to apply json to a domain it most certainly is not optimized for. I’m taking an extremely common case and showing why xml is way too verbose for very simple scenarios.

My main point is in xml the closing tags are unnecessary, and you need an opening tag for every value where it’s obvious from nested context what that value is supposed to be. Xml is very redundant, there’s just all this zero entropy text everywhere which conveys no information. I disagree with your metadata in json example, why not just add more data to the value? The data model of msbuild is also annoying, but I’ve worked with some dotnet core projects now and using json as the project format definitely saves typing and actually allows you to intuit what a project is doing rather than just bombarding you with useless text.


You're picking a particular way to handle that common case. A common way, I agree, but you have others if verbosity is what you're optimizing for.

And with respect to data and metadata, what you propose is not fundamentally different in terms of verbosity:

   <foo metadata="bar">baz</foo>

   {
     "foo": {"value": "baz", "metadata": "bar"}
   }
And it's even worse if you also need to preserve order of keys.


Ok, but like what if your metadata is even more complicated than just a single field, it’s ridiculous to have to include even more opening and closing tags. <map> <kvp> <key>k</key> <value>v</key> <metadata> <field1>m</field1> <field2>m2</field2> </metadata> </kvp> </map>

It’s just crazy to look at tbh, and it was annoying to type on my phone. I’m amazed people are still arguing in favor of it. In reality you just want the markup to transmit the most information in the least amount of bits. This is a measurable quantity and xml objectively sucks at it.


> In reality you just want the markup to transmit the most information in the least amount of bits.

If that were the case, we'd be using binary serialization everywhere. But you yourself are making an argument that ease of writing matters. So does ease of reading. Overly verbose markup is a tax on both, but so is extreme brevity.

Anyway, this particular thread was a discussion to address a very specific point made in the article that overstated XML verbosity over JSON. I'm not actually arguing that XML is perfect, or even "good enough". Its syntax is overly verbose, and its data model has pointless distinctions and arbitrary restrictions. But it also had many good ideas, and it's unfortunate that those get ignored in the quest of simplifying everything - and then later, when the issues that were the original motivation for those ideas are rediscovered, that wheel gets reinvented in dozens of flawed and mutually incompatible ways.


.NET Core uses .csproj project files which are XML.

Do you have difficulty in reading a large HTML document? That's as verbose and repetitive as XML but also usually filled with junk comments and malformed tags. If you don't find it hard, then why is XML different?


> Millions of developers use XML all the time, it's called HTML.

HTML is not XML and even then few developers today use HTML without some sort of processing involved. HTML is also not a language used for configuration files, so it is really off-topic.

The question is, do people prefer XML configuration over JSON configuration. The answer is that people generally prefer JSON. JSON just maps far better to data types we use in everyday programming. Have you seen some of the common ways of expressing key-value pairs in XML? It's horrible.

> Now try taking a typical HTML document and expressing it in JSON.

That's not the problem anybody is trying to solve here. We're talking about configuration files, not complex documents. But let's say I wanted to solve that problem, the representation would look like this:

http://www.jsonml.org/

Not that bad at all.


It's a technical distinction without a practical difference. They are both declarative languages to describe data.

Why is it so horrible? Have you ever typed a <ul><li> list? Or <input name="key" value="value">? What exactly is so difficult about this syntax in XML but completely fine in HTML?

The point isn't whether you can represent a document in JSON, it's about how easy it is for a human to manage it. The XML/HTML structure is far easier as things get more complex. The page you linked to shows this quite clearly even with the small "Bulleted List Example" at the bottom.


> It's a technical distinction without a practical difference.

There's a lot of practical differences when actually implementing configuration with either XML or JSON, due to technical distinctions.

> The point isn't whether you can represent a document in JSON, it's about how easy it is for a human to manage it.

If XML is easier to manage, why are people shifting from XML and to JSON for configuration? You may have some experience with HTML, but have you actually used some of these XML abominations that are used for configuration?

> The XML/HTML structure is far easier as things get more complex.

Configuration files shouldn't be complex. They're mostly key-value pairs, perhaps a couple of lists here and there and a modest amount of nesting. XML is complex to begin with.

Perhaps at some stage, for some use-cases, XML starts becoming simpler to edit. Configuration files generally isn't one of them.

> The page you linked to shows this quite clearly even with the small "Bulleted List Example" at the bottom.

It's an example of representing XML-like structure with JSON, which is fairly easy. Try it the other way around, things get hairy. If you were to represent the data as JSON, you wouldn't write it like that.


If config is small, both formats are easy and it doesn't matter. When config is large and complex then both formats can be hard, however XML is much easier than JSON as complexity climbs. My evidence is how many people easily edit large complex document structures in HTML already without issue.

I'm just not sure what the argument against this is other than verbosity. Config files shouldn't be complex? Sure, but what if they are? Are people really moving to JSON everywhere or is it just tied to the rise of Javascript and web frameworks?


> If config is small, both formats are easy and it doesn't matter.

Even if I would agree with this (which I don't), if it doesn't matter then you should pick JSON because it's easier to work with as a developer and it doesn't require XML parsing as a dependency.

> My evidence is how many people easily edit large complex document structures in HTML already without issue.

This isn't evidence at all. An HTML document is not a configuration file. Even then, people rarely author large HTML documents by hand these days. It's more likely that they're using a simpler language like Markdown to author documents.

> Are people really moving to JSON everywhere or is it just tied to the rise of Javascript and web frameworks?

If Javascript and web frameworks are responsible for the rise of JSON, but XML is better, why didn't they stick with XML? Remember, SOAP was XML. AJAX stands for "asynchronous Javascript and XML". The way to do an HTTP request (before fetch) from Javascript was called (inappropriately) XMLHTTPRequest. JSON on the other hand was just an informal spec with several flaws, yet it won out over XML.


1) What I mean by "it doesn't matter" is that at small scale, neither one easier than the other.

2) JSON does need to be parsed and does add a dependency. Javascript is not the only language out there.

3) HTML not being config files is not the point. The document/tag structure is identical. If you can edit large complex HTML files (regardless of how they are generated) then you can edit large complex XML files. And it's rather easy to do so, against the claims that XML is so hard to work with.

4) There are plenty of XML configs out there, backing just about every piece of software you use. If you only focus on web/js projects then JSON "won out" because it is a simple dump of the in-memory representation rather than a more formal serialization like XML. The missing features like JsonPath and Json Schema are now being added back to turn it into a proper serialization format. When storing configs, these schema features are rather important.


1) Yes, if they're both easy to use, pick JSON - for simplicity's sake.

2) A JSON parser is a far smaller and simpler dependency than an XML parser, especially when we're adding XPath to the mix.

3) XML isn't necessarily hard, it's tedious.

4) There's plenty of legacy software out there using all kinds of stuff for config. You don't see a lot of people choosing XML these days. You see YAML or TOML or JSON, even though all of these have issues of their own. JSON happens to be the simplest and most commonly supported.


HTML is distinctly not XML. There is XHTML if you want to use a form of HTML that is processed like XML, but XML processing is a lot stricter than HTML processing in which you can take a lot more liberties in the markup and still end up with a readable result, maybe even what you intended, whereas an XML processor will usually refuse to render incorrect XML markup at all.


I'm comparing the data structure and editing ergonomics (which are identical) between XML and HTML, not how they are processed. It's good for config files to have precise parsing instead of tolerating errors.


>They are both declarative languages to describe data.

This still applies then


Well put. Thank you.

In my world, config files are machine-parsed but human-generated and modified. JSON is just easier to work with.


XML is a lot less pleasurable to parse from a programming point of view though. With a JSON file it's something like:

const config = require('config.json');

console.log(config.apiKey);

With XML you often have to use things like XPATH etc. to extract the data which is a lot more effort.


Javascript isn't the only language though. There are lots of XML and JSON parsers and serializers available to convert to an in-memory object. Most relational databases have great XML support too.

I don't think people realize that JSON doesn't actually have a querying system at all, you have to deserialize it to an object to use. There is the coming JsonPath standard but that's not well supported yet and is pretty much the same as XPath.


I think JSON is just universally more easy to use. Every language I’ve used with it is mainly the same in the way it works. JavaScript, Ruby, Python, Perl, PHP...

But you’re not wrong. Yes you have to deserialise it. Hence it being an Object Notation system.

While you say JSON is trying to catch up with XML in regards to XPath, JSON is trying to catch up to XML with things like JSON Schema.


> While you say JSON is trying to catch up with XML in regards to XPath, JSON is trying to catch up to XML with things like JSON Schema.

Absolutely. I would say XSD is the strongest advantage of XML over JSON right now. XSD still has many things to improve but JSON schema is even further behind.


> I don't think people realize that JSON doesn't actually have a querying system at all, you have to deserialize it to an object to use.

You don't have to deserialize it to an object, you just do that because it's convenient in Javascript (and other dynamic language). It's a feature.

Now try using XML like an object, what do you get? A DOM. Which is fine if you wanted a DOM, but I don't want a DOM. I don't need XPath or any of that stuff. These are tools to deal with the complexity of XML, which I don't have, because I am using JSON.


A DOM is just a tree (and deserialized XML is not always W3C DOM; indeed, in most modern languages, it usually isn't). Deserialized JSON is also just a tree. And even if you deserialize JSON by eval'ing it from JS, it is still deserialization - you're just happening to be reusing the JS parser for that purpose. But any parser is fundamentally a deserializer from the language syntax to an AST.

So you're really saying that JSON deserializes to something that is a more natural fit for the language that you're using. And it's true in many cases; but also not so much in others, like when you're dealing with 64-bit integers, or dates, or all those other things that JSON doesn't spec because "complexity". In practice, it just means a proliferation of incompatible ways to represent these things, and utterly insane deserialization behavior in corner cases when implementations try to be "smart" to transparently compensate for JSON lacking something (e.g. https://github.com/JamesNK/Newtonsoft.Json/issues/862).

Conversely, if you are writing in a language that has integral support for XML - say, XQuery, or even VB.NET (https://docs.microsoft.com/en-us/dotnet/visual-basic/program...), the complexity is mostly not there. At the very least, if you control the format - which you have to, if you're in a position to decide what to use - then you can certainly stick to the subset of XML that is not anymore complex than JSON.


Remember, we're comparing with XML. There's no dates or numeric types in XML at all. This kind of proliferation is far worse in XML.

Sure, there are corner cases and limitations with JSON. I've never experienced that as a significant issue.

> Conversely, if you are writing in a language that has integral support for XML - say, XQuery, or even VB.NET...

I am not using any of that stuff, nor is there any reason for me to start using it.

> At the very least, if you control the format - which you have to, if you're in a position to decide what to use - then you can certainly stick to the subset of XML that is not anymore complex than JSON.

...or I can just use JSON.


> Remember, we're comparing with XML. There's no dates or numeric types in XML at all. This kind of proliferation is far worse in XML.

XML (or rather, XDM, which is the appropriate level of abstraction to talk about this) has all those things:

https://www.w3.org/TR/xmlschema-2/#built-in-primitive-dataty...

Nor does it doesn't require an out-of-band schema - you can slap xsi:type on any element. And you can do that without breaking the data model, because namespaces keep data and metadata unambiguously separate, and code can easily deal with the former while being completely oblivious to the latter, unless it needs it.

JSON also has similar higher-level abstraction layers with more metadata. The problem is that nobody can agree on which one to use, or even whether to use one at all, and most code that's deserializing JSON in the wild is not going to be able to distinguish metadata from data.


XML has string attributes and text content.

Sure, you can add information until you arrive at a point where a string attribute or text content will be interpreted as a certain data type, but XML itself doesn't have it.

> JSON also has similar higher-level abstraction layers with more metadata.

JSON has all the basic data types built right in, there's no need for more metadata to do simple things. There's a reasonable mapping to basic data types and structures for almost any language.

> The problem is that nobody can agree on which one to use, or even whether to use one at all, and most code that's deserializing JSON in the wild is not going to be able to distinguish metadata from data.

...which is generally fine because of the aforementioned mapping. Your JSON library doesn't have to (and shouldn't) do any magic.


That's ridiculous - there are lots of pragmatic xml deserializers available. Not to mention, sometimes you want something like xpath; lack of xpath and lack of validation (schema, whatever) aren't features, they're bugs.

The only real advantage json has when it comes to deserialization is that it suggests to a human reader that keys are like object properties, leading to a very obvious deserialization strategy. But even that's a bit misleading, since json allows duplicate keys, just like xml, so a pragmatic deserialization library is going to make that feature impossible.

Make no mistake - the culture of straightforward deserialization is hugely valuable! But that's because of its history and other human factors more than the language itself.


Fair, but many developers are intimately aware of the shape of that kind of XML. Ever tried making sense of an XML document that tries to express something as complex as a web page, but without the requisite domain knowledge? It's painful.

Slightly less so as a JSON document, IMHO.


I never write html directly. With jsx you can have variables and reuse code.


I think the stigma comes from issues using XQuery etc. Not that they are unusable but they require adjusting your way of thinking.


But XQuery (and XPath, being its subset) is pretty much just pure sequence comprehensions for the XML Data Model. It might have been unusual from a mainstream PL perspective back when it was introduced, but today, when C# has LINQ, and JS developers preach the miracles of map/filter/fold over immutable data structures, I don't think XQuery is all that exotic.


JSON has no querying system, it can only be accessed by deserializing to an in-memory object or using whatever custom APIs are available (like postgres json functions).

JsonPath is the proposed standard, and it looks pretty much like XPath. And that's before getting into Json Schema.


At least XML has comments and you can simply duplicate the last element in a list without fear of forgetting a comma.

XML has other problems though.


And the problem would not exist if we just used the good ol' s-expressions. Gets all the structural benefits of XML and JSON with few of the drawbacks.


The problems people have with XML is not the syntax. Also, how would you represent (optional) attributes in s-expression? I could think of a couple ideas but none that is nice.


Basic S-expressions also don't distinguish between lists and maps, which is something that turns out to be very convenient in practice. Sure, a map is just a list of pairs - but the deserializer needs to be aware of its meaning to parse it into the appropriate data structure. So you either need a schema even for the most trivial cases, or you need a distinct syntax.

EDN is a better Lisp-flavored candidate, IMO.


Basic S-exp syntax can easily be extended to denote dictionaries. Just like #(...) gives us vectors and #S structures, some #H can provide hash tables.


Of course, but then we're not talking about raw S-exps, but something built on top of them. And then you get something like EDN anyway.


> Of course, but then we're not talking about raw S-exps, but something built on top of them.

I don't understand what this means.


If you really needed, you could just define that the second list element is an attribute list/map, turning <foo bar="baz"><quux /></foo> into (foo ((bar "baz")) (quux)).

The problem with XML itself is being unnecessarily verbose (and thus difficult for both human and machine to read) for what's just a way to encode trees. Attributes are arguably XML's self-inflicted gunshot wound in the foot; you mainly need them because of visual noise caused by regular nodes.


I believe that more than being verbose (closing tags for example) it is that the whole spec is enormous, with entities and namespaces making it even more complex. Still we see that some of that is actually needed as various json path/schema projects show.

mapping a 1-1 s-expression translation on HTML/XML/JSON/YAML etc. solves nothing. YAML has (had) code execution security problem and parser incompatibility issues. HTML/XML actually need namespaces in a few cases. JSON is abused as "compile target".

There can be no one format that works for everything. This is why I like TOML, it is really good at what it tries to do and stops there.


There is no need for optional attributes in the first place, because you can always just do this:

  <thing>
     <optattr>42</optattr>
     ...
  </thing>
insteadof

  <thing optattr="42">
    ...
  </thing>
Attributes exist in order to create a more compact syntax within XML/HTML tags, in ironic recognition of these notations being horribly verbose.


Many implementations of JSON - especially the ones that parse human input - allow trailing commas for this exact reason.


You may be right, some JSON parsers even support comments. But you never know which parser is strict and which not, because it's not part of the specification.


Why would you limit yourself to those two options? Use TOML or something.


TOML kinda breaks down when you have nested data:

https://stackoverflow.com/questions/48998034/does-toml-suppo...


The answer to the question is "yes it is supported", it doesn't break down at all.

It's even in the example file https://github.com/toml-lang/toml/blob/master/examples/examp...


looks quite ugly. repeating keys... yaml looks better (while it has other problems)


Your link shows nested data structures working? But regardless TOML is a configuration file format (like INI), not a general purpose data structure format (like JSON is).


> general purpose data structure format

General purpose until you need to store dates/times, self-referential structures, enums, etc.

json and toml are both serializations of data. I don't think the line you're trying to draw between "configuration file format" and "data structure format" is a well-defined line.


Sure it's a blurry line but some formats are better than others for some specific tasks. JSON isn't the right tool for every job but it's a good enough lowest common denominator for data sharing.

I wouldn't want to use if for config files or passing complex data but then it's not designed for that.


TOML also has more "normal" notations for dicts and lists that are almost the same as JSON.

There is a distinction between markup languages e object notation languages. XML, HTML and Markdown are markup, JSON, YAML, TOML and most other are object notations.

(here my distinction, in general, is if they allow unquoted text)


Shameless plug for a notation I work on: http://treenotation.org

To me it’s an improvement over JSON which is an improvement over XML.


I mean, we're not going back to xml anyhow; and xml is considerably older anyhow, so this is a pretty hypothetical case.

But I'd posit that most - but not all - of the issues with xml config files have little to do with xml per se, and everything to do with the crazily detailed stuff people squeezed in it. You don't have to stick everything in a namespace, and nest even trivial things 3 deep with annotations for obvious types. And if you do those things in json it turns into a mess too.

SOAP is a perfect example of that. But: just because soap is messy doesn't imply xml everywhere has to be that.

But sure, xml has some downsides. Then again, so does json. Oh well!


At least XML has comments.


This is a false choice. These aren't the only two options.


XML does suck, but at least it supports comments.


Its easy to write multiline xml and add conments.

Cant do that with json.


Being able to statically analyze and transform configuration is a benefit for humans.


I don't understand why people think JSON is a bad format for humans. The ability to write invalid JSON is very hard, and doing so will result in JSON parse errors. Take YAML on the other hand -- simply omitting a dash can result in unintended consequences that will parse just fine:

  some_collection:
    - first_entry
vs

  some_collection:
      first_entry
When you're staring at a non-trivial YAML file, it's very hard to spot these errors, and you'll potentially get some very weird behavior in whatever is depending on the config.


Not as bad as YAML isn’t much of an endorsement.


Usually when people complain about JSON they offer YAML as a better option. This article complains about JSON and offers no alternative at all -- what would you suggest?


I think for most cases, old school Unix style is perfectly adequate. Where it isn’t xml is the least bad alternative (and yaml is probably the worst).


Ever edit a 200 line yaml file? There are worse options than JSON.


I did, and it was mostly painless, both from IntelliJ and from Emacs.

I do know about a number of gotchas, and stepped on some briefly. Yaml is not ideal, but its set of tradeoffs is more suitable for human-editable configs, from my point of view.


I’ve had to debug 600+ line elasticsearch queries in JSON format, while working the yaml for Ansible playbooks also quite long.

It doesn’t matter the language or syntax, if the unit of structure doesn’t fit on the screen, it’s hard to work with.


I never did, what problems did you encounter?


Not the parent commenter, but it really gets tough to figure out _where_ you are in a bunch of nested lists, objects, etc deep down in the yml file. When the schema gets complicated enough you start having to refer back to the documentation to understand what syntax a particular item needs (I'm looking at you, docker compose!)

One example from a larger file:

  networks:
    macvlan51:
      driver: macvlan
      driver_opts:
        parent: eth0.51
      ipam:
        config:
          - gateway: 192.168.51.1
            subnet: 192.168.51.0/24
Maybe I'm just getting older and dumber, but that hyphen under config seems completely arbitrary to me (and the necessity of it is poorly documented in the manual). This gets magnified to the extreme as your config file grows. Maybe this is a list of objects under a key? Who knows.

Edit: OK, it looks like hyphens indicate arrays in the schema but for me it feels like a guess as to which one of these to use for each particular case.


Without the hyphen configs value would be an associative array. With it it’s a list.

I’ve been editing long sublime-syntax files recently and I do not like it much. But I’m also not sure if JSON would be better for such a thing.


Whats the point of a list of tuples? Order? Json seems more immediately obvious and consistent in this case ie [{a:b},...]


It's not a list of tuples, it's a list of dictionaries.


Ah, but my confusion just deepens my point, haha.


It's mostly a question of what you're familiar with. JSON using [] and {} to denoting list vs dict isn't fundamentally easier than "-" and a prefix-less list of key:value pairs. YAML has a bunch of extra complexity, but IMHO this difference is really not it.


From a syntax standpoint parenthesis are extremely more clear than prefixes+whitespace (maybe less readable).


Semantic whitespace is a nightmare. It's also sometimes I ambiguous whether things are ints, bools, or strings.


> JSON is a poor format for humans

I have never understood this argument. I'm not saying it's the best, but I find it much much more readable than yml or xml.


I don't feel pain with vscode's settings.json, or .eslintrc for example, because they are autocompletable, allow comments (json5), allow trailing commas (json5), ...


Unfortunately, it's not actually JSON5, it just supports a few features from the spec.

But yes, if every implementation of JSON moved on to JSON5, I don't think we'd be having this discussion.


It would be interesting to try Markdown files for configuration data. Documentation and settings in the one file: structured data like arrays can be stored in md tables.


Interesting idea. I did a quick search and found literate-json on github.


I think it's possible to skip the .md to JSON step?

Example is below where an application could either extract values at startup, or is extracted when code is built and deployed?

  # My App

  ## Intro about My App

  Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.

  ## Prod Server Settings

  | Host                     | Port          |Operating System| Type   |
  | -------------------------|:-------------:|:--------------:|--------|
  | webserver1@myapp.com     | 80            | Debian         |Web     |
  | webserver2@myapp.com     | 80            | Debian         |Web     |
  | db@cloud.com             | 1433          | aws            |Database|
  | webservice.com/api       | 80            | aws            |Database|
  ---------------------------------------------------------------------

  ## Support Information

  Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.

  ## Application Roles

  |Name                   | Active Directory Group |Description                              |
  |-----------------------|------------------------|-----------------------------------------|
  |System Administrator   | MyApp.SysAdmins        | Details who should be assigned the role.|
  |Finance Administrator  | MyApp.FinAdmin         | "         "                             |
  |Finance Delegate       | MyApp.FinDelegate      | "         "                             |
  |Finance Analyst        | MyApp.FinAnalyst       | "         "                             |
  -------------------------------------------------------------------------------------------|


> JSON is a poor format for humans

Depends on which kind of humans.

For the regular non tech types I agree JSON is poor. A couple of years ago I created a mini spec for a config file so that very low tech paper designers could configure an interactive book.

For programmers, JSON works fine in my experience.


> For programmers, JSON works fine in my experience.

The author mentions this, and I very strongly agree. What JSON significantly misses is the ability to have comments in your configuration files. I'm not going to hold up the Apache config files as an example of amazing config formats, but... here's an example: https://svn.apache.org/repos/infra/websites/cms/webgui/conf/...

That's awesome. It describes options that are available and not enabled, and what they do. Right. In. The. File. Comparing that to, say, webpack config... there's a pile of research needed to figure out what options are even available, let alone what they do.


Yeah, that's the #1 problem with JSON as a config file format, IMO, and the Apache logs, among others, are a perfect example of why.

All the other arguments for or against it boil down to personal preference, but not having comments is a real miss.


Scale is the important thing here. Even if you're a magical superior human programmer variant, json sucks once you cross over 50 lines.

Cloud Formation is a great example of the failures of JSON. Specifying just a single dynamo table with indexes and scaling roles in JSON is a hard to read cluster of a mess. Even with a good idea and json collapse / expand help, large json files are just insufferable to deal with as a human


I'd say if you're crossing over 50 lines, it's going to suck no matter what format you're in. Other solutions are not really going to fare any better than JSON at that level.


XML + XInclude would allow you to break it all into multiple files, for example.


> For programmers, JSON works fine in my experience.

Programmers will maybe cope better but just being a programmer doesn't make you magically immune to poor readability and lack of comments.

Most of the points in that article apply to programmers as much as non-programmers.

(I agree there is a reasonable debate to be had about programmability - I'm on the fence about this one)


We've been using HOCON for some configuration files. JSON but with comments and a few other more human friendly bits.

I don't hate it, but I really dislike it. It's definitely better than JSON for this purpose, but it's still quirky and a bit of a pain.


I feel like JSON is a decent format for storing human-readable configuration data.

The use of brackets means if something is broken, you'll know explicitly because your IDE will throw a fit. Formats that are white-space dependent can have subtle errors that aren't easily recognizable on first glance.

The 2 problems with JSON are multiline string support and comment support, but JSON5 solves both of those problems. Sure JSON5 isn't JSON, but YAML isn't JSON either. No matter what you're going to need a library to parse a config file, so why not stick to something based on a widely used data-storage format?


While there are no explicit comments in JSON, you can just add a superfluous string:

{ useFluxCapacitorComment: "Enables time travel. true or false", useFluxCapacitor: true }

I don't mind the formatting, but I'm a professional developer, so my tolerance for complex UIs is abnormal.

Written on mobile, so that JSON may look like garbage.


Often not an option:

E.g: npm/yarn will complain loudly if you add unexpected lines in package.json IIRC.


I have seen people using additional attributes for comments. That does influence the runtime, which comments should never do, but I thought it as a practical solution. Not optimal of course.


Though not all JSON parsers accept random attributes. The AWS CloudFormation one, for example, is extreme in its refusal to accept "malformed" JSON. It really frustrates me sometimes.


> Comments are mostly useless in a machine-generated and machine-consumed file. They are indispensable for humans, because they are a link to what machines are not doing.

But comments are often code smell—a crutch for poor identifier naming and design. If your names and design are well thought out you rarely need comments and the consumer of your code should be able to understand it using the statically checked source alone—-versus manually having to parse your unchecked human language documentation.


Comments are a code smell when they describe things that should be described in code.

Comments are important when they express things that code cannot.

Examples:

// See ticket XYZ-123 for discussion and rationale.

// We are sidestepping a bug in Docker 3.x here.

// TODO: Allow queue size per service version. Currently we're using max size for all.


Great examples. I wonder though if all those could be better expressed using something like decorators in order to enforce some static time checks. For example, instead of a free form comment to do have a structured todo notation that then makes the todo parsible and checkable. Often I see a //todo comment that is still there by accident after the thing is done. Or a todo that has since been abandoned.


Edit: adding a link to the Clean Code bit about comments, which I think is great advice : https://gist.github.com/wojteklu/73c6914cc446146b8b533c0988c...


From the link, about comments:

  Use as explanation of intent.
  Use as clarification of code.
  Use as warning of consequences.
Here's a link to a well-known recent book, half of which is pretty much about why those points are a real and frequent thing, and why code should be much more thoroughly commented than it's recommended by the usual philosophies.

https://www.amazon.com/Philosophy-Software-Design-John-Ouste...


In a JSON config comments can be very useful to "hide" disabled options

["DoStuff", "EvenMore"/, "ops go back :("/]


> If your names and design are well thought out

Sure, but how often does that happen?


It usually happens in good code if the language supports it. But I agree, not enough and comments are great as a stopgap until the thing you need to explain can be explained in the programming language. But sometimes that can’t be and so you do need to resort to comments. It’s like types in doc comments being used as a stopgap until a language added types. I hypothesize (I plan at some point to crunch some data here) the more comments the worse the language, and so you’d see a pattern like Python 3 code having fewer comments on average than Python 2 code.


JSON is data, not code.


Anybody that wants turing complete programmability in their config file is asking for pain. You can always embed such a language in a string as an escape hatch, so there's no loss of generality in any case, but it's sufficiently painful to encourage simplicity and security. But there exist computation models weaker than turing completeness that are still useful, some of those might be reasonable in a configuration file format. For example, you might want to specify a bunch of things iteratively, or perhaps with some fixed number of temporary variables; that's merely an FSM. Or even simply plain non-iterative logic with parameter passing, which is even easier to reason about and possibly still useful; i.e. re-usable templates without iteration or recursion.

I can at least imagine such a configuration file format could be both simple enough for human and machine to reason about, and flexible enough to add value over (say) json. But the devil is in the details; and it's not always immediately obvious what's too complex.

In any case, json5 looks like what json should have been from day 1; several of the really critical improvements were common JS practice when json was invented.

I'm a little worried about the IdentifierName extension, since that's more than what javascript itself allows (no reserved keywords!).

But json5 is also an interoperability problem. The nice thing about a (good) standard is the that if it works someplace, it'll work anywhere. Add json5 into the mix, and that's not the case anymore.

Then again, there aren't a huge number of plausible successors to json, so maybe this is the best we can hope for. And it's trivial to down-convert to plain json, which means it's not too bad, I guess?

I wish json5 simply worked everywhere ;-).


Dhall is a good example of a "programmable" configuration file format. It's based on System F, which means you get no-nonsense abstraction (in the LC sense) without any pain. It really helps to be built on something conceptually robust instead of ad-hoc whims. Oh and it has a static type system. Quite nice!


I find HOCON to be a nice minimal extension to JSON - allows config values to reference each other, with some basic string and array functions included, and allows successive files to append to or override existing config values. Fills 90% of what I see people asking for in terms of programmability.

https://github.com/lightbend/config/blob/master/HOCON.md#goa...


It's hard to believe json standardized without comment section and everything json5 fixes.

TOML is fine.


It's easy enough to get Turing complete config files with LISP

shivers


Author of the article here; I think this is a complex issue with no single "right" answer. It depend on the context and what you're doing with it.

I wrote this after getting annoyed with MediaWiki's new extension format which uses JSON files to set up the extension. What I wanted to do is not include some CSS if a certain plugin is loaded, and that's pretty much impossible to do in a declarative configuration file. Using a JSON file – or any other declarative config – to describe how code works is not a good idea IMHO (although it's probably fine to add basic metadata).

The worst case of declarative programming gone wrong that I've seen is probably k8s.

I can't really judge your Gradle example, as I've never worked with Gradle (or Java in general).

> I'm getting a bit fatigued of these you're-doing-it-wrong posts

I agree, and I would rename the article if I hadn't put the full title in the URL (I have since stopped doing that exactly because I wanted to rename this). In my defence, at the time I was annoyed with MediaWiki, PHP, this JSON stuff, and the reasons I even had to work on MediaWiki in the first place (which is a long story involving a lot of drama).

That being said, I think using JSON as a human-editable configuration format is, quite frankly, never a good application of the tool. I can't think of any cases where it's the best tool, chiefly due the lack of comments, but also because it's kind of a pain to write. Variants which allow comments – such as JSON5 – are okay, but that's no longer JSON.


You can have declarative configs that are programmable, these are not exclusive.

> You can't even figure out what the dependencies are going to be for some Gradle projects without executing them

This sounds like an issue of Gradle, not of programmable configs in general.


It's not an issue of Gradle, it's a feature of Gradle. Why would you want to figure out the dependencies without executing script? Programmatic configuration allows e.g. specify some library version once (e.g. springVersion) and use it for 10 dependencies as a variable. And that's a simplest use-case. Full-featured programming language allows for a lot of flexibility.

Not every config file needs that, but a config file for a build system definitely needs that. Otherwise your build system becomes too restricted and you must either write your logic in plugins or use another scripts along with your main build tool. Both options are bad.

I'm not even sure that build.gradle should be called a config file. It's a build script.


> Why would you want to figure out the dependencies without executing script?

Originally, the whole point of using configuration files for builds was to eliminate the programmatic element where you used lots of little scripts to run builds. Config files are easy to read and reason about.

> Programmatic configuration allows e.g. specify some library version once (e.g. springVersion) and use it for 10 dependencies as a variable

Declarative syntax can also include variables which can't be reassigned to later.

Gradle's original marketing message was based around its syntax, i.e. Apache Groovy being easier to read than XML, which is true. But providing the programmatic ability of Groovy as well is a step backwards for configuration best practice, ultimately resulting in an increase in technical debt. Thankfully, almost all the Gradle build scripts out there are simple 20-liners which don't use programmatic logic.

Gradle ought to follow the example of Jenkins pipelines, which later provided a Declarative Syntax as an alternative to Groovy to mitigate this very issue.


> Why would you want to figure out the dependencies without executing script?

For the same reason that I would want to figure out if my script type-checks without executing it - making it easier to avoid bugs and reason about the program.


Why does every piece of advice need a counter pitch of a better idea? If I tell you “don’t stick your arm in a wood chipper, even if it’s not running”, I’m hoping you’re not going to be like “well then what should I stick my arm into?!” Presumably he didn’t give prescriptive advice because configuration isn’t one size fits all.


Worth mentioning that VS Code also lets you map any filename glob to a JSON schema. [1]

1: https://code.visualstudio.com/docs/languages/json


> "Lack of programmability" is a feature

if you ever have a non-programmable config file, then at some point someone will make a turing-complete programming language which goes on top of your tool.


Or, if it's open source or an interpreted language, they could edit the code of the tool to implement whatever new features they want.


> executable-program-as-config I learned to hate in the JVM world

I don't understand how anyone who ever used gradle or even maven and went somewhere else can actually say that

Gradle is simply amazing and literally don't know what could be better about it. The link posted is so trivial compared to it's power, it's like criticizing a bullet train for the pattering on the seats.


JVM world had XML before JSON became the rage.


- parsing xml may cause random random network requests (firewall issues)

- xml contain comments, sooner or later someone will try to add data to comments, but you already have the thing for data, the format itself, adding data to comments is stupid

- unlike js, in json someone finally said "Enough! There will be only one type of quotes", it makes parser simpler


> "- xml contain comments, sooner or later someone will try to add data to comments, but you already have the thing for data, the format itself, adding data to comments is stupid"

That's easily the worst argument against comments I've ever heard! Would you argue against comments in code because some jackass could abuse it to insert custom compiler directives or something? Ridiculous. The good comments do easily outweighs any harm some moron is going to do to himself by parsing comments for data.

I don't like XML any more than the rest of you, but praising json for not having comments just smacks of sour grapes.


The intentional lack of comments cements json's position as a (human readable) data interchange format. It's not designed to be anything else. Which is fine. There's nothing wrong with a file format being focused on one goal.


Indeed.

And the point of the article is that JSON is fine for data, but configs are not data. Configs are code.


Lack of comments just makes it a data interchange format. It’s comments which make things human readable.


But you don't _need_ to abuse comments for extra (or meta-) data if you're using XML, you just use a separate namespace.

It's definitely the case that XML has features that aren't needed -- or at least, we've survived without them thus far -- but there definitely feels like more than just a smidge of only realising later why some of that complexity was present. See also: npm vs pretty much any other dependency management system.


> parsing xml may cause random random network requests (firewall issues)

inability to properly use a library should not count against it


Java is a domain specific language for converting XML files into stack dumps.


VSCode configs is a bad example to advocate JSON. They embedd a custom DSL into JSON as using JSON structures to express the same things will be too verbose I presume.

JSON5 makes things better, but it still requires too much syntactical noise and multi line string values are still not there.


Maybe it’s meant to be a feature, but instead what happens is you end up with declarative configs absolutely littered with custom metadata and huge and massively complex custom engines to decipher them, where it would have been simpler to include it in the config. They suck.


> "Lack of programmability" is a feature.

True... for end users, but absolutely not for other developers. I often see this mentality used as a crutch to baby developers from having to write original code.


Lack of comments was the reason I never liked JSON for configuration. JSON5 looks awesome.


So what's the author's proposed alternative?

IMHO JSON5 (JSON with comments basically) is fine for config files, VS Code is using this and they seem to be doing alright.

Addressing the article's points:

- Lack of comments: not an issue with JSON5.

- Readability: I think it's pretty readable. Are the quotes, semicolons and commas really an issue? Not for me.

- Strictness: just edit it with an editor that points at the errors quickly. Escaping quotes might be annoying, but I can't think of a possible alternative format where escaping isn't required at all.

- Lack of programmability: I've never had any problem with this, and again pretty complex and customizable applications like VS Code seem to be doing fine without this. This may even be considered a feature, as it forces you to keep the configuration well separated from the actual logic.


Personally I'm partial to the classic Windows-style INI file.

Because it was designed to be a human-editable configuration file format, it has a simple and intuitive format that is easy for both humans and machines:

  [Section]
  ; comment
  Key1=Value
  ; comment
  Key2=Value
  [Section2]
  Key1=Value
It is also easy for machines to non-destructively edit, maintaining existing comments and formatting by humans, and applications take advantage of that. That is a big plus in my opinion.


TOML is arguably an evolution of ini format.


The problem with ini is that it has no good story for arrays or associative arrays. TOML attempts to fix this, but it provides ugly format for arrays and associative arrays.


I personally find the TOML solution quite nice. In part because you can always drop to JSON syntax to avoid excessive nesting of array tables.


It is pretty limited and the worst part is quoting. What if your value contains \n? What about UTF8?


Horrible for: UTF8, syntax checks, numbers (albeit JSON has this problem too for floats), deep nesting, arrays, strings as keys, consistency between different parsers, etc etc

Good for: comments, readability if keys limited, familiarity.


UTF-8 is only a problem if you're using the Windows implementation. I don't see what the problem with numbers is, either, pick a representation that suits you and use it.


I agree, but my second choice if INI doesn't suit someone is just the standard app.config in Windows (trivial to interface with in the .Net space), or storing the values in a table. I'd prefer any of these before JSON or the other alternatives.


No standard = deal breaker.

Comments might work for a parser but not on another. What's the point when you can do better with TOML?


> - Lack of comments: not an issue with JSON5.

Yeah, but now you have to find a tool that supports that. Easy on small projects, but try being at a company supporting many languages and frameworks.

> - Readability: I think it's pretty readable. Are the quotes, semicolons and commas really an issue? Not for me.

If you haven't tried TOML, you're missing out. Why on earth would you put up with syntax (hard to author, at that) that isn't necessary? Syntax distracts from the actual problem that this is trying to solve. JSON is fine for machine read/write, but horrible on humans.

> - Strictness: just edit it with an editor that points at the errors quickly. Escaping quotes might be annoying, but I can't think of a possible alternative format where escaping isn't required at all.

What if you have an issue in prod and need to fix something quick? You might only be able to open it up in a stripped-down vi, or if the machine is locked down you might only be able to materialize a plain text view. What if not everyone uses the same tool? This imposes a further constraint. Yet more concessions pointing to how horrible JSON is for humans.

> - Lack of programmability: I've never had any problem with this, and again pretty complex and customizable applications like VS Code seem to be doing fine without this.

Can you imagine projects where this might be an issue? These aren't the only codebases in the world. And just because VSCode appears to be doing fine doesn't mean they don't second guess that decision. They might just be locked in at this point.


Most languages have JSON5 libraries just as they have TOML libs.

TOML is decent but if you need nested data it gets super awkward.

> What if you have an issue in prod and need to fix something quick? You might only be able to open it up in a stripped-down vi, or if the machine is locked down you might only be able to materialize a plain text view. What if not everyone uses the same tool? This imposes a further constraint. Yet more concessions pointing to how horrible JSON is for humans.

JSON5 is pretty easy to edit correctly with any text editor. Not much harder than TOML.

> Can you imagine projects where [lack of programmability] might be an issue? These aren't the only codebases in the world. And just because VSCode appears to be doing fine doesn't mean they don't second guess that decision. They might just be locked in at this point.

Does TOML allow programmability that JSON5 doesn't?


This is the first time I read about TOM. After checking out the syntax, I don't like the way you declare nested data, because if you need to move one block of data to a different parent you need to replace all the path of each table. If done wrong, this can lead to very difficult to find errors. OTOH, it can be easier to send partial information, just append this block to your config and you are done.


How do you tell if a file is json or json5? Feels better to forget about json for config and go with the obvious TOML that has single file extension with single standard.


If you're trying to parse it, you don't need to tell. Just parse as JSON5, it's a strict superset.

And if you're editing a file, you can usually tell by the fact that there are comments in it. :)


Official json5 site does mention using .json5 extension to be obvious but I bet people will just use .json which will be confusing if the file is passed to another party and they get parse error until they realize it's json5. I think the only downside to json5 is that they inherited the name json instead of coming up with a different name.


While J4X would simply let you specify the version number in the DTD, if course! ;)

https://news.ycombinator.com/item?id=19656646


Generally, you swap out the config file at build time if that's a requirement. Just as a few examples - you can set up a dev/demo/prod(whatever you call these where you work) config file in .NET Core, Docker/K8S, Angular, etc. As far as I know, this is considered best practice.


There are many different needs.

  - static config
  - environment specification
  - runtime feature flagging 
  - semi-static control plane config 
  - rollout weights
Etc.

All of these have different characteristics.

Infrequently changed, a priori config that doesn't need to be changed in emergencies is suitable for baking in at compile time.


> What if you have an issue in prod and need to fix something quick?

Just fix it quick somewhere that’s not prod. Snowflaking has not been a good practice for a long time now, so arguing that a certain tool makes it easier is a bit of a moot point.


I'm not sure what the the author would suggest but I'm increasingly of the opinion that config files formats should be treated as a compile target not a file that you edit.

There are formats that are easier to edit than others and formats that are easier to parse than others. But sooner or later all of them end up being created by an ill-suited template engine or embedding a programming language in the syntax (cloudformation, ansible, hcl2).

Why not treat the config file as a generated artifact that is readable but not written by a human. Tools like jsonnet (https://jsonnet.org/), dhall-lang (https://dhall-lang.org/), or my personal side project ucg (https://ucg.marzhillstudios.com/) allow you reuse values and safely generate your configuration formats with comments and logic all while bypassing the pitfalls of most formats.

They give you a purpose built language for creating configurations with comments and logic w/o putting that burden on your application later. And it allows you to ignore the parts of a particular format that cause you problems.


Create a good config file format and transpile that to a bad config file format for use in an application? I dunno I would rather just have the application use a decent config file format in the first place.

It's just a config file, no need to overthink this with translation layers IMO.


No good config file format will allow you to share common shared values across all of the formats. That database host everything uses? You'll have to copy paste it everywhere or use a template engine to generate all of the formats.

If you only have one application then it's overkill. When you have 10's of applications it's a life saver.


Good point. However hopefully whatever brilliant file format the source of truth is in will eventually end up being able to be consumed natively by the other applications and we can forgo the translation layer. Until then yes you have a good use case there.


> I'm increasingly of the opinion that config files formats should be treated as a compile target not a file that you edit.

I think this makes sense as far as it goes.

The first choice, no matter what the config file format, should be editing within a tool. The tool can check values, be accessible, assist in coordination, etc.

However, I think it's essential that the intermediate format be human readable and have the ability to contain comments. Because if something in the toolchain goes wrong, determining what the settings are right now on a particular system will be very useful. Being able to modify those settings by hand, though not your first choice, may save time and money and (depending on the system) lives.

Also, if the tool breaks, you can still change configs (though obviously this is not what you want).

Essentially, you should have a tool that edits config values with full explanations of their purpose. That tool should output a machine-readable format (JSON5, YAML, TOML) that supports comments with each value preceded with the comments containing their explanations and, ideally, a list of comments made when changing config values with previous values listed.

You get the best of "both worlds."


> So what's the author's proposed alternative?

I am the author. It depends on your purpose; in general, I would say "the simplest solution that meets all needs", where "all needs" depends on what you're doing. In Python, this could be the configparser module, or parsing a Python file (again, depending on requirements).

As a matter of taste, I prefer minimal syntax. I wrote a library with this syntax a while ago[1], although it's clearly not the best solution for every case.

I should probably write a separate article about this.

[1]: https://github.com/carpetsmoker/sconfig#what-does-it-look-li...

> Lack of comments: not an issue with JSON5.

JSON5 isn't JSON though. It's JSON5. None of the standard JSON parsers can deal with JSON5, you need a different JSON5 parser.

> I think it's pretty readable

A matter of taste.

> just edit it with an editor that points at the errors quickly

It seems to me that having to rely on such tools for issues such as this is not a great idea, especially not if it's easy to do better.

> I can't think of a possible alternative format where escaping isn't required at all.

There is a difference between "escaping isn't required at all" vs. "having to constantly and awkwardly escape a commonly used character".

> Lack of programmability [..] VS Code seem to be doing fine without this.

I've not used VSCode, As I'm a Vim guy. I can't imagine configuring Vim with just declarative configuration as so many useful parts come from being able to program Vim. This could be separated out to just an "extension" or "plugin" system, but I really like being able to just stick one or two lines of code in my vimrc and have it do something really useful, without writing a full plugin.


for my tastes, sconfig is the opposite of human readable. IMO, JSON is good enough for the average cases of configuration files.


I'm curious why you would consider it to be "the opposite of human readable"? It has a minimal of syntax to parse, so at least in my view that should make it easier, rather than harder?


There are cases when JSON is really unreadable. Multi line strings, strings with lots of escaping, etc. There are a lot of other formats that solve these problems (eg YAML, which certainly has other unrelated problems).

There are trade offs with the various formats. Think about your use cases and pick the one that fits them the best.


So many options out there, just look for config formats on github.

TOML is my personal favorite.


Agreed. I feel like everyone uses YAML, I hate it.

YAML feels like 10 steps back; no parser I've ever used has been able to accurately give me a line number for an error, white space issues make it impossible to work with without special editor features, and some of the types are just cryptic. I spend most of my time that's spent editing configs in vim over SSH and it can be challenging at times.

TOML resolves a lot of these issues in my experience but I'd still argue that by virtue of its capability, is still quite complex when trying to define configs for new applications.


And TOML does not have the security code execution vulnerability that is in YAML.


I once tried to use YAML as my format of choice. Then I read the spec. YAML: Not even once.


> TOML resolves a lot of these issues

The only complaint is that the name TOML isn't as cool as other formats.


Came across a pain point with TOML yesterday in use.

When trying out a CLI I've been writing recently, this time on Windows... the TOML config file (generated on first run) wasn't usable.

The problem turned out to be TOML not handling single slashes "\" in strings, instead treating them as escape characters.

Annoying as on Windows the paths are (eg): "C:\Program Files\stuff". Which then needs changing to "C:\\Program Files\\stuff" to work ok.

I can see both good and bad points of TOML's string handling there. But it's annoying it needs a workaround for one of the signature aspects (slash vs backslash) of windows. Especially when paths are very common in config files, regardless of OS. :/


Just use 'literal strings' instead of "basic strings".

TOML has two types of strings, each of which can be single line or multi line [0]. Basic strings allow escape characters, while literal strings do not (so there's no way to represent an apostrophe in a single line literal string). Multi line strings are like Markdown's code blocks:

    """
    This is a multi-
    line basic string.
    C:\\Program Files\\stuff
    """
    
    '''
    This is a multi-
    line literal string.
    C:\Program Files\stuff
    '''
[0] https://github.com/toml-lang/toml/blob/master/README.md#stri...


Thanks heaps, that looks like exactly the right solution. :)


Definitely


In cases where the file is written and read by the program, without the expectation that the user will typically edit or read the file by hand, sqlite is something that should be considered.

A sqlite file isn't an ascii text file that you can edit with vi, so that suggestion is likely to get a lot of jeering, but it's a serious suggestion. It's already being used for configuration on all mobile platforms. There exists a wide range of applications, particularly for windows users, where the common user is not a developer and is not meant to be editing configuration by hand.

Furthermore, access to the sqlite3 binary is very nearly as widespread as access to a vi binary. And self-styled developers who don't know SQL aside, it actually is a rather convenient format to edit by hand. Machine transformation of the config is obviously straight forward, and sqlite databases tend to be self-documenting anyway. Sane table and row names go a long way (obfuscated tables and rows would be roughly equivalent to obfuscated keys in an ini file...) and if that's not enough, comments for tables and rows can be stored in the schema.

The one downside is strictness; sqlite doesn't enforce types. It does however have data constraints, which help.


https://medium.com/@robmuh/yaml-has-won-ba5dae37e740

Seems like YAML won.

But TOML seems to be heavily used in Rust.

Also there is SANE.


I don't care if it has "won" in a few places, I'll be using TOML and mindshare can change who has "won" later.


In the Scala world, HOCON addresses a lot of these (except programmability, which I think is probably a bad idea).


The only config system I've worked on that ever felt even vaguely satisfying was built around the excellent Typesafe (now Lightbend) HOCON implementation. The fact that HOCON is a superset of both JSON and properties formats made adoption in a large org feasible since it could "just work" with the many already existing config files in those formats. While generalized programmability has obvious dangers, HOCON's ability to do variable substitution was super useful for the large amounts of config that was supposed to follow a convention, e.g. hostnames based on a template like

  ${service}.${env}.example.com



Whole bunch of other ports into different languages too: https://github.com/lightbend/config#other-apis-wrappers-port...

Thanks for pointing me to that - I'm working in Scala on server-side and JS on browser-side for a couple of years now, and sharing config would be nice. Still need object merging and value concatenation, which hocon-js doesn't support, but I'll keep an eye on it!


It has a few features that feel like programmability but really aren't, and I think these address most of the use cases that people want a Turing complete language for.


I have a proposal that I'm sure will make EVERYBODY happy by combining the best of both worlds!

How about "J4X" (JSON for XML), that lets you embed XML in JSON the same way the ever-popular "E4X" (ECMAScript for XML) lets you embed XML in ECMAScript?

https://developer.mozilla.org/en-US/docs/Archive/Web/E4X

J4X Example:

    <!DOCTYPE j4x PUBLIC "-//W3C//DTD J4X 1.0 Transitional//EN" "http://www.w3.org/TR/j4x/DTD/j4x-1.0-transitional.dtd">
    {
        april: <fools late="13 days"/> <!-- sorry! -->
    }


>Escaping quotes might be annoying, but I can't think of a possible alternative format where escaping isn't required at all.

One nifty way of doing this is custom quote sequence.

In C++ it works like R"delimiter( raw_characters )delimiter"

As delimiter can be anything it can be chosen not to conflict with the payload.

Bash has had the same trick for here documents for a long while.


The alternative is javascript. Javascript gives you json with comments and programmability. And where possible .env.


A lightweight ini reader in js I wrote a few years back: https://github.com/cmroanirgo/jsinf


Yes - I have seen at some jobs what a horror "configuration files" can be if you let people add programmability into them.

Many years ago I worked at a job that made an XML tag-based language that was turing complete with the intent of it making their configuration "smarter". It ended up making the entire project much more difficult to debug, trace through, and even understand at all.


The lack of comments is a problem. The rest seems whiney to me, but sometimes I need to leave notes for other developers as reminders.


I formalize comments and keep them themselves as data. I don't use JSON for this, but it'd work out as a perfect solution here. There's also good reasons to do it this way. For instance the formalized comments can work as a tool tip for a user configuration UI. Another reason is to do something like automatically produce source code from the config file with the comments in a format such that they work with intellisense. The produced source code now also can trivially reproduce a config file with appropriate defaults. Lots of perks.


Wow, just looked at JSON5 and it's everything I could've ever hoped for! I remember distinctly looking for a JSON alternative that supports comments, trailing , and single quotes. This does all that and more. Now how long until it becomes standard (supported by python's default json library, browser's JSON.parse, etc).


Somehow not widely spread. I can't believe json was standardized with all those shortcomings.


Perhaps HCL[0] is worth a look. It has comments and no need to quote keys. I ported it to C++ and Lua for a project and it gets the job done for non-dynamic configs.

[0] https://github.com/hashicorp/hcl


Take a look at HJSON.


I agree that HJSON is better for config files than JSON5, because of its more relaxed syntax requirements that making adding and editing values quicker.

Link for HJSON: https://hjson.org/

Compare to JSON5: https://json5.org/


> a possible alternative format where escaping isn't required at all.

HEREDOC-like format:

#qweqwe123" string where everything is unquoted and is terminated by a " followed by qweqwe123. you can chose a short relevant sequence of alphanumeric chars as needed"qweqwe123#


EDIT: moved this comment to the root of the discussion https://news.ycombinator.com/item?id=19655249


Of course, JSON5 didn’t come out until 2017, and this was written in 2016.


I thought he would suggest yml but he didn't. yml is good but, personally I don't like it. I find json file easier to read yml. It might be because of years of experience with json.


In my opinion, a statically typed config file would be preferable.



Yes I like Dhall, I also had a go at my own one a while ago: https://github.com/willtim/Expresso


It's like every other article I read these days takes a strong opinionated stance and then gives decent but not overwhelming support for it.

    Please don’t. Ever. Not even once. It’s a really bad idea.
Followed by a list of reasons of which lack of comments is the only one with any universal weight.


What's even worse is the lack of a proposed alternative. These days I'm getting sick and tired of "X is bad and you should not use it" articles with X being something popular that everybody uses/does, without proposing a decent alternative. Tell me what I should do, not what I shouldn't do. The first one is very valuable, the second is just giving yourself a tap on the shoulder.


As the author, the reason I removed some proposed alternatives is because people kept emailing me with stuff like "but what about X?" where "X" was something I never heard of, or was only superficially familiar with.

You make it sound like it's easy to just "propose an alternative", but it's not really. First off, some problems aren't necessarily obvious from just reading a few examples. YAML is a good example of this with a lot of subtle behaviour that can really trip people up. This isn't really obvious form just a glance, and required an in-depth review. So doing a full in-depth review of all alternatives is much harder than it might seem.

Secondly, there is no single right answer. It depends on what you're doing, the language and environment you're using, social factors (e.g. if all tools in your company already use XML, then it's probably a good idea to stick with that), and probably a bunch of other factors.

Alternatives that are better are also so plentiful that quite frankly, listing all possible alternatives strikes me as rather redundant. YAML, TOML, JSON5, or even eval(), and many more, all are decent alternatives for many different scenarios, although obviously not all.

That JSON is popular doesn't mean it's a good idea. I think there is a simple reason that it's so popular: for quite a few programming languages it's the only file format that's supported in the stdlib, and people are already kinda familiar with it. This is not unreasonable choice at face value, but at the same time a lot of people don't always see the downsides at face value.

"Tell me what I should do, not what I shouldn't do" sounds like a half-wisdom. Sure, it's probably more useful. But at the same time, pointing out possible errors people haven't thought of isn't useless. I've actually received quite a few emails over the years with the gist of "I wanted to use JSON for my program, but then I read your article and decided it was a bad idea so I used YAML instead, thanks!"; so clearly, it's useful for some.


That’s a pretty silly stance. If I tell you not to touch a hot stove do you expect me to give you instructions on what to touch instead? And if I don’t are you going to burn your hand to spite me? “Well if you don’t have a better idea I’m just going to keep burning my hand”


Well if you need to do X, then yes, you need an alternative. The alternative of not doing anything is worse in the case (or at least that premise seems to be accepted by the author). So if you say do X but don't use tool Y, then you need the alternative.

Your stove metaphor breaks because the alternative of doing nothing have no consequences.


If you’re a professional you get paid to make these decisions. You don’t need a blog to do it for you.


Yes, because doing any research on anything is not what professionals do, they just do everything themselves...

Blogs are meant to offer insight and arguments, and all the top level comment is saying is that this blog did a bad job in that respect.


Advice on what not to do IS an insight. And there's an argument for why not. The idea that you have to give an alternative in order to not be "negative" or whatever when there's not a one-size-fits-all piece of prescriptive advice is just dumb.


You don't have to offer alternatives but it does improve the piece. No one said it had to be one-size-fits-all, there's plenty of room for nuance.


Shows how people just want to stand out from the average with elitism without decent back up facts.


It's almost like if different config languages have their pros and cons, and each can be useful in different use cases.

I do wish though that JSON5 would become more widespread, as that basically solves every downside of JSON.


Agreed - really folks, it's not going to be the end of the world if you use JSON for your configuration. Or XML. Or comma delimited text files. As long as you agree to it and you have it documented somewhere, I promise your codebase won't suddenly set on fire because of it.

Now, if you add programmability to your configuration files... that is asking for someone to abuse it and turn a simple configuration file into an abomination. I've definitely seen that happen.


Maybe you should be reading peer reviewed research papers instead of blogs.


Website is really wigging out on iOS.

Back on topic, first we’re not supposed to use text files. Then, no XML files. Then, don’t use YAML. Now, we’re supposed to stop using JSON.

At some point you realize the problem isn’t with the tech but with the programmer. We have this need to constantly reinvent the wheel all the time. It’s silly.


Exactly. Also, OP does not explain the need for comments in a config file. I mean, do people consider Apache config files -- which extensively use comments -- to be a shining example of clarity? Hundreds of lines long, impossible to parse manually. Good luck finding those 3 lines you need.

OP's argument is silly. JSON files are fine for config.


> I mean, do people consider Apache config files -- which extensively use comments -- to be a shining example of clarity? Hundreds of lines long, impossible to parse manually.

You're conflating two things, the readability of very large config files, and the readability of config files that allow comments.

> Good luck finding those 3 lines you need.

Personally, since I live in a world where searching text in a file is commonplace and trivial, additional text to match, possibly even comments that note where the config differs from the stock one that ships, are useful.


The comments of these config files are part of what make them unreadable, these are not two unrelated issues.

Yes, searching Apache config is pretty much the only reasonable way of finding what you need. But have we really gotten to the point where we passively accept gargantuan, unreadable config files because it's possible to search for the exact pattern you need?

What if there's a problem in your config and you need to find the issue? Prepare to set aside a few hours of your time to read every block.


I just looked at some random configuration file on my system. Right at the top, in the comments was the following:

    # Please don't edit this file directly (it is updated with every Red Hat
    # Linux update, overwriting your changes). Instead, edit /etc/...
Nice. I now know to make changes in a file that won't be overwritten by an update, and if there are issues, it's either with the site local file, or the one provided by Red Hat, and it's probably (although not exclusively) the site local file that is in error.


Have you never piped a config file through grep to exclude comments to get a streamlined view? It's pretty easy, and quickly gives you the best of both worlds. Configure your editor of choice to collapse comments and that's an even more convenient interface.

For stock configs that ship with a lot of comments, I would sometimes replace the file with one where I've stripped the comments. Since I always save the stock config as $FILE.dist, I get maximum usability. Never would I want the configs to no support comments though. Even after I've stripped configs, I might add my own comment to each change from the stock config, or for some test change I'm trying out. Not having comments means any documentation would need to be at least one degree removed from the item it's documenting. Sometimes that's acceptable, sometimes it's not.


OP does not explain the need for comments in a config file

The same reasons you'd want comments anywhere else. As one example, often in a package.json file I want to add a comment that a dependency is intentionally using an older version due to an incompatibility. The workaround is to use a key of "//", which is hideous: https://github.com/npm/npm/issues/4482


Agreed that workaround are hideous, but no, it's not immediately obvious to me we need configuration data (which should be self-explanatory) glossed with comments.

Username? Got it. Password? OK. Hostname? Sure. Why do these things need comments?

If there's subtleties to the config, put that in the README. Or give better names to variables.

Functions are not always self-explanatory and get complicated fast, so comments in programming make a lot of sense and are helpful. A config file in theory should just be a series of assignment variables-- it's not clear to me why these require detailed explanations.


Comments in config are useful in the same way that comments in code are useful --- to point out things that are non-obvious to other people who work on the same config.

Things like listen on port 8081 instead of port 80, because the loadbalancer does something weird. Or this strange config is important because of feature X and blah. Or if you change this here, be sure to change it there.


To add to what 'toast0 wrote - code is about what; comments are about why. Even if your program has config so trivial that they're clearly data and not code, data alone doesn't tell you why it looks the way it looks. "username=admin" doesn't need explanation about what it was, but it might need an explanation if "admin" isn't what's typically expected as user name here, and is e.g. your specific workaround for something.


Apache conf files in practice suffer from people editing from the 'example' file to what they want and leaving all the embedded documentation.

When you build a file from scratch, or delete the embedded documentation as you go, you end up with reasonably compact configuration.

Additionally, apache has like thousands of options because it's software built to adapt to the world, and not software built for the world to adapt to it, so that adds to length.


> Also, OP does not explain the need for comments in a config file.

He does explain, there is even an example.


No, he complains that comments are not possible and the example he references is about the workaround some people use but he accepts a priori the need for comments in a config file. Why do I need explainers in my config file?

How about putting instructions in the instruction file (README) and configuration data in the config file?


I just checked, and I have over 100 config files in /etc. Does that mean I need to have an additional 100 README files?


You can die from vitamin poisoning. Doesn't mean that vitamins are bad. Similarly, too many comments are obviously not a good idea, but that doesn't mean you never need them.

There are many cases you need an occasional comment: to explain why a possible surprising value is set to what it is, for example. Or to warn future editors for a mistake you made in the past, or even to just describe what a potentially confusing setting does.


Comments in most /etc files on Linux are fantastic - you can skim the config file and often find what you need, and you often know what is irrelevant, and the comments often warn you against common failures.


Just mmap C structs to files and give people the headers!



mmapping and protocol buffers are in no way comparable... You have the downsides of a predefined schema and the downsides of wasting CPU time on serializing/deserializing. What you're looking for is flatbuffers or cap'n proto which can be mmapped directly while staying independent from the ABI of the compiler.


Thanks for the links, though I stand by my snark


I still have nightmares from back when this was standard practice.


What if as time goes on and memory and computation get cheaper, we find better ways to do things and it would be good if the industry thought about that at a technical level rather than a business level.


Huh? This “we” and the supposed “them” doesn’t seem well defined outside your mind. Nobody said don’t use JSON, he said don’t use it for config... for very specific reasons. Just use something designed for storing configuration data.


If you're looking for a configuration language that is powerful and useful AND can interface with existing YAML based systems, may I recommend Dhall?

https://github.com/dhall-lang/dhall-lang

Dhall gives you the power of a limited programming language, the power to version configs and perform generation and logic in your configs, gives you the ability to spread your configs and custom functions for config generation across multiple machines safely, and does so with good guarantees about the type system and does a good job of stopping potentially infinite loops.

Dhall is really, really good and very overlooked by the industry just because its syntax is functional.


What kind of things have you used Dhall for before? I think it's pretty interesting but we don't have enough problems with YAML to really switch.


You might enjoy the examples at https://github.com/dhall-lang/dhall-kubernetes


Here I am still using INI and/or XML configs like it's 2000. Programs can read/write if needed, comments, parsers galore. I still can't see how these are "broken". My biggest driver is the comments issue - nearly all non-json config files (ini, XML, rc, conf) for projects I use have loads of valuable instructions in them, right next to the thing that needs that documentation. Currently, all JSON ones I observe are missing that


TOML is the popular modern alternative to INI. It's a more rigorous spec and has a few useful extensions.


Yea, yeah...I see TOML and YAML a lot too - I also prefer those to JSON for config files.

I've been rocking the ini file config for 25+ years, it just works, zaro bugs, and I can get on with building cool stuff.


YAML I think is too full featured. It does way too much for a config file format. INI and TOML have the advantage of being simple and focused. My only issue with INI is that different parsers aren't always entirely compatible with one another. But that's mostly only a problem for edge cases.


Lucky your parsers have been behaving as you expect when the format isn't even standardized. You can keep building "cool" stuff with a bit more solid format as well.


Almost 200 comments as of now, but no one mentions Lua tables as an alternative to JSON?

Yes, you have to include the runtime, but the size of it is all but guaranteed to be smaller than whatever library you need to parse YAML or TOML or XML (or JSON in many languages.) Smaller than some fonts.

The syntax is almost the same as JSON, but you can use comments and multiline strings. There are also APIs for just about every language imaginable.

You do sacrifice an enforced lack of programmability unless you're self disciplined, though, and Lua does index at 1 instead of 0, which is annoying and objectively wrong, but still less of a gotcha than some of the crazy stuff YAML does.

... that said, I do also feel like anything touching the web stack at all should probably be using JSON just because javascript engines already parse it.


I came here looking for other lovers of Lua.

Configuration is in Lua's roots. It is specifically designed for use as a self-documenting configuration language.

I think the issue is that people just don't know how to use Lua to load config files. ;) But the fact is, there are few other better options than to include the Lua VM and then put all configuration management in .lua files on the target system.


I use Lua for configuration (even at work---ops has no problem with it). It even allows trailing commas. You can even use semicolons if you want.

As for the index issue, you might find it annoying, but I would argue the "objectively wrong" bit (there are other languages, like Fortran, that use 1-indexing).


Lua's great, but if a thousand lines of C or 500 of python can parse TOML then I doubt the average parser gets anywhere near the size of the Lua runtime. Also you have to supply your own encode function.


After spending the last week fiddling with YAML files, please do. White-space dependent config is a complete nightmare, and XML is generally overkill. JSON caught on for a very good reason; it is the most succint possible human and machine readable text based data format short of just writing S-expressions.


Toml is so much better. Editing json with the default vim configuration is really error prone. But I'd rather have json than yaml.


Really tired of cutting the last element in an array and causes an error on that last little comma in json. TOML to the rescue.


I far prefer reading and editing YAML over JSON.


The problem with not using JSON is that the alternative almost always is "some proprietary format" or "an undefined dialect of something that looks like a well-known format".

I consider that significantly worse. With JSON, you at least can assume that the person reading it will understand what the file means. With a proprietary format, especially the popular "key=value" format, it gets super murky - what are the rules for quotes in this dialect (are quotes required, optional, or interpreted as part of the value, which escape characters), will white space be included, etc.

Lack of comments does seem like a very valid point.

I don't see the issue with readability. It is not great but still OK without syntax highlighting, and superb with highlighting (which works because it's a wide-spread standard format, not some weird custom dialect).

The strictness is mildly annoying, but violating it will make the config obviously invalid, which means it can be rejected instead of silently breaking production because you used a single quote and this dialect's parser only considers double-quotes, so your single quote was considered part of the string.

I don't know of many projects that use programmable configs, and many attempts to achieve this resulted in horrendous nightmares.

Seems like a superset of JSON that remains a subset of JS (like JSON5) can address most concerns, and can still be syntax-highligted with a JS highlighter.


> The problem with not using JSON is that the alternative almost always is "some proprietary format" or "an undefined dialect of something that looks like a well-known format".

Sorry, that’s nonsense. JSON is young.

We lived happily with many easy to read & parse file types before JSON arrived. Everything has changed because CPUs are fast now (so parsing overhead isn’t noticed) and storage isn’t worried about (JSON is bloated).


Json is the worst except for all the other options. It seems too late now but json with comment blocks and a schema notation would be ideal.

I say too late because there's no migrating the world's json parsers at this point. We could decide to all agree on a schema and comment format that uses the existing syntax with "#comment" and "#schema" fields but there doesn't seem to be a defacto standard.


What we tend to forget in this debate is, human readability doesn't only mean "read and understand", but "read, understand and mentally validate". Now I'm not so sure that for example YAML is as easily validatable as JSON.

To me the simplest and the easiest to validate used to be the old good INI but unfortunately it never got its chance to be strictly standardized to support nested structures. JSON is the next readable/validatable but it is slightly worse in all respects.


> To me the simplest and the easiest to validate used to be the old good INI but unfortunately it never got its chance to be strictly standardized to support nested structures.

That is a feature! If you need to "nest" structures, you are not really writing a configuration file.


Since I started using Clojure, all my configs have become EDN (https://github.com/edn-format/edn), which is what JSON is to JavaScript. And it is awesome.

So I can't see why JSON would be a bad choice either.

There are some things in EDN missing in JSON, richer types, can be extended by the user and support for comments mostly. I'm not sure if not having those would be a deal breaker though.

I've used YAML for some things, and I haven't found it better. In fact, it's annoying to have a different syntax for your config then you do for your code. And the whitespace matters definitly has been a source of frustration.


EDN is fantastic, though I havent used it in a project in a long time. If there's one thing JSON (and numerous programming languages in general) can learn from EDN is that whitespace is the ideal seperator in lists. Commas are an anti-pattern that adds its own level of management (hence the trailing comma issue) that can be sidesteped entirely.


Yeah, I really like EDN for configuration. It is a nice sweet spot between JSON (too much syntactic noise to be human friendly) and YAML (human friendly, but much harder to parse).


Besides comments I see none of these as negatives.

JSON is plenty readable for most in my experience and there are plenty of great readers/prettifiers for it.

Having flexible configuration files sounds like a terrible idea. I don't want my mistakes to fail silently. Not to mention that the only problem the author mentions is trailing commas...

as mentioned better by others, lack of programmability is a feature, not a bug. Configuration files are presets and markers, not functionality.


Some valid points, but the only one that pains me are comments, the rest is fine.

I don’t want to program my config files, it goes against the whole idea of configuring stuff.

JSON being JSON makes it language agnostic and easily shared (via APIs and whatnot).

And as for the changes argument - try inline commit log in a file editor (eg GitLens in VS Code). Not perfect, but it does give you some history at a glance.

(The point above is weakened by JSON’s lack of dangling comma support, so one-line changes often affect two lines).


Switch your parsers to HJSON, and never look back :)


Can’t. It’s either a fully standard file or something that will fail at some point.


For those looking for JSON-like alternative, I recommend HOCON: https://github.com/lightbend/config/blob/master/HOCON.md


I just made some JSON config files. They're always/normally written by software but should be readable by humans. I have no regrets.

PS They do have "comment" fields that record when the file was created. I don't find it "just damn ugly" at all.


I made an 'extended' JSON format, and parsing library many, many moons ago. I actually want JSON as config file, it's nice, is strict, it's easy to edit, it's hierarchical, it's nowhere near as arcane as XML, it's not as stupid as flat [config].

I added comments, a way to have node names that don't require quotes, hex constants, base64 blobs etc etc and even have annotated nodes (for translation and that sort of things).

I know it's not purist, but I never tried to shove it down people throats, it just fits my use case(s)

https://github.com/buserror/libejson


This looks nice!

Is there a way to have comments, such that when the file is read and then later written (by the program), the comments are preserved (even if some of the fields changed)?


'saving' comments is difficult; and no the API doesn't handle that, but it's a pretty cool idea I'll look into it! I mean, all i can do is call the 'slaved' API with the comment content and let them re-generate them....


i will never stop shilling for EDN: https://github.com/edn-format/edn

it has no comma issues (commas are whitespace), it has comments, it's very structured, it got better set of primitive types and literals for them, it has native support for tagged values for encoding more complex types, it's got first class support of encoding computation by virtue of being subset of clojure - lists can represent function calls, the usual lisp shenanigans, it has readability of yaml and none of it's drawbacks.

it's a shame EDN is not more widely used.


It does have some questionable Clojure-isms (e.g. why have both vectors and lists), but it's certainly much better than most contenders out there. Unlike JSON, it didn't throw the baby out with the bathwater when it comes to XML features beyond syntax and data model idiosyncrasies - it still has namespaces, for example, making it trivial to design extensible data models.


Vectors and lists are very different data structures with different performance characteristics.


In-memory, yes, but not when it comes to serialization. As data, both are semantically just sequences of values for which order matters. Indeed, you generally don't see this distinction in most serialization formats.


> both are semantically just sequences of values for which order matters

you've just described strings, shall we get rid of them too? :)

i find it weird that you would complain about presence of some feature which in no way interferes with any other features and can be completely ignored without any consequences. was there something else you wanted to use parens for in EDN serialization format?


Complexity matters for serialization formats - the more complex the spec is, the harder it is to implement a parser for it, and the larger that parser is. Serialization frameworks can't just ignore features without consequences. Thus, redundant features always draw attention - this is no different than e.g. XML having both attributes and elements, which is also often brought up as one of its deficiencies.

But I don't consider it a major deficiency, just a nit. I mean, in the comment that you've responded to, I literally said that it's one of the best contenders for a human-readable serialization format today - it's certainly much better than JSON, or XML for that matter. It's just not perfect.


> Lack of comments

json5 to the rescue: https://json5.org/

> It’s just not that readable. Sure, it’s readable for a data-interchange format, but not readable for a configuration file.

I'm of the opposite opinion. I find json to be the easiest to read, as opposed to yml and xml.


It’s a bit lesser known but I’m a fan of HOCON (human optimised config object notation). https://github.com/lightbend/config/blob/master/HOCON.md

If that’s not an option I’d prefer xml (with some sadness). JSON is a bad idea for the reasons described in the article, and YAML is too clever for its own good and is definitely a loaded foot-gun.


HOCON is great. It can be incorporated into many programming languages. And it supports inheritance, includes, overriding of environment variables and so on.

I think that this library found a very good compromise between configuration and programmability.

In configuration files you want overrides, includes, comments and inheritance. May be very very simple pattern matching.

Yet, you do not want flow control structures such as if-statements/loops & gotos.


HSON ( https://hjson.org/ ) addresses a lot of this. Just allowing trailing commas and adding comments reduces 90% of the friction involved.


And most importantly it can consume JSON just fine. No need to learn yet another language.

Also fully convertible to json (without comments) if there is ever a need to change format or to interface with another system that can only understand json.


People have tried every point on the JSON-to-full-programming-language spectrum, I put some examples in this post: https://blog.ometer.com/2015/09/07/json-like-config-a-spectr...

JSON5 is new since that was written I think... there are a bunch more of them too.

As part of the team at lightbend that came up with HOCON I still like it because it lets you use env variables (handy for Docker, Heroku, etc) and has some don’t-repeat-yourself capability without going to a full programming language.

(If you’re going full language, to me why not use a real one, maybe the one your app is written in... but there are definite tradeoffs to using a language, like difficulty writing automation that understands the config, and it assumes anyone changing config is a full-blown developer)


Slightly off topic but did anyone else notice this ligature? https://imgur.com/vtOdiCE

What in god's name is that??


I love that "ct" ligature so much! Also "st". Sometimes you find "sc".

I persuaded the author of Linux Libertine to add the first two as "historical" alternatives.

https://english.stackexchange.com/questions/25118/is-there-a...


And that's right after "readability is important". ;-)


XML, which I consider to be unreadable by both machines and humans.

Really?

Depends on how it’s structured, but I’d say XML is much more READABLE than JSON, although it requires much more typing to write by hand.

And it was a much more rigorous data format than JSON for many years before any of the more modern things like JSON schemas came about.


XML has comments.

XML can be programmable. You can make <If> and <Loop> constructs if you so desired while being trivially parse-able.

XML can be arbitrarily strongly typed.

JSON really was a step backwards in many ways. I think the main reason JSON took over was that it looks nicer and is easier to type by hand (as if everyone doesn't have editor macros for XML).


Imagine if HTML was in JSON instead of something related to XML.


> Lack of comments

Actually I consider that a feature. If some piece of software requires some complex configuration I have to refer to some external documentation anyway. Many projects include the documentation (or parts) as comments in their configuration files (e.g. Postgres' pg_hba.conf), but I think it makes it harder to find the interesting parts.

> Readability

I think JSON is pretty readable, even for non-programmers, but this is highly subjective.

> Strictness

Good! Parsers are too lax anyway.

> Lack of programmability

Again, that's a feature: just keep it simple. Now JSON is not perfect (I would like to have a distinction between integers and floats), but it gets a lot right and it supports the most important data types and structures

  - strings
  - numbers
  - dictionaries
  - lists
that easily map to data structures found in higher level programming languages.


> If some piece of software requires some complex configuration I have to refer to some external documentation anyway.

Except something seemingly simple which doesn't attract your attention might have a weird reason for existing - for instance you might have dependency which has to be pinned to a certain version due to compatibility issue or an existing bug. Would you rather spend 2 hours troubleshooting that or maybe put a comment there so that the next person can easily understand the core issue and evaluate it in 5 minutes?


> If some piece of software requires some complex configuration I have to refer to some external documentation anyway

Where do you document you use of that external document? In a configuration file the support comments, you can do it in a comment:

  # See Necronomicon, page 751 to determine these parameters


Here's a very real issue we had at work. I wrote a program (and it's configuration file) to process SIP messages. We installed it with using the default port of 5060. It kept failing. After much investigation, it was revealed that some piece of router gear was "helpfully" processing any SIP message it saw on port 5060 (and it couldn't be turned off because of a bug in the router).

The fix was easy enough---change to a nonstandard port. But sans comment, there's no indication why we were using a nonstandard port. With a comment, the ops guys know why we're using a nonstandard port, and even ask "is that still an issue?"

Edit: clarify why we did the workaround we did.


Would be nice if you could add a few suggestions of languages that you consider more suitable for this task.



This is one of my favorites too.

HCL and HCL2 from hashicorp have a similar feel -- which makes sense, since they were inspired by ucl and nginx configs apparently (so says the readme).


But XML is horrendously overdesigned and verbose.

YAML has dangerous type coercions and tedious indenting.

TOML has bizarre formatting for arrays and associative arrays.

And other formats have too-small adoption.

We know that every config format is bad, but we appear unable to correct this bone-headed simple problem.


My largest hang up is I LOVE to comment my configuration. So JSON is out by definition. YAML seems to work just fine for all my use cases and is less verbose and more human readable FTW.


IMHO, YAML tries to be too clever. E.g. the fact that it happily guesses types and doesn't require strings to be quoted creates tons of traps.

    languages:
        - en
        - de
        - no
    portmappings:
        - 8080:80
        - 2222:22
parsing the above, "no" is suddenly a boolean, and YAML 1.1 parsers will turn "2222:22" into the number 133342 (because there was a helpful rule that thinks the latter means "2222 minutes 22 seconds" and turns it into the number of seconds... https://docs.docker.com/compose/compose-file/compose-file-v2... )

I really liked it initially, but after using it a bunch I'd prefer a more restricted subset of it. (also hit a bunch of parser bugs/incompatibilities, where the complexity of supported cases probably also plays a role)


To be clear, both of those issues are just with YAML 1.1. The core schema of YAML 1.2 dropped parsing "no" as true, or (as you mentioned) colon-separated numbers as sexagesimal integers. But yeah, unfortunately a number of libraries that only support 1.1 are still being used.


I want completely different things from the configuration format, depending on the software I'm configuring.

JSON is great if I'm going to want to generate configs programmatically much of the time but still want to be able to view or edit them manually now and then. JSON (+comments) is a fine config format for vscode since it's (among other things) a specialized JSON editor, and it gets to leverage functionality it already has (JSON schema support) towards making it's configuration management pleasant.

If the config is going to go in a users home directory, like ssh or fontconfig, or for services like nginx and polkit where other software is going to want to add some configuration, then support for config glob directories (e.g. conf.d) is very handy, so that different system packages or gnu stow directories can provide their config via file tree merging.

Some software gets a ton of value from using a scripting language as it's config format, although in general it's something to avoid. Vim and emacs wouldn't be what they are with a less powerful config format.

Some programs need just a tiny sprinkle of config, just to specify the socket to bind on and maybe some database credentials. INI files or even just environment variables are ideal for this sort of thing.

JSON has its place, the problem is that it's too easy to use. Programmers automatically turn to it without thinking about what their software needs from its config format. It's the not thinking that's the problem.


.ini files are the only good option.

I did a thing that used JSON for config years ago when I was young and stupid (60). Terrible choice. Someone in this thread said that config files need the amenities of source code. I agree. However it also needs to be readily machine generated in case your system is successful enough to support a UI for configuration.

.ini hits the sweet spot for me. I can edit it reliably. It's extensible in every way I've ever wanted.

I looked at TOML, mentioned below. I would like the array syntax but when I saw the Wikipedia article's example including something with quotation marks I realized that complicating it with array syntax means that you hop down the slippery slope to having semantic cruft to distinguish the types.

For .ini, everything is a string, period. You always know what you have. No debugging.

In fact, I wrote myself a .ini interpreter that does some obvious stuff to make it useful in Javascript. Case-insensitive interpretation of true/false. Things that can be a number become one.

But then, slippery slopes galore. I added hinting to let me convert objects with number property names to arrays. Now I spend my entire life realizing that I didn't add the hint for this recently, quickly added case.

I'd be better off just always converting it where I need to use it.

It's my opinion that changing anything about the .ini is always premature optimization.


Depending what you are making, there is different formats that you can use, such as: YAML (which is a superset of JSON; however, it does not use the same comment format as JavaScript), X resource manager format (I wrote two implementations, both of which are smaller than the one in Xlib, and can be used independently of Xlib), a C include file (but then you have to recompile the program to configure it, although it makes the program more efficient), INI (some programs use a simplified version without section headings), RDF, TOML, executable code (especially if the program is written in an interpreted programming language, such as JavaScript, or embeds one, such as perhaps SQL if it is a database program, although SQL seems isn't generally a very good configuration format), etc. What is best depends on what the configuration file is for, I think.

I haven't heard about JSON5 before now, and I don't know what is JSON5 (I read in here apparently JSON5 is JSON with comments; what format of comments (is it same as JavaScript), and does it accept trailing commas?). Also, XML is OK for markup, but it is often used for other stuff where other formats are often better (although maybe you have some case where the other options would be worse than XML for the configuration format).


I've been attempting to use a new project for json as a dsl.

https://jsonnet.org

It's pretty amazing and I'd recommend it.


Why isn't json5 used more? It should be pushed until it becomes the defacto standard.

https://json5.org/


JSON5 is okay, but it has different goals than JSON. I'm not sure if it should replace it.


Serialization vs configuration?

Isn't json a subset of json5? If so, but write out to json when serializing, but always read json5 so you never run into any surprise? Any drawback with this approach?


I think the issue is that JSON is already harder to parse than you'd think; see: seriot.ch/parsing_json.php

While JSON5 doesn't introduce huge changes as such, it does make it more complex. There is also the matter of semantics/API; should json.parse() accept JSON, JSON5, or both? What if we expect JSON and someone sends JSON5; should we accept that? Postel's law says yes, but personally I think that would be a mistake.

It seems to me that it's easier to maintain a clear distinction between the two (json.parse(), json5.parse(), etc.) One of the values of JSON as a data interchange format is that it's fairly simple, and adding JSON5 features – none of which are needed for this usage – probably isn't a good idea.

Also note that there are many JSON supersets. YAML is a JSON superset: any valid JSON is also valid YAML, there's alos HOCON, and probably a few other things as well.


Let's also exclude YAML, which has all sorts of security issues. TOML seems to be the best config file syntax. Easy to read, does not try to be Turing-complete :-)


I agree with this except for the last point. "Lack of programmability" is a good thing. Mixing configuration and code is a recipe for a tangled mess.


I believe every programmer, once they have implemented a config file format, then added control of flow, assignment, and/or statements should get a patch like in the boy scouts. It should have "never again" in latin on the bottom.

Randy's law 637: Every configuration file format (read by some language 'L1') eventually gets control of flow, assignment and statements. This format then forms language 'L2'.

Proof: See Unix.


Why? I really don't get this article. I am using it for years without any issues for multiple businesses. Easy to read and easy to parse (any language)


This article is silly. I honestly think a minor tweaked version of JSON will be the future for config files. Probably JSON5 which adds trailing commas and comments.


A "tweaked version of JSON" is not JSON, it's a "tweaked version of JSON". JSON5 is also not JSON; it's JSON5.

This is like replying with "but a changed version of this beer tastes great" to "I don't like this beer". Okay, but that's not what I'm drinking.


You might like this: https://jsonnet.org


The problem with comments: suppose I build a config UI to edit the configuration file using fill-in-forms. How do I display the comments in this UI? I can't display classic line or block comments easily, because it isn't always clear which setting the comment is attached to.

The author says something like "__comment" is ugly for humans editing the file. Sure, it is. But it works much better if the file is also going to be edited by a UI. (Although "__comment" can comment the object but not an individual key; something like "name#comment" being comment on "name" key and "version#comment" being comment on the "version" key could permit separate comments on each key.)

I actually am encountering a similar problem at work right now. We have a complex XML config file. Some people edit it via an XML editor, others edit it using a fill-in-form UI. With the fill-in-form UI, we'd like to display (and permit editing of) the XML comments, but it isn't always possible to work out programatically which XML element(s) the comment is talking about.

Compare:

    <!-- The following setting is blah blah blah -->
    <foo-setting>32</foo-setting>
    <bar-setting>64</bar-setting>
with:

    <!-- Email Configuration Section -->
    <mail-host>mail.example.com</mail-host>
    <mail-port>25</mail-port>
In the first example, the comment belongs the immediate subsequent XML element only. In the second example, it belongs to both of the following elements. There is no reliable way to distinguish these cases, unless you try to interpret the natural language text of the comment itself (which can't be done with perfect reliability).


>comment support was explicitly removed from JSON for good reasons.

>https://plus.google.com/+DouglasCrockfordEsq/posts/RK8qyGVaG...

The link there goes to G+ and thus a dead end. Anyone have that link?

I would have liked to see some alternatives / options that the writer likes.


https://web.archive.org/web/20120506232618/https://plus.goog...

> I removed comments from JSON because I saw people were using them to hold parsing directives, a practice which would have destroyed interoperability. I know that the lack of comments makes some people sad, but it shouldn't.

> Suppose you are using JSON to keep configuration files, which you would like to annotate. Go ahead and insert all the comments you like. Then pipe it through JSMin before handing it to your JSON parser.

HN discussion is here: https://news.ycombinator.com/item?id=3912149


I am doing my config by reading JSON into Java classes with Jackson. Doing a minify step would mean that I would get the wrong line number in the error message when I forget a comma.


Thank you very much.


I 100% disagree.

I rave about JSON. I don't understand the rants with config files, but I could bet $20 that they _always_ begin with "but my comments!".

In the past I've promptly replied with (copied from my Reddit post):

Douglas Crockford said to use comments in json config files:

"Suppose you are using JSON to keep configuration files, which you would like to annotate. Go ahead and insert all the comments you like. Then pipe it through JSMin before handing it to your JSON parser."

Which was at the following link before Google killed Google+

https://plus.google.com/+DouglasCrockfordEsq/posts/RK8qyGVaG...

Here's my flag solution with JSON config files: https://github.com/zamicol/jsonflag I've never read this "arp242.net" article, but my solution anticipates some of its complaints, such as environmental variables, and addresses them. What I mean to get at is that this isn't rocket science and the "deficiencies" of JSON are flexibilities left to be easily addressed by the programmer. JSON is a great level of complex without being too much.

Every time this comes up, I don't understand how https://json5.org isn't superior to YAML in every way. If there was some deficiency with JSON5, just simply use JSON with comments. It's that simple.

JSON is one of the best things to ever come out of the CS disciplines. Moreover, JSON is a javascript subset. If json isn't sufficient, why not first try a different javascript subset/json superset before trying something more alien?


Sublime Text uses JSON for settings. As a user, I've never been unhappy with that. Their JSON parser handles comments fine because each setting is accompanied by one.

    // When enabled, hovering over a word will show a popup listing all
    // possible locations for the definition symbol. Requires index_files.
    "show_definitions": true,
The author does highlight an important point though: if you're using JSON for your program's config and your program's parser doesn't support comments, you may want to find a parser that does.

The author also mentions readability, which I think is subjective. I find JSON plenty readable provided key names are descriptive and the JSON is well-formatted. Readability does suffer when you have multiple nested objects. I'll also agree to the annoyances of forgetting a trailing comma somewhere.


One thing the article misses is that configuration files are often not only supposed to be machine readable, but also machine writable.

Often, software offers commands to adjust config parameters (eg. PostgreSQL's ALTER SYSTEM command) or even a GUI interface for changing the configuration.

In these cases, if your config file format supports comments, or like the author suggests, even arbitrary code, then it's unlikely the software can automatically alter the config file. And even if it could, comments would quickly become outdated.

PostgreSQL addresses this issue by using multiple config files that can override each other, which can make debugging very hard, if you're not aware of the fact.

Issues like this can be avoided if you have a declarative config format like JSON, and the lack of comments actually becomes an advantage.

So in my opinion, if your config files can be changed by software, JSON is actually a very good option.


> configuration files are often not only supposed to be machine readable, but also machine writable.

Config files which are expected to be written by both humans and machines are a fairly rare use case. Most config files are either editable by hand (e.g. /etc/), or editable though an UI (e.g. Firefox), but not both.


The main point that makes sense for me is about lack of support for comments; but JSON5 should solve this problem.

I find JSON to be much more readable than alternatives like YAML and much more flexible than plain key value pairs. I've been using YAML for a while with Kubernetes but I find it difficult to follow the indentation (especially if my IDE does not support guide lines)...

Strictness is a good thing for config files and makes it easier to find things.

Programmability in config files is generally not a good idea. Config files should be simple enough so that users with minimal technical ability can change the behavior of the system without having to know how to code.

If/when JSON5 becomes mainstream and IDEs stop showing warnings about comments in JSON, I will stop using .js files for my Node.js applications and just use .json. I do think that the commenting ability is important.


I get the desire since JSON isn't seen as a configuration tool by its designers. However, it's proven itself to be a suitable tool for driving screws as well as hammers (to paraphrase).

For the software folks -- why not adapt the tool or spin off a similar tool to handle these very common use patterns?


I don’t think it’s as clear cut as the article describes. If all you want is to define a few static values, like DB connection details, then it makes sense to use something that is easier to parse.

However if you’re building another nginx then obviously you’d want something more composable.

The problem that the article doesn’t address is that there isn’t a single good common standard. YAML has its own problems, TOML is fine for simple configs but can quickly get less readable when you start having deep nested elements. None of those allow conditional elements. So then you have HCL, which is an abomination and a bunch of other vendor specific formats - most of which don’t have reusable APIs. So do you then create your own format? And in which case your users have to learn yet another config language.


You might want to fix your site: https://www.dropbox.com/s/yg58z7hztlez3ff/ScreenRecording_04...

It’s unreadable in Brave on my phone.


wtf?! That's super weird. It doesn't even have a lot of CSS, and no JS at all, so no idea what's going on here?!

I'll take a look later. Thanks for taking the effort of reporting it!


Yeah! No problem, this is the app version:

Version 1.6.6 (18.11.15.17)

I’m using an iPhone SE


I can reproduce in Safari on iPhone too (using BrowserStack), and I've reduced all CSS to this to reproduce:

  @media (max-width: 26rem) {
    html { font-size: 12px; line-height: 150%; }
  }

  a { transition: all .2s; }
I added that transition for a slightly prettier colour change on hover. What was going on is that this would also animate the size of the font, triggering some back-and-forth between the regular font size, and the smaller one in the media query.

I fixed it by changing to:

  a { transition: color .2s; }
I think this is how it's supposed to work. It's just the iPhone SE just to happens to be at the exact size to trigger this? I'm not sure, I wasn't really able to trigger it in desktop Chrome or Firefox.

At any rate, thanks again :-)


I haven't seen mention of it in comments: There is a config file format used by some unix programs: the fortune cookie format. I think Eric Raymond compared it favourably with the windows ini file format, in his Art of Unix Programming.


As alternative (even thought I do like and use json) pure code files can, depending on language, work well too.

Just a file with variable declaration that gets read into the runtime. It's definitly more readable for non tech users than json.


The author is correct: JSON shouldn't be used for configuration files (or for anything else for that matter!), but not because it lacks comments, but because it's very difficult to construct a parser for it. (Just yesterday I had to compile and package jq because writing a parser of my own for it would be insanity).

Configuration should be done with operating system packages and shell scripting; then there is no need for comments, since the files for making the package are kept in a revision control system like Bitkeeper or Mercurial and configuration on systems never modified by a human.


The ONLY thing I agree with is the comments. However comments in _anything_ would have to be thought of by the developers.

See a tag or section you don't use / parse? Copy that through to the output; or only over-write the parts of the structure that the parser understands... Something to help comments and other unsupported items in the config to the output file.

"XML" expressed configs have this problem too. Way too many projects that claim to process XML stuff will also just eat the config, remember what they understand as options, and throw away everything else, comments included.


And what's the least painless? Yaml? I worked with many configuration files, the only way to deal with the big beasts is using GUIs. There are tons of free and paid and most of them cost ~$50.


Honestly, even as a transport datatype it's always bothered me that JSON lacks a well defined number. I mean, if your going to allow infinitely large numbers shouldn't you explicitly support them with Big types?

EDIT: and if you want fixed size numbers (for many good reasons) then what size?

JSON as a config is another story, and I completely agree with this link. I'm a personal fan of TOML[1] for many purposes.

[1]: https://github.com/toml-lang/toml


JSON is fine as a transport datatype between two bits of code you completely control (e.g., webapp front end and back end). It is a terrible choice as soon as you want to publish and document a public protocol.


JSON isn't my favorite format, but it's not surprising that it is so widely used in the JS world because you can trivially read and write it using only JS and Node built-ins.


JSON is the most straightforward way to add a config to a static webapp. Make it a JS file that sets window.config or something if you want comments maybe, or parse it as JS since it's trusted (if you don't trust your own site files then everything is out the window anyway).

ini, yml, etc are nice but you'd need to import a parsing library and that may be slow/overkill for a small app. xml works with xmlhttprequest though.


We replaced most of our JSON config files at my last job with either YAML or JSON5 [1]; personally I'm a pretty big fan of JSON5 from a readability and writability standpoint, although the caveat is that it still uses 64-bit floating point for all numbers. YAML is a bit too... enormous for my taste personally, but at least it's better than JSON.

1: https://json5.org


It's so exhausting being a developer nowadays. Do this, do that. Do this, don't do this. This is right, this is wrong. Everyone has an opinion. Who is right? Who is wrong?

I don't even like the term "best practice" nowadays. 9 times out of 10 it's an opinion on a best practice.

I prefer to just get shit done, do my best what's right, learn from great colleagues and peers on the job, and filter out the majority of this noise.


I bet it was far more exhausting to be a developer before 2000.

They only had "don't do this but since it's all we got, do it anyway".

Feel the luxury of choice not complain about it.


YAML seems a reasonably expressive language and a good practical alternative. Using it with python is a breeze and allows a lot of flexibility.


It’s config. I’m willing to bet no matter how awesome a developer you might be I will find a dozen things to complain about before I complain about the config format. If it is number one, congratulations your code is amazing. You’ll have so much time from not having to fix the millions of other thing us mortal programmers have to do to make the worlds best config system.


I think the people who really care most are doing ops not dev

(though of course many people do both)


Lack of comments and illegal trailing commas is an artificial problem. Implement/use non-strict mode to allow those and both of the issues for readability are gone. Perl JSON modules have this mode, making json files for simple config just fine. If you already using JSON in your program, theres no need to add more dependencies for other config file format.


So what is your favourite format for config files?


I like Toml for my config files. Clear and simple format.


I've had good experiences on the JVM using HOCON [1].

[1]: https://github.com/lightbend/config/blob/b782a2d701fc2b04579...


YAML.

All the benefits of JSON with none of the negatives.


YAML is whitespace sensitive, that alone is a nightmare.


Yes, as someone who helps with some software with a YAML config, the whitespace issue has caused much time in IRC spent trading pastebins fixing these.

I agree overall YAML looks pretty nice, and consistent whitespace is a strong preference (I have pre-commit hooks), breaking user (if you have them) config if they don't know about this is an issue tho.


Do note that in addition to block collections, YAML also supports the flow collections of JSON, which are not whitespace sensitive. In fact, YAML 1.2 is a superset of JSON, and you can parse JSON with # comments as entirely valid YAML.


As someone who uses YAML and doesn't know it well (Thanks for nothing Docker & K8s) - I see absolutely zero difference between the two as far as one being better than the other goes, except the whitespace sensitivity which the other user mentioned, which really is a pain.


protobuffers now! protobuffers forever!


You doing stuff with multiple languages/frameworks?


Since many are asking/complaining that the author didn't provide alternatives, I'd like to point out the obvious starting with what JSON is - "a lightweight data-interchange format".

Who is interchanging information? In 99% of the cases, which is the target of the blog post, it's a human/developer and some software writing and reading a JSON file ON THE SAME MACHINE, IN THE SAME ENVIRONMENT. That is coding! You can do a far better job controlling a piece of software by actually writing code, telling it how to run.

Reversely, if you need to configure two pieces of software via the same config file, you need something that both can read, so you might go with JSON, XML, YAML, your flavour of the day.

But think for a second: does that happen more than 1% of the times? It doesn't!

Software like eslint (js), babel (js), emacs, etc are using configuration written in their appropriate languages, not in a serialized format. This allows for huge benefits.

If you had the chance to work with AWS for instance, you would have noticed how their Cloudformation templates are insane. They have pseudo functions, control statements IN JSON (and YAML FWIW)! On the other side of the globe, GCP teaches its customers to use python or jinja (a python template engine) to GENERATE YAML, though you can obviously use whatever language you want, which is then consumed by GCP.


You can have comments if you add a preprocessor (such as cpp) to your build pipeline.

Also, I don't think json is that bad for quick software utities, if you're already using it for data interchange, there are parsers for every language, I think it's a superior option to designing a custom config language (up to a point of course).


I think the only real issue I have with json/yaml/toml/etc is the fact that the options available to me aren't obvious. I have to go digging through documentation to find out what keys are available and what values they accept. It can be really cumbersome if the authors don't keep their docs up-to-date.


This again? It's the same argument against using JSON for configuration that everyone else posts. Even worse it says "Please don’t. Ever. Not even once. It’s a really bad idea." but gives you zero solutions that fits the criteria.

It's an effortless post just to rage against JSON. Why is this even on the front page of HN?


Hi,

I've created the SANE configuration format [0] because I also found JSON (and YAML, and TOMl...) Faaaar from optimal.

It's kind of a JSON but designed for Human to Machine usecase (instead of machine to machine)

[0] https://opensource.bloom.sh/sane


HOCON is pretty much the same thing. Why not use that instead?


JSON as config files with node programs works pretty well though - especially when you want to have easy access to those values in the program. I really don't find the lack of comments to be that bad... if comments are needed relating to the config you can just stick it in a readme lol


I agree that JSON is a shouldn't be used as a configuration language.

However, one could use Python as a configuration language! It could fill in the configuration object, which will then be transferred to the main app using JSON, YAML, XML, or something better than that.


This is so far down my list of coding concerns I’m already done interacting with this thread.


Why the heck does JSON still not support comments? Why is it so hard to add support for them?


If it's within your own repository then you have comments via git. You can see when and why things were added if you link them to an issue tracker.

You even have inline comments if you use an editor plugin like gitlens (VSC).

I guess the problem here though is portability


I use protos as configs pretty frequently. They have comments, easier syntax, there's a schema file the describes exactly whats expected. You can parse them from text, csv, or even awful binary, and use them in practically any language.


Speaking of slippery slopes. This JSON enhancement requires documentation. It turns JSON into a specification language, I guess, and it required documentation. Ick.

https://jsonnet.org


Avoiding lightweight markup languages and sticking to real programming languages for configuration has a lot of benefits.

On that: https://youtu.be/0pX7-AG52BU



This is a welcome improved spec! I'll be curious how long till wide adoption.


JSON as config is the same reason XML as Config files were a bad idea. Configuration files will often be read and changed by humans and thus should be easily human readable.

Plain text equals configuration setting is still a good idea.


I would argue that JSON is pretty human readable, and easy to write by hand too.


This is not true, except for irrelevant trivial cases where the format does not really matter.

Try to open a jupyter notebook with a text editor. What you expected to be a cleanly laid out python program with a few nicely formatted comments is in fact a nightmarish JSON "text" file, that looks like an obfuscated perl program.


Related: this C library for parsing JSON that allows comments and trailing commas.

https://github.com/andrewrk/liblaxjson


When they call it a json parser, they should stick to json or call it json5 parser. There is no "relaxed" json.


I agree JSON is problematic; mostly because of the lack of comments and the unnecessarily unforgiving syntax. But this author doesn’t really provide an alternative. Just setting PHP vars directly?


I quite like the concept of SDLang for config like this, but I've not yet had a chance to actually trial it in any real projects (due to less than broad language support).


No, it does't allow comments. Fortunately, a project README has more then enough room for any remarks. Yes, you're better off putting numbers in strings. No big deal.

But also:

JSON files are easily readable by default, in a lot of languages. Knowledge about (the limitations of) JSON can easily be assumed. Alternatives like YAML, TOML or any of the others all have their own idiosincracies.

And non-programmability is a good thing. It might be more verbose, but you don't have to fire up a javascript environment or bash interpreter or some such just to read a configuration file. And separate code paths are very explicit since that will mean separate objects or even files.


I think yml is the future of config, but if you want to use JSON it should only take 5 minutes to write a pre-parser regex to remove lines starting with #.


What's the alternative? How about...JavaScript! JS has a little function `JSON.stringify` that accepts objects and returns JSON. Pretty nifty.


If the alternative is YAML, I'll keep JSON...


I'm partial to libconfig myself, although I'm not sure if it's well supported outside c/c++/python.


I always mess up when trying to edit JSON to the point where I will convert JSON to YML make changes and convert it back.


Can I also say: Please don't use YAML as config files. Any non-trivial YAML file can be messed up way too easy.


This is going to be another holy war, like clean, pure vi vs the bloated, satanic emacs, isn’t it?


I read this to be convinced, and I remain unconvinced.

• Lack of comments - Config files are usually pretty self explanatory

• Readability - Never been an issue

• Strictness (complaints about trailing commas) - Who cares

• Lack of programmability - Does YAML offer this? I don't know YAML.

What are the alternatives?

Edit: After 35 minutes of comments here - looks like we have zero consensus and the article is trying to prescribe a preference.


I've used edn [0] as a "better json", but in the end, json is good enough and ubiquitous. Oh, edn is not programmable either, and that's by design.

[0] https://github.com/edn-format/edn


Thanks for the link, reading.


”Config files are usually pretty self explanatory”

In the universe where I live, most settings in a config file are pretty explanatory, but also, almost every config file has at least one setting that isn’t.

Also, even if I clearly understand the configuration file I’m reading, figuring out how I can change it often isn’t clear at all:

- If a json attribute contains a file path, can it be a URL or S3 object reference, too?

- does that path to a .gz file in an attribute mean input must be gzipped, or does the program look at the extension? If the latter, what compression methods are supported?

- If I don’t want to write that second output file, do I leave out the ‘extraOutput: foo.txt’ path setting, set it to null, set it to an empty string, or add an optional attribute ‘produceExtraOutput: false’ that defaults to ‘true’?

- what is the range of acceptable values for that ‘foo’ attribute that has a value of 42 in the config file I’m reading?

- is that ‘progressInterval’ value measured in bytes read, input lines, records processed, lines written, or wall time seconds?

- what other options that aren’t in this config file might I want to set in mine?

Yes, all of that could be in the documentation, but it’s easier for me if the config file contains comments describing that, and, in my experience, it also is easier for programmers writing the code to add such comments to the sample config file than to keep a documentation page in sync with the code and the sample config file.

Finally, some things cannot be in the documentation of the tool because they are local changes. For example, logging config could deviate from company standard for a reason.


> • Lack of comments - Config files are usually pretty self explanatory

Really? I have never seen a JSON config file that doesn't require reading external documentations. Comments are also useful when you leave a note on a specific configuration for your system.


Config files are almost never self explanatory. Comments should be describing why things are configured like they are—how would you do this without comments?


Yeah nah, I'll keep doing it as its worked extremely well so far with no issues.


There's a linked article discussing the shortcomings of YAML, as well: https://arp242.net/weblog/yaml_probably_not_so_great_after_a...

Reading the two screeds together, though, I think the author has ignored one key idea of both JSON and YAML as formats: their use-cases are complementary, and the "problems" with each of these two formats only arises when you're using one for the use-case best suited to the other.

YAML's use-case is for human-written configuration. That's why it has so many weird ways to write things: to allow humans to write some data down in a way that works best to communicate said data to other humans, in a way that machines can coincidentally read. (See also: AppleScript.)

And that's also why YAML is "so powerful that it's basically inherently insecure." You shouldn't be executing untrusted YAML any more than you should be executing an untrusted bash script. YAML is a language for the deploying sysadmins of a piece of software to configure said software with; it's not a language for the tenants of a system to communicate their needs to the system with.

JSON's use-case is for building machine data generators and parsers in a distributed system (like the web) where it is very important that the data be easily introspected by humans, and that humans will also be able to modify or otherwise "poke at" said data, in flight, using simple text-based tools. But the intention isn't that humans will ever author JSON "at scale." JSON is like HTTP/1.0: it's something that a human can write, in a netcat session to ensure things are working or to probe at a system's capabilities or responses; but where a human writing in said format is never going to be the "proper" way of doing things in production.

Now consider the subheadings of each screed:

"JSON as configuration files: please don’t"

- Lack of comments

- Readability

- [Too Much] Strictness

Obviously, if these are your problems, then you're a human (probably a sysadmin) hand-rolling configuration. Use YAML.

"YAML: probably not so great after all"

- Insecure by default

- Can be hard to edit, especially for large files

- It’s pretty complex

- Surprising behaviour

- It’s not portable

None of these matter if you're a human sysadmin who is interactively hand-rolling a configuration file for a piece of software. YAML "doesn't scale", but configuration files aren't a domain that requires thousands of lines or huge, deep hierarchies. And the complexity, surprises, and portability don't matter if you're not trying to "shout YAML into the void" without knowledge of what the target system is.

If those are your use-cases—if you need scalable, portable, secure, predictable data—then you're definitely not writing a configuration file, but rather are just interchanging data of some kind. Use JSON.

---

Now, the funny thing to me is, JSON is a subset of YAML.

(And that's why https://leebriggs.co.uk/blog/2019/02/07/why-are-we-templatin... was written: nobody should be templating YAML when machines are perfectly capable of emitting JSON that YAML parsers will happily accept.)

Really, rather than talking about which format to use, what this discussion should be about—in a better world than the one we're in—is the use of "trusted mode" flags in YAML parser configuration.

Ideally, you should be able to take a YAML parser, and tell it, per input stream, that "this is the hand-rolled sysadmin kind of input, so parse it using the full feature-set of YAML"; or "this is the J. Rando user kind of input, so parse it like a JSON parser with maybe one or two more features" (ala StrictYAML.)

That would solve most of the complaints the author has, on both sides: if config systems used YAML parsers but only configured them to accept a strict subset of said YAML (either StrictYAML or plain JSON) for any input they expected to come from anywhere other than a sysadmin, then there wouldn't even be a discussion here. You'd use a plain JSON parser for (efficiently, securely) receiving plain data; but for configuration, a YAML parser would be the obvious choice, because it could always be locked down.

But in the world we live in, YAML parsers don't have security features like this.

I'd make a suggestion of an alternative, which in pseudocode looks like this:

    opt.parse('-c', '--config [path]', Path) do |path|
      this.config = JSON.parse(path)
    end

    opt.parse('-C', '--config-trusted [path]', Path) do |path|
      this.config = YAML.parse(path)
    end


> But in the world we live in, YAML parsers don't have security features like this.

I'm the author of the "yaml" npm package, and would be interested to hear more precisely what security features you'd wish for a YAML parser to have. In other words, if there's one I've missed, I'd like to fix that.

https://eemeli.org/yaml/


Having the parser play "guess the type" like YAML can cause problems in almost any use case, though.


You know what's even worse? IIS config files.


i prefer using EDN for configurations, it has comments and you don't need to worry about commas and whitespace


That's why you should use YAML.


The correct answer here is YAML, folks


Thoughts on this:

* Config should never, ever be written by the application which depends on it. Ever. I don't care what you think your rationale is: no application should modify its own configuration.

- Git is a weird case which springs to mind, but technically `git config` is separate from eg. `git fetch`. So no real problem there.

- .NET's bidirectional config bindings are daft. If you stick to reading only then the framework's fine, but otherwise it rapidly heads into crazytown.

- If you think you need to violate this rule, probably your 'config' is actually user data. If so it should be treated as such, ie. it's subject to data migrations, the admin doesn't need to touch it during upgrades, etc.

* Config needs to be strict. It must be possible to statically verify the basics, and there should not be any confusion over meaning in eg. some random diff.

- Everything's a string in a text file. The way in which the string is interpreted must be absolutely unambiguous and sane.

- YAML need never apply.

* Config must not be Turing-complete. Ever. If your configuration can be every conceivable output of a program then you have built a system which is effectively impossible to properly test.

- I take this one step further and consider it idiotic to include code in a config file at all. It's fine to generate one from other files as part of a deployment, but the config read by the application fucking better be a single static text document which maps to a simple data structure, with as little processing as possible performed after the fact.

* Config needs to be possible to process in a totally reliable way by a deployment process, eg. templating.

- See above.

* Config should succinctly describe concepts which make sense for configuring the application, not the dozen or so libraries upon which it depends.

- In .NET terms, this means "don't give us ten megs of 'default' WCF crap to fiddle with". Give us an endpoint address, and sufficient documentation/logs/etc to figure out why 'http://something' doesn't work for an app written only for 'net.tcp://something'.

- Logging is a weird one. I'd generally handle this by providing app-level levers for expected logging requirements, then explicitly calling out the framework and version used and where to find the docs for the advanced stuff.

Given the above, I fail to see anything at all wrong with XML. It has strong support for schema validation and templating, and can handle any complex data structure it needs to in both contexts.

Saving keypresses is not a valid excuse. Terse crap is still crap.#

While longwinded pointless junk has given XML a bad name, particularly in the Java and .NET world, being explicit about things retains stability in the face of supposedly-unrelated changes: if your configuration's meaning can change globally if a single rule or heuristic is modified then your software's reliability is in question.

All it costs is keypresses, and text diffs are very easy to audit in source control.

Write compact config in anything you must, generate the big stuff, commit it, record it, audit it. And have CI check that the compact matches the big.

Disclaimer: ten years ago I hated XML for wasting so much space. I was an idiot. These days I'll take abundant permanent precision over everything.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: