Hacker News new | comments | ask | show | jobs | submit login
Toml: Tom's Obvious, Minimal Language (github.com)
358 points by wheresvic1 7 months ago | hide | past | web | favorite | 195 comments



Hey, Tom here (creator of TOML). Fun to see TOML on HN again! Since I first wrote a (mostly) joke proposal for TOML 5 years ago, TOML has been adopted by a number of prominent projects such as Cargo, Hugo, Pipenv, and others.

TOML is especially well suited for projects that need a simple configuration file that maps unambiguously to a hash table. There are still some weaknesses in TOML that make it non-optimal for large, complex config, but I'm hoping to address that in a later version of the spec (perhaps 2.0).

Happy to answer any questions you all have about TOML!


A killer feature of TOML compared to JSON is that it allows comments. A config file without comments and examples ain't great. I found the double square bracket syntax useful and understandable. Agreed it's not obviously .INI or perfectly elegant but it certainly works and has its use-cases. Anyways, thank you @mojombo!


JSON used to support comments but they started being used for unintended things like parsing directives.[0]

It is apparently possible to include, and then remove them before usage apparently but I've never tried.

[0]: https://plus.google.com/+DouglasCrockfordEsq/posts/RK8qyGVaG...


I'm personally doing something similar in a C++ project that I'm working on. I designate comments with a //. When I load the JSON file, I go line by line removing comments as I find them. Once the file is parsed you can feed it into any regular JSON parser (personally I'm a fan of https://github.com/nlohmann/json).

Using JSMin would have been a much better idea. At this point though the only downside is that I haven't yet done this for saving JSON. It's a harder problem but I don't see why it wouldn't also be doable.

edit: Apparently this library allows you to have comments, and will even preserve them when saving the JSON file.

https://github.com/open-source-parsers/jsoncpp


You could use libjsonnet++ instead of a custom reader to gain the extension of using comments. Of course, Jsonnet is a gateway drug to a much more sophisticated superset of JSON.


There's the JSON5 format that supports comments, there are libraries available for parsing and encoding it.


You're welcome, really glad you like it, and good to know the double square brackets make sense to you! It does seem to be a bit polarizing, so we'll keep working on something even better.


> I found the double square bracket syntax useful and understandable

Really? I had kind of the inverse reaction to it. Indeed useful, but definitely not understandable


It's kind of similar to the syntax of uri query strings and the php syntax for appending to arrays, in that an extra pair of square brackets indicates appending:

In php:

    $a[] = 1;
    $a[] = 2;
    # $a == array(1, 2)
In URIs:

    http://example.org/?a[]=1&b[]=2
Compare to TOML

    [[a]]
    item = 1

    [[a]]
    item = 2


That URI syntax is just a convention used by some web frameworks. You can just as well write

  https://example.org?foo=bar&foo=baz


Not for PHP at least, maybe others, the last query arg will win.


{ “_comment”: “The comment goes here!”, “name”: “foo” }


So now your comments are flowing through infrastructure wasting bandwidth and possibly leaking internal configuration details?


Parse private properties in the config files in your build process, if paranoid. I don’t have use cases where I’m sending the full contents of a config across the network. Each property in a config would be explicitly referenced in the application.

And it’s not like there isn’t any potential problems with some YAML or TOML library you import. Rails Yaml comes to mind.


And maybe even breaking code which doesn't what to do with these fields.


Json allows comments, but they have to be in string fields of an object so they survive reserialization.


So then… not actually comments at all?

A nice config parser will error out or at least warn loudly when encountering an unknown parameter. This helps prevent typos.


I used to feel this way, and also used to be frustrated about multi-line strings in JSON. With years of experience now, though, I actually appreciate JSON omitting these features.

Config files should absolutely not have or need comments. If you need them directly in the config file, something is wrong. Applications should document their default settings in a different way, preferably in a README or generated documentation that also explains how to use environment variables to override the defaults. That sort of separate companion doc is the right place for notes about defaults or "why" certain config values exist in the file. The same is true for using JSON to store parameter files, etc. It's actually quite important to keep metadata about the config / params / etc. specifically out of those files, so that they are absolutely nothing but value files. Information about why a file contains those values belongs elsewhere, and it's an anti-pattern IMO to rely on comments in the config / param file.


I have never disagreed with someone more than I do now. =)

Config absolutely needs comments. Context is everything. Comments allow me to explain to other humans why the config is the way it is. Dumping that out to a separate file is begging for it to fall out of sync when there's no comment instructing anyone to go and update the other file. Plus that's just kind of silly.


I would even say config needs comments more than code does. Code can be self-documenting: by using good variable and function names, splitting or combining lines of code, or re-ordering blocks of code you can often make the intent of the code clearer without adding explicit comments. If you do something unexpected in a config file, it likely just shows as setting some name to a magic number or a magic string.


This ... so many things about configuration decisions that make sense at a point in time but can change with versions of a library, OS, server etc.

In past lives I have had to hack around so many different things to make something work they way I intended that I knew there would likely be a better solution to at some point.

As we get closer to things like infrastructure as code and configuration as code being the norm I would like to take this comment to remind people there is no such thing as self commenting code!

Even if this configuration should be “obvious” given constraints today when someone comes back to this months or years from now it’s likely some of those constraints could have changed or been removed completely - ignoring this is how you end up not changing things out of fear that something will break without real understanding.

Comments are almost never a problem unless they’re not updated when significant changes are made


I definitely add way more comments to my projects' default config files than I do to my actual codebase, if we're talking total number of comments.


I disagree. It’s an anti-pattern. For example, if you’re writing an application that loads a default config file to populate parameters at run time, then the software module that loads from the default file is the correct place to document it, because the meaning of defaults is relevant to that source code, not at all to someone reading the parameter file itself. A parameter file is just some blob of stuff.

I agree context is everything, and that’s why it’s a bad idea to embed usage info or instructions about the contents or meaning of a parameter file into that very file.

Somewhere else, something has to choose to load that file, and that is where the documentation belongs (in addition to readable, separate artifacts that are generated from the file).

For example, suppose you need to rewrite the parameter file from YAML to Toml, or you need to add a new layer of nesting and some post-processing logic at load time.

The meaning of these things has no context inside the parameter file itself. It only has meaning at the point some other system consumes it. Another system could consume the exact same file and choose to interpret all the parameters with different meanings in that program, regardless of what any comments says in the param file.


Maybe that makes sense for the app you are writing that is only consumed by others at your company, but if I'm installing your app from a package manager I absolutely do not want to have to read the config loading source code to figure out what all the parameters do. And even if the docs are nicely described in a man page and not in config file comments, I absolutely want to be able to comment on particular parameter changes ("# had to up foo to 18 because bar was frobbing baz at to high a rate").

Even if you use configuration management it's still nice to be able to comment things in config files since they show up both in the source and in the output files, which is very helpful when debugging.


I specifically meant that comments aren’t a good idea in config files when the config files are distributed to any end users as part of some app or package installation.

> “And even if the docs are nicely described in a man page and not in config file comments, I absolutely want to be able to comment on particular parameter changes ("# had to up foo to 18 because bar was frobbing baz at to high a rate").”

This is the exact anti-pattern that happens as a result of relying on comments in the config file. Your goal of adding that comment about why you changed foo directly in the config file is dangerous and is a very bad practice.

Instead, whatever config file it is that you are modifying (whether for third parties to consume or just for own local Postgres or video game or anything), that file should be turned into a proper package. Place it in a version control repo, and readme and usage documentation for the “why” of the parameter values, and make a tool so that if the end user wants that exact config file, they can “install” it.

For any customized overrides of single settings at run time (so specifically not changes to a config file), it should happen via the end user overriding ENV variables, not mucking around in config files and trusting ad hoc comments found inside them.


I very much disagreed with you two comments up, but this more fleshed out response actually makes a lot of sense. You just moved me from "I'm going to add a build rule that strips comments from all my JSON files so that I can have comments in my configs" to "The next time I want comments in JSON, I'm going to see if what mlthoughts2018 said makes sense in this case. Do I really need to comment this JSON file, or would it be better to put the comments somewhere else instead?"

We'll see how it goes!

Cheers


>I agree context is everything, and that’s why it’s a bad idea to embed usage info or instructions about the contents or meaning of a parameter file into that very file.

If you don't embed those comments you bereft that file of context -- why, as you admitted "is everything".

>Somewhere else, something has to choose to load that file, and that is where the documentation belongs (in addition to readable, separate artifacts that are generated from the file).

Well, that's not really relevant. People who change configuration are 99% of the time not the same as those who write/read/or even have access to the code that reads it.

Imagine Apache or Postgresql configuration for example.

Admins change those all the time, but don't have anything to do with the Apache or Postgresql project, and don't ever care to read their code.

On top of that, my particular settings in some config file, are based on MY local server and needs and my context, and not on something Postgresql or Apache devs will know themselves.


System admins who might change e.g. Postgres config files absolutely should be proficient in looking at documentation or the source code to understand the meaning, and then create separate documentation about their own customized config files (not making others or their future self actually have to read that file to understand why a value was chosen).

But more generally, it’s bad that a lot of applications don’t offer documented ways to modify config through ENV variables.

A config file should always be code reviewed, versioned and checked into SCM. One side effect is that end users, sys admins, etc., should never be given the chance to inject their customizations through locally modifying the base config file. That should be straight disallowed, so that distributing config files (such as the default, or bundles of other settings commonly used in unison) is part of packaging and deployment, and end users use other mechanisms that allow overriding defaults in a case by case manner.

Then you are talking about perhaps a shell script that sets dozens of ENV vars to override what comes from a (never modifiable) config file, and sure you might document the why of your choices in that shell script with comments, and at no point would anyone need or want comments in the actual config file(s).


Suggesting that most developers who use postgres should be familiar with its source code is madness.

Expand this to everything a sysadmin has to manage it just doesn't scale.


> “Suggesting that most developers who use postgres should be familiar with its source code is madness.”

This is a glib strawman that has nothing to do with what I said at all.

If you intend to modify config files directly, then you should be able to find the relevant documentation, source code comments, etc., on what the config entries mean or why a value is chosen as the default without needing comments in the config file. This is in no way related to your wild idea that “most developers who use postgres should be familiar with its source code..” (which is not at all what I said and is not at all a reasonable characterization. Reading a tiny bit of source code or docs to find one type of comments is utterly and completely different than your gross mischaracterization that it’s somehow a claim that people would need to be deeply familiar with a bunch of source code.)


You're talking about comments that document the semantics of fields in a config file. Most other people here are talking about comments that describe why the _specific values_ present in a _specific file_ were chosen.

Whether documenting semantics of fields belongs in a config file or code (I say both!), it simply doesn't make any sense to say that comments about why specific values in specific files are the way they are belong in the code that parses them. (How can the parsing code have knowledge of all config files out there?)


No, I am talking specifically about comments or documents that explain the "why" behind parameter choices. For canonical defaults that are shipped with an app, this should be documented in a user guide and in the appropriate parts of the source code, not inside the config file (which most users would never care to read, and which would be worse to read for most developers who might care or change it).

For local config files that are not shipped with the package, they should still be source controlled in a separate repo (yes, even if you personally are the only one using it, like your local Postgres configuration or something), and you should use a good practice to 'deploy' any changes to your config from the repo into the actual location where the application can recognize that config -- meaning that documentation about the "why" of the parameters once again should absolutely not be embedded inside the config file itself.


If they're not in the config file, where exactly are they then?

A text file sitting next to the config file? I can't see any benefit to that arrangement over just using comments in the file.

Comments that exist in the file in the repo but get stripped out by the deploy process? Again, there seems to be no point to doing this.

Commit log messages? I agree commit logs can sometimes be valuable to see the context of a change, but 1) they're fundamentally about documenting _changes_, not the contents themselves, and 2) the UI is really clunky: git blame, find the line you want context for, then git show on that commit.


>Config files should absolutely not have or need comments.

Well, that's like your opinion, man.

If I have something set to some value in my configuration, I want those reading the configuration file (not parsing it, reading it in a text editor, e.g. to make an edit) to know why it's so.

>Applications should document their default settings in a different way

Comments in configuration files are not there to explain default settings, but to document why a setting (default or not, but usually already edit to suit your specific environment) has the value you set it to.

>Information about why a file contains those values belongs elsewhere, and it's an anti-pattern IMO to rely on comments in the config / param file.

That's a statement, not an argument. Why does it "belong elsewhere"? So that you have an extra layer, that few will bother to check?


You make some interesting assertions, but 1) they contradict everything I have learned from painful experience and 2) they seem to make no sense.

> It's actually quite important to keep metadata about the config / params / etc. specifically out of those files, so that they are absolutely nothing but value files. Information about why a file contains those values belongs elsewhere

Okay, I'll bite: Why? In every area of programming, we learn that mental context changes are harmful. We learn to avoid gotos, not abuse exceptions, add meaningful comments, and seperate concerns all in large part so when we look at a file, we can understand what's going on without referencing other files.

Now you say "oh, unless there are some config values in that file, then it's important for it to be as cryptic as possible!" Surely you see why that sounds a bit odd? Should we also base64 encode the file? Do you also hate descriptive variable names in config files?


> I actually appreciate JSON omitting these features.

I guess this is right if config files are only machine read and written. When I am prototyping, config auto-gen tends to come pretty late in the project (for me), and I like being able to comment things.

I also like types and time-date parameters (which JSON doesn't provide unambiguously), so by the time I've done, I've reinvented YAML+, and that's a mess, so I actually appreciate TOML quite a bit... (At the very least - I get to be a retard[0] about config later in the process)

[0] To the inevitable person that's going to call me an able-ist for using the term, I'm using the term in a self-denigrating way, ...


> "I guess this is right if config files are only machine read and written."

No, I meant omitting these features is very useful for humans reading and writing config files. Comments utterly don't belong in config files. They belong in the sections of code that load specific config files and convert their contents into defaults or parameters.

A config file is just some file. Its contents have no conventional meaning. It only takes on a meaning in the context of the specific system that uses it.


Hugely disagree. Config files will modified by non-experts of the application 1000x as often as the actual developer of the application. Those people won't and often can't go look at the code.

Also, it can be really hard to find where exactly a configuration value is used. You may have to trace through a ton of code to find the place, and then you can't be sure that's the only place it's used.

Configuration comments are crucial, both for onboarding new users (explaining what the default is if you don't set a value, explaining what the configuration value actually controls, etc) and for experienced users to tell others why this esoteric configuration is set the way it is.

A configuration file is literally just a stack of magic numbers and strings. Why are we setting threadpool to 10 and not 1000? Why are we disabling X feature? What the heck does TPS_Report=true do?

You need comments.


I disagree strongly. Config files being modified by anyone should be going through code review. The risk of not understanding while at the same time modifying things is extremely low.

Plus, the documentation in the application code that loads the config and manipulates would function as the exact same reference documentation for any developer trying to understand how the config is used or why a choice is made.

This prevents the documentation about what the config is supposed to be used for from being coupled with the implementation detail of what particular config file looks like, what language it’s written in, etc.

Just as you say, a config file is a stack of magic constants. They have no meaning at all sitting in that file. The place to look for their meaning is the documentation of the code that loads the file, which should tell the user everything they need to know about modifying or providing their own file.


IF you have a 500k LOC software project... how the heck is an SRE/devops person going to figure out where in that code a specific configuration item is going to be used? They're not. This is why documentation is essential for projects. You could keep configuration documentation in a separate file... but that only helps for what the config does. It can't help you figure out why Bill (who left the company a while back) set ThreadMax to 650 when he changed the code 6 months ago. There cold be a commit message that references it, but that's more disconnected from the change that just slapping a comment on top that says why.

I agree that code review for configuration changes is necessary. That same code review process can ensure that the comments in the config file are also correct.


> It can't help you figure out why Bill (who left the company a while back) set ThreadMax to 650 when he changed the code 6 months ago. There cold be a commit message that references it, but that's more disconnected from the change that just slapping a comment on top that says why.

Also worth noting is that not all config files are committed to version control as-is. If a deployment process bakes the config file from variables, it can be even more disconnected and difficult to find the change.


The size of the project is a red herring in your comment. People document command line options, ENV settings, etc., in huge projects all the time. It has absolutely no bearing on whether comments belong inside of config files.


> Config files being modified by anyone should be going through code review.

Yeah, I'll put in a PR for my local Transmission config file and see how far that gets me ;)

Config files should be understandable and readable by the end users so they can customize their local installs.


This is silly. If an app like Transmission expects end users to modify a config file locally as the means to add customized settings, that’s a seriously bad design. Why not provide documentation about command-line arguments or ENV variables it would look for for end user customizeability. A settings file where you need to know the meaning as you read the file is among the worst ways to solve it.


>They belong in the sections of code that load specific config files and convert their contents into defaults or parameters.

And what are users who don't have access to the source, or who aren't programmers, supposed to do?


Somehow these people are reading Toml files of config? That’s silly.


Do you expect a systems administrator to be familiar with the source of every software he maintains?


No, I expect the system administrator to refer to a readme, user guide or API doc that explains how to inject custom options at the command line or through a web API, etc., and absolutely never by mutating a config file outside of version control with code review from the team that maintains that specific type of config.


Take Apache or Postfix, for example, which are configured via possibly complex configuration files and not a “web API”. It is definely useful to have comments explaining configurations. For example, “here we deviated from the default for such and such reasons”. That’s true whether the file isn’t maintained manually or via some Puppet or Chef template or something else.


But if the config file has a comment like “here we deviated from the default...” it suffers the sane risk of being out if sync as the documentation would suffer anywhere.

So then, instead of forcing someone else to dig around a file like that, why would you put the docs about your modified config file somewhere else, and in a format that’s much easier to read?

For example you could make a whole git repository solely to hold a certain config file, allowing it to be versioned and even including an “install” script so that some other tool like docker could be used to faithfully reproduce a config setup.

Basically the only examples I have seen anywhere in this thread where people think in-line config comments matter are large desktop applications like Transmission or Postgres which have the unfortunately bad practice of letting users modify complex config files (instead of forcing all overrides of defaults to be ENV variable based).

And even in these cases, it seems more like lazy people who just personally prefer to sling a config file around rather than doing some minimal best practice like putting it in a repo to wrap it up like a mini-package and give much better documentation to someone who might “install” that config than what crappy in-line comments can give, and to enforce code review even for your own local config changes.


If you think there’s a risk of comments becoming outdated in a config file, then the risk of it becoming outdated in a separate place seems even higher.

Configuration via environment variables is as far from best practice as I can imagine. It prevents proper validation of settings (e.g. a typo can’t be detected because you don’t know which environment variables are used by other programs), not to mention the possible security implications. Besides, these environment variables would have to be set somewhere, like a startup shell script, which I bet would end up with explanatory comments anyway.

This trend of using environment variables is a result of trying to make applications stateless, therefore easier to run in containers. However with something like K8s ConfigMaps, it seems totally unnecessary.


> “then the risk of it becoming outdated in a separate place seems even higher.”

I’d say they are almost identical risks, and for all practical purposes there is no difference. Putting comments inside the config file only makes discovery much harder than putting it in a user guide, man page, etc. with no offsetting benefit of consistency.

> “Configuration via environment variables is as far from best practice as I can imagine”

You are arguing against perhaps the most dominant notion of config best practices in the modern era when you say that [0]. Quoting from the 12 Factor App best practices on config:

> “Another approach to config is the use of config files which are not checked into revision control, such as config/database.yml in Rails. This is a huge improvement over using constants which are checked into the code repo, but still has weaknesses: it’s easy to mistakenly check in a config file to the repo; there is a tendency for config files to be scattered about in different places and different formats, making it hard to see and manage all the config in one place. Further, these formats tend to be language- or framework-specific.

The twelve-factor app stores config in environment variables (often shortened to env vars or env). Env vars are easy to change between deploys without changing any code; unlike config files, there is little chance of them being checked into the code repo accidentally; and unlike custom config files, or other config mechanisms such as Java System Properties, they are a language- and OS-agnostic standard.”

You say,

> “This trend of using environment variables is a result of trying to make applications stateless, therefore easier to run in containers. However with something like K8s ConfigMaps, it seems totally unnecessary.”

I don’t think those are the reasons for putting config into the environment at all. I think a main reason is specifically so that end users modify config with ENV vars that disallow scattered and non-reproducible collections of config (e.g. if you share the command or script that defined the environment, the settings are enforced; if you share something that relied on a local config file you had modified but no one else knows about, they can’t get your settings).

[0]: < https://12factor.net/config >


> You are arguing against perhaps the most dominant notion of config best practices in the modern era

I don’t know that it’s “dominant”, nor that it’s good merely because it’s “from the modern era”. But even if all that is true, 12 factor apps constitute a small subset of configurable software that is deployed worldwide, and hardly a silver bullet for all kinds of software (it even defines itself as a methodology specific to SaaS applications).


Non-open source software.

Ops teams deploying a service.

Programmers who don't know the language a program is written in.

Heck what about a config file for a game? If setting the resolution it'd be nice if valid values are shown in the comments.

The argument is about comments in JSON btw.


Why would these people be modifying config files outside of code review? That’s horrible if true, and would entirely go against most best practices (e.g. 12 Factor use of ENV vars).

If you want to provide config customizeability as part of an API or interface to third party users (and you should!) then doing it by comments in a file that users modify is insanely bad. Instead, document usage instructions for overriding defaults with ENV vars — customization should never involve mangling a config file outside of version control, and absolutely not by third party devs, system administrators, etc.

In fact, I think your response highlights exactly why relying on comments in config files is such a bad anti-pattern.


> Why would these people be modifying config files outside of code review?

My VSCode config files are stored as JSON. They are hand modified all the time.

package.json is a user editable config file, used by millions of JS developers every day.

The .babelrc file to configure the JS transpiler is a user configurable json file.

.eslincrc, used to configure one of the world's most popular JS linters, user editable json file.

Those are all the ones sitting in one folder, and they would all be vastly improved with comments.

Especially the package.json, entire scripts live within package.json files, you run them with the command

    npm run <scriptname>
Right now there is no way to comment what the heck each script does! Sorta a PITA to have to go through each script, which can link to other scripts, to figure out what is going on when a simple comment would save a lot of time.

> customization should never involve mangling a config file outside of version control, and absolutely not by third party devs, system administrators, etc.

Config files in general? Of course they need comments.

What about users wanting to set custom key bindings in computer games?

Back in the day, customizing Quake settings was huge. All user editable config files. Little Jimmy's DOS 6.0 games folder wasn't using version control.

Even more recently, users modify config files, often times to fix or get around bugs in a games UI.

How about setting up a mail server? Those are all user editable config files. People running a local mail server aren't going through code review or version control. A sendmail config without comments would be even more impossible to read!

Or just simply .bashrc files.

Config files are used all over the place. Being able to document what happened is incredibly valuable. Not every file in the world needs version control.

> Instead, document usage instructions for overriding defaults with ENV vars

Environment variables are set through configuration files! That is just kicking the ball down the road.


You begin your comment with a big list of extremely bad designs for config management. The least important flaw of those examples would be any limitation on comments in the config file — they have severe problems well beyond that. Papering over problems in e.g. badly misused package.json files won’t solve anything, and because it would run the same risk of those comments getting out of sync with the entries of the file as any other way of documenting it, it could even be harmful.

The rest of your examples just rehash the same fallacy mentioned in other comments (which I’ve given adequate responses to already). Large apps that use a local config file are no exception to what I’m saying. That’s still a terrible way to expose customization to an end user, and even if you are modifying some big local config file, like with Postgres, you should be putting your config file into a separate version control repo, versioning it, having comments as part of that repo, treat it like a mini package that gets deployed or installed when you make any changes (which should be code reviewed always, even if you’re just talking about your own config file for some app on your home PC).

Environment variables should not be set through config files, that’s terrible. Rather environment variables being set explicitly should be the way you override whatever default would have been chosen from the config file.


> Environment variables should not be set through config files, that’s terrible. Rather environment variables being set explicitly should be the way you override whatever default would have been chosen from the config file.

How else do you propose environment variables be set?

Some script someplace has to set those environment variables.

A script that configures environment variables is awfully similar to a fancy config file. Indeed, in the interest of keeping code clean and organized, if a lot of constants, in the form of environment variables, are being set by a script, it might be a good idea to put them all in one file and just import that.

At which point you now have an actual config file!

> That’s still a terrible way to expose customization to an end user,

How would you do bashrc?

There are a lot of small problems for which config files a perfectly fine. Especially configurations aimed at technical users.

Creating a UI around configuration means developer time and effort, and that UI has the potential to have bugs. That UI needs to document what each configuration setting does, and what all the possible valid values are.

If all the config is trivial, and stored in a plain text file, then a text editor is perfectly valid UI for advanced users, and comments in that file serve the same purpose as documentation.

Hell you could make a transformer that takes a config file with documentation and pops out a UI. At which point you've unnecessarily complicated the UI of a text editor.

Indeed one of the niceties of VSCode versus Visual Studio is that VSCode has all its configurations stored in an editable text file. In Visual Studio settings were hidden behind a horrible UI that recently got search put in place. Until that happened, the common way to find a setting in VS was to look up on google where the setting was.

On a related note, Visual Studio uses what is called MSBuild files, fancy XML files that configure the build settings, and instructions, dependencies, etc, for a project. There is a nice UI around it. The nice UI likes to corrupt the configuration file. Advanced users quickly learn that modifying the file by hand is far superior.

> (which should be code reviewed always, even if you’re just talking about your own config file for some app on your home PC).

Bullshit! I am not creating a repro for my config file for my linter. I do however want this comment:

    // imports React Native builtin functions
in my linter config, which I'll set, and not ever change until I start a new project, copy and paste my linter config file, open it up, see that comment, and know if I need to keep that line or not.

Also comments in repros suck.

Repro comments do not comment on specific lines in a file, they are on the entire check-in.

External tools are needed to determine what commit modified a given line in a file. Hopefully that last commit has a comment about not just what that change to that line did, but what that line is for in general. Heaven forbid if multiple lines were changed.

If I am in the middle of editing a config file, I now need to task switch to another tool, potentially play "guess the commit", and piece together what that line in the config file actually does.

If I'm running some multi-million dollar system off of that config, sure, it can be worth it. But there is a HUGE productivity loss there in comparison to

    // boolean, false leaves tabs alone, true converts spaces to tabs
in a config file for my text editor!

Your entire viewpoint seems to be around mission critical software, which is great for mission critical software. But requiring git be installed so users can setup tabs vs. spaces is insane.


> "A script that configures environment variables is awfully similar to a fancy config file. Indeed, in the interest of keeping code clean and organized, if a lot of constants, in the form of environment variables, are being set by a script, it might be a good idea to put them all in one file and just import that.

At which point you now have an actual config file!"

I think this is specifically not what is recommended, e.g. read through [0].

> "How would you do bashrc?"

.bashrc is source code that gets executed, not config. Similar with .emacs, etc. Personally, I keep all of these things in a single git repo and then the local files are symlinked to the version controlled files, so that I can document any parameter changes through a PR in GitHub, and it's easy to distribute config files to new places (just clone these types of configs from git). A nice bonus is that putting comments into the config directly is not needed. Anyone (whether my future self or a friend who wants to use my configs from GitHub, etc.) can look at the readme / user guide for the git repo, and not need to go on a fishing expedition for ad hoc comments in the actual files.

> "Especially configurations aimed at technical users."

You can't have it both ways. If users are technical, then they can find what they need in readmes, user guides, and source code comments. There would be no benefit for the comments to be directly in the file, while having comments scattered all over and possibly out of sync from the intended usage instructions would be a downside.

On the other hand, if the user is not technical enough to modify settings in that way, then a user guide and some type of additional API for customization is needed. Overall this suggests the comments belong at the user guide / readme level, since that would be a beneficial way of working for both groups.

> "Bullshit! I am not creating a repro for my config file for my linter. I do however want this comment:"

This just seems like you use bad practices. I have a pretty heavily customized pylint setup for linting my code, and I absolutely do check it in. I actually wrapped it in a simple setuptools package so that I can say, "pip install <my private github URL>" and actually install my custom linting tools (and their settings) as a full (versioned) Python package complete with an exported shell script to control it. Then I can, for instance, just have a conda environment like "my-linter" that has absolutely nothing except for my installed linting tools, and make a bash alias that will source that environment, run the linter, and then deactivate. I'm even considering putting it all inside a Docker container instead of a Python environment.

> "Repro comments do not comment on specific lines in a file, they are on the entire check-in."

Maybe you are using different tools, but this is not true in any source control tool I have ever used. Certainly not true in GitHub or Gitlab.

I honestly don't know what you're talking about with the whole "guess the commit" stuff. Just navigate to a file in GitHub and look at the history (or do this from the command line once you feel comfortable reading file history). If looking at a file in GitHub is too much work for you, well, I can't help you. That's just not reasonable and speaks to extremely bad coding practices if inline comments in config are being used to circumvent meaningful code review that leaves useful commit history.

> "Your entire viewpoint seems to be around mission critical software, which is great for mission critical software. But requiring git be installed so users can setup tabs vs. spaces is insane."

This baffles me. I am not even sure I can parse this as a coherent comment. Requiring git to be installed? What? If you change a file from tabs to spaces, hopefully some linter complains at you based on whatever your team's conventions are. But if you mean changing a setting in a settings file somewhere, then yes, it ought to be checked into something. Why would you think it's reasonable to just hand-edit config like this with zero version history? That's just not reasonable.

To be clear, I am not talking about mission critical software for any of this. I am talking about any type of config you manage. I do all this just on my local machine for my Jupyter config, linting config, emacs, bashrc, etc., all just for my local laptop that I use for recreational programming and random personal computing. Obviously, it matters even more for commercial software.

[0]: < https://12factor.net/config >


Commenting on individual lines is a github extension, not something that the majority of source control software can do. (I've actually never seen any other source control software that can do that, but I imagine it exists somewhere).

Heck Github desktop doesn't even support it, I'd have to check in the file, go to the website, then add a comment. My linter files get thrown in the repro as a matter of fact, so they are at least checked in, but the other two things are more work than I want to worry about for some throw away file that I don't care about.

As to commit histories, I've spend hours of my life digging through commit comments (in multiple different source control systems) trying to find out what the hell some setting did. Often times the config file ended up being grabbed from a branch in some other project, and the revision history is gone.

I've also spent probably dozens of hours of my life tracking down what some random environment variable does. Joy.

Realize you are an extreme outlier. The majority of people in the world do not commit every single config file for their personal projects, nor do they use commit messages as a way to track individual settings changed in the file.

> Why would you think it's reasonable to just hand-edit config like this with zero version history? That's just not reasonable.

Because that is what the majority of people on the earth do. Because it works just fine. Because I have a thousand things to worry about trying to run a startup and the commit history of some file that I really don't care about isn't that important. Being able to write a comment is a nice to have so if by some odd chance I come back to the file years later I know what the heck I did.

I think it is reasonable because life is full of trade offs, and limitations on time, and not everything can be perfect.

> .bashrc is source code that gets executed, not config. Similar with .emacs, etc. Personally, I keep all of these things in a single git repo and then the local files are symlinked to the version controlled files, so that I can document any parameter changes through a PR in GitHub, and it's easy to distribute config files to new places (just clone these types of configs from git). A nice bonus is that putting comments into the config directly is not needed. Anyone (whether my future self or a friend who wants to use my configs from GitHub, etc.) can look at the readme / user guide for the git repo, and not need to go on a fishing expedition for ad hoc comments in the actual files.

I agree github is useful to distribute files, easy to checkout from. But the bashrc on my server is one I copied and pasted by hand, because that is Good Enough(tm) for use on a machine that I'll use the shell for maybe a grand total of 30 minutes a year. Setting up git and going through two factor auth for a file that is literally Never Ever Going To Be Changed serves no purpose. The world would not be better place because a bashrc on a VM is under source control.

Also I'd then have an external facing server with a copy of a personal repro laying around, which could be an information disclosure problem if someone breaks into the server and I check something into github that I shouldn't. (Even non-obvious stuff, like machine names or paths used on internal machines is a type of information disclosure that increases risk.)

(really my VPS provider just had a crap default config that didn't show the current working directory at all, so the bashrc there is a single line)

Likewise, with most of my config files, they are throw away, or write once edit rarely.

FWIW it sounds like you are using github comments instead of inline comments. This only matters if you care about why a line change was made. If you just want to know what a line does, then using comments on github provides no more functionality than adding a comment to the file, except looking up comments on github takes longer.

Checking config files into repros to easily spread them among projects makes sense if you have enough overlap. But saying that github line comments are the one true way to comment lines in a config files is making an absolute statement that you know what is best for everyone's situation. Such a scheme adds no extra value if someone just wants to know what a given line does right now. And I can't think of ever having wanted to look at the history of my linter config. If I am copying it over to a new project, I want to know what lines to keep and what lines to delete based on the type of project I'm doing. Comments in the config file would allow that. Except I don't have comments because JSON doesn't have comments.

In which case I'm now thinking of github line comments as an ugly hack around JSON not having comments.


That might be correct. I tend to want to localize explanation at the edge. I've got to think about this. Thank you.


Hey Tom,

I just wanted to say thank you real quick.

We use TOML in [Habitat](https://github.com/habitat-sh/habitat) to define default and overridable configuration. It's super easy and expressive.

The [Habitat Core Plans](https://github.com/habitat-sh/core-plans) have quite a few examples about how it gets used in the project. If you are looking for examples of larger configuration files with .toml, then that might be a good place to look for some real world use.

Thank you again!


Amazing, thanks for the heads up! I'll add it to the list of projects using TOML [1] and take a look at your example configs to get some more real-world uses cases to look at when evaluating future changes to the spec.

[1] https://github.com/toml-lang/toml/wiki



I'm a pretty big fan of TOML, though I've mostly only used it for Cargo.

What are the limitations for large complex configs, and do you think there's currently a better configuration format for them? Feel free to link elsewhere if there's a canonical location for this discussion.


The limitations are mainly around nested arrays of tables. As has been mentioned here already, the [[array-of-tables]] syntax that currently allows this can be super confusing, and is definitely the worst part of TOML right now. Even so, you can often work around this weakness by reworking your config file to pick up on a specific naming scheme (perhaps every "server-*" table is interpreted as a member of an array) or maybe there's an array of strings that contains the names of the tables, each of which becomes an array member in that order. But these are hacks, and it would be nice to have a first-class way of dealing with nested large tables that was easy to scan and not too repetitive.

That said, the alternative right now in other formats is basically a { bunch { of { braces } } which is also not the most amazing thing for long nested tables.


> The limitations are mainly around nested arrays of tables. As has been mentioned here already, the [[array-of-tables]] syntax that currently allows this can be super confusing, and is definitely the worst part of TOML right now.

Good to know this is recognized. Looking forward to a better syntax to an indeed difficult problem


Thanks, is there a strawman proposal out there for fixing this, or is it something for the distant future?


is this e.g. due to the verbosity of re-declaring field names every time? or something else?

e.g. CSVs handle 2D tables reasonably efficiently, since you just specify values:

    field_a, field_b, field_c
    1, 2, 3
    3, 4, 5
but I imagine TOML would benefit from something that lets you define nested tables without re-defining the table every time... which probably conflicts with the "obvious" goal. Hmm.


I think HCL (Hashicorp Configuration Language) reads much better than TOML when there are nested sections in the config file.


Hi Tom, huge fan of this project.

1. Would a file path type make any sense to add? TOML is ideal for configurations, and so many configuration options use file paths. I know there are questions about platform dependency, and I don't have all the right answers for how that would work, but I could see it being possible somehow?

2. Data versioning is a big deal to me. For now there's always an option of just adding a "version" key and updating it, but can you think of a better way to do that with TOML?

I'm always overly critical of JSON (despite overall being a fan) because it doesn't specify bit sizes of numbers. I notice that TOML has specified 64 bit numbers. It's nice to see that.


1. How would the file path type differ from a regular string? Would you want specific semantic validation for each platform?

2. This is still a big question, and we've been having a robust argument about it on GitHub [1]. It's a problem that so far has evaded an elegant solution, but usually when that's the case, it just means we haven't been creative enough about it yet! Or did you mean versioning of the data IN the config file? Can you elaborate?

[1] https://github.com/toml-lang/toml/issues/522


1. Yea the goal would be some kind of validation at parsing time. I'm not super familiar with TOML parsing, but there must be error cases in the implementations, so it seems like using `\` on UNIX systems, for example could return an error. The only hard part about this I see is distinguishing paths from strings syntactically without making TOML overly complex.

2. So basically, the majority of people don't want to type version numbers (I see where they are coming from, but respectfully disagree). I'm not sure if there exists a solution in that case. As for my comment, I was really talking about both kinds of versioning. Mainly that, if my data is version 2.3.1, I'd expect that to also lock the version of TOML, requiring a new version of my data for bumps to the underlying data structure. Again syntax is the hard part, as I'm having trouble imagining a way to actually do this. Furthermore I've seen data formats in the past that let you version "parts" of a format, I personally like this, but don't think it's a good fit for TOML.


Backslashes are valid characters in a POSIX filesystem, but they don't indicate a directory.


Yeah...rather than validating paths, the better use case is probably normalizing them so that you don't need platform-specific configuration files. Being able to write:

static-files = path/to/static/files

And have it work on all platforms gives a distinct advantage over specifying paths as strings.


Every modern operating system accepts forward slash paths. It's been a long time since there was any need for things like Python's os.path.join() or special file path knowledge in config files to convert slashes to backlashes on Windows. You can just use forward slashes everywhere and not worry about it.

The one exception on Windows may be paths in a CMD.EXE command line, where / may be confused for a switch character, but I would think a config file should not be passing paths on a command line but instead passing them directly into a program - and then the forward slash will work fine.


That was my first thought, but imo such a functionality belongs in the language's file access library.


Tom, have you heard of EDN and Transit? Like JSON, but:

1. Extensible, custom types

2. Streamy, efficient, compressed and fast, leverages platform's existing native/optimized JSON parsers

3. Wide platform reach

4. Made by Rich Hickey

http://blog.cognitect.com/blog/2014/7/22/transit


I have not, and while it looks cool, it seems to be optimized for data transfer between applications, not a human readable config format.


Transit is more about data transfer, but EDN is mostly meant to be a better JSON (useful date types, doesn't care about commas, has comments, etc)


There is also a similar project - Hjson, a user interface for JSON- https://hjson.org/


For me, hjson takes on some of the warts of yaml (bare strings) which is why I prefer json5 if I'm going for a json variant.


> There are still some weaknesses in TOML that make it non-optimal for large, complex config

May be the config shouldn't be large and complex in the first place?

My biggest question is, why hasn't TOML used much more widely or becoming de facto standard? Am I missing something obvious?


I'm always in favor of small config files, but sometimes for larger projects that's just not possible (or desirable). TOML has always been about being as simple as possible while still solving a wide range of problems. But I think of it as the 80% solution, where 20% of projects might need something more powerful to solve their more complex needs, and that's ok. We still need a super simple config file format for the 80%.

As for adoption, it takes a long time for something like TOML to be adopted. Inertia is powerful in the development space. I also haven't had as much time as I would like to push TOML forward and evangelize it. My hope is that at some point there will be a tipping point and you'll see most projects start with a TOML config and only change to something else when TOML can't meet the project's needs.


Yes I've got a project and config that I tried to stuff into TOML and even make my own DSL for it, but I haven't gotten anything satisfactory still. I've settled on some semiugly yaml for now until I try again

https://github.com/perlbot/App-EvalServerAdvanced/blob/maste...

The rest of the sandbox is configured with TOML however. Just seccomp rules that I haven't figured out the perfect way yet.


Tom here (not the creator or TOML). Every time I have a "what are we using for configuration" discussion at work for a new project, I get to say "let's use TOML, it's obviously superior, just look at it's name!".

It's a horrible joke and I've yet to actually use it, but it makes me happy :)


That TOML started out tongue-in-cheek makes it all that much better!

Just a curious thought:

Sometimes it’s handy to have a text file for small bits of numerical data. Would it be possible to extend TOML to have a “csv” section (array of arrays)?

    [x,y,z]
    1,2,3
    4,5,6
    7,8,9
Or perhaps it’d have to be:

    [data: x,y,z]
    1,2,3
    4,5,6
    7,8,9


The one problem I've had, and perhaps I'm just doing something incorrectly, is going from a `[[something]]` back to a `[thing]` without TOML thinking that `thing` is part of `something` it's not.


Curious. In order to be part of one of the tables in the [[something]] array, a table would need to be named [something.X]. What parser are you using? it's possible that it's behaving improperly in this case.

See the relevant part of the spec here: https://github.com/toml-lang/toml#user-content-array-of-tabl...


This project reminds me of the good old INI files.


As it should! It was inspired by INI, but with a desire for a proper spec that unambiguously maps the config file to a hash table.


This was the first thing I noticed as well. It was always somewhat alienating when encountering newer formats that were less friendly than what came before. I think TOML does a good job of taking what came before and bringing it forward.

Over the many years of the usage of INI I think occasionally we've seen developers add their own touches to how they support types, sections, groups, multi-dimensional arrays or other structures within config files that certainly indicated a need for some more standardized advancement here.

I really hope this gains a lot of traction, because there are so many scenarios where things like XML or JSON are forced into roles they really don't belong in.

One thing that I would say is that it might be worth indicating a standard header that can be used at the very least in cases when it is not stored in a file with a .toml extension. It would not surprise me at all to see TOML widely used with extensions that describe better the intent of the file rather than the format.


damn it.. of course you would update the spec right before I release my config parser. =)

I have been working on a TOML reader/writer for golang that supports read/change/write and format (like go fmt) that sometime I will get enough time to actually finish and release. =)


Hah, sorry! You can always release an 0.4 compatible first, and then add the stuff from 0.5. Sounds like a great project!


Hi Tom. Dave here. I like a lot of the features in TOML, but I'm curious, can I have a pony?

(Folks downvoting, PLEASE JUST LET PEOPLE HAVE JOKES EVERY NOW AND THEN)


You aren't going to change what people around here like. And complaining about downvotes usually just attracts more.

If you think that your brand of funny is worth braving people's disapproval, do it knowing what the reaction is going to be. If you don't like people disapproving, then go elsewhere, or engage here in a way that people prefer.


I made the joke for Tom. It has special significance for him.


That's fine then. Make the joke for Tom. Ignore the inevitable downvotes. Nobody else has to understand or like it.


Dave and I worked together at Powerset many years ago (pre-GitHub) and wrote a bunch of Erlang together. We have a lot of inside jokes from those days. LEAVE HIM ALOOOOOOOOOOOONE!


Hi Dave! Ponies come standard in TOML, you just have to know where to look. =)


Toml is better than a pony. Proof: http://toml.versus.horse


No soup for you.


TOML needs anchors (like yaml)


So like YAML, TOML ain't markup language either, despite the recursive acronym and the "ML" suffix. If your name was Mark instead of Tom, would you have named it MARKML, and would it still not be markup language?


This does not claim to be a markup language.

According to the repo history, it did... 5 years ago, until someone noticed and corrected it.


The news here is (presumably) the announcement of version 0.5.0 a few days prior [1], which includes several clarifications, rollup changes, and enhancements that accumulated over the last 3 years.

In my opinion, one of the more useful features was the addition of Joda-style datetimes [2][3][4][5], which disambiguates between various kinds of datetime constructs aren't interchangeable yet commonly conflated. It's fair criticism that the addition of rich datetime types exceeds the language's original 'minimal' goal, but too many other languages and formats these days just punt to RFC3339 and leave no guidance or tooling on how to represent dateless times or timeless dates without introducing a ton of side-effects. This is a place where language standard libs, with few exceptions, have repeatedly dropped the ball, and similarly, language or library-agnostic, generic guidance is nowhere to be found.

TOML raises the bar here, by providing a concept and notation to specify these values at rest, and gives parser writers, as opposed to the users, the task of finding a way to represent these values in whatever way is idiomatic for the given language.

[1] https://github.com/toml-lang/toml/blob/master/CHANGELOG.md [2] https://github.com/toml-lang/toml/pull/414 [3] https://github.com/toml-lang/toml/pull/362 [4] https://github.com/toml-lang/toml/issues/412 [5] https://github.com/toml-lang/toml/issues/263


Glad you like them, we spent quite a bit of time getting them right! Datetimes are a horrible mess of complexity, but hopefully over time languages and tools around them can standardize on a set of common primitives to make all our lives a bit less horribly messy. =)


I just started a rust project and so had to learn TOML since that's what the package manager (and a lot of the ecosystem) uses.

For some reason, though, I just struggle with the syntax. It says it's "obvious" but it wasn't to me. I think it says something that the README is full of "this TOML would be represented like this JSON", to help you understand what's going on. Every time I saw that, I was like "oh, now I get it." I don't know if that means JSON is just inherently more understandable, or I'm just more used to it, though.

Are there obvious downsides to JSON for config that I'm missing? What are the advantages of TOML over JSON? Maybe eventually it'll "click", though.

I think the following mean the same thing:

     [[foo]]
     bar = {baz = 5}

     [[foo.bar]]
     baz = 5
But I don't think the following works:

     [[foo]]
     [[bar]]
     baz = 5
That is, I think the double bracket syntax always starts at the top level? While the `=` are relative to the double brackets above it? Something about all that is non-obvious to me, and I don't love that there's multiple ways to do the same thing.


> Are there obvious downsides to JSON for config that I'm missing?

for me, there are two. firstly, TOML supports comments.

secondly, TOML or INI is very readable without indentation. i have learned that removing indentation from configuration files makes non-technical people much more comfortable editing the file by hand. when many non-technical people see braces and indentation, they feel overwhelmed and associate it with complexity.

i don't really use TOML's more advanced features, though. i basically just use it like INI.


Yeah, most of TOML looks good, but the .INI style [table] and even wierder the [[table-array]] thing seems like they were trying to hammer a square peg into a round hole for the sake of having config files that look like INIs.


Backwards-compatibility with ini is in my opinion one of the more powerful aspects of TOML which help adoption. Kinda like how UTF8 is ASCII-backcompatible.


Does anyone know why such a "joke proposal" (as stated by the author itself) was chosen for pretty significant projects like pip and cargo? (edit: I tried to say it began as something small and personal)

(Well, I guess there's not that much to win/lose in the area of config languages, but still) (Also, I think it works pretty well, so this is not to downplay TOML!)


Well, it was only a joke to begin with. I couldn't stand the complexity or ambiguity of YAML and one night I had a few drinks and banged out my thoughts on something better. When people started writing implementations, I realized that TOML might actually have legs and started tightening it up and removing the snarky bits. I guess these projects felt the same pain I did about config files and presto!

I think the moral of the story is: just put your whacky ideas out there and see what happens. You never know when you'll hit a chord.


Thanks for response. I edited my response to be less snarky. Good job and thanks for the moral of the story.


There's no one clear place to point to about how the decision was made for Cargo, but some nuggets have been left by authors [1]. The gist seems to be that JSON (and YAML, per that comment) was considered a poor format to author and maintain a config file in, and YAML had no lib in Rust and there was no appetite for anyone to write one.

Pip's choice was made for them in the form of PEP 518 [2]. The authors of PEP 518 explain their rationale [2][3], which largely boils down to XML being too wordy and awkward to hand-edit, JSON awkward to hand-edit, various formats used by existing python tooling were underspecified and implementation-dependent, and YAML's python parsers being fairly complex and hard to vendor.

[1] https://news.ycombinator.com/item?id=7938388 [2] https://www.python.org/dev/peps/pep-0518/ [3] https://www.python.org/dev/peps/pep-0518/#other-file-formats


I was one of the supporters of TOML back when this decision was made. I might have been the first one to suggest it, but I don't remember exactly.

Basically, I wanted something that was (1) simple, (2) terse, (3) supported comments, and (4) supported recursive data structures. Requirement (1) eliminated YAML, requirement (2) eliminated XML, requirement (3) eliminated JSON, and requirement (4) eliminated INI. Of the remaining formats, TOML seemed to be the most popular and had the most traction, so I went with that.


This seems reasonable, but I'd like to add that XML, while certainly not designed as a config language, could be made much more terse by using SGML, the markup metalanguage XML is derived from and of which XML is a subset. SGML has additional constructs for short forms such as tag omission/inference and short references (custom Wiki syntax) targetted at authoring, whereas XML only admits canonical angle bracket markup syntax which is what makes XML verbose and cumbersome for edititing by hand.


While that's all true, in also not aware of any fine compliant sgml parsers in most major languages. That Maude's it have a rather high barrier


The name for this is the "Genetic Fallacy"

https://en.wikipedia.org/wiki/Genetic_fallacy


For more complex config, I highly recommend HOCON (https://github.com/lightbend/config/blob/master/HOCON.md), and its parser in Java (https://github.com/lightbend/config).

It feels like Scala (Lightbend is the company behind Scala) in the sense that there are 100 ways to achieve the same thing and having different levels of "code elegance", but for me it's a plus, since I'm a Scala fan.


Since Tom is browsing, I have a question - and this is my one and only major issue with TOML.

     # THIS IS INVALID
     a.b = 1
     a.b.c = 2
Why? I'm sure there is very good reasoning - but it makes me have to reason about my data in a manner I consider backwards/confusing.

     name.first = "Bob"
     name.last = "Smith"
     # Can't do this because name.first has already been defined 
     # name.first.alternative = "Robert"
     # This is too ambiguous 
     name.alternative = "Robert"
     # And this is backwards to me
     name.alternative.first = "Robert"
Some people might suggest to instead do

     first.name = "Bob"
     last.name = "Smith"
     alternative.first.name = "Robert"
But now nothing is scoped to [name] and if I need to get the full name I can't just pull in [name] but need to pull in [first], [last], and [alternative]. That's really messy in my opinion. All of these are names and should be scoped to [name] and not their own structure.


Because the "a.b" is not a key, but a table "a" that contains an entry with key "b". If you write "a.b = 1" then that integer is not a table, and can't have an entry called "c".

Because it avoids these problems, your "name.alternative.first" example seems perfectly sensible to me. Alternatively, the name might be an array:

    [[name]]
    first = "Bob"
    last = "Smith

    [[name]]
    first = "Robert"
It's probably best to think of TOML as convenient syntax for creating a JSON-like structure. What kind of JSON would you expect as the result of your config examples?


>Because the "a.b" is not a key, but a table "a" that contains an entry with key "b". If you write "a.b = 1" then that integer is not a table, and can't have an entry called "c".

Your explanation is perfect and I understand the justification why it isn't possible now - so thanks for that! The array alternative provided by xinau is also an acceptable replacement - maybe even better to be honest, especially in this particular scenario.

>It's probably best to think of TOML as convenient syntax for creating a JSON-like structure. What kind of JSON would you expect as the result of your config examples?

I came back from lunch and realized what I was trying to do didn't make sense while trying to convert it to JSON. Having name.first be both a value and contain an object doesn't make sense - unless it were to contain an array value but then why have the object and not just have the value? All very silly. So I guess my "only issue with TOML" is that I never sat down and tried to express what I wanted to express in any other way. It made a lot more sense in my head than on paper. :)


I don't know if this is the intend of Tom, but I was also stumbling on this issue and the only reasonable explanation I can come up with is that:

One key can't have multiple values with different types. For example in json:

  { "name" : 
    { "first" : "Bob",
      "last" : "Smith"
    }
  }
How would you add a "alternative" key to the "first" value type. The only way i can think of is something like this.

  { "name" : 
    { "first" : "Bob",
      "first.alternative": "Robert",
      "last" : "Smith"
    }
  }
or as @latk suggested, let "first" be an array, with it's first entry being the non alternatives.

  { "name" : 
    { "first" : ["Bob", "Robert"],
      "last" : "Smith"
    }
  }

I think this is by design. To quote @mojombo "simple configuration file that maps unambiguously to a hash table".

But this is only a guess.


Cool. TOML is used quite extensively in Rust projects. I think it's awesome for very simple configurations. And newcomers don't have to learn anything in order to change a TOML configuration, which is very powerfull.


Yeah, Cargo and Rust have been really important in TOML gaining adoption. Love the Rust community!


TOML is a great format, but it can not universally replace JSON, because it's designed for small nested depth.

I've been trying to push RON as an alternative, and it works very well for WebRender, Amethyst, and other projects in Rust.


Just dropping a useless comment to say thanks! Between Tom's work on Github, Jekyll, and TOML, I think he has influenced a vast amount of developers!

For the projects I've used TOML on, it was a nice breath of fresh air and a terrific improvement over JSON (still mad about JSON's lack of comments). Simplicity wins!


It's never useless to say thanks! I really appreciate your kind words, it's the fuel that powers a lot of open source. Keep it up!


Not to mention inventing MySpace!


Originally, I was fully on board with the homogeneous array requirement but its recently started causing me pains.

Homogeneous arrays can come in the following forms

- Shallow, literal type (a list of lists, regardless of the nested lists contain)

- Full, literal type (a list of dicts of strings)

- Logical type

There are many times where a list is the best type for my data but I want to take advantage of logical types for easier configuration by my users.

Below is an example of what I mean by "homogeneous logical types":

`Cargo.toml` has you specify dependencies using a dict

  [dependencies]
  foo = { "version" = "1.0" }
but allows a short-cut syntax where a string value is assumed to be the version value in the above dict.

  [dependencies]
  foo = "1.0"
Generally this is done in Rust using Enums

  enum Dependency {
     Version(String),
     Specification(HashMap<String, String>),
  }


Homogenous arrays are partly to make implementations easier, and because if you really need that flexibility, you can always use an array of inline tables, which has the benefit of giving each sub-element a name, hopefully increasing the obviousness.

I'm not sure I understand your complaint, though, can you give me another example of how it's biting you in real life?


> you can always use an array of inline tables, which has the benefit of giving each sub-element a name, hopefully increasing the obviousness

Except the name would be duplicated with the content I'm storing.

As for an example, its effectively Cargo. My use case is very similar (dependency reporting) but my content is slightly different (as I said, the value would effectively duplicate the key)


Oh, you mean you want something like:

  list = [
    "1.0",
    "2.7",
    { version = "1.4", path = "..." },
    "9.9",
  ]
Is that correct?


Something akin to that, yes.


I didn't know about TOML until I started using Hugo. I've been using YAML and TOML and I find both have their merits, especially for their simplicity. TOML looks like the classic INI file, and is fairly easy to use/learn. I've been trying to move toward TOML for everything. YAML is nice, but I had to be careful about white-space significance and also translation of "YES","NO", etc.

https://arp242.net/weblog/yaml_probably_not_so_great_after_a...


Absolutely. That's what I mean about TOML mapping unambiguously to a hash table. Strings in TOML are always quoted. There is no fuzzy interpretation of things like YES and NO. That way madness lies. I also am not a fan of meaningful whitespace, which is why TOML doesn't do that. Glad you're finding TOML useful, good luck on your projects!


> There is no fuzzy interpretation of things like YES and NO

I think the big problem is that YAML is dynamically typed. If I had schema-enforced config files, I'd be perfectly happy to say that for boolean typed data all the values of YES and NO and ON and OFF and T and F and 0 and 1 can all be reasonably interpreted as a Boolean True and False. The problem happens when a string-typed or integer-typed member can also have their ON value interpreted into Boolean TRUE.


Yeah, I'll agree with that, among many other issues with YAML. =) A schema would at least solve that problem, but I don't think most simple config users want to define a schema, so strong typing is a better solution.


I have to admit, the string and array syntax are the only parts that keep me from using it instead of YAML.

Probably because they are the only two features that need delimiters at start and end. The rest are only one-liners or simply stop if the next of the same type starts.


Funny enough, I learned how to/started to use TOML today. I was running some [redacted] experiments using a .pyc compiled script and wanted to experiment with parameters -- but then I had to compile the script again and again.

Enter TOML: I can paste a bunch of "var_x = 2"-type statements (there's like 50) directly from Python, read them as a dictionary and find-replace all appearances in like 5 minutes while I'm waiting for an Uber.

Thanks Tom!


Awesome, sounds like the perfect use case for TOML!


It's surprisingly easy to add it as an optional drop-in, ex: https://github.com/EamonnMR/OpenLockstep/blob/master/data.py...


I am using TOML in most of my projects and once the configuration files grew too large I just extended the syntax with “#@include filepath” declarations and split up the config files. Works great. https://github.com/nodemailer/wild-config/blob/master/README...


I'm not a fan of configuration files. Instead, I prefer the style of "calling" the program from within a script, and passing the configuration as parameters. This also allows to pass e.g. callback functions. Imho, this is much more flexible, and the advantages grow over time, whereas configuration files tend to accumulate awkward/convoluted constructs as the software matures.


If only package.json had been package.toml


Would that be hard to implement for npm/yarn? I'm sure that transforming TOML to JSON on the fly could be added as a step somewhere, resulting in changes to the package manager or just a wrapper command.

Stylistically, I agree with you as TOML is clean and well thought. Moreover, TOML has support for more data types that JSON lacks leading to ugly workarounds (floats as strings anyone?). The main advantage of JSON is that its encoding/decoding is included in JS as it is, and it's generally deeply ingrained in the Node community.


It would not be trivial because npm/yarn don't just read package.json but also modify it. See numerous GitHub issues in either repo about supporting comments inside package.json.

Of course nothing is impossible, but I think that the ship has sailed. I was just dreaming of what could have been :-) Subtle details in NPM's behavior would've probably been designed differently if it was necessary to make updates to the package file without destroying layout and context.


Here's the TOML parser in Nim language: https://github.com/NimParsers/parsetoml.

It works great based on my small amount of testing, and TOML is awesome! I wish more people start using it.


Should it be called a "language" when it has grammar but no semantics? (just curious)


I never really questioned the name, TOML, but I guess "Tom's Obvious, Minimal Language" works


To me, there was nothing obvious about the double bracket syntax, e.g.:

    [[designers]]
    name = Guido
    lang = Python

    [[designers]]
    name = Larry
    lang = Perl


Yeah, it's not my favorite part either. We've made them mostly unnecessary by adding inline tables, but for heavy nesting of arrays of tables, you still need to use them, which is why I mentioned that there are still some weaknesses for large, complex config files. Hoping to make this better in 2.0.


In defense of double brackets, I first encountered TOML in pipenv[1], and just by experimenting found I could add another source. So they're reasonably obvious if you see them in an existing file.

[1]: https://github.com/pypa/pipfile#pipfile


Yeah, they're ok for very simple use cases, it's when you start having more than one level of nesting that things get a little crazy.


I agree about the double bracket syntax however I don't think that is part of the specification any more, instead nested structures use a dot notation inside single brackets. Both solutions do still feel like a hack though, but the dot notation is definitely less ugly.

That all said, for flatter schema's I do personally think TOML is more readable than it's JSON, YAML and XML counterparts. If there is one thing I did like about Windows back when I used to run it, it was the syntax of INI files (which TOML was heavily influenced by).

Ultimately though, there is no perfect solution for all problems.


> for flatter schema's I do personally think TOML is more readable than it's JSON, YAML and XML

Yes. But for nested data, I find TOML becomes the least readable and I have to fall back to YAML or JSON.

Maybe we need just one more ML...


I disagree. The way JSON and YAML nest data with indentation forces data to be disconnected from it's property chain when there are multiple large items at each level. TOML seems to provide a solution to that by allowing deeply nested paths to be declare explicitly at the top level for each item. You get to choose how to split the paths into containers instead of being locked into a 1-to-1 tree structure.


Oh man, I can't disagree more.

I loathe YAML for readability. When I'm 2 pages down on a yaml tree I have no clue how many indents exist before what I'm working on. Likewise, when I scroll up, I quickly lose track of what the actual parent to the data I was working on is. I've found it to be an absolute mess.

JSON doesn't even fit for me, because it's not human intended (imo). Being able to document (comments) configuration is a requirement for me on any config language.



Yeah, that's weird.

For the most part the language seems like a smarter slightly more flexible INI which I like (although there's no real standard for INI, most formats don't allow for nested arrays or tables,) but why embrace bracket notation for arrays, but not a keyed syntax with the same brackets for tables?

Something like this would have been be a lot cleaner to me:

    [designers]
    [name = Guido, lang = Python]
    [name = Larry, lang = Perl]


Yups. And making a whole datetime RFC part of the spec kind of broke with the "minimal" thing. But apart from that I totally prefer it over INF/YML/JSON/XML for many purposes.


Datetime does need to be part of the spec if you want your markup files portable otherwise every TOML parser might decode a date slightly differently. In that regard it is really no different to saying "01" (string) is different to 1 (integer) in terms of preserving your data's integrity.


I guess there's no way around that though, if you want dates to be first class members. Can't do "a bit of date".


We added a proper datetime type because the only thing worse than having one is not having one. If you had to supply a datetime as a string, every TOML file would have a different way of doing it, which is...not so obvious.


Excluding the datetime RFC in the name of minimalism would make the language much worse (coughjson). I feel like this is as simple as it could be, but no more.


indicates an 'open' array ?


An array of dictionaries. In JSON, my example becomes:

    {
        "designers": [
            {"name": "Guido", "lang": "Python"},
            {"name": "Larry", "lang": "Perl"}
        ]
    }


This JSON is vastly more readable, frankly. It’s so clear what container type “designers” refers to, and I can read directly what data structures the elements are. I can’t do that in Toml. Unless I just happen to remember some rote memorized convention for what the syntax unpacks into, there’s no way to tell by looking at Toml code. IMO this ought to be priority number one for any utility language like this. No matter what brevity of other syntax there might be, this lack of direct expression of the data structures is too much of a problem.

This JSON is also much more readable than the “dot attribute” Toml syntax too, which I think is one of the least intuitive and hardest to read ways of creating nested data structures, certainly vastly less readable than the equivalents in YAML or JSON.

I’ve never understood why anyone would say Toml is easier to read than YAML or JSON. It’s drastically harder to read and more confusing.


I think you're focusing on TOML's weakest case here. I think for a lot of configuration files, TOML is going to be easier to read and write. Personally, I'd rather have a configuration file that looks like this:

    [site]
    name = "My Great Website"
    url = "https://example.com"
    author = "Watts Martin"
    email = "foo@bar.com"
    links = [
      { name = "tom", url = "http://tom.example.com" },
      { name = "bob", url = "http://bob.example.com" }
    ]

    [database]
    server = "localhost"
    username = "dbuser"
    password = "dbpassword"
Than one that looks like this:

    {
      "site": {
        "name": "My Great Website",
        "url": "https://example.com",
        "author": "Watts Martin",
        "email": "foo@bar.com",
        "links": [
          { "name": "tom", "url": "http://tom.example.com" },
          { "name": "bob", "url": "http://bob.example.com" }
        ]
      },
      "database": {
        "server": "localhost",
        "username": "dbuser",
        "password": "dbpassword"
      }
    }
Semantically, I just find the first one clearer and more intuitive. (The links are really only the dubious part.)

The equivalent YAML doesn't look bad:

    site:
      name: My Great Website
      url: 'https://example.com'
      author: Watts Martin
      email: foo@bar.com
      links:
        - name: tom
          url: 'http://tom.example.com'
        - name: bob
          url: 'http://bob.example.com'
    database:
      server: localhost
      username: dbuser
      password: dbpassword
...but the significant whitespace makes it somewhat more fragile, particularly in those pesky links.


Actually, even here in your extended example, the JSON is the easiest to read. For durable files that will be read much more than written, saving a few newlines or keystrokes is super irrelevant, but JSON gives such explicit understanding of the data structures and nesting, which IMO is really confusing and hard to understand in Toml.

I can agree Toml is sometimes better than YAML, but I cannot agree it is better than JSON. For me JSON is so simple, so easy to read, it has the nice feature of preventing comments and multi-line strings to keep things extremely simple and to enforce that documentation about the values in the file has to be located outside the file, as it absolutely should (documentation should be at the site that uses the config/param file, not inside it).

Plus, I find JSON indentation to be a real joy to assist with reading and seeing the nested structure. That indentation is optional in Toml, but I think pretty much always the use of that indentation ought to be enforced as a convention for people using Toml. Lacking the indentation is really not good for config / parameter files.


Nesting has locality value, which is often desireable.. but as I said, the [[toml]] syntax yields open arrays, it seems you can define thing in multiple passes (which can be harmful but may also be a need once in a while)


Sure, I only mean that the syntax for it in Toml is extremely hard to read. I personally just find JSON and YAML much, much easier to read.


I'm more a fan of YAML.

"YAML Ain't Markup Language"


TOML and YAML both have kinda weird names because both were initially incorrectly said to be a "markup language" and then subsequently renamed. The original names were "Yet Another Markup Language" and "Tom's Own Markup Language".


The syntax for tables is awkward compared to CSV (This is actually also true for XML, JSON, YAML).


Looks a lot like Rebol/Red.


Some questions I couldn't find answers for:

- Is interpolation supported? e.g. "key1" = "value" and then "key2" = "$key1". That would be very useful in avoiding repetition.

- What is the keyword for null? e.g. when I want to set the value to null


> interpolation?

No, there is no facility for variable templating, interpolation, references, or anything of the like in toml.

If you wanted that, you could implement it in your application by doing post-processing on strings, or you could not use TOML.

> What is the keyword for null?

If you have "x = 1", you can always comment it out with "# x = 1". There's no specific support for null.

it would be kinda silly to have it anyways since null is a language construct more than anything else, and e.g. some lanuages support mixed strings + nulls in one array, but many don't.


How does toml compare to Google protobufs as a configuration system?


Protocol Buffers is a data serialization format and unsuitable for configuration, as the binary serialized data is not human readable.


Does TOML have a schema?


Not currently, but it's something I'd like to add as a separate spec sometime later.


In case you're curious, one of the Clojure libraries I found while researching TOML seems to have an ABNF grammar. It sits at the top of this file:

https://github.com/lantiga/clj-toml/blob/0.4.0-instaparse/sr...

What a lovely language! I may have to use it..


> Whitespace means tab (0x09) or space (0x20).

> Newline means LF (0x0A) or CRLF (0x0D0A).

Complicating things from the start. Not a good sign.


Well some (many, actually) people use Windows too, and some prefer tabs and some prefer spaces; I can’t see why this is a problem?

A language for configs is different from a data interchange language: the former is intended to be written and edited by humans.

See https://arp242.net/weblog/json_as_configuration_files-_pleas...


Indentation is not significant.

And you can't expect windows users to always check their editor is lf compliant. Toml is not just for programmers


(2013)



Yes, we just released it two days ago! I know v1.0.0 has been a long time coming, but it's important to me that we get it right, as specs have a very long-lasting impact (much more so than a specific version of a library). We are indeed working hard towards a proper 1.0 though!


Something to consider for 1.0.0, a hexadecimal floating point type value. I don't think it's in there yet, but if it isn't https://www.effectiveperlprogramming.com/2015/06/perl-v5-22-...

They're incredibly useful if you need to specify an exact floating point value, and about rounding or precision issues.

Other languages than perl support them, but that article does a good job of demoing them


Congrats for the release! I opened an issue in LXIY minutes. I will looking into this new version and update the tutorial :)

https://github.com/adambard/learnxinyminutes-docs/issues/315...


Used to be called "INI file" in the past...

In any case I'm glad to see it in several open source projects (Mailtrain for example). It's so much easier to read for humans than JSON.


Yeah, TOML is definitely INI inspired, but there is no canonical INI spec. I wouldn't say that TOML is INI, though. INI files still exist in their variously poorly specified ways.


As mentioned in the source link, the INI format is not standardized. This is an attempt at a standardized format similar to INI.




Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: