
Toml: Tom's Obvious, Minimal Language - wheresvic1
https://github.com/toml-lang/toml
======
mojombo
Hey, Tom here (creator of TOML). Fun to see TOML on HN again! Since I first
wrote a (mostly) joke proposal for TOML 5 years ago, TOML has been adopted by
a number of prominent projects such as Cargo, Hugo, Pipenv, and others.

TOML is especially well suited for projects that need a simple configuration
file that maps unambiguously to a hash table. There are still some weaknesses
in TOML that make it non-optimal for large, complex config, but I'm hoping to
address that in a later version of the spec (perhaps 2.0).

Happy to answer any questions you all have about TOML!

~~~
gregn610
A killer feature of TOML compared to JSON is that it allows comments. A config
file without comments and examples ain't great. I found the double square
bracket syntax useful and understandable. Agreed it's not obviously .INI or
perfectly elegant but it certainly works and has its use-cases. Anyways, thank
you @mojombo!

~~~
mlthoughts2018
I used to feel this way, and also used to be frustrated about multi-line
strings in JSON. With years of experience now, though, I actually appreciate
JSON omitting these features.

Config files should absolutely not have or need comments. If you need them
directly in the config file, something is wrong. Applications should document
their default settings in a different way, preferably in a README or generated
documentation that also explains how to use environment variables to override
the defaults. That sort of separate companion doc is the right place for notes
about defaults or "why" certain config values exist in the file. The same is
true for using JSON to store parameter files, etc. It's actually quite
important to keep metadata _about_ the config / params / etc. specifically
_out of_ those files, so that they are absolutely nothing but _value_ files.
Information about _why_ a file contains those values belongs elsewhere, and
it's an anti-pattern IMO to rely on comments _in_ the config / param file.

~~~
Waterluvian
I have never disagreed with someone more than I do now. =)

Config absolutely needs comments. Context is everything. Comments allow me to
explain to other humans why the config is the way it is. Dumping that out to a
separate file is begging for it to fall out of sync when there's no comment
instructing anyone to go and update the other file. Plus that's just kind of
silly.

~~~
mlthoughts2018
I disagree. It’s an anti-pattern. For example, if you’re writing an
application that loads a default config file to populate parameters at run
time, then the software module that loads from the default file is the correct
place to document it, because the meaning of defaults is relevant to that
source code, not at all to someone reading the parameter file itself. A
parameter file is just some blob of stuff.

I agree context is everything, and that’s _why_ it’s a bad idea to embed usage
info or instructions about the contents or meaning of a parameter file into
that very file.

Somewhere else, something has to choose to load _that_ file, and _that_ is
where the documentation belongs (in addition to readable, separate artifacts
that are generated from the file).

For example, suppose you need to rewrite the parameter file from YAML to Toml,
or you need to add a new layer of nesting and some post-processing logic at
load time.

The meaning of these things has no context inside the parameter file itself.
It only has meaning at the point some other system consumes it. Another system
could consume the exact same file and choose to interpret all the parameters
with different meanings in that program, regardless of what any comments says
in the param file.

~~~
__david__
Maybe that makes sense for the app you are writing that is only consumed by
others at your company, but if I'm installing your app from a package manager
I absolutely do _not_ want to have to read the config loading source code to
figure out what all the parameters do. And even if the docs are nicely
described in a man page and not in config file comments, I absolutely want to
be able to comment on particular parameter changes ("# had to up foo to 18
because bar was frobbing baz at to high a rate").

Even if you use configuration management it's still nice to be able to comment
things in config files since they show up both in the source and in the output
files, which is very helpful when debugging.

~~~
mlthoughts2018
I specifically meant that comments aren’t a good idea in config files when the
config files are distributed to _any_ end users as part of some app or package
installation.

> “And even if the docs are nicely described in a man page and not in config
> file comments, I absolutely want to be able to comment on particular
> parameter changes ("# had to up foo to 18 because bar was frobbing baz at to
> high a rate").”

This is the exact anti-pattern that happens as a result of relying on comments
in the config file. Your goal of adding that comment about why you changed foo
directly in the config file is dangerous and is a very bad practice.

Instead, whatever config file it is that you are modifying (whether for third
parties to consume or just for own local Postgres or video game or anything),
that file should be turned into a proper package. Place it in a version
control repo, and readme and usage documentation for the “why” of the
parameter values, and make a tool so that if the end user wants that exact
config file, they can “install” it.

For any customized overrides of single settings at run time (so specifically
not changes to a config file), it should happen via the end user overriding
ENV variables, not mucking around in config files and trusting ad hoc comments
found inside them.

------
niftich
The news here is (presumably) the announcement of version 0.5.0 a few days
prior [1], which includes several clarifications, rollup changes, and
enhancements that accumulated over the last 3 years.

In my opinion, one of the more useful features was the addition of Joda-style
datetimes [2][3][4][5], which disambiguates between various kinds of datetime
constructs aren't interchangeable yet commonly conflated. It's fair criticism
that the addition of rich datetime types exceeds the language's original
'minimal' goal, but too many other languages and formats these days just punt
to RFC3339 and leave no guidance or tooling on how to represent dateless times
or timeless dates without introducing a ton of side-effects. This is a place
where language standard libs, with few exceptions, have repeatedly dropped the
ball, and similarly, language or library-agnostic, generic guidance is nowhere
to be found.

TOML raises the bar here, by providing a concept and notation to specify these
values at rest, and gives parser writers, as opposed to the users, the task of
finding a way to represent these values in whatever way is idiomatic for the
given language.

[1] [https://github.com/toml-
lang/toml/blob/master/CHANGELOG.md](https://github.com/toml-
lang/toml/blob/master/CHANGELOG.md) [2] [https://github.com/toml-
lang/toml/pull/414](https://github.com/toml-lang/toml/pull/414) [3]
[https://github.com/toml-lang/toml/pull/362](https://github.com/toml-
lang/toml/pull/362) [4] [https://github.com/toml-
lang/toml/issues/412](https://github.com/toml-lang/toml/issues/412) [5]
[https://github.com/toml-lang/toml/issues/263](https://github.com/toml-
lang/toml/issues/263)

~~~
mojombo
Glad you like them, we spent quite a bit of time getting them right! Datetimes
are a horrible mess of complexity, but hopefully over time languages and tools
around them can standardize on a set of common primitives to make all our
lives a bit less horribly messy. =)

------
losvedir
I just started a rust project and so had to learn TOML since that's what the
package manager (and a lot of the ecosystem) uses.

For some reason, though, I just struggle with the syntax. It says it's
"obvious" but it wasn't to me. I think it says something that the README is
full of "this TOML would be represented like this JSON", to help you
understand what's going on. Every time I saw that, I was like "oh, now I get
it." I don't know if that means JSON is just inherently more understandable,
or I'm just more used to it, though.

Are there obvious downsides to JSON for config that I'm missing? What are the
advantages of TOML over JSON? Maybe eventually it'll "click", though.

I think the following mean the same thing:

    
    
         [[foo]]
         bar = {baz = 5}
    
         [[foo.bar]]
         baz = 5
    

But I don't think the following works:

    
    
         [[foo]]
         [[bar]]
         baz = 5
    

That is, I _think_ the double bracket syntax always starts at the top level?
While the `=` are relative to the double brackets above it? Something about
all that is non-obvious to me, and I don't love that there's multiple ways to
do the same thing.

~~~
Pxtl
Yeah, most of TOML looks good, but the .INI style [table] and even wierder the
[[table-array]] thing seems like they were trying to hammer a square peg into
a round hole for the sake of having config files that look like INIs.

~~~
scrollaway
Backwards-compatibility with ini is in my opinion one of the more powerful
aspects of TOML which help adoption. Kinda like how UTF8 is ASCII-
backcompatible.

------
afraca
Does anyone know why such a "joke proposal" (as stated by the author itself)
was chosen for pretty significant projects like pip and cargo? (edit: I tried
to say it began as something small and personal)

(Well, I guess there's not that much to win/lose in the area of config
languages, but still) (Also, I think it works pretty well, so this is not to
downplay TOML!)

~~~
mojombo
Well, it was only a joke to begin with. I couldn't stand the complexity or
ambiguity of YAML and one night I had a few drinks and banged out my thoughts
on something better. When people started writing implementations, I realized
that TOML might actually have legs and started tightening it up and removing
the snarky bits. I guess these projects felt the same pain I did about config
files and presto!

I think the moral of the story is: just put your whacky ideas out there and
see what happens. You never know when you'll hit a chord.

~~~
afraca
Thanks for response. I edited my response to be less snarky. Good job and
thanks for the moral of the story.

------
wiradikusuma
For more complex config, I highly recommend HOCON
([https://github.com/lightbend/config/blob/master/HOCON.md](https://github.com/lightbend/config/blob/master/HOCON.md)),
and its parser in Java
([https://github.com/lightbend/config](https://github.com/lightbend/config)).

It feels like Scala (Lightbend is the company behind Scala) in the sense that
there are 100 ways to achieve the same thing and having different levels of
"code elegance", but for me it's a plus, since I'm a Scala fan.

------
Nadya
Since Tom is browsing, I have a question - and this is my one and only major
issue with TOML.

    
    
         # THIS IS INVALID
         a.b = 1
         a.b.c = 2
    

Why? I'm sure there is very good reasoning - but it makes me have to reason
about my data in a manner I consider backwards/confusing.

    
    
         name.first = "Bob"
         name.last = "Smith"
         # Can't do this because name.first has already been defined 
         # name.first.alternative = "Robert"
         # This is too ambiguous 
         name.alternative = "Robert"
         # And this is backwards to me
         name.alternative.first = "Robert"
    

Some people might suggest to instead do

    
    
         first.name = "Bob"
         last.name = "Smith"
         alternative.first.name = "Robert"
    

But now nothing is scoped to [name] and if I need to get the full name I can't
just pull in [name] but need to pull in [first], [last], and [alternative].
That's really messy in my opinion. All of these are names and should be scoped
to [name] and not their own structure.

~~~
latk
Because the "a.b" is not a key, but a table "a" that contains an entry with
key "b". If you write "a.b = 1" then that integer is not a table, and can't
have an entry called "c".

Because it avoids these problems, your "name.alternative.first" example seems
perfectly sensible to me. Alternatively, the name might be an array:

    
    
        [[name]]
        first = "Bob"
        last = "Smith
    
        [[name]]
        first = "Robert"
    

It's probably best to think of TOML as convenient syntax for creating a JSON-
like structure. What kind of JSON would you expect as the result of your
config examples?

~~~
Nadya
_> Because the "a.b" is not a key, but a table "a" that contains an entry with
key "b". If you write "a.b = 1" then that integer is not a table, and can't
have an entry called "c"._

Your explanation is perfect and I understand the justification why it isn't
possible now - so thanks for that! The array alternative provided by xinau is
also an acceptable replacement - maybe even better to be honest, especially in
this particular scenario.

 _> It's probably best to think of TOML as convenient syntax for creating a
JSON-like structure. What kind of JSON would you expect as the result of your
config examples?_

I came back from lunch and realized what I was trying to do didn't make sense
while trying to convert it to JSON. Having name.first be both a value and
contain an object doesn't make sense - unless it were to contain an array
value but then why have the object and not just have the value? All very
silly. So I guess my "only issue with TOML" is that I never sat down and tried
to express what I wanted to express in any other way. It made a lot more sense
in my head than on paper. :)

------
timvisee
Cool. TOML is used quite extensively in Rust projects. I think it's awesome
for very simple configurations. And newcomers don't have to learn anything in
order to change a TOML configuration, which is very powerfull.

~~~
mojombo
Yeah, Cargo and Rust have been really important in TOML gaining adoption. Love
the Rust community!

------
tpaschalis
Just dropping a useless comment to say thanks! Between Tom's work on Github,
Jekyll, and TOML, I think he has influenced a vast amount of developers!

For the projects I've used TOML on, it was a nice breath of fresh air and a
terrific improvement over JSON (still mad about JSON's lack of comments).
Simplicity wins!

~~~
erikpukinskis
Not to mention inventing MySpace!

------
epage
Originally, I was fully on board with the homogeneous array requirement but
its recently started causing me pains.

Homogeneous arrays can come in the following forms

\- Shallow, literal type (a list of lists, regardless of the nested lists
contain)

\- Full, literal type (a list of dicts of strings)

\- Logical type

There are many times where a list is the best type for my data but I want to
take advantage of logical types for easier configuration by my users.

Below is an example of what I mean by "homogeneous logical types":

`Cargo.toml` has you specify dependencies using a dict

    
    
      [dependencies]
      foo = { "version" = "1.0" }
    

but allows a short-cut syntax where a string value is assumed to be the
version value in the above dict.

    
    
      [dependencies]
      foo = "1.0"
    

Generally this is done in Rust using Enums

    
    
      enum Dependency {
         Version(String),
         Specification(HashMap<String, String>),
      }

~~~
mojombo
Homogenous arrays are partly to make implementations easier, and because if
you really need that flexibility, you can always use an array of inline
tables, which has the benefit of giving each sub-element a name, hopefully
increasing the obviousness.

I'm not sure I understand your complaint, though, can you give me another
example of how it's biting you in real life?

~~~
epage
> you can always use an array of inline tables, which has the benefit of
> giving each sub-element a name, hopefully increasing the obviousness

Except the name would be duplicated with the content I'm storing.

As for an example, its effectively Cargo. My use case is very similar
(dependency reporting) but my content is slightly different (as I said, the
value would effectively duplicate the key)

~~~
mojombo
Oh, you mean you want something like:

    
    
      list = [
        "1.0",
        "2.7",
        { version = "1.4", path = "..." },
        "9.9",
      ]
    

Is that correct?

~~~
epage
Something akin to that, yes.

------
otterpro
I didn't know about TOML until I started using Hugo. I've been using YAML and
TOML and I find both have their merits, especially for their simplicity. TOML
looks like the classic INI file, and is fairly easy to use/learn. I've been
trying to move toward TOML for everything. YAML is nice, but I had to be
careful about white-space significance and also translation of "YES","NO",
etc.

[https://arp242.net/weblog/yaml_probably_not_so_great_after_a...](https://arp242.net/weblog/yaml_probably_not_so_great_after_all.html)

~~~
mojombo
Absolutely. That's what I mean about TOML mapping unambiguously to a hash
table. Strings in TOML are always quoted. There is no fuzzy interpretation of
things like YES and NO. That way madness lies. I also am not a fan of
meaningful whitespace, which is why TOML doesn't do that. Glad you're finding
TOML useful, good luck on your projects!

~~~
Pxtl
> There is no fuzzy interpretation of things like YES and NO

I think the big problem is that YAML is dynamically typed. If I had schema-
enforced config files, I'd be perfectly happy to say that _for boolean typed
data_ all the values of YES and NO and ON and OFF and T and F and 0 and 1 can
all be reasonably interpreted as a Boolean True and False. The problem happens
when a string-typed or integer-typed member can _also_ have their ON value
interpreted into Boolean TRUE.

~~~
mojombo
Yeah, I'll agree with that, among many other issues with YAML. =) A schema
would at least solve that problem, but I don't think most simple config users
want to define a schema, so strong typing is a better solution.

------
thanatropism
Funny enough, I learned how to/started to use TOML today. I was running some
[redacted] experiments using a .pyc compiled script and wanted to experiment
with parameters -- but then I had to compile the script again and again.

Enter TOML: I can paste a bunch of "var_x = 2"-type statements (there's like
50) directly from Python, read them as a dictionary and find-replace all
appearances in like 5 minutes while I'm waiting for an Uber.

Thanks Tom!

~~~
mojombo
Awesome, sounds like the perfect use case for TOML!

------
EamonnMR
It's surprisingly easy to add it as an optional drop-in, ex:
[https://github.com/EamonnMR/OpenLockstep/blob/master/data.py...](https://github.com/EamonnMR/OpenLockstep/blob/master/data.py#L31)

------
andris9
I am using TOML in most of my projects and once the configuration files grew
too large I just extended the syntax with “#@include filepath” declarations
and split up the config files. Works great.
[https://github.com/nodemailer/wild-
config/blob/master/README...](https://github.com/nodemailer/wild-
config/blob/master/README.md#toml-extensions)

------
amelius
I'm not a fan of configuration files. Instead, I prefer the style of "calling"
the program from within a script, and passing the configuration as parameters.
This also allows to pass e.g. callback functions. Imho, this is much more
flexible, and the advantages grow over time, whereas configuration files tend
to accumulate awkward/convoluted constructs as the software matures.

------
skrebbel
If only package.json had been package.toml

~~~
jxub
Would that be hard to implement for npm/yarn? I'm sure that transforming TOML
to JSON on the fly could be added as a step somewhere, resulting in changes to
the package manager or just a wrapper command.

Stylistically, I agree with you as TOML is clean and well thought. Moreover,
TOML has support for more data types that JSON lacks leading to ugly
workarounds (floats as strings anyone?). The main advantage of JSON is that
its encoding/decoding is included in JS as it is, and it's generally deeply
ingrained in the Node community.

~~~
skrebbel
It would not be trivial because npm/yarn don't just read package.json but also
modify it. See numerous GitHub issues in either repo about supporting comments
inside package.json.

Of course nothing is impossible, but I think that the ship has sailed. I was
just dreaming of what could have been :-) Subtle details in NPM's behavior
would've probably been designed differently if it was necessary to make
updates to the package file without destroying layout and context.

------
kaushalmodi
Here's the TOML parser in Nim language:
[https://github.com/NimParsers/parsetoml](https://github.com/NimParsers/parsetoml).

It works great based on my small amount of testing, and TOML is awesome! I
wish more people start using it.

------
somedudeatwork
I never really questioned the name, TOML, but I guess "Tom's Obvious, Minimal
Language" works

~~~
gnuvince
To me, there was nothing obvious about the double bracket syntax, e.g.:

    
    
        [[designers]]
        name = Guido
        lang = Python
    
        [[designers]]
        name = Larry
        lang = Perl

~~~
cies
Yups. And making a whole datetime RFC part of the spec kind of broke with the
"minimal" thing. But apart from that I totally prefer it over INF/YML/JSON/XML
for many purposes.

~~~
wvh
I guess there's no way around that though, if you want dates to be first class
members. Can't do "a bit of date".

~~~
mojombo
We added a proper datetime type because the only thing worse than having one
is not having one. If you had to supply a datetime as a string, every TOML
file would have a different way of doing it, which is...not so obvious.

------
amai
The syntax for tables is awkward compared to CSV (This is actually also true
for XML, JSON, YAML).

------
mchahn
Should it be called a "language" when it has grammar but no semantics? (just
curious)

------
miguelrochefort
Looks a lot like Rebol/Red.

------
noisy_boy
Some questions I couldn't find answers for:

\- Is interpolation supported? e.g. "key1" = "value" and then "key2" =
"$key1". That would be very useful in avoiding repetition.

\- What is the keyword for null? e.g. when I want to set the value to null

~~~
TheDong
> interpolation?

No, there is no facility for variable templating, interpolation, references,
or anything of the like in toml.

If you wanted that, you could implement it in your application by doing post-
processing on strings, or you could not use TOML.

> What is the keyword for null?

If you have "x = 1", you can always comment it out with "# x = 1". There's no
specific support for null.

it would be kinda silly to have it anyways since null is a language construct
more than anything else, and e.g. some lanuages support mixed strings + nulls
in one array, but many don't.

------
anonu
How does toml compare to Google protobufs as a configuration system?

~~~
mojombo
Protocol Buffers is a data serialization format and unsuitable for
configuration, as the binary serialized data is not human readable.

------
exabrial
Does TOML have a schema?

~~~
mojombo
Not currently, but it's something I'd like to add as a separate spec sometime
later.

~~~
firepoet
In case you're curious, one of the Clojure libraries I found while researching
TOML seems to have an ABNF grammar. It sits at the top of this file:

[https://github.com/lantiga/clj-
toml/blob/0.4.0-instaparse/sr...](https://github.com/lantiga/clj-
toml/blob/0.4.0-instaparse/src/clj_toml/core.clj)

What a lovely language! I may have to use it..

------
TooBrokeToBeg
> Whitespace means tab (0x09) or space (0x20).

> Newline means LF (0x0A) or CRLF (0x0D0A).

Complicating things from the start. Not a good sign.

~~~
boramalper
Well some (many, actually) people use Windows too, and some prefer tabs and
some prefer spaces; I can’t see why this is a problem?

A language for configs is different from a data interchange language: the
former is intended to be written and edited by humans.

See [https://arp242.net/weblog/json_as_configuration_files-
_pleas...](https://arp242.net/weblog/json_as_configuration_files-_please_dont)

------
laurent123456
(2013)

~~~
aloisdg
v0.5.0 is a new release!

Changelog: [https://github.com/toml-
lang/toml/blob/master/CHANGELOG.md#0...](https://github.com/toml-
lang/toml/blob/master/CHANGELOG.md#050--2018-07-11)

~~~
mojombo
Yes, we just released it two days ago! I know v1.0.0 has been a long time
coming, but it's important to me that we get it right, as specs have a very
long-lasting impact (much more so than a specific version of a library). We
are indeed working hard towards a proper 1.0 though!

~~~
simcop2387
Something to consider for 1.0.0, a hexadecimal floating point type value. I
don't think it's in there yet, but if it isn't
[https://www.effectiveperlprogramming.com/2015/06/perl-v5-22-...](https://www.effectiveperlprogramming.com/2015/06/perl-v5-22-adds-
hexadecimal-floating-point-literals/)

They're incredibly useful if you need to specify an exact floating point
value, and about rounding or precision issues.

Other languages than perl support them, but that article does a good job of
demoing them

------
dvfjsdhgfv
Used to be called "INI file" in the past...

In any case I'm glad to see it in several open source projects (Mailtrain for
example). It's so much easier to read for humans than JSON.

~~~
mojombo
Yeah, TOML is definitely INI inspired, but there is no canonical INI spec. I
wouldn't say that TOML is INI, though. INI files still exist in their
variously poorly specified ways.

