
YAML: Probably not so great after all - kylequest
https://arp242.net/yaml-config.html
======
ivan4th
From my experience, while YAML itself is something one can learn to live with,
the true horror starts when people start using text template engines to
generate YAML. Like it's done in Helm charts, for example,
[https://github.com/helm/charts/blob/master/stable/grafana/te...](https://github.com/helm/charts/blob/master/stable/grafana/templates/deployment.yaml)
Aren't these "indent" filters beautiful?

~~~
PopeDotNinja
> the true horror starts when people start using text template engines to
> generate YAML

I just had a shiver recalling a Kubernetes wrapper wrapper wapper wrapper at a
former job. I think there were at least two layers of mystical YAML generation
hell. I couldn't stop it, and it tanked much joy in my work. It was a factor
in me moving on.

~~~
msoad
Oh my god! I'm working on the same wrapper wrapper wrapper

~~~
PopeDotNinja
I called ours The Yamlith. Name yours!

~~~
DonHopkins
What was the straw that broke the YAML's back?

~~~
codeduck
get out.

------
jchw
The issue is, I think most people (myself included) enter YAML into their
lives as basically a JSON alternative with lighter syntax. Without really
realizing, or perhaps without internalizing, the rather ridiculous number of
different ways to represent the same thing, the painful subtle syntax
differences that lead to entirely different representations, the sometimes
difficult to believe number of features that the language has that are seldom
used..

It's not just alternate skin for JSON, and yet that's what _most_ people use
it for. Some users also want things like map keys that aren't strings, which
is actually pretty useful.

I recall there being CoffeeScript Object Notation as well... perhaps that
would've been better for many use cases, all things said.

~~~
norcalli
I've never understood this. JSON is really _not_ that difficult to work with
manually. I tend to write my config files as JSON for utilities I write. What
is it with peoples' innate aversion to braces?

~~~
zzo38computer
I don't aversion to braces. Rather, my issues with JSON is that it doesn't
have comments and that you cannot use a optional trailing comma.

~~~
JoBrad
And the required double quotes around strings. YAML’s string handling is a lot
easier to deal with.

~~~
zzo38computer
I think it is good to require quotation marks for strings, at least for values
(although I could live with it if quotation marks for strings are allowed even
if not required, since then, if you do not like the feature of not having
quotation marks for strings, you can just not use that feature).

Maybe it would be sense if quotation marks were not required for keys with
only a restricted character set which are not an empty string, though.

~~~
JoBrad
No quotes around keys would be sufficient, honestly. I use YAML a lot for API
documentation, and there are still some cases where wrapping your values in
quotes is necessary. But requiring it for keys becomes very annoying.

------
airencracken
[https://noyaml.com/](https://noyaml.com/)

YAML is bad.

Every YAML parser is a custom YAML parser.

[https://matrix.yaml.io/valid.html](https://matrix.yaml.io/valid.html)

~~~
takeda
The problem is with parsers, how they are implemented or used. YAML actually
has a way to specify type of the data, alternatively the application supposed
to suggest desired type. What's this take is showing is what types are assumed
when they are not specified.

------
felixfbecker
I'll say it: I think YAML is great and a joy to use for configuration files. I
can write it even with the dumbest editor, I can write comments, multi-line
strings, I can get autocompletion and validation with JSON schema, I can share
and reference other values. It allows tools to have config schemas that read
like a natural domain specific language, but you already know the syntax. I
haven't had problems with it at all.

~~~
martinpw
This was me too - until yesterday, when I made a minor change to one of our
YAML config files and everything broke. On investigation it turned out that
all of our YAML files had longstanding errors but those errors happened to be
valid syntax and also did not cause any bad side effects, so we had been
getting away with it by pure luck until I made a change that happened to
expose the problem.

So now no longer a YAML fan...

~~~
dragonwriter
That would make me not a fan of the particular parsers/validators I've been
using, rather than not a fan of YAML.

The big strike against YAML I see there is that it needs a good conformance
test suite and implementations need to be tested against it. But that's not a
problem with the format but a fairly easy to fix ecosystem problem.

~~~
Izkata
> of the particular parsers/validators

But the syntax was valid, the parsers/validators would've been correct to
accept it.

------
ridiculous_fish
fish shell is looking for a new text serialization format for its history file
(currently it uses an ad-hoc broken psuedo-YAML).

Boxes to check:

1\. Self describing format

2\. SAX-style parser available to C++

3\. Easy for users to understand and ad-hoc parse using command-line tools

4\. No document closing necessary, so appending is trivial

YAML looks pretty good:

    
    
        - cmd: git checkout file.txt
          when: 1565133286
          pwd: /home/me/dir/
          paths:
          - file.txt
    

protobuf is also an option:

    
    
        entry {
          cmd: "git checkout file.txt"
          when: 1565133286
          paths: "file.txt"
        }
    

though I am unsure of how well its text serialization is supported.

Any suggestions?

~~~
theli0nheart
TOML?

~~~
ziotom78
TOML would be great, if not for an annoying obscure detail in the
specification that makes it hard to use for my typical use cases (scientific
computation) [1]. Moreover, I find quite unintuitive how you are supposed to
specify array of tables [2]: this kind of is much easier in JSON (which is the
format I am currently using, although it is far from perfect).

[1] [https://github.com/toml-lang/toml/issues/356](https://github.com/toml-
lang/toml/issues/356)

[2] [https://github.com/toml-lang/toml#user-content-array-of-
tabl...](https://github.com/toml-lang/toml#user-content-array-of-tables)

~~~
pitaj
That is an annoyingly obscure detail, I think the wrong decision was made
there. Hopefully it gets reversed.

Personally I really like the array of tables syntax. It is a little
unintuitive but it's not difficult to remember. It's useful for fulfilling the
OP's "No document closing necessary, so appending is trivial" requirement.

And if you don't want to use it, you can always use inline tables in an array,
just like JSON.

------
fabian2k
I've used YAML as the format for a config file, and I certainly regret that
choice. Trying to explain to someone that doesn't know YAML how to edit it
without setting them up for failure is quite annoying. There are too many non-
obvious ways to screw up, like forgetting the space after the colon or of
course bad indentation.

~~~
booleandilemma
Giving meaning to whitespace causes so many headaches and yet people still
embrace Python, for some reason. I don’t understand it.

~~~
whatshisface
Your editor makes a world of difference here. Since you shouldn't be writing
brace-language code without indents anyways, the biggest issue remaining is
mixing tabs and spaces. Gedit makes this a big pain with it's default config
(it doesn't even auto-indent) but Atom and IDLE handle it well.

~~~
userbinator
Code you write yourself is not usually the source of problems with significant
whitespace; it's situations like posting code on websites and discussing it
where code in a whitespace-significant language becomes next-to-useless when
leading whitespace is stripped, whereas code in any other language will still
survive and then easily be autoformatted without changing its meaning.

~~~
pshc
Can’t remember the last time this has actually happened to me. In what
websites are people posting code without code block formatting support? Like,
instant messengers?

~~~
contradictioned
Funfact, even Facebook, Whatsapp, and Telegram support preformatted text in
triple backticks.

------
al_form2000
As an ansible user, I hate YAML and its broken parsers with a passion, but the
security objection does not make much sense. It does apply verbatim to any
parser of anything if the implementation decides that a given label means
"eval this content right away". I fail to see how this can be a fault of the
DDL rather than the parser's.

~~~
JelteF
The reason this is a fault of the DDL and not the parser is that the DDL spec
decides that it has label that evaluates a command. The parser then has two
options, either implement it or not conform to the spec (and essentially
implementing a different DDL). For programming languages it makes sense to
have an eval label/command. For configuration/serialization DDLs I think it's
a terrible choice.

~~~
al_form2000
And terrible it is indeed, but I cannot find it specified - the strings eval,
exec, command, statement do not even occur in the official specs (shallow doc
perusal, I know)

~~~
SignalsFromBob
That's because there's nothing in the spec stating anything about execution.
The parent is simply incorrect. That's why they haven't responded.

------
crazygringo
So what's the HN consensus on the best format for config files?

Is it TOML as the author seems to prefer at the end?

~~~
markmark
I say just use JSON. Everyone knows it already and it's good enough. Use a
parser in your app that allows comments and trailing commas like vscode does.

~~~
dagenix
That's not JSON anymore, that's some custom format that's JSON inspired.

~~~
skybrian
Yep. Some kind of JSON++ is where we're headed. Hopefully we can agree on a
new standard someday?

(No, not YAML.)

~~~
tln
JSON5.org would be nice :)

------
mckinney
Regardless of the reasoning laid out in the OP, it's difficult to argue in
YAML's favor comparing it with JSON. I'm not an ardent fan of JSON either --
both YAML and JSON have issues wrt inconsistencies:

\- what draft of JSON Schema are you using 4? 7? Neither?

\- what version of Swagger or OpenAPI are you using?

\- etc.

Sure, it's great to see ongoing development of schemas, but with each new
development we have yet another dialect to consider/support.

In my view, perhaps an even greater problem with structured data formats in
general is the void that separates them from programming languages esp. static
languages such as Java where static type information is otherwise leveraged.
The industry standard solution, code generation, is awful in almost every
respect. The Manifold framework looks promising in this regard
([http://manifold.systems/](http://manifold.systems/)).

~~~
roryokane
JSON Schema is not affiliated with JSON and should not be confused with it.
JSON is a data format, like YAML, and there is only one version of it: the
spec at [http://json.org/](http://json.org/).

~~~
bmn__
> there is only one version of [JSON]

Sadly, this is utterly wrong.

[https://tools.ietf.org/html/rfc8259](https://tools.ietf.org/html/rfc8259)
[https://tools.ietf.org/html/rfc7159](https://tools.ietf.org/html/rfc7159)
[https://tools.ietf.org/html/rfc7158](https://tools.ietf.org/html/rfc7158)
[https://tools.ietf.org/html/rfc4627](https://tools.ietf.org/html/rfc4627)
[http://www.ecma-international.org/publications/files/ECMA-
ST...](http://www.ecma-international.org/publications/files/ECMA-
ST/ECMA-404.pdf)
[https://www.iso.org/standard/71616.html](https://www.iso.org/standard/71616.html)

------
beders
If you still aren't convinced YAML is terrible, try copying and pasting YAML
fragments with a regular text editor.

You might end up with valid YAML, but you won't know until the YAML consumer
barfs.

BTW, all of a sudden XML with DTDs are looking sane again :)

~~~
KaoruAoiShiho
Use the right tool for the job. I use yaml extensively but never in a
situation where someone would want to edit it with a regular text processor.

~~~
beders
do you deliver a YAML editor with your software? Because people will use
notepad or nano to edit that stuff.

~~~
KaoruAoiShiho
Sort of, I deliver a GUI that exports into YAML for pretty much only reading,
portability, and version control. People are expected to do the editing in the
GUI, only using YAML for editing when doing complex regex operations that my
GUI doesn't support.

~~~
twic
If people aren't editing it by hand, why does the format matter? Why not just
use JSON? Tools for too-complex-for-the-GUI manipulations are at least as good
for JSON as they are for YAML, and the editing is less error-prone.

~~~
KaoruAoiShiho
Internally it is JSON. It exports as YAML for readability when sharing on
discord.

------
deepsun
I think author confuses YAML problems with his favorite languages problems. I
bet those problems (at least most of them) are non existent in Java, for
example, only because Java programmers usually more responsible. Same for
Haskell or Rust I think.

But in other languages with notoriously irresponsible coders (JS, PHP) I bet
to see even more of these problems.

(I coded in all of them)

~~~
xenator
Exact after-taste I felt after article. Why not just move the focus to point
that author is not like Ruby and other Ruby frameworks anymore.

During my working life using Python I got few meh-moments with YAML. And this
is all. Never lost real joy of using it.

------
leoc
XML is as pleasant to look at or touch as a nettle rash, but it seems it can
join ALGOL 60 among the ranks of technologies which were a great improvement
on their successors.

~~~
gweinberg
Umm, no. You can find cases where JSON sucks, but you have to look for them.
You can find cases where XML doesn't suck, but you have to look for them.

~~~
michaelmrose
Other than looking ugly and being a pain to type does xml actually suck?

~~~
cygned
Is there an agreement on whether it’s

    
    
      <author name="pete" />
    

or

    
    
      <author>
        <name>pete</name>
      </author>
    

yet?

~~~
dbcurtis
Attributes are XML’s foot-gun.

~~~
papaf
_Attributes are XML’s foot-gun._

I disagree. Back in the day we used attributes for everything that was key
value and inner tags for anything with structure. We also formatted for
clarity:

    
    
        <lunch
          env="outside"
          food="sandwiches"
          drink="cola"
        />
    

Compared with what we used to do, I look at attribute-less maven pom.xml with
horror.

~~~
johnchristopher
What if it's to be used by french speaking software/people ?

    
    
        <lunch
          env="outside"
          envfr="dehors"
          food="sandwiches"
          foodfr="sandwichs"
          drink="cola"
        />
    

Or

    
    
        <lunch
          env="outside"
          food="sandwiches"
          drink="cola">
              <label type="env" language="fr">Dehors</label>
              <label type="env" language="de">Außenseite</label>
              <label type="env" language="en">Outside</label>
        </lunch>
    

Quite curious about it.

~~~
yosamino
I guess this way:

    
    
      <lunch
        env="outside"
        food="sandwiches"
        drink="cola">
        <label xml:lang="fr">Dehors</label>
        <label xml:lang="de">Draußen</label>
        <label xml:lang="en">Outside</label>
      </lunch>
    

says[0]the w3.

[0][https://www.w3.org/TR/REC-xml/#sec-lang-tag](https://www.w3.org/TR/REC-
xml/#sec-lang-tag)

~~~
johnchristopher
I see, thanks. (I also see you corrected my broken German ^^).

------
tptacek
The first argument, about YAML security, isn't valid; YAML is hardly the only
format whose parsers have admitted deserialization vulnerabilities (they're
endemic in Java; Rails had this problem with XML, and even before ROP-style
deserialization was a thing, XML was getting applications owned up through
external entity definitions).

Format aside, no matter which you choose, you have to pick library interfaces
that don't deserialize to arbitrary, constructed objects.

~~~
afiori
It is a valid criticism when comparing to Json or TOML

------
znpy
I've been working on a software that's eavily based on XML and in a number of
occasions I've been glad XML is strict and verbose.

you can quickly tell if an xml document is malformed (good parsers will
tipically point you to the un-closed tag).

Yaml on the other hand would probably load anyway, with the application
receiving garbage data, potentially misdirecting the application behavior...

------
ozten
Another one: Parsing partial YAML files doesn’t detect an error with loading
the complete file. We’ve had a production outage, because of large yaml files
getting cutoff and not all settings getting loaded into our server. JSON or
XML typically will not parse.

~~~
ailideex
Is this not an issue with a parser rather than with YAML?

~~~
afiori
Unreliable parsers are an issue of yaml.

~~~
ailideex
Why?

------
QuinnyPig
If the JSON and YAML folks can’t get along, I swear I’ll turn this car around
and make you all use XML.

~~~
jackfraser
Aren't we just reinventing the wheel, though? Got your structured data format,
now you need parsers (tons available for XML, incl SAX, DOM parsers,
SimpleXML, Nokogiri...) a schema and validation tools (XSD), a templating
mechanism (XSLT), a query language (XPath), ...

JSON was a reaction to the verbosity of XML, but a better reaction would have
been to work harder on our text editors so that working with XML would be just
as easy as working with JSON in terms of the numbers of keystrokes needed.
Better parser interfaces that help you treat the dataformat more like it's
part of the language would also help (i.e. SAX and DOMDocument suck to work
with, but SimpleXML is almost idiomatic).

~~~
saghm
> JSON was a reaction to the verbosity of XML, but a better reaction would
> have been to work harder on our text editors so that working with XML would
> be just as easy as working with JSON in terms of the numbers of keystrokes
> needed.

Isn't that only solving half the problem? XML is also pretty difficult to read

~~~
EGreg
Look. I am just a web guy.

But why is XML so freaking great? We can’t even tell if whitespace is
significant or not. If a schema says it’s insignificant then that’s that!

[https://www.oracle.com/technetwork/articles/wang-
whitespace-...](https://www.oracle.com/technetwork/articles/wang-
whitespace-092897.html) That alone is TERRIBLE! (Same problem with YML.) Why
should I bother with that? JSON can encode strings, hashes, arrays etc. in a
way that’s instantly interoperable with JS and is far far more unambiguous.

What exactly is so great about XML that you can’t do with JSON in a better
way? Schemas can be stored in JSON. XPATH can specified for JSON. Seriously I
never got the appeal of XML except that it was first.

~~~
mntmoss
Some of XML's biggest achievements lie in written documentation
formats(DocBook, DITA) where fine-grained markup control is needed and the
presentation of the content is secondary to semantic features like footnotes,
indexing, etc. These are formats that professional technical writers turn to
when Markdown, Word docs or PDF won't quite do the trick.

For a lot of data, XML isn't the right form and buries too much data in
hierarchy and tag soups - but it's flexible enough to make it into whatever
you want, and since XML was buzzworded and XML libs were some of the easiest
things to reach for in the 90's, it got pushed into every role imaginable.

------
devnulloverflow
As much as I am an old-school Unix zealot, I think it is time to move towards
a well standardised binary config format with non-trivial types (i.e a
schema). There still has to be a standard text format, but only for the source
from which the live configs have to be built. Done right, this has several
advantages:

1\. Built-time validation (or at least type checking).

2\. Built configs can be easy to parse but (potentially) rich enough to avoid
confusing templating.

3\. Separation of concerns between storing/maintaining configs and applying
them. E.g. scoop text configs off a source repo, but send out binary configs
over the network.

All this is a fantasy in my head. Right now the closest mainstream thing is
protobufs. But they make trade-offs for non-config use cases, and thus don't
really cut it in the "... rich enough to avoid confusing templating"
department.

~~~
zrail
SQLite databases might fit the bill. Fairly lightweight. Can talk to them in
basically any language. Instead of templates you copy the database file and
issue some UPDATEs.

~~~
shandor
But that, and the parent's idea of binary formats in general, throws away the
absolute golden property of text format configuration files: you can put those
in git, and see with an accuracy of a single character what has changed. My
impression was always that this was a huge reason for plain text files in the
first place.

Someone mentioned protobufs, maybe with something like those one could have
both?

~~~
geocar
> you can put those in git, and see with an accuracy of a single character
> what has changed.

You can use sqldiff[1]. Try adding it to your .gitattributes[2]. If you need
TRIGGERs and VIEWs, consider dumping your database[3] instead.

[1]:
[https://www.sqlite.org/sqldiff.html](https://www.sqlite.org/sqldiff.html)

[2]: [https://git-scm.com/docs/gitattributes](https://git-
scm.com/docs/gitattributes)

[3]:
[https://gist.github.com/peteristhegreat/a028bc3b588baaea09ff...](https://gist.github.com/peteristhegreat/a028bc3b588baaea09ff67f405af2909)

------
jbaudanza
These are all valid points. But I still find YAML to be the best format for
storing my strings for localizations. I find it much easier on my eyes than
JSON. I’m open to other suggestions though.

~~~
rc_mob
Per the article consider Toml

------
daveisfera
There's two types of formats: 1) those people complain about, and 2) those no
one use.

~~~
meowface
TOML seems widely used but I've never seen complaints about it. I'm sure there
are some, but the only time I see it mentioned is when someone is recommending
someone else switch to TOML.

Out of curiosity, is there anyone here who doesn't like TOML for
configuration?

~~~
jillesvangurp
I've encountered toml a couple of times but I wouldn't call it wide spread.
It's alright for small configuration files. However, if you keep things
simple, json and yaml are also not so bad and even properties files or good
old ini files will work. Doing e.g. cloudformation stuff in toml is not a
thing though and it supports both yaml and json. If your data is simple, use a
simple format. I've always liked properties files with simple name value pairs
separated by =. Still very common in the Java world though yaml has replaced a
lot of that.

BTW. I've handled all of those formats using jackson on Java & Kotlin. It has
a flexible parser framework originally intended for json. But it has lots of
plugins for different tree like configuration files. Look for jackson-
dataformat-yaml and jackson-dataformat-toml on github. There are loads more
formats that you can support with jackson. Nice if you need to translate from
one to the other or need to support multiple formats.

IMHO Json with some tweaks would be really nice. E.g. just supporting comments
and multi line strings would make it a lot nicer. A lot of json becomes
unreadable due to the need to escape strings. I've come across Hocon a couple
of times (jackson-dataformat-hocon) and it's a strict superset of json, which
means that if you accept hocon as input, you implicitly also accept json.

------
Legogris
Despite spending time writing and reading YAML on a daily basis for years, it
still trips me up once things get non-trivial. It's definitely my least-
favorite non-propritery config file format. XML might be overly verbose, but
there are no surprises (unless you go bananas with schemas).

~~~
krapht
Yeah. I never got the hate for XML. I feel like it was always mismatched
expectations: some people wanted something the Markdown of configuration
files, and other people wanted something extensible enough to encode any
possible data structures.

~~~
marcosdumay
It's verbose, illegible, redundant, and shares mos of the problems YAML has.

Not to talk about the attribute/content duality and all the ill-defined
parsers it leads to.

------
vlozko
I think the author’s conclusion is in line with my own thought: If JSON is the
problem, YAML isn’t the solution.

I recall the first time I saw YAML and all I could think to myself was that I
have to learn yet another syntax. I find it far less readable than JSON or XML
and made me pine for the latter.

------
namelosw
YAML is bad. It's like markdown, there are too many parsers behave
differently. Unlike markdown just for reading, it is used in configurations
for critical systems. JSON is much better, it IS readable and writable, people
using package.json all the time without problems.

Templating YAML is even worse. Templating is an ad-hoc abstraction, and very
easy to run into issues. A minimal JavaScript runtime with JSON would be much
better, JSON is JavaScript Object Notation after all.

~~~
ailideex
Markdown's problem is no single standard. This is not the case with YAML, so
no, it is not like markdown. And you can technically write assembler also.

~~~
namelosw
You are right, but if we shrink the scope to CommonMark the problems still
exists.

And what is assembler, may I ask? Is it for YAML or Markdown?

------
cuillevel3
YAML is really so much more than JSON.

    
    
      * YAML can have several 'documents' in the same file,separated by ---
      * there are anchors and references
      * easy to read multi line texts
      * it's also a superset of JSON
    

I can see, how choosing YAML when you just wanted readable JSON might give you
more headaches than expected.

And like someone else said, putting another template engine (or two) on top of
YAML is when the real problems start.

~~~
c3534l
It's really to maintain JSON without being allowed comments. JSON is fine for
transmitting data across the web, but if you need complex configuration data
in version control that many people work on, you need comments so new
teammates can be easily onboarded onto a project. There's definitely a need
for something that is programming-language-like, but entirely about
structuring and formatting data for configuration. YAML is the closest thing
to that. It's not perfect, but it works better for that purpose than JSON or
XML.

------
DannyBee
1\. General purpose serialization format is released.

2\. Format is declared to have x and y problem, new format is invented that is
"simpler and better"

Time passes

3\. People slowly discover format in #2 has the same issues that led to
creating #1.

Repeat.

(Just like attempting to super-generalize anything else)

------
nrvn
The points in the article are pretty solid.

But here is a big question.

Imagine you can influence the switch of the configuration formats for projects
like Ansible, Kubernetes, Docker Compose, AWS CloudFormation, Google Cloud
Deployment Manager, et al.

You can take any project with huge user base and all of those project will
have one thing in common: JSON-based configuration with an option to write
this configuration in YAML.

Since basically anyone talking YAML in the context of JSON is talking about a
JSON superset.

So here's the task: propose a JSON-compatible alternative to YAML.

Things to keep in mind:

\- backwards compatibility

\- easy migration from YAML to a new format

\- full JSON compatibility

\- relatively cheap to get supported by the project of interest.

~~~
theknarf
Just parse the YAML and spit out JSON5
([https://json5.org/](https://json5.org/)) and then keep that version.

------
TazeTSchnitzel
I wish the INI file format was standardised. It's easy for computers and
humans to read and write, and it has nice features like comments and _non-
destructive editing_!

~~~
timmytokyo
TOML is basically a standardized version of INI.

~~~
TazeTSchnitzel
TOML is similar to INI, but it's not the file format I know and love.

------
borntyping
The first example (i.e. `yaml.load()` in Python) doesn't work with the current
version of PyYAML.

Function application was disabled some time ago, and `yaml.load()` logs a
noisy deprecation warning telling users to use `yaml.safe_load()` instead [1].

[1]: [https://github.com/yaml/pyyaml/wiki/PyYAML-
yaml.load(input)-...](https://github.com/yaml/pyyaml/wiki/PyYAML-
yaml.load\(input\)-Deprecation)

------
methou
I have a completely unrelated question with the topic but derived from the
font-face used in the article.

[https://arp242.net/yaml-config.html#can-be-hard-to-edit-
espe...](https://arp242.net/yaml-config.html#can-be-hard-to-edit-especially-
for-large-files)

In the heading, how was sp ligatures in the `espcially` written, is there a
name for this? How do you connect the beginning of a `s` to the beginning of a
`p`?

~~~
TheDong
These are discretionary ligatures [0].

They're turned on in html with the following two css lines (though on firefox,
either one is enough to have them happen):

    
    
        font-variant-ligatures: common-ligatures discretionary-ligatures;
        font-feature-settings: 'liga' on, 'dlig' on;
    

[0]:
[https://www.fonts.com/content/learning/fontology/level-3/sig...](https://www.fonts.com/content/learning/fontology/level-3/signs-
and-symbols/ligatures-2)

~~~
Carpetsmoker
Note it won't work for every font, or some fonts may have a different flag to
enable it (I think there's an historical-ligatures, as well).

------
uponcoffee
I learned some of the intricacies of yaml the other day when refactoring a
docker-compose project. At first glance, it's brilliant... Until I started
running into limitations, edge cases, and issues.

I like the _idea_ of yaml, but: \- it's overly complicated in the wrong ways
\- common/simple use cases aren't supported and require post processing (i.e.
Merging block maps/arrays, string interpolation, etc)

------
perlgeek
I've recently started using [https://jsonnet.org/](https://jsonnet.org/) to
generate more complex config.

It's easier to write than JSON (no need to quote keys, allows trailing
commas), has reusability through functions and objects, and can output JSON
which is much easier to parse than YAML.

Downside: you need another build step for the config.

------
jaten
S-expressions rule. I use them for configuration everywhere.

I wrote a lovely library for parsing them in Go (along with a full lisp
interpreter if you like)
[https://github.com/glycerine/zygomys](https://github.com/glycerine/zygomys)

Provides comments, multiline strings, and automatic translation into Go
structs using reflection.

------
patsplat
It's best to consider YAML in it's appropriate context as a better XML
fragment. The ideas in YAML evolved into JSON which is preferable today.

However at the time, a tree data format that deserialized into native types
was quite useful. The alternative was writing event based SAX parsers, or
incredibly verbose XML object apis.

------
olliej
The security problem is common to many serialisation formats and similarly
terrible bugs have happened in a large number of formats.

For instance, the recent iMessage bugs that project zero announced were
because NSCodable serialization tells the deserializer what class should
instantiated. Followed by remote code execution (woo!)

Similar problems have occurred with java serialization over the years, the
python serialisation thing (that silly name I can’t recall).

I was recently learning swift and was getting frustrated by the verbosity/work
for deserialisaing abstract classes when I realized the clunkiness was due to
a design that made the deserialise attacker specified objects basically
impossible. Obviously you could engineer a solution that would be exploitable
but there’s only so much a platform can do to stop developer mistakes.

------
Vosporos
Just go with Dhall and be done with that

------
markpapadakis
I dislike whitespace sensitive languages or definition formats with a passion.
Especially when tabs and spaces are not treated equally. I don’t mind python
as much nowadays but YAML is borderline insulting to me. I hope we all move on
to something more sane soon.

------
dang
Discussed last year:
[https://news.ycombinator.com/item?id=17358103](https://news.ycombinator.com/item?id=17358103)

------
cmauniada
I find yaml to be the best for making cloudformation templates. On its own it
isn’t much good but if you use the right plugins it really is better than
json.

------
amelius
The problem with most configuration file formats is: you can't put functions
in them realistically.

The best configuration file is simply source code that initializes whatever
you want to run, and then runs it. That way, you can install hooks in the form
of closures and make the program behave exactly like you want without the
constraints that a simple "value-only" configuration file format has.

~~~
gjstein
This may be application-specific, but I might worry about security if my
configuration files support arbitrary closures.

~~~
rco8786
Any more so than the rest of your code?

~~~
dragonwriter
Config files are typically written updated by non-developers and often go
through a less rigorous release process, so having a less-complex and
dangerous, even if less-capable, language can be desirable.

~~~
rco8786
Ah, that's not been my experience - but I can understand that if non-
developers are the ones making the changes

------
rc_mob
Heh, yaml haters unite! I wish yaml would go away.

~~~
jeltz
I hate yaml but I have yet to find a better option for deeply nested confog
files. Toml is the closest thing I have seen. Toml support is also not that
great.

Yaml despite its flaws works pretty well for Ansible playbooks and for storing
localizations.

~~~
zamadatix
Would you say actual YAML does a better job in Ansible over just writing JSON
and letting it be parsed as YAML?

~~~
crdoconnor
Have you tried writing JSON by hand or diffing it in a pull request?

~~~
zamadatix
Err, yes? Is quoting your keys, delimitation members with a comma instead of
whitespace, and putting brackets/braces around collections really that
confusing that people struggle to edit it by hand or read it in a diff?

I think the syntax is actually what makes it more human readable, it's still
95% text/numbers just annotated with information that makes it clear what
things actually are instead of hiding them behind confusing computer parsing
rules nobody is going to think about while reading human-friendly text.

------
acd
ThoughtWorks has templating in yaml on their hold list.
[https://www.thoughtworks.com/radar/techniques/templating-
in-...](https://www.thoughtworks.com/radar/techniques/templating-in-yaml)

How does one write Kubernetes specs and Ansible without yaml?

~~~
q3k
As YAML is a superset of JSON, use $favourite_language to generate it. I like
JSONnet for that.

------
jaten
Google uses a subset of python called Starlark for build configuring
(available in Go and I think Java). Nice if you want to be able to compute
things during config.

[https://github.com/google/starlark-go](https://github.com/google/starlark-go)

------
pjmlp
I still keep using XML as my favourite format.

Get to use parsers out of the box, validation tooling, support comments, IDE
code completion and they are super easy to transform.

In a couple of years some trendy SV unicorn will make XML the best format of
the world, as these cycles happen to be.

~~~
DonHopkins
The irony is that the best format that takes over the world will be an XML
representation of JSON, enabling comments and trailing commas.

------
reilly3000
Kubernetes supports JSON but overwhelmingly leans towards YAML. I've had to
spend some time really grokking it to do basic dev ops, and now have my IDE
pretty dialed to support it. That said, its not my favorite by a long shot.
Can Jsonette save us?

~~~
reilly3000
I should clarify that kubectl converts YAML to JSON, so most examples I see
are in YAML in a repo, then applied by a CI system via kubectl where it is
transformed into JSON.

------
jokoon
I liked the indented style of YAML, asked how to properly parse an indented
file, and wrote my own small "parser".

What's nice about YAML is the choice to use indentation, for the rest, I have
a hard time following the language's choices.

------
johnisgood
S-Expression or TOML! I personally use TOML for all my projects' configuration
file.

~~~
boobePhuu7iet7i
TOML is so much easier to read IMO

------
linkerzx
Not a fan of YAML, but it has its advantages over JSON, such as the ability to
comment specific portions easily.

Haven't used TOML yet, but it seems promising given that for most use cases
you would only use a portion of the YAML language.

------
jwilk
FYI, PyYAML 5.1 partially fixes the security issues:

[https://github.com/yaml/pyyaml/issues/265](https://github.com/yaml/pyyaml/issues/265)

------
vinay_ys
I strongly recommend doing away with config files completely for sake of ease
of use, maintainability and security.

Instead just declare all config variables within code itself in a separate
config class/module file, along with initialization to default values and
provides dynamic getter/setter interface over a debug API (which can be
enabled/disabled via a command line flag).

If you want, you can also provide a friendly cli tool to interact with the
debug api. This tool could output help messages, show current config values -
differentiate between default vs overwritten etc.

Of course, this can be written once as a utility library and cli and used
consistently across all your programs.

~~~
twic
For the love of god, if anyone reads this, _please_ don't do this!

Config files are fantastic. Trivial to read, write, copy, track in version
control, diff, grep, generate with scripts, etc.

API-driven configuration has _none_ of these properties.

Some Java application servers take this approach of API-driven configuration.
It's an improvement over UI-driven configuration, which is what they had
before. But it's still significantly worse than simple file-driven
configuration.

If you want to provide a 'friendly' CLI tool, by all means do so - but provide
a tool to interpret and generate config files, not something which replaces
config files.

~~~
vinay_ys
Config variables in code gets version controlled and release-managed alongside
code and takes the same CI/CD route to production as rest of your code does.
Your code asserts, compilers and test cases can help you catch errors in
config data. All of this happens without any special file format or parser
concerns.

If you need runtime reconfiguration in production, then it requires a runtime
config management system tailored for operations folks with proper
authn/authz, audit logs etc. This is a product by itself. The connection
between your running program and the runtime configurator has to be
intentional, secure etc.

IMO, config variables should have following bindings: 1\. config variable in
code. 2\. program start env variable. 3\. program start command argument flag.
4\. runtime configuration.

All available configuration options are declared in the code. But not all
configuration options should be accessible from #2 to #4. And the override
preference order may not be same for all types of configuration variables.

Btw, this isn't related to Java or any particular programming language. I've
seen this done in large C++ projects 20 years ago.

------
goatinaboat
The superior replacement for XML, JSON and YAML is the SQLite .db file. Easy
to “parse”, easy to manipulate programmatically, what more could you want?

~~~
ip26
\- Editable in vim/emacs

\- Meaningful version control

~~~
goatinaboat
\- editable in Emacs easily, there’s a mode for it

\- dump it to SQL and version control that, if you must use Git

~~~
h1d
\- Not everyone uses emacs.

\- Not everyone likes the extra step.

~~~
goatinaboat
\- that’s your own choice, if you don’t want the features you don’t have to
have them

\- not everyone likes the hoops you need to jump through with the alternatives
either! That’s why we’re discussing this :-)

------
shermozle
WTF is that weird ligature between s and p on the heading "Can be hard to
edit, especially for large files"? That is an odd font.

------
ngold
YAML (/ˈjæməl/, rhymes with camel[2]) was first proposed by Clark Evans in
2001,[10] who designed it together with Ingy döt Net[11] and Oren Ben-
Kiki.[11] Originally YAML was said to mean Yet Another Markup Language,[12]
referencing its purpose as a markup language with the yet another construct,
but it was then repurposed as YAML Ain't Markup Language, a recursive acronym,
to distinguish its purpose as data-oriented, rather than document markup.

------
cryptica
I never understood how YAML is more human-readable than JSON. I find JSON much
easier to read. What annoys me the most about YAML is that it's easy to
misinterpret the indentation. You need a special IDE to know whether a
property belongs to a specific object or to its parent.

~~~
thayne
> I never understood how YAML is more human-readable than JSON

Two things: comments and multi-line strings.

~~~
throwaway_391
My personal JSON Pet hate is: ``` x = [ "Foo", "Foo2", ] ``` Is not valid, but
the following is: ``` x = [ "Foo", "Foo2" ] ``` Makes dealing with packer
configs feel like punching yourself in the face.

I still prefer it over YAMLs awkward initial learning curve.

~~~
zamadatix
At first I found it really annoying but then the more I thought about it the
more I came to value the "," semantics as proper validation for a "forgot to
put the last element in the list" error which would otherwise be silently
hidden via the parser.

~~~
throwaway_391
So your comment is vaild. Having strict and not-strict validation would be a
nice compromise though (:

------
techntoke
Configuration files are usually meant to be single purpose. Docker, Kubernetes
and Helm all use YAML exceptionally well.

------
mixmastamyk
Yaml is great at the core. It just has too many features, the first things I
disable.

A simplified subset, .syml would be a good idea.

~~~
tjalfi
StrictYAML[0] is a YAML subset that removes some of the problematic features.

The implementation is in Python.

[0]
[https://github.com/crdoconnor/strictyaml](https://github.com/crdoconnor/strictyaml)

~~~
mixmastamyk
This should be the post then, a solution rather than griping.

------
schpaencoder
EDN

------
simonrepp
tl;dr: Another alternative to YAML (among many great others), this one
designed and developed by me:

[https://eno-lang.org/](https://eno-lang.org/)

I've been doing a lot of research and development on language design for file-
based content (e.g. for static site generators). I've found that YAML -
although established as the go-to format for statically generated blogs, etc.
- was never designed for these things as it by its nature does not support
simple, essential features for this usecase like for instance unindented
blocks of verbatim text (for which YAML frontmatter was invented as a very
limited hack).

The result of all this R&D is a language called "eno notation" which is
designed especially for file-based content usecases, and around which I've
also built an entire ecosystem for many languages and editors - if you're
working in that field, it might be worth taking a look!

~~~
roryokane
I find it surprising that your format doesn’t distinguish strings and numbers,
or other types of scalar values in general. For example, in your demo “eno's
javascript benchmark suite data” on [https://eno-
lang.org/eno/demos/](https://eno-lang.org/eno/demos/), both of these lines:

    
    
      iterations: 100000
      evaluated: Fri Jul 06 2018 09:46:48 GMT+0200 (Central European Summer Time)
    

are tagged below as just a “Field”. Do client programs that read an Eno file
need to run `int()`/`float()` or `.to_i`/`.to_f` on the field values they know
should be numbers? That seems unergonomic.

~~~
simonrepp
You are correct! The thinking behind this is that for the majority of file-
based configuration and content usecases the expected types are fixed and
known beforehand already - ergo it makes more sense that a developer has to
specify _once_ which type a field is (gaining in return 100% type safety,
validation, localized validation messages, ...) than all users later having to
e.g. explicitly write quotes _a million times_ when writing
configuration/content, just to tell the application something about the type
it already knows anyway (and wouldn't expect/accept any other way too). I
think this is really more ergonomic, even in the short run.

------
botto
What about HCL, would it make sense to use this as a config language?

------
Waterluvian
I just want json with comments. Is that too much to ask?

~~~
roryokane
Someone else already mentioned JSON5
([https://json5.org/](https://json5.org/)), which is JSON with a few ergonomic
improvements, including comments. Hjson
([https://hjson.org/](https://hjson.org/)) is a similar, slightly more complex
format with a few extra features such as unquoted strings for object values.

------
breck
Disclosure: I work on Tree Notation. It’s the future of file formats, IMO.

The idea is to have 2 levels: a simple, minimal syntax/notation (think binary)
called Tree Notation, and then have higher level grammars on top of that,
called tree languages.

It works for encoding data and also for programming languages, regardless of
paradigm.

[https://github.com/treenotation/jtree](https://github.com/treenotation/jtree)

~~~
svnpenn
wow that is awful

~~~
dang
" _Please don 't post shallow dismissals, especially of other people's work. A
good critical comment teaches us something._"

[https://news.ycombinator.com/newsguidelines.html](https://news.ycombinator.com/newsguidelines.html)

Edit: we've had to ask you multiple times already not to be a jerk on HN.
Would you please review the guidelines and take the spirit of this site more
to heart?

------
abakus
YAML is horrible. Toml is much better. Even Json is not that messy.

