Hacker News new | past | comments | ask | show | jobs | submit login

Did something happen recently to get the YAML hate train going again?

I get it. YAML is not perfect. Neither is JSON, TOML, XML or even code as configuration (Xmonad anyone?). They each have pros and cons and projects/technologies take those in consideration and pick one.

Not sure I see the point in hating on one specific configuration language. If it was that bad, nobody would use it. And if you still think it's bad anyways, you can always improve on it. But very few actually want to put on the enormous amount of work needed to improve YAML or create a new language.

IMO if there was something that was substantially better, we would see projects switching to it in a heartbeat. But the fact is that most times the difference between them is not substantial, so the effort to make any kind of switch so you can shorten Norway is simply not worth it.




> [...] If it was that bad, nobody would use it.

Haha, if that logic held true ... we wouldn't be using lots of things.

Usually people use things, because "that's how we have done it before" or "that's all we know". Not many people look for better tools frequently. They try to get something done and the moment it is working, they are done with it. People are frequently punished for trying to improve the current state, by management that tells them: "But it already works! No business value in changing it now."


You forgot "that's the only viable option provided by the vendor." Some people are reliant on certain software as a requirement and forced into certain standards imposed on them. They don't exactly choose the standard, they have a functional requirement and the only way to achieve it in some budget/timeline may be with a third party solution that uses, say YAML.


Yeah I've been there personally too. But why aren't we complaining about the vendor then?

I find it weird that we love to complain about the YAML format instead of the projects that chose it. Given some of the emotions I saw in other responses in this thread, it looks more like a venting exercise.

Which is fine, I guess. I was simply curious on why suddenly it got propped up again.


If ten thousand vendors use one subpar language, it makes sense to try to swing opinion away from the language, rather than play whackamole with eleven thousand vendors.


because the complaint is with yaml.


Just use JSON then?


Config needs comments


Use json5 ?


I use CUE, it is awesome and a joy to use


For some reason that line triggered a lot of folks. That was not my intention, haha.

I agree that it's not always under our control and that can be extremely frustrating. But that's not YAML's fault, is it?

When I wrote "that bad" I truly meant the extreme version. Something so bad that it has no upsides. Which IMO is not the case here. YAML has pros and cons, just like all others, and for one reason or another many folks in several different projects decided that YAML was a good enough choice.

I have a very hard time assuming everyone that ever chose YAML is so incompetent that they never thought about the pros/cons of it.


There's always a tool that will work better for a specific use-case, especially when you either don't understand or have forgotten about some of the requirements.


Well the question about business value is a good one, no? After all, the question is not "Would improving this thing be a good thing?" it's "Would improving this thing be better than every single other thing we could be doing with those same resources?".

Besides there's an equal and opposite danger with too much change or change delivered without clear benefits. I'd prefer a 7/10 UI which stayed the same for a few years vs a 8/10 UI which changes substantially every month.


Here's the thing. We've basically been using TOML since the INI file format from MS-DOS. It works, and it's useable, and we've all seen it. TOML is just an evolution of the INI format to fix some of its shortcomings. YAML blasts onto the scene like "hey, what if we just rewrite the JSON format to make it legible?" which brings with it a million edge cases of problems.

YAML is the new kid on the block. And despite having several great formats for different use cases already, some kind of mass hysteria caused big players to adopt YAML. I suspect everyone who adopted YAML is either a Python dev or they were drawn in by the legibility of the format. It's easy to read. While that's true, did anyone stop to think about what it's like to actually use the format?


> Usually people use things, because "that's how we have done it before" or "that's all we know".

You forgot "this is the sexy new shiny way to do it, let's hop on the bandwagon! The old things sux!"


Yeah, but you mention Rust one time and everyone gets mad ...


Not only do you get punished by management, but you get ridiculed on HN for trying to change the status quo!


I think the reason everyone jumped ship from XML to JSON was that JSON is comparatively very dumb - and dumb is quick to grok.

Then some of us realised JSON is actually too dumb for many things, and instead of going back to XML we made JSON Schema, OpenAPI, etc.

Others of us thought that the main problem with JSON is that it's not human readable and writeable enough. So we came up with new formats like YAML. [EDIT: My timing is wrong here, sorry.] Unfortunately being human we could not resist making it much more complicated again, thus increasing cognitive load.

There have been many times when in order to really understand a YAML file (full of anchors and aliases, etc) I've had to turn it into JSON. This is ridiculous.


The timing doesn't quite work out for this explanation, unfortunately. YAML is about the same age as JSON and started finding a niche as a configuration language in parallel with JSON finding use as a serialization format. Ruby on Rails 1.0 was using YAML for configuration in 2005, and it didn't even have JSON support out of the box at that point.


Serves me right for not checking! I certainly became aware of YAML long after I started using JSON. But I do think people are choosing it over JSON for it's alleged improved read/write friendliness.


Indeed, back then it as "Yet another Markup Language" (https://yaml.org/spec/history/2001-12-10.html). I remember using it to write blog posts with static generators, like webgen around 2004.


Interesting, I'm surprised the opposite way as the others replying -- I thought YAML was much older than JSON. We all encounter things at different times I guess.


This is lovely, I didn't know. I guess this is what Kuhn was talking about, we write history in retrospective, sorting it out preferring narrative over fact.


> Then some of us realised JSON is actually too dumb for many things, and instead of going back to XML we made JSON Schema, OpenAPI, etc.

This take doesn't make any sense at all.

JSON Schema is a validation tool for languages based on JSON. You build your subset language based on JSON, and you want to validate input data to check whether it complies with your format. Instead of writing your own parser, you specify your language with a high-level language and run the validation as a post-parsing step. Nothing in this usecase involves anything resembling "too dumb".

OpenAPI is a language to specify your APIs. At most, it is orthogonal to JSON. Again, nothing in this usecase involves anything resembling "too dumb".

JSON is just one of those tools that is a major success, and thus people come out of the woodwork projecting their frustrations and misconceptions on a scapegoat. "JSON is too dumb because I developed a JSON-based format that I need to validate." "JSON is too dumb because I need to document how my RESTful API actually works". What does JSON has to do with anything?


You're making an interesting distinction between "JSON" and "languages based on JSON" there, which I don't. JSON and XML in isolation are just a bunch of punctuation and not useful. They're only useful when we know the structure of the data within them. XML already had schemas, and we were able to easily (YMMV!) validate the correctness of a document.

JSON was simpler because we would just say to each other "I'll send you a {"foo":{"bar":1234,"ok":true}}", "Cool thx" - and there wasn't a way to formalise that even if we wanted to. That doesn't scale well though. We needed a (machine-readable) way to define what the data should actually look like, thus OpenAPI etc.


> You're making an interesting distinction between "JSON" and "languages based on JSON" there, which I don't.

That's basically the root cause of your misconceptions. A document format encoded in JSON is a separate format, which happens to be a subset of JSON. A byte stream can be a valid JSON document but it's broken with regards to the document format you specified. Tools like JSONPath bridge the gap between your custom document format and JSON. This is not a JSON shortcoming or design flaw.

> They're only useful when we know the structure of the data within them.

They are only useful to you because you're complaining that you still need to parse your custom format. This is not a problem with JSON or any other document format. This is only a misconception you have with regards to what you're doing.


Too-dumb is that it was too machine tied, imo jsonc(and now later json5) strikes a perfect compromise.

Json5 adds back, quote-less identifiers, trailing commas in objects/arrays and most importantly, comments (That jsonc already added).

With those additions, there is little extra pain of writing JSON as configuration without losing anything in terms of being stringent.


>There have been many times when in order to really understand a YAML file (full of anchors and aliases, etc) I've had to turn it into JSON. This is ridiculous.

There's no reason why editor couldn't inline those things in YAML to help see what's going on locally. I can't code without code navigation and stuff like type hints for inferred types.

As YAML gets used for more complex stuff I think the tooling needs to catch up.


> There's no reason why editor couldn't inline those things in YAML to help see what's going on locally.

There's no reasons why an editor couldn't present either YAML style (block style or the JSON-like flow style) that the user prefers, and save in whichever one is preferred for storage.


The multitude of IDE plugins and editor modes cannot (and my God should not) solve for the fundamental weaknesses of a data specification format.


Referencing common data is not a weakness, but it does introduce tradeoffs - I'd take that over having to do a "have I updated every instance" whack a mole.


I recently wrote about the JSON/YAML limitations in the context of OpenAPI and JSON Schema:

https://ebastien.name/posts/api-design-language/

Also sharing an humble attempt at an alternative language:

https://www.oxlip-lang.org/


> There have been many times when in order to really understand a YAML file (full of anchors and aliases, etc) I've had to turn it into JSON. This is ridiculous.

Spitballing here. If underlying data is identical in JSON or YAML or whatever, why not introduce a view layer that is structure agnostic provided that the syntax can be translated without modifying the data?

I'm imagining a VSCode plugin or some view that parses the data into whatever format you'd like when you open it, then when you write it serializes it into the file format specified in the filename. You could do the same with your code review system.

Ultimately the specific syntax is for humans only, so as tooling improves why not add that next layer of abstraction? Is it because there are so many format-specific idiosyncrasies that can't translate well, due to the complex nature of a lot of these config files (gitlab-yaml, etc.)?

Just wondering, without having the time to think through the language specs properly, why we haven't seen this yet when it seems like such a huge quality of life improvement.


> Then some of us realised JSON is actually too dumb for many things, and instead of going back to XML we made JSON Schema

XML has numerous different schema languages made for it, outside of the XML standard, because XML itself is just as “dumb” as JSON in this regard, and apparently no one got schemas exactly right for it. The holy wars over XML schema languages only faded when XML’s dominance did.

> OpenAPI

OpenAPI uses JSON and JSON Schema much as SOAP uses XML, but it doesn't prove JSON is “too dumb” any more than SOAP proves that XML is.


I don't know if SOAP caused it, but since it popularized, every single tool just assumes your XML is specified by a DTD, even though that's the one schema language that nobody ever liked.

I never saw a war. AFAIK nobody ever wanted to use DTDs, but that's what everybody used because it was what everybody used.


One thing about yaml that I think conflicts with 'human readable' is the idents mattering. On long files, even with a small number idents, it can be tough to tell whether something is in one section or another. For whatever reason, lining up braces/brackets/parentheses makes it easier for me to tell


Encoding meaning in whitespace not only makes it difficult to verify correctness by eye (for nontrivial cases), but also is very fragile.

You're lucky if it survives a cut and paste across tools. Almost every tool ever written treats whitespace as formatting, not meaningful content, and many tools therefore assume that "fixing" whitespace will improve readability.


I'm still baffled at how semantic whitespace 'won' with YAML and Python.


something that baffles me is that I find Python's semantic whitespace more comprehensible than YAML's. Haven't figure out why though


Python's use of whitespace is, dare i say, perfect.

It makes everything more readable. I've never seen cases where indentation is ambiguous.

I think this is because in a programming language indentation happens in a small set of specific situations like `for` loops and such. Either you indent right or you get an error. On the rare occasion the wrong indentation level can assign a variable to the wrong value, but that is a rookie mistake.

Whereas in YAML, everything starts out ambiguous. Everything is context sensitive. Indents change the meanings of things in slight, unclear ways. Its a constant source of confusion


The legibility argument.

What if I told you that C-style syntax can be formatted to use indentation exactly as you prefer?


Because Python only has one semantic meaning for whitespace. YAML’s whitespace is also mixed with other symbols that change what the whitespace means, and it differs further depending on how much whitespace follows the previous indent and precedes the following symbol, if any. Or if the previous line ends with a pipe, then the only semantic meaning is “none of this whitespace matters, so much so that it’s trimmed… but the indentation is preserved after the first line”. I’m probably wrong about some or all of this! It’s been a whole day since I had to write YAML, so the YAML gnomes have rightly reclaimed the part of my brain that was sure I knew what a given bit of whitespace actually means.


In python you split your code into functions when it grows too many indentation levels to be comprehensible.

In yaml, you have the whole application logic in one ”function”. Fine for small things, problematic when it grows.


Amen. My current theory: data science. Data science has become much more prevalent in the last decade, and I suspect data scientists prefer English-like syntax because they're not engineers.

Engineers use language like mechanical components. It should be precise, neat, and functional. We're designing machines. I have a feeling that data scientists don't want any of that. They want to use language as a way to describe data transformations in a format that resembles a log rather than functional instructions. Much like SQL.


The problem with braces is that as a human, you'll still rely on whitespace, and when the whitespace isn't enforced, it can be misleading.

    if (launch_button_pressed)
      prepare_missile();
      launch_missile();


That should fail a lint check in your CI and your editor/IDE should autoformat that indentation away.

You also claim that is a "problem with braces" but you're not using braces and languages like rust no longer allow brace-less single statements like that.


Yeah, if I had my druthers, brace-less shorthand syntax like that would never be allowed. I never use it in my own code.

Beautify takes care of all questions about what an indent means. C-syntax devs never assume an indent has syntactic meaning. We use it has a helpful hint. We innately understand that it's the braces that matter, and it's really easy to Beautify a file if the original author was sloppy.

C-syntax devs read differently. We're not reading a book. We're reading code.

And we generally strongly prefer correctness. Braces avoid all unintended bugs related to where the instructions are located on the screen. Pretty is nice. Structure and correctness are better.


gcc will give a warning for this.

  <source>:6:17: note: ...this statement, but the latter is 
  misleadingly indented as if it were guarded by the 'if'
      6 |                 launch_missile();
        |                 ^~~~~~~~~~~~~~
https://godbolt.org/z/cYqhhqz47


Which is why you configure your editor to auto format, and then those errors can't exist once you press save.


It works much better for Python than for YAML. The details matter, and YAML has all of the worst details.

(And if you want proof that the details matter, just look at Haskell, where it works perfectly well.)



> Others of us thought that the main problem with JSON is that it's not human readable and writeable enough. So we came up with new formats like YAML. [EDIT: My timing is wrong here, sorry.] Unfortunately being human we could not resist making it much more complicated again, thus increasing cognitive load.

One of the odd things about the progression is how user-hostile it is. JSON lacks support for comments, YAML has systematically-meaningful indentation indentation and frequently deep nesting, etc.


There was a combination of reasons.

XML has plenty of problems of its own which legitimately generated a lot of hate for the format. JSON, at least superficially, didn't have many of those because it lacked (and still lacks) a lot of features. So, for a reasonable person it wouldn't be a proper comparison, but... there's the reason number 2.

JSON rise to prominence coincided with Flash dying and JavaScript hype train gathering momentum. Flash made a bet on XML (think E4X in the latest AS3 spec, MXML, XML namespaces in the code etc.) Those who hated Flash for reasons unrelated to its technological merits hated everything associated with it. In particular, that hate would come from people doing JavaScript. HTML5 that was supposed to replace Flash, but never did was fueling this hype train even more.

At the time, JavaScript programmers fell inferior to every other kind of programmer. Well, nobody considered JavaScript programmers to be a thing. If you were doing something on Web, you'd try hard to work with some other technology that compiled to JavaScript, but god forbid you actually write JavaScript. But people like Steve Yegge and Douglas Crockford worked on popularizing the technology, backed by big companies who wanted to take Adobe out of the standardization game. And, gradually, the Web migrated from Flash as a technology for Web applications to JavaScript. JSON was a side-effect of this change. JavaScript desperately needed tools to match different abilities of Flash, and XHR seemed like it won't be part of JavaScript and was in general a (browser-dependent) mess, especially when it comes to parsing, but also security. JSON had a potential to exploit a security hole in Web security model by being immediately interpreted as JavaScript data, and this was yet another selling point.

To expand on the last claim: one of the common ways to work with JSON was through dynamically appending a `script` element to HTML document, then extracting the data from that element, which side-stepped XHR. There was also a variant of pJSON (I think this is what it was called, but don't quote me, it was a long time ago), where thus loaded JSON would be sent as this:

    $callback({ ... some JSON ... })
Where the `$callback` was supplied by the caller. I'm not entirely sure what this was actually trying to accomplish beside dealing with the asynchronous nature of JavaScript networking, but I vaguely remember hearing about some security benefits of doing this.

Anyways, larger point being: JSON came to life in a race to dislodge one of the dominant forces on the Web. Speed of designing the language and the speed of onboarding of new language users was of paramount importance in this race, where quality, robustness and other desirable engineering qualities played a secondary role, if at all.


As someone else said, JSONP, the callback thing you showed, is/was a same-origin workaround: script tags are legacy and can be loaded cross-origin and executed, but that doesn't expose the content of the script to you.

So you pass along the callback that you want it to execute in a query string param, the script comes back as a call to your callback, and you can then get at the data even though it's coming from a different origin. The remote side has to opt in to looking at the callback query param and giving you back JSONP, so it's kind of a poor man's CORS where the remote side is declaring that this data is safe to expose like this. Of course on the flipside, you're just getting and executing whatever Javascript the remote chooses to send, so you're trusting them more than you have to with a modern fetch/xhr using CORS.


Oh, yeah, thanks for the refresher. It's been a while!


Before CORS, JSONP was one of the few ways to work around the same-origin policy.


> HTML5 that was supposed to replace Flash, but never did was fueling this hype train even more.

My impression of what happened was that the iPhone replaced Flash, and therefore HTML5 couldn't replace Flash because Flash was already gone.

There were some nasty things about Flash, but in retrospect mobile applications are so much worse. We used to have things better than we do now.


> There were some nasty things about Flash, but in retrospect mobile applications are so much worse.

Not really. Flash was terrible on phones that supported it. There were SDKs that turned Flash applications into native iOS ones, IIRC, but otherwise Flash was a dead end once mobile started to grow.


Not really... people made these claims without any testing, as per usual.

I was on Adobe's community advisory board at the time of the iPhone fight against Flash and I'd work with all sorts of things that were supposed to be for the phones.

Flex had problems with phones. Macromedia and later Adobe built this GUI framework on top of Flash with multiple problems... performance being one of them, but the other, and perhaps more important problem was that Flex was created by people who wanted to copy Swing. In no way was it any good for making typical smartphone UI.

So, Adobe tried, but with very little commitment to produce some sort of Flex-based "extension" for smartphones... and that thing never went beyond prototype. Also, at that exact time, Flex was transitioning to using the new text rendering engine, which while offered more typographically pleasing options was really, really slow to render.

People behind Flash player had some good ideas. Eg. Flash Alchemy: a GCC-based backend for ActionScript that made Flash very competitive performance-wise (but never really went beyond prototype). Around that same time a new UI framework appeared in Flash aiming to utilize GPU for rendering, which was a big step in the right direction, especially considering how "native 3D" in Flash failed (it was all on CPU, operating very heavy display objects).

None of these ideas saw much traction in particular because Adobe's management responsible for the product lived under illusion of invincibility. They did a little bit of something, just enough to keep the lights on, but they didn't realize they were side-tracked until it was late. And even at the time it was late, they made some really bad choices. Instead of open-sourcing the player, they started a feud with people who wanted to maintain Apache Flex (the Adobe-abandoned Flex) because of some irrelevant IP right on the Flash player core API. They never officially recognized Haxe. And, generally, undermined a bunch of other projects that targeted their platform (Digrafa comes to mind).

They didn't come clear with major users of their technology, repeating "Adobe is eternally committed to supporting Flex" until they left it in the ditch and forgot all about it. They made it very, very hard to support them in whatever they were doing.

----

Bottom line, Flash could've been made to perform well on smartphones. It ran OK on what would today be called "feature phones" before smartphones existed (eg. it was available on Symbian), if you knew what you were doing.

It died because of piss-poor management on one side and monopolistic desires of mega-corporations on the other side.


> Not really... people made these claims without any testing, as per usual.

Come on… I was around at the time and played with it on phones that supported it. From a user’s perspective it was very bad. Some people desperately wanted it to happen and they had their reasons, but saying that I am negative without having tested it is pointless speculation, and wrong.


I worked for a shop that made Flash games for Symbian phones (i.e. old Nokias). That's a lot more resource-constrained environment than any of iPhone or Android ever were. And it ran fine, if you knew what you were doing.

When Android just appeared on the market, I worked for a company that was making a video chat Facebook app. It was written in AS3 and one of the main features was to apply various effects to video. We tested it on Android, and it worked fine, even though that's a very memory and CPU intensive app.

Really, Flash player was not the problem. It couldn't go toe-to-toe with native code, but optimized AS3 code would beat unoptimized native code.

It was some form of code-golf to write Base64 encoding in AS3 and benchmark it. Usually comparing to the implementation in Flex. When Flash Alchemy came out, I wrote a version of Base64 encoding that beat it something like 100:1. A friend of mine who was known by his forum / Github user name "bloodhound" (here's some of his stuff: https://github.com/blooddy/blooddy_crypto/tree/master/src/by... ) wrote a bunch of encoding / decoding libraries for various formats (he also improved upon my Base64 code). And these were used all over the place for things like image uploads / image generation online. This stuff would beat similar Java libraries for example.

Not sure if you remember this, but at one point in the past Facebook had a Java applet that they used to manage image uploads to your "albums". Later they replaced it by Flash applet. It didn't work any worse that's for sure.

----

The performance problems were in Adobe AS3 code, not the player. Flex was a very inefficiently written framework. And so were AS components. But if you take AS3 3D engines, even those that were fully on CPU... you had plenty of proper 3D games. Eg. Tanki Online (a Russian game made with Alternativa3D Flash engine) was a huge hit. Even if the phone could handle a fraction of that, you'd still have plenty of room for less complex UI.


iPhone didn't replace Flash: the intent was to be be smartphone, not a data distribution format...

iPhone browser, like MacOS browser, drop support for Flash (and most plugins in fact but Flash was the most noticeable.) In the other hand, HTML5 was adopted quickly by Apple. So we can say that HTML5 replaced Flash (not iPhone per se, as it didn't come with a specific replacement first and an alternative was already there second.)

However, I wouldn't say that HTML5 is the drop-in replacement of Flash. It did help to avoid this later on some common use cases with video and audio tags and standardisation of formats (that also kill use of QuickTime/WindowsMedia/RealMedia plugins)


What do you think happened to flashgamelicense.com?


I dunno what it is. After a little search, I think you wanted to point out that games developers lost interest in it and migrated to the smartphone stores?


The reason is easy of use. XML means SAX (difficult to work with for most people), or DOM (insane API).

JSON was just: var settings = eval(json-string); Done, nothing else, just simple access.

It's all about API design. This is also why Stripe even exists


There was an absolutely excellent essay by James Duncan Davidson, creator of the Apache Ant build tool for Java, about why XML was its configuration language, and why it was perfect at first but grew to be a huge problem. To summarize from memory:

- Ant’s config needs were modest: configs were static strings, numbers, booleans, and lists

- XML parser existed, was well tested, was supported by lots of tools, so was a quicker and easy way to parse configs

- As Ant became successful, the configs grew more complex and started to need to be dynamic and have “control flow”

- Once control flow was needed XML was no longer the best choice for config parsing, but it was too late. (The correct answer is that a scripting language is what you want here.)

Edit: I posted this quote from his essay on HN back in 2020 [1]:

> Now, I never intended for the file format to become a scripting language—after all, my original view of Ant was that there was a declaration of some properties that described the project and that the tasks written in Java performed all the logic. The current maintainers of Ant generally share the same feelings. But when I fused XML and task reflection in Ant, I put together something that is 70-80% of a scripting environment. I just didn't recognize it at the time. To deny that people will use it as a scripting language is equivalent to asking them to pretend that sugar isn't sweet.

Edit 2: The reason you’re not seeing an “improved yaml” is because the improved version is just Python/Ruby/JS/PHP,…

[1] https://news.ycombinator.com/item?id=25385443


I was the first person to use Ant outside of Sun. I committed it to the Apache cvs repo after James donated it.

The initial use for it was to build the source code for what became Tomcat (also donated by Sun/James). At the time, Java didn't really have a build system. We were using Makefiles for Apache JServ, and it was horrid.

Ant was a breath of fresh air. At the time, XML was the hot "new" thing. We were instantly in love with it and trying to get it to do more. Nobody could have predicted what Ant was going to turn into. It was effectively an experiment and just a step forward from what we had previously. Iterative software development at its finest.

Similar to how Subversion was a better CVS. At the time Subversion was being developed (I was part of that as well), nobody could predict that a whole different model (git) would be so much better. We were all used to a centralized version control system.

It is entertaining watching everyone bork on about this topic, 24 years later.

By the way, Java still doesn't have a good build system. Maven and Gradle, lol.


I didn't always hold this opinion, but currently I believe general build (and deploy, integration, etc) engines should never be anything more than a logically controlled workflow engine, capable of transparently querying your system, but never directly changing it.

It should allow for any random code to be executed, but by themselves they shouldn't provide any further feature.

Ant is not like that. In fact, the Ant script was always a programming language. It lacked some flow control features, but if you have a sequence of declarations that should run one after the other, it's an imperative programming language already, you don't need loops and conditionals to qualify to that.


It's not about being perfect. It's about using the wrong tool for the problem.

Ultimately, code as configuration is the solution. The code generates the actual configuration-data which can be of an arbitrary format (e.g. json) since no one is gonna see it anyways except for debugging.

There are usually two arguments against that:

1.) code can be or become too complicated

It's a non-argument because while code can and will be complicated, the same configuration in non-code will be even worse. In the worst case, people will build things around the configuration because the configuration isn't sufficiently flexible.

What is true is that a single line of code is usually more complex than single line of yaml. The same can be said about assembly vs. any high level language. But it should be clear that what counts is the complexity of the overall solution, not a single line.

2.) code can be dangerous

This is a valid point. You will have to sandbox/restrict it and you need to have a timeout since it could run forever - and that creates some additional complexity.

I think this complexity is worth it in 90% of the cases. Even if you are not using code to generate the configuration, you probably still want to have a timeout, just in case of some malicious actor trying to use a config that takes forever to parse/execute etc.

But if you say that this is a nogo, then at least use a proper configuration language such as Dhall instead of YAML. It already exists. You don't need to invent a new configuration language to avoid YAML.


I think it's more of a progression thing. The config file starts out pretty simple, enough where one of the text file formats is clearly the right solution and code is over-complex. But then it grows and grows, continuing to accumulate more depth and use more features of the format, and maybe some hacky extensions. At some point, it gets complex enough that, if you designed it from the start with that feature set, being code would obviously be the right solution. But now the transition over is hard to pull off, so it tends to get put off for longer than it should be.


I would like to believe it, but I have my doubts.

For instance looking at kubernetes, do you really think they didn't know in advance how complex things would get?

My explanation is different: often configuration files are used for things that are close to infrastructure/operations and those are done by folks who are already used to yaml and who are not used to high level coding as much. It's probably not a concious decision by them, it's just what they believe is the best.


K8S maintains the internal state of the system in JSON which is easy to convert to YAML which is seen as more readable in comparison. I think their choice to support JSON and YAML as an interface for configuring K8S is because of this.

I'd also guess that the expectation was that abstractions would form around YAML to make it more powerful. Helm uses Go templating language which supports logic and loops (but is very unpleasant to write complex configs with) and the operator pattern is also popular for more advanced cases. Ultimately both end up exposing some JSON/YAML interface for configuring, though. It is up to you to decide how you create that JSON/YAML.


They could have chosen Dhall as well, which also converts to json (or YAML). So I have to wonder: why did they choose YAML in the end?

> I'd also guess that the expectation was that abstractions would form around YAML to make it more powerful.

Yeah, now you have 3 layers: you have an arbitrary abstraction which is likely different for every tool, then you have yaml and then you have the internal json. That seems like a loss to me in terms of complexity.


Dhall was first released in 2018. Helm was started in 2015. Helm did flirt with using Jsonnet which would have been better, but I think they already had charts using templates.


Well, okay, very fair point. They shall be forgiven then. :-)


Code also requires a runtime or needs to be compiled to something static, like YAML or JSON, in order to be consumed. That latter option is essentially how things are today given that templating tools the generate configurations are plentiful.

I think that this requirement makes broad adoption a lot more difficult for code-based configuration.


> That latter option is essentially how things are today given that templating tools the generate configurations are plentiful.

Yeah, that's what I mean partially by "building around the configuration". How is that better than writing the configuration directly in code?

Obviously, if the configuration is described with code then there is a runtime needed. But since the configuration is usually being run by program... there should be a runtime already. It's not like I'm saying that just any code/language should be available.


Any configuration format that cannot be auto-formatted (i.e. has significant whitespace) goes straight in the trash. 90% of my time spent with YAML is usually trying to figure out if something is indented correctly.


> Any configuration format that cannot be auto-formatted (i.e. has significant whitespace) goes straight in the trash. 90% of my time spent with YAML is usually trying to figure out if something is indented correctly.

What are you talking about, exactly? If I hit alt+L in IntelliJ, my YAML file is auto-formatted.


Indentation changes the structure of the data. There is no deterministic way to auto format YAML, you have to make sure, as a human, that everything is at the right indentation level

   - something
     - a
     - b
Is completely different to:

   - something
     - a
       - b


It can be deterministically auto-formatted as long as you respect the format, like with everything. If you wrongly indent a line then it’s your fault and the auto-format can’t know if some indent is wrong or not. I don’t see what’s wrong here; it’s like adding a superfluous } in JSON or not closing a tag in XML.


It complicates moving stuff around the hierarchy.


what about stopping the little darf who writes into your files unauthorizedly?


Problem is that it's pretty easy to have "wrong" whitespace that'll get formatted to something different than you expected.


Using IntelliJ to edit YAML files is a bit much


> Using IntelliJ to edit YAML files is a bit much

Not if those YAML files are part of a larger project, which they quite obviously are.


That's actually an advantage of indentation sensitivity: the indentation is always correct, so you don't need to auto-format it.


The indentation is always syntactically correct, but good luck ensuring it's semantically correct.


Why would that be any more difficult than in a brace-based language?


Because braces delimit entire blocks of code, whereas leading whitespace only delimits a single line of code.

It is very easy to stick your cursor between two braces and start typing or paste the clipboard. It is easy to see exactly which depth the added content will exist in. If you'd like a little breathing room, you can put some empty lines between the braces and insert there. Either way, once you (auto)format it'll be fine.

But if whitespace is significant, it's not enough to place your cursor in the spot where the content — the whole block — belongs because there is no one spot where the content belongs. Rather, each line of the content you wish to insert must be placed individually. When pasting a block, placing the cursor at the correct spot before the paste is no guarantee that all of the content will be indented correctly; you can only guarantee that the first line is correct (and even then, doing so might require taking into account any leading whitespace of the copied content). Subsequent lines might need to be adjusted to retain the same relative indentation relative to the first line — or not! It depends on your particular scenario.

But with braces, there really is no “it depends”. Place cursor, put text. It just works.


What are your thoughts on Helm templates...


> And if you still think it's bad anyways, you can always improve on it. But very few actually want to put on the enormous amount of work needed to improve YAML or create a new language.

The issue is less about putting in the work to create a new language, it's about convincing a significant chunk of an ecosystem to use that new langauge.

> IMO if there was something that was substantially better, we would see projects switching to it in a heartbeat.

Switching costs are non-zero and can block a switch.

ie: Helm charts are built on YAML and gotmpl, even though JSON is also an option. I can make a fairly compelling argument that Helm would be better with JSON and JSonnet, but that ignores a huge amount of investment in existing Helm charts. Whatever benefits are there would be swamped by the cost in terms of additional complexity and potential porting costs.


This is spot on. My brief exposure to Jsonnet made me realize how superior it can be, especially for k8s manifests. Unfortunately the learning curve is steep, to the point that I've held off on attempting to introduce it anywhere new.

Recently I discovered kpt (https://kpt.dev/), which attempts to improve the k8s manifest situation and seems to have a lot of good ideas. Considering how long it's been under development though, it may also never catch on among mainstream k8s users.

Grafana recently switched their dashboard DSL to one based on Cue. I haven't yet dug into learning it, but it seems potentially even more powerful than Jsonnet. They also have the advantages of a much smaller audience (relative to k8s manifests) and of putting in the work up front to define the new language and a sample implementation (useful out of the box), not to mention making it a requirement going forward.


I suppose where I was going with all that is: hopefully, as people are exposed to superior solutions within a smaller context, they might be open to considering similar alternatives for other purposes.


> But very few actually want to put on the enormous amount of work needed to improve YAML or create a new language.

No need to create a new language: S-expressions have existed for longer than I’ve been alive.

> IMO if there was something that was substantially better, we would see projects switching to it in a heartbeat.

I disagree. Path-dependence is a real issue, as are local maxima.

> the effort to make any kind of switch so you can shorten Norway

It’s not so that one can shorten Norway; it’s to avoid silent errors (like the Excel-gene issue). Fortunately, this particular bug no longer exists in the current YAML spec.


You say "S-expressions" like it's a standard, but it's really hundreds of different standards. Maybe someone needs to create nice websites for SON (scheme object notation), RON (racket object notation), ION (IMAP object notation), CLON, CLJON, ELON, R7RON (large and small), and the rest.


> If it was that bad, nobody would use it.

> IMO if there was something that was substantially better, we would see projects switching to it in a heartbeat.

This weird line seems to govern a very persistent & substantial minority of thinking in tech that I've never been able to grok. It's like a sort of "natural selection" of quality - not only have I never seen it apply in real life, there's a reasonable argument that the opposite applies (a topic on which there's been plenty written). I guess it's based on some imagined model of each individual being an intentional & fully informed (& objectively correct) decision-maker in their own tech usage; it's certainly not based in experience.

I use YAML in work. A lot. It may be the filetype I interact with most. Does that make me an advocate?

I use it because the tools I interact with default to it - I have not actively selected each of these tools; of the subset I have chosen myself, YAML was one of the cons in that selection process.

JSON is imperfect. XML is imperfect. TOML is imperfect.

YAML is not imperfect - that implies something approaching quality.


I don't think using it makes you an advocate. I totally understand the point that most of the time we're not choosing it. It has been chosen for us in one way or another and that sucks.

My whole point is that we should focus on the projects using YAML instead of the format itself. Complaining about YAML at this point is like kicking a dead horse. We know it's not ideal, but saying it's crap is not gonna change anything.


> saying it's crap is not gonna change anything.

That's true, & at the end of the day, people use YAML for a reason - one that other alternatives fall down on: it's the same reason people use Markdown (& despite it having similar drawbacks to YAML it's become even more ubiquitous). Ultimately, YAML & Markdown are good for one common reason: human write-ability. A lot of people think it's about readability, but it isn't: readability may be a pro in terms of quality, but its write-ability that actually drives adoption. YAML has pretty bad readability in reality (people think it's good because they confuse readability with aesthetics), in no small part due to its non-standardised indentation (also in smaller part due to its ambiguous data-types).

JSON is probably the most readable of all of them (readability isn't about being terse or "clean"), but the requirement for terminators makes it much less write-able. XML is worse again in this regard.

TOML is a kind of middle-ground - adopting many of YAML's flaws but fixing some of it's most egregious faults. Personally though, I think it seems like an odd attempt to resurrect INI, & I'd prefer something like StrictYAML[0].

[0] https://pypi.org/project/strictyaml/


The Efficient Markets hypothesis applied to software engineering :).

Maybe with zero friction, infinite developer resources and perfect knowledge it would be true.


I would prefer almost any of the alternatives you mentioned to yaml for most places it is used. Except for json, due to lack of comments.

Although for things where you have ifs, loops, etc.like CI pipelines and ansible, it really should be using a fully featured language instead of definig a new mini-languagr inside of yaml.

Yes, none of those are perfect, but that doesn't mean none of them would have been a better choice.


Yeah... this is like saying that murdering all people in line to buy ice-cream and waiting for your turn to buy ice-cream have pros and cons: either you get to buy ice-cream faster, but some people die, or you have to wait, but nobody dies.

YAML is trash. There's no pros to using it. Not in any non-preposterous way.

> If it was that bad, nobody would use it.

If fossil fuels were so bad for the planet, nobody would use them.

If super-processed fast food was so bad, nobody would eat it.

If wars were so bad, nobody would fight in them.

Do you sense where the problem with your argument is, or do I need more examples?


I'm the first person to dislike YAML, but I also like it better than the other popular alternatives. JSON sucks to write and read, but is great for machines. TOML sucks beyond simple top-level key value entries; Ini is even worse. XML reminds me of a thorn bush (but I can't explain this). Nobody on the team would want to learn to read Dhall

YAML has some "cute" features that are annoying (but anchors are nice).

Cue/JSONNET are a nice compromise between Dhall and JSON/YAML, but I don't see them reaching wide adopting (I'm hopeful it happens, but alas).

Aside: I wonder how people develop such an extremist view regarding something as mundane as config languages to equate them with casual murder


Config languages and build systems are both in a domain of maximum annoyance and minimum intellectual satisfaction.

Within the realm of software.. casual murder is wrong and all but it would be config languages driving a build system that drove me to it if anything could.


> config languages driving a build system

Yeah, this part alone is already wrong and can not work well.


Have you tried CUE?

It is very intellectually stimulating


> XML reminds me of a thorn bush (but I can't explain this). Nobody on the team would want to learn to read Dhall

I grant you the thorn bush, but "nobody would want to learn" is imo a weak argument against Dhall.


That's an overblown reaction: comparing yaml use to murdering people? Come on.

And you examples are flawed.

> If fossil fuels were so bad for the planet, nobody would use them.

It's not the planet that's using fossil fuels.

> If super-processed fast food was so bad, nobody would eat it.

It's probably not as you make it out to be, and another thing: food costs money. People choose with their wallets, whether you like it or not. Processed foods are usually cheaper. YAML, JSON, XML, they all cost the same; their indirect costs are really hard to measure.


> People choose with their wallets

You are very naive if you believe that. This can only work if people can meet the following requirements:

* People are rational and will always or at least statistically enough times will make the best choice.

* People are egotistic and will only do what's best for them, they don't consider the benefits of others.

* People know everything there is to know about nutrition, they can deduce the short term and long term effects of consuming any quantity of any chemical in any combination with any other chemicals based on precise knowledge of their gut functionality, the bacteria that lives in it etc.

* People should be able to predict the future, at least enough for them to be able to make rational choices about the effects of their current actions. In particular, they should be able to predict famines, invention or hybridization of species of various agricultural plants, developments in pharmaceutics helping them to combat various eating-related health problems.

----

People don't choose. They happen to eat super-processed foods, drive energy-inefficient cars, take out mortgages they cannot pay etc.

Finally, eating from a dumpster is even cheaper than eating super-processed foods (whose price can very much depend on the scale of manufacturing rather than anything else). For some reason most people choose not to eat from a dumpster...


> It's not the planet that's using fossil fuels

When people say "it's bad for the planet", we need to assume that "it's bad for the humanity". The planet does not really care about CO2 emissions, humans suffering/dying from them probably do.

> If fossil fuels were so bad for the humanity, nobody would use them.


> comparing yaml use to murdering people?

Where do you see me comparing YAML to murdering people? Please read that place again. I compare murdering people to the inconvenience of having to wait in a queue.


What?

You're not bringing any arguments yourself. "There are no pros to using it" has got to be the most lazy thing I've read today.

Are we to assume that when the Google folks working on Kubernetes chose YAML they had a very brief moment of insanity? I can clearly see you hate it, but that's not an argument against it. Same for every other project that ever used YAML? Come on, that's a leap if I've ever seen one.

Those examples of yours are, again, very lazy. I'll take one of those just to prove my point: fossil fuels have bad consequences especially with overuse, but they of course had their pros, otherwise nobody would ever use it.

Take a deep breath mate. This is a discussion about YAML. Nobody is getting murdered.


> You're not bringing any arguments yourself.

You'd need to search through my post history. The arguments are long and numerous. I'm not sure I want to repeat myself again. I promise to do the search part though. So, if you have patience, you can wait and I'll hopefully find something, but cannot promise to work fast.

PS.

> Are we to assume that when the Google folks working on Kubernetes chose YAML they had a very brief moment of insanity?

I am "Google folks", so what? Just come from a department unrelated to Kubernetes (and I don't work for Google anymore).

Kubernetes is an awful project inside and out. Poorly designed, poorly implemented... It owes its popularity to the lack of alternatives early on. (Docker Swarm never really took off, by the time they tried, Kuberenetes advertised itself as having way more features, way more integrations, even though the quality was low). Kubernetes fills a very important niche, that's why it's used so much. It would've been used just as much if you had to write charts in any other similar language, no matter the quality or the popularity of that language.


Now _that's_ an unpopular take. I love it!


Feel like revenge of guy frustrated by kubertes.


For the purpose of full disclosure: I worked at Elastifile, some time before and after acquisition. In terms of development of Kubernetes, I have nothing to do with it. We might have been among the early adopters, but that's about it. At my first encounter with Kubernetes I knew about it just as much as any other ops / infra programmer who'd be tasked with using it outside of Google (we actually started using it long before the acquisition).

I do, however, greatly regret that Kubernetes exists. I cherish the idea that instead of writing non-distributed applications and then trying to stitch them using the immense bloat that is Kubernetes the industry will turn to a framework that enables making applications distributed "from the inside" (like Erlang's OTP). I see Kubernetes as a "lazy" way into the future, where we waste a lot of infrastructure to cover for lack of talent and desire to learn how to do things right.

For me, Kubernetes sits in the same spot with planned obsolescence, unnecessary plastic bags and cigarette butts in the ocean. It's the comfort of today at the expense of the much greater struggle tomorrow.



Interesting comment. I agree with pretty much all of it.

Although in fairness, I never claimed YAML was good in any way, shape or form. My comment was always about the pointlessness of it all. Either we get projects to not use YAML or we're yelling at the void.

Given how passionate you are about this though, I am curious what would you suggest instead of YAML?


I agree. That post about all of the pitfalls of YAML was very well written. My two cents: The best config format that I have used is JSON with C & C++-style comments allowed. Sure, dates and numbers (int vs float) are still a bit wobbly, but it is good enough for 99% of my needs.


Kubernetes is such a magnificent piece of software which must have been created by brilliant gifted engineers IMHO. But even they make mistakes.

..And even when mistakes were made, you can still use just JSON with k8s and forget about that stupid backward yaml. K8s is the best


> If fossil fuels were so bad for the planet, nobody would use them. If super-processed fast food was so bad, nobody would eat it. If wars were so bad, nobody would fight in them. Do you sense where the problem with your argument is, or do I need more examples?

You mention these things as if they had only downsides. They do not. Fossil fuels have lots of advantages: high energy density, good storability, easy transport. Otherwise we would not have any problem getting rid of them.

Same for ultra processed food: it is cheap to make, addictive, has long shelf life so can be easily stored and shipped around the world, and all of that makes them very profitable.

Same for wars: some people or entities (companies, countries, etc) do profit from them (or at least hope or plan to profit).

So yeah, even things you find personally disgusting have some purpose, at least to some people. It’s also the case for YAML, as for all things. Comparing YAML to fossil fuels or wars is unhelpful hyperbole. We should be able to take a deep breath and discuss these things rationally like adults.


The argument was much more simple: popular doesn't mean good (for whatever metric you are using to measure it).

No need to look for reasons why something bad is also good. That's not the point. Hitler was a vegetarian and had some other commendable character traits. None of which make him a good person.


What about Facebook? And PHP, private cars, spotify, AWS, just wanted to drop some more examples


YAML is bloody terrible, and I've hated it since the first time I was forced to use it.

JSON or TOML are always preferrable, based on the use-case. If you want humans to be able to change the file, use TOML. If only machines and debug programmers need to see the file, use JSON.

YAML is in this horrible in-between where it's extremely unintuitive at first glance and it's really easy to break. The only positive aspect to YAML is that it's legible. But legible is not the same as understandable. TOML is just as legible, but it's also easy to understand and change. JSON is hard to read and hard to understand, but the structure is very durable.

I'm going to take a guess here and assume that there's a strong correlation between people who like Python and people who like YAML.


I don't know if direct code would solve any problems of YAML, the problem is you're configuring a system that isn't runnable locally, and by the nature of that, it doesn't matter if that system is configured in js, ruby, python, brainfuck, html, or yaml, without a validator you're screwed, and if you have a js validator, no reason you can't have a yaml one.

I personally think YAML has a lot of bad flaws, for one it doesn't have a schema definition similarly to JSON or XML, so you can't just say "write a yaml that conforms to this schema" and boom, autocomplete, self-documentation, etc.

That's about where I landed. "If only we had the ability to run a miniature, totally unscalable, but testable architecture locally, it'd solve all my challenges of administering a system."

This is just an indicator that what we have now isn't the best, it's just the one that functions. Kinda like humans, we're not the best biological organisms, we're just the ones that ended up winning.


Validation is pretty solvable though. Validate your configuration schema once it’s in memory. This is what we do for our yaml configs, it takes maybe an hour or two to write something from scratch. Sure it’d be nice to have it as a core feature, but it’s not a difficult add on.


Probably because of things like Github Actions, which uses YAML. The correct thing to do is use the .yml (like 15 lines) to execute a script of some kind, that does the real work, and can more easily be tested, changed, and used on its own.

But many people instead start writing more and more in the yaml files to do what they want, and they're left with a mess that cannot easily be run outside of github actions, so then they test it less, or other teams build parallel processes to do the same things but locally, etc


> Not sure I see the point in hating on one specific configuration language. If it was that bad, nobody would use it. And if you still think it's bad anyways, you can always improve on it. But very few actually want to put on the enormous amount of work needed to improve YAML or create a new language.

The point is that this format is like a virus. It doesn't need to be improved. It needs to go away. We already have a lot of great serialization formats that have all the use-cases covered. Whenever someone chooses to serialize configuration into YAML, they are unleashing hell upon all of the humans that have to use it.

Let me put it this way. If I was designing some software to sell to you that requires a config file, why would I skip over TOML (INI), JSON, or XML to pick YAML -- a brand new format that has a lot of funny rules that take training to understand? I would be subjecting you to confusion for no good reason. If I'm designing the software for a developer, I'd use JSON or XML. If I'm designing for a non-dev, I'd use TOML. If I'm intentionally trying to cause suffering, I'd choose YAML.

Edit: I wrote this comment in a way that makes it sound objective. Programmers like to think our arguments are all objective. Truthfully, I think the hate train is really about that fact that YAML seems to be very divisive -- some people really like it, and some people feel very frustrated with the format (like me). When those of us who don't like the format encounter some system, tool, or package that relies on YAML serialization, we lose our sh--. We feel that we're being subjected to a difficult and silly new format for reasons that seem entirely arbitrary.


YAML was first released in 2001, same year JSON was formalized, so if we're going to reject a language for being "new", we probably shouldn't use JSON either. It supports comments, and doesn't use commas for lists, so you don't have the trailing comma problem for diffs. It's far from perfect, but let's not lie and pretend it kills babies or something.


I use it, because I am forced to write in as means to configure products that don't give me any other option.

If I was free to use any format, I would always go for XML instead.


D:


Oh, the days and hours I've longed to be able to put a comment in a json config file. I would also choose XML.


You can put a comment in a yaml!


XML is perfect for everything in theory. Too bad the average programmer apparently has such an issue grokking a little complexity, adjusting their eyes to the sight of brackets, and configuring their parser correctly that we all had to throw our hands up and use dumbed down formats instead.


One of my big drivers away from XML is that processing used to be insanely expensive. Over the years this has improved some, but XML parsing is still significantly less performant than any other option, especially compared to JSON or YML. Used to be a few orders of magnitude more expensive on compute to read, and that's why many folks in the industry were happy to move on, esp for cases like message passing. I've pulled XML out of a few apps over the years in apps with a lot of message passing and the throughput and performance improvements were measurable and significant. For that use case Protobufs are much better than XML and have a schema though it's a binary format, so not usable for conf files.


> For that use case Protobufs are much better than XML and have a schema though it's a binary format, so not usable for conf files.

There's a canonical JSON representation that is widely supported: https://protobuf.dev/programming-guides/proto3/#json

Since JSON is a subset of YAML, you can also make YAML work.

There is also a text format which lies somewhere between YAML and JSON: https://protobuf.dev/reference/protobuf/textformat-spec/


Yes, that's very true. I actually have used json representation at previous gigs, I just wasn't up to adding the extra qualifiers. Whenever I've needed to use Protobufs, I've always found them very nice work, really. I certainly have a much warmer opinion of them than XML.


> adjusting their eyes to the sight of brackets

XML's “brackets” aren't actually brackets, but less-than and greater-than signs.


also known as angle brackets.


Actual angle brackets look like this: 〈〉 A bracket should visually surround the text it's containing, otherwise it looks confusing. The less-than and greater-than signs are only used due to ASCII limitations, though I'm wondering why they didn't go for square brackets, which are available in ASCII.


> in theory


> Not sure I see the point in hating on one specific configuration language.

Point is most devs do not control what their employers mandate to use. Ranting a bit help them release some frustration. And how one can improve on it? It is not an individual app where I can change or upgrade a bit and be happy.

> But very few actually want to put on the enormous amount of work needed to improve YAML or create a new language.

So some did awful amount of work to come up with awful config language and everyone feel thankful about it. It doesn't seem right to me.


Not really. I personally have hated YAML with a passion since it arrived on my lap with Ansible.


I wonder if it's usage in puppet is different enough to be a factor... But the introduction of Hiera to Puppet was a godsend (to me) and since then I've liked it (or at least put up with it). As long as it's only used as a way to hold data, and not the code (like Kubernetes) then I'm ok with it. If I have to use a DSL like Puppet or Terraform, I much prefer to use Hiera to perform context-aware data lookups so I can write environment-agnostic infra code (with zero hardcoded values) that iterates over data. I'm sure Hiera could have been implemented in another markup, but I wouldn't like to force my other choice on anyone. YAML'll do.


> YAML is not perfect. Neither is JSON, TOML, XML or even code as configuration

S-expressions curiously omitted from the list of textual tree formats that are not perfect. Hmm... :-D


I am honestly curious to see if someone will create a configuration language that is just s-expressions as nested json arrays


The worst part about that is not having the symbol type to use in JSON, so you must quote every symbol (or make your language incompatible with standard JSON).


S-expressions are actually quite hard to write and read when compared to TOML, and a bit more confusing then JSON (with comments).


We most certainly have a problem with our society, the internet has turned from being a place of information and sharing, to boldly whining about things that impact nobody. This is the 6th time over the course of 5 years someone has shared this link on hackernews. We get it, tools aren't perfect.

Do we have mods on HN that can get rid of these dupe posts?


First time I saw it. My day is just a little better for having seen it.


It got shoved down people's throats more. Probably because of kube.

The same thing happened with XML. First people liked it then it became a monster. JSON and YAML came to rescue us. As YAML grew in use, its warts become more evident, so it became the new monster. Time for a new language to come to the rescue. The cycle repeats...


> If it was that bad, nobody would use it.

>if there was something that was substantially better, we would see projects switching to it in a heartbeat.

Both of these takes are extremely naive, a person’s choice of tools has almost nothing to do with the tools and everything to do with the person. Psychology governs human behavior, not logic


> Did something happen recently to get the YAML hate train going again?

Not really; this website hasn’t been updated since 2018.


As it's used in CI/CD systems, it's almost as if people are making a adhoc, buggy, non-portable version of Make.

I would actually love an Xmonad style configuration system for this use case. Use an actual programming language's much better grammar and resolve errors at compile time.


Ah yes, Make, the original "white space has meaning" disaster... A single tab in the wrong place and you are screwed.


So we've replaced one whitespace sensitive language for another.

My editor tells me when I'm doing something stupid with Make whitespace. It's impossible to do the same for yaml.


JSON is not perfect either but hey, you can write just the data no whitespace and everything is fine! In YAML you need to first write the data, then find the whitespace error. The next logical thing is to use handwriting for configuration files, no technology at all


> But the fact is that most times the difference between them is not substantial, so the effort to make any kind of switch so you can shorten Norway is simply not worth it.

Switching from YAML 1.1 (2005) to YAML 1.2 (2009) for that reason makes perfect sense, I think.


> But very few actually want to put on the enormous amount of work needed to improve YAML or create a new language.

Actually a lot of people do work on creating new languages, but I think language nerds create languages for other language nerds, not really for the bulk of people who need something simple. They always want to make them functional and immutable or self-hosting or build high performance webservers with them. Things that nobody wants to do with YAML.

What we need is a configuration language with the simplicity of Logo or the C that was in the old ANSI C Book.

So YAML is the most flexible and bastardizable markup language and it gets used for configuration as well, even though it is terrible.


>If it was that bad, nobody would use it

Well, maybe they think that it's worth "hating" on it because people are using it, and they want to make people aware of the problems? I know I do, though "hating" sounds like I have malicious intent, which I do not. I am sure the creator(s) are very bright people with nothing but good intentions. But I agree with the article that YAML is frustrating to work with. The "it's not perfect, but neither is anything" argument is a bit of a cop-out in my opinion, as that can be applied to anything and everything.

I feel (but don't know!) that YAML was inspired by markdown as an attempt to create a format that felt like the most natural and intuitive to read for humans, while still machine-consumable. A noble idea, but in my opinion, one that fails as soon as you have more than half a page of configuration. Then, it just becomes a pain to even figure out which parent a specific bullet belongs to. And that's not even getting into all the cleverness.

I don't want to create a new language because [XKCD standards comic]. I'd prefer people use JSON or TOML as I consider those better even if they have plenty of issues on their own.


YAML in the very the basic usage is OK. You need to watch the indent, and that's mostly it.

YAML using all the bells and whistles that you had no idea were even part of the spec (e.g. Anchors and Aliases) is terrible, hard to read and harder to edit.


Could just be one of those things where people have grumbled about something for a while, and when there's a sudden outcry that goes viral, brings the scale of the discontent into sharp relief.


Yeah, sure looks like it. With some of the emotions I'm seeing in this thread, it feels almost like catharsis for a lot of folks.


> Did something happen recently to get the YAML hate train going again?

I was all onboard the yaml hype train until I finally started using it more for the daily job. json > * in my stogy brain.


Everything that is intensely loved will eventually be intensely hated. My theory is that it attracts a type of people with borderline personality, who intensely love something then after a period of disenchantment in which they fight, reject and ignore evidence that their passion is not flawless, they abruptly switch to hating it. They’re also typically the loudest voices in a community as they always feel intensely about everything, whether positive or negative.


As a dev at a company that almost exclusively uses HOCON for application configurations it's really sad that it doesn't have a bigger audience. I guess Lightbend is mostly to blame for that.

We used it in our python projects as well for a little bit until we hit a bug with how pyhocon handles certain references and we just switched to using the java implementation to serialize configs for python apps into JSON...


I think it is sort of like the joke about programing languages.

There are two types of programing languages, those that everybody hates, and those that nobody uses.


> And if you still think it's bad anyways, you can always improve on it.

JSON is so popular probably because you _can't_ improve it! It is too entrenched.


> IMO if there was something that was substantially better, we would see projects switching to it in a heartbeat

What problem do you see in adopting Dhall?


JSON and XML are not perfect but are simple formats with simple rules. YAML, on the other hand, is not simple. The specs are so baroque that JSON is part of it.

That's my problem with YAML. Other people may have other problems, I understand that.

It's nice that you believe that people tend to use good things and abandon bad things. I wish world would work like that. But it doesn't.


The fact that YAML is a superset of JSON is the least of YAML's problems.


Write a JSON parser. Now try writing a YAML parser.


So JSON is better because it's easier to write a parser for it? By this logic, Brainfuck is the best programming language.


Yeah, true. Why not just use elisp, the one and only configuration language?

It's amazing what you can do with it, given the appropriate platform. Some people even managed to implement a decent editor using elisp.


It's just so baffling to see the industry replace things with worse things, over and over again.

The upside is that sometimes we manage to do the opposite. Perhaps it's worth it.


Unfortunately people don't automagically switch to better things, especially when they aren't the ones suffering from the choice


> If it was that bad, nobody would use it

Like I have a fu* choice with Gitlab ci enforced onto me by organization.


> YAML is not perfect. Neither is JSON, TOML, XML or even code as configuration (Xmonad anyone?).

Whataboutism and false dichotomy in one.

My house isn't perfect, so it's no different than living in a cave.

My car isn't perfect, so it's no different than riding a bike.

France isn't perfect, so it's no different than North Korea.

Macs aren't perfect, so it's no different than a slide ruler.


Care to elaborate?

You wrote a lot of words but said nothing. My argument is that picking only on YAML is useless because we can find faults in all of them. There's no perfect choice, just trade-offs.

What's your point?


Not the parent but the logic is fairly clear.

Something being suboptimal (JSON not allowing comments unless it's JSONC) is very different from the trash heap of poor YAML design decisions shown in this article.


Isn't this confirmation bias at work? These are poor decisions of different degrees, but they're still simply poor decisions.

I have no horse in this race. I suffered with the shortcomings of all of these formats. So I don't see a point in saying "this one is in a different category of bad". YAML was built with different ideas in mind. If we are so adamant on hating on something, we should hate the projects that chose YAML over something else.


> These are poor decisions of different degrees

We agree.

> but they're still simply poor decisions

I, the other poster, and the article writer feel a preponderance of poor decisions regarding YAML (mentioned in the story) make it much worse than a few poor decisions regarding JSON or similar.


No it's not confirmation bias. The linked site has TONS OF EXAMPLES of why. XML is verbose, that's pretty much the only problem. JSON is simple, that's its only problem. Those are TINY problems.


Not my comment but I would agree that "If it was that bad, nobody would use it" is a weak argument - there is a lot of bad things around us which we have to use. It should not stop us discussing that they are bad.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: