Github actions is the one that surprised me. It's a recent enough invention that it was built long after all these discussions, issues raised, and alternatives proposed.
I've had far more issues getting these YAML files correctly formatted than I should admit, and every test means pushing a commit and waiting to see what happens.
Only if you use a yaml 1.1 parser. Yaml 1.2 changed the behavior of how booleans are parsed, but it's only been 13 years since it was released, so I get why people might not have migrated to it yet.
This sounds like an example of the problem: It's not up to me to migrate, we're using a third party service and I have no idea what version of Yaml Github Action's parser users. I read this guide and can't find a version referenced anywhere:
Depends who the intended audience is. I wouldn't inflict curly braces on anyone other than software developers and certainly not where you expect people to do a lot of manual data entry.
Even I tend to write "pseudo-YAML" and then add the JSON line noise programmatically if I'm entering a lot of data.
People who've learned a Lisp family language offer our hopes and prayers that people using these configuration languages don't miss a comma, space, newline, or optional quote. :)
((name "Ford Prefect")
(age 42)
(possessions "Towel"))
(:name "Ford Prefect"
:age 42
:possessions ("Towel"))
(countries "GB" "IE" "FR" "DE" "NO")
(countries gb ie fr de no) ; if country codes are identifiers in a DSL
((first-name "Christopher")
(surname "Null"))
(:first-name "Christopher"
:surname "Null")
To add to siblings comment, dhall can only load resources from URLs if it's annotated with the hash of its content. The "language" itself is explicitly not turing complete, and is fully deterministic.
Look a little closer, dhall is probably the best option I've seen that preserves the properties I want out of a configuration language (correctness, determinism, turing incompleteness, easy projection into other formats, etc).
> However, when you protect an import with a semantic integrity check the import is permanently locally cached after the first request, so subsequent imports will no longer make outbound HTTP requests.
> Many users have requested Dhall support for "offline" packages ... The goal of this change is to document what is the idiomatic way to implement "offline" Dhall builds ... The trick to implementing offline builds in Dhall is to take advantage of Dhall's support for semantic integrity checks. ... The offline nature of the builds are enforced by compiling the Haskell interpreter with the -f-with-http flag ...
Both XML and s-expressions have a great formal advantage over JSON or YAML and most of their derivatives: the ease of being explicit by means of using names and types allows applications to process their configuration freely rather than according to inappropriately high level conventions.
It may be because for many use cases -- probably most, statistically, although that's just an intuition -- what you're really looking for is a configuration data format, not a full-blown language. You need key-value pairs, probably nested, and... that may very well be it. It may not even matter if the data format can explicitly specify data type.
Dhall and Cue are no doubt excellent and amazing for complicated use cases, but there are a lot of use cases where using them is akin to taking the VTOL jet out to pick up a few things at Trader Joe's.
Yeah, you always start out with K/V pairs but it never takes long until you need the same configuration but just with a slight tweak here or there. For example consider the same configuration but for different running environments.
People always come up with custom solutions per tool which some sort of metaprogramming, again in YAML. This is just bonkers IMO.
So why not just use a proper language from the beginning?
If I where to use it in Python, I would have to code a parser, then implement the entire type logic mysql.
And I bet this is why it's not more used: JSON or YAML are comparatively easy to implement because you just need the parser. You don't have a full featured language with a set based typing system on top.
I did, and they are very underrated. CUE in itself is brillantly designed: just enough power to be useful (conditionals, loops), but not enough to be dangerous (not turing complete). The idea that you can define data types and data values with the same language makes for very ergonomic configuration files, and it is compatible with YAML and JSON, being able to export to and validate them.
Any new tech is esoteric at first.
When I started Python 20 years ago, there were no job offer for it in my country.
I have a weak spot for S-exprs and tend to use them when nobody will be looking. There's just something very nice and right-looking about them, and the "correct" pretty-printing is algorithmically very simple, parsing is easy, when indented they're easy to read and edit even if the file gets a bit longer etc.
Sometimes you want to let a user edit some fields in a map/dict/whatever, and that's it. I use TOML here, personally, but YAML has some advantages for more complex data, especially if string keys are long.
Yaml is nice, its a shame MS Word does not support it (with syntax checking of course) or its adoption would be even greater. It just looks nice and non coders easily get the hang of it.
You're the 2nd person in the reply who worries about non-coders.
Honest question - what is the example use case here? Which configuration files in YAML are currently being handled by people who don't know any coding, and would be impeded by curly braces (and anything like JSON or XML)? I think this is only an imagined problem.
Also, I have a met a non-programmer lady from HR, who was able to download a VB script into Outlook and adapt it to her needs (which was some kind of automation). Richard Stallman made a similar observation with Emacs configuration. I think you quite underestimate what non-programmers can do.
I don’t know much about dhall, but Cue is one of the worst pieces of software I’ve ever used. It’s clearly been designed by someone who’s so proud of their own cleverness that they never considered if any of it is actually useable or not.
My YAML secret weapon is that JSON is more or less valid YAML. So, if you want YAML with a “safer syntax”, just write JSON instead. You can sprinkle in comments if needed.
StrictYAML got rid of all these, as all the safer YAML variants. E.g. perl5 still uses 1.0 for its cpan metadata abstraction, and still has to restrict these. I maintain the https://metacpan.org/pod/YAML::Safe module, which allows whitelisting of certain objects.
This is an issue with Ruby's YAML implementation. YAML 1.2 processors should interpret documents without YAML directive as if they were YAML 1.2 documents - see https://yaml.org/spec/1.2.2/#681-yaml-directives.
See the last section -- a parser that interprets YAML as v1.2 by default will break for YAML v1.1 documents.
There's no way to determine whether a `.yaml` file is YAML v1.1 or v1.2 without a version directive, and most YAML documents are v1.1 because most YAML parsers default to v1.1 semantics.
I used Ruby as an example since it's easily available on most platforms, but you could also use Python or C++ or Swift or whatever language[1] you prefer. The underlying issue -- YAML not being a subset of JSON -- is universal.
[1] Note that some libraries, such as go-yaml, do their own thing and don't conform to either v1.1 or v1.2 semantics.
- a git explainer centered around git internals, serving as an indictment of git's ux
- a parser for a restricted subset of yaml, serving as an indictment of yaml's excesses
on the front page yesterday:
- an rsync explainer centered around rsync internals, where part 1 details how rsync is wrapped in a dockerfile and a perl script in order to be made useful
- a sad thinkpiece on how baroque the web has become
on the front page the day before:
- a go utility weighing in at tens of source files that implements what should be a built-in feature of AWS
- an article on encapsulation in rust, the buried lede of which is "you must audit your transitive dependency graph in order to retain the benefits of rust"
Possibly it's because it's hard to get things right the first time, but sometimes it's better than the old way so there's a big shift to it, and once we know a better way it's hard to shift people to the better thing because the old thing was just good enough.
Come to think of it, this probably applies to any product.
Exploration? Who knows, maybe we will all replace our yaml with perl at some point. Maybe not. But it is definitely good to see these somewhat crazy experimentations. That is how we move forward.
I meant it in the sense of Ivan Illich's "Tools for Conviviality" [0], in which he addresses even your misunderstanding[1].
> My purpose is to lay down criteria by which the manipulation of people for the sake of their tools can be immediately recognized, and thus to exclude those artifacts and institutions which inevitably extinguish a convivial life style. Paradoxically, a society of simple tools that allows men to achieve purposes with energy fully under their own control is now difficult to imagine.
> The hypothesis was that machines can replace slaves. The evidence shows that, used for this purpose, machines enslave men. Neither a dictatorial proletariat nor a leisure mass can escape the dominion of constantly expanding industrial tools
> The crisis can be solved only if we learn to invert the present deep structure of tools; if we give people tools that guarantee their right to work with high, independent efficiency, thus simultaneously eliminating the need for either slaves or masters and enhancing each person’s range of freedom. People need new tools to work with rather than tools that “work” for them. They need technology to make the most of the energy and imagination each has, rather than more well-programmed energy slaves.
1. "After many doubts, and against the advice of friends whom I respect, I have chosen “convivial” as a technical term to designate a modern society of responsibly limited tools. In part this choice was conditioned by the desire to continue a discourse which had started with its Spanish cognate. The French cognate has been given technical meaning (for the kitchen) by Brillat-Savarin in his Physiology of Taste: Meditations on Transcendental Gastronomy. This specialized use of the term in French might explain why it has already proven effective in the unmistakably different and equally specialized context in which it will appear in this essay. I am aware that in English “convivial” now seeks the company of tipsy jollyness, which is distinct from that indicated by the OED and opposite to the austere meaning of modern “eutrapelia”, which I intend. By applying the term “convivial” to tools rather than to people, I hope to forestall confusion." (Illich)
There are basically two types of YAML file I've seen:
1. Application configuration. That would be better off as environment variables and/or JSON(5) for structured data.
2. CI/CD configuration. Those should really be scripts, because basically every such configuration file I've seen invents its own way of making conditionals anyway.
> CI/CD configuration. Those should really be scripts, because basically every such configuration file I've seen invents its own way of making conditionals anyway.
Hear hear. Every ci/cd is “what if we half-assed a terrible programming langage in yaml?” And apparently the world is ok with that.
There should be a law stating that if you write a new DSL, you are forced to write all the plugin for all the IDE and a debugger for it before you are allowed to publish it.
For 3 months I have been working with a team leader that had the project of making all entrypoints in the program accessible from a JSON based DSL. I tried to suggest we first expose a nice regular ergonomic API, and if we can't satisfy our requirement, we then add the DSL on top.
Nope.
Got the PR this week. Now he leaves next month, so we will have to maintain that.
Having used yaml for anything from configuration to Gitlab CI pipelines, I can't see how JSON could possibly improve the situation. It's an ugly subset of YAML more readable by computers at the expense of being readable by humans.
Yes, if you want, you can write some terrible YAML. It's your project though, you don't need to make your YAML terrible. The same is true for everything else about your configuration, code, and infrastructure. I've seen some terrible JSON configuration that used the fact that some JSON parsers will overwrite subsequent reuses of keys as a method of documentation, for example.
I know that in YAML 1.2 you can just enter JSON into a YAML file and it'll work if that's what you want. I haven't found a real life example of why that would be better, though.
If you're going configuration format purist, use XML. It's actually good at requiring you to follow the schema and there are XML parsers for absolutely any platform. You can write conditions in an accompanying XSLT file for "active" configuration.
Most docker compose files I've seen are not complex enough to warrant scripting, but look way better as YAML than JSON. In those cases I think StrictYaml would be a great fit.
Also scripts are very different from configs. Configs are declarative, (most) scripts are imperative. If your config is complex enough then maybe you should move some logic into imperative code, but you can still have a config on top of that.
I like XML or the no. Going off to do something, entirely non-computer based could be nice. further onwards from that you might have a nice chat in one of those hidden inter-racial dimensional pubs with neither futuristic or westworld themes.
It looks like you need to use an empty string and you can tell it to translate to None. That's better than nothing, but it's still basically an inability to use actual null here.
I think I'm misunderstanding you, but StrictYAML's biggest departure from YAML is that it doesn't try to guess types at all. Everything is decoded as a string by default until you specify otherwise with its schema system.
Here you can use `NullNone` to parse `null` into None.
It doesn't guess, but also the data can't tell it. It's not just that you get a string by default, it's that everything starts as a string and then goes through post-processing. If you want to distinguish between 'any string' and 'not a string', you can't.
> Here you can use `NullNone` to parse `null` into None.
That's the worst possible outcome for poor Christopher Null.
The best you can do for him is use "a:" to mean null but at that point you're not really dealing with nulls, you've just gone with strings and used an empty string.
I developed a configuration format which is similar to, and a superset of, the JSON format. It's not new - it dates from well before its first announcement in 2008 - and has the following aims:
* Allow a hierarchical configuration scheme with support for key-value mappings and lists.
* Support cross-references between one part of the configuration and another.
* Provide a string interpolation facility to easily build up configuration values from other configuration values.
* Provide the ability to compose configurations (using include and merge facilities).
* Provide the ability to access real application objects safely, where supported by the platform.
* Be completely declarative.
It's similar to newer formats such as JSON5, HJSON, HOCON and similar but offers a number of features [0] which they don't, as indicated by the above list. It's not intended to occupy the niche where you find things like Cue, Jsonnet, Dhall and similar.
It was just never especially publicised when first implemented for use in Python projects, but it now also has implementations for the JVM, .NET, Go, Rust, D, JavaScript [1], Ruby and Elixir (all BSD-3-Clause licensed) and it would be great to get feedback on the project from the HN community.
The need for StrictYAML makes me realize how unnecessarily complex YAML is. I think choosing to make YAML a superset of JSON was a smart idea to encourage adoption, but as YAML has grown in popularity it has outgrown the feature. It reminds me that there a constant tradeoff—between legacy comparability and simplicity—that applies beyond configuration formats.
Since JSON is already terrible on its own, making YAML a superset of JSON is a great example both of "putting lipstick on a pig" and of considering "it works in the common case, let's try something even harder" an acceptable design standard.
Last few years I was playing with configuration files and my final thought was always, why not to have a Python module imported for configuration just like Django does?
Sometimes I need to substitute variables.
Sometimes I need to generate similar blocks from templates.
Sometimes I need to read part of configuration from environment variable.
Everyone. They might be able to handle foo = "bar", but some will not be able to handle the quotes, or understand the difference between 123 or "123".
If you use anything else from Python lang they drop like flies. def? loop? Out of the question. I've even seen so-called IT people not able to handle scripting.
.ini files are about the only thing you can count on, (almost) guaranteed that a non-developer can handle.
this is a good idea, but difficult to nail in scope and multi-lang support. i’ve made similar attempts[1,2]. tbh if this had go support i’d probably try it today.
json, yaml et al are ways to declare literal data. this is good. they are fine.
the issues always come from what the data is used for. nailing your schema, making your data structures as simple as they can be and no simpler, this is where the engineering happens. this is the hard part. literally all that matters.
not validating arbitrary data inputs is obviously a bad idea. whether you validate them via a high level library or tediously by hand[3] isn’t very important.
what is important is that the data structures are sane, simple, and stable. if they are easy to describe, they might be a good idea. if the approach the complexity of general purpose pl, they probably aren’t.
most literal data schemas are too broadly scoped. too general. github actions, other ci, k8s, etc. they have too many knobs, too many permutations. this is not a feature, it is a failure of design.
good schema validation won’t fix broken design. it’s unrelated.
Probably the Map() class actually takes a series of nodes, that could either be comment or a key/value pair (and validates that you don't have duplicate keys). tomlkit does something similar.