Hacker News new | past | comments | ask | show | jobs | submit login
Cue – A language for defining, generating, and validating data (cuelang.org)
354 points by wikibob 10 months ago | hide | past | favorite | 152 comments

It looks like an alternative to Jsonnet which has schema validation & strict types. IMO, Jsonnet syntax is much simpler, it already has integration with IDEs such as VSCode and Intellij and it has enough traction already.

Cue seems like an e2e solution so it's not only an alternative to Jsonnet, it also removes the need of JSON Schema, OpenAPI, etc. so given that it's a 5 months old project, still has too much time to evolve and be mature.

We're heavily using Jsonnet for our data modeling (https://github.com/rakam-io/recipes) and pretty happy with it. We also have plans to add support for JSON Schema which is adopted by many of IDEs so that VSCode makes us feel like we're writing Java, not a Jsonnet file.

Cue is Google's 6th attempt and given that Jsonnet already has traction and works well out of the box, I would invest my time into Jsonnet at this time.

Okay, so I've spent some time trying to understand this, and I think it's actually really cool, but I found the "About" and "Concepts" documents tumid and murky.

Here is my understanding of the basic concepts:

- It allows a schema-like set of constraints to be declared for JSON (and therefore for YAML & TOML) in a syntax that is an extension of JSON.

- Cue deals with types (sets of values) where JSON deals with single values. Ordinary JSON syntax for a primitive value denotes a set containing that one value. For example, `a: 1` means that the set of possible values for `a` is {1}. Cue calls this a concrete definition.

- Cue provides operators for union (`|`) and intersection (`&`) of sets, and inequalities for ranges of numbers, and so on. `1 | 2` denotes `{1, 2}`.

- Built-in names provide the types `int`, `float`, `string`, etc.

- Cue "struct" types look like JSON objects, associating names with sets. Each name/value pair is a constraint, and all constraints must be met. For example, `{a: int}` denotes "the set of objects that have a property `a` with the value in `int`".

- Properties can be referenced by name; this allows a property defined in one place to be used a type (set) definition in multiple places.

- When a name is bound more than once, the sets associated with each binding are intersected. This means that enforcing a schema reduces to simply combining the schema definition with the "concrete" bindings and throwing an error when an empty set is encountered.

BTW, if I do have it all wrong and what I've described above is not an accurate description of Cue, then I think I'll have to go build what I've described.

Some reasons I'm afraid I'm off-base:

- What's with the lengthy discussion of lattices, and related terminology? Sure, we can construct a lattice from the set of possible types and a "subset of" operator, but that's another level of abstraction away from the necessary concepts, so I don't see what value it adds.

- The `|` operator is described as constructing a sum type, when it seems to me it must actually be a (non-discriminated) union. Elsewhere the `|` operator is described as "computing the join", which to me would mean finding an element in a lattice, but for this to make sense to me I have to think of it as adding an element to the lattice (again all the lattice or poset terminology serves only to obfuscate things).

> What's with the lengthy discussion of lattices, and related terminology?

It's how the author thinks about the values. All the operations move a value up (|) or down (&) the lattice and those operations are associative, commutative, etc. Moving down past the concrete values gets you bottom, the error value, so 1 & 2 is _|_. It fits into things like default values where (using # for * because HN formatting doesn't do escape sequences) a: int | #1 and a: int | #1 unify to 1 because #1 & #1 = #1 but a: int | #2 added would result in a: int because #1 & #2 = _|_ so there is no default anymore.

I don't think the extended discussion on lattices is particularly useful. A much better intro is the tutorial [1] plus the concepts page [2]

[1] https://github.com/cuelang/cue/tree/master/doc/tutorial [2] https://cuelang.org/docs/concepts/logic/

The motivation is clearly to build a tool for configuring kubernetes but I see the combination of data, validation, and order independence as being valuable outside that use. I've definitely had projects where it'd fit. The main reason I'd think twice is because it does add a LOT of concepts for something that can be pretty simple on most projects.

Some of the reasons I like the tool can't be found in the language spec. CUE (the tool) provides import facilities for existing configurations like yaml, json, openapi, protobuf or even go code into cue. this helps with adoption and time spent porting existing configs. Another feature of the CUE tool is the ability to create small tools that are able to operate on cue definition files. https://github.com/cuelang/cue/blob/master/doc/tutorial/kube...

tumid and murky

This should have its own word. Murid is too close to 'lurid'. Tumky, perhaps, though it lacks a certain heft and judginess.

how about "obtuse"

Here is an interesting example showing what it looks like: https://github.com/cuelang/cue/blob/e5d8d09b3ba2e4f48c84a3b5...

I used the following command on the cloned repository to find it[1]:

  find . -iname '*.cue' | xargs ls -l | tr -s ' ' | cut -d ' ' -f5,9 | sort -n
[1] Note that 'find' has a '-printf' option which could have been used to simplify this one-liner.

Or just search the files directly in github? Just type T and .cue

Is the language supposed to "validate"?

In this file, there is no constraint describing non-negative number or non-empty string, or ill-formed URL, or invalid number ports.

EDIT : found it in the doc : https://cuelang.org/docs/usecases/validation/

If configuration starts becoming more complex than looking up key value pairs, why not just write it in the programming language you are using? More languages / serialisation just adds another layer of complexity. Config as code is actually really neat.

> If configuration starts becoming more complex than looking up key value pairs, why not just write it in the programming language you are using?

Because you want to write a config file, not a program?

What next, config files, build systems and unit tests for your config files?

No thanks.

But this is a language, so it needs to be a program to run? Oh wait.. feel like I miss the joke.

What if you're using multiple programming languages?

Then just pick one of them. This may be simpler than introducing another language to the project.

What if you are interfacing between multiple projects? It’s not about introducing another language, it’s about integrating pre-existing systems.

Some web projects validate inputs in JavaScript, then in a possibly different backend language, then (unusually) in the database with CHECK statements. Even when it's all JavaScript the validations are different because the frontend and backend frameworks are different.

And some projects have multiple backend services written in multiple languages.

Then cue might be a good idea.

- can only be done in script languages(?) - syntax errors

I get what you're saying and it can be neat, but I don't think it's universally applicable

how does this apply to compiled languages though?

Recompile when config changes.

If you're going to do that, you don't need configuration at all. Just hardcode everything.

Yeah, and it is not really new either. ioquake3 did that, it had a header file for some values. If you changed them, you had to recompile the QVMs. So then we added cvars, and some values such as HP, DMG, etc. stayed in the header file. I learnt programming C by fiddling around with ioquake3 and its forks, Tremulous especially. Good times.

I think a better title would be "Cue: A constraint programming langue (from some people who work[ed] at Google)".

When you consider the use of this language within a distributed system it's pretty freaking brilliant. https://cuelang.org/docs/concepts/logic/#the-value-lattice

This might cause confusion with cuesheets, which already exist and have the same extension: https://en.m.wikipedia.org/wiki/Cue_sheet_(computing)

Or worst, it could be seen as a first attempt of takeover. What will be the next targeted extension?

What a strange comment. Extensions aren't the property of anyone.

No one owns file extensions. Magic numbers and shebang lines declare a file's language.

I feel like that validation feature could theoretically save a lot of people that occasional 1 hour of their time that was wasted because of a typo in a config file leading to a cryptic error message.

This is literally a feature of XML everyone complained about when JSON was the hottest thing, though.

Yes, that's the interesting part to me.

Very interesting, I can think of a lot of places where this would be useful for managing infrastructure.

I highly recommend reading about some of the internals, it is making me rethink a lot of how configuration should be done.


Cue was discussed previously here


Looks like there is a nice new web site.

It seems a bit too feature laden to take off IMO.

Seems like an interesting language, but I'm very disappointed that my first attempt to view example code took 5 clicks.

The most important part of any programming language website is a short example snippet: put it above the fold on the front page!

Well at least you could find an example.

I couldn't find anything after like a minute of searching. Unless we're counting the gif that slowly shows you an example project letter by letter.

Someone please link to the code.

The website is under active development at the moment and their are incomplete parts. For the code links please take a look at the following tutorial on GitHub at the moment.

* https://github.com/cuelang/cue/blob/master/doc/tutorial/basi...

* https://github.com/cuelang/cue/blob/master/doc/tutorial/kube...

Ya, first rule of PL promotion is that you have an example snippet of code in the first screen seen on your web page.

Yup. Tell me what problem you solve for me, and then show me how you do that.

I didn't even have to scroll down to see an example of code and usage.

Where? I don't see any example on the page linked to.

Edit: Ah, the code example only appears for desktop browsers (or at least, browser windows wider than a phone screen).

And only if you have javascript enabled.

And it's one of those trendy shitty live-typing demonstrations instead of just letting you read some text.

Works fine on my phone, though unlike desktop,I do have to scroll a little.

Even worse, the examples are complex and incomplete.

Or make me go blacklist this project and everything to do with it (largeCapital pop >5M)

I found this comment: "In V3 the hobby field is explicitly disallowed. This is not backwards compatibly as it breaks previous field that did contain a hobby field" in https://cuelang.org/docs/usecases/datadef/

So if you add a field, you break existing code that doesn't know about the field.

This is wrong. CUE has optional closed schemas marked by a double colon. The V3 entity you’re talking about is explicitly declared to be a closed definition and therefore disallows unknown fields in entities that claim to accord to the V3 type. Not all definitions are closed.

Even for those that you choose to close, it’s a matter of having different code for different definitions. The claim that it just automatically breaks isn’t true even when closed definitions are used.

BTW, this feature speaks to CUE’s intended purpose as a configuration language. It is (or at least can be) nice to ignore unknown fields in transmitted payloads for forwards and backwards compatibility. But if I’m trying to configure some software and misspell a field, I probably want the configuration file to fail validation, not have the software run with an unintended configuration.

I gave up looking after 1min

Can't see any examples in the github readme, or even in the codebase either.


Are you familiar with the concept of context? It's helpful because it allows easier communication by not requiring words to mean the same thing at all times.

In this case, it is helpful because we can scale our understanding of the poster's disappointment to allow us to not have to consider how it might be disappointing relative to, say, global thermonuclear war or a first kiss, but only need to think about it in relation to the other topics of discussion.

I highly recommend using context whenever communicating.

I can be disappointed when viewing a web page while simultaneously being disappointed in humanity's collective response to climate change. Context matters.

> A key thing that sets CUE apart from its peer languages is that it merges types and values into a single concept. Whereas in most languages types and values are strictly distinct, CUE orders them in a single hierarchy (a lattice, to be precise).

That's cool!

In TypeScript, an upper bound of “foo” and 5 is any. But I wonder what a lower bound of those two is supposed to be in Cue.

I can’t keep upper and lower straight, but `“foo” | 5` would allow either “foo” or 5, and `”foo” & 5` would be “bottom” (also spelled “_|_”, essentially an error)

I'm feeling both really pissed and validated right now. I thought this was going to just be a normal thoughtless config language that would only be successful as a Google project. Then I looked at the theoretical basis page. I have no formal proof about this, but I've been talking about this type of a type system with my parents and high school CS teacher for a while now! My idea sounds like this: a value is simply an actual binary string. The "type" that classifies any data is described as a formal grammar, where the binary string is a formal language. If the language can parse a given binary value with the formal grammar, that value is an element of the set (type). This naturally leads to a structural type system which can be described using existing set theory and implemented using existing parsers and formal language theory. Of course, this leads to type -> type functions which describe dependent and refinement types naturally. It's great to see this idea being broadcast on the front page here, as I can see it very clearly as a superior type system. I'm really regretting not writing a formal article about it sooner!

Why feeling pissed? It is unlikely this Cue project implements the very same semantics the way you think about them. I'd go ahead and just write that paper. Looking forward seeing your reference implementation :-).

> Of course, this leads to type -> type functions which describe dependent and refinement types naturally.

could you expand on this? i don't understand how you get this from grammars.

(on a side note, i think that dependent types usually mean you can write functions from values to types, not type -> type?)

Yes, I mistyped that bit. I meant to say that because types (as a grammar) are values, they can be inputs and outputs of functions. The functions can fill in the hole that dependent/refinement types fill, by taking context into account (the grammars which describe simple types being context free). Type -> Type fill in for type constructors like this infinite list:

  InfiniteListOf = Type ->
    Type & InfiniteListOf Type
That function just morphs a simple grammar into another grammar. Imagine now if we could calculate something in between:

  IncrementingInfiniteListFrom = number ->
    number & IncrementingInfiniteListFrom number+1
That's where the dependent types come in, naturally.

Seems really interesting after a quick read-through. Specs that allow range-based validation look useful, and the structural declarations also feel like they'll help reduce a lot of boilerplate and repetition. I wonder how this compares with Dhall and Jsonnet, both of which I've been looking into as a safer alternative to templated YAML. With Google putting its weight behind this I'm curious if it'll start finding its way into K8s.

I am really curious why people are downvoting this.

I've always thought that a more restrictive and simpler version of yaml would be a good alternative.

Folks upset at the sibling comment weren't here at the time. The reaction was swiftly negative, and there are few charitable reasons to be found for that.

You'd think they'd make it easier to quickly seem examples of the language.

How does this compare to jsonnet or hocon? Why would someone choose this over generating configuration files with a “scripting” language like python?

CUE improves in Jsonnet in primarily two areas, I think: Making composition better (it's order-independent and therefore consistent), and adding schemas.

Both Jsonnet and CUE have their origin in GCL internally at Google. Jsonnet is basically GCL, as I understand it. But CUE is a whole new thing.

jsonnet is basically GCL, but after like five failed attempts to rewrite, replace, or fix GCL, which included creating a formal semantics of GCL pointing out all the problems. I believe there was a final successful attempt to fix it.

The Cue docs talk about jsonnet explicitly.

TIL how to use cat with STDIN.

A lot of these systems ignore the querying side of a schema i.e. in graphql you can define a schema then query only certain parts of the schema so only parts of the schema are enforced at runtime

Would like to know more on how its a good fit for database schemas? Does anybody here use any DSLs for database schemas? If yes, what and why?

Does it sport working on binary data, it would be a cool thing to have in some applications, to validate conditions on binary data.

my time at Google has made me hate GCL so much.

I am pretty sure you placed the hate on the wrong target, you probable are hating borgcfg, instead of the GCL/BCL language.

Disclaimer: was borgcfg owner 2016-2019.

I had my own qualms with the language itself too. My team had very complex Borg configs, so complex they took more than a minute to evaluate. Luckily the GCL team at the time was working on a new interpreter (gclx? IIRC), which was indeed much faster (I wrote a mandelbrot PNG generator in pure GCL to prove the big speedup and convince my team that switching was worth the effort).

Unfortunately there was no formal spec of the GCL language, so the new impl was based IIRC on reverse engineering the spec from the first implementation of the interpreter. It turned out our configs hit several cases where the original behaviour was either unsound (and thus the new impl sacrificed backwards compat) or we hit a bug in the new impl.

The main problem with the language (as opposed as issues with the implementations) was that finding the root of those behaviour differences was very hard. It was very hard to follow where the variables came from. The GCL scoping rules (and lazy evaluation) were indeed very unfriendly for debugging.

This was an extreme case of a pain that was felt on a daily basis by a lot of people I've been talking to.

Talk to the borgcfg team, and let your organization's tech leaders know as well.

A few executives are not friendly to borgcfg. Their agenda, appeared to me, has been to deprive it's resources so it can die from rotting. That's bad engineering and totally unnecessary. A healthy BCL/borgcfg will die easier, because they'll allow an easier path migrating to something new.

I left the company half a decade ago. Just to be clear I actually loved borgcfg, just shared a war story. I'm happy to hear that the tooling improved. I'm unsurprised to hear that many still have mixed feelings towards GCL and ecosystem.

FWIW I'm the current maintainer of https://github.com/bitnami/kubecfg whose name is shameless xoogler bait.

you are probably right. Felt like the documentation for borgcfg was extremely hard to just find. Maybe I'm wrong about this. This is also not the place to discuss this probably.

But GCL made it so whatever variable was actually being used by borgcfg was obscured by layers upon layers of imports.

well, if people testing their BCL, like what they do with c++, things will be better. But you know what, if they do that google officially will be a company built on BCL...

I was going to say, this looks a lot like GCL. Dynamic scope, recursive lookup in parent scopes, templates [1], everything. GCL is neat and all, but I'd almost rather write my job configs by building thin python or lisp scripts to emit json or protos.

1: https://github.com/cuelang/cue/blob/master/doc/tutorial/basi...

AFAIK, cue is written by the author (or one of them?) of GCL, and purports to fix a lot of what is wrong with it.

Well, why don't you? Job configs in borg are protobufs. GCL can produce the required protocol message but so can any other language or tool. You could use Guile or whatever your heart desires.

> I'd almost rather write my job configs by building thin python or lisp scripts to emit json or protos

This is what Facebook does: Python emitting JSON.

Weird flex, but ok.

Is it true that there are dozens of internal language and DSLs? Why does G have so many?

1. use plain .textproto as your config

2. find a use case where you want a config with ~700 lines, most of it just a repeated variant of something

3. add more fields to your config to reduce the verbosity

4. realize that your configs are now obscenely complex, and build a limited-purpose DSL

5. five teams now use your limited-purpose DSL and make more feature requests

6. cry because you now maintain a DSL

7. ???

8. promo for cross-team impact

The syntax looks a lot like Coffeescript, which I use for defining and generating data.

Cute. Is there something that takes in Cue and makes a GUI editor for the format?

Wouldn't the optimal one be a mode for your editor of choice?

I mean a form, with blanks and pulldowns. There's enough info in Cue to generate one. Not just a syntax checker that tells you that you got it wrong.

I don't think a gui to work with cue files is a good substitute for a gui for regular users to configure whatever you are configuring because its unlikely to be able to define valid data to the same degree as an actual app.

If its for a technical individual to configure your software I don't know that such a gui would be superior to your favorite editor.

Welcome to kubernetes new configuration language!

They should just foist BCL and Borgmon upon the gullible masses. All cloud development will be set back years while people (try to) figure it all out.

We used to joke that open-sourcing Borgmon would be an industry-disabling move, but Prometheus is very popular which just proves there's a lot of people with poor taste in software.

FWIW Piccolo has also leaked into the industry in the form of Pystachio.

Gullible mass member here. What's better than Prometheus?

Eating glass.

I do not think Google figured anything better than BCL.

Disclaimer: Was owner of borgcfg 2016-2019.

We still haven't.


As I alluded in the other comment, there needs to be a better tool for BCL, not the language.

GCL, used outside borg, is a much more pleasing experience, because there is a decent tool. And the team have done good job to innovate continuously.

The post title is misleading. This is a more of 20% project, not an officially Google supported one.

There's no difference.

There is a large difference between battle-tested tech maintained by a properly staffed team and a proof of concept built by someone in their spare time.

The title certainly made me think that it was the former, even though it's the latter.

> maintained by a properly staffed team

This seems to describe very few Google products. Almost daily, I have the following thought: "Has anyone at Google actually even used this product?"

(Most often with Assistant, but definitely with other products, too)

Just because you don't like doesn't mean it's not built and supported by a large team following some VP or PM vision. Boondoggles cost a lot of resources.

Perhaps I have a different definition of "maintained" than Google does. For me, it means that it's not just online, but also has bugs regularly fixed and features added.

Most Google services (off the top of my head: Voice, Talk, Reminders) seem to reach v1 and then stop dead in their tracks. They're online, but that's it. Thousands of people request fixes or features, and they go completely unheard.

Google employees on HN have confirmed this, saying that the company rewards new products that drive ads, but not the work involved in improving and maintaining existing products. That explains why Google's released the following products for messaging, and none has been amazing: Talk, Hangouts, Voice, Wave, Allo, Hangouts Chat, and Messages/RCS.

Most of these overlapped at some point, and if you've used any of them, you wouldn't describe them as "maintained". They're more like "abandoned without publicly announcing anything".

If there's a VP or PM vision anywhere at Google that lasts for more than a year, I'd love to know what it is. It seems like a company with a thousand committees and no real creative leadership.

According to Googlers, much of the battle tested tech Google's critical systems run on is not maintained by a staffed team. Stuff that works well enough to not scare a VP and isn't a 10x moonshot doesn't get funding.

For 20% projects, Google typically doesn't make any commitments on the project's development. If you're just being sarcastic on Google's infamous product longevity, you should've used more than 3 words.

How is this better than yaml?

Less footguns, built-in validation, strong types. Me gusta.

So..why not use XML?

Because while XML has types and schemas, the actual implementation and ergonomics is absolute garbage. Those things matter to intelligent people.

So that humans eyes don't bleed when they try to read it.

One of the things they bring up in the docs is lessening boilerplate. And it's hard to get more boilerplatey than XML.


I realize this was meant to be sarcastic. However, I whole-heartedly and unironically agree. For writing small bits of configuration, just about any language will be fine. For large amounts of configuration, as is required for... oh, I dunno, let's say deploying software in the cloud, all the commonly used languages are a disaster.

Data serialization languages (like JSON, INI, XML, etc) lack the power to describe large configurations. There's no way to define an abstraction and then use it in multiple places. There's no way to constrain what is considered correct. You end up with tens of thousands of lines of very, very repetitive structures that are very fragile and hard to change.

General purpose languages are also bad for writing configurations. Yes, they're very powerful. That's not a feature when it comes to configuration. If you have a program that emits a configuration, the only thing you can do with it is run it, then inspect the resulting configuration. You can't inspect or transform the program at the level of the configuration semantics. You can't ensure that the configuration will have specific properties. You can't even ensure that the program will, in fact, emit a configuration.

So yeah, we need new configuration languages that lie in between data serialization languages and general purpose computing languages. We're starting to see them. HCL is awful, but it was an attempt at solving this problem and a move in the right direction. Jsonnet is maybe better better? I dunno, I've never tried it. Dhal is interesting, though difficult for non-Haskellers to approach. CUE is also interesting. Bravo!

I wholeheartedly and in ironically disagree!

We have all we need. If you want a full featured language, use that. Beyond that, we have JSON, YAML, INI and more. If you want something more complicated you can create your own DSL for your app.

If you want schemas and validation, use XML!

each of those options have major downsides:

- 'dumb data' (json, yaml, ...) is machine-, but not human-friendly. it's often tedious to write/read, and you can't abstract common parts out

- DSLs are something you have to write, debug and maintain yourself. is the bug in your config or is it in your DSL's implementation? who knows!

- full featured languages require a full blown interpreter and aren't tooling-friendly. as an example: with a Python package, it's not really possible to statically determine the dependencies , because its setup.py can declare anything it pleases depending on, say, the time of day. also you can't really run untrusted configs because they might launch some missiles


there's a decent middle ground – write a program that generates a 'dumb data' config. you write a small amount of code (friendly for humans) and run it to get a static, easy-to-process config (friendly for machines). however in practice (in most languages) the program won't be pretty – probably about as easy to read/write as an implementation of a macro that directly manipulates ASTs (i.e. not very). this can sometimes be ameliorated with some EDSL trickery, but that brings back all the problems DSLs have

and so generating configs is what projects like Dhall/Cue aim to improve. i'd say they're aiming to be something like the regex of config generation - do a limited amount of common&useful things, and make them easy to express.

We do indeed. XML is awful, YAML is a clusterfuck, JSON is often insufficient, and so is INI. TOML looks promising, but might not make it out of its niche.

If Cue is basically a better YAML with built-in schemas, that sounds pretty good.

I was hoping to like TOML, but my first experiments using a Python package for parsing it were disappointing. I found the syntax cranky, and the Python package I was using did a terrible job at providing diagnostics — very poor exception propagation.

And that seemed to be the most mature TOML parser that I could find for Python, so I decided TOML isn’t likely in my future.

TOML is extremely verbose for lists. It’s ridiculous. Sure it had less features than YAML but it also looks considerably worse in all the common use cases. Looks like INI

Yeah, I was fighting the list problem. And not able to mix numeric and string data in a list just made the whole thing unwieldy and frustrated me.

I’ve used JSON for configs, which is not great, but at least readable. The biggest problem is lack of comments, but I added a quick hack to ignore any dictionary key starting with an octothorpe. Not ideal, but is actually handy because it makes it easy to comment out keys temporarily as well as add arbitrary commentary.

Also, the Python standard library JSON parser is very flexible and yields precise exceptions. I was able to turn those parsing exceptions into meaningful error messages with precise line and column numbers. Which came as a bit of a shock to some of my users, because a different tool written before my time but used by the same people had exactly one error message for mispelled YAML: segfault.

> XML is awful

I never really understood this. This seems to be an oft repeated truism from mid 2000s with little backing it up.

The only awesome thing that JSON did, was lose type information as well. In fact, the only thing that I can see that JSON brought to the table was easier editing by those who didn't have IDEs, at the expense of losing type information.

XML and especially XML schema are hugely complex and almost laughably difficult to bind to any mainstream programming language. It took Java over a decade to produce a binding that could (perhaps) handle an arbitrary schema. Types boil down to sums and products, I'm not convinced the designers of XML/XSD understood this. XML has so many overlapping concepts: elements, attributes, enumerations, choices, unions, sequences, lists, element/attribute groups, substitution groups, facets, simple types, complex types etc etc IMHO it's ad-hoc and ugly; we deserve better.

My main complaint with XML is that it’s far too complex for most use cases (like config files), and thus requires an unhealthy amount of tooling to work with.

I can’t just load the file and run it through a parser and get an easily accessible object structure back, I’ll need to navigate the document with DOM or XPath.

JSON with comments (aka JSONC) is in my opinion the best format in wide use today. It has structure and types, but not too many, and not a lot of magic, like YAML has, but for the most complex cases, JSON(C) falls short with its lack of extensibility and inheritance.

Schemas, interesting. I wonder where we’ve seen that before..

We could have a URI that gives you a schema and some tools to generate a skeleton for that schema for you. Then you could put a header in your request for what you want to do and send that payload.

Like a simple configuration access and management (SCAM) protocol.

Yes, yes I think we’re on to something here.

can you back what you are saying? i personally think we don't. most continuous integration pipelines need static configuration files, not languages. to be honest, i have no idea where this language fits in a software development process...

If it doesn't at least "compile" to JSON, how are we to load it into arbitrary environments?

It does

New standards: https://xkcd.com/927/

The problem is not that there are too many configuration languages. The problem is that none of them are any good. The way to solve that is to keep trying new ones until we find one that is good.

Or the premise is a false one, that a configuration language is not the right approach. And everytime yet another such language falls flat only serves to reinforce this very point.

Configuration languages succeed and work just fine most of the time, despite their inadequacies and quirks, but no one ever complains about them when they work.

Go ahead and write your configuration as a fully Turing complete sub-application in whatever language you like. It will do everything you could possibly want, and in a few years it'll grow so complex and hairy that it will need its own fully Turing complete sub-application to configure it, lather, rinse, repeat. If you're really clever, every configuration layer will be written in a separate language with its own dependency tree, test framework, and toolchain.

Personally, I prefer not having to recompile just to change some variables and settings. I'm fine with INI or JSON (although I prefer Lua tables) when they're appropriate. The problem is not that configuration languages are a bad idea, the problem is interminable Turing creep and developers wanting every aspect of their applications to be as flexible and powerful as possible.

The premise of having a separate config language is fine - somewhere, somehow, inevitably, you're going to need a read-only data store for globals and references to system settings. You can hardcode all of those variables in your application or put them elsewhere.

> Configuration languages succeed and work just fine most of the time, despite their inadequacies and quirks, but no one ever complains about them when they work

I'm genuinely intrigued to see a non-trivial example of this, a configuration in use by an organisation with complex needs, describing its applications and its cloud infrastructure, while avoiding the pitfalls you describe that bedevil the use of Turing complete languages.

Note that this is more unique since it acts as a schema and a data language. And it has common programming constructs.

The tl;dr is buried deep in the docs

> CUE is an extension of JSON. This improves familiarity and makes it easy to get going quickly.

> In its simplest use case, CUE can substitute for a more pleasant way to write JSON.

So it’s a google branded yaml?

Let’s raise a glass to our blue eyed comrades who’ll adopt this fully, only for google to decide in 2-3 years to completely and aggressively kill it again for no apparent reason.

Pardon for slamming my ignorance down on the table here: I know Google has a tendency of killing products but I’m not familiar with many of their engineering tooling being similarly sunset with comparable frequency.

Any named examples? I’m sure there are-and likely they just elude me at present, maybe seeing their names will jog the memory probably?

You don't hear as much about developer tools being sunset because they rarely impact as many people and that makes less interesting news.

Examples include: AngularJS (replaced by a rewrite of angular effectively), GWT (donated as 'open source' with minimal continued google involvement), basically 90% of all 20% projects by googlers that weren't official google stuff (too many examples to name, practically all of them), the xmpp api for google talk (and all associated library code, including some open source xmpp extensions, libjingle), tons of chrome and android libraries that were killed/deprecated as part of new versions not using them, ARC (https://en.wikipedia.org/wiki/Google_App_Runtime_for_Chrome), the caldav API for google calendar and any associated libraries...

I could go on, but the majority of the relevant examples are the 20% projects that never made it, and I'd rather not list any of those since they're largely single-person projects and it's kinda personal to comment on any of em.

jsonnet? :)

The jsonnet mailing list still has activity. Is the project dead?

It's on GitHub, if you fork it, they can't kill it. If you use it, contribute to it.

How much support does the configuration language that basically converts to json need? If your <insert language> here supports json then the support required is a library to convert cue -> json or at worst if nobody cares about the format anymore a program that converts one time from cue -> <insert new format here>.

Worst case scenario it has a short life and is forgotten for a better OSS alternative but remains the default for several Google projects and a number of developers are forced to learn and maintain it as it becomes brittle with age.

With that as the worst case scenario, it's more than worth the try.

It's not Google branded and who has blue eyes?

Feels like an also crappy jsonnet

Well, it doesn't look very backwards compatible to me ;)


Ill prolly get downvoted for this:

What i want to see, is a company which will provide as a service, basically devops deployment, monitoring and provisioning regardless of the cloud provider.

Such that i can say “deploy this” and it will eval, track and monitor the cost of the deployment across aws, azure, gcp, etc and i can click controls to see/kil/scale wherever...

Thus, i dont care about cloud provider deployment language, etc...

Disclosure: I work on Tree Notation.

If the authors would like to discuss how Tree Notation may be a better syntax for this language, please feel free to get in touch: breck7@gmail.com or yunits@hawaii.edu.

Here is a demonstration of what a Tree Language for config files could look like: https://treenotation.org/designer/#standard%20config

Stop commenting on every HN post, thanks

Fact: I've commented on fewer than 1% of HN posts today alone.

Fact: my post is very relevant to the OP and I'm offering to help them.

Fact: I've been a member of this site for over 12 years, and never comment on a post unless I think it adds value to the discussion or would be helper to the parent.

Fact: What you wrote was essentially a private, directed comment that should have been a private email, and added literally nothing to the discussion.

> should have been a private email

That's a good suggestion. I didn't think of that. But thinking about it, I guess I don't want to bother the OPs inbox. If they are interested, they can get in touch.

> added literally nothing to the discussion.

I disagree. When I post a new language, and someone shares a link to a related language, those are often the most valuable comments.

Validating, defining, using data: sound like things Tree Notation syntax is perfect for. Cue's semantics are great, and presentation and execution, I just think potentially a syntax switch is worth exploring. I understand the strategy to be able to parse JSON as cue, and that's probably the way to go for now, but in the future Tree Notation syntax might offer compelling advantages.

FWIW I think your post is very relevant.

I like the idea of a git based database. Please don't take a single person's opinion as that of the entire community.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact