Hacker News new | comments | show | ask | jobs | submit login
Ask HN: File format for declarative language?
5 points by mchahn on Feb 23, 2016 | hide | past | web | favorite | 6 comments
I'm in the process of developing a pure declarative language. The source consists of simple JS-like data structures that would be at home in a JSON file. I have two questions ...

- I'm naturally considering using JSON, but I don't think it is great for writing original code. It has too much visual clutter like quotes. My next thought was YAML but it seems kind of obtuse and overly-complex. Any thoughts on this would be appreciated. Am I overlooking any?

- Assuming I have a format chosen, can I use a new custom file suffix matching my language or must it be `.yml`, `.json`, etc.? One could argue that the suffix should match the syntax (`.json`) or one could argue it should match the semantics (my language). There are languages than match C syntax but don't have the `.C` suffix (if I'm not mistaken).

If you opt for making your declarative language a subset of JSON or YAML, presumably it will be a proper subset, with some further expectations and constraints. It makes total sense to use a custom file extension — say, `.mdl`, "my declarative language", for the sake of discussion. Users will be able to easily find their files written in your language without having to wade through all possible files of the superset format. A `.mdl` file really is its own thing, it's not just a .json or .yml file. It may be the case that a `.mdl` file is a just `.json` or `.yml` file, but the converse will not be true.

The only advantage I can think of that accrues from using the more common extension is editor support: a user's favorite editor may well provide syntax highlighting and checking, auto-indenting, etc. for `.json` or `.yml` files, but not for `.mdl` files out of the box. You might want to develop `.mdl` profiles (sic) for popular editors.

Any decent parser for the 'true' format will accept either a string or a full filename (including extension) and shouldn't expect a fixed file extension.

Regrettably I can't think of a single example right now, but I know I've encountered many programs that use custom file extensions which have turned out to be just some familiar format after all.

That said, it's another question whether either JSON or YAML is a great choice. I agree with your reservations about both. In fact I've had the very same problem in a couple of development projects, one recently. JSON is friendlier than XML, true, but it's not really a language to think in, even declaratively — too much clutter. YAML is certainly versatile enough, but it provides probably more than you need, and it is... yes, obtuse. Of course, it depends on your intended users; I found it was too "programmerly" for mine. I ended up developing a custom parser for a language with more syntactic sugar than YAML, which has its advantages and disadvantages.

One way to look at it is as a domain specific language and the choice as between an embedded DSL or a stand alone DSL. The advantage of an embedded DSL is that a lot of existing tooling can be leveraged and the entire host language can be leveraged. On the other hand, debugging in the new language may wind up requiring deep knowledge of the host. JVM languages often have that drawback.

But at a higher level, the new language should be designed around it's use case. If it's always used in a JavaScript context, the JSON might make more sense. If it's for *nix hackers then YAML might be better. If it's for phlembotomists then perhaps something altogether new.

Good luck.

Thanks everyone. You've given me more to think about. Unfortunately I'm still on the fence. Usually I'm a lot more decisive. I guess it shouldn't matter in the long run. But you know how it is, I want my new language to be perfect.

https://en.wikipedia.org/wiki/S-expression and https://en.wikipedia.org/wiki/Xupl are available and have many language bindings

1)There are a couple alternatives. There's TOML, axon.

2)It can be whatever you want. It's your language. If you are reusing a current format it will be easier for devs and sysadmins to keep the current syntax

If you don't mind I'd like to ask what the language is about?

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact