
Show HN: SAN – a Safe and Nice TOML/YAML Alternative - z0mbie42
https://astrocorp.net/san
======
z0mbie42
Hi author here.

We've created a new file format designed specifically for configuration: SAN
(pronounce /seɪn/ like sane).

You can find a Go parser here: [https://github.com/phasersec/san-
go](https://github.com/phasersec/san-go)

Vim syntax here: [https://github.com/z0mbie42/vim-
san](https://github.com/z0mbie42/vim-san)

SAN was created because of a need to have a Simple And Neat configuration
format (with comments, unlike JSON, easy to parse unlike YAML....).

The main killer features compared to YAML/TOML are the following:

* Comments as first class citizens which means programs can manipulate and modify files with comments without destroying them.

* Safe

* Human and parser friendly

* Easy to use, even without syntax coloration

It's an open format and any feedback is welcome.

~~~
phaer
Could you elaborate on the differences to HCL? It looks almost the same on
first glance. Maybe minus functions and and nested blocks in HCL.

~~~
z0mbie42
Hi, HCL was one of the source of inspiration. We wanted to push it 1 step
further by providing a clear and open specification which allows to create
parsers for languages other than go.

So the main differences are the ecosystem: a CLI to auto format - think go
fmt) and validate SAN files, parsers for languages others than Go

~~~
phaer
Are you aware of the HCL2 specification at
[https://github.com/hashicorp/hcl2/blob/master/hcl/hclsyntax/...](https://github.com/hashicorp/hcl2/blob/master/hcl/hclsyntax/spec.md)?
It also provides a validator, even for custom schema definitions.

~~~
z0mbie42
Yes, The problem is that we want a pure data configuration format. We think
that very few application require a full turing complete language for their
configuration.

Bazel have too it's own (turing complete ?) Programming configuration language

~~~
laurentlb
Bazel's language
([https://github.com/bazelbuild/starlark](https://github.com/bazelbuild/starlark)),
also used in Buck and other tools, is technically not Turing-complete. In
Bazel's BUILD files, there are additional restrictions (e.g. no function
definition), making them relatively declarative and tool-friendly.

~~~
kjeetgill
I literally just started prototyping this exact idea last week: a python
interpreter without any imports/stdlib as a configuration language.

I'm a little bummed I was beat to it. I was going back and forth on allowing
defining functions. Definitely no classes.

------
geoah
+1 for non indentation based nesting

+1 for quoting all strings

+1 for comments

-1 for the pronunciation hehe :P

I don't like that map values don't need commas while arrays do, and also don't
know how I feel about the equal sign instead of colon.

Great attempt overall. Thank you.

~~~
z0mbie42
Ack

I've opened an issue
[https://github.com/astrocorp42/san/issues/12](https://github.com/astrocorp42/san/issues/12)

------
chme
Interesting, but the first thing I checked was if it allows list elements to
have different types. And it doesn't.

That was one issue I had with TOML, because the allowed data schema is not
JSON compatible. Switching from YAML or JSON to TOML or SAN is therefore often
difficult and as such easier to just stay with the one that is currently used.

For new projects, it might be nice. But having to work around this limitation
when you need it is also a bit annoying.

~~~
vinceguidry
I think we need to start distinguishing between data formats and configuration
formats. JSON is a data format. I loathe working with NodeJS because of the
necessity of hand-editing JSON, and that just makes me want to bash my head
against the wall.

In my side projects, generally I'll hand-write a parser with regular
expressions because I haven't yet found one I actually like both looking at
and editing. I can fix every single other problem except that one. With SAN,
the issue is the bracket delimiters. I hate futzing with them, just let me use
tabs. I want this:

    
    
      Monthly Recurring
      	Patreon: 8 | 9/17/2018 | 2mo - Acc: Capital One - Tags: arts
      	Workflowy: 4.99 | 9/17/2018 | 6mo - Acc: Amex - Tags: tools
      	RubyTapas: 15 | 9/17/2018 | 6mo - Acc: Amex - Tags: education
      	Verizon: 70.17 | 9/17/2018 | 6mo - Acc: Amex - Tags: utilities,tools
    

This is easy to look at, easy to edit with a text editor, and, once the
parser's written, dead simple with Ruby POROs, easy to parse into queryable
data structures.

I have yet to find any format that's intended to be human-editable that
doesn't actually feel 'nice to the touch'. So I just make my own.

When hand-editing the format above, I generally don't type in a whole line at
a time. I'll just add in the basics, and fill it out later, perhaps with
machine processing. Tags in particular are easy to copy-paste in by hand and
are generally the last part of the line.

~~~
Entangled
Try Dixy :)

[https://github.com/kuyawa/dixy](https://github.com/kuyawa/dixy)

~~~
vinceguidry
Someone else's hand-rolled parser isn't going to be _my_ hand-rolled parser.

------
Keats
It does look very similar to a language I wrote:
[https://github.com/Keats/scl](https://github.com/Keats/scl) although SCL
supports files inclusions and environment variables out of the box.

I would definitely change the supposed pronunciation though, no one is going
to pronunce san as sane.

------
playpause
> SAN (pronounce /seɪn/, like sane)

No, absolutely not.

~~~
bovermyer
Why not?

~~~
h1d
As no same person would remember or bother to read it as "sane" and is a
failed attempt to force people to read in some weird way.

------
ainar-g
This is indeed very sane. One thing I personally would change is making a
trailing comma in multiline lists obligatory, like in Go.

I hope this gains traction, so that I could use it in my projects without
people being upset about "some obscure configuration language".

~~~
izietto
> One thing I personally would change is making a trailing comma in multiline
> lists obligatory

Can I ask you why?

~~~
ainar-g
Sure. In Go, this was one of the details that cemented my love for the
language. Imagine you have a multiline list, sorted alphabetically.

    
    
      strs = [
        "a",
        "b",
        "c"
      ]
    

If you don't have a trailing comma (looking at you, JSON), when you add a new
item, you have to make changes on _two_ lines:

    
    
      strs = [
        "a",
        "b",
        "c", // Change 1.
        "d"  // Change 2.
      ]
    

On the other hand, when you have obligatory trailing comma on multiline lists,
there is only one change:

    
    
      strs = [
        "a",
        "b",
        "c", // No change, since the comma was already there.
        "d", // Change 1.
      ]

~~~
z0mbie42
Very pragmatic, I love it!

I will explore this.

------
kjeetgill
I'll have a weird fondness for the special kind of bikesheding configuration
languages invite.

To add my 2 cents. I really like what I saw in Skylark(Starlark?). It's fairly
close to my dream setup: a simple Python interpreter with no libraries.

Pythons object notation looks 95% like JSON, it supports comments, solid
multiline support, well known escaping rules, basic format strings when you
need them.

------
thechao
I am now going to propose the key-value-inline-notation: KVIN. If JSON is that
kid in the early '90s who had a l33t 286, then KVIN is JSON's older, iRoc
driving brother. KVIN has a 'stache that all the girls love. (It's a little
thinner than Magnum P.I.s 'stache, right now.)

A KVIN file follows one simple pattern: it's a list of key-values or comments.
Comments are any line that starts with any amount of whitespace and then a
'#'. A key-value starts with a key, ends with a value, and has a '=' in the
middle:

    
    
        key = value
    

A key is made up of any valid C-identifier or C-numeric expression. You can
'chain' keys together with '.'s:

    
    
        key ::= key [. axis] | axis
        axis ::= c-identifier | c-number
    

A value is any valid c-identifier, c-number, or "byte string". A byte string
starts with a '"', ends with a '"'. If you want to include a '"' then you have
to escape with a '\'. If you want to include a '\', then you '\' the '\'. All
whitespace is included in a byte-string: even a '\n'!

Strangely, KVIN understands _relative_ paths. If a key starts with '.' then
that means it's relative to the path just above it. Each '.' means 'go up one
level from the last value':

    
    
        foo.bar = 0
           .baz = 2
    

Would be a JSON like this:

    
    
        { "foo" : { "bar" : 0, "baz" : 2 } }
    

KVIN has one final trick, besides the busted old turbo he's been trying to get
in his iRoc: he knows how to autonumber a list:

    
    
        foo.10 = a
           .# = b
           .# = c
           .# = d
    

Which would be the JSON:

    
    
        { "foo" : [ ..., a, b, c, d ] }
    

With 'a' starting at index '10'. (The base index of arrays is 0.)

I've got code laying around, somewhere, that implements KVIN.

------
wezm
This looks great. I love the strictness and improvements over JSON and also
the goal of preserving comments. The absence of specified a date type feels
like a back compared to TOML though. Without this they end up as strings in
all manner of formats.

~~~
oftenwrong
An application should reject anything not in its expected format. I would not
want to be boxed into the limits of the configuration format's date and time
representations. For example, TOML accepts datetimes with fixed offset time
zones, but not time zone designators with multiple offsets, like
America/New_York.

------
hardwaresofton
Thanks for including a comparison page[0]!

I'm not sure I buy that it's a large improvement over TOML -- In practice I
haven't found the homogeneous list requirement to be an issue, and I really
appreciate toml's writing styles and the use of sections to clearly delineate.

I think TOML is still my favorite (especially in terms of readability) but SAN
looks awesome

[0]: [https://astrocorp.net/san/san-vs/](https://astrocorp.net/san/san-vs/)

------
henryluo
SAM is similar to Mark Notation
([http://marknotation.org/](http://marknotation.org/)) that I created in many
ways:

    
    
      - a clean syntax and data model
      - easy to parse and use
      - friendly to human
      - improving on JSON
    

SAM vs. Mark Notation

    
    
      - SAM is primarily for configuration data
      - Mark can be used for both configuration and mixed content (like HTML, XML)
      - SAM is pure data (no templating)
      - Mark is also designed for templating, e.g. Mark Template

~~~
henryluo
Sorry, should be SAN.

------
rayredd
This looks a lot like HCL
([https://github.com/hashicorp/hcl](https://github.com/hashicorp/hcl)).

~~~
z0mbie42
Hi, HCL was one of the source of inspiration. We wanted to push it 1 step
further by providing a clear and open specification which allows to create
parsers for languages other than go.

------
oftenwrong
More sane than TOML, and far more sane than YAML.

Tabs would be more sane for indentation: one tab, one indentation level. Also,
it would match gofmt, if that matters to you.

~~~
IshKebab
Here's the github issue for that. Looks like they are starting to come around
to the idea:
[https://github.com/astrocorp42/san/issues/3](https://github.com/astrocorp42/san/issues/3)

------
iofiiiiiiiii
I have used YAML and JSON and I loathe them both for reasons that are probably
well known to many. I have, however, not used TOML - so I would very much like
to hear the viewpoint of someone who has. If you use TOML today, do you see
SAN as valuable? Why exactly?

------
wattengard
What are the benefits over JSON5?

~~~
arethuza
Personally, I'd be happy with just a standard way of adding comments to JSON

~~~
benjaminjackman
You may want to check out [json5]([https://json5.org](https://json5.org)) it
supports comments (among other features like trailing commas)

~~~
arethuza
My comment was actually prompted by looking at Json5 features and thinking
that I'm not really bothered about any of those other features. I'd just like
JSON to have comments.

------
ortuman
Why another serialization format? What are the advantages over TOML/YAML?

------
andybak
Human friendly with curly braces? I don't buy it.

I would never unleash a curly braces syntax on a non-developer. And if it's a
format for developers then there are already plenty of options.

~~~
krapp
Curly braces are not that unfriendly - people use parentheses in common text
all the time, and curly braces are not that different. Perhaps the concept of
a "block" or scope is a bit unfriendly but that's independent of syntax... and
arguably, curly braces make that more explicit.

Meanwhile, with significant whitespace, non-developers have to learn to be
aware that a syntactic element intended to mean nothing now suddenly means
something, and that one type of whitespace means something different than
another and that they can't be used together, which _is_ counter-intuitive.

~~~
andybak
I think semantic identation is very intuitive to non-developers. It's used in
print everywhere.

I think there's a touch of Stockholm syndrome in your belief otherwise. You
have been corrupted and are projecting "intuitive" on to the pure. ;-)

------
akvadrako
If you are looking for a YAML alternative,
[https://sdlang.org](https://sdlang.org) is also worth considering. It's a bit
richer.

~~~
IshKebab
Richer usually means "more stuff I don't want to have to learn" though.

------
dalbotex
So it's basically JSON without commas and quoted keys?

------
rambojazz
Maybe it's just my preference, but I think a JSON/YAML replacement should
focus on removing as much redundancy as possible. For example use

    
    
        title: SAN Example
    

instead of

    
    
        title = "SAN Example"
    

and use dictionaries by default, that is

    
    
        creator
        
          name:    Sylvain Kerkour
        
          website: https://kerkour.com
        

instead of

    
    
        creator = {
        
          name = "Sylvain Kerkour"
        
          website = "https://kerkour.com"
        
        }
    

or something like that more or less...

~~~
z0mbie42
Hi, the main concern with your solution is that it does not support auto
format (think go fmt). While adding braces and not being indentation based
allow auto format and large file to be read more easily

~~~
rambojazz
Well you can still use braces, but also allow a default behavior with no
braces.

------
xellisx
1\. SAN is an already used an abbreviation for something that has been in the
tech world for a while, and well, would be hard to search for.

2\. No multi-line support?

What about JSON5?

~~~
z0mbie42
1) Can I ask what? because I did my research but didn't found anything

2)
[https://astrocorp.net/san/versions/v1.0.0/#string](https://astrocorp.net/san/versions/v1.0.0/#string)

~~~
xellisx
1) Storage Area Network.

2) Awesome!

------
krapp
In before someone asks "why not just use s-expressions?"

------
jcwayne
This feels a lot like HashiCorp configuration language[0]. But more
standards[1] are always welcome ;)

[0] [https://github.com/hashicorp/hcl](https://github.com/hashicorp/hcl) [1]
[https://xkcd.com/927/](https://xkcd.com/927/)

