
Show HN: Eno – A lightning fast, user-friendly YAML/TOML alternative - simonrepp
eno [1] - A modern plaintext language w&#x2F; libraries [2] for JavaScript, Python, Ruby &amp; soon more!<p>We migrated a big relational research database to a file-based solution - requirements were:<p>- Super fast and easy editability for users<p>- Highest performance for parsing&#x2F;validating &gt;10K documents on every user change.<p>Our trials with YAML&#x2F;TOML showed us that we wanted something both faster [3][4] and easier [4],
something <i>tailored</i> for file-based content management ... and after months of
research &amp; development it&#x27;s now publicly available (under MIT license) for everyone!<p>Last but not least I also want to mention eno&#x27;s document introspection capabilities -
with a few lines of code you can build intelligent relational suggestion UIs as shown in [4] below.<p>[1] <a href="https:&#x2F;&#x2F;eno-lang.org&#x2F;" rel="nofollow">https:&#x2F;&#x2F;eno-lang.org&#x2F;</a><p>[2] <a href="https:&#x2F;&#x2F;eno-lang.org&#x2F;libraries&#x2F;" rel="nofollow">https:&#x2F;&#x2F;eno-lang.org&#x2F;libraries&#x2F;</a><p>[3] <a href="https:&#x2F;&#x2F;github.com&#x2F;eno-lang&#x2F;benchmarks&#x2F;" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;eno-lang&#x2F;benchmarks&#x2F;</a><p>[4] <a href="https:&#x2F;&#x2F;eno-lang.org&#x2F;resources&#x2F;introspection.mp4" rel="nofollow">https:&#x2F;&#x2F;eno-lang.org&#x2F;resources&#x2F;introspection.mp4</a><p>PS.
Your input for the Roadmap is highly welcome - what do you think should be in the next releases?
More languages? (If so, which? Currently in progress: Rust&#x2F;PHP, Currently planned: Go&#x2F;Java)
Additional IDE&#x2F;editor support? (Currently supported: Atom&#x2F;VSCode&#x2F;Sublime)
Or something else entirely? :) Looking forward to your feedback!
======
splitbrain
After a quick look I have a lot of questions

1) Unlike YAML or JSON this doesn't parse into a simple array structure, but a
library dependent object hierarchy?

2) Is the API of the libraries also part of the language spec?

3) Can I assume that document.lookup() will be available as document->lookup()
in the PHP library?

4) What about programming language specifics that may differ between
languages? Will the PHP objects implement the Iterable interface? Or the
ArrayAccess interface? (There are probably similar but slightly different
concepts in other languages).

5) There is eno.parse(), but is there some kind of reverse mechanism to create
a new Eno document? Like document.addList([...]).addSection('hey', 2) or
something.

~~~
simonrepp
1) Yes! (You can directly dump it to a language-native structure with the
raw() method too, this is not 1:1 YAML/TOML style generic deserialization
though as there are no fixed types in eno)

2) Some detail aspects of whitespace-parsing around the line continuation
syntax will need to be specified by the language, the shared official API I am
implementing for the different platforms is fully open to improvement and
future reinvention though, I'd love to see a completely new take for a library
API if it comes up in the future. :)

3) Definitely!

4) I try to keep things as consistent as possible across the platforms, but if
there are important language specific paradigms I think these should be taken
advantage of! I can't answer details regarding the PHP implementation yet but
keep in touch, I'm happy about a dialogue here! (Also I can't be good at
everything :)).

5) I want one! Obviously there can't be a stable generic "just dump it
already" implementation, but a smart builder-type API is definitely on the
list, I even started one for enojs but had to re-prioritize because there was
so much else to do for the whole ecosystem. ;)

------
lorenzleutgeb
What I was looking for on the website and think is more than implementing a
parser in another language is schema support. That is, you should provide
something like XSD for XML, JSONSchema for JSON, TOLS for TOML.

Why? There is a need (see above enumeration) to declaratively specify how an
Eno file should look like. I do not want validation to creep into my code,
like you do with `document.string('author', required: true)`. This just scares
the hell out of me. Say you want to parse some Eno file with different
languages, you also end up replicating validation, which mains you end up
maintaining it, or rather not maintaining it... Apply leverage by moving
validation into your parser.

Another thing is that it appears you are implementing the parsers by hand
instead of using a parser generator that consumes a grammar for Eno. What is
your reasoning behind this? Is it performance? Did you benchmark using
generated parsers (maybe wrapped in a nice API)?

~~~
simonrepp
Yay thanks for the detailed input!

If there is demand or initiative for a portable schema solution I'll gladly
support it! The native architecture in the eno libraries is programmatic
because that does have its own powerful merits which are employed to the
fullest in the API design, like always there's not one best choice, and the
'validation creeping into code' can be turned around into 'external schema
definition creeping out of line with code' just as well. ;) Do you have some
concrete usecase in mind or planned where we could explore how a portable
schema solution could look like for eno? Drafting things from various real
life use cases has worked great for eno so far, so that's the route I would
love to go here too if we follow that track!

Custom parser implementation is easier to answer: By now I've iterated through
dozens of custom parser designs for eno in multiple languages, and I'm pretty
much confident that generated parsers will not stand a chance of being faster,
they after all do the same thing as I do, only I can't really hand-optimize
what they produce afterwards. :) You can study the benchmarks I linked to
under [3], there are some generated TOML parsers included with rather
disappointing performance to put it mildly, and as it stands there's not much
that's faster than the eno parsers in YAML/TOML land anyway, so I have low
incentive to experiment in that domain currently. :) Long term goal is to
(optionally) integrate (generated or custom) C (respectively Rust) parser
cores through native bindings as well, so that will bring up this question
again then for sure.

~~~
imoverclocked
I've come to really appreciate the difference between "syntactically correct"
(ie: is a file valid xml) vs "semantically correct" (ie: does it follow a
specific dtd if it's valid xml.) More than that, I've come to realize how many
other people don't have this appreciation even though they will identify
problems that directly relate to this distinction in every day usage.

To truly be able to have a portable file format, there needs to be a way to do
both validations reliably in different contexts (eg: different languages). If
you ignore this part of your design then it may become the slowest part of the
eno ecosystem because your grammar _will_ have quirks that you'll end up
needing to support long-term. I suggest toying with this functionality now and
providing something which is extremely pessimistic on what it will pass. Only
loosen things up as people show a need and keep your entire spec as tight as
possible.

I would imagine that you could even use eno syntax to describe document
structure, much like xml/dtd has such strong parallels with each other. Then
you get the fast parser in both places essentially for free!

Finally, on the format of eno itself, I'm curious on your thoughts relating to
unicode characters that visually masquerade as common characters. eg:

[http://www.fileformat.info/info/unicode/char/ff1a/index.htm](http://www.fileformat.info/info/unicode/char/ff1a/index.htm)

Sample usage:

\---

author： Jane Doe

email： jane@eno-lang.org

\---

Does this parse?

How about this:

\---

author： Jane Doe:

email： jane=doe@eno-lang.org

\---

What do I do if I want a "#" in a name?

\---

# #twitter

@hackernews = 0xC0FFEE

\---

or:

\---

# \\#twitter

@hackernews = 0xC0FFEE

\---

Are quotes optional somehow? Can I put arbitrary things into an identifier?

Cheers and keep up the great work!

~~~
simonrepp
Syntax vs Semantics is distinguished by _ParseError_ vs _ValidationError_ in
the eno libraries - I'll keep the importance of distinguishing them in mind
for the schema development too - thanks for pointing this out!

Right now only an ASCII colon is interpreted as an operator, but this looks
like a question to thoroughly consider for the next and final spec (which is
planned for 2019, currently we're in frozen RC) - work on this currently
happens at [https://github.com/eno-lang/eno](https://github.com/eno-lang/eno).

There is escaping for arbitrary keys by using backticks - see the advanced
language feature documentation at [https://eno-
lang.org/advanced/](https://eno-lang.org/advanced/), in the case of # #twitter
you wouldn't need it though unless you omit the space.

Thanks for your input, appreciate it!

------
mamcx
What about types?

All parse to string? If not, hopefully please add date/datetime support
(better as ISO) and decimal.

\---- > More languages?

How about make the core on Rust and the rest use it?

BTW: What your use for the introspection? What editor is that? I like the
auto-complete stuff..

~~~
fernly
This! As the introduction doc coyly says,

> so as a user we usually just concern ourselves with editing the values and a
> friendly developer takes care of specifying the names for us. :)

The friendly developer gets the job of conveying -- somehow -- that values for
landline: and mobile: have to be valid tel#s (but must they have the country
prefix?) while "hire date:" must be in the form yyyy mm dd only... and also
gets the friendly job of writing one-off, locally unique validation code for
these fields and hacking them into the parser.

~~~
simonrepp
Conveying what types to enter is not an eno-specific problem, as a user
without schema or code access you don't know which types a blank YAML/TOML
file expects either!

Asides the absolutely valid meta solutions (e.g. in-file comments, clear key
naming, documentation) there is an additional way this is approached in eno:
If you use the type loaders provided by the API (say 'my_var =
document.url('website')'), and properly expose errors to the user, the user
will get a localized (!) error message in his language, like "'website' must
be a valid url (e.g. [https://eno-lang.org/)"](https://eno-lang.org/\)").

In the long run we can have community packages for any number of important
locally unique types (loaders are just simple functions, so they can be easily
authored), so at some point you likely don't have to write any one-off
validation code, and neither the error messages or their localizations, you
just pull it in as dependencies.

~~~
mamcx
How about type inference? You can look at rebol/red/tcl for inspiration, that
already look like a config format but have a defined way to types:

[https://randomgeekery.org/2004/12/26/rebol-
datatypes/](https://randomgeekery.org/2004/12/26/rebol-datatypes/)

[http://www.re-bol.com/rebol.html](http://www.re-bol.com/rebol.html)

~~~
simonrepp
I appreciate the input :) but the thing is that the typing concept in eno as
it is now is essentially what makes eno eno. Every application that uses eno
decides for itself what types it supports and requires, and that in turn is
how eno manages to be so simple and usable on the language level, even for
completely non-technical people who normally feel uncomfortable with the idea
of editing their content as raw text files.

If I would add types and type inference again, then I would essentially arrive
at YAML and TOML again, and I don't want to reinvent them. ;)

But if I actually misunderstood you there, please let me know and do clarify!

~~~
mamcx
I get that, but I wonder how give a base set of types, that avoid small
incoherences.

JSOn is a good example:

[https://www.tutorialspoint.com/json/json_data_types.htm](https://www.tutorialspoint.com/json/json_data_types.htm)

Is so spartan that everyone need to encode dates somehow, to make a simple
example. I think this are the base types (also, my experience with RDBMs and
building a relational lang now, and always having troubles with cvs, json, and
others formats in ETL kind-of-task):

\- String

\- Floats. Can be split Ints/Floats but stick to just Float is ok. However,
make it Float64.

\- Date(Time). And be ISO. Not ambiguity.

\- Boolean

\- Decimal64. This is a pet peeve of mine. A lot of data in the business space
is about money, and floats are not ok. What if like in rebool $3.2 is decimal?

Then the composites.

ie: This is json + dates/decimal. And make a single encoding (utf8?).

Is insane that, for example, you save a CVS in excel and open it again and
excel get lost, and can't parse it fine.

Apart from this, url, email, host, website, phone, cellphone, city, country,
state are so common that maybe with a import like "!schema:common-fields" or
something.

------
TheAceOfHearts
It seems bizarre that you wouldn't create a C version of this library. If you
create a C version, everyone can write bindings and consume it from their
favorite language. I think maybe you can do that with Rust code, but it's
typical to write the canonical version in C.

Your benchmarks are flawed, as they only compare between different
implementations using the same language. If you really care about performance,
it seems bizarre that you'd use PHP or Ruby.

~~~
simonrepp
That road (C or Rust parsing core through bindings) will likely be taken, but
for the initial development and jump-starting the ecosystem it was important
for me to start with implementations that can be quickly experimented with and
iterated and not spend a lot of extra time on dealing with segfaults, memory
leaks, the different binding mechanisms on different platforms, etc. As things
stand now, people are provided with multiple, fully functioning, pure
implementations that already are faster than the majority of YAML/TOML
parsers. In the coming months and years there will be plenty of time to make
things even faster. :)

For me caring about performance also means caring about performance on all
platforms, why not after all? You can take the tabular benchmark data I
provide and paste it together, or use the raw data that is also available as
eno files in the repository to compare language against language too (which I
initially also did but later dropped because same-language comparison for
libraries made more sense to me), if you want the quick run down as far as I
remember it: mostly javascript parsers lead the ranking, ruby parsers are a
bit behind and just slightly ahead of Python.

------
mschwanzer
The level of sophistication you got this to over the last few months is truly
impressive. Supporting several languages and IDE/editors from the start takes
serious dedication for one person. Kudos! Looking forward to using it for my
next database-less project.

~~~
simonrepp
Thanks michael! <3

------
tmcw
I'm left looking for a spec, like a real spec-flavored specification with all
the gritty details, written like you would want if you were writing a parser.
It doesn't seem like there is one, yet.

~~~
simonrepp
You're right, _not yet_! Jump-starting the whole ecosystem was a major time
investment for me but now that there is public exposure providing a formal
spec has a higher priority because someone might actually see it and do
something with it ;) Keep an eye on [https://github.com/eno-
lang/eno](https://github.com/eno-lang/eno), this is where I'm working on it,
I'll also announce it on the newsletter
([http://eepurl.com/dA9LcH](http://eepurl.com/dA9LcH)) when it's there!

------
gregwebs
Related as a better config language (but universal, not for a use case like
this) is dhall (non-turing complete, type-safe, remote imports).
[https://github.com/dhall-lang/dhall-lang](https://github.com/dhall-
lang/dhall-lang)

~~~
NegativeLatency
At first glance dhall looks a bit like some of the CSS preprocessors out
there.

------
mitchtbaum
first off, I agree that file-based content management is the way to go. thank
you for working to make better tools for it.

> we wanted something 1) faster and 2) easier

1) a) why didn't you write a new library using the same spec? b) do you have
speed tests to show your libraries are better than existing ones?

2) how is this spec easier? (like a basic rundown)

~~~
RussianCow
> b) do you have speed tests to show your libraries are better than existing
> ones?

The OP linked to their benchmarks in [3]: [https://github.com/eno-
lang/benchmarks/](https://github.com/eno-lang/benchmarks/)

------
jxy
Honestly, putting a pair of parentheses in all of these plain text format, you
get back lisp.

Greenspun's tenth rule rules!

------
edoceo
Can you talk about why "" We migrated a big relational research database to a
file-based solution""?

I'm curious how modern DBs failed you here, more details of the problem space
please.

~~~
simonrepp
Sure!

Cultural research = notoriously underfunded, so although they have and rely on
a relational database that holds their data (previously Postgres) the cost and
effort associated with maintaining and extending the system is pretty high.

With the new setup the thousands of eno files represent both the place of
storage and the interface to edit the data, so by that we eliminated the
development effort to provide and maintain a full web frontend to the
database, and the effort to just maintain the actively deployed technology
somewhere and keep it at least patched for security reasons.

All that remains now technology wise is an Atom plugin that is locally
installed on each client at the institute and takes care of validating,
provides relational autocomplete helpers as demonstrated in [4] and offers a
few hooks to kick off local builds for multiple deployment targets and deploy
them to live as well.

Hope this clarifies things! :)

------
krick
Kinda worthless rant, but I'm tired of reconsidering configuration file markup
language every couple of years, and please remind me: why TOML isn't perfect?

~~~
krapp
If you already believe TOML is perfect, why do you reconsider configuration
file markup languages every couple of years?

------
petershinners
It looks like there is no support for writing data to this formatting. My
inner hope was that there was a format I could parse, modify, and write again
preserving as much comments and formatting as is reasonable.

I looked at the Python implementation but did not see that type of
functionality. Am I wrong?

~~~
simonrepp
ruamel.yaml in python does that to a certain degree from what I've read, you
might want to check it out if yaml is ok for your usecase!
([https://yaml.readthedocs.io/en/latest/](https://yaml.readthedocs.io/en/latest/))

I've given this some thought as well, and given that the eno libraries hold
their own representation of data in memory this might actually be plausible to
implement in some way. Still I fear this will turn out to be a hard, hard
problem (as eno is not even generically serializable by design), so that's why
I haven't explored it further. So for the moment I can only say - Maybe in the
near future sometime, check back every once in a while! :)

------
jajag
I like this, but I'm wondering how strict the format is. It seems like a
potentially useful format for capturing data from a non- or semi-technical
user base, but then some degree of fault tolerance in the entered data would
probably be desirable. Was this (or should it be?) one of the design goals?

~~~
simonrepp
One of the design conderations was and is that the format is very strict (and
that way predictable), but at the same time as helpful as possible in
identifying, communicating and resolving issues.

To that end all error messages that can occur are handwritten, fully localized
and shared across all eno libraries (see [https://github.com/eno-lang/eno-
locales/blob/master/specific...](https://github.com/eno-lang/eno-
locales/blob/master/specification.eno)) and the API implicitly handles them
for you when you write programs that consume eno.

So basically eno does no magic fallbacks of any sort when faults occur, but it
is candid and friendly about it when it happens. :)

------
dathinab
I just read the benchmarks and was surprised at how slow the toml parsers are,
I mean it's syntax is simpler to parse than YAML (assuming spec v1.2) but they
where still slower...

So I guess there is simply no high performance toml parser implementations??
(in the tested languages, which don't include rust)

~~~
simonrepp
From what I saw, I think for at least a few parsers this might be the case
because they are built on generated parser code, and it's just easy to run
into unfavorable bits and pieces in the output that way, which can drag down
performance completely although 95% of the parser are just fine. Technically
there's no reasons why toml parsers shouldn't be just as fast or faster as
yaml or even eno. :) In any case I'd be happy if the benchmarks stir up some
movement and maybe kick off some high performance toml parser intiative, toml
is an awesome format and also 0.5.0 was just officially released so there's a
good reason to update the parsers now anyway. :)

------
topspin
I am working on a spectrum diagram rendering system and have been thinking
hard about what syntax I should select for source documents. I'm going to give
eno a hard look and the existence of JavaScript parser is not the least of my
reasons. Thank you.

~~~
simonrepp
That's fantastic to hear! Do let me know via email or github etc. if you run
into any issues, I'm eager to gather more insights from other people's
usecases and work on improvements where needed!

------
fiatjaf
This is amazing. We really needed an alternative to YAML, and this seems
great.

~~~
pritambaral
Have you seen strictyaml?

~~~
fiatjaf
No, but I'll look it up. The name is not bad.

------
amann11
Is there support for dictionaries as list items / nested dictionaries?

~~~
simonrepp
Yes through _sections_! See [https://eno-lang.org/introduction/](https://eno-
lang.org/introduction/).

You can nest as deeply as you want and multiple sections on the same level
automatically turn into a list of sections. For just a list of flat
dictionaries you can also use fieldsets, see [https://eno-
lang.org/advanced/](https://eno-lang.org/advanced/). :)

------
maitredusoi
good name, good possible alternative to yaml (I juste hate json), and not the
least a ruby gem ! I will watch for the project, on how to evolves ....

------
JaggerFoo
How does it handle Oblique Strategies?

~~~
simonrepp
On npm that's in fact already covered, scrutinize the list
[https://www.npmjs.com/search?q=eno](https://www.npmjs.com/search?q=eno) :)

------
Zardoz84
Thanks, but I would keep using SDLang :
[https://sdlang.org/](https://sdlang.org/)

------
igore
Do you remember when people were writing plain C parsers first? Pepperidge
farm remembers!

------
maccio92
Why would I use a field set instead of a section? They seem functionally
identical, is it meant to be a semantic decision?

~~~
simonrepp
Well spotted, good question!

It's also been asked in another thread on HN, I'm quoting myself here: "eno
has neither indentation nor closing tags of any sort, that means if you use a
section to group some values, you need to start another section to end the
previous one (no closing tags!), that's why there are fieldsets, which allow
short groupings that automatically end with the next field/list/fieldset."

Hope this explains it :)

------
someguydave
no python 2 support means no python support

~~~
danso
Python 2's end of life is in 1 year, 4 months.
[https://pythonclock.org/](https://pythonclock.org/)

~~~
someguydave
This is a meaningless deadline, given that nobody is actually paid to support
Python. Third party vendors will support it for years to come.

