

A simpler and shorter representation of XML data inspired by Tcl (2013) [pdf] - networked
http://www.tclcommunityassociation.org/wub/proceedings/Proceedings-2013/JeanFrancoisLarvoire/A%20simpler%20and%20shorter%20representation%20of%20XML%20data%20inspired%20by%20Tcl.pdf

======
chipsy
I spent ages looking and looking at this question around human-readable
serialization syntax and whether a substantially better syntax is achievable.

I concluded these things:

* The structural forms that readily come to mind(linear, hierarchical, graph, relational) all have some kind of existing representation, or an easily achieved bootstrapping. They are all Good Enough at the basic problem of generically encoding the structure.

* What keeps us tied to existing formats like XML or JSON are the data types they can encode and the bidirectionality(i.e. a serialization and unserialization loop will leave the data unchanged). For moving around raw data in a textual form, these are difficult to beat, although efforts like SML may succeed in eking out a slightly more terse form.

* If we only make the syntax more forgiving, edge cases appear where syntax fails to encode some data. If we add special case syntax to encode that data clearly, bidirectionality becomes problematic and risks making everything unreadable. Correcting for these cases leads us back to the starting point of mediocrity.

* Substantial improvements for human editing come from moving away from a generic solution towards a source-format schema targeted around the domain problem. This allows the parsing to relax its requirements and simply fail with bad source data. Automated editing of these formats can also maintain the source as it was entered, because it's no longer bound by the requirement of being computer-centric - it's only manipulating the key symbols in a limited way, in the same way that we manipulate strings without fully knowing what the strings mean. The final meanings are obtained through a separate compilation process.

~~~
chenglou
Check out transit:
[http://blog.cognitect.com/blog/2014/7/22/transit](http://blog.cognitect.com/blog/2014/7/22/transit)

The js implementation: [https://github.com/cognitect/transit-
js](https://github.com/cognitect/transit-js)

------
ArkyBeagle
Hint: "<Folder index="1">"

I'd really rather just see

Folder.1.name ="Take off zones..."

Folder.1.open =1 #// ???

Folder.1.Folder.1.name ="drome"

Folder.1.Folder.1.visibility =0

... assuming "open" is an attribute and not a directive to add a level of
nesting...

I know there's good map between this representation and XML because I've used
it.

There are good conceptual cues from SNMP and ASN.1 without having to go the
full BER monte. Indeed, you may well recognize what I typed in above as being
a lot like what tools like snmpwalk produce once you get all the .mib files in
place...

EDIT: fixed line breaks

------
falcolas
I recall this being a pre-cursor to XML, but it seems like it could be as
easily adapted as the TCL format from the OP:

    
    
        /head(
            /body(
                /a(href="abcdef", a link somewhere (with parens!\))
            )
        )

------
indubitably
This notation, like XML and unlike JSON, doesn't handle arrays.

------
wodenokoto
How does it handle

    
    
        <tag id="1"> data </tag>
    
    ?

~~~
draven
Section "The SML solution":

    
    
      XML elements: <tag attribute="value" ...>contents</tag>
      SML elements: tag attribute="value" ... {contents}

------
mlhaufe
So.. what about significant white-space?

~~~
draven
In content text? There's this passage: "The content text is between "quotes".
Escape '\' and '"' with a '\'."

In the same space (replacement syntax for XML) I like Erik Naggum's idea of
NML (Enamel): [http://www.schnada.de/grapt/eriknaggum-
enamel.html](http://www.schnada.de/grapt/eriknaggum-enamel.html)

