
STON – Smalltalk Object Notation (2012) - mpweiher
https://github.com/svenvc/ston/blob/master/ston-paper.md
======
rurban
While many applaud this notation superior to JSON let's face the criticsm:

It's slower than JSON, less readable than JSON and YAML, and more insecure
than JSON.

To get this feature set, use YAML 2.0. YAML supports classes and references
already, and is more readable then STON.

Slower: Having to support references, STON needs to store every object in a
hash. JSON doesn't need to. JSON is at least 10x faster.

Readable: JSON and YAML have much less syntax, and are more pleasant to the
eye.

Insecure: Changing classes in serialization protocols without any protection
is the most common exploit vector. Esp. supporting user-classes, not only
builtins. YAML at least supports builtin-classes only via tags.

~~~
wtetzner
I would disagree that JSON is more readable than STON, if only for that fact
that all keys have to be quoted in JSON, but I agree with the rest of your
comment.

~~~
rurban
I was referring to the # pound syntax for fields. fields can be used with a .
or - prefix much more pleasingly to the eyes. An untrained eye would take the
# for comments.

    
    
        TestDomainObject {
          #created : DateAndTime [ '2012-02-14T16:40:15+01:00' ],
          #modified : DateAndTime [ '2012-02-14T16:40:18+01:00' ],
          #integer : 39581,
          #float : 73.84789359463944,
          #description : 'This is a test',
          #color : #green,
          #tags : [
            #two,
            #beta,
            #medium
          ],
          #bytes : ByteArray [ 'afabfdf61d030f43eb67960c0ae9f39f' ],
          #boolean : false
        }

~~~
s-phi-nl
_Your_ untrained eye took the # for comments, but I'm not sure that
generalized to anyone's. Plenty of languages use # for other things: Smalltalk
uses them for symbols, Clojure uses them for reader macros, OCaml uses them
for method calls, Haskell can use them as an arbitrary operator, Lua uses them
for length, C and C++ use them for compiler directives. See
[https://en.wikipedia.org/wiki/Number_sign#In_computing](https://en.wikipedia.org/wiki/Number_sign#In_computing)
for a list.

Admittedly, Python, Bash, Perl, and Ruby use # for comments, so you do have a
point, especially since Python is such a common teaching language.

~~~
rurban
No, the point is the clutter. A pound # is the most intrusive ascii character,
with about 85% black. Older languages were critised to use $ with about 55%
black as prefix. It's like writing in ALL-CAPS.

~~~
yAak
This is a very valid critique in terms of graphic design.

(For anyone confused by the mention of graphic design: it's all about
communicating information through visuals.)

------
smitherfield
All these text-based serialization[1] formats are IMO relying on the same
fallacies used to sell XML 15 years ago.

 _Would you rather use XML for serialization, or an ad-hoc, undocumented
binary format?_

Er, XML I guess.

 _So it 's proven: XML is _better _than binary formats! The golden age of XML
is upon us!_

[1] The specific task of serialization over a network; for configuration files
or other applications involving direct manipulation by a human or a shell
script, JSON or YAML (or XML) can be reasonable.

~~~
wtetzner
> [1] The specific task of serialization over a network; for configuration
> files JSON or YAML (or XML) can be reasonable.

I would argue that JSON is not reasonable for configuration files that are
meant to be edited by humans, because they don't support comments. It's also
annoying to write JSON by hand because all keys have to be quoted.

YAML on the other hand, is a very nice configuration format.

~~~
gmjosack
Also the lack of trailing comma means often editing two lines to add new
elements to lists or objects. I've actually experienced multiple outages at
different companies where someone helpfully added a trailing comma to a JSON
config.

------
dom0
This looks superficially similar to (a related subset of) QML code.

    
    
        Rectangle {
            id: photo                                  // id on the first line makes it easy to find an object
    
            property bool thumbnail: false             // property declarations
            property alias image: photoImage.source
    
            signal clicked                             // signal declarations
    
            function doSomething(x)                    // javascript functions
            {
                return x + photoImage.width
            }
    
            color: "gray"                              // object properties
            x: 20; y: 20; height: 150                  // try to group related properties together
            width: {                                   // large bindings
                if(photoImage.width > 200){
                    photoImage.width;
                }else{
                    200;
                }
            }
    
            ....
        }

------
dwheeler
Interesting. There are alternatives to consider, though.

If you want this feature set, YAML is a reasonable alternative.

If you just want to add support for comments, trailing commas, and a few other
things, JSON 5 is an alternative: [http://json5.org/](http://json5.org/)

If you're processing lots of S-expressions (e.g., Lisp code or data), the
readable Lisp notations (including sweet-expressions) might be of use:
[http://readable.sourceforge.net/](http://readable.sourceforge.net/)

------
donpdonp
EDN does all this, and looses the commas too (winning!).

------
amelius
I prefer LISP notation, because it naturally allows to store functions.

~~~
brudgers
While I love to sing love songs to Lisp, I see Lisp's notation as naturally
allowing the storage of lists and the storage of functions/macros/structs etc.
as a matter of interpretation at a higher level of abstraction where the
language semantics live instead of its notation.

~~~
amelius
Of course, but try to do it e.g. in JSON, and you'll see that the resulting
representation quickly becomes convoluted. In contrast, the LISP notation is
basically the same notation as you'd use in the language LISP itself. That's
what I mean by "natural".

~~~
brudgers
I'm not clear on its advantages for the serialization of Smalltalk objects.
Would the deserializer be written in a Lisp?

~~~
jimbokun
You can treat the serialized S-expressions literally as Lisp code. The first
token of the list could be the name of a macro, for example, that could expand
into any kind of code you want to execute.

(So yeah, you better be really, really sure you control the data you are
processing this way.)

------
smnplk
Transit tries to solve the same problem with JSON
[https://github.com/cognitect/transit-
format](https://github.com/cognitect/transit-format)

~~~
dottedmag
STON is more similar to EDN than to Transit.

------
draegtun
I myself use Rebol/Red which I find much easier to use.

Here's a translation of the first example:

    
    
        test-domain-object: make object! [
            created:  2012-02-14/16:40:15+01:00
            modified: 2012-02-14/16:40:18+01:00
            integer:  39581
            float:    73.84789359463944
            description: "This is a test"
            color: green
            tags: [
                #two
                #beta
                #medium
            ]
            bytes:   #{afabfdf61d030f43eb67960c0ae9f39f}
            boolean: false
        ]
    

And there is REN (REadable Notation) which is an attempt to produce a (sub-
set) standard so it can be interchanged with other languages -
[http://pointillistic.com/ren/](http://pointillistic.com/ren/) |
[https://github.com/humanistic/REN](https://github.com/humanistic/REN)

------
rbanffy
It's like JSON, but cool. ;-)

------
rpastuszak
Interop between different languages still requires specific implementations
written in them, so replacing type annotations with, say, a property called
"$t" sounds like a good tradeoff regardless. (and a rather cosmetic change,
unless I've missed something obvious here)

I've been happy with this approach when working with a stack built with JS and
C# - it might not suit everyone, ofc.

[edit] typo

~~~
rpastuszak
Why a downvote, sir?

------
ape4
Looks like an improvement over JSON.

~~~
rurban
No, of course not. It makes this kind of JSON insecure and slow.

You can use YAML for this kind of stuff already, and it's still more readable
than STON.

~~~
collyw
I am trying to set up deploy script using Ansible just now - which uses YAML.
Its more readable in one sense, but indentation gets a bit confusing when you
are not especially familiar with it. JSON is more obvious in that respect.

I wish we could have Python dictionaries as an alternative to JSON, as large
JSON configuration files are horrible, when you add an extra trailing comma or
things get too nested.

~~~
rjeli
I really like TOML: [https://github.com/toml-
lang/toml](https://github.com/toml-lang/toml)

------
pavledjo
why?

~~~
Pamar
From the _Rationale_ paragraph:

 _However, JSON knows only about lists and maps. There is no concept of object
types or classes. This means that it is not easy to encode arbitrary objects,
and some of the possible solutions are quite verbose (Encoding the type or
class as a property and /or adding an indirection to encode the object's
contents).

Adding a symbol (globally unique string) primitive type is a very useful
addition: because symbols help to represent constant values in a compact and
fast yet readable way, and because symbols allow simpler and more readable map
keys._

~~~
throwanem
This is very true! What a shame the no doubt very clever people who specified
ES6 failed to read it, because if they had, I can't imagine that ES6 "symbols"
would be so bizarrely broken.

~~~
thomasfoster96
> I can't imagine that ES6 "symbols" would be so bizarrely broken.

In what way are ES6 “symbols” broken? (genuinely interested, I haven’t really
used symbols before)

~~~
parenthephobia
They're not like what are called symbols in any other language.

An ES6 symbol is a unique object with an optional name. They can be used as
property names, like strings, which means that code can add new properties to
objects without worrying about name collisions. Two symbols are distinct
objects regardless of name: _Symbol( "foo")_ is never the same symbol as
_Symbol( "foo")_.

Symbols as used in other languages (sometimes also called atoms) are
effectively unique instances of strings. They chiefly serve the opposite
purpose of ES6 symbols, which is being able to refer to the same object by
name in different parts of code, even across executions of the program, or
across distributed systems. i.e. _:foo_ is always the same symbol as _:foo_.

They're (often) more efficient than simple strings because _:foo_ and _:foo_
will always reference the same object in a given instance of the program, so
they can be compared by address rather than character-by-character.

Some languages with symbols have a function, often called gensym, for
generating a symbol with a unique name, for when you need to support ES6's use
case. e.g. Lisp and Prolog.

~~~
WorldMaker
Given ES2016's intended goal to remove a lot of the global interconnected
nature of the language (let/const over classic var; module scopes), it seems
clear why ES2016 "over-corrected" and only supports the gensym-style unique
symbols and exporting them at the module level if they need to be reused.
(From whence known symbols like Symbol.iterator exist.)

I've seen strawman proposals for ES to also support some form of a global
symbol namespace, but after debugging much of the legacy of JS global-happy
code and order-of-script-tags bugs I, for one, am happy that none of those
strawman proposals are currently favored by the committee.

~~~
v413
There are global symbols. You use Symbol.for() to either create or retrieve
them. E.g. Symbol.for('hello') is available globally through the global
registry. It should be noted that an already existing symbol e.g.
Symbol('hello') is not the same as Symbol.for('hello') while
Symbol.for('hello') === Symbol.for('hello')

~~~
WorldMaker
Thanks, I had forgot that had made it in after all.

------
mjpuser
I'd just stick with JSON. I think the reason why JSON is so popular is that it
works with most languages. This would only be compatible with object oriented
languages. Adding in code blocks would only make it easily compatible with
one.

~~~
throwanem
> This would only be compatible with object oriented languages.

Even then, it's only compatible with an implementation that either implements
or stubs all the classes that a given STON body references. I can see where
this makes sense for interop among Smalltalk environments with enormously
complete standard libraries, where everything likely to be referenced in an
arbitrary STON blob _is_ available, but for interop among implementations in
multiple languages, it's a no-go unless either:

\- STON use is limited to a well-defined subset implemented by all parties in
the interaction, _or_

\- all languages use STON parsers which support automatically stubbing
(ignoring, etc.) classes specified in content but which aren't available in
the parsing context, _or_

\- all parties in the interchange provide custom-implemented and probably
dangerously incomplete translation layers between STON and the rest of the
implementation.

That said, there are a couple of things here that I really like. I was about
to say that, since es6 has a native symbol type, it might make a lot of sense
to include symbol literals in a new version of JSON - but I've just taken the
time to actually examine es6's symbol implementation for the first time in
detail, and...well, I'm not sure precisely what it _is_ , but I'm quite
certain it's not what it claims to be, and its justification for even existing
is gravely in doubt. That's pretty special, and it also means that STON
symbols wouldn't even make sense in the context of es6, so never mind. The
internal references in STON are pretty neat, too, but it'd be a hairball for
any parser to implement and pretty useless unless _every_ parser implements
it, so never mind that too.

Oh well. If I weren't accustomed to compromise, I wouldn't spend so much time
writing Javascript, would I? I will say, though, this "symbol" business really
depresses whatever enthusiasm I had for deep-diving the post-ES5 variations of
JS. If they're so far off the mark with something as trivially simple as
_symbols_ , God alone knows how badly they'll have handled the parts that are
at all complex.

~~~
mpweiher
> only compatible with an implementation that either implements or stubs all
> the classes that a given STON body references.

Don't see why that would be the case. You can easily ignore the class info and
then interact based on arrays/dictionaries. You just don't get the benefits.

~~~
throwanem
I suppose I'd call that an extreme case of stubbing; if I could still edit,
I'd replace that with "implements, stubs, or ignores".

~~~
mpweiher
"Extreme" in that it ignores everything, but very simple to implement: parse
the name and toss.

