
Open Sourcing the Stupid-Simple Messaging Protocol - vonsnowman
https://www.aerofs.com/blog/open-sourcing-the-stupid-simple-messaging-protocol/
======
todd8
The stupid-simple messaging protocol is interesting. However, I would prefer
the use of something like D. J. Bernstein's netstrings rather than NL
terminated strings. Netstrings have the advantage of having an explicit length
field and thus allowing the receiver to allocate buffers of the appropriate
size without the complications of receiving NL terminated data. Examples of
protocols using netstrings are SCGI and QMQP. Netstrings are easier to handle
and are still easy to debug if verbs are expressed in ASCII while still
handling arbitrary binary data equally well.

Netstrings are so brilliantly simple, see the wikipedia page:
[https://en.wikipedia.org/wiki/Netstring](https://en.wikipedia.org/wiki/Netstring)

This is what DJB says about the netstrings [1]:

> The famous Finger security hole may be blamed on Finger's use of the CRLF
> encoding. In that encoding, each string is simply terminated by CRLF. This
> encoding has several problems. Most importantly, it does not declare the
> string size in advance. This means that a correct CRLF parser must be
> prepared to ask for more and more memory as it is reading the string. In the
> case of Finger, a lazy implementor found this to be too much trouble;
> instead he simply declared a fixed-size buffer and used C's gets() function.
> The rest is history.

> In contrast, as the above sample code shows, it is very easy to handle
> netstrings without risking buffer overflow. Thus widespread use of
> netstrings may improve network security.

[1]
[http://cr.yp.to/proto/netstrings.txt](http://cr.yp.to/proto/netstrings.txt)

BTW, see Aaron Swartz's blog post on DJB,
[http://www.aaronsw.com/weblog/djb](http://www.aaronsw.com/weblog/djb)

~~~
vonsnowman
Netstrings look neat although I find the use of a redundant comma delimiter
somewhat confusing.

Buffer overflow concerns are not applicable to SSMP however, as the spec
explicitly restricts message size to a maximum of 1024 bytes.

~~~
dsr_
Which for some reason is not in the abnf spec, and thus will be ignored up
until somebody tries to implement this and fails a compatibility test.

Please put it in your abnf spec. People use them, you know.

Yes, it's hard to specify. The problem is that you have both an uncapped ID
and an uncapped PAYLOAD in the same message. I recommend giving ID a max
length of, say, 32, and PAYLOAD then has a max length of 951 if I'm counting
right.

Or you could consider that an IPv6 path MTU is at least 1280, and use that (or
1232) as your per message bound instead of 1024. You're sending a packet,
might as well get full value.

~~~
vonsnowman
A very valid concern. We will work on this.

------
nickpsecurity
I like what I see in the grammar. Very compact and clean. It's actually
smaller than the table of contents of XMPP Core spec lol. That's the kind
difference that might make an ultra-robust, efficient implementation a bit
easier. ;)

Note: Reminds me when I was illustrating Oberon-2 complexity for C and C++
programmers by comparing Oberon-2 BNF to their specs same way. Good way to do
prelim assessment of protocol/language complexity and whether it's worth the
trouble.

------
thom_nic
Also, since SSMP is completely text-based, does that mean the only way to send
binary data is to base64-encode it? Or does it support a length header or
chunking similar to HTTP?

~~~
vonsnowman
base64 is the preferred way but any 8-bit encoding that doesn't use LF would
also work.

There is no support for a length header or chunking as SSMP is designed with
small messages in mind.

~~~
teacup50
Why wouldn't you just include a length field on strings/bytes, allowing the
protocol to be "binary clean" and avoid the base64 problem entirely?

This is one of the most annoying things about XMPP (even sending contact
photos hits this!), so if replacing XMPP ...

~~~
vonsnowman
A big advantage of LF-delimited over length-prefixed messages is
netcat/telnet-friendliness. That was more valuable to us than being binary-
clean as our use cases do not involve sending large binary messages.

~~~
mjevans
I think you might want to make a distinction between a stream packet and a
completed message.

If you're going for telnet compatibility then you'll want to terminate packets
in CR+LF, but possibly expect to see only CR or LF from the client (ASCII
mode).

Your stream could either be stateful (a message is always sent complete and in
order, even if it takes multiple stream packets) or stateless* (different
messages might have stream packets consecutively).

It would be more future proof if you started with a message grammar and then
defined your protocol on top of that.

------
thom_nic
Neat but it's not clear how client authentication is handled in SSMP. I have
used XMPP in the past and built-in identity was one of the biggest plusses of
the protocol.

~~~
vonsnowman
Client auth is explained in more details in the spec:
[https://github.com/aerofs/ssmp](https://github.com/aerofs/ssmp)

Our most common use case is client certificates but there are provisions for
alternate auth schemes.

------
pbreit
Sorry for being dense but what are the prime use cases for this? Is it for
Jabber-style text messaging? Or inter-app MQ-style messaging?

------
ilaksh
Reminds me of an idea I had:
[https://github.com/runvnc/pubsub](https://github.com/runvnc/pubsub)

------
mraison
Cool work. Any plans for an IETF submission?

~~~
vonsnowman
Not currently but we'd love to do it if there's enough interest to justify a
more formal standardization.

------
aaggarwal
What do you think about telehash ([http://telehash.org](http://telehash.org))?
It might have been useful for your situation. Was it among your options?

