Hacker News new | past | comments | ask | show | jobs | submit login
How not to design a wire protocol (ibiblio.org)
68 points by alexeiz 20 days ago | hide | past | web | favorite | 70 comments

Having written trading applications with binary, JSON, and FIX protocols: this article is fucking terrible.

1) scanf on a floating point number works differently on different platforms. If you explore the space of numbers that are expressible, you will find very different results in terms of floating point error depending on which implementation you use.

Therefore transmitting floating point in ascii is just asking for trouble. But IEE-754 is fairly universal and binary fixed point conversion will look the same everywhere.

2) The complaint about not being able to locate fields based on a text dump is okayish. But it's easily solvable by adding a 4 char code to specify the message type. For example, if you open a video file in a hex editor, you'll see headers like 'MPEG' all over.

3) In high performance or low memory applications, translating text to binary and back is expensive. But writing binary as text for debugging is just a printf away.

4) Auxillary to 3, allocating memory for streams becomes an O(logn) problem in terms of allocations since you don't know where the current message of arbitrary length will end. Meanwhile, binary messages over either UDP or TCP are fixed length and atomic, vastly simplifying your streaming and event code.

The worst case of this I ever saw was a text RPC protocol for video streaming, where a single message could be anywhere between 20 bytes and several GB. The guy that wrote that one made the same argument ESR makes, "It's so easy to read!".

He also repeats that binary protocols have the downside that you need to "have the spec in front of you" to interpret them... but network traffic is virtually never inspected by humans without the assistance of a protocol dissector that will add the descriptions, parse numbers, and so on for you.

There is clearly an efficiency downside to packing this description into the messages themselves when the recipient software will necessarily have the protocol information necessary to decode it.

Not terrible.

I have learned some things about how the NTP protocol was designed. And it is a good summary of trade-offs involved. You may or may not agree with them, and they may or may not be important to you.

However, I have learned even more things about how absolutely mind-bogglingly huge the egos of some posters here are.

Funny that you should mention mind-bogglingly huge egos in a discussion about ESR:

> “Are you” she asked “the most famous programmer in the world?”

> This was a question which I had, believe it or not, never thought about before. But it’s a reasonable one to ask...


What's the relevance of this? It's certainly not the case that the presence of one person with an embarrassingly obvious craving for adulation in the conversation means that other people with distended egos of their own will stay away. The opposite is nearer true.

If you're so sure that the grandparent comment is relevant, then please tell us exactly which messages you agree are from people with absolutely mind-bogglingly huge egos even larger than Eric S Raymond's? I haven't noticed anyone making any appeals to their own authority as arrogant as claiming to be the most famous programmer in the world. Are you saying that it takes a huge ego to point out ESR's long track record of making flawed arguments, and quote the many patently false, self-aggrandizing, racist and sexist things he's said in the past, or that there's something wrong with pointing out racism instead of silently condoning it?

Yeah I'm really surprised this article is getting as much shit as it is. There's details it misses, but it seems like a good enough intro to some of the trade-offs between fixed-size messages and self-describing ones, and designing for extensibility.

> The worst case of this I ever saw was a text RPC protocol for video streaming, where a single message could be anywhere between 20 bytes and several GB. The guy that wrote that one made the same argument ESR makes, "It's so easy to read!".

That reminds me of the Mork file format. Apparently, the edict from management was that the database file format should be space-efficient and human readable. This is to store essentially an entity-attribute-value relationship. The resulting file looks like this: (^a0^23fca)(^a1^23fcb)(^a2^23fcc)... totally unreadable. And it's not particularly efficient as a file format either as a result.

Unfortunately the binary representation of IEEE-754 may not look the same everywhere.



When the very first sentence is incorrect, the outlook for the rest looks grim.

> A wire protocol is a way to pass data structures or aggregates over a serial channel between different computing environments.

A real wire protocol involves not just the flow of data but the flow of control. Without knowing which data structures are replies to which others, issued under what circumstances and affecting which other parts of the state space, you don't have a protocol. All you have is a format. It's the difference between the floor plan of a courtroom vs. the rules for what happens within one, and that difference is not a minor one.

Then, predictably, ESR fails to distinguish between three separate concepts: binary vs. text, fixed vs. variable length fields, self-describing vs. not. While he sets up a false dichotomy between two, all eight combinations actually exist. There are even further variations. "Self-describing" can apply to any combination of length/delimiters, field names, internal format/encoding, applicable versions, mandatory vs. optional, and many more. If you want to have a serious discussion about designing protocols and formats, the design space is much larger than "NTP's wire format sucks for one set of constraints and purposes" which is all you'll get from this article.

How dare you insult the author of one quasi used json format!!

I don't like the article for the same reasons. It felt long-winded and didn't ultimately offer anything of value. What's worse is that it's written from a position of authority, common for this particular author, so some people will be duped into treating it as gospel.

So glad to see this comment as I felt exactly the same way.

I would look forward to reading articles from any point of discussion however strongly people felt and he seems like he would make for a great debate but at the end it was...."You think you know but you don't."

Seems like a classic ivory architect mentality that grinds potentially good conversations to a halt.

Not mentioned in the article, but bit packed data structures are easier to interpret in hardware and offer more guarantees. If I'm designing an FPGA/ASIC I can predict how many clock cycles a statically sized data structure will need to move through my system, and how much RAM I need while it's being processed.

I have spent a lot of time working with FPGAs that process network packets and many of the performance guarantees relied on the rigid structure of L2-4 protocol headers.

I was a bit surprised myself that the author prefers JSON to bit-packed protocols, but given that the author is none other than Eric Raymond, I have to take him seriously.

Accidentally, in my work lately I used JSON for data exchange over the network where performance is not important, and MsgPack otherwise where it is (which is essentially a packed JSON).

I feel like the fact that it's ESR causes me to take him less seriously. I'm basing that on the other contents of his blog, such as http://esr.ibiblio.org/?p=7239 , http://esr.ibiblio.org/?p=26 , and http://esr.ibiblio.org/?p=6907 .

We don't even need to get into his (execrable) politics; he's substantively bad on technical issues as well.


> “Are you” she asked “the most famous programmer in the world?”

> This was a question which I had, believe it or not, never thought about before. But it’s a reasonable one to ask...

That is precious.

Don't worry, if you look into the comments, the politics really comes out in force.

Did he manage to work white nationalism into binary versus ascii wire formats? I've seen him make some pretty impressive leaps before but that would be outdoing himself.

Appeal to authority, huh? Do you also take all of his many racist statements seriously, "just because he's none other than Eric Raymond"?

"The average IQ of the Haitian population is 67... Haiti is, quite literally, a country full of violent idiots." -Eric S Raymond

"... The minimum level of training required to make someone effective as a self defense shooter is not very high... unfortunately, this doesn't cover the BLM crowd, which would have an average IQ of 85 if it's statistically representative of American blacks as a whole. I've never tried to train anyone that dim and wouldn't want to." -Eric S Raymond


(Note: this is just the tip of the shitberg. There are SO MANY MORE examples on so many other topics (like "Is the casting couch fair trade?") from so many other times over the decades.)

There's no need to appeal to authority; the article gives specific reasons when to use JSON and when to use binary. What do you think about those arguments?

There’s a nice rebuttal from one of the ntp people in the article’s comments.

JSON is a disaster for many reasons. Hardware incompatible floating point is one; inconsistency in parser implementations (and ambiguities in the spec) also don’t help.

Also, why use a tree structured data representation when the underlying data structure is fundamentally just a N-tuple with a fixed schema?

Similarly, why use a text protocol to send around fixed length blobs or encrypted data?

Just to clarify the phrase "one of the NTP people": I'm the lead designer of NTS, which adds modern cryptographic security to NTP. I did not design NTP; Dave Mills did that, in large part before I was born.

If I got the chance to redesign NTP from scratch, there are a lot things I'd change, but use of fixed binary fields is not one of them.

I use CBOR, which is a really nice binary Json. I use it for a custom protocol for embedded systems and it's just brilliant.

Yep, see also MessagePack. Kind of telling that esr didn't notice the existence of either.

If one of my engineers brought me this "protocol" (um, protocols require state machines, where is it?), we would be having a series of long talks to figure out whether he needs to be educated or fired.

What happens if I want to run NTP on an ARM M4 microcontroller with a lithium coin cell battery? Because, you know, perhaps I actually might want my time to be accurate on devices that even outship cell phones?

Sending that message would be difficult without drift because of the huge number of bytes involved. Transmission time is far too long. I could go on and on...

If you want to see a relatively well designed protocol, go chew through the BLE (Bluetooth Low Enregy) spec. It's not perfect, but it shows you how to balance functionality vs. engineering (note the number of times you have a "length" parameter so that you can chew through your binary blobs even if you can't parse all of it).

Please quit giving ESR a platform when it's quite clear he really sucks as a programmer.

> Please quit giving ESR a platform when it's quite clear he really sucks as a programmer.

I am not sure this post complies with Hacker News guidelines.

A better reason not to give ESR a platform is that his racism doesn't comply with Hacker News guidelines.

Why? An assessment of technical competence/relevance seems to be a cornerstone of the upvote/downvote system on HN.

I find ESR to be ferociously overrated from a technical standpoint and resent the fact that he absorbs oxygen from people far more talented but far less "adept" at self-promotion. In addition, the "technical" ideas that he promulgates occasionally have to be actively undone by those with stronger technical chops.

Why does verbalizing this run afoul?

Why would someone write a protocol in 2019?

Don't we already have formats? (EDIT: Like CAN or I2C?)

Don't know why you got downvoted, it's a good question.

A "protocol" sits on top of things like I2C, SPI, and CAN.

Protcols answer things like: "How do I send more bytes than the underlying transport can take in a single transaction?" "How do I exchange data when hardware has different characteristics." "How do I minimize the power or time needed to exchange data?"

Different protocols have different strengths and weaknesses.

Remember: part of my complaint about this "protocol" is its "verbosity". If you are on a battery or are bandwidth constrained (Narrowband-IoT, LoRA, ANT, etc.), you want a protocol that exchanges short messages. Time is certainly something that you don't want to require lengthy messages when you are trying to set up.

Too many people think "embedded" means "runs a Linux installation larger than the average computer in 1996".

The article seems heavily influenced by author's personal preferences.

He starts from presenting false dichotomy (bit stream vs self-documenting text) and proceeds to apply his personal experience with proprietary GPS trackers to well-documented NTP protocol. He describes his favorite approach without mentioning it's downsides — and that approach is JSON! JSON!

By design, JSON format lacks any capacity for extensions. It's creators figured out, that backwards and forwards compatibility is more important that anything else, so they froze the specification at version 1 and refused to introduce new features or extension support. And thus JSON can't...

1) contain comments;

2) properly encode non-Latin text (no, — hexadecimal encoding is even worse than no encoding);

3) have more than one top-level element;

4) have any data types, except ones in JSON spec.

Each of those limitations has lead to creation of at least one incompatible JSON-like format, that can't be processed by spec-compalient JSON parsers. Pick a random piece of JSON from the wild, and you may find, that it isn't actually "JSON", but one of those quasi-JSON formats. To make matters worse, JSON spec didn't mention maximum supported number size/precision, so JSON payloads from one implementation may not properly decode on another implementation.

If he wants to design JSON-based NTP protocol, he is welcome to do so. But widely adopting such thing would be unwise — we already suffer from traffic amplification attack via NTP, and bigger packet lengths would make those worse.

> And thus JSON can't... […] properly encode non-Latin text (no, — hexadecimal encoding is even worse than no encoding);

This is (RFC) valid JSON:

Sure, JSON has some corner cases. Binary protocols can, and do, as well. While I'm sure that non-compliant JSON examples exist in the wild, I would think that overall they're exceedingly rare compared to compliant ones.

And if you don't like the limitations of JSON, extending the format for your particular use-case is a valid solution. (Though I would argue that going w/ a well-known format that already supports your needs is a more pragmatic one.)

> This is (RFC) valid JSON:

You mean, "valid, according to the latest 2017 RFC". Such young RFC is still to raw, too immature to adopt, especially if it concerns data interchange formats. IPv6 was created in 1995, and it apparently still too young!

I fear, that a proper full-featured JSON spec, with comment support, mandatory UTF-8 and strict prohibition of hex-encoding won't be created and implemented by most JSON parsers till at least 2090. At that point the JSON format itself will likely become insufficiently hip for general use (just like XML suddenly stopped being hip enough in early 2000's).

> You mean, "valid, according to the latest 2017 RFC". Such young RFC is still to raw, too immature to adopt, especially if it concerns data interchange formats. IPv6 was created in 1995, and it apparently still too young!

No, I mean valid, according to the oldest, 2013 RFC and all later standards. Non-ASCII characters, encoded directly w/o escaping, have always been supported by JSON. (JSON comes from JavaScript's syntax, and it's legal there, too.)

> I fear, that a proper full-featured JSON spec, with comment support

Many of us use JSON as a language to exchange data, service to service. Comments do no good in that regard. JSON, even w/ comments, is not terribly friendly. I'd recommend TOML or YAML, depending on the situation.

> mandatory UTF-8

JSON is required to be encoded in one of the Unicode UTF encodings. So, it's not required to be UTF-8, but it's pretty close, and I don't think I've yet run across a JSON document that wasn't UTF-8.

> strict prohibition of hex-encoding

I don't think you'd really want this. (Particularly if you want human-friendly features, like comments…) In debug situations, certain non-printing characters are just easier to deal w/ if they're not printed, for example.

Doesn't it also have serious security issues, partly for related reasons?

ESR may be a horrible person, and this article may get too caught up in flawed technical examples, but there is an underlying point here that’s important.

When designing any system, be it a wire protocol or anything else, it’s tempting to optimize for metrics that are easy to measure and forget about metrics that are hard to measure. Humans are expensive. Time is expensive. It may very well be worth using a little extra bandwidth to minimize development and debugging costs. That won’t always be the case, but it’s an important question to ask while you’re still in the designing stages.

Any decent programmer can see that the technical arguments here are flawed: NTP, by nature, needs to be very predictable and use as few bytes as possible, or embedded systems are going to run into issues. That’s a hard technical requirement, not a matter of optimization. Unfortunately, that oversight, combined with the author’s poor track record, detract from an important point. Sadly, the author has chosen to present his argument as a misguided rant about a particular protocol rather than a strong theoretical debate over the pros and cons of different optimization goals.

I'm the lead author of the protocol that ESR is critiquing. I've just posted a rebuttal here: http://esr.ibiblio.org/?p=8254#comment-2202914

I much prefer bit packed protocols. It's easy to process them on either side and if you really need them to be human readable you can dump the readable to a log or write a tool to let you see it. But to be honest, after a little while, it's like the Matrix, you can just see what you're looking for.

Offhand, I agree but also want to suggest.

Error replies should also include at least a short-text response (often along side a numeric one).

Initial connection strings might also have a text-string that says something useful to humans, like what the protocol is; just like the various multimedia container formats that were developed on the Internet rather than by 'media companies'.

Push complexity as far up the protocol stack as possible; but don't sacrifice useful extensiblity at lower levels if it makes sense. At the same time don't depend on that data staying the same (if it exists in another layer, it should be modifiable without breaking your actual protocol). FTP is a great example of a protocol that (because of connection multiplexing limitations) embeds data which should be low level in to a higher level.

If an RFC exists that describes the protocol then packed is probably OK. If no RFC described protocol or method exists, try to get as close as possible with off the shelf stuff and prototype with human readable things where possible on top of that until solid requirements for a new RFC are refined.

> Error replies should also include at least a short-text response (often along side a numeric one).

I always try and design in a debug mode. Turn it on and the destination will try to tell you exactly what you did wrong instead of stonewalling you.

Painful experience has taught me that you really want unique start and stop tokens, message type, message length, rev field and checksum/mac always. That at least allows you to mechanically validate and dispatch packets/messages.

Cannot upvote enough building it //in//; and shipping it. Conformant implementations must include it and allow admins to enable it.

You reminded me of the horror of talking to closed source things where there isn't even useful server-side debugging data. Stuff just fails or gets dropped without informing anyone why it went bad.

This also applies to things like my credit card - I would love to have the last week or two of even /failed/ attempts at using my card in my online statement. That would really, really help with figuring out if someone was trying to use my card, or if a given service that rejected use of my card even tried to hit the CC company. (This happened with a major travel site which probably did it's own processing; having a firm direction to push and solid data might have helped.)

IDK if you would say that ASN.1 (DER) is 'bit packed', but I still kind of like it.

I had some young feller tell me some time back why ASN.1 was the anti-christ, but I can't remember (for the life of me why). Do you happen to know/remember why ASN.1 is 'bad'?

I can write 8-bit assembler to it - I can write 32- or 64- bit compiled C to it, and it's pretty easy to create an FPGA pre-processor (and router) for it, and it certainly is more constrained than random JSON strings. What's wrong with ASN.1? Too old?

ASN.1 has multiple ways to serialize the same message which opens it up to bugs in rarely-used code paths. And every ASN.1 implementation was rife with security holes.

But mostly ASN.1 comes from the same "bad neighborhood" as CORBA, X.400, X.509, OSI protocols, etc.

Fair enough - there were bugs in specs (like the X.509 letting you attach arbitrarily large image or blobs), but mostly I found that the compilers were buggy.

Whenever (way back in the day) we (our/my company) did SET (secure 3P "secure" credit card protocol) with competitors (MS/HP/RSA/IBM/Netscape/etc.), because we compiled an interpreter from the spec, we were able to put in code-path switches depending on the counter party and adapt. Since they had a buggy compiler from a 3rd party - they could not.

Was that an issue with ASN.1? Or crappy tooling that people used?

One of my bosses could read the raw x.400 OSI networking packets and tell which ADMD it was from their botched implementation cough sprint cough

I had to resort to a 409 dump

Well, I’m not that good but when I see our raw data I can pick out the fields and read the values easily enough. You really do just get used to a system and how to read it. There’re patterns to it.

There is a beauty to simple bit packed protocols. That said, they seem to always end up not a great idea. Wholesale changing protocols is hard, so people tend to try to tack features onto what exists. Look at the clusterfk that DNS has become, with various degrees of support, depending on the server.

On the other hand, you could look at SIP for an example of what happens when you've an extensible ASCII-based protocol, and everyone and their dog decides to extend it. It's not pretty, that's for sure. I've come to regard it as my success disaster.

> Wholesale changing protocols is hard...

In my dozen plus years of working with them, I've not seen a large change like this. And TBH, there are way to structure them to make sure they handle changes.

I mostly deal with hardware that has to be stable in the field for years, so maybe that's the difference.

Clearly binary formats are dumb and TCP/IP should be reimplemented with JSON payloads to be more sociable.

I wonder whether we can have the benefits of both.

Sounds like the bit-packed protocols save bits (that's clearly good), while the other protocols are self-documenting. Documenting is good.

Why can't we have both? Something like protocol buffers (yes, Raymond mentions them in the article) is a binary protocol that makes pretty efficient use of the bits on the wire. But they are also very well documented. And it's "documented" that is useful, not "documentation is included in every message that gets sent".

Is it possible to look at the on-wire protocol and tell that this message uses protocol buffers and which protocol specification it is using? (I'm not sure. I hope the answer is yes.) Is it possible to find the documentation for a particular protocol buffers specification once you know which one it is? (I think it is, if by no other means than a google search, although a more automated repository might be nice.) If both these things are true, then we can have bit-level efficiency AND have well-documented and extensible messages.

A lot of the older protocols are like that due to bandwidth or processing limitations. IP also has all these type fields that does the same thing, in addition to fragmentation that they have which is similar to said complaint. Now a days you would t design it like this anymore.

IIRC, Silicon Graphics (the company) used to maintain a registry for "magic numbers" designating each file's type. Customers could contact SGI to be issued a magic number for whatever file formats they were cooking up.

I wonder if there's merit to recreating such a thing under ICANN, where the issued serial numbers are useful for file types, wire protocols, etc.

Then anyone needing to reliably interpret a packet could (1) look for the format serial number at some well-known location, and then (2) consult the well-publicized registry for whatever information has been provided regarding the format.

Magic numbers are better than nothing, but "next protocol" fields accomplish the same thing without magic. Usually the steward of each protocol maintains a registry of protocol IDs. For example, IEEE maintains a list of Ethernet protocols: http://standards-oui.ieee.org/ethertype/eth.txt and IANA tracks IP protocols: https://www.iana.org/assignments/protocol-numbers/protocol-n... and TCP/UDP port numbers: https://www.iana.org/assignments/service-names-port-numbers/... and MIME types are used for HTTP and email: https://www.iana.org/assignments/media-types/media-types.xht...

In classic Mac OS and BeOS, file inodes carried file types so you didn't have to guess about file types either.

IIRC, it was magic numbers and offsets. And you could run the charmingly-named magic program on a file to find out if the system knew what it was...

If you want something like that, using an IEEE OUI/CID would be a good start.


So, when can we expect his JSON serialisation of TLS?

When someone else writes it, Russell Nelson takes it over, and then gets bored enough of it to let Eric Raymond claim it for his resume.

When Russell's not too busy nominating his own wikipedia page for deletion because he wasn't allowed to whitewash the "Blacks Are Lazy" incident.

I'm ignorant on the subject, but is JSON really much better than SOAP?

I thought everyone hated SOAP.

Those are different layers. JSON is a simpler alternative to XML. SOAP has its own complexity on top of the complexity of XML, so a "full" SOAP implementation is probably 100x more complex than, say, JSON-RPC.

Consider if IPv4 source and destination addresses in IP datagrams were textual rather than 32-bit fields. IPv6 might not be necessary, NAT might not never have had to be hacked on to the Internet, and most of the centralized cloud model that is pervasive today may just not have had an opportunity to take hold.

... and routers would have to parse text fields in order to route packets

Ouch! Object lesson: the Internets lets smart people write dumb things and sometimes they end up on the front page of HN.

I shouldn't pile on (Sorry Eric Raymond), but there's this one:

> A decimal digit string is a decimal digit string; there’s no real ambiguity about how to interpret it

The context is as contrasted to a 64-bit big endian value.

Of course, a decimal digit string is subject to its binary encoding (no different than anything else sent as bits).

I know the author is talking about JSON, but I've see a lot of different ways the length of decimal digit strings determined: null terminator, double-quote terminator, single-quote terminator, length is a twos-complement 16-bit, 32-bit value before the first character of the string. (I think maybe 16-bit is know as a pstring or Pascal string? My memory is not 100% here.) I'll bet someone's done a 64-bit value before the string, though I haven't seen it myself. Oh, and I've seen where the length is determined by knowledge of the data structure (that is, something like bytes 10-25 are a name, padded with spaces or null terminators, usually leaving readers to infer the encoding based on the dominant platform). And once you start terminating a string with a certain sequence of bits, there's an escaping mechanism you need to deal with. Let's look at the source code for a quality JSON parser before we call it unambiguous?

I mean, on my first paying gig of my life I made $50 writing some sample code for a BASIC tutorial. The first version of it was rejected because it didn't work right on their EBSIDIC system. I was 16 and I was thinking, "what the heck (I actually swear back then) is EBSIDIC?!?") I know we all use ASCII and Unicode now, and IIRC, the actual digits 0-9 were the same, but not the decimal point, so maybe parsing integers is OK but not floating point values. Speaking of which, using . for the decimal point is not exactly universal (even forgetting about EBSIDIC, and let's please do)...

JSON has it's rules (which is good!) but my point is: a decimal digit string not necessarily a simple thing. I realize the author doesn't know this, and I don't begrudge him -- I am certainly not happier for knowing otherwise -- I'm just trying to point out that the authors of NTPv4 were not exactly working in a era where a good programmer could possibly think a decimal digit string was anything but a hornets nest sitting on a land mine guarded by MCP (Tron reference, sorry).

So... the author complains that parsing an NTPv4 packet requires prior knowledge of things like big endianness, but parsing JSON requires plenty of prior knowledge.

I get it: big- vs. little-endian is not something people are used to dealing with these days, so it jumps up and bites you when you do. But it's just another encoding and is actually much, much simpler than ones you deal with every day.

(Back then, all the cool CPUs were big endian so I think it was pretty understandable how it ended up on the wire.)


Does it matter? YES (well, not my personal anecdotes! but the other bits).

The dead truth is: you're going to have to understand and parse the messages you receive and they may or may not use conventions and idioms you already understand.

How not to embarrass yourself.

I see snark and politics rules are selectively enforced.

Can you name any other overrated factually incorrect self-aggrandizing racists whose articles don't get criticized and downvoted as much as Eric S Raymond's?

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact