1. A lot of ASN.1 software is pretty buggy and undermaintained. How do I submit a bug to pyasn1? Because it's entrenched in libraries three or four layers below what most developers will see, it's also difficult to replace.
2. Faster and more compact depends on the encoding rules. Let's talk about the dozen or so different ways you have of encoding ASN.1. BER? DER? Maybe CER? XER? Why not both: CXER! I'm missing a bunch, I know.
3. The performance and network arguments ignore what you can do with compressed verbose formats like JSON, XML or EDN. lz4 is magic. Even totally ubiquitous gz compression works extraordinarily well.
4. While ASN.1 messages do contain descriptions of what types they contain (e.g., you can see that there's a bit string coming), they aren't self-describing in the same way that JSON or XML are, which is quite annoying for debugging. You can make self-describing messages using XML, but at that point you're literally just doing XML. Good luck finding software that lets you easily use whichever you like -- even if you subscribe to the notion that easy debuggability doens't matter for production messages, in which case I strongly disagree.
5. I'd harp about its extensibility, but it's really no better than JSON, so I won't.
All-in-all, I'm compressing some EDN, and I'm pretty happy.
In addition to what you mentioned, I wanted to mention EXI (and FAST, the FIX encoding) as interesting on-the-wire encodings that maybe more people should consider over just compressed JSON or XML. Generic LZ-based compression doesn't necessarily win very much with lots of short messages.
When those vulnerabilities hit I never found out how that code did -- I wish they'd open-sourced it as they'd planned.
So, that particular part (DER? again I forget) seemed tolerable to me. The newer stuff like Cap'n Proto is probably still better.
The features in the author's F# ASN.1 compiler are pretty swank. ASN.1 probably gets a bad rap because of BER/DER.
But starting to use FAST today seems like a bad idea, because the largest publicly-known production users of FAST have already moved away from it (toward plain, uncompressed binary structs on the wire).
Then there were things like the Meru devices that added or deleted a field and accidentally renumbered all the following entities in the (clearly auto-generated) MIB, on a minor firmware version update.
Sure; but in reality, you just write your from-scratch systems to use DER, and then throw in a "protocol-upgrade to PER" message/bitflag if things aren't going fast enough. And then if some crazy legacy SOAP+WSDL enterprise wants to integrate their codebase with yours, you add the ability to negotiate XER just for when they're speaking to you. It's not like all the different encodings are equally valid choices when both nodes are under your control; they fall pretty clearly on a spectrum from "whenever you can get away with it" down to "only if you have to."
> The performance and network arguments ignore what you can do with compressed verbose formats...
Wire compression doesn't help if you're sending short intermittent messages across millions of connections, rather than batch-streaming messages across a link. Specifically in the GSM context mentioned in the article, the average control-channel ASN.1 message won't be helped at all by wire compression.
Also, this isn't an argument in favor of ASN.1, but sometimes you need your wire message format to be efficient not because of throughput constraints, but because you need to directly manipulate the resulting data in memory (ala Cap'n Proto), and so efficiency on the wire directly translates to efficiency of packing in memory. The protocol can be compressed on the wire on top of this, of course, but it implies that "just use wire compression" won't solve every efficiency problem.
> they aren't self-describing in the same way that JSON or XML are
ASN.1 comes from an era of record-oriented storage, where a schema is transmitted once, in advance—or possibly baked into the client—and then rows defined by the schema are transferred many times. XML does this well for individual self-describing messages by allowing for a reference to a DTD+XML schema, but ASN.1, being for streams of records, expects you to create a message-container format to reference your schema, so as to remove the overhead of repeatedly mentioning it on the wire.
This last one makes a more general point for me: some protocols reward good engineering done in the code that uses them, and punish bad engineering. If you're trying to take the naive-and-suboptimal approach to something and a protocol is fighting with you, it might be because the protocol "wants" to be used in an optimal fashion.
And yes, some of the X- encodings for ASN.1 should die in flames (alongside XML in general; flames are welcome ;>). It's still a pretty neat binary format if one uses BER though. But when it comes to binary formats, I really like everything IFF-like. It's similar to ASN.1 in terms of how data is structured, but uses far simpler rules for headers and payload. It's much harder to botch the implementation.
I'd love to hear opinions of people who have used them, and the experience they made. I've found an interesting thread  discussing them, and the claimed advantages for Protobuf are as follows:
- faster (real existing software, not some hypothetical ASN.1 compiler that could do x)
- easier to maintain backward compatibility
- much simpler, and thus easier to understand and more robust
Especially maintaining compatibility seems crucial to me in large distributed systems.
* Many different ways of encoding a simple string, with some very obscure encodings.
* SET OFs are sorted in the DER encoding, to ensure consistent bytestreams. This sucks for embedded systems.
* OIDs (unique identifiers for things) are unbounded.
Furthermore, ASN.1 has the usual lameness you get when people build generic description languages: for example, it's quite common to encode a particular ASN.1 structure, and then put the resulting structure into an OCTET STRING for inclusion in a parent structure (take a look at Extension in RFC 5280's ASN.1, for example). This is presumably because ASN.1 didn't (doesn't?) support an ANY type to allow inclusion of arbitrary structures that the decoder didn't know how to parse, so there's no extensibility without such tricks.
In the end, I punted and just used BER/DER directly without ever using ASN.1. This made a lot of things much simpler and produced much smaller and more efficient code (e.g. my cert parser for our SSL library for the Palm III ran with no additional allocation space, and compiled to a few K of code).
JSON is taking over because it's a good match to languages where you can define lists and dictionaries easily. Most languages now have that. The overhead is high, but the simplicity is helpful. As a practical matter, it's usually easier to get things to talk to each other with JSON than with the more rigid forms. Someone is usually out of compliance with the data definition.
There's now "Universal binary JSON" (http://ubjson.org/). That's just a binary representation of JSON. Then there's JSON Schema, which adds a data definition language to JSON. Put both of those together, and you have something very close to ASN.1.
And the wheel goes round and round.
These encoding formats (or rather their implementation) is based on mimizing copying of data. Deep down they are based on mmap-ing memory areas. Not unlike you see the casting of blobs of memory to packed structures (but with more safety).
XML and JSON are both ridiculously inefficient, both for static storage and especially for communication protocols. Can't wait for them to die.
It sure was fun implementing back then.
Contrast this with text-based protocols that rely on scanning forward to find a delimiter - the lengths are implicit. I think this is what really can cause bugs, as not having explicit lengths makes it easier to forget to check them against the buffer's size.
> Length fields automatically make the grammar context sensitive which is much harder to secure according to langsec.
Is this accurate given a finite length field? I can imagine a DFA that recognizes the language of a single byte length prefix followed by strings of 1 to 255 characters, just that the node that consumes the length field will have 255 branches to sub-DFAs that recognize 1, 2, ..., 255 character strings.
Also, here is a video that works a bit better as an introduction to LANGSEC: https://www.youtube.com/watch?v=3kEfedtQVOY (around 19:00 is especially entertaining)