Hacker News new | past | comments | ask | show | jobs | submit login

Yeah, I skipped all the drama, read the spec and implemented an encoder/decoder. CBOR is just how MessagePack-like format should have been done from the beginning: it's technically superior in a sense that it's neat and simple, replacing many specialized rules with one generalization.

I agree that IETF standardization isn't a good argument (well, for some it is), that's why I replied with a better argument :) But, seriously, I won't say that everyone should replace MessagePack everywhere with CBOR, both work fine (as long as you use the latest version of MessagePack, with binary/string distinction).




Looking at CBOR spec, I see that it is just more complex. Two ways to encode lists/strings -- indefinite-length and strings. 16-bit floats in the main spec. Separation of "null" and "undefined" values. And tags, with which define things like decimal fractions, bigints, and regexps.

The tags are the worst, actually. Sure, spec says "decoders do not need to understand tags", but this is not really the case. For example, if someone has floating point numbers and worries about precision loss, they can store the value as _decimal fraction_ (per section 2.4.3). This means that your decoders have to support both tagging and your favorite bignum library just to make sense of the data. In comparison, in msgpack (or json or xml or anything else) you would just have to store a string representation -- trivial to convert to either regular floats if you do not care, or to pass to your favorite bignum library (and this will be simple, as all of them support constructor based on ascii strings).

In general, I think optional tags in data-interchange protocols are a very bad idea. For example, there is a tag for "Standard date/time string" and for "Epoch-based date/time". Which means that either:

- You schema says "date/time", and your decoder now must support both of them (and probably untagged strings, and integers too). So this is an extra complexity in your decoder.

- Your schema says "date/time in 'Standard date/time string' format", and now every encoder user must make sure the emitted value is encoded and tagged appropriately. This means you cannot do `x = cbor_encode({"now": date})`, you have to read your encoder documentation to make sure the CBOR encoder you are using will generate the required encoding.

So extra complexity in either case, and no real benefits. Better stick to msgpack, at least it has no extensions defined currently.


Seriously, I just don't understand how can you described the benefits of CBOR and claim that they are its drawbacks.

Yes, you have to read the documentation of your encoder/decoder to understand what tag values it maps to your programming language's objects, but if you need to encode or decode those same values with MessagePack you'll have to define your own format for them and document it. You just moved this problem up the stack, but with an ad-hoc format.

Separation of "null" and "undefined" is for full JavaScript support. Before C99, C didn't have boolean type, but you wouldn't complain if serialization formats had them, would you? Same thing "undefined": while it's useless in most other languages, it's useful to have it for JavaScript.

I don't like float16 too - they took 19 lines of decoder code to support (no need to support it in encoder) — but it's the same situation as with "undefined" — some people need it.


MessagePack has extension types:

https://github.com/msgpack/msgpack/blob/master/spec.md#forma...

I experimented using them to implement a limited atom/token spec for use on Arduino.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: