
The Order of the JSON - tosh
https://blog.almaer.com/the-order-of-the-json/
======
Someone
Next installment: how flipping an innocuous switch introduced a subtle bug
that was silent for years and cost us millions.

You can’t just flip that switch, run _”a client to post the JSON to that
instance”_ , see that test _“worked just fine”_ , and call it a day.

The POST might just store it, for later processing to wreak havoc (say by
ignoring a value that isn’t in the expected place in the JSON), or only rarely
seen JSONs might cause problems, or it might ‘only’ break the yearly run, etc.

~~~
dtech
How do you justify ever changing anything with that attitude?

This change made parsing more lenient which is generally considered fine [1]
and made the software actually follow the spec, since it specifies object
entries are unordered.

[1]
[https://en.wikipedia.org/wiki/Robustness_principle](https://en.wikipedia.org/wiki/Robustness_principle)

~~~
Someone
_”This change made parsing more lenient”_

How do you know? It makes the outermost layer of the system accept more
liberal in what it accepts, but that doesn’t guarantee that the inner layers
can handle that more liberal content.

For example, the code may assume that “<user-ID>” always is the first node in
the _json_. If you start sending “<car-ID>” instead, things may go fine until
you get two customers who share a car.

 _”and made the software actually follow the spec”_

How do you know? The spec of _this_ piece of software may state it has a JSON-
like API that requires the “<user-ID>” to be the first node in every request.

And yes, most of its code may handle that change fine, but it only takes one
piece of code to break things.

 _”How do you justify ever changing anything with that attitude?”_

In the case of _”a service running IBM DataPower Gateway which sat on top of
WebSphere which sat on top of the COBOL.”_ where _”Much of the system was so
old that it was hard to find anyone who knew how it actually worked, and it’s
maintenance had been outsourced ”_ : very carefully.

This change may have been fine, but you have to check that.

------
Ididntdothis
This sounds very familiar. A lot of companies are full of people who have no
curiosity and no ability to think for themselves. I have seen it multiple
times where someone claimed a change is impossible or takes insane effort.
Then you have someone competent look at it and you have a solution in an hour.
I think stuff likes this is the real price of not hiring really good people.

~~~
bjoli
I had the same thing happen to me lately, but I was at the stupid end. I am
not a programmer by profession, so it matters very little to me, but I had
been struggling to produce a correct solution to a seemingly simple problem
(writing a macro to allow definitions in expression context in r6rs scheme).
My solution worked but had the side-effect rewriting obviously bad syntax into
correct one and not in a good way.

I was struggling with how to do it correctly, and then one of the people in
the r7rs working group simplified the problem by a very large factor by
pointing out to me that I was trying to solve a non-existant problem, because
I tried to add definitions to expression contexts where it simply made no
sense. In fact, instead of supporting all forms, I could get a better result
by just focusing on one of them and add simple wrappers for another 5.

It was all quite humbling. Had I taken a step back and actually analyzed the
problem I would have come to the same solution, but I immediately tried the,
to me, most obvious and also hard-to-get-right solution.

~~~
quickthrower2
You had trouble writing a scheme macro? I think most pro software developers
would struggle to! I don’t think this is dumb.

~~~
bjoli
I do not have any problems writing macros (in fact, I recently finished my
guile scheme version of rackets for loops) It is just that the macro I wrote
did a lot more work than it had to, and led to weird syntax behaviour. I
didn't delimit what I was trying to do. John over in #guile put me on the
right track.

~~~
quickthrower2
I didn’t mean to imply you had trouble writing all macros. But as someone who
is trying to learn some c-lisp on the side I can see why they can get a lot
more tricky than “regular” coding that one might do say in Java. So the point
is you are probably doing something more advanced there.

------
jwilliams
(I've used both DataPower and COBOL - I've got a healthy respect for robust,
long-lived legacy systems).

I must admit I was scratching my head on this.

The JSON spec might not specify order, but the serialized JSON is ordered by
nature. JWT needs it for example (and I'd assume many signature models). Or
you might have a caching layer that needs it. Maybe unchecking this causes the
legacy backend to get hammered? There are valid non-spec concerns with order.

Replies here seem to assume stupidity here. It's a valid reason, but it's not
the only one. Equally, the author doesn't ask "why" \- why would it take so
long, why was that option enabled?

~~~
blattimwind
> JWT needs it for example (and I'd assume many signature models)

You almost never need canonical representations for signing things. I would
even say that if you need a canonical representation to sign your things, then
that is a design smell of your cryptographic protocol.

~~~
jwilliams
I didn't mention canonicalization. My point is that serialized JSON is ordered
- which I think is exactly the same property you're referring to.

~~~
coldtea
That's an implementation detail. Serialized JSON can also be printed, but your
systems shouldn't depend that JSON is always ink on paper.

------
carlmr
>I got to learn that we were talking to a service running IBM DataPower
Gateway which sat on top of WebSphere which sat on top of the COBOL.

As soon as you read IBM you know you have bigger problems than COBOL in your
system.

------
perfunctory
> How does world not break due to technology more often

My own theory is that most (not all) of technology is there to support BS jobs
and they are irrelevant for the normal functioning of society. It simply
doesn't matter when (most) technology breaks.

------
lacker
I agree the system described is crazy. However, I find it frequently useful to
use ordered JSON as a data format and I think it would be handy if more
languages supported it. For one, it makes it a lot easier to write integration
tests using a “golden file” of ideal output, because your program that outputs
JSON now usually deterministically has one correct output. For two, it lets
you hash a json-encoded object deterministically.

~~~
hyperpallium
OK. I _totally_ prefer ordered JSON, because it is so much easier to eyeball -
to visually compare JSON with different ordered keys is quite a lot more
difficult (O() complexity?) than if they are in the same order. It also
enables _diff_ to help see _where_ the differences are (diagnosis, not just
binary identical or not).

And, in fact, I do use ordered JSON for comparison in testing, as you
describe.

 _However_... comparison of JSON as objects (i.e. in memory) is order
independent. Hashcode is also order independent (the trick is to sum the
elements' hashcodes e.g.
[https://docs.oracle.com/javase/7/docs/api/java/util/Set.html...](https://docs.oracle.com/javase/7/docs/api/java/util/Set.html#hashCode\(\))

Diff for ordered JSON is possible using longest common subsequence for trees,
but has terrible complexity, and lacks diff's clever optimizations both
general and specific to typical input.

~~~
crazygringo
+1 for all reasons above for ordered JSON, highly convenient in practice.

And if it doesn't impact performance significantly, these are all pretty good
reasons for JSON outputters to default to sorting objects deterministically by
keys, or at least to provide a flag to do so. (Even if there's no canonical
sort order between JSON libraries, all that matters is it's deterministic for
each library.)

BUT... I can't imagine any scenario where you'd want to _validate_ that JSON
content was ordered on the _input_ , which is what was enabled in this
article. Why does IBM even have that as an option?!

Be strict in what you emit and liberal in what you accept, and all that...

~~~
hyperpallium
I guess because OrderedJSONObject is ordered, not sorted. Like java's
LinkedHashSet, it maintains the order keys are added.

If you're going to rely on a specific order for comparisons, it makes sense to
alert the user to any JSON in a different order (instead of silently,
liberally accepting it), or you'll get false negatives elsewhere. Easier to
check for a sorted order, but also possible to define a specific order. IDK
what IBM did here.

funfact: jq used to sort keys; now it retains ordering.

~~~
diroussel
Indeed, and remember this JSON message is going to a mainframe. Mainframes
don't have much memory and typically process record-by-record, or event-by-
event. So the implemenations probably streams the JSON in and constructs
COPYBOOK from the payload before continuing to invoke the cobol.

So the rework time might be to write a general purpose re-order layer that can
re-order any imcoming message.

------
mirimir
> Much of the system was so old that it was hard to find anyone who knew how
> it actually worked, and it’s maintenance had been outsourced to some of the
> typical IT outsourcing companies of the time.

Some years ago, I worked on an antitrust case involving a firm that had
outsourced relevant systems. Multiple times, to different IT firms. And there
was literally nobody left who knew how they worked.

After considerable negotiation, they agreed to provide documentation. And what
that ended up being was a report by IT company 2 about their understanding of
what IT company 1 had done with the firm's systems. Because, I gather, IT
company 1 had evaporated.

And yes, the core of it was COBOL.

------
cryptonector
Ha! Stephen Dolan had to add this to jq quite sometime back, making it
preserve object key input order on output. And yes, it's infuriating, but it's
also somewhat convenient, and yes, there really is software out there that
cares about object key order (sigh).

~~~
aflag
Doesn't anything using jwt depend on a specific order?

~~~
scandinavian
It shouldn't, unless you base64 decode the header, then parse it with a
library that causes the order to change, encode it again, and then use your
own re-encoding to calculate the signature:

    
    
      HMAC-SHA256(
        b64(reencoded_header) + '.' + b64(payload),
        secret
      )
    

You should really just verify the signature for the provided header + payload
in their base64 encoded form.

------
firefoxd
Ha. Creating a layer on top of broken APIs is an thriving business. Every
single carrier company has a broken API. That's why companies like aftership
exist.

One example I struggled with for days recently was USPS. Not only they use xml
in the url parameter, the order of the elements also matters. Unfortunately,
the order in the documentation is incorrect.

~~~
dnautics
Hell even Amazon's ec2 spec is confusing in some places (the boto python call
parameter doesn't always match the restful call, which is less documented,
leading to errors in some libraries).

------
beached_whale
I wrote a json serialization/deserialization library for C++ and if you
provide the members in the order they are specified in the type you get better
performance. It can construct the class without having to bounce the parser
back to that location and it is much much more cache friendly.

~~~
couchand
I also would much rather write code that doesn't quite do what it's supposed
to if it's easier to do and I can take an extra day off. However, that's not
my job.

~~~
beached_whale
It's an interesting problem. So if all the parser does it put them into a
variant like structure, sure whatever. But when you go to put them into the
reified data types you would be wasting a lot of resources to parse JSON to an
intermediary data structure and then request the members. I parse them
directly to their final classes. So I was left with a choice and both have a
cost. Parse the json in-order of the file and store the concrete type to grab
it later, or store the position/size of that part and move to the next
constructing them in the order needed. This was cheaper. It will parse the
JSON in whatever order, just the performance can be impacted by the data
ordering as many parsers are.

~~~
couchand
Is your parser general-purpose? Is it template-based? Is the performance
variance due only to the impact on the processor's cache or are there other
factors? Is the code open-source, maybe I could just look myself?

~~~
beached_whale
It is template based so that each types JSON parser is statically known and to
give the compiler more opportunity to optimize.
[https://github.com/beached/daw_json_link](https://github.com/beached/daw_json_link)

I haven't had a chance to look at why yet.

------
jmiserez
I just recently dealt with a JS library that was expecting object properties
to be in a certain order ("works" in ES2015 [1]), and that object was loaded
via JSON. Was a huge pain getting the object in the required order.

[1] [https://stackoverflow.com/questions/5525795/does-
javascript-...](https://stackoverflow.com/questions/5525795/does-javascript-
guarantee-object-property-order)

------
jrochkind1
How were they going to charge him for 9 FTEs for 6 months to uncheck a
checkbox?

~~~
duxup
Write some software that reordered it probably.

~~~
jeremy_wiebe
Yep. Probably IBM Data Transform Services™️ installed in the cloud plus a
custom plugin written to do the ordering. That’s a few folks to deploy and
configure the server. A few more for the engineering. Oh and there’s probably
a support contract to support and maintain this new “solution”.

------
pavel_lishin
For what it's worth, I do wish it was trivially easy to customize things like
error log output, so that I could say that I _always_ want the timestamp
first, followed by severity, followed by whatever else. Our logs are annoying
to parse when debugging things locally.

~~~
tuananh
all the modern logging libraries does this now i think.

and with app like fluented, we can decorate it with whatever metadata we want
(instance name, machine type, container name, etc..)

------
rurban
Writers should produce sorted maps, readers need to accept unsorted maps.

It's a security issue, not just convenience. With unsorted maps the internal
hash seed can be exposed, together with timing information.

Another famous omission from the specs.

~~~
AgentOrange1234
I’m no security expert. How is “exposing the hash seed” a problem for the vast
majority of applications? What timing information would be leaked and why
would that be problem?

On the other hand, accepting unsorted maps seems like it could introduce
covert channels?

~~~
j88439h84
[https://bugzilla.redhat.com/show_bug.cgi?id=750555](https://bugzilla.redhat.com/show_bug.cgi?id=750555)

This led to hash randomization on by default since Python 3.3.

~~~
AgentOrange1234
Cool. Thanks.

------
m463
This makes my head hurt. I wonder if you end up with sorting differences
depending on the locale?

~~~
wallyowen
It's not sorted, but ordered: The order emitted is the order entered.

------
peteforde
I built the first popular sports "app" for the original iPhone, back when Jobs
claimed that there was no need to write native code.

The backend polled an API that served XML updates on game scores etc. Note
that I didn't previously know anything about baseball (and I still don't, not
really).

So let's say that a baseball team scores a Double. We'd see some XML like
<doubles><double/><double/></doubles>.

Now, suppose a team scores a tripple... You know that there's a
<triples></triples> entity. What would you expect to see inside the node?

If you're me, you'd expect to parse <triples><triple/><triple/></triples>.

I got the call during dinner. There was an important game happening, and the
app suddenly broke. People were uptight. They loved our app and they were
complaining.

In the end, it turns out that we would need to process
<triples><double/><double/></triples>. Why? "Oh, it's always been that way."
(Says the brusque developer at the service charging $50k/month for access to
this feed.)

------
tosh
recent twitter thread referenced in the article:
[https://twitter.com/therealfitz/status/1161349619659530242?s...](https://twitter.com/therealfitz/status/1161349619659530242?s=20)

~~~
wallyowen
It's not sorted, but ordered: The order emitted is the order entered.

~~~
wallyowen
Sorry, that was intended to be a reply to the prev. comment.

------
eridius
The tweet at the top of the article says

> _This so far out of the spec it makes my ankles hurt._

This is not in fact out of spec. The JSON spec does not define semantics here
but instead quite explicitly leaves it up to the JSON processor and data
interchange spec for what to do about ordering of objects.

~~~
dtech
> This is not in fact out of spec. The JSON spec does not define semantics
> [...] for what to do about ordering of objects.

What? It is literally on the first page of json.org and section 1 of RFC 7159
[1]

> An object is an _unordered_ set of name/value pairs.

> An object is an _unordered_ collection of zero or more name/value pairs

[1]
[https://tools.ietf.org/html/rfc7159#section-1](https://tools.ietf.org/html/rfc7159#section-1)

~~~
patrickthebold
There are multiple standards. [https://www.ecma-
international.org/publications/standards/Ec...](https://www.ecma-
international.org/publications/standards/Ecma-404.htm) Is more relaxed.

>The JSON syntax does not impose any restrictions on the strings used as
names, does not require that name strings be unique, and does not assign any
significance to the ordering of name/value pairs. These are all semantic
considerations that may be defined by JSON processors or in specifications
defining specific uses of JSON for data interchange.

Also it's pretty clear that the keys must have an order going over the wire.
And JavaScript objects (which aren't unrelated to JSON have an order now)
[https://www.stefanjudis.com/today-i-learned/property-
order-i...](https://www.stefanjudis.com/today-i-learned/property-order-is-
predictable-in-javascript-objects-since-es2015/)

------
lunias
I've been asked by architects to order JSON properties. It's fine until you
depend on it. i.e. accessing a map as you would an array

[https://fasterxml.github.io/jackson-
annotations/javadoc/2.3....](https://fasterxml.github.io/jackson-
annotations/javadoc/2.3.0/com/fasterxml/jackson/annotation/JsonPropertyOrder.html)

------
cypressious
I have written an app for the menu of my university's canteen. The JSON API
that I used returned the meals as fields of a document where the order of the
fields was the actual display order. It took me a while to even find an
implementation of a JSON parser that keeps the order of the fields.

~~~
hobofan
I would assume that any decent JSON parser has at least the option to keep the
order of the fields. Especially for CLI tools that automatically add fields to
a JSON file thats usually edited by a user e.g. package.json, I feel that it's
absolutely crucial.

------
bullen
This is an omission from not only the JSON standard but also the JavaScript
language, JavaScript has no map order and the rest is history.

The result is that all implementations are broken, f.ex. this is how a tree
structure has to be implemented with JSON:

[http://root.rupy.se/meta/user/task/eelzter/44953781393584543...](http://root.rupy.se/meta/user/task/eelzter/4495378139358454353?sort=2)

It's inefficient, ugly and just wrong.

------
flyinfungi
Sounds like normal life to me. Please give me $$$$

------
al_form2000
"Why would you care what order we were sending the name value pairs for this?"
The article makes it sound like it's somehow IBM's fault. However, questions
about preserving order in hashes/associative arrays/maps/whatchamacallit span
at least thirty years. Some applications of the format even demand it (e.g.
ansible uses YAML - a superset of JSON - or even straight JSON for it
playbooks; emitting them out of order is not an option).

Every time the issue surfaces it elicits a large amount of eye rolling among
the cognoscenti. However, I'm a firm believer in the idea that recurring
demands from the user base need to be addressed, rather then mocked. The
coding community appears to me notably tone deaf on this.

'Course anybody can use <insert suitable technique> to send a JSON map in the
desired order, except the parser on the other hand will blissfully disregard
it. Or you can devise any ordered solution on both ends, which will leave you
outside of the accepted standard and open you up to the situation described in
the article (where the requirement may well have been frivolous).

I can remember very few - if any - instances of somebody implying that the
standard should somehow make room for this kind of scenarios. The standard
reply was "use a different format" or "change the requirement". (Somehow
remembers me of people asking what can one do to have whitespace-preserving
XML, and at least one amusing story about that)

~~~
viraptor
Json has a perfectly fine order preserving key-value map:

    
    
        [["Key1", "value1"], ["key2", ...
    

You just need to deserialise it into a specific collection on the receiver.
It's within standards and there are no weird parser issues.

~~~
al_form2000
That's a way. A way that, IMHO, vastly diminishes its expressive power on the
reader's side, which I do not consider a very good thing. Most solutions to
the issue are of this nature.

Consider serialized JSON already has a natural ordering, which is thrown away
for some reason (probably parsing convenience).

~~~
viraptor
> which is thrown away for some reason (probably parsing convenience).

It's got nothing to do with parsing. The json comes from JavaScript syntax,
where objects represent unordered mapping. Json naturally does the same.

If you want expressive power for reading, json is very poor in comparison to
pretty much everything else. Just use it for simple serialisation.

~~~
al_form2000
Granted, it came from there, but that was back in the days of map=eval(json)
and they're gone. There is nothing in json the format (as opposed to json the
language construct) to impose the unordered behavior.(Or, for that matters,
the 'no comments' bit).

~~~
adrianmsmith
> There is nothing in json the format [..] to impose the unordered behavior

That's not true. The spec at [http://www.json.org/](http://www.json.org/) says
"An object is an unordered set of name/value pairs".

~~~
al_form2000
Yes, that's in the specs. But what I meant is that there is nothing intrinsic
to the format demanding unordered behavior.

