As I worked on it though I realized it might be of general interested to people. Thus the example in the README of piping JSON through multiple tools without generating trivial changes that mess up diffs.
Most of the decisions I made were clear: no insignificant whitespace, object keys must be ordered, etc.
There are two things I'm still not sure about:
+ Son doesn't provide escape sequences for any Unicode character that JSON allows to be written unescaped. This includes U+007f (ASCII "delete"). Will that cause a problem for many programs? All the other ASCII control characters are required to be escaped by JSON, U+007f is the only one left out.
+ Son doesn't allow trailing zeros in fractions. This means you can't serialize `1.0`, you have to serialize it as `1`.
I was confident in the decision to take out scientific notation (it would be cool if JSON parsers actually treated numbers as being in scientific notation and tracked significant digits, but they don't so I feel like that ship has sailed). Trailing zeros are different though because some JSON generators do use them to distinguish integers from fractions. The problem is that many parsers don't care about them, so you end up in a situation where parsers are tossing out information about documents, meaning they can't serialize them faithfully again which is the whole point of Son.
reply
1. Keep 1.0 as a special case to maintain the int/float distinction (it's a float; calling it a fraction is kinda-of-a-lie).
2. Refuse to handle floats at all, at which point people can pass [ <mantissa>, <exponent> ] for reals or [ <numerator>, <denominator> ] for rationals.
There is of course (3), "build a compliance suite and claim the parsers that toss out information are Incorrect", but that doesn't seem compatible with your postel-ish goals.
EDIT: Not meaning anything is a plus! But I see your point about being un-googleable.
There are some other edgecases that you aren't considering (or that I missed)
Min and Max integer values - JavaScript has some pretty tight limits here
Keys should be in lexicographic order — you might want to be more specific. Is there a permitted subset of Unicode? Which normalization rule must be used?
As I worked on it though I realized it might be of general interested to people. Thus the example in the README of piping JSON through multiple tools without generating trivial changes that mess up diffs.
Most of the decisions I made were clear: no insignificant whitespace, object keys must be ordered, etc.
There are two things I'm still not sure about:
+ Son doesn't provide escape sequences for any Unicode character that JSON allows to be written unescaped. This includes U+007f (ASCII "delete"). Will that cause a problem for many programs? All the other ASCII control characters are required to be escaped by JSON, U+007f is the only one left out.
+ Son doesn't allow trailing zeros in fractions. This means you can't serialize `1.0`, you have to serialize it as `1`.
I was confident in the decision to take out scientific notation (it would be cool if JSON parsers actually treated numbers as being in scientific notation and tracked significant digits, but they don't so I feel like that ship has sailed). Trailing zeros are different though because some JSON generators do use them to distinguish integers from fractions. The problem is that many parsers don't care about them, so you end up in a situation where parsers are tossing out information about documents, meaning they can't serialize them faithfully again which is the whole point of Son.
reply