Clojure really is probably one of the best languages for application programming out there, and pairs beautifully with Java when you need to go some levels lower.
A small set of primitives
eg writing an inspector gui
eg searching for references to some id
But still able to represent types and invariants
Able to reify changes as data
eg for undo log
eg for real-time collaboration
All data has some name/path/location by which it can be referred to
eg no hidden state in closures
eg no hidden closures in the event loop queues
Avoid depending on pointers for identity
Data notation:
A textual representation which is easy to read/write
Used consistently everywhere - one standard way of picturing data
Self-describing - doesn't require out-of-band type/schema
Uses layering to add capabilities while mimicking familiar notation
Uses shorthands and exploits context to reduce redundant information
eg clojure namespace aliases
eg unison names
Code:
The notation for code is a superset of the notation for data
eg can print data and copy-paste into code / repl
Can choose the mapping between tags in data notation and types in code
Code can be represented as data with low mental distance
The codebase is also data - can trivially analyze whole thing including dependencies without having to execute side effects
Maybe, if possible, reify the execution of code as data
Crucially, the data model and the data notation need to be co-designed, because it's so easy to make choices in the data model that prevent creating a good data notation later."
Honestly, I dont see the issue with JSON. It is capturing user generated content. It's not that '43' is a logged as a string instead of an int - it is that '43' is the raw data in quotes. To me, that is the same spirit as using "read" instead of "eval" as mentioned elsewhere. Yes the read-print-loop fails for JSON - but JSON only has this failing when you are working with code-generated values. At the end of the day - a user type the 4 and 3 keys on their keyboard and that was captured. To say it is an int or a str or whatever brings back the need to understand memory representations.
for example - when parsing json with python, you can apply the same principles you would to python objects. That is, assume the item is the format you know it should be (or test it first to be safe)
so even though the json is {'43' : ['bob','alice']} - you can do an int() cast if you need to do something with that data that requires it to have a type. Otherwise it is represented as it was typed.
So, JSON has non-string values in other positions (as elements of arrays, or values in an object). Wouldn't your argument also lead to the conclusion that we don't need numbers at all, since we could get by with
{
"foo": "42",
"bar": ["1", "2", "3"]
}
There's also the issue of values with multiple equivalent string representations. I want 42.1 to equal 42.10 and 42.100. I also want {"foo":1,"bar":2} to equal {"bar":2,"foo":1} but with just strings you don't get that:
Good point, we could also expect {"foo": "42", "bar": "[1,2,3]"}. JSON does assume that the values have types (like a list) and that is inconsistent.
As for equivalent representations, I do not think what you want is universally applicable. 42.1 does not equal 42.10 until you use logic to rule that what you are working with are Numbers
In government regulations related text, for example, 42.10 could be 9 items after 42.1 and you might expect to see 42.1(a) and 42.10(a) as other items in the same set or related value sets.
Any way you cut it, the real problem seems to be that when data entry happens - a certain amount of context is assumed - and those assumptions have enough variance to need to be handled differently when the data is consumed. Which makes sense