Hacker News new | past | comments | ask | show | jobs | submit login
Jid – Drill down JSON data incrementally (github.com)
196 points by jamslater on Dec 3, 2016 | hide | past | web | favorite | 54 comments

Ok, I've just hacked a version of this that uses jq[0] underneath, so you can do all the sorts of fancy queries jid originally doesn't support: https://github.com/fiatjaf/jiq

[0]: https://stedolan.github.io/jq/manual/

In less that 24 hours since the post, you've managed to shoehorn jq in (looks like about 6 hours of work from commits). Nice work! Really shows the power of OSS.

Bravo & Thanks! :D

The JSON crowd is re-inventing LISP. Originally, JSON was a subset of what you could pass to "eval()". Yesterday, state machine programming in JSON. Today, an inspector.

Any data serialization format which could be represented by S-Expressions could be said to be emulating Lisp.

Doesn't mean they are.

But yes, I agree with the core of your comment, though I usually harken to XML for a more visceral reaction. First there was a self-formatted schema and tooling to verify it. Then an RPC specification. Followed by parsers built to fit special needs (such as comments). Inspectors, transformers, pretty printers, incremental parsers, bash integrations, namespaces...

Programming history is definitely cyclical.

I expect that will eventually happen whenever you have a popular enough data format and stick with it long enough. All the same tools get reinvented. Start over with a new format and the tools all have to be rebuilt from scratch.

But this isn't to say that new languages should never be invented. Just that there's a lot more work involved than anyone would expect.

That's an interesting way of viewing it, I guess it applies to JavaScript as well. There is a lot of discussion regarding how the Javascript community is re-inventing the wheel, but I guess once enough impetus arives to a language then re-tooling is inevitable.

Except JSON has a huge advantage Lisp doesn't: it's close to, or in some cases identical to, the data-structure syntax of several popular programming languages. Lisp's particular s-expression syntax is not close to or identical to any programming language other than Lisp.

In other words: once you decide you don't want to program in Lisp, Lisp's way of representing data stops being a useful syntax for representing your data. Undervaluing this realization (and worse, reacting smugly when interacting with people who don't share your undervaluing of this) is a big part of the gulf between Lispers and the rest of the world.

> not close to or identical to any programming language other than Lisp.

It's close to the Lisps, Scheme, Racket, Clojure, and many others.

You are correct that the popularity of Ruby, Python, and JavaScript has eclipsed the popularity of these languages, and so the popularity of JSON has eclipsed that of s-expressions.

Scheme, Racket and Clojure are all Lisp-family languages if not arguably Lisps in and of themselves.

By that token, JavaScript and PHP are C-family languages


I don't see where checkbox ticking will get us.

They are C-family languages.

It should be unsurprising that languages which have adopted Lisp's syntax/programming model to a larger degree will also naturally find Lisp's syntax/programming model to work for them, while languages that haven't... won't.


> In other words: once you decide you don't want to program in Lisp, Lisp's way of representing data stops being a useful syntax for representing your data.

Even if you code in C++, you can find Lisp's surface syntax simple enough to read and write. Now, if your language already defines a "read" function, sure, go with it.

As I said: if I wanted to write Lisp I'd write Lisp. I've already chosen not to write Lisp. I have no incentive to make my language parse Lisp when something much closer to my preferred language's data structure syntax is available.

The mere fact that a solution exists for a problem does not mean it is the best possible solution for all possible people in all possible cases, nor that there can never be a reason to develop or prefer another solution.

I didn't say it is the best solution in all possible cases. Just that even if you choose not to write Lisp, using a data format that follows Lisp's approach is not a stupid thing to do.

> Undervaluing this realization (and worse, reacting smugly when interacting with people who don't share your undervaluing of this) is a big part of the gulf between Lispers and the rest of the world.

You are the one setting the tone with your exclusionary rhetoric.

You are the one setting the tone with your exclusionary rhetoric.

Except this is basically how Lisp people react to things. "Oh, why didn't they use Lisp's version of this, it's existed for decades" is a common -- and, let's face it, "Smug Lisp Weenie" exists as a phrase for a reason -- reaction. Lisp people believe Lisp is the greatest thing since sliced bread. It works for them. And that's not a bad thing!

What is a bad thing is the endless haranguing of everyone else about why we won't just do everything the Lisp way. Data serialization formats tend to bring this tendency out pretty badly, as the Lispers show up to essentially ask why anyone bothered to invent anything else after s-expressions.

A new formulation of Greenspun's tenth rule, perhaps. [0]

[0]: https://en.wikipedia.org/wiki/Greenspun's_tenth_rule

If LISP had a widely-accepted standard, non-turing-complete subset to represent the equivalent persisted data structures JSON does, it would still be less readable than JSON.

There are reasons JSON is the lingua franca of modern API interaction and LISP is not.

Whatever example I give you're just going to complain that it's not widely accepted because it's not JSON/YAML/XML. Since just about every scripting language uses similar syntax to JSON there's an argument to be made that familiarity is the most important characteristic. However, that doesn't mean that LISP is unfit for the job from a technical perspective. If we're going to start evaluating JSON as code then there's a strong argument to be made that a LISP is better suited since that's one of its core principles and is already well developed.

      (first-name john)
      (last-name smith)
      (age 23)
      (parents jane jim)

      "first name": "john",
      "last name": "smith",
      "age": 23,
      "parents": ["jim", "jane"]

Might change that a bit to more accurately represent the same type of objects described in your JSON.

      (first-name "john")
      (last-name "smith")
      (age 23)
      (parents ("jane" "jim"))
Of course, for a pure set of such simply formatted data (hashes, lists, strings, numbers), you could use almost any data serialization format. S-Exs don't really offer anything special here that can't also be done in XML, MessagePack, SQL, protobuff, bencode, thrift, or any of a dozen other serialization formats.

Then again, JSON doesn't really bring much special to the table in this case either (other than having an encoder built into most Javascript intrepreters).

How do I know that ("jane" "jim") is a list (like ["jane", "jim"] in JSON) rather than an associative array (like {"jane": "jim"} in JSON)? S-exs can represent the difference certainly, but not in a way that is as clearly readable by humans.

The difference between lists and associative arrays is quite unambiguous and readable in JSON. This helps make the language editable by non-expert users, which is a critical benefit.

In general if you're making an alist in Lisp then you're going to use keys for which comparisons are cheap; you would use atoms instead of strings as the keys: '((name "Jane") ...) instead of '(("name" "Jane") ...).

Strings must be compared byte-by-byte, atoms are interned and compared by pointer identity.

In fact, in some Lisps (such as Common Lisp) you would probably use a special type of atom called a keyword. Keywords are just atoms with a colon at the front to indicate that they belong to the special KEYWORD package; code in two different packages can then compare them without any namespacing fuss. It would look like '((:name "Jane") ...).

Lisp just takes the logic out of the syntax and puts it into the semantics of the language.

What you call 'atom' is actually called 'symbol' in Lisp. Keywords are in Common Lisp a subset of symbols, those who are in the package 'KEYWORD'. Keywords also have themselves as value. Thus keywords evaluate to themselves.

'atom' means something else: 'not cons'. Anything that is not a cons cell (the two-pointer building block of linked lists) is an atom: numbers, characters, strings, symbols, arrays, ... Thus a string is an atom, too.

    CL-USER 1 > (typep "foobar" 'atom)

    CL-USER 2 > (typep 4 'atom)

    CL-USER 3 > (typep 'foobar 'atom)

    CL-USER 4 > (typep '(foo . bar) 'atom)
True though, symbols (not atoms) are often used as keys for key/value data structures like assoc lists, property lists, hash tables, CLOS instances and others.

Common Lisp also allows symbols to have arbitrary names. It uses | and \ as escape characters in symbols.

    CL-USER 11 > '((person |Marvin Minsky|)
                   (|KNOWN FOR| |Artificial Intelligence Research|)
                   (lab |MIT AI Lab|)) 
    ((PERSON |Marvin Minsky|)
     (KNOWN\ FOR |Artificial Intelligence Research|)
     (LAB |MIT AI Lab|))

    CL-USER 12 > (setf mm *)
    ((PERSON |Marvin Minsky|)
     (KNOWN\ FOR |Artificial Intelligence Research|)
     (LAB |MIT AI Lab|))

    CL-USER 13 > (assoc '|KNOWN FOR| mm)
    (KNOWN\ FOR |Artificial Intelligence Research|)

Symbols are by default interned in a special data structure and are looked up at read-time. Thus they are compared by pointer identity. In Common Lisp this data structure is called a package and there can be more than one package.

Yes, "symbol" is the correct term. Must have been a brain fart.

You simply know that:

  ("jane" "jim")
is a list, because that's the overwhelming convention in Lisp dialects. A common notation for vectors, used in Scheme and Common Lisp, is the hash-left-paren:

   #(1 2 3) ;; unambiguously a vector
   #()      ;; empty vector
There is no standard for notating a hash table; it is dialect specific.

For instance, in the Racket language, they use #hash, #hasheq and #hasheqv prefixes: http://docs.racket-lang.org/reference/reader.html#%28part._p... The hash contents are using the same syntax as an association list (list of dotted pairs).

In TXR Lisp, I invented a different notation: a #H prefix which is followed by a compound S-expression. The first element of the S-expression is a list of optional attributes of the hash, and the remaining elements are the key-value pairs. For example:

  #H((:equal-based) ("jane" "jim") ("alice" "bob"))

  #H(() (:jane 3) (:jim 5) (:alice 7))
equal-based is needed if the keys are aggregate objects like strings, lists and vectors.

In Racket these would be:

  #hash(("jane" . "jim") ("alice" . "bob"))

  #hasheq((#:jane . 3) (#:jim . 5) (#:alice . 7)) ;; keywords are #: in Racket
If you want interoperable S-exps that include literals for associative arrays, you have to pick the way some dialect does it, or roll your own, and customize the reader accordingly.

Or else, a decent, portable solution for general data exchange is just to use association lists, like this, which use a notation that is common to many Lisp dialects:

  (("jane" . "jim") ("alice" . "bob"))
Though it is not denote an optimized associative array structure, the syntax (at least when it is not empty) clearly conveys that it is associative. The software which processes this can convert it to a hash table, knowing that an associative list is expected for that datum in that position of the syntax. That is to say, we use the representation of an inefficient association list (which, if used "as is" the way it comes out of the parser implies linear searching for keys) and let the applications optimize that as they see fit.

Just a remainder: https://www.kalzumeus.com/2010/06/17/falsehoods-programmers-....

Don't rely on first and last names. Also, the age keeps changing, store a date (unless the age is what the user inputs, and is later converted to a date).

    (#1=(person ...)
     #2=(person ...)
       :names ("John Smith")
       :date-of-birth @1993-10-04T05:03:00.000000+01:00
       :parents (#1# #2#)))

This is a fairly common occurrence for any declarative data serialization format. Just look at all the crappy imperative XML programs you can write in ant. And plenty of other imperative XML "languages."

Do you think that is bad?

Many people like LISP.

I think Lisp folks complain because so much effort has been spent to converge on the same stuff we already had. When it comes to data serialization, over many decades we've invented many formats, and they usually go off on tangents and then converge on essentially the same stuff. Lispers respond: "We told you so."

Can you point me to an example of a serialization that was reinvented?

JSON vs S-expressions

Ah I always figured S-expressions were "representations" as opposed to "serializations", but that makes sense. Any others?

I personally like digging in a ruby console, e.g.:

    require 'json'
    require 'open-uri'

    data = JSON.load(open("https://api.github.com/users/Dorian"))


You personally like to suffer.

I used to do that in Python, but it is a pain. jq is great for these things, and jid seems to be also.

How would you do a `data.keys.sort` with jq? Here I don't have to learn a new programming language or hack around with UNIX pipes ;)

jq has sort and sort_by(exp). Its | is a jq operator, not a unix pipe.

It's true it's a new lang, and although simple, I often have to look things up.

It might be nice to have a jq REPL, so needn't quote, and easier to keep state around.

I find jq syntax tricky, and it's difficult to work out what I need to get to specific elements. I guess I'm more used to things like XPath. Seconding the request for a REPL

`keys|sort` in jq.

Nice! Thanks

Would be nice to have this in Chromium/Firefox developer tools.

I built a similar thing recently with python. Not complete, but a bunch of stuff does work. It's modeled on interacting on-device with a JunOS or Cisco config. Probably the most interesting feature is text completion.


Shameless, but related plug [0] - an interactive JSON log viewer.

[0] https://github.com/pkamenarsky/sherlock

How can I use wildcards in queries?

For example consider this.

    echo '{"users":[{"name":"s1","id":1},{"name":"s2","id":2}]}'|jid 
I can query users[0].id or users[1].id. How can I get all the ids? I tried users[*].id which didn’t work.

In Clojure I use specter [1] for this which is able to handle wildcards using ALL.

[1] https://github.com/nathanmarz/specter

jq uses empty square brackets as a way of accessing all of an arrays elements. Not sure if this tool uses the same syntax?

E.g. users[].id

echo '{"users":[{"name":"s1","id":1},{"name":"s2","id":2}]}' |exec tr '{' '\12'|exec sed '/name/!d;s/\"//g;s/}.*//'



When should/shouldn't I use this over jq? Is the main difference that this is interactive while jq is not? How similar/different is the syntax?

If only Jid & jq could get married <3 :)

Does it use jq internally?

The answer is NO for jid, but jiq uses jq internally: https://github.com/fiatjaf/jiq

The answer is no.

Why not?


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact