
Show HN: Catj – A new way to display JSON files - soheilpro
https://github.com/soheilpro/catj
======
jolmg
I was curious if this could be doable with jq, and apparently it is:

    
    
      jq -j '
        [
          [
            paths(scalars)
            | map(
              if type == "number"
              then "[" + tostring + "]"
              else "." + .
              end
            ) | join("")
          ],
          [
            .. | select(scalars) | @json
          ]
        ]
        | transpose
        | map(join(" = ") + "\n")
        | join("") 
      '
    

EDIT: Got the string quoting and escaping.

EDIT 2: For those who want to save this script, you can put just the jq code
in an executable file with the shebang:

    
    
      #!/usr/bin/jq -jf

~~~
kfrzcode

      jq: error: syntax error, unexpected INVALID_CHARACTER, expecting $end (Unix shell quoting issues?) at <top-level>, line 3:
      jq -j '      
      jq: 1 compile error
    
    
    

This, uh, doesn't work for me on jq-1.5.1.

~~~
waveforms
#!/usr/local/bin/jq -rf

tostream | select(length > 1) | ( .[0] | map( if type == "number" then "[" \+
tostring + "]" else "." \+ . end ) | join("") ) + " = " \+ (.[1] | @json)

~~~
alinspired
i need to understand how #! works, ie `#!/usr/bin/jq --stream -rf" errors with
`/usr/bin/jq: Unknown option --stream -rf`

`#!/usr/bin/jq -rf ` with tostream wrapper in code works fine

~~~
jolmg
Like another thread mentioned, shebang (#!) parsing is non-standard. In macOS,
I think what you tried would work like you'd expect, but it'd work differently
on linux. The reason is that in linux, after parsing the path to the
executable and a space, everything else is taken as a single argument. So if
you were in bash, what you did would be the equivalent of doing:

    
    
      jq "--stream -rf" path/to/script
    

and jq doesn't know of any _one_ option called "\--stream -rf".

I haven't seen the discussions around these design decisions in the different
OSes, but I imagine the crux of the matter is that you have to pick somewhere
to stop, and where you chose to stop is largely arbitrary.

I mean, you can have the OS interpret shebangs with multiple arguments, but
then you'll want to be able to put spaces in these arguments, so you'll want
quoting, and then you'll want to put special characters like newlines inside,
so you'll want escaping, etc.

The OS can implement all these things in execve()'s logic, but it might also
be preferable to keep the logic simple in the interest of avoiding security-
harming bugs. You know, less code, less bugs, less vulnerabilities.

If --stream had a single letter option equivalent, you could stick it together
with the other ones. However, since it doesn't, your only option to make a
portable script is to use a shell shebang like #!/bin/bash, and then do:

    
    
      exec jq --stream -rf ...
    

You might feel that this single argument restriction sucks and is definitely
inferior to any implementation of multiple argument shebangs. I don't know if
macOS shebangs support quoting, but if they don't and simply split on spaces,
then I can tell you they can't do hacky stuff like writing code in a shebang
like this:

> [https://unix.stackexchange.com/questions/365436/choose-
> inter...](https://unix.stackexchange.com/questions/365436/choose-
> interpreter-after-script-start-e-g-if-else-inside-hashbang/365751#365751)

Granted, it's bad practice, but a little cool nevertheless.

------
avidal
Have you seen gron[0]? It's similar: flattens JSON documents to make them
easily greppable. But it also can revert (ie, ungron) so you can pipe json to
gron, grep -v to remove some data, then ungron to get it back to json.

[0] [https://github.com/tomnomnom/gron](https://github.com/tomnomnom/gron)

~~~
inferiorhuman
What does grep + gron give you over jq?

~~~
webo
as far as i can tell, jq doesn't do flattening .

~~~
pstuart
Not built in, but @jolmg posted a script here which does the needful.

------
twp
I wrote a similar tool:

[https://github.com/twpayne/flatjson](https://github.com/twpayne/flatjson)

The flat format is great for diffs:

    
    
      --- testdata/a.json
      +++ testdata/b.json
      @@ -1,5 +1,6 @@
       root = {};
       root.menu = {};
      +root.menu.disabled = true;
       root.menu.id = "file";
       root.menu.popup = {};
       root.menu.popup.menuitem = [];
      @@ -9,8 +10,5 @@
       root.menu.popup.menuitem[1] = {};
       root.menu.popup.menuitem[1].onclick = "OpenDoc()";
       root.menu.popup.menuitem[1].value = "Open";
      -root.menu.popup.menuitem[2] = {};
      -root.menu.popup.menuitem[2].onclick = "CloseDoc()";
      -root.menu.popup.menuitem[2].value = "Close";
      -root.menu.value = "File";
      +root.menu.value = "File menu";

------
chrismorgan
There is actually a standard around writing paths into JSON objects: JSON
Pointer,
[https://tools.ietf.org/html/rfc6901](https://tools.ietf.org/html/rfc6901).
It’s straightforward, and avoids ambiguity between separator and key name by
simple replacement, e.g. `/foo/bar~0baz~1quux` looks up a key named "foo",
then a key named "bar~baz/quux" inside it. It’s not particularly widely used,
but I’ve come across it in a few places over the years (it’s not a common
thing to need to _do_ ), and probably most recently JMAP uses it for
backreferences.

(I haven’t run it, but a skim of the code suggests that this tool will turn
`{"foo.bar": "baz", "foo": {"bar": "baz"}}` into `["foo.bar"] = "baz"` and
`.foo.bar = "baz"`, resolving the separator ambiguity in a pretty JavaScripty
way.)

------
yason
When XML came out into popularity I think the first thing I wrote was a small
Python program to flatten/unflatten XML into per-line entries quite similar to
the example output in the article.

The text streams that are processed line-by-line by dozens or hundreds of
line-based tools are immensely powerful and universal. It's all Unix heritage
and often overlooked by fancy modern designs that more often follow a fashion
rather than root themselves in substance.

Surely text streams have their share of limitations like everything else but
in practise you can retrofit nearly anything into line-based text streams and
get an immediate productivity multiplier by being able to apply a whole array
of established tools to process that data. Proof of that power is that it has
been worthwhile to write converters to and from text and other formats. Not
only you can find translators to turn various hierarchical or object-oriented
formats into text but you can even convert a PNG into text and back (with
SNG).

Text streams are like roads with lanes. They're ages old, they're pretty good
at separating and guiding traffic, and they're somehow suboptimal in several
senses yet rarely can anyone point out a single, clear practical improvement
on laned roads, not to mention a system for containing traffic flows that is
superior to them.

------
emmelaich
Augeas can do something similar too. But not only JSON but XML and 200+ other
config file formats.[0]

    
    
      $ augtool -r . -L --transform 'JSON.lns incl /catj-eg.json'  <<< 'print /files/catj-eg.json'
      /files/catj-eg.json
      /files/catj-eg.json/dict
      /files/catj-eg.json/dict/entry = "movie"
      /files/catj-eg.json/dict/entry/dict
      /files/catj-eg.json/dict/entry/dict/entry[1] = "name"
      /files/catj-eg.json/dict/entry/dict/entry[1]/string = "Interstellar"
      /files/catj-eg.json/dict/entry/dict/entry[2] = "year"
      /files/catj-eg.json/dict/entry/dict/entry[2]/number = "2014"
      /files/catj-eg.json/dict/entry/dict/entry[3] = "is_released"
      /files/catj-eg.json/dict/entry/dict/entry[3]/const = "true"
      /files/catj-eg.json/dict/entry/dict/entry[4] = "director"
      /files/catj-eg.json/dict/entry/dict/entry[4]/string = "Christopher Nolan"
      /files/catj-eg.json/dict/entry/dict/entry[5] = "cast"
      /files/catj-eg.json/dict/entry/dict/entry[5]/array
      /files/catj-eg.json/dict/entry/dict/entry[5]/array/string[1] = "Matthew McConaughey"
      /files/catj-eg.json/dict/entry/dict/entry[5]/array/string[2] = "Anne Hathaway"
      /files/catj-eg.json/dict/entry/dict/entry[5]/array/string[3] = "Jessica Chastain"
      /files/catj-eg.json/dict/entry/dict/entry[5]/array/string[4] = "Bill Irwin"
      /files/catj-eg.json/dict/entry/dict/entry[5]/array/string[5] = "Ellen Burstyn"
      /files/catj-eg.json/dict/entry/dict/entry[5]/array/string[6] = "Michael Caine"
    
    

[0]

    
    
      $ ls  .../share/augeas/lenses/dist/|wc
            221     221    2867

------
tedivm
For exploring large files I released a program called "JSONSmash" that runs a
shell which lets you browse the data as if it was a filesystem.

[https://blog.tedivm.com/open-source/2017/05/introducing-
json...](https://blog.tedivm.com/open-source/2017/05/introducing-jsonsmash-
work-with-large-json-files-easily/)

~~~
ficklepickle
What an interesting idea! On my last contract, I had to deal with 100mb+ JSON
responses from a barely-documented API. This would have come in handy when I
was figuring it out.

I've used JSONExplorer for this purpose, but it is web based and doesn't
handle files this large.

Extending the filesystem metaphor to JSON data and re-using the same commands
strikes me as a great idea.

Did another project inspire you, or did you come up with the concept yourself?

Have you done a Show HN yet?

~~~
tedivm
I came up with this idea on my own when dealing with the AWS Bulk API files. I
did make a "Show HN" but it got a total of four upvotes.

~~~
mkesper
HN is very sensitive to timing, afaik the moderators allow resubmissions when
that seems to be the case.

------
hokus
this is something between a joke and a thought experiment.

    
    
        function cason(x){
        switch(x[0]){
          case "movie": switch(x[1]) {
            case "name"       : return "Interstellar";
            case "year"       : return 2014;
            case "is_released": return true;
            case "director"   : return "Christopher Nolan";
            case "cast": switch(x[2]){
              case 0: return "Matthew McConaughey";
              case 1: return "Anne Hathaway";
              case 2: return "Jessica Chastain";
              case 3: return "Bill Irwin";
              case 4: return "Ellen Burstyn";
              case 5: return "Michael Caine";
            }
          }
        }
        }

~~~
geofft
Well, that's kind of the inverse of writing switch statements like

    
    
        def license(kernel):
            return {"Linux": "GPL",
                    "FreeBSD": "BSD",
                    "NT": "Proprietary"}[kernel]

------
kara_jade
You can do this easily with the json_tree function from SQLite's JSON1
extension. It's given as an example in the documentation:

[https://sqlite.org/json1.html#jtree](https://sqlite.org/json1.html#jtree)

    
    
      SELECT big.rowid, fullkey, value
        FROM big, json_tree(big.json)
       WHERE json_tree.type NOT IN ('object','array');

~~~
ComputerGuru
And with the fileio sqlite extension, you can even directly query files (and
their contents, and directories recursively too, no less) from SQL.

------
nfoz
That's nice, I like that.

Related, if you want more of a csv-style, see JSONLines. aka "newline-
delimited JSON"

[http://jsonlines.org/](http://jsonlines.org/)

~~~
emmelaich
I don't see what jsonlines has over yaml; in fact the jsonlines examples
presented there are almost trivially converted to yaml. e.g. the first example
is valid yaml if you add a '\- ' to each line.

And JSON is (almost) a perfect subset of yaml.

I've been using csv lately. It's reputation is overstated.

What I like is that it's far more compact than yaml or json and trivially
pulled into sqlite for ad-hoc queries.

~~~
geofft
One thing I like about JSON Lines is robustness to bad data - if an individual
line is corrupted, you can discard it / print a warning and move on, and the
parser can start again at the next newline. This makes it useful for log
messages / metrics, because if something crashes while emitting a log line,
you can recover. If something crashes while emitting an item in a YAML list,
you might corrupt the entire rest of the document.

Another is that it makes streaming processing a little easier. Once you have a
line, you know you can attempt to process it, and you can shard processing on
newlines without a full YAML processor. Tools that work on newlines or tools
that can just split on lines can handle the first level of JSON Lines output.

------
limsup
Try it with deno:

deno install catj
[https://deno.land/std/examples/catjson.ts](https://deno.land/std/examples/catjson.ts)
\--allow-read

------
ravinizme
Similar to python-jsonpipe (8 years ago).

[https://github.com/zacharyvoase/jsonpipe](https://github.com/zacharyvoase/jsonpipe)

It includes 'jsonunpipe'.

So you could grep part of the JSON and still get a JSON back.

``` echo '{"a": 1, "b": 2}' |grep b| jsonpipe | jsonunpipe

#{""b": 2} ```

------
bborud
Oh no. This looks like the horrible config format a colleague of mine invented
at Yahoo when making an absolutely horrific config system. This brings back
bad memories.

This may look cute but it is horrific when dealing with large configs and you
have to reconstruct all the structure in your head.

Also, when you have a format that nests using brackets, braces and
parenthesis, you can get help from the editor. This format does not give you
that.

I'm not a huge fan of JSON (and the above mentioned format was invented
because none of us were fans of XML at the time), but it turns out that both
XML and JSON are actually easier to work with in practice than this format.
Not least because there is ample tooling for JSON (and XML).

The lesson I learnt: I may hate XML (or in this case JSON), but finding an
alternative that is better is not easy.

~~~
geofft
I think the idea is not that this is for storage or editing, but just for
querying - you should keep your data in JSON, but if you want to find where
something is, do `catj foo.json | grep something` and it'll tell you all the
paths in the document where you can find the string "something". The intent is
not to open catj format in a text editor, or to use the output of catj for
anything other than ad-hoc purposes.

------
flying_sheep
[https://github.com/stedolan/jq/issues/243](https://github.com/stedolan/jq/issues/243)

jq can also do the same thing, with more flexibility. And it is possible to
combine with bash alias to make it indistinguishable from catj

~~~
jolmg
> combine with bash alias

... or you know, you could put the jq script in an executable file and add a
shebang like

    
    
      #!/usr/bin/jq -jf
    

or

    
    
      #!/usr/bin/jq -rf
    

In my opinion, aliases should mostly be used to add default options only. Not
really to insert whole scripts into them.

~~~
Splognosticus
I've always been of the mind that aliases should be "whatever the user finds
convenient." I don't think I've ever seen code that depends on them.

------
sdegutis
This is really similar to the format that AWS uses to represent recursive
structures (arrays, maps) as a single array of key-value pairs for their APIs.
Your catj could potentially be used to create that if ever working with the
raw API directly instead of a SDK.

------
QuadrupleA
Please don't write JSON like the example - it's like putting 4 files in a
hierarchy of 20 folders. Way over-structured.

That said, if you're stuck dealing with bad JSON like this with low signal to
noise this is a decent way to redisplay it.

------
danschumann
Okay I made one with javascript ( go to
[https://underscorejs.org/](https://underscorejs.org/) and open console )

var json = JSON.parse('{"my": "json"}'); (function printRecursively(ob, _keys
= []){ _.map(ob, (val, key) => { var k = _.isNumber(key) ? '[' \+ key + ']' :
'.' \+ key; var keys = _keys.concat(k); if (_.isObject(val))
printRecursively(val, keys); else console.log(keys.join('') + ' = ' \+ typeof
val + ' ' \+ val); }); })(json);

EDIT: how do I markup code on HN?

~~~
ComputerGuru
> how do I markup code

Indent each line with four spaces. Please _please_ keep line length very short
(under forty?) as HN's pre tags are absolutely not mobile friendly and can
trash the entire page.

Edit: actually it seems to at least scroll within the comment div on overflow
now, that's a huge improvement!

------
not_kurt_godel
Hm, essentially JSON->Properties file. Cool. I wrote a little script the other
day and decided Properties format was a pleasant way to define the config,
maybe this could dovetail with similar future endeavors.

------
blablabla123
That's really smart, in fact this was - maybe until now - the only reason for
me to resort to csv. (With pandas it's by the way really easy to flatten jsons
into csv) JSON is such a nice format but the tools are really not there yet. I
guess should should make it then possible to combine with line-based tools
like head, tail, sort, uniq etc.

------
venthur
Looks a lot like ye olde gron:
[https://github.com/tomnomnom/gron](https://github.com/tomnomnom/gron)

Here's a Python implementation of gron [https://github.com/venthur/python-
gron](https://github.com/venthur/python-gron)

------
krapp
The output format like a slightly better INI (in that it would support deeper
hierarchies), although arrays would be a pain to write an index at a time like
that.

And for purely aesthetic, nitpicky reasons I think the leading period in each
line is redundant.

~~~
ucarion
I don't think that leading period is redundant. It indicates that the top-
level value is an object, as opposed to an array or some basic value.

------
deostroll
Sweet! I find this tool extremely useful for parsing dmn files in javascript
(via xml2js). But right now I am more interested in a similar feature for
xml/html documents. Hoping if someone can point me to it. Thanks.

------
etaioinshrdlu
It looks almost executable! If it were, that might be handy now and then.

~~~
rolltiide
add in a few nil and undefined checks automatically and it really is
executable

------
konsumer
I made this to do similar, it goes both ways:
[https://github.com/konsumer/jsflat](https://github.com/konsumer/jsflat)

------
BaconJuice
Just curious, what would be the use case for something like this?

~~~
soheilpro
Author here. I use it all the time when working with jq [1] to find the path
of the nodes that I want to select or filter. It also makes it much easier to
understand the structure of deeply nested JSON files.

[1] [https://stedolan.github.io/jq](https://stedolan.github.io/jq)

------
fareesh
I use the "JSON viewer awesome" extension on Chrome. It displays JSON in a
collapsible verticle tree and has a copy path button.

This is a nice CLI based alternative

------
matmann2001
I would recommend an optional flag to disable the output coloring, in the case
where someone wants to pipe this to a file or other program.

------
jay-anderson
Reminds me of how spring configuration works between yaml and property files.
It does a similar flattening to translate between them.

------
nailer
pwsh is an open source shell that runs on Unix and supports JSON natively, so
you don't need to flatten or use jq.

Here's an example from a couple of days ago:

[https://twitter.com/mikemaccana/status/1141706132823695362?s...](https://twitter.com/mikemaccana/status/1141706132823695362?s=21)

------
karxxm
All those redundancies make me sad

------
htk
What a simple and clever idea. I would love something like that for XML in C#.

------
stdcall83
Looks like a good candidate for lesspipe plugin.

------
laurent123456
New from (2014)

------
jijji
not much different than php print var_export(json_decode($json,TRUE), TRUE);

------
usamaejaz
this looks nice

------
hasahmed
cat foo.json | jq .

------
dymk
Might want to put [2015] in the title? Certainly not a _new_ way

------
enriquto
this is not merely a display, it is really a _sanitation_ of the brain-damaged
json syntax.

~~~
Sohcahtoa82
What method would you use to serialize data structures into a format that is
both easily read and written by both a computer and a human?

~~~
enriquto
In the very rare cases that you _really_ need to serialize complicated data
structures, something like s-expressions or (gasp) json, or even a memory dump
is perfectly appropriate. However, most of the times your data structures
should be lists of numbers or lists of strings. For those, you do not really
need a "format", you can simply print and scan them from a text file.

