
Gron: A command line tool that makes JSON greppable - oftenwrong
https://github.com/tomnomnom/gron/
======
kbenson
This is cool... but it appears to also work as an HTTP client as well? I'm not
sure why the additional complexity of including this is needed.

I maintain that any HTTP client functionality that supports enough options to
be useful is complex and if it supports so few options that it's a toy, why
include it?

There is non-negligible overhead in keeping track of what shell tools can make
network requests, and not being correct and up to date in that area has been
the cause of numerous bugs and security issues in tool sets and programs that
include and utilize a component like this without realizing it.

To many, this might sound like some overly nit-picky complaint, but I maintain
if you ask just about anyone that's been in the trenches as a sysadmin for
more than a couple years whether they think saving the few characters it takes
to use curl and pipe is a good trade-off for the possible unintended
consequences of some developer shelling out to this without proper validation
from some webapp, they'll tell you no.

This tool seems awesome, but keep it simple. It's not like it's a GUI tool and
it's hard to pass data between programs. A pipe is perfect here.

~~~
TomNomNom
Author of the tool here.

I think you raise a good point; and I don't think it overly nit-picky.

Personally I don't find myself using the built-in HTTP client at all (and pipe
the output of curl instead); but I know people who do. I ummed and ahhed about
keeping this functionality for a while, and surveyed the—at the time fairly
small—user base to figure out what I should do. What I found was a subset of
users whose use-case was very different to my own; they were often running
Windows, with no install of curl (which I believe is actually a default on
Windows now?), and only wanted very basic functionality.

I'm definitely a proponent of the Unix philosophy, but I try my best to be
pragmatic, especially where it can lower the barrier to entry for users.

~~~
erric
>with no install of curl (which I believe is actually a default on Windows
now?)

Alas, no. It's just an alias to invoke-webrequest, or what ever their
equivalent power shell incantation is.

~~~
coldacid
As of Spring Creators Update (coming any time this month) both curl and bsdtar
are present in the base Windows install. On my Insiders build 17133.1, curl
7.55.1 and bsdtar 3.3.2 are both present in C:\Windows\System32.

The problem is that PowerShell 5.x still maintains the "curl" Invoke-
WebRequest alias, and that captures `curl` on the PS command line before
curl.exe out of the path. However this (and a bunch of other *nix-conflicting
aliases) are removed with PowerShell Core, and annoying aliases can be deleted
out of the Alias:\ PS-drive on PS 5.x and older.

~~~
erric
Good to know, Thanks. I’m sure I will be getting more questions from my co-
workers, but adding these utilities is a good thing in the long run.

------
krat0sprakhar
If you use jq[0] and are wondering why Gron, the answer is at the very bottom
of the readme:

 _jq is awesome, and a lot more powerful than gron, but with that power comes
complexity. gron aims to make it easier to use the tools you already know,
like grep and sed. gron 's primary purpose is to make it easy to find the path
to a value in a deeply nested JSON blob when you don't already know the
structure; much of jq's power is unlocked only once you know that structure._

[0] - [https://stedolan.github.io/jq/](https://stedolan.github.io/jq/)

~~~
cryptonector

        $ jq -c tostream <<<'{"a":[{"b":2}]}'
        [["a",0,"b"],2]
        [["a",0,"b"]]
        [["a",0]]
        [["a"]]
        $ 
    

However, filtering that and then reconstructing JSON from that is... not
possible at this time:

    
    
        $ jq -c tostream <<<'{"a":[{"b":2}]}'|jq -crn 'fromstream(inputs)'
        {"a":[{"b":2}]}
        $ 
        $ jq -c tostream <<<'{"a":[{"b":2}]}'|grep b|jq -crn 'fromstream(inputs)'
        $ 
    

:(

The reason is that tostream and fromstream can handle multiple top-level JSON
texts, since jq normally does too, but then there's an ambiguity issue to
resolve by having a sort of an object terminator. Filtering tostream's output
with grep loses the terminators, and so fromstream cannot operate normally.

But it should be possible to define a function that does allow this, by, e.g.,
requiring just one top-level JSON text.

The other thing is that a path-based encoding that does not require quotes and
commas would be handier -- tostream's output is itself JSON, so it's not
shell-friendly. This is gron's brilliant innovation: it's got a path-based
encoding of JSON that is easy to deal with in a shell script. (Mind you, I'm
not sure that using brackets to denote array indices is all that easy to use,
but the need to disambiguate object keys that look like numbers is critical.
Also, there's an ambiguity as to keys that have embedded periods ('.') in
them. And lastly, even gron can't shake off the string quotes for _values_.)
That jq has the builtin functionality needed to do the same is not good enough
if it doesn't actually do it out of the box.

~~~
cryptonector
It occurs to me that the way to get rid of quotes in string values is to not
include the quotes but print the actual string with newlines (and maybe other
characters, like double-quotes) escaped.

And the way to get rid of ambiguity regarding object keys that contain periods
or " = " (and also square brackets) is to escape them: ".." and " == " or
similarly.

Example:

    
    
        .foo.bar[0].baz == ..blah = this is a\ntwo-line string
    

where the last key in the path is "baz = .blah".

Also, " = " is a bit annoying. I'd prefer ": ":

    
    
        .foo.bar[0].baz: this is a\ntwo-line string
    

The the quoting rule for the special chars in keys can then be generic: double
them.

    
    
        .foo.bar[0].baz[[5]]:: ..blah: this is a\ntwo-line string
    

Here the last key in the path is "baz[5]: .blah". Mind you, this is still not
trivial to deal with in a shell script, so perhaps we need some other escaping
mechanism -- one that doesn't reuse the escaped characters, such as \u
escaping.

~~~
kbenson
What exactly was the benefit of removing quote characters? It does allow for
easier grepping in _some_ instances, but as you already covered, the brackets
from array notation generally need to be quoted as well. I think you're better
off just assuming the need for single quotes when grepping this output, since
the fact the output is valid javascript is very useful, IMO (especially with
autovivification in JS), and losing that to make it slight easier to search is
regression in my eyes.

But maybe I'm missing the benefit you're seeing, and it's not about searching?

~~~
cryptonector
First off, you'd still have to escape newlines, and probably keep all other
escapes required by JSON. But then the quotes would be unnecessary, thus
wasteful. In particular, if I wanted to print a raw string (with escapes) at a
particular path I could first use grep(1) to extract that path, then I'd have
to write a fairly complex sed(1) command to first remove the path, then remove
the quotes, whereas I could otherwise use grep(1) and cut(1) only.

Mind you, I'm sticking to jq, as I know it really well. But I'm thinking of
other users here. I think the value of a path-based transformation of JSON is
ease of use, which motivates me to think about making it even easier to use,
such as by removing those quotes.

~~~
kbenson
> then I'd have to write a fairly complex sed(1) command to first remove the
> path

No need for sed for the path, use cut for that as well. cut -d'=' -f2- will
remove the path (but leave a space).

In the end, you can accomplish it with the following, whichI think is fairly
easy:

    
    
      echo 'json[0].foo.bar.baz = "some string";' | cut -d'=' -f2 | sed -e 's/^\s*"//' -e 's/";$//'
    

For me, it's a toss up whether I would use that or Perl, since chances are I'm
doing it as a first step in some other process, and I can just continue on in
Perl for the rest of the process anyway.

    
    
      echo 'json[0].foo.bar.baz = "some string";' | perl -pE 's/^.*?"//; s/";$//;'
    

I find keeping the output as valid JS _extremely useful_ though, since I can
just paste a grepped entry into a developer console to get a valid object to
play with on a page. That's cutting out a pipe to a js prettifier, pipe to
less, search for identifying text, and careful cut and paste to get the
enclosing block of text for what ends up being a semi-common action for me. On
the other hand, I _can_ get raw strings, but barely ever have need of that,
and could fairly easily make an alias for that if it became common.

~~~
cryptonector
A single grep and cut would be simpler. You've proved my point :)

------
tedivm
I wrote a tool called jsonsmash[0][1] that's meant for a more explorative view
of the JSON files. It basically exposes the data in a minishell, complete with
`ls` (with a ton of the standard flags), `cd`, `pwd`, `cat` (which outputs in
yaml), and some others. My main use case was to read some json files that were
far too big to load in standard editors (210mb+) and for that it has worked
great.

[0]
[https://www.npmjs.com/package/jsonsmash](https://www.npmjs.com/package/jsonsmash)

[1] [https://blog.tedivm.com/open-source/2017/05/introducing-
json...](https://blog.tedivm.com/open-source/2017/05/introducing-jsonsmash-
work-with-large-json-files-easily/)

~~~
pronik
This sounds awfully similar to Augeas CLI. Not that I endorse Augeas in any
capacity...

------
yason
This kind of tools borderline the very interesting threshold where writing
your own tool can be less of a mental effort than discovering, learning and
keeping track of these small utilities separately.

I've written a similar tool in Python, both for JSON and XML. Especially the
JSON version was dead simple, probably fits on a single screen and took 15
minutes to test _and_ write. Surely it didn't have any "features" but it does
the job of letting me grep json.

Gron is probably 10x more versatile and actually comes with useful features
but I'd really have to have pressing needs to do transformations of JSON on a
regular basis to switch over.

The same applies to libraries in programming languages. There is a very vague
threshold, depending on the expressiveness of the language and operating
environment as well as the hardness of the problem itself, where it either
makes sense to write your own library or reuse an existing one.

------
derimagia
If you'd like to see some more tools for dealing with structured text take a
look, [https://github.com/dbohdan/structured-text-
tools](https://github.com/dbohdan/structured-text-tools) a pretty nice list.

------
beaugunderson
There's some prior art; namely jsonpipe/jsonunpipe:

[https://github.com/zacharyvoase/jsonpipe](https://github.com/zacharyvoase/jsonpipe)

------
kriomant
Alternative: use ogrep ([https://github.com/kriomant/ogrep-
rs](https://github.com/kriomant/ogrep-rs) /
[https://github.com/kriomant/ogrep](https://github.com/kriomant/ogrep)) on
pretty-printed JSON.

------
adrianN
I find jq very useful for tasks like this. But it's more awk for JSON than
grep for JSON.

~~~
jedisct1
And rq has even easier syntax.

~~~
cryptonector
Link?

~~~
bklaasen
[https://github.com/dflemstr/rq/blob/master/README.md](https://github.com/dflemstr/rq/blob/master/README.md)

~~~
cryptonector
Thanks!

~~~
erric
Never had heard of rq, so thanks to the person above for that. I also use jp
from JMESpath. I like it because both azure and aws cli use the JMESpath way
of dealing with JSON so filtering between two different cloud providers is at
least a _little_ easier when hunting needles in haystacks.

For some compare and contrast between: jsonpath, jq, and jp these were for
getting AWS ec2 instance IDs:

    
    
      cat foo.json | jsonpath -p $.InstanceProfiles.[*].RoleId
      cat foo.json | jq .InstanceProfiles[].Roles[].RoleId
      cat foo.json | jp InstanceProfiles[].InstanceProfileId
    
    

apologies to mobile users!

------
soheilpro
I have written a similar tool called catj [1]. I mostly use it when I need to
construct JSON expressions when working with jq.

[1] [https://github.com/soheilpro/catj](https://github.com/soheilpro/catj)

------
xenomachina
A few years ago I made a Python script that does the same thing, minus the
reverse mode of converting the assignments back into JSON:
[https://github.com/xenomachina/jsflat](https://github.com/xenomachina/jsflat)

I've found flattening JSON in this way not only useful for line based tools
like grep, but also for understanding unfamiliar JSON. Sometimes it's nice to
be able to see the whole path down to the value you're looking at.

------
acobster
Thanks for this! "Grep for absolute path" is the use-case that the otherwise
awesome `jq` doesn't address well. I've found myself having to iteratively
drill down to find the field that has the data I need. One of those "eh, I'll
automate this [poorly] someday..." things. :D

------
bm1362
If you’re stuck on a foreign box or don’t want to install a new tool:

    
    
       cat xyz | python -mjson.tool | grep foo

~~~
jwilk
Python's json.tool is a pretty-printer. It's not anything like gron.

(json_pp is another pretty-printer that is likely to be already on your
system.)

------
billsmithaustin
I might have chosen a name that sounds less like Cron, but regardless, I can
see how Gron would be useful.

------
aiCeivi9

        json.Host = "headers.jsontest.com";
        json["User-Agent"] = "curl/7.43.0";
    

Why not use simple notation for eveything without '.' in key?

~~~
TomNomNom
Author of the tool here.

The output is designed to be valid JavaScript, which doesn't allow certain
characters in unquoted object keys, like the dash in User-Agent.

Using JavaScript's rules for quoting keys makes it a lot easier to specify the
grammar (and therefore write the parser); and makes it trivial to 'parse' the
output using JavaScript should you want to.

There must be _some_ rules in place for when to quote the key (e.g. when there
is a dot, equals sign, square brace etc in the key name), so I see no reason
to adopt something custom and potentially error-prone when a known-good set of
rules already exists.

Hope that answers your question well enough!

------
coldacid
Looks like I have a new Chocolatey package to create and push up when I get
home tonight.

