
Q: A faster re-implementaiton of jq written in Reason Native/OCaml - davesnx
https://github.com/davesnx/query-json
======
aasasd
For everyone pining for a Jq with a different syntax: I have a bunch of links
to alternatives collected, you might want to try some of them (some may be for
different things than JSON):

[https://github.com/fiatjaf/awesome-jq](https://github.com/fiatjaf/awesome-jq)

[https://github.com/TomConlin/json2xpath](https://github.com/TomConlin/json2xpath)

[https://github.com/antonmedv/fx](https://github.com/antonmedv/fx)

[https://github.com/fiatjaf/jiq](https://github.com/fiatjaf/jiq)

[https://github.com/simeji/jid](https://github.com/simeji/jid)

[https://github.com/jmespath/jp](https://github.com/jmespath/jp)

[https://github.com/cube2222/jql](https://github.com/cube2222/jql)

[https://jsonnet.org](https://jsonnet.org)

[https://github.com/borkdude/jet](https://github.com/borkdude/jet)

[https://github.com/jzelinskie/faq](https://github.com/jzelinskie/faq)

[https://github.com/dflemstr/rq](https://github.com/dflemstr/rq)

Personally I think that next time I might just fire up Hy and use its
functional capabilities.

~~~
vips7L
Don't forget powershell's Convert-FromJson :)

[https://docs.microsoft.com/en-
us/powershell/module/microsoft...](https://docs.microsoft.com/en-
us/powershell/module/microsoft.powershell.utility/convertfrom-
json?view=powershell-7)

~~~
melbourne_mat
That is so not jq! I've really been pining for jq on my current Windows
project :-(

~~~
vips7L
What do you feel is missing?

------
jeffbee
1) refuses to operate on stdin; requires a filename argument, which is so
irritating.

2) doesn't accept values that jq accepts

    
    
      % time jq -r '[expression]' < parcels | wc       
          365    1454    7978
      jq -r  < parcels  1.39s user 0.00s system 99% cpu 1.390 total
      wc  0.00s user 0.00s system 0% cpu 1.390 total
    
      % time ~/.yarn/bin/q  '[expression]' parcels | wc
      q: internal error, uncaught exception:
         Yojson.Json_error("Line 56, bytes -1-32:\nJunk after end 
      of JSON value: '{\n  \"OBJECTID\": 155303,\n  \"BOOK\"'")

~~~
diggan
1 is easy to work around (handy tip incoming for any tools that _seem_ to not
support stdin but actually do, as stdin is also available as a file in unix):

    
    
        echo '{"foo": "bar"}' | query-json ".foo" /dev/stdin

~~~
saagarjha
Tools that accept filenames often expect you give them a real file, as they’ll
do things on it that may not be supported by the various “it’s a file
descriptor pretending to be something on disk” solutions.

~~~
diggan
Hm, do you have any examples handy? It's not that I don't believe you, it's
just that in all the years I've been using this, it has always worked.
Granted, I'm only using it for reading data, not for saving stuff to
/dev/stdin, which would obviously fail.

~~~
ebg13
Anything that seeks, which you can't do on a pipe.

------
toastal
Are we sure it should get a single-letter 'q' binary name though? Docs seem to
point that it's short for 'query-json'? Why not call it 'query-json' and let
the user decide that as a shell alias or whatever. Even the ubiquitous 'ls'
and 'cd' are two characters.

~~~
nondave
Also clashes with this existing q:
[https://en.m.wikipedia.org/wiki/Q_(programming_language_from...](https://en.m.wikipedia.org/wiki/Q_\(programming_language_from_Kx_Systems\))

~~~
stingraycharles
Which is relatively established and widely used (although mainly in finance).
It was the first thing I thought about.

------
andylynch
This looks interesting, but could be confusing given the programming language
of the same name ([https://code.kx.com/q/](https://code.kx.com/q/))

~~~
ulucs
Ah yes, the old ".j.k raze read0`" as a separate app

~~~
andylynch
I should definitely check how that compares on some big files here.

------
mkesper
I'd long for such a tool with a better comprehensible query language.

~~~
cube2222
If so, and for anybody else having this wish, check out jql[0], I've created
it exactly for this reason, to have the most common jq operations available in
a more uniform and easier to use interface.

[0]: [https://github.com/cube2222/jql](https://github.com/cube2222/jql)

~~~
davesnx
Nice!

I will try to bring it to the brenchmark, thanks for sharing

------
as-j
> Aside from that, q isn't feature parity with jq which is ok at this point,
> but jq contains a ton of functionality that query-json misses and some of
> the jq operations aren't native, are builtin with the runtime. In order to
> do a proper comparision all of this above would need to take into
> consideration.

> The report shows that q is between 2x and 5x faster than jq in all
> operations tested and same speed (~1.1x) with huge files (> 100M).

While faster for somethings....that's a pretty large set of caveats!

~~~
davesnx
Adding most of the jq operations shoudn't affect performance at all, in fact
If I endup implementing streaming could be even faster.

I have a issue to improve performance where I can push this forward:
[https://github.com/davesnx/query-
json/issues/7](https://github.com/davesnx/query-json/issues/7)

But sure, are caveats!

------
riston
Would be good if someone adds an explanation why this new approach is better,
is it that the OCaml is faster, more efficient algorithms were used, etc?

~~~
davesnx
I tried to explain it on the Performance section and on the report

[https://github.com/davesnx/query-
json#performance](https://github.com/davesnx/query-json#performance)
[https://github.com/davesnx/query-
json/blob/master/benchmarks...](https://github.com/davesnx/query-
json/blob/master/benchmarks/report.md)

But all explanations aren't based by any evidence, just asumptions.

------
jakuboboza
Do we need to make jq faster ? Anyone has issues with current speed ? Is there
any specific reason other than "because we can" ?

~~~
phonebucket
I can't answer for the OP, but "because we can" is a valid enough reason (pun
unintended) for me.

IMO, an individual dev making a fast useful tool should always be welcomed as
a feat of worthy hacking.

------
ksmg
Hm, I thought q is synonym for querying CSV files
[https://harelba.github.io/q/](https://harelba.github.io/q/)

~~~
redsaz
Same. When I saw the name "q" I thought of this same tool.

~~~
davesnx
Right, I found q cute... but I'm thinking to release new version with the name
query-json or just change the name all-together. Any suggestion? ^^

------
jamil7
As an outsider I get very confused by the Reason / Reason Native / OCaml /
Bucklescript / Rescript?! ecosystem. What does it mean for it to be written in
Reason Native/OCaml?

~~~
rashkov
That means it produces a native binary (for example, a .exe file on windows
platforms), so ultimately you're aiming to run the program in a terminal. This
is the normal way for OCaml to operate.

In this case the author is using Reason as an alternative syntax to OCaml.
Reason resembles javascript a little more, and some people find that nicer to
work with. So the idea is that you write Reason code, then translate it into
OCaml code using the Reason tools, and then ultimately you compile it down to
a native binary.

If instead you want to write a web-app which runs in a web browser or node.js,
then you'd need to compile it to Javascript, which is what bucklescript helps
you do.

Where does Rescript come in? As explained above, Reason can be used for
writing either native apps or javascript apps. However, it's hard to evolve
the syntax of Reason in a way which satisfies both aims. So they've now split
the work -- going forward, Reason will specialize on native, and Rescript will
specialize on javascript apps. Their syntax is expected to diverge from each
other, in order to support those aims as best as they can.

~~~
jamil7
Thank you for the detailed answer! I check in on the status of the related
projects from time to time and was often confused by the relationship between
the components.

------
StavrosK
This looks nice, but I was a bit dismayed at "friends don't let friends curl |
bash, to install this run curl | bash".

~~~
konjin
I remember one of the first times I tried installing Linux software in the
wild. The bash script asked for your password, sent it to their server using
curl then returned you the script with the password hard coded into it, run
itself with sudo, all over unencrypted http. I was 17 but even then I stopped
to think if this was a good idea.

It wasn't.

------
jonemi
I used to be a regular user of jq, but I was never parsing very large JSON. I
now do what I used to do with jq in my browser's developer tools console. Map
and filter are far more familiar than jq's syntax where I found myself
referring to the documentation most of the time.

I'm sure other people have use cases where the browser wouldn't meet their
needs, but for me, I find jq unnecessary.

~~~
choward
Writing a script? I'm not going to have my script open a web browser so I can
attempt to interact with a web console.

~~~
jonemi
When it got to the point when I needed a script, I just preferred Python. I
can understand how some might prefer jq and a shell script, I just realized it
wasn't worth it for my particular needs.

------
gkfasdfasdf
Curious, any description as to why it's faster? Something intrinsic to Reason
Native/OCaml? Architectural changes? Reduced feature set?

~~~
tyingq
Jq appears to have its own hand written json parser and requires flex/bison. I
suspect something about the hand written parser is slow for large data sets.

I was somewhat surprised it didn't use an existing json parser library.

~~~
brundolf
I'm doubly surprised that such a popular utility uses bison; generated parsers
tend to be slower than handwritten parser, and JSON isn't exactly the world's
hardest language to parse

------
dkdk8283
Is jq slow? I have only worked with datasets up to 1mb but I’ve never had a
performance issue that wasn’t attributed to my error.

~~~
nicoburns
jq is pretty fast in my experience. But there have been cases where I've
wanted it to be faster (dealing with a 90GB JSON file).

The main weakness seems to be streaming use cases (not having the whole file
in memory at once). These are supported, but the syntax is quite awkward.

~~~
arethuza
Out of interest, what created a 90GB JSON file?

~~~
hobofan
I don't think it's quite 90GB, but I've processed Wikidata dumps in the same
order of magnitude before (which are one JSON object per line) with jq, and it
could've certainly been faster.

~~~
arethuza
I was wondering about that - whether they are one single JSON array/object or
one per line.

------
Ericson2314
This is funny because Stephen Dolan, the original jq author, works on OCaml
itself.

~~~
davesnx
Exactly! I wanted to contact him

------
vasergen
The speed is not concern for me. I am wondering if there something better than
`jq` in terms of syntax. Whenever I want to get something more that just
prettify json output in the console or simply get value by specific field name
I have a problem, for me it is just difficult to remember jq syntax without
looking into history. As well have in my notes links to examples like this one

[https://mosermichael.github.io/jq-
illustrated/dir/content.ht...](https://mosermichael.github.io/jq-
illustrated/dir/content.html)

~~~
bradly
Check out jql
[[https://github.com/cube2222/jql](https://github.com/cube2222/jql)] and oj
[[https://github.com/ohler55/ojg](https://github.com/ohler55/ojg)]

~~~
vasergen
definitively will take a look, I've never heard of `jql` before, thanks

------
layoutIfNeeded
Umm... There's already a language called Q for array processing.

~~~
davesnx
Will rename it to query-json. Thaaanks!

------
heycosmo
In case anyone is interested in yet another alternative, I have this old,
unpolished project:
[https://github.com/bauerca/jv](https://github.com/bauerca/jv)

It is a JSON parser in C without heap allocations. The query language is
piddly, but the tool can be useful for grabbing a single value from a very
large JSON file. I don't have time for it, but someone could fork and make it
a real deal.

------
Borkdude
If you're into Clojure, check out
[https://github.com/borkdude/jet](https://github.com/borkdude/jet)

~~~
iLemming
I use jet all the time when I need to quickly examine a json snippet in Emacs.
I would use <C-u M-|> (shell-command-on-region with a prefix) and execute jet
to convert selected json part to EDN. That cuts out all the visual noise. EDN
is much more concise, cleaner and easier to read. I'd use it even if I don't
write Clojure.

------
RMPR
Upcoming q-rs a rewrite of q in Rust :p

~~~
davesnx
I hope so!

------
muktabh
Slightly out of context here, I find the entire stack of bsb, bsb-native,
ocaml and esy pretty cool. However, I just dont find enough resources, good
tutorials etc on Google search. Is there a good set of beginner tutorials
anyone can point to ? Thanks in advance.

~~~
davesnx
The documentation is a problem in the OCaml world and a problem with Reason
Native as well. I found myself pretty lost some times, esy.sh should be a
initial point in contact for most of Reason related stuff.

Menhir/sedlex and others are pretty high accessibility barrier for new
commers.

One of the nice things about all of it it's the discord, it's friendly and
always helpful.

Hope it helps, just let me know if there's any specific!

~~~
smabie
Just ditch Reason and use OCaml. There's a lot more documentation and the
syntax is better.

------
skywhopper
This is cool, but I’m not sure it’s fair to claim it’s “faster” yet when it
doesn’t do 95% of what jq does—-particularly the command line options. If it’s
still faster when you can match 80% of the functionality, then it might be a
claim worth making.

~~~
davesnx
Exactly I didn't claim to be faster in all the cases, since there's no feature
parity and I won't make it that way.

For the set of operations that I implement it it's faster, that's true.

------
YesThatTom2
Great! Now improve the syntax!

~~~
dividedbyzero
How, though? I agree that jq's syntax isn't exactly the most straightforward,
and it gets raised as a point of criticism anytime jq is mentioned, but its
scripting language seems like a pretty good compromise between compactness and
rich features.

Replacing that with, say, traditional command line flags would make it a lot
less useful for me, I'd probably have to build much longer pipe-chains to do
things that are relatively simple and readable jq snippets (if one knows the
syntax.)

Using an established scripting language in its place would make it pretty much
just python -c/ruby -e or whatever with some pre-loaded functions, but what's
the point? You can always just write a quick python/ruby/whatever script, jq
to me is an alternative for cases where a script feels unnecessary. It would
also mean everything gets more verbose, so less of my jq transformations can
be inlined without loss of readability.

Aligning it to more established languages would probably cause confusion as
well in those cases where it doesn't match the reference language 1:1. Looks
like javascript, writes like javascript, but only for a tiny subset of the
language, etc.

Doing this only for a few function names or syntax constructs still results in
a pretty unique and unusual language that will require people to reference the
docs a lot, just now lots of existing scripts break.

~~~
davesnx
Just because jq is very well stablished doesn't mean their APIs are well
designed and we shoudn't improved because will break existing scripts.

There're a lot of quirks from the usage of it and people struggling with
learning such a great tool, so in the area of query-json it will try to make a
better interface for users.

------
brundolf
I'd love to hear some speculation - from the author or otherwise - as to why a
fresh OCaml implementation would so dramatically outperform a mature C
implementation

~~~
davesnx
There are a few good asumtions about why is faster, there are just
speculations since I didn't profile jq or query-json.

The feature that I think penalizes a lot jq is "def functions", the capacity
of define any function that can be available during run-time.

This creates a few layers, one of the difference is the interpreter and the
linker, the responsible for getting all the builtin functions and compile them
have them ready to use at runtime.

The other pain point is the architecture of the operations on top of jq, since
it's a stack based. In query-json it's a piped recursive operations.

Aside from the code, the OCaml stack, menhir has been proved to be really fast
when creating those kind of compilers.

I will dig more into performance and try to profile both tools in order to
improve mine.

Thanks

------
nikolay
JMESPath is the only viable alternative, which probably has a wider footprint
than even jq as it's part of AWS CLI.

~~~
acdha
It's definitely popular but “only viable alternative” is a bit strong: that's
only if you need compatibility with particular tools which support only one of
the two formats. There's no reason why anyone who doesn't like those tools
couldn't create a different syntax to scratch whatever particular itch they
have.

~~~
nikolay
It's embeddable and available as a library for all languages [0]. Everything
else is nothing but an CLI tool pretty much, which further limits its
adoption.

[0]: [https://github.com/jmespath](https://github.com/jmespath)

~~~
acdha
Well, there is XPath 3.1 if you want standards[1] but my point was simply that
it depends on whether your question is “I need compatibility with existing jq
scripts”, “I need an embeddable library I can integrate in other programs”, or
“I want to process JSON for my own usage”.

For example, someone who works with a lot of Python might prefer something
like
[https://github.com/kellyjonbrazil/jello](https://github.com/kellyjonbrazil/jello)
to write comprehensions using the full capabilities of Python, especially
since that would provide a direct path to using the final expressions in a
Python program or even embedded in one of the environments where Python is
used as a scripting language. Is that a viable alternative? The answer depends
entirely on who's asking.

1\. [https://www.w3.org/TR/xpath-31/#id-
introduction](https://www.w3.org/TR/xpath-31/#id-introduction)

------
tus88
Isn't JQ written in C? I doubt LISP is going to be faster.

~~~
davesnx
Yes, jq is written in C. Where LISP comes from?

------
yahyaheee
I’m with Q!

~~~
davesnx
+1

