
JMESPath – A query language for JSON - selrond
http://jmespath.org/
======
justin_oaks
I like JMESPath, but it has some serious limitations which prevent it from
being as general purpose as jq.

JMESPath limitations:

\- No simple if/else, but it is possible using a hack, documented below.

\- The to_number function doesn't support boolean values, but it is possible
using a hack, documented below.

\- can't reference parents when doing iteration. Why? All options for
iteration, [* ] and map, all use the iterated item as the context for any
expression. There's no opportunity to get any other values in. May be possible
for a fixed set of lengths. Something akin to the following (except there is
no syntax for switching or if statements):

    
    
      switch (length):
       case 1: [expression[0]]
       case 2: [expression[1], expression[1]]
       case 3: [expression[0], expression[1], expression[2]]
       ...
    

\- Key name can't come from an expression. Why? The ABNF for constructing key-
value pairs is given as: keyval-expr = identifier ":" expression. The key is
an identifier, which gives no possibility for making it an expression. No
functions modify keys in such a way as to allow using an expression as a key.)

\- No basic math operations, add, multiply, divide, mod, etc. Why? Nobody
added those operators/functions.

\- There's a join, but no split.

\- No array indexing based on expression. Why? Indexing is done based on a
number or a slice expression, which also doesn't support expressions. Here's
the ABNF:

    
    
      bracket-specifier = "[" (number / "* " / slice-expression) "]" / "[]"
    

\- No ability to group_by an expression.

\- No ability to get the index of an element in a list

Hacks:

Convert true/false to number:

    
    
      boolean_expression && `1` || `0`
    

If/else:

Option 1)

    
    
      [{q:CONDITIONAL_EXPRESSION, v:IF_RESULT_EXPRESSION},{q:!COND_EXPRESSION,v:ELSE_RESULT_EXPRESSION}][?q]|[0].v
    

Option 2)

    
    
      {"if":CONDITIONAL_EXPRESSION, "ctx":@} | [{q:if ,v:ctx.IF_EXPRESSION},{q:!if,v:ctx.ELSE_EXPRESSION}][?q]|[0].v

------
gingerlime
I love JMESPath. I first discovered it when using the AWS CLI `--query`
option[0].

I then realized that using it in my code would make things much more
declarative and easy to grok than a bunch of maps, filters etc. Here's a real
example which I think illustrates it[1]. It has libraries for lots of
languages with a clear specification/compliance test[2].

The cherry on the top is the interactive query on the website. You can tweak
any of the examples (both queries and data) and get results instantly.
Extremely useful for playing around, building queries to work with JSON data
(webhooks, API responses etc)

[0]
[https://docs.aws.amazon.com/cli/latest/userguide/controlling...](https://docs.aws.amazon.com/cli/latest/userguide/controlling-
output.html)

[1]
[https://gist.github.com/gingerlime/757c7b4778c1ab68605dfce66...](https://gist.github.com/gingerlime/757c7b4778c1ab68605dfce66ceb8378)

[2] [http://jmespath.org/libraries.html](http://jmespath.org/libraries.html)

------
johnhenry
Besides's this projects cli, jp
([https://github.com/jmespath/jp](https://github.com/jmespath/jp)), I see jl
([https://github.com/chrisdone/jl](https://github.com/chrisdone/jl)) and jq
([https://github.com/stedolan/jq/](https://github.com/stedolan/jq/)) in the
comments. I wonder if anyone has had experience with all three (or even just
one) and can comment on their experiences?

~~~
ak217
jq is by far the best developed and has the most intuitive syntax, but it
doesn't have a formal spec for its language.

I have been maintaining
[https://github.com/kislyuk/yq](https://github.com/kislyuk/yq), which wraps jq
with a transcoder for YAML and XML.

~~~
cryptonector
Is there demand for a formal spec for jq? Would that lead to additional
implementations? Serious question.

~~~
llimllib
I would love to have a jq lib in every lang, which would probably require a
spec

~~~
cryptonector
That's fair.

------
xchaotic
The reinvention of XML in JSON is almost complete - JMESPath vs XPath, JSON
Schema vs XML Schema etc. If you need semi structured data to that level,
consider using XML instead - you can validate it, there's plenty of tools,
it's very stable and mature etc

~~~
bufferoverflow
We still need XSLT for JSON to complete the circle.

Though it seems like XSLT should support JSON:

[https://www.w3.org/TR/xslt-30/#json](https://www.w3.org/TR/xslt-30/#json)

~~~
avmich
XML Stylesheet Transformation, the language to describe transformations of XML
documents into another XML documents. Isn't jq or JMESPath an example of such
a language for transformations?

~~~
bshacklett
No, those are query languages akin to XPath.

~~~
avmich
What's the difference between query and transformation then? Isn't
transformation a (vector) function ^f of input ^x, with n dimensions of x and
m dimensions of f where f_i = f_i(x_1, ... x_n)?

~~~
Thiez
XPath can be used in Xslt, but not the other way round. In Xslt you can create
new elements, while in XPath you can't. Seems like a pretty big difference to
me. XPath isn't Turing complete either, while Xslt is.

------
crooked-v
The example on the front page seems equivalent to (and only marginally less
verbose than):

    
    
        locations
          .filter(l => l.state === 'WA')
          .map(l => l.name)
          .sort()
          .join(', ')

~~~
chatmasta
The benefit of a query language is that it can be described declaratively
(i.e. in a non-executable text file, perhaps within JSON itself), and then
programs written in any language can execute its query logic using a standard
interpreter written in that specific programming language.

So you get reusability of queries across the stack, in all languages that
implement a parser against the spec. Your example only provides re-usability
in JavaScript, and requires evaluating code at run-time so may not be suitable
for queries based on user-submitted data in multi-tenant environments.

~~~
steve_adams_86
I really appreciate this comment. I was trying to figure out why I wouldn't
use native data types and functions, but this makes it clear.

In your opinion, where would someone be storing the json such that they'd
benefit from a tool like this? The only time I use json outside of pulling it
from an API (where I can convert it to a native object) is probably storing it
in postgres, where I've already got json querying tools.

~~~
chatmasta
Some cases off the top of my head:

\- Infrastructure configuration stored in JSON. Query could reference other
JSON files, or the JSON file itself (loops would need to be considered).

\- Declarative reactive programming, e.g. platforms like IFTT. You might want
to take certain actions based on data in a JSON post. The IFTT GUI would
create JSON config files that its server side parsers can safely use without
eval'ing code to decide which action to take.

\- Adding conditional logic to jsonschema form generation. Recently I've built
a questionnaire renderer in react that renders forms based on jsonschema. The
user creates forms with a GUI, which compiles them to JSON, and then the
renderer knows how to render. Conditional logic (e.g. question B is required
if question A === true) can be quite limited when constrained to pure JSON.
Something like this could help with that.

The nice thing about declarative syntax is you can build a GUI to generate it,
so users never use the JSON itself, but you can store it in a database, safely
execute rules based on it, etc. without requiring programming from the user.

That said, there are usually better ways to accomplish this, like in pure JSON
for example. Mongo syntax achieved this, with declarative operators like
$or{}, ${sum}, etc., but it can be quite cumbersome.

------
zero_intp
Thanks for taking the time to write this tool. Can you explain how it is
distinguished from jq?

------
maltalex
I want to add one more into the mix - Couchbase has something called "N1QL"
('nickel'), which is actual SQL adapted for JSON:

[https://www.couchbase.com/products/n1ql](https://www.couchbase.com/products/n1ql)

It's not standalone though, you need Couchbase to use it.

------
danbruc
Does this have any mathematical foundation like the relational algebra for
SQL? Or more generally, does a mathematical framework exist to treat this or
similar constructs and that goes beyond what relational algebra provides and
that, for example, also handles aggregate functions?

The reason I am asking is that I am currently trying to build a tool to
analyze a kind of time series data, think log file entries, in order to look
for anomalies and visualize them. I could of course just build all the
transformations I am interested in in an ad hoc fashion but it would be nice
to have a mathematical framework in order to start out with a small set of
basic operations and then compose those while having some guarantees about the
expressiveness of that the basic operations and ideally also a rigorous
foundation for transforming them, for example for performance optimizations.

But so far I was unable to find something that seems fitting, everything I am
aware of is either to limited like relational algebra or way to general like
general functions. It feels like what I am looking for should exist but I am
unable to find it.

~~~
spraak
What reasons did you not go with SQL itself? I may not fully understand what
you're trying to do, but in any case it sounds really interesting.

~~~
danbruc
I never tried it but I am expecting the performance to be not good enough, it
takes already several minutes with code specifically written to perform the
calculations I am interested in. And because I don't know what exactly I am
looking for I need more or less interactive speed so that I can try out many
different ways to look at the data. But maybe I could use [materialized] views
to convey enough information to the query planner how to efficiently carry out
the calculations or maybe I am even underestimating how good query planners
are. I just have the gut feeling that performing a lot of aggregation will
make a database perform a lot of unnecessary work. But maybe I should and will
try loading the data into SQL Server and see what happens.

The other thing is that SQL seems not the best fit to me. Say you just want to
know how many events occurred in the last three months in any hour, that is
straight forward grouping and counting at first, but already rounding the
timestamps to an hour is not as obvious as it should be. But if there was no
event in a specific hour, your result will just have no row for that hour
instead of a row saying there were zero events in that hour. This in turn will
cause more trouble if you want to build a histogram showing in how many hours
there were say 0 to 9, 10 to 19, 20 to 29, and so on events. Certainly still
doable with SQL but we are already entering the territory where writing a
single query will take most people several hours to get the desired result.

I also couldn't easily tell how to express calculating the 99th percentile of
the event size for every day of the week and hour of the day. I am pretty sure
it is possible but I guess it would also be pretty unreadable unless you put
in quite a bit of effort to create utility functions instead of hacking
together one huge SQL statement. Then again I don't really know much about the
more recent SQL features for partitioning and aggregating, maybe I should have
a closer look at that first.

~~~
spraak
Is this for an open source project, or anything you'll be publishing? I'd be
interested to follow on with the results!

~~~
danbruc
Right now it is just an effort to develop a tool to diagnose and hopefully
thereafter fix random performance problems we are experiencing with one of our
applications in production. Despite having a small team dedicated to
investigating the problems, monitoring every click and function call with
Dynatrace, having had a Microsoft SQL server expert look into it, and getting
the system audited by one of the big consulting companies, the problem
persists since years and nobody has really any clue about what is going wrong.

The performance is never really great, it is [one of] the central applications
of the company and depends on the interaction with a sizable junk of the
system landscape developed over decades and therefore it is prone to be
affected by incidents in a lot of systems but most of the time it is good
enough. But once every couple of weeks or months something goes badly wrong an
requests, it's a web application, start taking several seconds or even minutes
to complete. Minutes later everything is back to normal.

But I digress. If I would manage to come up with a reusable and somewhat
general tool to analyze data similar to what I am looking at, I would consider
releasing it. It could either be a somewhat general data analysis and
visualization tool, think R, or it could be more specifically tailored towards
looking for anomalies in data sets like the one I am investigating. But as of
now I am struggling to come up with a general framework to express the
analyses I am performing and therefore all I have is a rather ad hoc
collection of transformations that extract and visualize aspects of the data
that could lead to new insights into what is going on.

But right now it is really driven by our specific issue, I notice something in
one view of the data and then come up with a new transformation to look at it
in more detail or from a different angle. It is nothing that could easily be
reused by anyone else and so for the moment it seems most likely that this
will never become public or maybe only in the form of a blog article
explaining what kind of information might be useful to look at and how to
derive it from logs that look rather uninteresting at first glance.

~~~
spraak
Aha, well thanks for sharing this far :)

------
jjuhl
See also [https://stedolan.github.io/jq/](https://stedolan.github.io/jq/)

------
wooby
A worthwhile alternative to this approach (a JSON-specific query language) is
a language for converting JSON structures to newline-delimited records. Then,
standard shell tools can be used to query and join:
[https://github.com/micha/json-table](https://github.com/micha/json-table)

------
rozzie
For those interested in arbitrarily transforming JSON objects (for example, in
a communications pipeline) I’d recommend JSONata. It’s quite useful and we’re
well along in a Golang port with $function extensibility.
[http://jsonata.org](http://jsonata.org)

~~~
cookiecaper
Agreed. I like JSONata a lot, even though it's the dark horse among JSON
traversal languages. I've had a good experience parsing semi-unstable JSON
with it.

------
dmoreno
There is also JSON Pointer
([https://tools.ietf.org/html/rfc6901](https://tools.ietf.org/html/rfc6901))
from IETF.

Very simple standard and easy to implement, but not as powerful as jmespath
nor jq.

~~~
bringtheaction
To say that it’s “from IETF” is sort of a misnomer I think. RFCs are submitted
_to_ IETF by others, not by IETF itself.

------
nimish
jq exists, is fast, and works well. Is this compatible?

~~~
fiddlerwoaroof
No, and I generally like jq’s syntactic choices better.

------
maxsavin
This looks interesting - but doesn't MongoDB basically achieve the same
effect? I kind of prefer MongoDB because you query JSON with JSON - but I'm
open to changing my mind :)

~~~
skywhopper
If you're using MongoDB already, then sure use MongoDB's query tools. But if
you are just working with raw JSON from a potential variety of sources, or in
a streaming context, then you need something more in-place and general-
purpose, which this appears to be.

------
avoidwork
why choose this over jsonpath (like xmlpath) or jq?

------
nurettin
I already have a query language for json. I insert json into mssql 2017
community edition and query it there. [https://docs.microsoft.com/en-
us/sql/relational-databases/js...](https://docs.microsoft.com/en-
us/sql/relational-databases/json/json-data-sql-server)

------
syats
One more alternative to many listed here is SPARQL-Generate. A single query
language that works for XML and JSON, and has syntax borrowed from SPARQL.

[https://ci.mines-stetienne.fr/sparql-
generate/playground.htm...](https://ci.mines-stetienne.fr/sparql-
generate/playground.html)

------
xonix
My tiny lib with very similar functionality: [1]. The query syntax is slightly
different though. Also I decided to re-use JS for evaluation of sub-
expressions instead of implementing own full-fledged parser.

[1] [https://github.com/xonixx/jsqry](https://github.com/xonixx/jsqry)

~~~
lioeters
I love libraries like this, which is small enough to be read in one sitting. I
can scan through and get a general understanding of everything that it does.

The "evaluation of sub-expressions" made me curious. This line:

    
    
       token.func = Function('_,i,args', 'return ' + token.val);
    

..could be a potential security issue with user-submitted expressions?

~~~
xonix
Thanks. I doubt this could be a security issue. Typical usage like so

    
    
        var name = one(users, '[_.id==?].name', 123)
    

uses parameterized queries, same idea as with SQL to eliminate injections.

~~~
lioeters
I see, I should have dug deeper before commenting. Wow, parameterized queries,
there's been a lot of thought put into this compact library!

------
octref
I built this VS Code plugin to convert JSON interactively using JMESPath:
[https://marketplace.visualstudio.com/items?itemName=octref.v...](https://marketplace.visualstudio.com/items?itemName=octref.vscode-
json-transform)

Might be useful if you are testing API or playing with JSON data.

~~~
selrond
Well actually - your plugin has led me to find JMESPath and post the link to
HN :D

------
danvk
I'm sad that JSONSelect
([https://github.com/lloyd/JSONSelect](https://github.com/lloyd/JSONSelect))
never caught on. It uses CSS selectors to query JSON, which has the nice side
effect that learning to use it improves your CSS as well!

------
jnordwick
I remember when xml started down this road too. "We aren't going to be sgml,
but just a lightweight markup. Xpath, xslt, etc. And now we have xml today,
the modern day sgml.

But this time it's different?

------
hoppelhase
There is also JsonPath.

[http://goessner.net/articles/JsonPath/](http://goessner.net/articles/JsonPath/)

------
ex3ndr
Does someone knows nice human-friendly search language like the one in Jira,
Slack, etc?

------
thrownaway954
the second I looked at the example on the homepage and saw this:

sort(@)

I'm like nope! What is this "@" symbol? Why can't that be "name"? I'm already
passing judgment that this library will be a nightmare to use which isn't
good.

Now I know I can read the docs and eventually what I can pass the sort
expression and what it all means, however, this is an issue I come across more
and more with new libraries in programming... show simple examples, not
"smart" or complicated ones. I shouldn't have to read through docs to try to
decipher an introductory example. There is a reason every programming language
starts with "Hello World".

~~~
Terretta
This is not that new. And it happens to be the JSON query language _built in_
to the AWS CLI tools.

For last 3 - 4 years I see lots of AWS CLI examples piping through jq. That’s
an extra dependency that’s not necessary when this is built in.

Here’s the author’s idea of intro materials (posted Jan 2015):

[http://jamesls.com/how-to-easily-explore-jmespath-on-the-
com...](http://jamesls.com/how-to-easily-explore-jmespath-on-the-command-
line.html)

I’ve found him very responsive on bugs and (provably useful) feature ideas.

------
blattimwind
I'm rather disappointed its not called JPath.

~~~
Ardren
Unsurprisingly there are already multiple projects called JPath

------
medleybron
this, combined with GraphQL would be awesome

------
adamretter
So no mention of JSONiq which came before JMESPath?

------
eli
May also be interested in the `jq` CLI, which on first glance appears to use a
similar but not identical query language.
[https://stedolan.github.io/jq/](https://stedolan.github.io/jq/)

~~~
donpdonp
seconded. the jq language is surprisingly powerful and at first glance at
jmespath, the syntax is similar.

~~~
shabble
I think the equivalent jq query would be something like:

    
    
        [.locations[] | select(.state == "WA").name] | sort | join(", ") | { WashingtonCities: . }

