
Fat JSON - cedricr
https://www.tbray.org/ongoing/When/201x/2014/05/05/Fat-JSON
======
malgorithms
An update on Keybase, since it was chosen as the example. The API now supports
field declarations. For example, these all work:

    
    
       https://keybase.io/_/api/1.0/user/lookup.json?username=chris&fields=basics,pictures
       https://keybase.io/_/api/1.0/user/lookup.json?username=chris&fields=basics,pictures,profile
       https://keybase.io/_/api/1.0/user/lookup.json?username=chris&fields=basics,public_keys
    

This change was intended from the beginning, but we are still in alpha. It was
a quick addition, so the HN post made it timely. The nice thing is that this
isn't just a client-sensitive change. Loading a user's info is module, so
those above examples should be faster and less work for our server than
getting everything about someone.

~~~
malgorithms
I should add that we're not fundamentally opposed to some kind of query
language in the API requests, but most of our API objects lend themselves
pretty well to just passing a list of fields you want. The above technique is
very simple and seems to work well.

Aside, it also has cross-origin support (as does search), just added for
anyone who wants to interface with Keybase on another web page, in the front
end.

------
drachel
I'm a little surprised not to hear any mention of JSON Pointer (RFC 6901). It
deals with exactly this:

[http://tools.ietf.org/html/rfc6901](http://tools.ietf.org/html/rfc6901)

With JSON Pointer syntax, it would be something like:

    
    
        JWalk.getStringPtr(user, "/them/public_keys/primary/bundle");
    

It's concise, complete/unambiguous, and has implementations in a growing
number of environments, so I think it could be worth mentioning as an
approach. It also defines a useful URL fragment syntax for referencing nodes
within documents, which would be a good thing for the JSON world.

~~~
timbray
Oh dear, I’ve probably hurt some feelings for having missed 6901. Interesting
that when I googled around for JSON analogues of XPath, I didn’t turn that up.
Having said that, it’s not obvious that _any_ syntax is a win; an list of
strings is an excellent selector that hits an 80/20 point & meets all my
needs.

------
chriswarbo
I make heavy use of a similar thing in PHP. I have a function "lookup" which
lets me say:

    
    
        lookup($foo, array('bar', 'baz' 15, 'quux'))
    

This is equivalent to any of the following:

    
    
        $foo->bar->baz[15]->quux
        $foo->bar->baz[15]['quux']
        $foo->bar['baz'][15]->quux
        $foo->bar['baz'][15]['quux']
        $foo['bar']->baz[15->quux
        ... and so on
    

It's useful in Drupal when the required data is often at the end of a long
chain of objects-containing-arrays-containing-objects-....

I've been a bit naughty with its types for the sake of convenience: if a non-
array is given as the second argument, it's wrapped as a singleton array, ie.

    
    
        lookup(array('foo' => 'bar'), 'foo') === 'bar'
    

Also if any of the components aren't found, it returns NULL, ie.

    
    
        lookup(array('foo' => NULL), 'foo') === lookup(array(), 'foo')
    

It would be theoretically better to return an empty array on error and a
singleton array on success, since they can be distinguished, but in practice
that's so far down the list of problems with PHP that it's not worth the
effort of adding such wrappers at the moment.

A really nice thing about this function is that it curries nicely:

    
    
        $find_in_foo = partially_apply('lookup', $foo)
    
        $get_blah = partially_apply(flip('lookup'), 'blah')
    
        $find_in_foo('x') === lookup($foo, 'x')
    
        $get_blah($foo) === lookup($foo, 'blah')
    

Actually, my argument-flipping function is already curried, so I can just say:

    
    
        $get_blah = flip('lookup', $foo)
    

This currying is great for filtering, mapping, etc.

~~~
Jgrubb
Hi, Drupal dev here. Got to the point yesterday afternoon where I'm going to
have to implement exactly what you've already done in order to put out some
reasonable JSON from a node_load_multiple call. Don't suppose you have this
function in a gist anywhere, do you? Please?

~~~
chriswarbo
Here's the definition and a few tests.

[https://gist.github.com/Warbo/9d8425fcdd7c026c795a](https://gist.github.com/Warbo/9d8425fcdd7c026c795a)

The SimpleTest test class probably doesn't work as-is, since it is originally
derived from our own class hierarchy, but shouldn't be too hard to fix.

------
jerf
Honestly, what's happening here is that you've hit the limit of JSON. JSON
trades fairly significant verbosity for ease-of-use... when it stops being
easy-to-use, well, you've stopped needing JSON. While you may be forced to
hack around it, I'm not sure I'd spend _too_ much time trying to figure out
how to be principled in that hacking, because hacking it shall ever be.

JSON's nifty and convenient, but it's _huge_... with JSON from "the wild" I
often find it gzips by a factor of 16. And that's just gzip, which isn't even
the best at this sort of thing. If the API provides only a vague question that
you can ask it, and it hands you back a huge chunk of very fluffily-serialized
data, well... in a lot of ways you've already lost, twice (once for fluffy
serialization and once for a presumably-foreign API giving you too much data).

~~~
danielweber
Not too long ago on a personal Android app I was dealing with huge JSON
strings that would take 2 seconds to decode on my test phone. I started
cheating by plucking substrings out of the response and parsing them, cutting
the parse time down by a factor of 10. Thankfully the server never changed the
order of their response fields.

~~~
capisce
Sounds robust... Doesn't that defeat the purpose of JSON?

~~~
danielweber
Hence why I called it "cheating" and I was "thankful" that the server never
re-ordered fields (since proper JSON fields have absolutely no guarantee on
order).

------
skywhopper
I feel like Tim (like a lot of people connected to the web standards community
and systems-oriented folks in general) rushes too quickly to thinking about
how to standardize and generalize an approach to breaking down large JSON
documents like this. But this isn't a standards failure, it's just an API
failure. APIs will always need to be refined in various ways after they're
released. A "standard" method of asking the server to chop up a JSON document
for you would solve this particular problem, at the expense of creating a lot
more work on the server side (likely a new layer of abstraction), and there's
a limit to how useful that is. Versus just tweaking the API to make it more
focused and flexible is a process that's _always_ going to be necessary, no
matter what JPath/JWalk/etc standards are developed.

~~~
dyoder
> ...it's just an API failure

This. As far as the bandwidth issue goes, effective use of caching and
compression can go a long way. Schemes varying the responses via the URL
compromise schema validation based on media type (see JSON Schema, for
example). Specific cases where fat or thin requests truly make a difference
can be addressed with additional media types.

------
ig1
I had this exact problem in Python and wanted to be able to use dot-notation
(i.e "book.metadata.title") to query the structure (like MongoDB, etc do) so
built my own library for doing it:

[https://github.com/imranghory/pson](https://github.com/imranghory/pson)

(Also available from pypi via "pip install pson")

~~~
icebraining

      class MyDict(object):
        def __init__(self, d):
          self.__dict__.extend(d)
        def __repr__(self):
          return repr(self.__dict__)
    
      import json
      r = json.loads('{"body": {"translations": [{"tr...', object_hook=MyDict)
    
      >>> r.header.title
      u'Hello World'
    
      >>> r.body.translations[1]
      {u'translation': u'guten tag', u'language': u'german'}
    
      >>> map(attrgetter('language'), r.body.translations)
      [u'french', u'german']

~~~
ig1
The code in the library is actually pretty simple, but deals with many
practical edge-cases (stepping into an array, handling missing values
gracefully, etc.) that the above code doesn't.

My longer term intention is adding other useful json manipulation and search
functions to the library.

~~~
icebraining
Yeah. I just don't like putting syntax in a string; I'd rather use Python
itself.

By the way, you should remove the pprint import from the library, since you
don't use it :)

~~~
ig1
I considered that approach but found the JSON returned by some third-party
apis contained characters in keys (dashes, spaces, etc) which have special
meanings in python so can't be used natively.

------
CyberShadow
Somewhat related, jq is a command-line utility which allows filtering JSON
data, and uses its own path/filter syntax:

[http://stedolan.github.io/jq/](http://stedolan.github.io/jq/)

~~~
twic
Quite related to this somewhat related thing, jgrep also does filtering and
selection on JSON data:

[http://jgrep.org/](http://jgrep.org/)

~~~
sprobertson
Somewhat in relation to these things that are related, pipeline is a DSL for
manipulating JSON in a Unixy manner.

[http://github.com/qnectar/pipeline](http://github.com/qnectar/pipeline)

------
tel
This is immediately solved by Lenses

    
    
        _Object . key "them" . key "public_keys" . key "primary" . key "bundle"
    

which you might complain saying that this is yet another idiosyncratic method
of traversing data types. In fact this whole thread is full of examples of
other idiosyncratic methods of traversing (JSON only) data types.

Except lenses aren't. Lenses are _highly principled_ and compose and combine
in ridiculous ways while maintaining their exact behavior. Half the features
proposed in this thread exist naturally and "emergently" by the nature of
lenses. Finally, they follow mathematical laws describing how everything
should work together all to the T.

At the highest level a Lens is a getter plus a setter bound together. You can
extract either component to get or set a subpart of a value.

At the next level, you can take note that lenses "compose" (as a category) by
letting you reach deeper and deeper into a structure. This is what I took
advantage of with the JSON example above: composing them with `(.)`.

At the next level, lenses generalize naturally to "traversals" which target
multiple subparts all at once and "folds" which build exotic "getters" over
multiple targeted subparts.

At the next level, lenses "dualize" to prisms which deal with branching types.
This is hard to explain if you haven't used a language with a true sum type
(Scala, Haskell, ML, and lets not get into lazy/strict sums) but I used one
above to traverse into the "Object" indicating failure if my assumption of the
structure of the JSON blob were wrong.

At the next level you generalize these into isomorphisms which have nice
mathematical properties determining when two types are identical by giving you
invertible mappings between the two. This is like a lens which focuses on the
entire type as its "subpart".

([http://hackage.haskell.org/package/lens](http://hackage.haskell.org/package/lens))

\---

At the end of the day there are even more steps in that hierarchy. It sounds
ridiculously complex, and it is a little bit to learn. The advantage is,
however, that the intuition of "focusing on a subpart" applies over any data
type and with pretty near any weird combination of "lens-like" operators you
can dream up.

\---

Usually when someone first hears about lenses they say "Oh, getters and
setters. I have those already, no big deal". That's really far from the case,
however... Lenses end up being the XPath of _everything_.

~~~
jarrett
Haskell's Lens package (linked to above) is indeed nice. For a practical
quickstart guide, I'de recommend the Github page over the Hackage page:

[https://github.com/ekmett/lens#lens-lenses-folds-and-
travers...](https://github.com/ekmett/lens#lens-lenses-folds-and-traversals)

~~~
tel
Apologies for the self-plug, but I tried to write up a "friendly" intro to the
Haskell lens library as well

[https://www.fpcomplete.com/school/to-infinity-and-
beyond/pic...](https://www.fpcomplete.com/school/to-infinity-and-beyond/pick-
of-the-week/a-little-lens-starter-tutorial)

------
Zelphyr
I just don't see how

    
    
      JWalk.getString(user, "them", "public_keys", "primary", "bundle");
    

is better than

    
    
      user.them.public_keys.primary.bundle
    

What am I missing?

~~~
vinkelhake
That JWalk library is in Java so there won't be something you can use
dynamically like that.

~~~
jayd16
You can use GSON and just parse your objects into POJOs and access them with
dot notation.

In fact, now that I think about it, I wonder if you can rig up your POJOs as
Java 8 Option<> and remove some of the null checking. I'll have to look into
that.

~~~
tel
That's a good place to start. Basing it on the list monad instead of the
option monad gives you traversals with monoidal summaries.

------
malgorithms
Happy to see one of our Keybase API responses used as the example here. I can
share our intentions with that API call in the long run, and how it'll end up
smaller, which is especially important for mobile devices.

The dictionary describing the user you requested comes back with some high
level sub-dictionaries in it: "profile", "basics", "public keys", "pictures",
etc., and the API replies, currently, with _all_ the data to which you're
entitled. This is quite huge as Tim mentioned in his post.

Server-side we have a module for loading user information, which allows you to
request which of these fields you want when loading a "user object". For
example, on a certain page of the site we might load a dozen users but only
need [user.BASICS, user.PICTURES], so that's the only data that will be
loaded.

The API simply doesn't expose this filtering mechanism, yet, as calls the API
hit [user.ALL] which is a combo of everything available about a user. So, in
essence, all we need to do to trim these down is allow you to pass a filtering
parameter.

Btw, going in the other direction, there'll be a way to query for multiple
users at once, fattening things back up :-). So for example you could request
just basics + pictures for an array of usernames or userids.

------
heterogenic
Help me out here HN... I remember _somewhere_ seeing an API format where you
passed up the empty JSON object which you wanted filled and returned.

Something like:

    
    
      (request)
      {username:"",address:"",credits:""}
    
      (response)  
      {username:"Manilow, Barry", address:"Hollywood Bowl", credits: 99}  
      

Clearly not optimal, but it worked pretty well and was very intuitive.

~~~
junto
I think you might be looking for this:
[https://news.ycombinator.com/item?id=7681973](https://news.ycombinator.com/item?id=7681973)

~~~
heterogenic
Not it, but it does about the same thing. Thanks!

------
iSnow
If you generate your JSON with Java using Jackson, it offers Jackson-Views, a
very nifty way to define sets of JSON properties according to use case.

* [http://wiki.fasterxml.com/JacksonJsonViews](http://wiki.fasterxml.com/JacksonJsonViews)

* [http://techtraits.com/programming/2011/08/12/implementing-ja...](http://techtraits.com/programming/2011/08/12/implementing-jackson-views/)

~~~
ejain
Neat! Jackson also has a tree data model that can be used like so:

    
    
        user.path("them").path("public_keys").path("primary").path("bundle").textValue(); // null if any node is missing

------
bananas
So we're ending up with JSON with schema and query support.

In a couple of years: XML is the new best thing (as people finally realise
that they've needed it all along).

~~~
mantrax5
It's easy to argue with strawmen like you do, but query and schema support was
never the problem of XML.

The problem was that XML is a markup language, and JSON is an object notation.
It's right in the name. And this results in different tradeoffs for both.

The XML schema was designed to create schemas for documents. This is why
different types of schemas evolved specifically for services (like SOAP) but
having to be slapped on top of a markup language base, it couldn't be good
even though it tried. JSON doesn't have that initial complexity and the goal
matches the use.

Just like it'd be silly to write a manual in JSON, it'll forever remain silly
to serialize generic object structures in XML.

Now I could also argue XML is a mediocre markup language, and show
alternatives, but as they say, that's a whole 'nother story.

~~~
WorldWideWayne
You're right. It's so much sillier to do this:

    
    
        <fruit id="10" name="orange"/>
    

Than this:

    
    
        {"fruit": {"id":10, "name":"orange"}}
    

because the first one is called a "markup language" and the second one is
called an "object notation". I'm not buying it.

~~~
slig
It can be done like this too:

<fruit> <id>10</id> <name>orange</name> </fruit>

Which one is better? I find it hard to decide, and I believe most people do.
And that's why we see it mixed, often in the same XML document.

~~~
bananas
The one you use is better (correct).

~~~
spopejoy
Utterly disagree. Attributes can actually have validated contents, such as
enumerated lists, etc, and are attractively terse.

Over-reliance on elements are why Maven pom files are such a verbose disaster,
and probably the main reason why web developers puke when trying to stream
data. Restating element names make for illegible, bloated data. Attribute-
heavy XML is attractively terse and benefits from validation (unlike JSON).

~~~
bananas
But you can't change an attribute to a composite type in the future easily.

As for maven POMs, I use Netbeans "add dependency" and that's about it so it's
a non issue for me.

------
justizin
THANK YOU.

I've been raising cain about this in the chef community for some time - node
objects can easily be as large as 128kb+ of json, which can consume over 1-2MB
once parsed into a ruby json object. An empty search of a system with over a
thousand nodes can consume over a gigabyte of ram!

The worst case I experienced this was writing out an /etc/hosts file, for
which you only need two fields : name and ip, of each host, but you still get
a list of every cpu core, dimm, etc..

Excited to see the potential examples, I might try to work on into chef if I
get time. The chef solution has been to have 'whitelisted attributes' which is
a whole mess unto itself.

------
jhh
Once you have used a parsing library to create generic data structures (in
your programming language) from your JSON all of this no longer has anything
to do with JSON, right? That's something that confuses me about this blog
post. To me it seems that it talks about a very generic issue in very specific
terms.

~~~
sk5t
Agreed. In something like JS (or dynamic C#) one could simply access
obj.results[0].address.street, which casts the object nesting as useful/self-
documenting rather than needlessly complex.

------
drostie
Given that someone uses fat JSON, it seems plausible that you'll have to face
those sorts of problems (either simple selectors with logic for potentially
dealing with multiple responses, or complex selectors into the tree). What
you're really saying in JSON-land is something like, "this whole object should
be destructured; I just want a flat object with short keys."

That's the right design approach for the "structs" of JSON; it's wrong
unilaterally (JSON also has "hashes" with the same syntax, and they should be
separated from that context. Similarly you don't want to destructure an array
from {data: [1, 2, 3]} into {data_0: 1, data_1: 2, data_2: 3} unless you
absolutely have to.)

Once you flatten it, then partial responses for things which return a struct
do exactly what you want; you say e.g.:

    
    
        ["myQuery", {on: "stuff", _fields: ["a", "b", "c"]}]
    

and you just get {"a":1,"b":2,"c":3} as your JSON response.

So what I'm saying in summary is that if you write your own APIs you can get
this sort of functionality without building a magic tool; the reason that the
magic tool is not mainstream is because it's only right for dealing with
structs and not hashes (because if there's a hash elsewhere in the object a
user might register their own key in the hash called "ctime"); and given that
some API gives you a complex structure, flattening it the way you're doing is
potentially a little risky because later updates might say that there's
another ctime to some other part of the Users object.

------
filipncs
I must be missing something obvious. How is Tim's JWalk example different than
doing:

    
    
      try { 
         var key = json_object.them.public_keys.primary.bundle;
      } ...
    

Is it to better allow dynamic keys? More consistent error handling?

~~~
aaronem
Well, it's in Java, for one thing.

~~~
jameshart
So this "fat JSON" problem is Java's problem, not JSON's, then? The above code
could be perfectly valid C#, as well as valid JavaScript.

On the other hand, if you're hardcoding assumptions about the JSON structure
into your Java (which you're doing even if you pass a string literal to some
API to look up a value for you), you could probably employ some lightweight
code generation to spin you up some classes with strongly typed public fields
named things like 'them' and 'public_keys' to help you write more idiomatic
JSON access code.

~~~
aaronem
> So this "fat JSON" problem is Java's problem, not JSON's, then?

Pretty much.

I would expect that Java could offer some sort of solution to this problem,
ideally with an accessor syntax similar to what you'd find in Javascript or,
as you say, C#. On the other hand, I've never delved deeply enough into Java
to know whether it's feasible; I suppose it's possible that the language
simply isn't flexible enough to make such a solution workable.

------
lttlrck
If path features became common it would likely lead to even more bloated and
less thoughtful APIs.

------
crazy_geek
My security senses are tingling. A server evaluating potentially hostile
client provided expressions? Proceed with extreme caution.

------
habosa
I know everyone is saying this is an oversimplification and just another
person rushing to create a library etc, etc. But the fact of the matter is if
you ask any Java developer who has dealt with JSONObject they'd probably want
to use this. And that says something.

Why can't I make a contract with my JSON parser. Saying: look, last time there
was a 200 OK response all of these fields were there. I promise, they'll be
there next time too. I don't need to try { get field } catch {} every single
time.

The best way around this currently is to hope that the API has a client
library, but that just means that every API maintainer now has to write my
Java for me too.

I'm not sure what the solution looks like, but I want a modern Java JSON
parser that understands how the API landscape looks today. This extends to
other languages that are static typed and have exceptions as well.

Edit: for the record OP is Tim Bray who invented the XML spec, so he has some
experience with traversing documents.

------
dukedougal
First world problem, man. Important only to the performance obsessed. Most
developers concerned about this have too much time and not enough commercial
imperative.

~~~
heterogenic
As we move intelligence to the client, we're starting to use a lot of data
queries which were never meant for the wire. (Particularly in IT
applications.)

Having a way to filter in those cases would save having to write & maintain a
whole new CRUD layer. In the long run, it could even lead to more efficient
queries when objects are composed of many records.

------
HarrietJones
If only there were some kind of structured query language we could invent that
allowed us to choose the fields and records we needed to look at. We could get
some kind of standards institute to ratify it so everyone used the same
interface.

A pipedream, I know. A crazy, wild pipedream.

~~~
spopejoy
While we're pipedreaming ....

how about some way to validate streamed data against a schema? Offering such
advanced features as ... an array with a single element? Or even crazier,
enumerated values.

Maybe even a way to transform streams using a selector-based query syntax.
What a world ...

------
uptown
I rolled my own solution to this for a mobile turn-based game I have in
development. Most of the time, the device is just polling to see whether
there's anything that needs updating. As part of my polling query I pass a
signature of the current game state (game-round, with a few other bits). On
the server, if that checks out, then the reply is tiny. If there's the need
for an update, I send back what's changed and update my views on the client-
side. It's definitely not the right solution for every scenario, but I've
found it works well for my specific situation.

------
billpg
RFC 6901 specifies a simple XPath-esque notation.

------
lhnz
I had a similar idea[0] that I haven't actually had time to finish. The
README.md kind of explains where I was thinking of going philosophically. It's
a bit out-of-reach with my current workload but I'd love to contribute with
others that could tackle the areas I find difficult.

[0] [https://github.com/sebinsua/jstruct](https://github.com/sebinsua/jstruct)

------
facorreia
The Open Data Protocol (OData) specifies ways to declare server-side filters
and to restrict the fields sent in the response.

------
zupa-hu
This is the typical nice-to-have feature. It takes time to implement, adds
server-side overhead, adds complexity but adds very little value. OP is
arguing that it costs him resources to traverse the entire JSON. There are way
more clients then servers, thus increasing server-load to make it cheaper for
the clients sounds weird to me.

~~~
Pxtl
> There are way more clients then servers, thus increasing server-load to make
> it cheaper for the clients sounds weird to me.

It matters if your client is a 1ghz phone with a poor cellular connection. The
user _can_ do the filtering on their end, but you're paying for that
offloading in terms of a worse experience for the user because of the larger
download.

------
jedp
JSONSelect lets you do queries on JSON objects using CSS-style selectors.
There's an interactive demo here:

[http://jsonselect.org/](http://jsonselect.org/)

The code is on github:

[https://github.com/lloyd/JSONSelect](https://github.com/lloyd/JSONSelect)

------
nailer
[http://agave.js](http://agave.js) includes (prefix)getPath by default on any
object.

    
    
        var mockObject = {
          foo: 'bar',
          baz: {
            bam:'boo',
            zar:{
              zog:'something useful'
            }
          }
        }
    

So:

    
    
       mockObject.getPath('/baz/zar/zog')
    

or, alternatively:

    
    
        mockObject.getPath(['baz','zar','zog'])
    

will return:

    
    
        'something useful'
    

It's also got a bunch of other useful stuff like 'kind' (closest prototype of
an object, that works consistently everywhere), number methods like
(2).weeks().ago(), and reads more cleanly than underscore as it uses actual
methods.

~~~
randallsquared
The main problem with string-based paths is that the keys are just a string,
so an object with slashes embedded in the keys is perfectly valid. So,
ambiguity.

~~~
nailer
Yep, that's one reason for the array version.

------
mjs
Letting clients choose the fields they want to receive leads to the awkward
realization that any generic, flexible and future-proof mechanism for doing
this leads to system where a client can suck down an entire website with a
single request.

(If clients can choose _not_ to receive some information, then by symmetry,
they should also be able to choose to receive some additional information
that's not included by default. And since everything is linked to everything
else (orders are linked to users, etc.), you end up with a single resource
that potentially embeds everything else.)

These systems also break caching, of course, and also to some extent the
principle that within-server links are indistinguishable from cross-server
links. The web is not optimized for performance or file size.

~~~
jdbernard
That's really a non-issue though. Just because they can decide to receive less
information, it doesn't follow that they _should_ be allowed to receive more.
I don't see why this particular type of symmetry would be important or
desirable.

The way I've implemented this in the past is to start with a standard API
endpoint with a defined data-set that it returns. Then I allow the client to
select to receive only a subset of those fields, or make no selection and
receive all of the defined fields. The client cannot request fields that are
not part of the data-set defined by that endpoint. This is at least as easy to
work with going forward as an inflexible return object. The API can be updated
to include more data without affecting clients who don't care about that. If
there is a breaking change, then that requires a new version, just as it would
normally.

------
carsongross
One of my pet theories when I developed intercooler.js
([http://intercoolerjs.org/](http://intercoolerjs.org/)) was that, by
targeting specific UI elements with only the data necessary, you might
actually cut _down_ on the amount of data transfer between the client and
server when compared with some general JSON APIs, despite the fact that HTML
is a less efficient data format.

I'd expect this to hold, in particular, in areas where JSON isn't particularly
a particularly efficient encoding mechanism (e.g. tables)

It's an interesting thing to consider, at least.

------
colinramsay
This is definitely a problem. I work with an existing API and am building a
mobile client based around it, and while there's limited support for selecting
which fields you bring down, it doesn't work for nested object.

This results in a bloated response, which on mobile is a real problem for
responsiveness and data costs.

We considered building a proxy which would form responses tailored for the
mobile client, but that felt like a hack. Mind you, so does Google's partial
response solution.

Maybe some services could allow you to build your own response, creating a
custom version of an API just for yourself?

------
jb55
Weird I also made a small dot-lens javascript library to do the same thing
just the other day...

[https://github.com/jb55/dot-lens](https://github.com/jb55/dot-lens)

It even works for zooming into arrays

------
andrewstuart2
As I understand it, there's a lot of places overhead creeps in that tends to
make this sort of thing vastly more efficient than making multiple calls. Sure
you'll maybe send data people don't want this time, but it will probably save
processing and networking overhead that would be spent building and sending a
long list of the fields they want. When designing an API I tend to lean to
being a bit more verbose than I need to be, if only to save the HTTP overhead
of another request. At least until HTTP2 helps us out there.

------
datashaman
Use JSONPath, it's a JSON version of XPath and it's awesome.

[http://goessner.net/articles/JsonPath/](http://goessner.net/articles/JsonPath/)

------
yeukhon
It is important to note that when you accept selected fields to output you
must validate those field names as well.

Sometimes people has a giant object from database, and on return they return a
subset of it. But someone may make a mistake by iterating over that object to
return selected fields.

    
    
        if options:
           return {key: object[key] for key in options}
        else:
           return safe_output_for_this_api(object)
    

So collapse that into safe_output_for_this_api instead :D

------
efsavage
I do this on a current project. Certain fields are rarely need and turned off
by default, others are usually needed and turned on by default. It's not done
via a special syntax though, just query parameters, so something like
/person/123?bio=false&salary=true. A standard path syntax might be nice but
for handcrafted APIs this works well.

The front-end models are reusable and don't really need to care so long as the
properties they need are available.

------
lukejduncan
The Play framework has JsPath, which I like quite a bit. It allows for
traversals of the kind I assume Tim would want to do with XPath.

[http://www.playframework.com/documentation/2.1.1/api/scala/i...](http://www.playframework.com/documentation/2.1.1/api/scala/index.html#play.api.libs.json.JsPath)

------
philo23
That sort of syntax reminds me of Objective C's key paths.

    
    
      [object valueForKeyPath:@"them.public_keys.primary.bundle"];
    

which is part of the larger, more in depth, Key-Value coding interface. It was
definitely something I missed going back to PHP, Javascript and other
languages after using Objective C.

------
jaredmiwilliams
We built a jackson extension to offer this sort of friendly filtering
automatically with JAX-RS endpoints: [https://github.com/HubSpot/jackson-
jaxrs-propertyfiltering](https://github.com/HubSpot/jackson-jaxrs-
propertyfiltering)

------
alexose
I'm a little late to the party, but I recently wrote something that can help
with this problem. It has the added benefit of acting as a kind of reverse
proxy:

[http://github.com/alexose/sieve](http://github.com/alexose/sieve)

------
AshleysBrain
Couldn't a "possibly null" member access operator help here? E.g.:

    
    
       var key = user?.them?.public_keys?.primary?.bundle;
    

where

    
    
        object?.property
    

is equivalent to

    
    
        object ? object.property : null

~~~
emidln
I'd rather just have a functions similar to clojure/clojurescript's _get-in_
[0], _update-in_ [1], _assoc-in_ [2], _dissoc-in_ [3] for working with nested
objects/arrays. These are pretty easy to write in Python, Ruby, and
Javascript, although I'm not aware of any public library that implements them.
It really makes working with JSON-returning apis a breeze.

[0] - [http://clojuredocs.org/clojure_core/clojure.core/get-
in](http://clojuredocs.org/clojure_core/clojure.core/get-in)

[1] - [http://clojuredocs.org/clojure_core/clojure.core/update-
in](http://clojuredocs.org/clojure_core/clojure.core/update-in)

[2] - [http://clojuredocs.org/clojure_core/clojure.core/assoc-
in](http://clojuredocs.org/clojure_core/clojure.core/assoc-in)

[3] - [http://clojuredocs.org/clojure_core/clojure.core/dissoc-
in](http://clojuredocs.org/clojure_core/clojure.core/dissoc-in)

~~~
ajanuary
The authors JWalk is basically get-in. Lenses, as popularised by Haskell,
solve a similar problem.

------
tonetheman
underscore.js does this with a pick function

_.pick( {a:1,b:2}, "a" );

underscore is cool.

~~~
vkjv
If you think underscore is cool, you'll go bananas over lodash.

~~~
mjs7231
Yet another single letter namespaced project to collide with other single
letter namespace projects.

~~~
mattwad
lodash is a replacement for underscore, so it's definitely not as bad as all
those '$' libraries. Just make sure lodash gets loaded after underscore, since
it's a superset (and faster)

