

Oboe.js: reacting to Ajax/Rest quicker by not waiting for it to finish - joombar
https://github.com/jimhigson/oboe.js

======
k3n
I was wondering when some of these types of libs might make their way into the
world, I know I saw in Nicholas Zakas' book "High Performance JavaScript"[1]
that he demonstrated how to read back (process) a large AJAX result in chunks,
allowing you to begin working with the response before it was finished
downloading. It was called "multipart XHR", and he shows a neat code example
and then links to a site that has sample code[2].

The main difference between multipart XHR and Oboe, that I can see, is that
MHXR requires you to format your data in a specific manner (using a magic
delimiting character), though I'm curious if the base method is similar or
not.

1\.
[http://shop.oreilly.com/product/9780596802806.do](http://shop.oreilly.com/product/9780596802806.do)

2\. [http://techfoolery.com/mxhr/](http://techfoolery.com/mxhr/)

~~~
joombar
I haven't looked at MXHR but here's roughly what Oboe does:

1 Create XHR, listen to XHR2 progress event. 2 Use Clarinet.js SAX parser,
scoop up all events. 3 From SAX events, build up actual JSON and maintain path
from root to actual node. 4 Match that path (+ some other stuff) against
registered JSONPath specs. 5 Fire callbacks if they pass.

~~~
k3n
Interesting. After looking at the MXHR more, it appears as though it was
adapted from Digg.com[1] (aka DUI.Stream[2]), and then later adapted by
Facebook[3].

Yours sounds more elegant in that it can handle JSON naturally, but I wonder
if the other might be better suited for binary content (not sure in which
context that would make sense, if any).

Either way, I find them all fascinating, and I've starred your project :) Will
keep an eye on it.

1\. [http://techfoolery.com/mxhr/mxhr.js](http://techfoolery.com/mxhr/mxhr.js)

2\. [https://github.com/digg/stream](https://github.com/digg/stream)

3\.
[https://www.facebook.com/note.php?note_id=183263890703](https://www.facebook.com/note.php?note_id=183263890703)

~~~
joombar
I suppose you could make a binary equivalent if you needed to. You'd need to
make some kind of binary matching language, maybe like Erlang's binary
matching.

Adding XML/XPATH support would be a natural extension.

------
timmaah
Perfect..

Took me about 10 minutes to add to an existing rails app that adds hundreds of
markers onto a google map via json. Live updating as you scroll around.

Previously I was trying to find the sweet spot between how many slows down the
initial rendering to the viewer vs showing all the markers.

Changed about 3 lines of javascript to use Oboe and changed my rails
controller to:

    
    
          per_page = 100
          1.upto(10).each do |page|
            response.stream.write ActiveModel::ArraySerializer.new( resources.paginate(page: page, per_page: per_page) , {root: 'cg'}).to_json
          end
          response.stream.close

~~~
rurounijones
What is the resulting speedup to the first marker?

------
dustingetz
At the very bottom he describes the use case. Mobile apps optimize for battery
life by preferring one big long request up front rather than lots of little
ones as needed. But you only need the first 10% of the data to render the
first screen of your app.

~~~
camus2
You might want to render the first screen of your app without any ajax call.
That's the best thing to do. There is no reason why your app cant render with
datas directly.

~~~
encoderer
No reason? You, sir, are lacking imagination.

Here's one: How about the fact in some cases waiting on the latency of a data
lookup before rendering the page makes it feel much slower and more sluggish?

In the case mentioned above of map markers, if there's real latency involved
in the lookup and you can render the page sans markers and then populate them
a couple seconds later, isn't that a superior UI?

Yes, if you are building a simple, moderate or low traffic website/app, it's
probably a better practice to render the page with the initial JSON needed.
But a lot of the people here are working on products with millions of users --
or just tons of data -- and that changes the equation a bit.

What do you think?

~~~
currymesurprise
I think you misunderstood your parent's comment.

While the phrasing of the final sentence was awkward, the post was arguing for
exactly the same thing as you.

------
sequoia
This looks really cool! I'm confused about the usecase, however. In your
foods/nonfoods example[0], it allows you to request a some JSON with 2 keys,
`foods` and `"nonFoods"`, each with an array value, and use only foods,
discarding nonFoods. You request this from oboe('/myapp/things.json')

My question: why not modify the backend to accept a request like
`/myapp/foods.json` and let the backend compose the json you need & send only
that? It seems like fixing it "in the wrong place" to build your frontend to
accommodate getting the wrong/too much data. Is this a contrived example that
isn't the core usecase? This is my assumption.

Is this primarily for 3rd party APIs and legacy codebases where it's
impractical to change the response type & updating to e.g. sockets is
impractical? Thanks for the cool project, I apologize for my ignorance wrt
whatever points I'm missing!

[0] [https://github.com/jimhigson/oboe.js#using-objects-from-
the-...](https://github.com/jimhigson/oboe.js#using-objects-from-the-json-
stream)

~~~
joombar
I suppose the example is a little artificial. It isn't really for using some
of the JSON response while ignoring the rest (well, you can use it for that
but it isn't the main use).

I got the idea for this project working on data vis. Not all of the data was
visible and we wanted to display the first bit of data quicker without waiting
for all of it to arrive. We could have just sent the visible bit but it was
good to have some data ready in the off-screen section for when the user
scrolled.

Before that I worked on a service where we were aggregating 6 or 7 services
into a single JSON. Some of the services were quicker than others but because
the AJAX lib we were using waited for the whole response they all had to go at
the speed of the slowest component.

We could have done multiple requests but it was more elegant to serve a whole
page's json in one call. Also, we cached the slow services so they were only
sometimes slow.

~~~
estebank
> well, you can use it for that but it isn't the main use

Actually, you could use it for that, as long as the library actually finishes
downloading the file on abort. If you need contents from the file again, you
just download it again and as it will already be cached, that operation should
be extremely fast.

------
joshfraser
This makes a lot of sense. Latency is the #1 enemy on mobile, but bandwidth
tends to be relatively okay. That's why streaming a video to your phone feels
surprisingly fast, while everyday browsing feels sluggish. The obvious
conclusion is to use fewer but larger requests, which is why Oboe is so
attractive.

------
freework
This is great for that 1 time out of 1,000,000 when you have an ajax call that
would benefit from a tool like this. In the overwhelming majority usecase,
this oboe.js thing is not going to be a "plug it in, automatically webscale"
type of optimization. I'm not trying to rag on the authors of this project,
but the wording of this submissions is going to lead noobs to mis understand
the benefit. The authors should instead emphasize the usecase where an actual
benefit comes out of using this library, instead of just saying "it makes your
ajax faster!!"

~~~
joombar
It should make most calls faster. Exceptions are for small JSON files or on
networks that are fast enough there is no streaming effect (the whole file
arrives very quickly)

For most sites there'll be some users where it will make it faster (mobile,
slow internet) and others that it'll be about the same. If the network is
unreliable it should help as well because when the connection drops you don't
lose what you already downloaded.

"makers" = just me :-)

------
joombar
Just merged in support for reading any stream in Node:

[https://github.com/jimhigson/oboe.js#reading-from-any-
stream...](https://github.com/jimhigson/oboe.js#reading-from-any-stream-
nodejs-only)

    
    
      oboe( fs.createReadStream( '/home/me/secretPlans.json' ) )
       .node('!.schemes.*', function(scheme){
          console.log('Aha! ' + scheme);
       });
       .node('!.plottings.*', function(deviousPlot){
          console.log('Hmmm! ' + deviousPlot);   
       })
       .done(function(){
          console.log("*twiddles mustache*");
       });

------
jbrooksuk
Does this require any changes server-side?

I'm using Node.js — with an Express.js router — to power an internal site. We
have a few API endpoints which would benefit from this. Does anything need
changing when sending the data?

~~~
joombar
No changes required. It should accept any JSON resource.

Having said that, there's a much bigger improvement if you're progressively
writing out the JSON rather than doing it in one big lump.

~~~
jbrooksuk
Awesome! I'll investigate how easily we can stream our JSON out.

~~~
justincormack
Its probably not worth changing it if it is eg in a file already, but if the
process to generate it is a slow one it might be.

~~~
joombar
I agree. If you're doing something slow/asynchronous like aggregating several
http resources it is worth it to write out as early as you can but keep
server-side the same if you can generate the whole JSON quickly.

~~~
reddit_clone
One issue I found with writing early is handling error condition(s).

On the server side, you may run into an error after you started writing out
your reply. That may cause an incomplete reply. It may require more involved
error handling between the server and client. 'Gather first write later'
approach gives you a simpler error propagation between server and client.

------
buro9
Is it intentional that it accepts invalid JSON?

I ask only because most of the examples on that page would not be valid as the
name part of the key:value pairs needs to be quoted.

PS: Don't upvote this, it's a minor detail and the big point about the use-
case is far more important than this comment.

~~~
joombar
Parsing is built on top of Clarinet (see where the name comes from?)

[https://github.com/dscape/clarinet](https://github.com/dscape/clarinet)

Unquoted JSON in docs is a mistake. I'll take a look now.

~~~
Omni5cience
One could be forgiven for thinking it comes from SAX, which is probably where
Clarinet comes from.

------
lnanek2
Java has had good stream parsing of JSON for a while now too. Last time I had
to do it, I was surprised to find GSON, the library we were already using, had
support for it. XML stream parsing vs. model parsing was a much bigger change.

------
lttlrck
There must be a break-even response size below which this is pointless, the
server doesn't send responses byte by byte but in chunks the whole response
could well be on the wire already so it would how zero impact on download
footprint. The impact of TCP nagle makes this very hard to predict too so the
chunk size will vary between server config and server workload at the time of
the request. Anyone that has developed an HTTP parser has experienced this. I
feel like this is a problem to be solved in the server using JSON path or
similar.

------
alessioalex
This is basically node-trumpet in the browser. Really great stuff!

[https://github.com/substack/node-trumpet](https://github.com/substack/node-
trumpet)

~~~
joombar
Hadn't seen that before. Interesting link, thanks.

Very similar except JSON/JSONPath instead of HTML/CSS. Oboe runs fine in Node
but I want to make the code a bit more standards-y. Ie, using Node's
EventEmitters instead of the little pubsub I made for the browser.

~~~
alessioalex
It would be really nice to have the same streaming interface (with
EventEmitters) as Node, but you I can just shim that on top of Oboe I guess.
It would be great to have the same patter as in Node.

Btw what are you using for the Node side, JSONStream?

[https://github.com/dominictarr/JSONStream](https://github.com/dominictarr/JSONStream)

~~~
joombar
There's only a node side so far as I needed to write for some component tests.
It'd work with anything that writes out valid JSON.

The client side works in Node right now as well as in the browser but it is a
bit browser-y. It is on my home office Agile board to make it a bit more
node-y.

------
gagege
Very cool, but there's one thing I don't understand.

Do you need to stream JSON objects from a server to make this work? You have
to get a response from the server in some kind of streaming protocol right?

Or...

Is this just reading the part of the JSON string that it currently has?

I'm inexperienced with streaming, so this might be obvious to some.

EDIT:

Ah, I get it. I was wrong to say it was "streaming". Looks like my second
suggestion was the correct one.

~~~
tootie
If you've spent too much time in jQuery you might not realize that streaming
is supported by the underlying XMLHttpRequest object. The readyState property
can be set to LOADING or DONE. Most applications will wait for DONE so they
have a complete document to work with. A "normal" payload will be small enough
that the lag between LOADING and DONE is tiny, but a big download can
definitely be parsed as it loads. Imagine an HTML doc with inline JavaScript.
That JavsScript is evaluated as soon as it's encountered and doesn't wait for
a page load that may never happen.

------
tlarkworthy
You could serialise a state history as it happens, allowing your users to
modify the futere state in real time and push to the always unfolding history
stream. Could be great for games.

~~~
joombar
If you don't care about old browsers you could use the same connection to keep
the state updated as you used to download the original state.

I built this for more-or-less standard downloading, only quicker. But, yeah,
if you set a server up to feed it you could use it for lots of creative
things.

------
foxbarrington
This is really important for perceived load time[0]. Here's an article that
illustrates how this can work in practice: [http://dry.ly/full-streams-
ahead](http://dry.ly/full-streams-ahead)

[0][https://sites.google.com/a/webpagetest.org/docs/using-
webpag...](https://sites.google.com/a/webpagetest.org/docs/using-
webpagetest/metrics/speed-index)

------
tambourine_man
This probably won't work if you have gzip on, right?

~~~
joombar
I'll have to test to be sure you still get streaming. It depends how the
browsers handle xhr2 events with regard to gzip'd http. I /think/ it'll be
fine but I need to check to be sure. Eg, with gzip on you still get
progressive html rendering.

~~~
tambourine_man
From what I remember, Apache will send you a single chunk if you have it on,
for example:

    
    
      <? 
      echo 'Hey<br>';
      flush();
      ob_flush();
      sleep(20);
      echo 'Bye';
      ?>
    
      

Will send you "Hey<br>Bye" after 20sec instead of what you would expect (which
does work with gzip off).

Besides, even if you manage to stream it, I imagine inflating partially
received content is not trivial for the browser.

~~~
mbrock
Gzip is a streaming format -- it's designed as a compressed format for
communication streams. Browsers have no trouble with this. The Apache behavior
you describe is probably related to buffering settings, which I think can be
configured.

------
jastanton
So if you're streaming in the JSON this program must have a custom JSON parser
because there would be no way to assure valid JSON on an incomplete payload.
Am I understanding how this works correctly?

edit: Also how is this faster, if you're parsing an incomplete response over
and over again isn't each parse blocking, wouldn't this approach kill your
FPS?

~~~
joombar
Well, you could have a JSON which was valid at the start and invalid at the
end. It'd parse the first bit ok and only throw an error when it got to the
invalid bit.

Nothing gets parsed more than once. SAX parsers already parse streams, they're
just not used very much because they're a pain to program with.

------
fro
This could be big for making dynamic web maps faster. Often we request a large
array of geometries to display on a map and can only display them all at once
after ajax is done. If we could display each geometry as they are loaded it
would be a big improvement to perceived performance. I imagine this is the
case for other kinds of data vis as well. Off to test!

------
zamalek
This seems like it could be used to create a protocol similar to XMPP - which
I always enjoyed from the perspective of elegance.

I might have a pet project this weekend, thanks for sharing.

~~~
joombar
No worries. This is my masters dissertation btw.

~~~
alessioalex
What exactly is the title of your dissertation? (just curious)

~~~
joombar
Liable to change but "An approach to i/o for rest clients which is neither
batch nor stream; nor SAX nor DOM." I'm writing it now.

------
Kiro
Is this sax? [http://www.saxproject.org/](http://www.saxproject.org/) Not sure
I understand what it is and what role it has in oboe.js.

~~~
joombar
it is built on top of a sax parser.

------
lucidrains
Excellent! Plopped it onto my site where I was loading a big json package and
it works beautifully. Thanks so much :)

------
jastanton
Are you using a webworker to process your JSON stream?

~~~
joombar
No, although that's a nice idea and something I'd like to look into.

