In my implementation, I let a 64-bit integer represent the non-terminal stack (then used recursion if it went over that). So you'd need a JSON structure that has "a hash of a hash of a hash ... pointing to an array" with a depth in the 60s before you'd get a recursion call. And the real program stack shouldn't take that much per call (I didn't optimize this part, so I'm not sure what the minimum is).
I only wrote a JSON validator, but my method should apply to cases where JSON elements are being placed into a data structure.
I'm pretty sure that a JSON parser could be written in C/C++ that requires less than 1kb overhead, where most of that is the string buffer.
I make no claims about speed though, as my concern was proving the bit part (I suspect speed using my method would be worse than the naive per character jump table).
I have been wanting to generalize this approach with some open source software for some time, but I haven't gotten around to it. I think an attractive solution is for a user to specify a JSONPath  upfront about the types of subsets they are interested in. In the case of Ruby these chunks could then be yielded by an iterator and cleared, but I wonder if a more general-purpose CLI program which prints the subsets to stdout would also work if the subsets are small enough. But this adds another serialization/deserialization step.
I wrote a simple SAX wrapper  that applies a (so far, extremely simplified) subset of XPath to the stream, then parses each match into a DOM using Nokogiri. Works exceedingly well, though it's not faster than reading everything into RAM.
Using it reminded me a lot of my days writing XML SAX parsers. I understand why some people don't like that pattern, but if you've ever written a Finite State Machine it will feel very familiar. Personally I like how the event-based model lets me break things up into little chunks.
Heck if you want to get fancy(and have access to mmap) you can just mmap() your FlatBuffer data and let the kernel handle paging it in/out for you for very low memory overhead.