
Jsmn, a minimalistic JSON parser in C - jasonmoo
http://zserge.bitbucket.org/jsmn.html
======
udp
The code is certainly very short, but what about \u escape sequences in
strings, parsing different representations of numbers, etc.? Since those
things are part of the JSON standard, you're not a JSON parser if you just
leave them to the application to handle.

Since this skimps out on half of the work, it won't even be able to tell you
with certainty what's valid JSON and what isn't.

(disclaimer: I also wrote a popular ANSI C JSON parser)

~~~
stevelaz
I saw json-parser. Looks good. I'm currently using YAJL for a project, but I'm
open to switching to something faster/easier.

Have you done any performance tests against other C json parsers?

Thanks,

------
Xion
This is not a JSON parser, it's a tokenizer. From a parser I'd expect at least
acknowledgement of the basic key-value association.

It doesn't say that it's not useful. It's just that for anything non-trivial,
you'd need to supplement this library with quite a bit of your own code, e.g.
a stack for tracking nesting levels.

------
alisdair
I wrote some simple jsmn examples, since it doesn't ship with any:
<https://github.com/alisdair/jsmn-example/>

And then I wrote about writing the examples:
<http://alisdair.mcdiarmid.org/2012/08/14/jsmn-example.html>

~~~
hendi_
This is really nice and I'm sure it will help new users a lot. I've myself
only used the provided test.c file to see how to use the API. While this was
surely doable, your example (and its explanation) has improved the situation a
lot. Thank you!

------
duaneb
It fails to parse basic unicode escapes - not a JSON parser.

------
alexgoldstone
For those interested in other lightweight implementations, I have not yet had
a chance to compare this but have been very happy with cJSON by @dave_gamble
for extremely resource-constrained embedded microcontrollers.

<http://sourceforge.net/projects/cjson/>

~~~
mturmon
Ditto. Found it easy to use and integrate. I needed something lightweight for
a highly nested numerical model containing numbers and descriptive strings. I
didn't want the Json bit to have a big footprint because the numerical
computations make the software complex enough as it is.

------
gilgad13
If we're golfing:

<https://github.com/quartzjer/js0n/blob/master/js0n.c>

Appears to work the same way, though it doesn't bubble back type information.

(Also, the `goto * go[ * cur];` trick was pretty crazy the first time I saw
it)

~~~
judofyr
This is called computed (or assigned) goto and is a GCC extension (that is,
not in the C standard): <http://gcc.gnu.org/onlinedocs/gcc/Labels-as-
Values.html>

~~~
binarycrusader
Note that other compilers (such as Oracle Solaris Studio) support computed
gotos as well:

    
    
      http://docs.oracle.com/cd/E19205-01/820-7598/bjabt/index.html
    

And LLVM:

    
    
      http://blog.llvm.org/2010/01/address-of-label-and-indirect-branches.html

------
nwjsmith
Looks good. For those of us shopping around, what advantages does Jsmn have
over yajl?

~~~
pjscott
It does no runtime memory allocation, and since it doesn't depend on libc and
has a very small size, it can be used on highly resource-constrained embedded
processors.

~~~
revelation
Sorry, but no. Shuffling around strings is all fine when you're writing
Javascript and have GHz and GB of RAM at your disposal, but that stuff just
doesn't fly when you need to conserve RAM and especially need to have easily
determined time and space constraints.

JSON just doesn't fit that profile. Neat implementation, but don't give the
impression that any of this would help with embedded.

~~~
sillysaurus
Why wouldn't this help with embedded? Are you saying it's never a good idea to
use JSON for data transport within any embedded system? That would be a bold
claim.

~~~
revelation
_highly resource-constrained embedded processors_

For me, that means code space and RAM measured in kilobytes, cycles in MHz.
Most people don't realize that serializing data (most importantly, floating
point numbers) to strings is a complicated and time-intensive matter. Even a
limited printf implementation can easily cost you many kilobytes. Not to
mention it introduces you to C's most special hell: variable length memory
blocks containing strings.

The one thing that embedded gives you is a lot of control about your computing
environment. Just sending around binary data is a very viable thing to do
under these conditions. Not so for JSON; its single biggest selling point is
that you can use it in every platform out there. Its the oldest tradeoff:
giving up flexibility allows you to use more constrained processors (and save
money).

~~~
sillysaurus
Excellent points. Thank you for transmitting your wisdom.

------
ijobs-ly
What is the purpose of JSON?

Does it have to do with efficiency?

Because if so, now we find ourselves discussing the resource requirements just
to scan/tokenise and parse it to get it back into a human readable form. Why
did we translate it to a non-readable form in the first place? What were we
trying to achieve?

Maybe we should let JSON be something the receiver translates text to (if they
want that sort of format), not the sender. The receiver knows what resources
she has to work with, the sender has to guess. The same principle applies to
XML. By all means, play around with these machin-readable formats to your
heart's content. But do it on the receiver side. No need to impose some
particular format on everyone.

The "universal format" is plain text. The UNIX people realised this long ago.
People read data as plain text, not JSON and not XML, not even HTML. No matter
how many times you translate it into something else, using a machine to help
you, it will, if humans are to read it, be translated back to plain text.

As for the "plain text haters", let us be reminded that UNIX can do
typesetting. Professional quality typesetting. But that's the receiver's
job.[1] There's a learning curve, sure, but what the receiver can produce
using typesetting utilities on her own machine is world's better than what a
silly web browser can produce from markup.

1\. I am so tired of dumping PDF's to text and images. PDF makes it seemingly
impossble to scan through a large number of documents quickly. Ever been
tasked with reading through 100 documents all in PDF format (i.e., scanned
images from a photocopier)? What could be accmplished in minutes with BRE
takes hours or even days to accomplish. This is a problem that persists year
after year. OCR is a hack. In most cases, the text should never have been
scanned to an image in the first place. The documents are being created on
computers, not typewriters!

So, as I see it, if you were a plain text hater, and you were really sincere
about making things look nice, then you would be a proponent of educating
people how to do typesetting and sending them plain text, the universal
format, that they can easily work with.

My solution to JSON and XML is sed. It works in all resource conditions and
most times is just as fast as any RAM hungry parser. If I need to do complex
things, that's what lex and yacc are there for. Pipes and filters; small
buffers. 'Nuf said.

------
lpgauth
Out of curiosity, what's the fastest JSON parser written in C out there?

Is it still YAJL?

~~~
mpd
Oj[1] benchmarks significantly faster, but I'm not clear if it's usable
outside of a Ruby environment, as I am not aware of any non-gem distributions.

[1] <http://www.ohler.com/oj/>

------
benatkin
Direct link to the source:
<https://bitbucket.org/zserge/jsmn/src/1caee52d37e3/jsmn.c>

This looks good to me. It isn't going to be the fastest or the shortest (no we
aren't golfing) but it's simple and easy to understand.

------
rmk
How is this different from jansson? I have used jansson in the past, and it
has served me very well.

~~~
hendi_
jsmn is only one .h and one .c file, jannson is more.

jsmn only parses the JSON into tokens, you handle _all_ the rest.

jannson provides its own hashmap etc., jsmn doesn't.

I've just used jsmn for a current project of mine where I've already had
implemented e.g. a custom hashmap, and I didn't want to link two into my code.
So the choice for the lean (if not to say "minimally") jsmn came naturally.
And I don't regret it :)

~~~
popee
Hey. What do you think about this?

<https://github.com/popee/libjason/blob/master/jason.rl>

It is implemented in ragel state machine compiler, with no other dependencies.
It is modified version of libejson but simplified to use only standard JSON.
Also small. Btw ragel is great utility ;-)

