Hacker News new | past | comments | ask | show | jobs | submit login
The simdjson library (simdjson.org)
60 points by fanf2 10 months ago | hide | past | favorite | 28 comments



Dan Lemire does a lot of awesome work. See also roaring bitmaps.


Hmm, I wonder jq would be faster using this.


Hey, it was discussed a bit some years ago https://github.com/jqlang/jq/issues/1892


We have silicon dedicated for AES [0], why not have silicon dedicated to JSON...? Only half joking.

[0] https://en.wikipedia.org/wiki/AES_instruction_set


I'd rather see less JSON and more binary serialization in the world, rather than bending over backwards to make JSON faster...


Get a binary format supported by all standard browsers with the same simplicity as JSON and you will see a rapid adoption.


No one has come up with a binary format that has the same properties and has been as successful.


Maybe because most binary formats can't be human-readable which is one of important properties JSON do have.



It's not self-documenting, you can't decode it unless you have the .proto. JSON just works.


Yeah, that's a great example of a non-human-readable binary format


last time I read, simdjson still beat protocol buffer a bit.


Due to the inherent constraint of JSON, the exact use case would matter a lot for such comparisons. Simdjson is generally faster when you only want a well-formedness check or a very small proportion of a large input JSON, but the "well-formedness" for JSON would be a small subset of the "well-formedness" of binary formats, and a partial parsing performance is often dominated by language bindings instead of underlying parsers (it is a wise move that simdjson also has a JSON pointer support for this reason, because that greatly reduces FFI overhead). Binary formats in comparsion tend to have a generally flat performance profile.


You can easily write a tool to dump a binary format.


Now embed that tool in all my other tools so I can human-read the binary format without thinking twice and then we can chat.


IFF was basically that on the Amiga:

https://en.wikipedia.org/wiki/Interchange_File_Format

Unfortunately with the Amiga itself being a niche platform, the idea didn't catch on.

It also sort-of requires a central registry for the top level FourCC codes (basically reserved magic number for each file format).



Arrow is columnar storage, nothing relevant here.

And capnproto is basically another version of pb... (see discussion above)


Look again at Arrow. Part of the concept is the wire format is the same as the in memory format—no de/serialization. Arrow isn’t intended for storage; they point to parquet for that.

Apache Arrow defines an inter-process communication (IPC) mechanism to transfer a collection of Arrow columnar arrays (called a “record batch”). It can be used synchronously between processes using the Arrow “stream format”, or asynchronously by first persisting data on storage using the Arrow “file format”.

The Arrow IPC mechanism is based on the Arrow in-memory format, such that there is no translation necessary between the on-disk representation and the in-memory representation. [0]

The Arrow spec aligns columnar data in memory to minimize cache misses and take advantage of the latest SIMD (Single input multiple data) and GPU operations on modern processors. [1]

0. https://arrow.apache.org/faq/

1. https://arrow.apache.org/docs/js/


msgpack and "I can't believe it's not msgpack with the name changed to my own name" CBOR are basically that.



During my PhD studies, we have developed a JSON parser for FPGAs [0] that could in theory be turned into silicon if somebody really wanted to. In our evaluation we showed that it is a magnitude faster than at least the AVX-256 version of simdjson that was available at the time on a single JSON document.

[0] https://dl.acm.org/doi/abs/10.1145/3533737.3535094


Not quite, but close, is the relatively new ARM instruction FJCVTZS which implements the rounding required by JavaScript.


FJCVTZS has JavaScript in the name, and is useful for implementing JavaScript yes, but as you say, it's just a float -> int conversion with a particular rounding mode. "Convert float to int, rounding towards zero" is a typical scalar type instruction, very different from the kind of instruction you'd want to optimize parsing.

I swear the only reason people pay FJCVTZS any attention at all is that it has "JavaScript" in its name. If it was just a "convert float to int, round towards zero" instruction, everyone would see it as a normal, boring part of the ISA.


FJCVTZS is really just "float to int with x86 semantics", which also happens to be the semantics that were baked into JavaScript. The name is probably just ARM dancing around mentioning a competing ISA.


> I swear the only reason people pay FJCVTZS any attention at all is that it has "JavaScript" in its name.

You’re not wrong, and yet—

> convert float to int, round towards zero

... and take the result modulo 2^32 as two’s complement. That’s still rather specific and likely useless for numeric computing, which is the canonical application area for this kind of thing. So the “JavaScript” in the name is fair, if only to dispel confusion around why one would ever want this.

(The normal, saturating conversion is FCVTZS.)


I mean these things seem very different? AES is "just" a bunch of mathematical operations done on 16 byte chunks, you can make instructions which optimize that fairly easily. JSON is mainly a parsing problem, which .. is much harder to do in hardware. Hardware acceleration shines when you can do operations on large chunks of data at once.

SSE/AVX/NEON/RVV are all general sets of vector operations, which can be used to optimize certain parts of parsing, as simdjson does. If there could be instructions added to further optimize JSON parsing, I have a feeling that they would be new general purpose vector instructions rather than anything JSON-specific.

Or do you have any JSON-specific instructions in mind which would help beyond what the existing vector instructions already do?


When writing parsers, NEON suffers a bit from a lack of movemask and from tbl requiring that you mask off more high bits before using it (compared to pshufb). I mostly don't have complaints about AVX.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: