Quickly Loading Things from Disk

rdtsc · on Dec 21, 2015

Depending on your platform (os, hardware, compiler) you can even do something silly like write packed structs and then write those to disk (or send them over sockets) and cast back.

Mostly it works if you know the hardware and compiler will stay the same (as in packed struct representation will stay the same). It is basically a poor man's Cap'n'Proto or Flatbuffers.

As for speed, buffering strategy (size and how often fsync is done) will often dominate other stuff. I remember doing a benchmark between C++ and Python for loading and saving data. Python won hands down. The reason was because it had better buffer size defaults. There is also trickery and configuration involved in your page cache params are setup and the schedule of how dirty pages are flushed (if you don't fsync).

> If you’re interested in how fast using read(2) is, it’s essentially the same speed: 5437341 ns if I’m reading directly into the std::vector.

Yap, mmap is not always better, it is more complicated and depending on access pattern not always worth it.

kabdib · on Dec 21, 2015

Also, the quality of the library providing stdio or streams can vary quite a bit. I know a few platforms where you're much better off never calling this stuff (unless it's for quick bringup or other throwaway code) because the folks writing that layer obviously hated their day jobs and wanted to write an OS instead . . . so they did.

xemdetia · on Dec 21, 2015

This is a confusing article. Does the author know about byte alignments? Or that read/write binary values like floats is just a quick way to run into platform/portability issues if relevant? Or investigating how any RDBMS serializes and reads data as fast as possible? Or really any research related to the field? What about comparing spinning disk via ssd? Or even how libpng works? Or the complexity of ASN.1 notation (used for TLS certificates and many other things)?

Protobuf solves a completely different problem than 'load a file from disk,' which at least FlatBuffers does.

It seems like the only complaint the author has is that it doesn't work particularly at the level of abstraction they prefer. The FlatBuffers timing results also seem dubious because of the lack of profiling to say what is killing the time. Are the comparisions ruining pipelining? Why not write a transpiler to glue the protobuf output to something you actually want?

It's hard to get past the first half with any confidence in methodology of testing, especially when the author leads with 'Loading things from disk is a surprisingly unsolved problem in C++.'