

Efficient data structure representation on disk - datashovel

I&#x27;m trying to locate resources to help me understand how to efficiently represent data structures on disk.  So far I feel the following have helped me the most.<p>http:&#x2F;&#x2F;stackoverflow.com&#x2F;a&#x2F;18285486&#x2F;3719366<p>http:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Extent_(file_systems)<p>From what I&#x27;ve read so far the only mention of a specific data structure that is meant for storage on disk is B-tree.  What would some of the other (commonly used or interesting) data structures be that were designed for efficient representation on disk?<p>Also, does anyone have suggestions (books, papers, code) that might help me further my understanding in this area?
======
dalke
It will depend very much on what type of data you want to save.

There are any number of descriptions of database formats and file system
formats. You can read more about the sqlite B-tree-based format at
[https://www.sqlite.org/fileformat.html](https://www.sqlite.org/fileformat.html)
.

The "cdb" ("constant database") of djb's is quite nice, if you have constant
data and just want a hash table. See
[https://en.wikipedia.org/wiki/Cdb_%28software%29](https://en.wikipedia.org/wiki/Cdb_%28software%29)
.

There are many stream-based formats which use the FourCC approach, of which
PNG is the most common. See
[https://en.wikipedia.org/wiki/FourCC](https://en.wikipedia.org/wiki/FourCC) .

Then there are more special purpose data structures, like storing a suffix
tree for genomic data. See
[http://www.cs.rpi.edu/~zaki/PaperDir/PS/SIGMOD07.pdf](http://www.cs.rpi.edu/~zaki/PaperDir/PS/SIGMOD07.pdf)
,
[http://webhome.cs.uvic.ca/~thomo/papers/cikm08suffixtrees.pd...](http://webhome.cs.uvic.ca/~thomo/papers/cikm08suffixtrees.pdf)
, and
[http://www.cidrdb.org/cidr2013/Papers/CIDR13_Paper3.pdf](http://www.cidrdb.org/cidr2013/Papers/CIDR13_Paper3.pdf)
for three arbitrarily selected papers.

