Hacker News new | comments | ask | show | jobs | submit login

I was also curious about choosing or creating a file format, when creating a low level program that outputs a lot of data.

Any recommendations or popular reading?

I realize some choices are for proprietary reasons, then compression, but that aside, there seems to be thousands of choices available.

Compared to the web, using json etc, and releasing a schema, but not necessarily creating a new file extension.

My current go to for this is using SQLite. It's basically made for this purpose. If that doesn't serve, I like the idea of Apache Avro, but some of it's C++ bindings are a little lacking in my opinion.

This is a fantastic first choice, particularly as it sets you up for using a more "real" database for sharing data/scaling in the future.

OTOH, you have to know when no to use it and step up (down?) to something that is text editor hack-able (XML!?) or has barn burner I/O abilities (yah actually just dumping raw buffers with regularized binary data to disk). Or for that matter is used to exchange data with other apps with other services (JSON, and the long list of other data dependent formats, although for at rest exchange I have to point at XML again).

I agree, thanks for the reminder. As an example, I was working with Mass Spectrometry data recently, and found a list of about 20-30 possible formats for that topic alone (mostly proprietary) [0]

[0] https://en.wikipedia.org/wiki/Mass_spectrometry_data_format

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact