Hacker News new | past | comments | ask | show | jobs | submit login

XML and Zip don't really do incremental updates, meaning the whole application file has to be written on save, meaning corruption can occur due to hiccups mid-write. Sqlite as a disk format and the right application implementation means you can't end up in a corrupted state.

I think you can achieve the same thing with xml/zip and some rename shenanigans, but sqlite lets you get that in a single file on disk.

Also if you are using sqlite as the memory model, why not use it as the disk/transport format? It's basically free at that point.

The file size issue can be dealt with VACUUM (I believe, haven't personally dealt with sqlite-as-file-format).




Incremental updates don't matter in a transport format.

The claim "it's basically free" isn't right, as for transport you need to VACUUM. And possibly COMPRESS too. And if you do that... might as well use the existing format. VACUUM completely rewrites the file from scratch. You can't do incremental updates in a VACUUMed file as it stops being VACUUMed, so you need to VACUUM it again to ensure minimal file size. Nothing is free.

ZIP also can be incrementally updated (file by file) by the way, I think MS Word uses this feature in some saves. But that's beside the point. You simply do not need incremental updates in a transport format.

I'm not sure what "hiccups mid-write" you're referring to. Any such hiccup that would damage an XML or ZIP file would also damage an SQLite file.

The distinction between a working disk file and a transport format are important. The working disk file is large, binary, messy, complex, optimized for quick look-ups and quick partial updates. If your word processor crashes, it can restore state from the working disk format in no time.

But the transport format needs to be small, readable, debuggable, simple, stable. And SQLite simply doesn't offer anything significantly superior in that department compared to the existing format. Especially nothing to justify the additional effort of changing an already working solution.

There's a reason "serialization" is called that, it's just serial data. No random access structures, no indices, single representation, often text-based. Throughout the decades, we've learned this is the best way to transport data of any kind. The messy/partial/polymorphic/cryptic/hyperoptimized/indexed formats are not for transport. They're intended to do work in, locally.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: