Hacker News new | past | comments | ask | show | jobs | submit login

I remember reading once upon a time about the sqlite file format being a good candidate for data transfer between systems/machines, as it's standardized, stable, and infinitely queryable. Has anyone here had any experience actually using it as such? What are your takeaways?



Yes, it works very well.

If you have only one thread/process, use it as is; if you need multiple readers and one writer (who do not block each other), use WAL mode; if you need simultaneous multiple readers and writers, use Postgres.

If you are using SQLite, be aware that the “copy the file to move it to a different system” comes with a caveat - a regular Unix “cp” may get the file in an inconsistent state if it was modified while copying, or in the middle of a transaction.

Iirc “rsync” if used to copy, will verify that the file was not modified while copying and restart the copy if it did (a number of times and then error out). But that’s still not enough - you may need to also copy the shm file (in wal mode) and the journal file - and you need to do that atomically. Which is harder still.

The right way to copy an SQLite file from place to place us:

Any way you like, if it was properly committed and closed last time it was used, and is not currently open by any other process;

Or creating a backup (e.g. with the sqlite3 command), which will be a nonblocking operation if in WAL more - and then you can ship the backup whichever way you want.


We have been using SQLite as a distributed object container for the last few years as part of an industry spec. <http://www1.semi.org/en/ritdb-interplanetary-database-manufa.... Has been working quite well on windows, OS X and Linux. Also java, r, c, js, python clients. We are using MQTT as the transport.


Your link gives 404.


Remove the >


We use a common SQLite schema for exporting/importing very complex application traces between environments (several tables and lots of relationships). If we didn't have SQLite managing this for us, it would probably be some horrific JSON blob on the wire with serialization+LINQ on both ends blowing up RAM and crashing things.

With SQLite, it's as simple as dumping the .db file from one machine to another and running your analysis tools over it - aka executing simple SQL queries from various reporting views. We also see really good compression on this type of structured data (we just compress the entire .db file on disk).

Our analysis tools are typically just DB Browser for SQLite - A tool I would say is 10x better & faster than SSMS: https://sqlitebrowser.org


I use it as exactly this. I use a bespoke schema for a case where I need to move millions of small files around and be able to access them efficiently. It actually works out to be faster than filesystem access in my case, and so much easier than trying to manage tons of tiny files.

Recent versions of SQLite's CLI also has a "sqlar" mode [1] that lets you treat sqlite database files as an archive file. Unlike some other archive files, adding and removing files to an existing sqlar file is a first class operation, since it's just a database file. It's basically what I'm doing for my "ton of tiny files" case, just with a documented schema.

1 = https://www.sqlite.org/sqlar.html




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: