Excellent question! I'll jump in - I am a part of the DuckDB team though, so if ...

OskarS · on May 12, 2022

One of SQLite's most appealing aspects to me is using it as an application file format, as described in this article: https://www.sqlite.org/appfileformat.html

How does DuckDB compare in that aspect? Does it have the same kind of guarantees of robustness, incorruptibility and performance (especially reading/writing binary blobs) that SQLite does?

In any case: DuckDB looks great, nice work! Good to have more players in this space!

1egg0myegg0 · on May 12, 2022

It is a goal of ours to become a standard multi-table storage format! However, today we are still in beta and have made some breaking changes in the last few releases. (Exporting from the old version, then reimporting your DB in the new allows you to upgrade!) Those should happen less often as we move forward (the storage format was genericized a bit and is more resilient to future enhancements now), and locking in our format amd guaranteeing backwards compatibility will occur when we go to 1.0!

tosh · on May 12, 2022

Thanks, it does help! I understand SQLite might be better/ideal for OLTP (?) but would DuckDB also work for use cases where I query for specific records (e.g. based on primary key) or would I rather use SQLite for OLTP stuff and then read SQLite from DuckDB for analytical workloads?

Basically I'm wondering: if I go all in on DuckDB instead of SQLite would I notice? Do I have to keep anything in mind?

I know, probably difficult to answer without a concrete example of data, schema, queries and so on.

The SQL query features in the article seem really neat. Kudos @ shipping.

1egg0myegg0 · on May 12, 2022

Good questions! You are correct that it depends. We do have indexes to help with point queries, but they are not going to be quite as fast as SQLite because DuckDB stores data in a columnar format. (Soon they will be persistent - see comments above!) That columnar format is really great for scanning many items, but not optimal for grabbing all of a single row.

With DuckDB, bulk inserts are your friend and are actually super fast.

Definitely let us know what you find! Just open up a discussion on Github if you'd like to share what you find out: https://github.com/duckdb/duckdb/discussions

1egg0myegg0 · on May 12, 2022

I should add that we can read/write Apache Arrow as well!