Question : I'm building a 3d procedural geometry generator and I need to store and index it by tiles for hundreds of kilometers.
I'm curious if it can be efficient to store polygonal data in a database instead of a dedicated file format.
I'm never sure how to properly copy raw struct binary data from sql, directly in ram. I think avoiding parsing the text output result in better performance, but I'm not entirely sure...
It sounds like PostGIS might be a good fit. It gives you a lot of tools for accessing and indexing geometry data, and it can scale to huge sizes.
Parsing text formats is unlikely to be a bottleneck, accessing the data is more likely to be the bottleneck. The correct data structure (schema) and indexes are going to make the biggest difference.
If you do complex computations on the data, and it fits in RAM, you'll probably see the best performance if you first load the data into data structures optimized for the computation first, and it's less important how you store the data on disk.
We're using SQLite archives of many GB successfully without issue. As long as the primary keyspace is well-designed (see also https://sqlite.org/withoutrowid.html), ranged queries are extremely fast.
> I'm never sure how to properly copy raw struct binary data from sql, directly in ram.
BLOB columns and an application-specific serialization/deserialization step work well. memcpy to a struct works if you are absolutely certain that you know what the layout will be. All of the standard perils apply - alignment, internal padding, platform-specific layout, endianness, etc.
We're using Protobuf with success. I imagine Flatbuffers would also work well. I'd put Protobuf/Flatbuf and their competitors on the front of the tool shelf.
I am working on an app that needs some GIS features. Spatialite and GDAL cover my use case. You can even eliminate GDAL if you don't retrieve data as binary but it is useful in other places too.
Storing small blobs is a use case where SQLite often excels. By small, I mean in the range of <400kB. That is, if those are usually write-once, read-only. I can't remember the paper right now, but one of the results was that if you are repeatedly overwriting rows, then the performance will degrade gradually (until you VACUUM, presumably).
I've used SQLite for this quite often, since it's convenient to store additional metadata that can be queried with these blobs. I know of a person who uses an SQLite database for storing emojis in a chat client, as well.
You can open blobs directly as binary data, as well. Almost like files. See sqlite_blob_open and friends.
The question you should ask, though, is what is your access/write patterns? Once you know where SQLite excels, you can accurately assess whether it fits your use-case. Without knowing more details, I can't say one way or another.
I was thinking of implementing almost exactly this for mesh decimation. Certainly a decade or so ago you had enormous point clouds being generated that were too big to fit in ram, and none of the common mesh decimation tools worked off disk. I imagine it's less of a constraint now.
I'm curious if it can be efficient to store polygonal data in a database instead of a dedicated file format.
I'm never sure how to properly copy raw struct binary data from sql, directly in ram. I think avoiding parsing the text output result in better performance, but I'm not entirely sure...