A "database" in OrbitDb is a single document or single log, with fixed at creation write permissions. So if you would want ten users to each be able to write their own log of data, but not touch the other's, then you would have ten separate OrbitDB "databases". Anyone can read.
Each new piece of data is an "Entry", and is written to its own IPFS address. Each entry contains a bunch of IPFS address pointers - to the last entry on that "database", to any extra entries you found out about, and to a bunch of previous entries (just to speed up reading). So given one entry, you can keep recursively crawling all previous IPFS addresses contained in them, to discover all previously known entries that make up that database to that point. The entries work a lot like git commits. It's a sweet DAG.
In order to get started knowing the state of a database, you have to have the latest "head" entry. To do this, you need some way of talking live to other OrbitDB "peers" hosting that database, and asking them for the latest head entries. OrbitDB does this through IPFS PubSub, looking for other peers subscribed to the same database, and exchanging latest head entries with them. Importantly this means that if no one is currently online with your database, then you can't get the current state of it.
OrbitDB (and IPFS PubSub) are definitely, absolutely not production ready. But that's another topic.
How bad is the latency to read the entire database state? Even git can become pretty slow fetching and processing objects on large repos, and all of those refs are local after the initial fetch.
Seems like with the additional latency on IPFS, resolving the entire DB state (for snapshotting it or otherwise backing it up, for instance) would be unusably slow for anything even somewhat large, no?
In my testing, I saw
OrbitDB load 10-40 entries per second (with 1 byte entry data payload sizes). Each entry tries to point to up to 64 previous entries, which allows a lot parallelism in loading previous entries.
Data size is also an issue which isn’t helped by parallelism.
And your database only ever grows, both in size and in number of entries.
That seems to be the story of the whole blockchain space unfortunately. It's incomplete at every level of the stack, even the IPFS pub/sub layer still needs a lot of work to scale.
A problem with the blockchain community is that there is an ICO first and then do the work later mentality. This is a bit like the startup mentality of Silicon Valley.
I think that what works for centralization won't work for decentralization. You need the people who own the coins to be skilled and for them to directly contribute to the projects; you don't want your network to be made up of random shareholders who don't understand what they're buying and aren't capable of adding real value to it.
- Patchwork has grown & scaled with spanning trees.
- Notabug.io handling large influx of users with daisy-chain ad-hoc mesh-networks (DAM).