> Do boring stuff. Use files. I have the opposite experience. Files are far from...

notyourday · on Aug 18, 2019

> By putting all data in the db you have less moving parts, transactions, referential integrity, well defined behaviours, etc

If you are doing it for 100k files and below you are using a Ferrari to pick up eggs in a corner store. If you are using it at 100k files and above, you are using a Ferrari to move a pile of paving stones one by one.

derefr · on Aug 18, 2019

Show me a filesystem that can efficiently hold onto all the inode information for a blockchain represented as files and directories. (Hint: not even LevelDB can hold onto all the trie information efficiently; solutions are being sought that pack things tighter than LevelDB.)

notyourday · on Aug 18, 2019

You don't need to do it efficiently. You are optimizing for non-existent problem.

Here's what you actually need to optimize for: you have a hundred million files. The are all somehow reachable via http://origin/someuniqueurl. You have multiple copies because you aren't an idiot and you know that users actually hate either losing or getting corrupted files back. You have fingerprints associated with every copy. Something happens and you can't reach part of the tree or you are getting incorrect hashes ( which you know because your origin computes the hashes every time someone requests and one and if those hashes don't match it triggers a re-request to a backup copy ). And now you need to

(a) ensure you still have the needed protection factor ( you you went from 3x to 2x as one copy is dead ) for all affected files

(b) minimize the time needed to remove the "thing" ( probably a disk or a node ) that caused the failure.

(c) minimize the cost of ensuring (a) and (b)

People spent lots of time figuring out very clever ways of doing it via DB. It neither scaled nor worked well when real life (i.e. disk crashes/nodes going away/bad data was returned) happened.

The only thing that you really need to know is

(a) where the data and its copies are stored.

(b) what is the hash of the data that is stored there

(c) how fast you can recover the information about where that data is stored in event of the issue with whatever the system that is used to keep track between (a) and (b).

(d) preferably you also want to know what other data objects ( files ) could be affected when some of the other objects are misbehaving ( if you know a file at nodeA/volumeB had a fingerprint Ykntr8H8pL9PyAtCwdw/CB5tToXTPf55+hKSZb0uhV0 when it was written and now for some reason it says its fingerprint is sw0SI4jkU5VVk6CH4oPYtwx+bK2hIlrw8hVM7i9zmNk while the rest of the copies are saying their fingerprint is still Ykntr8H8pL9PyAtCwdw/CB5tToXTPf55+hKSZb0uhV0 you could decide that you want to trash the nodeX/volumeB because you no longer trust it.

derefr · on Aug 18, 2019

Er, yes, you do need to do it efficiently. A blockchain needs to read about 1000 of these “files” per transaction in order to verify the next block (and there are ~10000 transactions per block, and the accesses have no special locality because they’re a mixture of requests submitted by unaffiliated parties.) It’s an at-scale OLTP system (ingesting thousands of TPS) that is required to execute on consumer hardware.

This is a real problem. The current implementation of Ethereum is IOPS-bound precisely because of this overhead, with ~80% of the time spent verifying a block being spent on the filesystem and index-access overhead of getting the relevant data into memory, rather than being IO-bound by the actual bandwidth of data required. This is an on-disk datastructures problem.

notyourday · on Aug 18, 2019

Oh, this is the block chain discussion i.e. another solution in search of a problem. Never mind.