
FlureeDB, a Practical Decentralized Database (2017) [pdf] - tosh
https://flur.ee/assets/pdf/flureedb_whitepaper_v1.pdf
======
marknadal
The paper says there aren't practical decentralized solutions, but doesn't
mention any prior research/comparisons to existing systems?

How does it compare against:

\- IPFS (yes, plenty of people are doing DB stuff on IPFS)

\- GUN (full disclosure, I'm the author,
[https://github.com/amark/gun](https://github.com/amark/gun) )

\- Scuttlebutt ( [http://scuttlebot.io/](http://scuttlebot.io/) )

\- Beaker Browser

They go on to talk about "append-only" data structures (this alone does not
make a system a blockchain or decentralized, but I don't see any further
reasoning on it), because technically, you could do things very similarly
with:

\- Kafka

\- Cassandra

\- Couch

And "append-only" is not practical, there are lots of caveats you have to
handle (you don't get eventual consistency for free, performance is terrible
on reads, etc.), it often times just makes scaling up harder, which is not
"practical".

The next section begins to discuss "blockchain" technology, but I don't see
code on GitHub:

\- [https://github.com/fluree/flureedb](https://github.com/fluree/flureedb)
(empty repo) ?

But their main website is already advertising $1000/month "Enterprise" plans.
The whole point of Blockchain and Decentralization is not to be vendor-locked
into a proprietary centralized DBaaS. Why isn't the code Open Source?

~~~
diminator
Agreed, they seem to ignore literally everything that's going on the open
source decentralized storage space.

Adding to the list: \- BigchainDB
([https://github.com/bigchaindb](https://github.com/bigchaindb)) a practical
BFT blockchain DB (disclosure, I'm the co-author)

\- SWARM

\- OrbitDB

And also, it's a very very bad idea to have public storage - one (encrypted)
PII entry illegal datum and it's over.

~~~
jonnydubowsky
I really appreciate the approach that BigchainDB and Ocean Protocol take in
addressing the extremely serious challenges around removing illegal content
from a decentralized database. I don't remember exactly where I read/heard the
details, but there is a talk Trent McConaghy gave where he describes the
process of forking around a node where illegal content has been discovered and
flagged via a takedown notice. I'll see if I can dig up the specific
reference. The important takeaway for me, as I work on integrating
decentralized and distributed database architecture into a platform for cancer
drug discovery and the enrichment of user data through the incentive
mechanisms within the decentralized databases that also have cryptographic
tokens and utilize cryptographic primatives a la curation markets, is that
hybrid solutions are starting to show promise in combining traditional
centralized databases and decentralized ones that have novel token economics
built it. I think BigchainDB and Ocean Protocol do a great job with this, as
does OpenMined, and I'm interested to see where Cardstack goes with their
approach. For me, it's not a binary set of options, rather about finding
projects that are interoperable, and also take seriously their
responsibilities to maintain compliance with the ever changing legal landscape
surrounding data and rights management.

~~~
pdimitar
I am _very_ interested in the area and will highly appreciate any additional
links you could dig in.

I really would like to work on something like that but so far it's only
enthusiasm and zero education. I am a pretty senior programmer in my eyes (16
years of official career, 25 years in total tinkering, started at 12-13 year
old teenager) but sadly all that experience does not translate directly to
areas like these.

------
thanatos_dem
"Fault​ ​tolerance​ ​and​ ​censorship​ ​resistance​ - Due to its decentralized
nature, FlureeDB guarantees maximum uptime and no possibility for data
regulation".

Censorship is rarely a technological issue, but one of laws.

No possibility for data regulation? Well, data regulation is already there,
and they're effectively advertising that they are trying to bypass those
regulations. If they actually can do that, it won't be long before someone
rules "Use of FlureeDB violates GDPR compliance" and all your nice tech isn't
worth squat. The database may prevent data regulation, but you can still put
the people using it in jail.

~~~
nwatson
The GDPR question and "right to be forgotten", consent, etc., gets asked
around this time in the Epicenter podcast with the founders:
[https://youtu.be/fdne9EvFNaw?t=3147](https://youtu.be/fdne9EvFNaw?t=3147)

* Brian Platz addresses the question

* first goes through benefits of immutability in Fluree in context of auditing

* suggests one approach is storing PII (personally identifiable info) in encrypted manner in public fields, and storing the decryption keys separately -- the act of "forgetting" is the act of destroying the decryption keys

* another approach: on the public Fluree blockchain DB store a UID that has no public ties to PII ... and store the PII in a private corporate-side Fluree blockchain DB ... the Fluree system allows JOINs across Fluree systems, so internal corporate-side queries can do JOINs to consolidate public info with PII stored internally ... now to the "forgetting" part ... in spite of immutability guarantees, Fluree does have one escape, which is the possibility in the private Fluree (for which the corporation completely controls consensus) to "retract" that entity containing the PII, along with a tombstone and a full audit trail of the retraction

Both these approaches seem to address the right-to-be-forgotten pretty well.
The Fluree guy also suggests that for the PII part perhaps Fluree isn't the
system you want, and perhaps that should be stored in some other system.

~~~
thecupisblue
> the act of "forgetting" is the act of destroying the decryption keys

I'm not sure cryto-shredding is a legit method to "delete consumer data"

~~~
crankylinuxuser
Why wouldn't it be? An encrypted file with the appropriate cipher set is
synonymous with noise. It is only the encrypted content plus the unlock key
that makes the file contents accessible.

If it is irretrievable, and the crypted data looks like noise, I would argue
that the content is gone.

------
macawfish
People are sleeping on hyperdb:
[https://github.com/mafintosh/hyperdb](https://github.com/mafintosh/hyperdb)

It's built from hypercore, the same library powering the DAT protocol.

Also, someone recently wrote a graph database on top of hyperdb:
[https://github.com/e-e-e/hyper-graph-db](https://github.com/e-e-e/hyper-
graph-db)

------
fizx
The more I read white papers, the more I appreciate academia's requirement of
a summary of previous related work.

~~~
nickpsecurity
Those sections are the source of a nice chunk of obscure references I post
here that people like. They're a gold mine. The other is serendipitous search
where you find what you're not looking for directly. Main way is to learn the
buzz words of _serious researchers_ in each sub-field plus verbs they use to
describe results. Then, you just do permutations of those with quotes,
minuses, and places like Citeseerx. You get the diamond mine once you master
that. ;)

------
gregwebs
This looks a lot like Datomic if you add hashing to the datomic transaction
(this would have to be done as a datomic stored procedure). Given that this is
hosted I am wondering whether his is using datomic underneath or if it is
written from scratch?

Two critiques:

* Given that this is a hosted offering this seems like an odd definition of de-centralized, from my reading a more accurate description would be: allows cross-partition queries. Edit: response below indicates this aspect is missing from the linked whitepaper but available elsewhere.

* The graph database section is hand-wavy. This seems like it could be similar to Datomic where with a small enough dataset you may have amazing performance, but at a large enough scale you will be lacking index-free adjacency.

~~~
nwatson
If I understand properly, Fluree will partner with various "federated"
enterprises to all run a Fluree ecosystem where everyone's data can be
replicated, and queries can be served via multiple gateways ... in that sense
Fluree will be decentralized.

Note that I think I heard this in the Epicenter podcast
([https://www.youtube.com/watch?v=fdne9EvFNaw](https://www.youtube.com/watch?v=fdne9EvFNaw))
featuring the Fluree founders. ... aahh, at this point in the podcast they
talk about the federated/distributed Fluree:
[https://youtu.be/fdne9EvFNaw?t=3770](https://youtu.be/fdne9EvFNaw?t=3770)

EDIT: pointer to "federated" / distributed database in podcast

------
matthewaveryusa
Sybil attacks are real and is the biggest road block towards resilient
decentralized networks. In FlureeDB, Blockchain is an attempt to solve Sybil
attack by attaching proof of work as reputation to counteract malicious nodes.
While it does raise the bar significantly it's not a panacea -- still more
work to be done!

------
mchahn
> a practical decentralized database did not exist before FlureeDB

It was hard to keep reading after this.

~~~
craigching
I’m not sure what you misunderstood, describing your skepticism would have
added to the conversation. Having read beyond that statement, I’m actually
intrigued. What “practical decentralized database” are you thinking of such
that you didn’t need to read further?

~~~
mchahn
It was the claim that no other practical decentralized databases exist that
turned me off. If you want I'll name a few.

~~~
craigching
> If you want I'll name a few.

Yep, that’s what I’ve asked above. Might spur a conversation if your claim is
legitimate.

~~~
mchahn
Well, the one I use is CouchDB. It is very practical. Others I can think of
quickly are CockroachDB, Cassandra, and Riak. And there are many others.

Are you possibly limiting the term decentralized to blockchain DBs? That would
be a stretch. Decentralized DBs existed for a long time before blockchains.

~~~
craigching
_I’m_ not limiting the term, that’s what the paper is about. And why I wanted
to know more about your claim. Might be a good idea to finish the paper now
;). The paper is actually very interesting.

~~~
mchahn
> Might be a good idea to finish the paper now ;).

I did read the entire paper and it _was_ interesting.

You're more specific meaning of the word should have been spelled out before
the claim. The meaning that I read was perfectly valid and shared by many
others.

~~~
craigching
Again, it wasn’t _my_ use of the term “decentralized”, it was the use of the
term in the _paper_. I read past the phrase for which you claimed to have
dismissed the paper and understood their use of “decentralized” pretty
clearly, as you should have if you understand distributed computing. Having
taken advanced distributed computing courses, I, honestly, never interpreted
the use of “decentralized” as you did. But, yeah, in practice, I can see your
confusion. Nevertheless, dismissing a paper on such pedantic terms as you
originally did is probably not worth commenting unless you _really_ have a
good reason for doing so. And, then, it’s probably best to comment only after
you have read the paper and then formed an informed response. Anyway, you keep
escalating this, so I keep responding in kind to help you. At some point I
have to be done with it. I honestly thought that you might have something
useful to contribute due to your terse response, but I was clearly wrong ;).

~~~
mchahn
> it wasn’t my use of the term “decentralized”

I apologize. I have a serious problem in forum discussions of not paying
attention to who writes each post. I absorb all replies and then reply to the
group. My bad. (Edit: And I thought you were the author).

>dismissing a paper on such pedantic terms as you originally did

I never dismissed the paper. Read my first post carefully. I enjoyed it and
did read it before commenting. I was just pointing out my shock at the claim,
which seemed ridiculous given my understanding.

> you keep escalating this

I thought I was participating in an argument about the meaning of the term
"decentralized". I was defending my understanding to multiple people telling
me I didn't know what I was talking about.

> Having taken advanced distributed computing courses

I have been working with databases for over 45 years. Most of that time
decentralized meant lack of a centralized (single point of failure) computer,
not a person or organization. It is only recently (10 to 15 yrs?) that the
user meaning has appeared. Please forgive my classic interpretation.

------
michaelmior
Fun flashback to my my Masters research paper where I wrote a system called
FlurryDB (unrelated).

[http://sysweb.cs.toronto.edu/publications/254](http://sysweb.cs.toronto.edu/publications/254)

