
Filecoin: A Decentralized Storage Network [pdf] - aboodman
https://filecoin.io/filecoin.pdf
======
JohnJamesRambo
I'm still confused why I would want my storage to be decentralized. No I don't
want my files spread on thousands or millions of computers no matter how
encrypted etc.

~~~
Ajedi32
Trustless, distributed, scalable, redundant, anonymous cloud storage with a
market-driven pricing scheme? Why _wouldn't_ you want that? For me that's
pretty much the ideal cloud storage system.

~~~
hudon
"distributed, scalable, redundant" describes AWS pretty well, and if I can pay
much less for a faster more reliable service but sacrifice "trustless and
anonymous", I'll stick to it.

~~~
Ajedi32
What makes you think AWS would be necessarily be faster or more reliable? If
done correctly, think a decentralized system could be significantly faster and
more reliable than even the best centralized storage solution.

For example, with a decentralized system files could be broken up into shards
and distributed across multiple nodes, so unless a significant percentage of
computers on the planet all melt down at once your data would remain
accessible (i.e. zero downtime). That would also allow downloads to be
parallelized across multiple connections like with BitTorrent, meaning
bandwidth wouldn't be an issue either.

And that's not even considering the effect that a commoditized pricing system
would have on the costs of a distributed storage solution. I imagine it'd be
significantly cheaper than AWS too, as storage would be sold at a price very
close to costs.

~~~
guy_c
I had been recently looking at Siacoin, but still a lot of work is needed
before it can displace something like AWS. Took nearly a day with lots of
stalls just to sync the blockchain. Also I am not sure about how price
competitive it will ultimately be. Amazon has huge buying power which I am
sure means they pay less per TB then consumers. Also other economies of scale.
I bet the cost of labour to bring online each 1TB on S3 is very low.

~~~
guy_c
Siacoins calculator at [http://sia.tech/](http://sia.tech/) is claiming a
price of $2/TB/month vs $23/TB/month for Amazon S3.

However it neglects to include Backblaze's B2 storage which is only
$5/TB/month - [https://www.backblaze.com/b2/cloud-storage-
pricing.html](https://www.backblaze.com/b2/cloud-storage-pricing.html)

That is maybe a not a big enough price saving to convince conservative
corporations to switch. Imagine going to your accounts department to request
they purchase a cryptocurrency so you can use it to pay for data storage.

~~~
Zahlmeister
I don't know how using Sia works in practice, but comparing it with regular
S3, which is connected to the entire AWS ecosystem (including EC2), is absurd.

Amazon S3 also offers "glacier" for longterm storage with few accesses, which
is $4 per TB.

~~~
dx034
But that's the problem. S3 is cheap if you use it with other AWS services and
it's designed to be only cheap that way. Bandwidth is expensive to avoid that
people pick services from a combination of Azure/AWS/Google.

If you're just looking for storage, S3 is certainly not the cheapest provider.
Glacier is a bit different but clearly just for archives.

~~~
Zahlmeister
I'm comparing just storage prices, because I have no idea how Sia works in
terms of latency/bandwidth/access. Being distributed, I suppose it will fare
worse in most (if not all) these respects compared to Azure/AWS/Google and
maybe even glacier.

As for bandwidth cost: I don't believe Sia can be successful _and_ stay that
cheap. Why would providers of bandwidth for Sia be able to offer it magnitudes
cheaper than the biggest tech companies in the world? Answer: It's offered by
a bunch of individuals with no caps on their data plans. If Sia takes off and
lots of people start using terabytes of bandwidth, the ISPs will put an end to
it.

------
sharemywin
I still go back to this paper for questions about decentralized storage:

[http://blog.dshr.org/2017/07/is-decentralized-storage-
sustai...](http://blog.dshr.org/2017/07/is-decentralized-storage-
sustainable.html)

~~~
calafrax
There is another macro-economic issue here:

With a decentralized trustless system you have to assume a higher failure rate
for nodes than with a central trust based system.

If you assume a higher failure rate you have to replicate data more
extensively to achieve reliability.

If you have to replicate data more extensively then decentralized will always
be more expensive than centralized.

~~~
wmf
Not necessarily if the inputs are free (e.g. unused disk space). However
Spotify, Skype, and Joost all found that CDNs are cheaper than borrowing free
bandwidth from people so the same may apply to storage.

~~~
MichaelGG
Did they find that out? I thought they realised it's unprofessional and
unacceptable to turn your users into hosting nodes.

------
Kubuxu
This whitepaper is the first overview of updated Filecoin protocol. More
details about specific components of Filecoin (like Proof-of-Replication,
Proof-of-Spacetime) will be released in future in their own publications.

It is also accompanied by an announcement of the Filecoin Token Sale:
[https://protocol.ai/blog/ann-filecoin-token-sale-and-new-
pap...](https://protocol.ai/blog/ann-filecoin-token-sale-and-new-paper/) which
will begin on July 27th.

------
strictnein
Opening up my hard drive to store unknown materials? Imagine trying to explain
any of this to a cop, judge, or jury.

~~~
stebalien
How is this any different from what Amazon does? They don't manually inspect
all uploaded files. Instead, they rely on legal protections given to service
providers.

Filecoin miners can do the same by (a) registering as service providers and
(b) complying with blacklists and takedown notices as mandated by their legal
jurisdiction. Note: Filecoin miners don't just store arbitrary files assigned
by the network; they sign contracts with specific clients to store specific
files. The only difference from Amazon is that the network itself enforces
these contracts.

~~~
strictnein
Amazon has an army of lawyers, a well written TOS, an identifiable customer,
and a corporate shield. They also have the ability to take down offending
material.

------
agl
Unfortunately, one of the important citations ([5], "Proof of Replication")
does not appear to be available. (Or, at least, Google cannot find it.)

~~~
Kubuxu
Yes, citations 5, 14, 15 will be published latter. There is a possibility that
some of them will be published next week.

~~~
ardivekar
At least upload a preprint to arXiv or something. What's the point of citing a
paper that doesn't exist yet?

------
gremlinsinc
How is this different from Siacoin?

~~~
AlexCoventry
FileCoin is using proof of storage of client data for blockchain ledger
security. That's a very tricky thing to get right (I didn't think it was
possible.) Search the white paper for "Sybil attack, outsourcing attacks,
generation attacks", and see section 7.4 of the SiaCoin whitepaper for
contrast:
[https://www.sia.tech/whitepaper.pdf](https://www.sia.tech/whitepaper.pdf)

~~~
Taek
Considering that they have three important citations of the paper unpublished,
I think it's okay to continue to be skeptical that they have achieved their
claims. This paper is not complete, and on it's own does not demonstrate that
they have solved the problem.

It is always trivial to simulate storing client data that you don't have. So
you'd have to invent a system where simulating client data is equally
expensive to storing real client data. But, simulating client data can always
be done deterministically using a seed, which means you don't actually have to
store the data.

So you have to prove that generating client data is equally expensive to
storing it, if not more expensive. Given that legitimate client data can
literally be all zeros, I don't see how you can achieve this without some
overhead.

That overhead will make you less competitive than Sia, which doesn't have the
overhead.

I haven't yet gotten to the point where I fully understand proof-of-spacetime,
but that is hard when the paper is literally referencing unpublished work. But
I'm skeptical they have solved the generation / simulation attack.

~~~
AlexCoventry
I'm not skeptical, yet, just surprised -- haven't had a chance to dig in.

My main concern at this stage is the replication setup time, which needs to be
computationally expensive for the strategy to prevent generation attacks. That
can possibly be amortized over long-term storage, but it sounds like it might
be a long amortization period.

~~~
Taek
Replication setup also likely could be optimized by ASICs, so attackers
willing to spend money on specialized hardware may have a 1000x
advantage/asymmetry when it comes to generation attacks

~~~
Kubuxu
Sealing part of the Proof of Replication will in big part require disk IO and
won't be possible to do in parallel so ASIC or FPGA won't help you much.

------
richardknop
I don't understand how decentralizing file storage and storing my files on
people's laptops / desktops can come even close to efficiency and performance
of a centralized storage service with powerful servers on fast network with
99.99 uptime.

Can somebody explain to me how in theory could this decentralized storage beat
something like AWS in efficiency, performance and uptime? I just think big
beefy servers in dedicated data centers with fast and robust uplinks/downlinks
should be better than consumer hardware on unreliable slow networks with shaky
uptime?

~~~
3pt14159
It's cheaper. Why? Because people have spare disk and spare bandwidth.

It's also more likely to survive a major war if it's redundant enough.

Is it better for serving hot assets, no. Is it better for storing a massive
amount of raw data that you need occasional random, indexed access to? Maybe.

Edit:

Also, less regulatory concerns since in theory nobody knows who is paying for
it. This is probably it's biggest weakness, to be honest, since it could be
used to spread child porn.

~~~
jaredklewis
> It's also more likely to survive a major war if it's redundant enough.

If there is a major war, international Internet will be one of the first
things to go. In that scenario, having your files spread out across the world
is a major negative. If you want the data to survive a major war, bury an
external hard drive.

~~~
richardknop
Yes I don't really understand the war argument. If there were a massive war
(like a world war scale not some conflict in Middle East), the first thing
Russia/China would do is cut underwater internet cables.

Most nation states would probably block outside internet access to avoid
propaganda during time of war. I think I would have bigger concerns than
whether I could recover my files stored online.

For cases like this I'd much rather trust just a simple USB stick or external
drive which I could carry with me in my backpack and backup important files
there so I don't have to rely on internet.

~~~
zzzcpan
No, it's not like that, blocking internet access would cripple nation states'
economies too much to even be able to fight a war. But censorship even without
a major war requires centralization of control over the internet, which
introduces a country-wide weakness that makes the internet unreliable to the
point of complete blackouts, like during anti-government protests. Furthermore
even a small conflict has an impact on the supply of electricity in the
country which also could make all storage nodes there unavailable for
prolonged periods of time. There are a lot of other weaknesses too, like not
that many countries even have good global internet connectivity and cyber
attacks could shut it down entirely.

------
svara
This sounds interesting academically, but what I don't understand (I have the
same problem with Siacoin, StorJ and Swarm) is:

You could build a Dropbox clone where users have the option of contributing
storage to get rewarded with real currency. Why would anyone want to use the
decentralized version instead, both as a user and as a contributor of storage?

The decentralized version is going to be much harder to get right for the
developers, and why should the users believe that the technology is
trustworthy? I just don't see the advantage, but I would be truly interested
in hearing what I may be missing.

~~~
colordrops
Because once they do get it right, it's open source and no one can take it
away. A centralized system can and most likely will eventually degrade and
disappear. Also, privacy.

~~~
Zahlmeister
I believe the market right now tells you that people value neither redundancy
(creating business for file recovery services) nor privacy all that much.

They _do_ value simplicity of use above all and they do trust large companies
(for better or worse).

What makes this system better than the other systems that fulfill essentially
the same purpose? What does "getting it right" mean? _Are_ they getting it
right and how so?

~~~
colordrops
Is there not room in the market for multiple solutions? Both Dropbox and a
decentralized solution can co-exist.

~~~
Zahlmeister
What I'm saying is that these are not actually big selling points for a
general audience.

I'm not saying it's not viable _at all_ , but the distributed aspect alone
requires a certain amount of popularity to sustain it.

------
solatic
Anybody with a better understanding of filecoin - can you explain how the
protocol can tell the difference between a legitimate hardware failure and a
storage miner who stores the uploaded file, signs proof of replication to the
blockchain, then immediately deletes the file so that he can "store" more
files, collecting more payment, and then refuse delivery upon download
request, claiming hardware failure?

Does the blockchain have some way to prevent miners from double-spending their
storage space? Is payment only collected upon retrieval, allowing Filecoin to
be abused for storing backups that are rarely retrieved?

~~~
stebalien
> can you explain how the protocol can tell the difference between a
> legitimate hardware

From the perspective of the network, there is no difference. When you agree to
dedicate some storage to the network, you post a collateral in FIlecoin. If
you lose something you've agreed to store (fail to prove that you're storing
it when asked to do so by the network), you pay a penalty out of your
collateral.

> Is payment only collected upon retrieval

No. Storage miners continuously prove (probabilistically) to the network that
they're storing the files they've agreed to store.

------
kleebeesh
How does it compare to maidsafe?
[https://maidsafe.net/](https://maidsafe.net/)

Maidsafe made some noise in early 2014 but I haven't heard much from them
since.

------
mtgx
Have they announced the ICO yet? Or has it already passed?

~~~
lgierth
Token sale is gonna start on the 27th, here's more info:
[https://protocol.ai/blog/ann-filecoin-token-sale-and-new-
pap...](https://protocol.ai/blog/ann-filecoin-token-sale-and-new-paper/)

~~~
eco
One disappointing but also reassuring thing about the Filecoin ICO is that
it's only open to accredited investors[1]. It shows they are doing it right
and legal but that unfortunately locks out non-millionaires.

1\.
[http://www.investopedia.com/terms/a/accreditedinvestor.asp](http://www.investopedia.com/terms/a/accreditedinvestor.asp)

------
tscs37
Quite interesting.

I do recommend anyone to also check out Swarm, similar but built entirely on
top of Ethereum.

[http://swarm-gateways.net/bzz:/theswarm.eth/swarm](http://swarm-
gateways.net/bzz:/theswarm.eth/swarm)

------
kadavero
So, child porn that nobody can delete?

~~~
Eerie
Encrypted files that nobody but the owners can decrypt.

~~~
kadavero
1 Submit child porn 2 Publish the key 3 ????? 4 Profit

------
mcnesium
The very first sentence of the abstract seems a bit too optimistic to me

------
binocarlos
so perhaps Ethereum miners will sell their graphics cards and buy disks?

------
LAMike
What is the current ICO valuation for Filecoin?

------
sktrdie
Coming from academia, these sort of papers would not pass any sort of double-
bind peer review process. Where's the actual science? Where are the
experiments?

It seems to be just a document describing the protocol of Filecoin, which is
fine. The problem is that all these "whitepapers" coming from these
cryptocurrency communities seem to be promoted as scientific papers, but in
reality do not stand a chance in any science/academic-related setting.

I'd wish they'd give more attention towards the science, if there's any.
Otherwise, hey, sure it's just a PDF on some site - but don't call them
"papers".

~~~
repomies6999
Also satoshi nakamotos original bitcoin whitepaper wouldn't pass as academic
paper. However it still is veey useful at explaining the basic idea behind
bitcoin. In this context whitepaper means a little bit different thing than in
scientific context.

~~~
sktrdie
Actually there were experiments in the original Bitcoin paper. And it would've
probably passed a workshop-related setting for academia.

More than the academic presentation though, I'm just baffled by the lack of
science that these projects with _huge_ followings have.

~~~
product50
Please be honest. When the original Bitcoin or Ethereum whitepapers came out,
wouldn't your original comment, of these wp not passing any science/academia
setting, still ring true then?

Also, even if they don't pass an academia setting, what is your point? So many
papers actually pass academia setting but does it do anything worthwhile?
Filecoin is a project in existence for the past 3yrs working on top of IPFS, a
popular file sharing protocol, and being built by a venture funded startup.
Its founder has a vision of a decentralized future which makes intuitive
sense.

~~~
zdkl
Care to elaborate on the "intuitive sense" part?

------
vedanta
Decentralization bubble anyone?

------
Goodlooking
Suggestion: implement a mandatory uploading fee (like a mandatory transaction
fee), which is distributed among the miners who host the files. It should
encourage people to dedicate more space to hosting files.

------
niahmiah
storj.io

~~~
Veratyr
StorJ doesn't allow users to exchange coins for storage. Last I checked you
had to buy storage from some kind of broker and conveniently, the only people
running one were the StorJ devs themselves.

That level of centralization is unacceptable.

