
We will no longer use the phrase “zero knowledge” to describe our software - remx
https://spideroak.com/articles/why-we-will-no-longer-use-the-phrase-zero-knowledge-to-describe-our-software
======
_cxqp
I still won't trust SpiderOak with my data. Their service is unreliable and
slow, their client is horrible to work with and their support is disgraceful.

I'll post my usual story:

In February 2016, SpiderOak dropped its pricing to $12/month for 1TB of data.
Having several hundred gigabytes of photos to backup I took advantage and
bought a year long subscription ($129). I had access to a symmetric gigabit
fibre connection so I connected, set up the SpiderOak client and started
uploading.

However I noticed something odd. According to my Mac's activity monitor,
SpiderOak was only uploading in short bursts [0] of ~2MB/s. I did some test
uploads to other services (Google Drive, Amazon) to verify that things were
fine with my connection (they were) and then contacted support (Feb 10).

What followed was nearly _6 months_ of "support", first claiming that it might
be a server side issue and moving me "to a new host" (Feb 17) then when that
didn't resolve my issue, they ignored me for a couple of months then handed me
over to an engineer (Apr 28) who told me: "we may have your uploads running at
the maximum speed we can offer you at the moment. Additional changes to
storage network configuration will not improve the situation much. There is an
overhead limitation when the client encrypts, deduplicates, and compresses the
files you are uploading"

At this point I ran a basic test (cat /dev/urandom | gzip -c | openssl enc
-aes-256-cbc -pass pass:spideroak | pv | shasum -a 256 > /dev/zero) that
showed my laptop was easily capable of hashing and encrypting the data much
faster than SpiderOak was handling it (Apr 30) after which I was simply
ignored for a full month until I opened another ticket asking for a refund
(Jul 9).

I really love the idea of secure, private storage but SpiderOak's client is
barely functional and their customer support is rather bad.

If you want a service like theirs, I'd suggest rolling your own. Rclone [1]
and Syncany [2] are both open source and support end to end encryption and a
variety of storage backends.

[0]: [http://i.imgur.com/XEvhIop.png](http://i.imgur.com/XEvhIop.png)

[1]: [http://rclone.org/](http://rclone.org/)

[2]: [https://www.syncany.org/](https://www.syncany.org/)

~~~
rarrrrrr
After looking through our records, I think perhaps you might mean 2015 instead
of 2016. I took over the company that year and some things have changed. At
the time customer satisfaction ratings were around 84%, and we're in the high
90s now. If I've found the correct case, we did at least suspend billing when
the issue started and eventually issued a full refund.

I'm sorry we weren't able to determine the cause of the slowness for you.
Troubleshooting an end-to-end encrypted product is hard because you can't just
see everything that's happening by looking at the server. We have seen ISPs
deny or aggressively throttle connections to our destination networks. Palo
Alto firewalls classify traffic to SpiderOak as an online backup service and
often block it outright or put it at least priority. I'm not saying that these
were necessarily the causes in your situation.

SpiderOak keeps improving and the 2017 road map is action packed. If for some
reason you would ever like to try SpiderOak again on me, you're welcome to
contact me directly, or write to support@spideroak.com where these days we do
a pretty good job of taking care of everyone. Otherwise I'm glad you've found
backup solutions you're happy with. Cheers!

~~~
hluska
Let me see if I have this straight. In response to a story from a customer,
you went through all of your records and then shared information about the
case in public.

I'm glad your customer satisfaction scores are higher, but I'd rather not
share any of my information with you. I'm not the op, but if I was, I would
feel rather violated.

~~~
jacquesm
> I'm not the op, but if I was, I would feel rather violated.

Well then don't choose a public forum to vent a complaint, and since you're
not the OP it doesn't matter anyway.

Turnabout is fair play: if you go into a thread about a product and make a
strong play that their customer service sucks and then an officer of that
company steps in and does what they can to see if there is a way to re-engage
you on their dime that's about as good as you could possibly expect. On top of
that he did not volunteer any info that wasn't already in the OP's post except
for a possible correction of the date.

So you're wrong, _twice_.

FWIW absolutely no affiliation whatsoever with Spideroak.

------
ianmiers
I think this is missing the larger problem with the claim. Yes, using the term
clashed with academic cryptography to some extent. But the larger issue was
even as they intended to use it, it's not a completely accurate description of
their product. They learn quite a bit of information compared to you storing
the data locally.

Why the collision with academic cryptography doesn't matter: Anyone who had
even a basic understanding of their product + some academic crypto background
would get what they were going for: they have no knowledge of whats going on.
Also, strictly speaking,the academic term is zero-knowledge proof or zero-
knowledge proof of knowledge. Ie, zero-knowledge is an adjective used to
describe a proof (and indeed, if you look at the history of how these evolved,
that is exactly what happened). You could reasonably use zero-knowledge as a
modifier for something else and it could be acceptable. Indeed, it's a fairly
good shorthand for a particular class of definitions of
privacy/confidentiality that require any transcript of the protocol can be
produced by a simulator who has no knowledge of what transpired.

The problem is Spider Oak's cloud backup cannot be zero-knowledge or no
knowledge. It almost certainly leaks when you update files and when you delete
them. Perhaps they don't log this or delete the logs, but they could. And this
meta data could matter to businesses.

~~~
rarrrrrr
Yes, of course there's some traffic analysis that would be possible, as there
would be with any such service. But for the record: we keep logs for a limited
time, and we don't just encrypt each file individually.

Instead there's an encrypted journal and encrypted data blocks. (Having the
additional layer of data blocks allows for better deduplicating one version of
a file to the next.) So for each transaction that's uploaded to the servers,
we know that the journal gets longer, and that data blocks are added or
removed (or both.)

All the database work for keeping track of the data blocks (reference
accounting, garbage collection) is done client side. More details in this post
from 2009: [https://spideroak.com/articles/why--how-spideroak-
architectu...](https://spideroak.com/articles/why--how-spideroak-architecture-
is-different-than-other-online-storage-services)

~~~
ianmiers
Right, some leakage is inherent and what you provide may be good enough or
even the best you can reasonably do. However, There's a long history (even in
academic crypto) of what seems like insignificant leakage being important. So
it's good to be overt about it.

So it looks you are doing blockwise encryption? Which means at least
conceptually, not only do you leak when a file is updated, you leak what
chunk? At least I'm assuming the journal isn't append only.

~~~
rarrrrrr
For what it's worth, I think of a journal as append only by definition and
that's what SpiderOak does. Unless you have millions of very small files, the
journal is going to be tiny relative to the backup content so this is fine.

So the server doesn't have a concept of "an existing file was updated" vs "a
new file was uploaded" etc. The server only knows "new blocks have arrived."
All the "smarts" are on the client.

In general operation, only new journal entries and new blocks are added. The
only time blocks are removed is when the user intentionally chooses to remove
data (we call that operation "purge") Intentional purges can also reduce the
total size of the journal, and this the only operation that does so.

Most backup software removes previous versions and deleted files after 30
days, but SpiderOak keeps these indefinitely by default, to allow for for
point in time recovery, restore from ransom ware infections, mistakes you
don't catch right away, etc. You can set a different retention policy if you
prefer.

------
aboodman
I'm always a little tweaked when somebody pitching a client product says "we
_can 't_ access your data".

Am I running your software in a process that has network access? Then you
_can_ access my data.

I understand the point you're trying to make, and I totally get that
architecting a system so that unencrypted data doesn't leave my device is
superior to an architecture where it does.

But I still must, ultimately, trust you, your competence, and your
motivations. If I trust that you _don 't want to access my data_, and have
tried to architect your systems so that is hard to do, and are competent to do
so, then I can trust my data is probably safe.

But that's not the same as it being physically impossible for you to access my
data.

~~~
BuuQu9hu
In Tahoe-LAFS, it is actually a proven truth that, up to the level of
cryptographic unguessability, server operators _cannot_ read stored files
without knowing the client-side key. If you can prove otherwise, then you can
earn a spot in their hall of fame: [https://tahoe-
lafs.org/hacktahoelafs/](https://tahoe-lafs.org/hacktahoelafs/)

Tahoe-LAFS providers like Least Authority [1] and Matador Cloud [2] pride
themselves on not being able to access your data.

[1] [https://leastauthority.com/](https://leastauthority.com/) [2]
[https://matador.cloud/](https://matador.cloud/)

~~~
cakoose
Like OP said, having that design is valuable, but you're still running
someone's software on your computer.

With Tahoe LAFS, you're either downloading the pre-built binary or compiling
from source. Either way, you're trusting the person who signed the
binary/source or the person who hosts them.

This is similar to when Apple says they can't read your messages. Sure, they
may not able to decrypt the data on their server, but they're in total control
of the client software. They can get access to your messages.

I do think it's in Tahoe/Apple/etc's interest to NOT be able to see their
data. For one, they said they can't and reputations are important. It may also
be beneficial when dealing with law enforcement requests. So trusting them
isn't totally unwarranted, but there's a lot there that isn't mathematics.

(You're never going to get to 100% math, but there are ways to get further.
For example, someone might eventually write a Tahoe LAFS client that comes
with a machine-checkable proof that it doesn't leak plaintext. You now need to
trust the proof checker, but it's progress.)

~~~
powera
What do you want then? Only use software you've written yourself, on a non-
networked computer?

~~~
cakoose
Like I said, I don't think it's unreasonable to trust Apple or the Tahoe LAFS
download server. I think they're improving the state of security and I would
use those products. I just want people to be clear about the actual security
properties of the whole system.

For example, people should know that blanket statements like "Apple cannot
read your messages" are false. And your client/server protocol can be state of
the art, but you still rely on the much-maligned HTTPS certificate
infrastructure (among other things) to get the client bits.

------
CiPHPerCoder
In case anyone is curious about why this was problematic:

[https://paragonie.com/blog/2016/08/crypto-misnomers-zero-
kno...](https://paragonie.com/blog/2016/08/crypto-misnomers-zero-knowledge-
considered-self-descriptive)

Zero Knowledge is something you're most likely going to find in an
authentication protocol, not an encryption protocol.

While I have mixed feelings about "No Knowledge", it's at least not a
collision with a different concept.

Good on SpiderOak for the effort here. It shows they do listen, at the very
least.

~~~
rarrrrrr
Thank you. It would be easier to create friendly terms that describe end-to-
end encryption if non-encrypted cloud providers weren't actively trying to
mislead people into a false sense of security. My previous rant about this is
here:
[https://news.ycombinator.com/item?id=13303599](https://news.ycombinator.com/item?id=13303599)

------
raphinou
I was surprised to read their mobile solution delegates the decryption to a
software running on their server: [https://spideroak.com/manual/spideroak-on-
mobile](https://spideroak.com/manual/spideroak-on-mobile)

This is clearly not "no Knowledge"....

Looking forward to standing corrected if I am wrong.

~~~
MertsA
AFAIK, they make this very clear if you ever try to use the mobile content
that you're giving them access to all of your data. I think there's a
sufficient amount of big scary warnings and from the way SpiderOak is designed
it makes sense to me why a true mobile client wouldn't work out so well.

------
abronan
This applies to the entire industry, we are often inflating our products with
terms that do not describe the reality. Technical accuracy is important
because it can drive a purchase decision when comparing features with a
competitor. Companies and individuals invest time and money on these
software/services, misleading them with inaccuracies can harm them directly or
their business.

At least it's an honest statement from SpiderOak, it's better to fix a misuse
of a term and admit an error than throwing a misleading term describing a
product used by thousands of people and then delete it as if nothing happened.

When I see this post, I cannot help but think of how Docker described "Swarm
mode" orchestration features during DockerCon 2016 using the terms "self-
healing" and "self-organizing". Obviously, "Swarm mode" was neither "self-
healing" nor "self-organizing" and a possibility is that they had no idea what
those terms meant, but it looked good on paper and from a marketing point of
view. While they have fixed it in the documentation after pointing this out
internally, these terms have leaked in many blog posts and are still in plenty
of talks recording on Youtube. It became hopeless to stop the spread of
misinformation.

Despite this change, a lot of SpiderOak customers are still going to use the
term Zero-Knowledge to describe the software to their friend/co-workers or
business partners. The term will stick to them for awhile.

------
trendia
Previously [0], HN and experts criticized SpiderOak for using the term
improperly in their marketing. (E2E is not the same as zero-knowledge
storage). And now they admit that they knew _at the time_ that it was used
improperly.

So, why did they use it?

[0]
[https://news.ycombinator.com/item?id=13301936](https://news.ycombinator.com/item?id=13301936)

~~~
passive
This is misleading.

They admitted at the time they knew it was being used improperly, they were
just attached to the usage.

They are now following up to say they have detached themselves from this
usage. This seems like very responsible behaviour that should be applauded.

~~~
trendia
I'm pointing out that they knowlingly used a word improperly... and for that I
am called misleading.

~~~
passive
You suggested that they were only coping to the improper use now, rather than
being upfront about it at the time. That's misleading.

------
TheSpiceIsLife
"Your data is completely safe from ... any threat."

How is that possible?

~~~
mavhc
Because they don't have your data, they have your encrypted data and no key.

Last week I emailed the Information Commissioner's Office in the United
Kingdom [https://ico.org.uk/](https://ico.org.uk/) about whether, in their
view, storing encrypted backups in another country, when the key never leaves
the UK, counts as moving personal data outside the country, sadly they said
yes, I don't really have faith they understood the maths though.

~~~
msh
Maybe they understand the theory but have no means of verifying the
implementation, while its easier to verify the location of data.

~~~
mavhc
They don't really verify things anyway, they wait for people to inform them of
issues.

Encryption can be broken, but, one assumes: a) that your backups aren't likely
to be stolen, b) that no one cares enough and c) by the time they are, the
people whose information you stored are dead

If you logged in you provided the key: NOTE: Logging in via the SpiderOak
website does temporarily allow SpiderOak employees access to your password.
Due to this exposure, we discourage users from entering your password online
if they wish to fully retain our Zero-Knowledge privacy.

------
temprature
_> For our secure group chat, file sharing and collaboration tool Semaphor, it
means you can even review the source code._

Has anyone ever tried to "review the source code"?

"review the source code" links to
[https://spideroak.com/solutions/semaphor/source](https://spideroak.com/solutions/semaphor/source)
which leads to
[https://spideroak.com/releases/semaphor/source](https://spideroak.com/releases/semaphor/source)
which is a 404.

~~~
Operyl
It's worked for me in the past, and still continues to do so.

------
ape4
"As we launch a new website today, we changed every mention of Zero Knowledge
to No Knowledge."

------
cmrx64
Dear god FINALLY. This has annoyed me constantly about spideroak.

~~~
remx
You can see the full HN furore here:
[https://news.ycombinator.com/item?id=13303436](https://news.ycombinator.com/item?id=13303436)

------
s-macke
For me the problem with Zero Knowledge or No Knowledge is, that the website
usually don't say, whether they encrypt the content, directory structure,
filename or file size. Often it is just the content. It would be great if the
services would explain this in more detail. What does SpiderOak encrypt
actually?

I guess a real "No knowledge" storage would be just a container and you read
and write blocks, i. e. the filesystem format is implemented on the client
side. Of course this make features such as versioning difficult to implement
and probable everything would be a little bit slower.

Edit: The post from rarrrrrr explains the technique of Spider Oak and he links
to a Blog entry. This is pretty impressive.

~~~
rarrrrrr
Your second paragraph describes SpiderOak quite accurately: it is exactly a
logical file system implemented client side, with all the database work to
support that done locally. It works because Sqlite is awesome.

------
barking
I have been using sync.com for about a year now and am very happy with it. It
says it's zero knowledge (I'm not qualified to judge the veracity of that) and
costs under €60 pa for 500GB. I have also found them exceptionally friendly to
deal with. It's ridiculous but I was chuffed to receive a postcard signed by
about a dozen people after signing up. I used to use crashplan but after doing
a successful restore from a supposedly good archive I found that thousands of
files were missing. The other thing I use as was mentioned elsewhere in this
thread is duplicati 2. Has worked perfectly for me so far.

~~~
DylanFuery
Just got my postcard as well! Sadly no signatures but lots of stickers. I love
me some stickers. Really enjoying their service so far, and with them I can
actually visually see what's happening (encryption and uploading) whereas
SpiderOak makes it less clear / obvious.

~~~
barking
Maybe they were a little quieter back then or maybe they just liked me more?
What can I say? :)

------
placebo
I think in terms of catchy marketing phrases, its a good switch, considering
they wanted one that sounds as good as the previous one yet does not clash
with existing professional terminology. I remember thinking for a few seconds
about what I would change it to back when reading criticism about them using
the term "Zero Knowledge" and couldn't off the top of my head think of
something different but still catchy. Seems obvious once it's thought of...

------
unknownsavage
Great news.

I tried spideroak and liked the product, but the inability for me to pay using
anonymous payment methods has lead me to use sync.com instead.

I had trouble with a prepaid debit card, and they unfortunately don't take
bitcoin either.

------
sauronlord
Odd, but they used "zero knowledge" in their title to describe themselves.

Anyone else pick up on this hilarious irony?

Hacker News is becoming a comedy site similar to The Onion.

