

Playing chicken with cat.jpg - cperciva
http://www.daemonology.net/blog/2012-01-19-playing-chicken-with-cat-jpg.html

======
Udo

      I won't post to say "I haven't looked at the contents of the 
      file, but it's named 'cat.jpg'" either. I won't even post to 
      announce that the one hundred millionth file has been stored. 
      [...] This is because I have no way to obtain that information. 
      The contents of files [...] is all hidden from me by Tarsnap's 
      strong client-side encryption.
    

I agree that cat.jpg was a privacy violation and I do believe they did more
than simply look at the filename. However, I'll take the unpopular position
that this is within the limits of what one can reasonably expect from a
website providing a service as 37signals does. Admins will, and are completely
expected to, look at the data - if only to make sure everything is working.
Looking at a file called cat.jpg because it's the gazillionth file is pushing
the boundary a bit, but I still think this is OK. The moment I opt to use a
hosted project management software, I implicitly accept that things like this
(and potentially much worse) might happen.

Forgive me cperciva, but to me your post looks just like a giant plug for your
own service. Client-side encryption is not warranted for everything, nor is it
a reasonable goal for every app that shares data on the web. It's fine that
Tarsnap does this, and frankly I would expect the same from a service like,
say, DropBox - but it's not a reasonable expectation when it comes to the type
of apps 37signals provides.

~~~
cperciva
_I do believe they did more than simply look at the filename._

We'll have to disagree there. I'd be very surprised if they did any more than
looking at their log files -- most likely using tail -f -- as the 100 million
mark approached.

 _Admins will, and are completely expected to, look at the data - if only to
make sure everything is working._

How does looking at individual files help to confirm that things are working?
Once you're operating at scale, looking at individual files doesn't tell you
anything useful; if there's a big problem users will notice it before you do,
and if there's a small problem the files you look at probably won't be in the
affected set.

 _Forgive me cperciva, but to me your post looks just like a giant plug for
your own service._

Was I plugging Tarsnap? Sure; I mention it every chance I get on my blog. But
I didn't write that post because I wanted to plug Tarsnap; I wrote it because
I saw the trust-is-fragile post on HN Daily and felt that revising their
privacy policy wasn't the right response. (If I had noticed that post when it
was first discussed here, that blog post would probably have been just a
comment -- but since I was about 24 hours late to the party I figured that
nobody would read a comment I made here.)

~~~
freddyuggs
What if the file was "kiddie porn.zip" or "TOP SECRET: asassination details
for operation 'kill obama'"

Aren't there some instances where they'd be justified looking at data...

~~~
Mavrik
No. They're not the police or the court, they have no business playing ones.

Attitude like yours gives us stupid privacy-violating terrorism and "protect-
the-children" laws.

~~~
freddyuggs
So you're saying if they notice 'illegal content' on their servers, they'd
just leave it there?

~~~
malandrew
The fact that they can notice private "illegal content" at all is a violation
of trust. If I mark something private, I should expect that it will be private
from everyone including employees of that company.

While it is reasonable to expect that they would contact the FBI in such
instances, I would also hope that noticing such details illicits a "I shouldnt
have been able to see that, so we're not doing enough to protect the privacy
of our clients" response and corrective actions.

~~~
freddyuggs
What if you put a dead body in a bank vault? Would the bank respect your
privacy?

Surely there's some line somewhere...

edit: apparently HN thinks there is no line anywhere.

~~~
mike-cardwell
The problem is, you're missing the point. What they would do if they came
across illegal content is irrelevant. They shouldn't be looking at the content
in the first place, so this is a non-issue.

~~~
freddyuggs
In a new theoretical world where browsers can encrypt and decrypt data
securely without the server having any idea of content, and where you can
solve all the issues around allowing other people you want to be able to
access that data, then sure it's a non-issue. I was talking about reality ;)

~~~
nitrogen
We already have encryption that can't be brute forced within the lifetime of
the universe.

~~~
rjbond3rd
Yet.

~~~
nitrogen
_Yet._

Ever. I don't have the original source in front of me, but with enough bits,
assuming there isn't some fundamental flaw in the encryption algorithm, you
couldn't brute force a key before the heat death of the universe even if you
recruited every particle in the visible universe for your computation.

~~~
peterhunt
> assuming there isn't some fundamental flaw in the encryption algorithm

Big "if".

~~~
rjbond3rd
Exactly.

------
wisty
Dear users:

Unless you are using a service like tarsnap, your admins can and will peek at
your data. If you use a service like tarsnap, and you lose your password, your
data is deader than disco. Pick one - security, or an admin who can save your
account.

And while it's theoretically possible to develop a rich web app without seeing
user data, it just doesn't happen. You need realistic data to do testing. The
most realistic data you can possibly get is your user's data. Guess what
99.999% of websites use for testing?

If you have sensitive information, use good encryption. Better still do what
the professionals (i.e. the government) do, and leave it on an internal-
network only computer, in a steel reinforced room. If you're paranoid, lock
the hard drives in a safe when you leave the room. And use encryption.

But don't make a fuss when the admin peeks at your data, in a semi-random way.
If they are stalking you specifically, or leak any damaging information,
that's another matter. But if you just don't trust them, don't give them your
data.

~~~
derefr
> Pick one - security, or an admin who can save your account.

There's a simple way to eat your cake and have it too, though: put a copy of
your passwords in a safe-deposit box. Passwords don't strictly have to be
_private_ to protect you from would-be attackers—they just have to only be
accessible to people who have absolutely no incentive to help any would-be
attacker.

~~~
itsameta4
But people, especially in a service industry, NEVER have no incentive not to
help anyone. People are helpful by nature, and easily conned.

~~~
pavel_lishin
> But people NEVER have no incentive not to help anyone.

The rare triple-negative.

~~~
grammati
That's nonsense - people don't NEVER have no incentive not to help anyone.

Bam! Quad-negative! Top that.

------
alberth

      Humans can't be — and aren't — trusted to follow their stated intentions.
    

This is why you implement systems that prevent humans from doing wrong (either
intentionally or unintentionally).

A commenter named Trevor even pointed this out to 37signals in their blog post
as to how:

    
    
      Did you know that Oracle provides Database Vault. 
      What it all allows you to do is set it up to prevent 
      event DBAs from viewing or modifying data.
    
      Idea being, DBAs should be able to “administator” the 
      database, but should not be allow to either VIEW or even
      MODIFY customer/employee data (e.g. credit card #, SSN ,
      salary data, etc..)
    
      There is another product Oracle provides which is called
      Transparent Database Encryption . What it does is encrypt
      your customer data on disk, but then when a database 
      select is issued – it unencrypts the data on the fly 
      without needing to modify your application code.
    
      Unfortunately, no such products like this exists for MySQL.
    
      Given the size of your company now and how much 
      sensitive customer data you are now storing, might be 
      worthwhile for you guys to seriously consider using 
      Oracle now.

~~~
cperciva
My point exactly: Implement technical measures, not just policies.

~~~
alberth
Off topic: You're my hero for the FreeBSD/EC2 work you've done. (Just couldn't
resist letting you know)

------
ryanwaggoner
This seems in slightly bad taste, as it feels like a slightly disingenuous jab
at 37signals so you can plug your own service. 37signals is serving a
completely different market, one that isn't going to peruse their source code.
So that market is going to have to trust them to some extent.

Additionally, every service requires some level of trust. How am I to know
that the source code you show me is what you're actually using? (obviously
client-side encryption services are better in this area). How do I know you
won't sell my personal information, or abuse my billing information?

~~~
jarito
This seems to be the MO for this particular site. There is a blog post calling
out someone's faults (privacy, security, etc), some basic misinformation and a
plug for his service as being better.

I don't mind him wanted to do PR, but it does seem a bit distasteful. This was
basically an ad couched in something that was supposed to look like content.

As one of the previous posters said, there are tradeoffs made when using a
SaaS service and it is not possible to run a system like theirs while using
strong client side, opaque encryption. Besides, comparing a backup system to a
online file management system is apples to oranges.

~~~
cperciva
_There is a blog post calling out someone's faults (privacy, security, etc),
some basic misinformation and a plug for his service as being better._

I make lots of posts about security and cryptography. I happen to think that
Tarsnap does things right; if I didn't, I would have Tarsnap do things
differently.

I usually decide to blog about something based on (a) whether I think it's
interesting, and (b) whether I think people will learn from it. (There are
exceptions like calling out jungledisk for not fixing weaknesses in their
cryptography, but those are rare.) The question "will this give me a chance to
advertise Tarsnap" doesn't come into it -- for one thing, the vast majority of
my readers are already aware of Tarsnap.

------
daleharvey
When 37 signals start encrypting all their data their search tool is really
gonna suck.

A backup service that just needs to move around opaque blobs can and should
encrypt its data, an application that needs to be able to react to the type
and contents of the data that is stored, not so much, it seems like cperciva
would know this more than anyone, so the post seems pretty disingenuous

~~~
alberth
No, this is no longer the case.

Encryption these days only adds 1-2% extra load.

Regardless, even if the load was higher like it use to be before current
modern hardware, you are still essentially informing your customers that
"speed is more important than securing their data" - which is a terrible
approach to take.

TL;DR: If you are given the privilege to maintain a customer data, it's your
obligation and responsibility to do so with the most care possible.

~~~
tlb
The problem isn't the cycles of encryption, it's that the data architecture
has to be designed differently, and optimizations like caching can't be used
as much. Document search is especially difficult on encrypted documents.

------
ragesh
This brings up an interesting point about the benefit of client-side
encryption. That's fine if you have a locally running app, but how do you do
it with a web app? With some kind of browser plugin, perhaps? Does something
like that exist today?

~~~
weavejester
It's possible to do in principle at least, assuming all your users have modern
browsers. You could use the Javascript file API to intercept file uploads and
then to encrypt the data before it is sent to the server. You could then use
XHRs to collect the encrypted binary data and decrypt it before presenting it
to the user. If it was an image, you could use canvas to display the decrypted
content.

You'd have to contend with what is probably a large performance hit, and I
don't know of any libraries that do this so you'd need to spend a considerable
amount of time writing one. I suspect that this approach would only be
practical for very simple web applications. For instance, an encrypted image
or file hosting web application might be a possibility.

------
mcculley
I'm not convinced that such a strict approach to securing client data is
always the best policy. The clients of 37signals are not the same as the
clients of tarsnap. I would think that a client of 37signals is the sort that
sometimes needs the help of an admin and that often that help would require
looking at the client data.

My own company will never store sensitive data with an outside firm like
37signals but that is only because we have a great IT staff. For companies
that don't have an IT staff, outsourcing to 37signals makes sense and is
probably worth the tradeoff to trust them with data.

------
janus
I think that the key point in this issue is "trust".

Just as you trust the bank to guard your money, and many of their employees
have access to your current account balance, the convenience of using these
kind of services need you to trust the organization.

~~~
wladimir
There is a significant difference though: If the bank takes money from your
account, you notice it. If the file storage provider makes a copy of your
file, you don't notice it. You'll never know how the file leaked.

(sure, the bank could perform other tricks behind your back, like doing bad
investments with the money you put in, but hey they'll get bailed out
anyway...)

~~~
janus
I agree, but in a paranoid alternative world, the bank employees could share
your bank account balance information with criminal organizations that would
investigate you and your family, and one day kidnap you and take you or any of
your family members, and ask for an amount of money they know you possess..
It's still a trust issue.

~~~
wladimir
Right, hadn't thought about that, they could also leak information.

Luckily in the case of files you can easily do something about it, by
encrypting them client-side or using a storage provider client that handles
that for you.

------
tlb
It's hard to design systems that don't keep sensitive data in readable
formats. Protecting filenames could be done by encrypting with a salted hash
of the user password. Doing this correctly while allowing password changes is
really tricky. Can you recommend a good set of guidelines for getting it
right?

------
bch
There's no mention of what the filename was. Neither the basename, nor the
extension. They say confidently "(It was a picture of a cat!)"

~~~
mhartl
They clarified this in a comment to the original post
([http://37signals.com/svn/posts/3076-i-heard-you-like-
numbers...](http://37signals.com/svn/posts/3076-i-heard-you-like-
numbers?58#comments)):

    
    
        Razvan Tirboaca 12 Jan 12
    
            And a Basecamp user uploaded the 100,000,000th file
            (It was a picture of a cat!)
    
                Are you looking at your users photos?
    

And the response:

    
    
        Taylor 12 Jan 12
    
        Razvan, absolutely not. The file was named cat.jpg and
        that was logged, which was what we saw. We do not look 
        at user’s files.

~~~
biot
And it couldn't have been a picture of a catalog, CAT construction machinery,
someone named Catherine, etc.? I guess we have to take them for their word on
this one.

~~~
evan_
what if it was a picture of a catalog? The file was called cat.jpg so they
said it was a picture of a cat. Maybe it _was_ a catalog. They still said it
was a picture of a cat.

------
codesuela
If you are looking for a service like tarsnap (client side encrypted file
storage service) you should check out Wuala (not affiliated, just a personal
recommendation). It has a nice (cross platform, Java) GUI client and all the
features dropbox has plus encryption. It is operated by a Swiss company and
therefore is subject to strong privacy laws. You can also access it through
your browser (Java applet) and they have native iOS and Android apps.

~~~
teaspoon
Why should I care that the company is Swiss? The point of client-side
encryption is that I don't need to worry about who's hosting my data or what
jurisdiction they're under.

~~~
DanBC
See, for example, Hushmail, which has no option but to cooperate with
correctly formed legal documents. This means that they use the passphrase
(which they can capture within a short time) of the non-java-applet version of
their software, or they serve a modified applet.

Being in a different jurisdiction provides a small amount of protection.

------
agentultra
If rules are made to be broken then by admission I don't think there will ever
be a rule that states you should throw out the steering wheel if you're
playing chicken.

Even though I completely agree that systems we build should have the least
possible level of permissions required to do their job.

But the temptation to leave a backdoor open to peek once in a while, "just in
case," is tempting and has it's own benefits...

------
mikeash
This whole thing seems to be blown completely out of proportion, based
entirely on hypothetical and unfair "what-if" scenarios based on the imaginary
case where the 100 millionth file was something sensitive.

I'm imagining a group of friends and one of them mentions an interesting book
he saw in X's house. The friends are immediately scandalized: what if instead
of a book, you saw naked pictures of X's wife? Apparently you'll just blab
anything you see, so you can't be trusted in people's houses anymore.

It's a completely innocent disclosure. That it would not have been innocent if
the file had been different seems completely irrelevant. Either they would
have been discreet in that case, or they would not have, but we can't tell
which from this one instance.

~~~
pyre
This whole SOPA/PIPA thing seems to be blown completely out of proportion,
based entirely on hypothetical and unfair "what-if" scenarios based on the
imaginary case where the laws are used in ways that they weren't intended.

~~~
mikeash
SOPA/PIPA apply to organizations with a long history of abuse. Applying a
history of abuse to hypotheticals is reasonable. Taking an entity with no
history of abuse and assuming that they would abuse in the future is not.

~~~
andrewflnr
Human beings have a long history of abuse. The discussion here seems to be
talking about the general case of admins looking at user data, a discussion in
which that long history is very relevant. Maybe we're getting out of the
bounds of usefulness by having that general discussion, but it's perfectly
natural and unsurprising.

------
nazar
Wouldn't be monitoring all the user-generated content a requirement once SOPA
gets to work?

------
nirvana
Before posting this comment, I went and checked the Tarsnap site, including
the Security section, the design section and the FAQ and didn't find an answer
to this question. My memory, from a past reading of your site, was that you
kept keys on your side of the service, so that you could turn them over to Law
Enforcement if they showed up. Is this still the case? (because if it is, then
you can look at cat.jpg, even if you wouldn't post publicly about it.)

~~~
spicyj
I believe that this has never been the case with Tarsnap -- the keys are
stored only locally. (cperciva can delete your encrypted data but not decrypt
it.)

------
evan_
37Signals could've completely avoided the controversy by just saying "we
contacted the user who uploaded the 100 millionth file to tell them that the
file they uploaded at 12:34 was the 100 millionth file and they wrote back and
said it was a picture of their cat".

~~~
dvdhsu
37signals has morals.

