Hacker News new | past | comments | ask | show | jobs | submit login
Playing chicken with cat.jpg (daemonology.net)
288 points by cperciva on Jan 19, 2012 | hide | past | web | favorite | 104 comments

  I won't post to say "I haven't looked at the contents of the 
  file, but it's named 'cat.jpg'" either. I won't even post to 
  announce that the one hundred millionth file has been stored. 
  [...] This is because I have no way to obtain that information. 
  The contents of files [...] is all hidden from me by Tarsnap's 
  strong client-side encryption.
I agree that cat.jpg was a privacy violation and I do believe they did more than simply look at the filename. However, I'll take the unpopular position that this is within the limits of what one can reasonably expect from a website providing a service as 37signals does. Admins will, and are completely expected to, look at the data - if only to make sure everything is working. Looking at a file called cat.jpg because it's the gazillionth file is pushing the boundary a bit, but I still think this is OK. The moment I opt to use a hosted project management software, I implicitly accept that things like this (and potentially much worse) might happen.

Forgive me cperciva, but to me your post looks just like a giant plug for your own service. Client-side encryption is not warranted for everything, nor is it a reasonable goal for every app that shares data on the web. It's fine that Tarsnap does this, and frankly I would expect the same from a service like, say, DropBox - but it's not a reasonable expectation when it comes to the type of apps 37signals provides.

I do believe they did more than simply look at the filename.

We'll have to disagree there. I'd be very surprised if they did any more than looking at their log files -- most likely using tail -f -- as the 100 million mark approached.

Admins will, and are completely expected to, look at the data - if only to make sure everything is working.

How does looking at individual files help to confirm that things are working? Once you're operating at scale, looking at individual files doesn't tell you anything useful; if there's a big problem users will notice it before you do, and if there's a small problem the files you look at probably won't be in the affected set.

Forgive me cperciva, but to me your post looks just like a giant plug for your own service.

Was I plugging Tarsnap? Sure; I mention it every chance I get on my blog. But I didn't write that post because I wanted to plug Tarsnap; I wrote it because I saw the trust-is-fragile post on HN Daily and felt that revising their privacy policy wasn't the right response. (If I had noticed that post when it was first discussed here, that blog post would probably have been just a comment -- but since I was about 24 hours late to the party I figured that nobody would read a comment I made here.)

  We'll have to disagree there. I'd be very surprised if 
  they did any more than looking at their log files
I'm probably a little less trusting on this. An admin seeing the filename in the logfile and just calling its URL out of curiosity seems like a very likely scenario to me. They said something like "...and it was a picture of a cat", not "...and it was named cat.jpg".

  How does looking at individual files help to confirm that things are working?
Not in this case, but having access to the file storage system per se is common and useful.

  But I didn't write that post because I wanted to plug Tarsnap; 
I understand. The combination of pointing the finger at someone for wrongdoing and then asserting your own superiority seemed inappropriate to me though. I understand where you're coming from, but I also believe to you the world is now full of places that should have client-side encryption, when in fact I don't think this is a good fit for what 37signals does at all.

  I wrote it because I saw the trust-is-fragile post on HN 
  Daily and felt that revising their privacy policy wasn't 
  the right response.
I'm not a 37signals user, and I haven't read their policy. I agree that changing the policy following this incident is very bad timing, but I think this maneuver does correct an unreasonable expectation users might have.

I don't think [client-side encryption] is a good fit for what 37signals does at all.

I'm inclined to agree with you. That's what I was getting at with my "even if 37signals doesn't want to offer cryptographically secure storage, they could at least remove the temptation to look at file names in log files by not writing sensitive information to log files in the first place" line.

I think most computer-savvy people, looking at a file called "cat.jpg," would make the leap to "it's a picture of a cat." The name is practically just a compressed version of that.

"I think most computer-savvy people..."

http://cl.ly/0Y1M1D0z1g123I0S0u1R - cat.jpg ;)

That was actually exactly the alternative I had in mind, an icon for a catalog feature in some sort of application (though a PNG would have been more likely.)

How does looking at individual files help to confirm that things are working?

"Hello, thanks for calling tech support. How can I help you?"

"I uploaded a file but it's not showing up in my account."

"What was the name of the file?"


"Ok, give me a moment to look at the logs..."

These are the kinds of questions that come up all the time in supporting a SaaS product with non-technical and semi-technical users. Debugging is not something only programmers do. Oftentimes bugs are found only after a client interacts with support.

Oh, another thing: deletes. At my last company I can't tell you how many times customers wanted us to restore deleted data. After many frustrating support experiences we implemented soft deletes for most objects. Hard deletes required written confirmation from the user and 48 hours to purge it from all backups.

What if the file was "kiddie porn.zip" or "TOP SECRET: asassination details for operation 'kill obama'"

Aren't there some instances where they'd be justified looking at data...

No. They're not the police or the court, they have no business playing ones.

Attitude like yours gives us stupid privacy-violating terrorism and "protect-the-children" laws.

So you're saying if they notice 'illegal content' on their servers, they'd just leave it there?

The fact that they can notice private "illegal content" at all is a violation of trust. If I mark something private, I should expect that it will be private from everyone including employees of that company.

While it is reasonable to expect that they would contact the FBI in such instances, I would also hope that noticing such details illicits a "I shouldnt have been able to see that, so we're not doing enough to protect the privacy of our clients" response and corrective actions.

What if you put a dead body in a bank vault? Would the bank respect your privacy?

Surely there's some line somewhere...

edit: apparently HN thinks there is no line anywhere.

If the bank sees you carrying in a body to put in the vault, I would expect them to call the authorities. However, if the bank guarantees privacy of what they store in the bank, I don't expect them to be looking inside anybody's safe deposit box to see what is inside. It doesn't matter if it is money, personal documents, blackmail material, jewelry (stolen or purchased) or even body parts.

Investigating the contents of each safety deposit box, or even having the ability to do so is outside the scope of what a bank vault services are sold to do.

A bank vault, like encryption, sells protection. It is for all intents and purposes neutral. It can be used for good and can be used for bad. 95% of the time a bank vault or encryption is either being used for an ethically neutral or at worst ethically ambiguous use.

When any technology or product is used for bad it is a social failure. Crime will always exist. The quantity of crime committed can be mitigated by sound long-term policies that treat those causes that are statistically most likely to contribute to crime occurring in the first place.

The problem is, you're missing the point. What they would do if they came across illegal content is irrelevant. They shouldn't be looking at the content in the first place, so this is a non-issue.

In a new theoretical world where browsers can encrypt and decrypt data securely without the server having any idea of content, and where you can solve all the issues around allowing other people you want to be able to access that data, then sure it's a non-issue. I was talking about reality ;)

We already have encryption that can't be brute forced within the lifetime of the universe.



Ever. I don't have the original source in front of me, but with enough bits, assuming there isn't some fundamental flaw in the encryption algorithm, you couldn't brute force a key before the heat death of the universe even if you recruited every particle in the visible universe for your computation.

> assuming there isn't some fundamental flaw in the encryption algorithm

Big "if".


And I repeat, "they shouldn't be looking at the content in the first place." I don't see how anything in your reply addresses this point.

I think you're being downvoted because you're making a very similar argument to "if you have nothing to hide you have nothing to fear". The only circumstance where the bank would find a dead body in a vault is one where they open every vault just in case.

... or if the body starts to rot and smell

First: a file being named Child Porn, no really I'm not joking this picture has naked children in it.PNG doesn't make it illegal content.

I suspect he also means that they shouldn't be looking through files in the first place on the grounds that "there might be something illegal in them"

Thats exactly what I meant, I might have just phrased it in a wrong way :/

The whole concept of "contraband information" is anathema to a free society.

I'd say in those circumstances they should be phoning the FBI or the secret service rather than looking at the data themselves.

Yeah I think I'd agree there.

What if one of your examples was the original filename and the account holder renamed it to something less obvious such as cat.jpg?

    Was I plugging Tarsnap?
Yes. EDIT: I don't have a problem with it, your service looks nice, but let's be honest. :)

I totally disagree that this came off sounding like a plug for Tarsnap. This is one of those perspective-changing point which can only be made by using concrete examples.

That he happens to be an expert in the field of digital privacy and has a way to prove that he is such an expert shouldn't be held against him.

Reading his post was like an "oh shit, he's right" moment for me and using Tarsnap as an example was key in helping me understand it.

Forgive me cperciva, but to me your post looks just like a giant plug for your own service.

Considering that in the discussion around cat.jpg, many people here were talking about a secure back-up service which encrypts all data at the client side with auditable source-code as if it was an unrealistic, unobtainable goal, I have zero problems with that.

I just wonder why they didn't say it was made up -a joke with no basis in reality. Just some poetic license.

Would that be worse than admitting/pretending they actually saw a file called cat.jpg? If there was such a file, it could have been a JPG for catalog of some kind, etc.

I think they are responding to people's first expectations and that there was actually a file with the image of a cat. I doubt it and think it was just an attempt at being funny which backfired and they felt they had to take responsibility for the perceived breach of trust and that any other explanation, even if truthful would have been seen as a weak excuse.

You may be right. When you think about it, what were the odds that the 100,000,000th file would be something funny worth mentionning in a post?

There are considerable technical hurdles in writing a web application that doesn't store unencrypted data. However, in principle, I don't see any compelling reason an admin should have access to user data like uploaded files. Ensuring an encrypted file is backed up and available for use is no different to doing the same for an unencrypted file.

  I don't see any compelling reason an admin should
  have access to user data like uploaded files. 
Mostly, uploaded files are stored unencrypted on a webserver though. The reason for this is that those are mostly "public" files in the sense that they can be accessed by a URL. Encrypting these does nothing really, except place additional load on the webserver when it has to decrypt them on-the-fly and admins would still be able to retrieve the key used for this from the software that is running on the server. This scenario is the most common one when it comes to user storage, and for good reason.

Encryption protects against a compromise of the backend. This lets 37signals, for example, store user data on S3 without leaking the user's information outside of the organization.

The overhead of decrypting an image is minimal compared to the latency introduced by a network fetch and by handling the rest of the request cycle in Ruby.

(And FWIW, people don't often have access to production encryption keys like this. Privacy is a big deal.)

Sure; that's what I meant by "considerable technical hurdles". But if these technical obstacles did not exist, if browsers had APIs to handle encrypted data for instance, would there still be a reason to give admins access to those files?

Browsers having APIs to encrypt and decrypt data in these scenarios is worth looking into - however, I do believe there might be a lot of apps that can't really profit from such an API. For example, what happens when you want to share an encrypted file with other users?

That's what public key encryption is for, no? The server just needs to store the encrypted files and the public key of each user, and files can be shared without the server being aware of their content.

To make this work, we'd need a pretty complicated system to distribute and manage those keys (and in a way they stay encrypted during transfer and storage). Furthermore, your client key pair would have to be stored on your machine, that means a service where you can't log in with just a password. Don't get me wrong, I think it's an interesting idea, I just don't see how this is not going to be a huge complexity and usability nightmare in applications that are designed for sharing data.

It's not complicated to distribute public keys. They're public keys, so put them in a public S3 bucket.

Private keys are marginally more complex, but not much. If a password is sufficient security, then the private key can be stored remotely (in S3 or whatever) but encrypted (symmetrically) with a password.

So say Alice wants to share a file with Bob. Alice's client downloads her encrypted private key, and prompts Alice for a password. The private key is decrypted with the password and stored in memory. Alice then downloads the file she wants to share with Bob, and decrypts it with her private key. Then she downloads Bob's public key, and re-encrypts the file with Bob's public key. She can now send the file to Bob securely without the server being aware of the content.

So now that we've got Alice and Bob covered, what happens when Alice wants to share a file with a group of people. And what if that group is dynamic? Is there something that addresses this scenario?

One of the members of the group can generate a new key pair, and share it with all the other members (in the same way one would share a file with each individual). If Alice sends a file to the group's key pair, anyone in the group can read the file.

Adding members to the group is trivial; just send them the group's key pair. Removing a member would be more difficult. Perhaps the most convenient way would be to add an additional layer of security on top (so members would need server access permissions, plus the private key). The only other option would be to create a new group and to re-encrypt all the existing files with a new key.

> It's not complicated to distribute public keys. They're public keys, so put them in a public S3 bucket.

AViD's answer on this security stack exchange is useful.


I will accept looking at the filename in some cases but never at the file contents.

It may be normal now, but it won't be for long. Sunir said it better than I could: http://news.ycombinator.com/item?id=3471464

Dear users:

Unless you are using a service like tarsnap, your admins can and will peek at your data. If you use a service like tarsnap, and you lose your password, your data is deader than disco. Pick one - security, or an admin who can save your account.

And while it's theoretically possible to develop a rich web app without seeing user data, it just doesn't happen. You need realistic data to do testing. The most realistic data you can possibly get is your user's data. Guess what 99.999% of websites use for testing?

If you have sensitive information, use good encryption. Better still do what the professionals (i.e. the government) do, and leave it on an internal-network only computer, in a steel reinforced room. If you're paranoid, lock the hard drives in a safe when you leave the room. And use encryption.

But don't make a fuss when the admin peeks at your data, in a semi-random way. If they are stalking you specifically, or leak any damaging information, that's another matter. But if you just don't trust them, don't give them your data.

> Pick one - security, or an admin who can save your account.

There's a simple way to eat your cake and have it too, though: put a copy of your passwords in a safe-deposit box. Passwords don't strictly have to be private to protect you from would-be attackers—they just have to only be accessible to people who have absolutely no incentive to help any would-be attacker.

But people, especially in a service industry, NEVER have no incentive not to help anyone. People are helpful by nature, and easily conned.

Hmm; what you're saying is true, so I think I phrased my statement a bit wrong. In general, yes, people do want to help. But your bank just isn't in the business of knowing what's in its safe-deposit boxes, just like Tarsnap isn't in the business of knowing what's on its servers.

The whole business model of a safe-deposit box relies on other people not being able to get into them without the owner's consent—so if anyone, including the bank itself, took a peek in there, that would instantly lose them all the trust they had ever accrued as a safe-deposit-box provider—and thus a lot of money. They have much more of an incentive to keep your data private than they have an incentive to help those who want it, because keeping your data private is what keeps them in business. That's the meaning I was going for.

> But people NEVER have no incentive not to help anyone.

The rare triple-negative.

That's nonsense - people don't NEVER have no incentive not to help anyone.

Bam! Quad-negative! Top that.

I agree completely. I run a service where thousands of files are uploaded a day, containing a persons location information (GPX/TCX logs from GPS devices). I have to use that data on a regular basis to further improve our ability to process these log files, which are generated by hundreds of separate pieces of software. The ability of my service to process these files requires my intervention semi-regularly. That wouldn't happen if, like some people are suggesting, I had to go to a safe deposit box to decrypt those files.

There are multikey encryption schemes that are out there; I can not help but think there's some spiffy protocol to solve this problem.

  Humans can't be — and aren't — trusted to follow their stated intentions.
This is why you implement systems that prevent humans from doing wrong (either intentionally or unintentionally).

A commenter named Trevor even pointed this out to 37signals in their blog post as to how:

  Did you know that Oracle provides Database Vault. 
  What it all allows you to do is set it up to prevent 
  event DBAs from viewing or modifying data.

  Idea being, DBAs should be able to “administator” the 
  database, but should not be allow to either VIEW or even
  MODIFY customer/employee data (e.g. credit card #, SSN ,
  salary data, etc..)

  There is another product Oracle provides which is called
  Transparent Database Encryption . What it does is encrypt
  your customer data on disk, but then when a database 
  select is issued – it unencrypts the data on the fly 
  without needing to modify your application code.

  Unfortunately, no such products like this exists for MySQL.

  Given the size of your company now and how much 
  sensitive customer data you are now storing, might be 
  worthwhile for you guys to seriously consider using 
  Oracle now.

My point exactly: Implement technical measures, not just policies.

Off topic: You're my hero for the FreeBSD/EC2 work you've done. (Just couldn't resist letting you know)

This seems in slightly bad taste, as it feels like a slightly disingenuous jab at 37signals so you can plug your own service. 37signals is serving a completely different market, one that isn't going to peruse their source code. So that market is going to have to trust them to some extent.

Additionally, every service requires some level of trust. How am I to know that the source code you show me is what you're actually using? (obviously client-side encryption services are better in this area). How do I know you won't sell my personal information, or abuse my billing information?

This seems in slightly bad taste, as it feels like a slightly disingenuous jab at 37signals so you can plug your own service.

I plead guilty to taking advantage of the opportunity to mention my service (although most of my readers are already very much aware of tarsnap), but I would have written the blog post anyway.

that market is going to have to trust them to some extent.

Sure, but I still think there's a huge gap between "we don't log sensitive information" and "we have a policy which says that we shouldn't look at the data we've logged".

Given that the inability to look at user data was an explicit design decision, calling it out when you talk about this very issue is quite fair, I think.

I think it might be lame if cperciva reacted to the 37s thing by changing his product, but he called it long before it happened.

This response is easily within bounds. Tarsnap is opinionated software. So is 37signals' offerings. Let them have a debate. It is for the better education of us all.

Tarsnap's position here is assailable, and we will all benefit from the discussion.

as it feels like a slightly disingenuous jab at 37signals so you can plug your own service.

How do you feel Colin's point is in any way disingenuous? Do you think he doesn't believe what he says? Because that's the only way I could see it as being "disingenuous."

Personally, I don't think it's disingenuous to opportunistically state what you believe to benefit yourself, assuming you truly do believe it.

The point is not that his product is better, it's that his method of ensuring customer privacy is better and should be industry standard.

This seems to be the MO for this particular site. There is a blog post calling out someone's faults (privacy, security, etc), some basic misinformation and a plug for his service as being better.

I don't mind him wanted to do PR, but it does seem a bit distasteful. This was basically an ad couched in something that was supposed to look like content.

As one of the previous posters said, there are tradeoffs made when using a SaaS service and it is not possible to run a system like theirs while using strong client side, opaque encryption. Besides, comparing a backup system to a online file management system is apples to oranges.

There is a blog post calling out someone's faults (privacy, security, etc), some basic misinformation and a plug for his service as being better.

I make lots of posts about security and cryptography. I happen to think that Tarsnap does things right; if I didn't, I would have Tarsnap do things differently.

I usually decide to blog about something based on (a) whether I think it's interesting, and (b) whether I think people will learn from it. (There are exceptions like calling out jungledisk for not fixing weaknesses in their cryptography, but those are rare.) The question "will this give me a chance to advertise Tarsnap" doesn't come into it -- for one thing, the vast majority of my readers are already aware of Tarsnap.

This whole thing seems to be blown completely out of proportion, based entirely on hypothetical and unfair "what-if" scenarios based on the imaginary case where the 100 millionth file was something sensitive.

I'm imagining a group of friends and one of them mentions an interesting book he saw in X's house. The friends are immediately scandalized: what if instead of a book, you saw naked pictures of X's wife? Apparently you'll just blab anything you see, so you can't be trusted in people's houses anymore.

It's a completely innocent disclosure. That it would not have been innocent if the file had been different seems completely irrelevant. Either they would have been discreet in that case, or they would not have, but we can't tell which from this one instance.

This whole SOPA/PIPA thing seems to be blown completely out of proportion, based entirely on hypothetical and unfair "what-if" scenarios based on the imaginary case where the laws are used in ways that they weren't intended.

SOPA/PIPA apply to organizations with a long history of abuse. Applying a history of abuse to hypotheticals is reasonable. Taking an entity with no history of abuse and assuming that they would abuse in the future is not.

Human beings have a long history of abuse. The discussion here seems to be talking about the general case of admins looking at user data, a discussion in which that long history is very relevant. Maybe we're getting out of the bounds of usefulness by having that general discussion, but it's perfectly natural and unsurprising.

When 37 signals start encrypting all their data their search tool is really gonna suck.

A backup service that just needs to move around opaque blobs can and should encrypt its data, an application that needs to be able to react to the type and contents of the data that is stored, not so much, it seems like cperciva would know this more than anyone, so the post seems pretty disingenuous

I don't disagree with this at all. But having the perspective that "The answer isn't to prove that they can be trusted; the answer is to ensure that their customers don't need to trust them" is worth keeping in the back of your mind...because I'm sure there are cases when that approach can be taken without breaking features.

No, this is no longer the case.

Encryption these days only adds 1-2% extra load.

Regardless, even if the load was higher like it use to be before current modern hardware, you are still essentially informing your customers that "speed is more important than securing their data" - which is a terrible approach to take.

TL;DR: If you are given the privilege to maintain a customer data, it's your obligation and responsibility to do so with the most care possible.

The problem isn't the cycles of encryption, it's that the data architecture has to be designed differently, and optimizations like caching can't be used as much. Document search is especially difficult on encrypted documents.

did you reply to the wrong comment? my speed mentioned nothing about speed.

Tarsnap can treat data opaquely and have the client encrypt / decrypt it, most web applications that arent just moving data around need to be able to access its contents to be able to work.

This brings up an interesting point about the benefit of client-side encryption. That's fine if you have a locally running app, but how do you do it with a web app? With some kind of browser plugin, perhaps? Does something like that exist today?

It's possible to do in principle at least, assuming all your users have modern browsers. You could use the Javascript file API to intercept file uploads and then to encrypt the data before it is sent to the server. You could then use XHRs to collect the encrypted binary data and decrypt it before presenting it to the user. If it was an image, you could use canvas to display the decrypted content.

You'd have to contend with what is probably a large performance hit, and I don't know of any libraries that do this so you'd need to spend a considerable amount of time writing one. I suspect that this approach would only be practical for very simple web applications. For instance, an encrypted image or file hosting web application might be a possibility.

I'm not convinced that such a strict approach to securing client data is always the best policy. The clients of 37signals are not the same as the clients of tarsnap. I would think that a client of 37signals is the sort that sometimes needs the help of an admin and that often that help would require looking at the client data.

My own company will never store sensitive data with an outside firm like 37signals but that is only because we have a great IT staff. For companies that don't have an IT staff, outsourcing to 37signals makes sense and is probably worth the tradeoff to trust them with data.

I think that the key point in this issue is "trust".

Just as you trust the bank to guard your money, and many of their employees have access to your current account balance, the convenience of using these kind of services need you to trust the organization.

There is a significant difference though: If the bank takes money from your account, you notice it. If the file storage provider makes a copy of your file, you don't notice it. You'll never know how the file leaked.

(sure, the bank could perform other tricks behind your back, like doing bad investments with the money you put in, but hey they'll get bailed out anyway...)

I agree, but in a paranoid alternative world, the bank employees could share your bank account balance information with criminal organizations that would investigate you and your family, and one day kidnap you and take you or any of your family members, and ask for an amount of money they know you possess.. It's still a trust issue.

Right, hadn't thought about that, they could also leak information.

Luckily in the case of files you can easily do something about it, by encrypting them client-side or using a storage provider client that handles that for you.

The key idea is the very general principle that you increase security by reducing the scope of resources that you must trust.

cperciva is giving 2 examples: (1) use a service provider that doesn't require your trust. (2) limit the exposure of customer sensitive information to your employees that you must trust to keep it private.

I agree this is a better strategy than simply updating a privacy policy, as far as actual security is concerned.

It's worth noting that banks and similar organizations put safeguards, controls and extensive auditing on the data that limits the data tourism that any employee can engage in. You trust the organization because the organization knows that humans are fallible and essentially doesn't trust its own workers.

It's hard to design systems that don't keep sensitive data in readable formats. Protecting filenames could be done by encrypting with a salted hash of the user password. Doing this correctly while allowing password changes is really tricky. Can you recommend a good set of guidelines for getting it right?

There's no mention of what the filename was. Neither the basename, nor the extension. They say confidently "(It was a picture of a cat!)"

They clarified this in a comment to the original post (http://37signals.com/svn/posts/3076-i-heard-you-like-numbers...):

    Razvan Tirboaca 12 Jan 12

        And a Basecamp user uploaded the 100,000,000th file
        (It was a picture of a cat!)

            Are you looking at your users photos?
And the response:

    Taylor 12 Jan 12

    Razvan, absolutely not. The file was named cat.jpg and
    that was logged, which was what we saw. We do not look 
    at user’s files.

What does "cat" mean? Was the next file "bnc.jpg", "grep.jpg" -- even if it had something to do w/ felines -- was it in a dir of gfx of DNA visualizations, where the image would be a GATACA sequence or something? I don't know, but these are questions that immediately come to mind for me.

Your response is a non sequitur. You stated

There's no mention of what the filename was. Neither the basename, nor the extension.

My comment merely quoted a 37signals employee giving the filename, both basename and extension. You can believe them or not, but they did explain the situation, and they explicitly denied looking at the image.

And it couldn't have been a picture of a catalog, CAT construction machinery, someone named Catherine, etc.? I guess we have to take them for their word on this one.

what if it was a picture of a catalog? The file was called cat.jpg so they said it was a picture of a cat. Maybe it was a catalog. They still said it was a picture of a cat.

This is the internez. Its perfectly fine to expect a random jpg to be a picture of a cat accompanied by the obligatory caption)

Further down in the comments they try to save themselves by saying "Of course we didn't look at it, it's just that the pic was called cat.jpg".

So, no, I don't believe them at all when they say they did never look at the file. They say--with confidence--it was a picture of a cat. Sorry but going from "cat.jpg" to such a conclusion is IMO quite a leap. It's just three letters, it could be a CAT scan, a screenshot of the Linux `cat` command, three DNA nucleotides, a picture of a tiger, something else named "cat" or something related to but not involving cats.

I don't know if I see a filename like that I'd say "It was called 'cat.jpg', so probably a picture of someone's cat." because it can be anything that somebody named "cat.jpg" for any number of reasons and I won't know for sure until I looked at it.

And even then, just them looking at the filenames is not right. Of course I understand that if it was `company-passwords.xls` or something more sensitive, they wouldn't have said anything. But already before they could judge whether the filename was sensitive or not, there really is no reason for why they needed to be looking at filenames in the first place!

Sure some admin can always go in as root and look at everything, but you don't need to tie the proverbial cat to the bacon, by putting the filenames right up someone's face who really has no business looking at them since they're just collecting statistics.

If you are looking for a service like tarsnap (client side encrypted file storage service) you should check out Wuala (not affiliated, just a personal recommendation). It has a nice (cross platform, Java) GUI client and all the features dropbox has plus encryption. It is operated by a Swiss company and therefore is subject to strong privacy laws. You can also access it through your browser (Java applet) and they have native iOS and Android apps.

Why should I care that the company is Swiss? The point of client-side encryption is that I don't need to worry about who's hosting my data or what jurisdiction they're under.

See, for example, Hushmail, which has no option but to cooperate with correctly formed legal documents. This means that they use the passphrase (which they can capture within a short time) of the non-java-applet version of their software, or they serve a modified applet.

Being in a different jurisdiction provides a small amount of protection.

If rules are made to be broken then by admission I don't think there will ever be a rule that states you should throw out the steering wheel if you're playing chicken.

Even though I completely agree that systems we build should have the least possible level of permissions required to do their job.

But the temptation to leave a backdoor open to peek once in a while, "just in case," is tempting and has it's own benefits...

Wouldn't be monitoring all the user-generated content a requirement once SOPA gets to work?

Before posting this comment, I went and checked the Tarsnap site, including the Security section, the design section and the FAQ and didn't find an answer to this question. My memory, from a past reading of your site, was that you kept keys on your side of the service, so that you could turn them over to Law Enforcement if they showed up. Is this still the case? (because if it is, then you can look at cat.jpg, even if you wouldn't post publicly about it.)

I believe that this has never been the case with Tarsnap -- the keys are stored only locally. (cperciva can delete your encrypted data but not decrypt it.)

37Signals could've completely avoided the controversy by just saying "we contacted the user who uploaded the 100 millionth file to tell them that the file they uploaded at 12:34 was the 100 millionth file and they wrote back and said it was a picture of their cat".

37signals has morals.

Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact