So people got the impression that they actually saw the picture. Hence the backlash.
Users of cloud services may be aware that engineers, sysadm or dba may occasionnally see their data but they certainly prefer not to think about it.
By giving them your files you are trusting them not to screw you over.
Why? (Or at least, why should they see anything private in raw form?)
> And it is certainly going to require staff to have the ability, even if they never have to use it.
> By giving them your files you are trusting them not to screw you over.
By giving themselves the technical ability to examine private user data, they are making a strong (or indeed legally compelling, in some cases) argument for not using their service to store anything private at all. That's a death sentence for most cloud services.
We don't accept companies storing passwords in plain text. We don't accept companies transmitting credit card data in the clear, and PCI DSS requires quite strict controls on access to such data even when it's stored internally on the company network. Businesses dealing with sensitive data such as health records are subject to all kinds of regulations on the privacy of that data. Professionals dealing with privileged communications such as between lawyers and clients don't get a pass. Off-site backup services give all kinds of strong guarantees about the security and privacy of the data entrusted to them.
Why should we give a pass to anyone else, because they can't figure out how to set up a security system where only the end user can access the unencrypted version of their own data?
This is in case somebody gains unauthorised access to the data, not in case staff can't be trusted. For example, paying by credit card over the phone you hand over your phone number to whoever is taking your order, but if they were to enter it into a system that system then has to comply with regulations.
As to plain text passwords, again this is in case of the data being stolen. It's all very well saying "Google shouldn't store plain text passwords", but if Google as a company wanted to read my email, they could a.) just replace the encrypted password in my database entry with one that they can use or b.) place code in their login system that would log the plain text password secretly or c.) Go straight into where my emails are being stored and access them there.
How do you get around this and make it impossible for them to access your data?
Actually, no. This is very much for both reasons. Part of PCI compliance is ensuring CC data is encrypted with a key that is partly known by a few people. So, let's say 3 people each knows a part of the key. The goal here is that in production, no one can access the key, but data can still be encrypted/decrypted.
To put it plain, it's not just a matter of encrypting and salting your CC data.
As for your "Google can just" remarks: yes. This can happen in many places. However, you mitigate the risk of this happening with procedures and security. I guarantee you that just working for Google doesn't give you access to the emails. I'd be surprised if the number of people that have direct access to emails at any time is in the double digits. Getting your code into production, I imagine, isn't just a cherry-pick.
You can't prevent people from having access to data you give them. However, they can mitigate the ability for it to happen.
The same way everyone else does: encrypt the data using a secret known only to the customer, isolate internal systems that have access to the decrypted data so that no one person can ever access that data on their own authority, and ensure that whatever procedure does permit access with the requisite authority creates a robust audit trail. If security really matters, the whole system and its logs should be regularly audited by an independent party, too.
You have to have some sort of trust, because obviously if everyone in the company is crooked then nothing but encrypting everything client-side using auditable code is bulletproof. But you can certainly engineer systems so that access requires multiple people's consent and gets securely logged, which would eliminate casual snooping and provide robust evidence for legal action in the event of collective abuse.
That sounds like Colin Percival's (cperciva) startup: http://www.tarsnap.com/
Who does that?
If you give your data to someone, chances are they might look at it. If you store files, use email or surf the web at work, chances are the IT guys can look at it. Of course, they should not, and they probably have better things to do etc etc, but believing this will never happen just seems naive.
Client-side encryption with end-user key management is not yet practical for the average end user. Until it is (I'm hopeful we'll get there), the average service will require some sort of administrative back door that is controlled by process and people.
Not exactly. The purpose is to limit the number of people who have access to your credit card number so that if one of them uses it fraudulently, it's easy to isolate and verify the source of the fraudulent transactions. Yes, the guy taking the order over the phone will have access to your credit card number. Yes the waiter running your card at the restaurant will have access to your credit card number. If the system is well designed, though, no one else will, and it'll be easy to find the person to blame if fraudulent transactions are made.
I've never worked on PCI compliant systems myself, but I know many developers who have, and they say that the sysadmins take solid measures to ensure that no one, not even the developers, gets any data from a database that handles credit card information. Any data pulled from those servers is first sanitized to ensure that credit card numbers and other personally identifying information is removed. Credit card numbers are replaced with a "sample" number that can be used for validation purposes. Names other information are replaced with sanitized data that has the same "shape" (e.g. number of characters and identical punctuation) as the original.
The purpose of these regulations is to ensure that there's always a clear chain of custody over your credit card numbers. Preventing unauthorized access is only one part of maintaining that chain of custody.
The problem is that incidents and attitudes like this make the market lose trust with the cloud services industry, which is poison to everyone.
Ethically there may be different obligations, but to say that there is some implicit "contract with the customer" is simply not the case.
As always, ask a lawyer if you want professional advice.
You are screwed. Simple as that.
Temptation can be resisted, but in this cases you are in trouble just for glancing.
Obviously they filter passwords and other sensitive data, but I think they should rightly have access to whatever they judge necessary to do their job.
There will always be people who have the ability to access data they are not supposed to, but in the end it comes down to who you will trust with your data.
To me, the transparency and contributions of 37signals qualifies them to have that trust. With that, I trust them to make good decisions about who they hire and what they store in their log files.
Of course, encryption per se is overkill for that. Something like ROT13 would do the trick.
I can trust someone and still not be comfortable if he accidentally see that I uploaded "how to file a divorce.pdf" for example.
If in the process of debugging the "," issue, a set of files is uncovered (including "how to file a divorce", tough.
Ideally, the usernames could be unscrambled separately, so at least there's no immediate connection to a single user.
I'd think the number of apps doing this is much smaller than those that don't, and even then only in cases where file names are replaced with hashes or GUIDs for directory reasons, not for the sake of information security.
I've seen tens of thousands of pieces of private data across all the companies I've contracted for. Data guys need to explore, they need to learn what types of customers use what type of features and why.
Heck, I talked to a guy online (didn't know his real identity, or I would call him out personally) that wrote a script that automatically checked his employer's database against outstanding warrants in the US (fuzzy matching first name, last name, city, age) and pulled in 2 to 3 times his salary just by the rewards. That is how bad some people are.
What you can trust is that a company almost certainly won't intentionally leak your data to the public, but rest assured that they do flip through it. Some awesome companies will obfuscate the email addresses or company names so that it is much harder to back calculate who owns what, but honestly unless a company is promising full encryption on their side I would just assume they can see everything.
If you want real privacy use encryption (or some other zero trust protocol) it really isn't that hard to use.
I would certainly terminate any account with a company that willfully was reading my private data and opening files for the mere sport of it.
* Should it happen everywhere?
* What are the mechanisms available to the market and the industry to prevent the deterioration of customers' trust with us?
Any person, employee or not, professional or not, has obligations to uphold just laws; certainly including measures of privacy.
We may also use personal information about you to improve our marketing and promotional efforts, to analyze Site usage, to improve our content and product offerings, and to customize the Site's content, layout, and services. These uses improve the Site and better tailor it to meet your needs, so as to provide you with a smooth, efficient, safe and customized experience while using the Site."
That bottom paragraph is fully communicating the nature of the relationship. Outside any law that would render the above unlawful, it is well within the law for an employee to "SELECT * FROM users WHERE created_at > 2010-02-01" or to "SELECT * FROM todos JOIN users ON todos.user_id = users.id WHERE users.profession = 'developer'". There are perfectly valid reasons to do these types of things. Anti-fraud measures, site optimization, etc.
> What are the mechanisms available to the market and the industry to prevent the deterioration of customers' trust with us?
This is a problem of mismatched expectations and priorities. It's a lot like politics. In an ideal world a politician would be able to say something like 'I think the American people acted irresponsibly financing homes and that is a good part of the reason for the financial crisis' because it is the truth and it would help people in the long run to hear it, as well as help any policy formation in response to it. But practically they can blame others and get away with it.
dhh hides behind: "I don't think it has to be this way. We often run internal reports on usage of certain features, but it's always aggregated, and never looks at the individual data. I feel bad enough looking at a customer's account when they've specifically asked me to do so from a support request.
I would certainly terminate any account with a company that willfully was reading my private data and opening files for the mere sport of it."
That carefully worded bullshit. Internal reports are not exploring. Internal reports are what you show at the monthly marketing or board meeting. C_Os get internal reports. Data guys test recommendation models. Data guys find out the interesting patterns to include in custom reports.
Also, His last paragraph is ridiculous. Obviously we don't read data for sport. In fact it is boring. You go through data for trends.
Helping to serve justice is now a bad thing?
Obviously police does not have such reason and therefore cannot get warrant.
I guess I'm biased because my business makes me to deal with fraud on a daily basis.
However, I'm disgusted by the number of people in this thread that justify the violation of customer privacy because it's what's normal.
As an industry, we all face in our sales cycle the fear from customers that we will violate their privacy. Self-regulation by holding each other to account is the cheapest and best way to address the issue.
While I would be stupid to believe software vendors don't look at my data because I know better, that isn't my expectation.
It's not my expectation that my lawyer, my accountant, my doctor, my therapist, my social worker, or my librarian trade on or reveal or delve through my private information. That's why they as professionals are licensed and self-regulated by their professional colleges.
As information professionals, we should act professionally with information as well. This is not crazy talk. We also see credit card numbers and personal information stolen every month. Last year over 100 million credit cards had to be reissued due to data theft. That's why the card industry created PCI compliance to self-regulate the industry, as imperfect as it may be.
The idea isn't that cat.jpg is bad. It's that over at 37Signals, someone was browsing the logs, reviewing the file uploads, and did see "2011 Financing Report for X Public Company - Unreleased" or something akin to that.
I understand your point of view. But the people offended by this are in the right. It's not what happened, but that it happened, and what it shows.
Rather, they did "SELECT filename WHERE row_num = 100000000".
Honestly, if you're concerned about something like this then you should not be using a third party solution to store your files. Of course 37 Signals can look at the names of the files you are storing- they could probably hide that information from themselves, but then they'll get a support request saying "we can't open file-x.jpg" and they won't be able to do anything about it.
They're the ones who have repeatedly described it as "looking at the logs". That struck me as weird -- to have a log that ordinally attributes every upload -- however that's how they describe it and is hence why others describe it so.
Honestly, if you're concerned about something like this then you should not be using a third party solution to store your files.
I engaged in the prior argument, and there too this was the common last line of defense.
It misses the point.
Everyone knows that SaaS vendors can access your data and files, so it is bizarre that this keeps getting mentioned like it was unknown. Yet critical businesses engage vendors to hold their most confidential files -- the sorts that auditors grill them over and various bureaucratic organizations monitor them on.
Because they know, or at least believe and hope, that the organizations they entrust with their data use discretion, and have standard policies and standards -- if not actual data security and auditing controls -- to ensure that data is only used on a need basis. For instance for support purposes.
Writing a blog post that flippantly mentions a customer's data sends the wrong message. While we all know it is possible, it gives the entirely wrong impression to customers. Data security is the #1 impediment to the adoption of SaaS.
SaaS depends upon the trust of customers, and DHH is approaching this in the right way. It is quite a contrast from the many laissez faire responses on here.
They wouldn't have released it, and you would be a moron for storing that data on _their_ servers.
> Or "Next iPad Specs - Official.pages"
You'd be a moron for storing that data on their servers.
The existence of cloud solutions doesn't beget the use of self controlled servers for truly critical data.
The reality is that the vast majority of business data is not interesting to anyone but themselves and possibly competitors. And the vast majority of those businesses and competitors are well outside the scope of 37. Thus your data is relatively safe. If you're that worried about the hosting provider being able to view you data, host it yourself. Simple...
Now if you upload "HowToBeat37Signals.docx" to Basecamp you should probably assume two things. There's the possibility, however remote, that someone not authorized will see it (that possibility exists on every farmed out service, no server is hacker proof despite GoDaddy's little badges) and if someone does see it and it gets leaked or used against you, you'll have a damn good chance of suing the bejesus out of them.
The word trust is the key word here. Whenever you use a service to store sensitive material there has to be some level of trust. I think it's a mistake to trust that no one within the company or as a result of a security breach will absolutely never ever see what you've stored. What you do trust is that the odds of that happening are supremely low and if someone were to see your data (at least within the company) that they won't use it against you or share it. History has shown us that no web service is 100% secure and reliable so if you aren't comfortable with your odds then you shouldn't use the service. I for one assume everything I've ever put online is not secure. I'm comfortable with my odds though and bank on the fact that no one will take something written or created by a nobody like me very seriously or care at all.
User A uploads file_a.txt and you want to encrypt it. What key do you use for that? It can't be attached to User A (e.g. their password or password hash) only otherwise User B won't be able to decrypt it. How would you set that up in a way that's still reasonable considering Basecamp use-case? (meaning: one of their goals is to make project collaboration simple)
There's probably some huge issues there, but it's a start to answering the question.
That's a good question. Hopefully my answer does it justice.
First, having access to something and accessing something are two completely different things. I'm not suggesting that they should not have access to something they need to do their job. However, that doesn't mean we can't expect them to minimize the risk.
Next, you argue that someone not authorized will see the document, however remote. You mention suing, and while it sounds great, it's a long, painful struggle that I imagine isn't a quick fix. More importantly, would you knowingly hand over private data to someone who has proven incapable of keeping your trust? This is the reason 37Signals is jumping on this so quickly and doing damage control (and I say that in a complimentary way). It's not just their paying customers they have to concern themselves with, but also all the people that use their various services in one form or another.
Finally, you mention trust. In this case, you suggest trusting the odds. I'd prefer to trust that the company isn't banking on odds, and is instead actively working to mitigate that risk. Odds are a funny thing. I'm not under the belief that they can provide 100% security and privacy, but that doesn't mean I need to blindly accept failure.
> I for one assume everything I've ever put online is not secure.
But I bet you still actively work to ensure everything is secure as possible. You don't share your password to your bank. You won't hand over your credit card data, you make sure you are using SSL before making a purchase, SSH, different passwords. A variety of things to mitigate the risk.
Honestly, I think part of the reason people are defending 37Signals is that for many of us (myself included), we never really think of these things, and we see how easy it would be for us to make the same mistake. Instead, we should be focused on the fact that even for a company like 37Signals, they can make mistakes.
They can also admit to them, apologize, and work to correct the problem. We should learn from this, and try to improve.
When I talk about odds we're on the same page in a way. We absolutely should expect them to minimize the risk but let's not fool ourselves into believing that no one will ever take the opportunity to access one of our files. The best we can do is mitigate the risks and hope for the best. I don't feel that their pulling up the file name in this situation is a meaningful breach of trust. As programmers we like nice, neat, black or white answers, absolutes but in this case you have to take the circumstances and the company's track record into account. 37Signald has never shown itself to be untrustworthy and I really think this is much ado about nothing. I'm having a hard time arguing your point because I agree with you for the most part. I just think that this one instance is very obviously a special circumstance and any casual observer would certainly let it slide without a single red flag being raised.
I usually don't go down this road but I've yet to figure out what person or group made this an issue? Did 37Signals bring this up on their own? I know there were a few comments questioning them when the original post cake about but it didn't seem like anyone was that upset ver it to the point that a blog post was necessary. There are a lot of individuals who are just haters and take any opportunity to come out of the woodwork and point out any itty bitty flaw they see and make it into the end of the world. I hope that's not what started this. I also wonder if some competitor or "enemy" for lack of a better word decided to make this am issue. Or maybe it was really just some of their users in which case all I can say is, fair enough. I don't agree but a company does serve at the pleasure of its customers to a large degree.
> any casual observer would certainly let it slide without a single red flag being raised.
Casual observer, sure. But, I imagine it was more than just a casual observer making a fuss, as I suggest below.
> I usually don't go down this road but I've yet to figure out what person or group made this an issue?
From what I know of 37Signals, they aren't the type to bow to the pressures of haters. I imagine there was some real concern here brought forth by people not in the public eye.
Anyways, thanks for the good discussion.
It's a pretty painless process (we have snippets to ask permission from the user and shortcuts to request access from the sys admins) and it helps prevent both willful and accidental leakage or modification of our users' data.
I actually have a bug open in FogBugz called "Build a better FogBugz" where we discuss in some length its shortcomings for our workflow and how to fix them. It's not exactly an active project, but it will probably be when I get sufficiently fed up, and I would have no problem organizing its development in FogBugz.
The point is, if you can't trust the people who are writing your tools, why are you using their tools in the first place?
So ultimately, it may be possible for one to snoop on our users, but it's much easier to trust (and keep tabs on) a small, well screened team than the entire company.
Speaking personally, I can say that I would (and do) absolutely trust my own data to our sys admins.
37signals' target demo is smart, well-to-do, logical. They shouldn't have to apologize. As their logical demo, we should know better. We know that if the filename was MyBossIsAnAsshole.docx or even MyWeddingPhoto.jpg that 37signals wouldn't have had to think for a second on the appropriate thing to do. As logical thinkers, we know why cat.jpg is funny as it pertains to our demographic. We know that MyWeddingPhoto.jpg wouldn't be funny.
The whole burn 'em at the stake routine is asinine.
These days people care much more about the privacy of the data that they actively share through uploads etc and much less about the tracking. At least on apps like ours.
Even in the best-case scenario, at least some employees can access data as part of their jobs. This has been true of every job I've ever worked at.
"Trust" is an emotion-laden, rhetorical word used by a someone who wants you to do something. "Just trust us."
Trust is not fragile, trust is an illusion.
Replace the verb "trust" with the word "assume" or "take a calculated risk" and you're closer to reality.
Instead of "trust us," how about, "Look at our record. Note that we have had not a single incident of data disclosure in 6 years. Decide for yourself if it is likely that we'll have one now, with your data."
Instead of "trust us," how about, "Think about our business: imagine the consequences if we were found to have looked at our customers' data, and see if that disincentive allays your concerns sufficiently."
Instead of "trust us," how about, "Here are the ways we are protecting your data. Consider whether they meet your requirements or not."
I'm voting "trust" off the island.
Many companies, especially large ones with lots of lawyers, have developed policies and procedures relating to what's acceptable and what's not. But most smaller companies and startups don't seem to have time to formulate these policies.
A professional code of ethics, specially with regard to privacy and user data, would be very useful.
Right now, most developers operate on a "do unto others" philosophy. While this may be good intentioned and work well a lot of the time, it's highly subjective -- as evidenced by the comments on this thread.
But should everyone in the company have that level of access, or should access be restricted to the minimum necessary? What I don't see in others comments here (except tghw's ) is any recognition of that. It's all very well saying you want to give your devs access, and that you can be trusted, but over time and as your company grows you're exposing yourself to the risk of a rogue operator. And it only takes one person to do something bad to severely damage the trust your customers hold in you.
It's a balance, to be sure, but I'm inclined to think a blanket "we trust our devs, so they have the access they need" could be exposing yourself to a large risk you don't need.
What if Mint.com celebrated their 1 "billionth" processed transaction by posting what it was? Wouldn't that cause outrage?
I even automated process of looking into customer's data. The main goal is to catch spam and scam and delete such accounts.
May be it's specific of my business (job board), but isn't scam and spam is risk in any business to at least a certain extent?
This is the absolutely the best response conceivable. Bravo!
And a Basecamp user uploaded the 100,000,000th file (It was a picture of a cat!)
In the comments, they clarify that it was called cat.jpg, and that's how they knew it was a picture of a cat.
They only mentioned that it was from the filename in the comments. So initially at the start, it wasn't clear that's how they did it.