Graf 1, sentence 1: "a few board threads" -> Internet's current most important programming forum.
Graf 1, sentence 1: "contributed to by our competitors" -> Smoke screen, unsupported, irrelevant.
Graf 2, sentence 2: "basically admitted they really didn't know the facts" -> Because the facts weren't provided, the contributors set about reversing them from published material, the point of the thread.
Graf 3, sentence 4: "does use publicly available, well researched, and NIST validated cryptographic algorithms" -> Virtually all cryptography anywhere can make a similar claim, and most of that code is broken. NIST validates primitives and a few basic constructions, but tying those primitives into a functional cryptosystem is outside their purview.
Graf 4, sentence 1: "for any customer deployments" -> Leaves open the question of whether they implement semantically insecure constructions in any setting.
Graf 5, sentence 2: "fundamental security features (full field encryption, randomization through IVs) were disabled" -> Randomized encryption isn't a feature, it's a fundamental property of a cryptographic construction.
Graf 6, sentence 1: "currently in the process of obtaining our FIPS 140-2 certification" -> FIPS 140-2 doesn't involve a rigorous analysis of cryptographic primitives; the crypto-specific components focus on use of NIST-approved ciphers and block modes, but do not assure that those primitives are used securely. To illustrate that point: every vulnerable version of SSL3 and TLS1.0 and TLS1.1 has had a FIPS-compliant implementation somewhere.
They should just be honest about their desire to suppress the use of their copyrighted IP in critiques of their product. They're in a competitive space, they're a small company, hard to manage their online reputation and build product, &c. The Reddit/HN/Stack Overflow scene wouldn't like that response, but it's better than this one, which actually creates more questions about their product capabilities.
This is a good example of bad legal/PR turning a company from a fairly well respected new security company to a joke.
Tokenization, which CipherCloud does, could actually be done fairly securely if you had a decent amount of local storage. They IIRC use a FIPS HSM for local key storage in their local appliance (I talked to one of their founders as a security event a year or two ago and was initially suspicious of their claims, but it seemed adequate for certain use cases based on how they were using it -- maybe things have changed). It's fundamentally not too different from when Stripe gives you a user key vs. PCI information.
Basically, if you can correctly identify certain fields as sensitive and others as not, and force all your traffic through a proxy, you could do totally unrelated random tokens in fields, and then do search locally on the appliance, rather than on the untrusted service. E.g. if you wanted to use Salesforce, but keep customer addresses secret (because they were super-confidential government sites or meth labs or something), you could still put names in Salesforce and do everything else, but just put a random string in for addresses; do address searches on the proxy, either going from single record to address or maybe even "give me all the records in Missouri". There is no magic here. Someone could do an open source implementation for any specific site (via scraping or a public API) easily. The difficulty is doing it for many sites, and keeping it updated, supporting it, and selling it to fortune 500.
I don't know if they've been pushed to do stupid stuff, or if they just have horrible marketing/PR now (which is weird since they raised a fuckton of VC), or what.
Agreed, no magic here. I rolled a quick version using Squid and greasy spoon. Got it to work on SFDC and Gmail inside of a day. Using tags around the encrypted content and regex you could then feed the content into the decryption engine. Search works, etc. You could even using a unique IV per user to add a level of security, but it is by no means rock solid. It would however address some of the frequency analysis concerns, since if the encryption (tokenization??) was cracked it would only reveal the contents for a single user. That would work for the gmail side, but doing in in SFDC is a whole other issue, and unless the have some Harry Potter stuff going on, is likely huff and puff.
I would be happy to post my code, but honestly the process is so embarrassingly simple, I'm sure other could do it better. Setting up the squid proxy with SSL bump was more difficult than the code, as there are some great libraries out there. Using a reverse proxy and Icap server, you need to parse all content using something like jsoup (regex if you really wanna hack). Jsoup grabs the element and you then run it through a great encryption library like bouncy castle you then add some unique identifiers arounds it (!!) so that you can decrypt it using simple parsing to get the encrypted content. Plop it back into the content using your trusty greasy spoon. And walla magic! All persisted data is encrypted. When data is pulled out you simply parse for the unique tag, and then run it through the decryption side. There are a number of things that you can do to increase the security of this implementation, with a little tweaking it works for searching, and the such, so gmail is no problem. An app like SFDC with joins between records would be significantly more difficult to do properly. Doing it improperly is trivial, as you could just just all of the same keys and IVs per org (the unit of work in SFDC).
It sounds like they don't have anyone with actual PR experience. The standpoint they are taking is very old school and stems from an angry reaction.
If instead they had entered the discussion with a sliver of respect and honesty it would have been great. Instead many people have been introduced to them via a negative and untrustworthy atmosphere, this has certainly tarnished their reputation. Despite what they say not all exposure is good exposure.
I'm not sure if it's that they have no PR experience in the company, or just don't consider StackExchange/HN/Reddit to be worthy of a serious effort.
IMO, this is the kind of thing founders should handle personally once it happens. Maybe guided by a PR person or an investor, but a founder giving an adequate response gets graded on a curve, and is thus a lot more effective than a completely polished PR/marketing person.
It's nowhere like Stripe's tokenization because all tokens are not equal. Their tokens have inherent patterns which aid frequency analysis. That is exactly the point of the SE discussion which got the DMCA notice.
Yeah, I've never looked at CipherCloud's security in depth, but you could do tokenization in a fairly secure way. There's essentially a triangle of security, functionality-of-SaaS-app, and complexity of the proxy.
One issue is access patterns might leak information, so if you wanted maximum security you'd end up doing crazy things like heavily caching or accessing extra "chaff records" periodically. Well before that point you'd probably just give up on the SaaS app entirely.
This is total BS. How is posting a few screen-grabs from their publicly available video a violation of their copyright? Isn't it considered fair use? I was expecting more on the lines of "we are sorry for the whole fiasco, our legal team acts independently whenever it feels like there is a violation", instead of him defending the DMCA. They used DMCA to try and censor a debate about their lies. Talking about DMCA, is it just me, or does anybody else think that government is always eager to pass copyright protection (aka censorship) laws, rather than passing laws to protect the citizens from corporate greed?
May be honest, it's just very convenient. "That demo we have on our site to show off the technology? It's really crippled and doesn't actually show off the technology... We promise the real thing actually works! Oh, you want to hear about the DMCA takedown? That was just our legal team, you know how they can be!"
I'm a little confused about not wanting to disclose IP during a patent process. Isn't that what the patent process is designed to do? Disclose a novel invention, and have it (among other things) vetted for prior art. They say these patents are pending, in which case, shouldn't they be searchable? Has anyone found them? I did an albeit cursory search and couldn't find anything. I'd like to give these guys the benefit of the doubt as they are funded by a16z, but the lack of information is troubling.
Favorite line in the whole thing "Some of the fundamental security features made available (e.g. full field encryption, randomization through IVs, etc.) were disabled because we were not comfortable sharing such IP on the internet while our patents are still pending"
So, apparently they are going to be patenting padding/randomization in encryption and "full field encryption". Our patent system at work for obvious things.
You might want to recalibrate it; the mercury should have burst the tube at this point. Searching encrypted data is impossible without fully homomorphic encryption and fully homomorphic encryption is wildly impractical for use at present.
"Contributed to by our competitors" -- if that's the case, the competitors are giving informative SO answers about crypto. Whereas they are engaging in censorious shenanigans. I, for one, prefer the "competitors'" contributions.
Don't those require that the client actually do the searching? (I couldn't devote enough time to read them now, so I only read the abstracts. Thank you, by the way, for sending the links; this kind of stuff is really interesting.)
To be specific I mean a second party being able to search the data for arbitrary strings would mean the security of it was broken completely, and I thought this service was storing and searching without client input.
I am not really sure what it is that CipherCloud provides or even claims to provide; it looks like a big pile of buzzwords but few details. I am not sure what sort of a service would be searching ciphertexts without some input from the client -- at the very least, the service will need to know what to search for.
You are correct that the PIR and ORAM protocols involve the client performing some of the work of the search. The point is that the client does not need to store or scan the entire database (for ORAMs there is usually a one-time setup that involves scanning the database, but this can be viewed as "uploading" the data to the server; this may not be acceptable for all use-cases). With FHE, the client will perform less work, but still has to at least encrypt its query and decrypt the result. However, FHE is still many years from practicality, whereas PIR is practical now (but maybe not for database search) and ORAMs are nearly practical.
10 links to their own website in that post. Not even links to other content, just root links. I feel bad not having much more to add because I don't really understand their technology, but that really stood out.
"A couple of recent discussions in a few board threads contributed to by our competitors have questioned CipherCloud’s small online payday loans. same day payday loans. easy online payday loan. direct lender payday loans online. approach to delivering cloud information protection."
I know the difference between probabilistic and deterministic encryption but I did not think of it at the time I read the PDF and posted the comment. »[Our] product is NOT deterministic.« instead of »Our encryption algorithm is not deterministic.« tricked my brain into visualizing their software as a random number generator. So it is good that you point that out, I misinterpreted their statement.
I see no way this could ever work the way they want and still be secure. They have two conflicting requirements - strongly encrypting the data and not breaking the functionality of third party applications operating on this data, that is making the encryption transparent to a (sub)set of operations (not under their control).
It is feasible to strongly encrypt all data but you have to make sure that you do not accidentally implement ECB mode or something similar when using a common block cipher like AES. So you definitely want a unique IV for every piece of data you encrypt. But now you have also broken all server-side functionality because (almost) no useful operation will produce the expected result when operating on encrypted data. Client-side functionality is no problem because it only sees decrypted data.
Therefore they (have to) make compromises. Actually the user has to make the compromise - keep some data unencrypted or lose the server-side functionality. This is most prominent in the demos with numeric data that needs to be aggregated, averaged and what not. Actually it would be not to easy to encrypt this numeric data because you have to preserve the format including limits and disallowed values or otherwise the server would reject some values.
What about the infamous text fields? They are probably the easiest to encrypt but you still have to be careful not to break validation rules, for example by making the encrypted text much longer or making an e-mail regular upset (but I bet most applications perform only client-side validation). But this again makes the third-parts application a lot less useful because you lost the ability to search in your textual data. The problem to solve is the following one (with some minor details ignored).
I - not being a cryptography expert - can not think of a way to get this working without leaking information and CipherCloud's solution as discussed on Stack Exchange definitively leaks a lot of information. This is really a very tough problem. (Probably) not even homomorphic encryption would help because you have no control over the comparison method - it is plain old substring search, maybe case insensitive and that's it. It is solvable using private information retrieval in the relaxed case when you have control over the comparison operation but with substring search it is probably to hard (if you want to keep the cipher text length similar to the plain text length).
Due to a malicious denial of service attack, likely by our competitors, in which they leverage forum-promulgated references, which through human-mediated multiple hyper-text transfer protocol requests to our blog hosting infrastructure, attempt to access proprietary information about our strategic positioning viz a viz current nonpositive media attention,
we are unable to provide this patent pending document at this time.
Our legal department will be shortly dispatching a DMCA infringement notice to all parties "mirroring" our content as a sign of our ongoing commitment to protecting our valuable intellectual properties against these thieves and scoundrels.
If you should encounter any further difficulties with our information dissemination services, please sign the following non-disclosure agreement and affix the supplied Fedex label to your firstborn. We aim to respond to all communications within 6 working months, as part of our Quality Commitment Assurance.
 Whilst blood is preferred, red ink will suffice.
"All of our customers, that I know of, have selected our solution as the recognized standard for cloud information protection after a thorough evaluation, testing, and scrutiny of our product’s design and implementation by their cryptographers and key management experts."
If you had to guess, how many of CipherCloud's customers do you think keep cryptographers on staff?
Basically, someone called out CipherCloud on apparently bogus claims about what they provide (homomorphic encryption). CipherCloud responded with DMCA takedown notices. Now they are trying to explain their actions, with a lot of "trust us, we are only hiding the crypto details because we need to maintain a competitive advantage!"
One correction: I don't think CipherCloud have actually claimed to do homomorphic encryption - at least I can't find any such statement on their web site. That was an assumption on the part of the StackExchange user.
It would be hard for them to make a security guarantee that isn't bogus, if the screenshots from their demo is an accurate representation of their technology, but they don't appear to have made this particular fraudulent statement.