Hi, Ben from DigitalOcean here - just to give you guys an update. This method will no longer work on a newly created droplet.
We've now default scrub_data to ON for both web interface and API as we look at making this process permanent. Additionally, we've re-engineered the way we're provisioning disks and access to previously written data is no longer possible.
We've taken all steps in favor of security currently and will build a permanent solution that favors security and caution moving forward.
I am a bit surprised I have heard absolutely nothing about this via. email (I'm a customer). From what I remember of the last security incident I did not get an email until way after it had become publicly known.
You guys really need to become better at communicating with your customers when I can look at the front page of HN one day and see some issue with your services, DO people commenting and no mail in my inbox.
The priority should be to alert customers there is a problem, and most importantly to fix the problem.
And sending a mail a week or a few days later is really not okay, a rapid response on your end to notify us is needed if we are going to be able to quickly take necessary precautions.
So, this is going to overwrite lots of data on the block device you're trying to recover data from, resulting in a lot of repeated information and erasure of recoverable stuff. The correct answer is to redirect the output, and make find.sh output the gzipped data so you can pipe that to your local disk never touching the remote end.
Hey, congrats on putting something out there, and for learning something new. Keep it up!
At one point in my career, I did electronic discovery and electronic forensics for a bankruptcy trustee as part of their fraud investigation process. One of the central principles is that you want to take every step you can to prevent changes to the examination target. The best way to do this is to mount the volume read-only from a boot CD or separate partition from the examination target.
The Digital Ocean documentation says that you can request that your Droplet be switched to a recovery ISO, but you have to make the request through support. I don't know if this is a technical limitation or a policy decision, but it ends up being a good mitigator against a mass examination of data using these techniques. You can see more details about how to request this under the section section "Attempt Recovery with a Recovery ISO":
This is a cute PoC of how much easy it is, but with freely available forensic tools like, say, PhotoRec is possible to extract much more meaningful and diverse data (entire files, images, database files...) that by simply running strings.
So, don't take it as the maximum damage one can get.
how long does dd take? Could use an estimate. Ran dd for around 10 minutes this morning and got 500,000 lines, and it was still running.
update: finished in around 12 minutes. out.txt is around 10gb.
update: out.txt is around 54 million lines from wc -l out.txt. I'm using less with command [line number]G to poke around. I have an NYC1 droplet, and there's a lot of junk not mine.. text in other languages and python which i don't use
I am planning on adding some estimate. It really depends on the size of the image but it roughly takes 10 minutes. I believe to add an estimate you need two scripts running because of how dd works.
no problem. The command "dd if=/dev/vda bs=1M | strings -n 100 > out.txt" in find.sh, which is the same as the one first mentioned today morning https://github.com/fog/fog/issues/2525. $20/month droplet
Thanks! I wonder how long it would take for someone to scrub their VM manually using dd before terminating it? Maybe the same length of time? 12 minutes seems a pretty reasonable amount of time, but since it's an SSD the writes could take >3x longer than the reads.
That's true, but if you've already rm'd a bunch of sensitive files, their data unfortunately can't be shredded. So you'll have to make sure you've always used shred for everything since the beginning, which is good practice but probably rare.
As a user of Digital Ocean (amongst others) I find it hard to get too excited about this. When I destroy a droplet (VM) I already have the option to scrub the discs before deletion.
If I choose not to use that (and I never have on any of the hundreds of machines I've created and later torn down) it's because there is nothing of any sensitivity on them. If someone wants to resurrect gigabytes of entirely boring and transient log data from what I was last doing, they're welcome to!
I can only really see this being a concern for people who were storing sensitive information on a cloud instance which they then removed and chose NOT to scrub. In which case, they already have larger issues than this one. "Problem with user, not with cloud."
If you're in the business of designing systems for users, I hope you'll reconsider your viewpoint. I'm a Rails developer, and the attitude you've adopted is very familiar to me. It's a kind of "trust your users" attitude that was prominent in the Rails community for a long time. Unfortunately, that led to several very, very ugly security issues.
What you're saying is that insecure defaults are OK, so long as they're obvious. The problem is that things are rarely obvious to 100% of the people, 100% of the time. A company the size of Digital Ocean has enough customers that if even 5% of their customers misinterpreted this option, a "significant" number of people would be affected.
Consider, for example, a user who sees the destroy page and assumes that "scrub" data is some extra, optional precaution, because "destroy" can't possibly mean "erased but recoverable". I mean, it says destroy, right?
Just a few days ago, someone posted a link to a blog post titled Toyota Manufacturing Principles. One of the principles mentioned in that post was poka-yoke[1]; otherwise known as mistake-proofing. A tenant so important that Toyota -- one of the most successful industrial companies in the world -- has made it a core principle.
One of my business partners has a favorite catch phrase: it's never a problem until it becomes a problem. It's his way of pointing out that just because something hasn't happened to you yet doesn't mean it won't be a problem if it does. Speaking as someone who has made their fair share of mistakes, I'd caution you to consider that advice carefully.
Very fair point, I do see what you're saying. I had a conversation just recently about some software that a third party had developed for us, which in one use case would basically present the users with a question saying, in effect, "Do you want to reflect the change you've just made in all places where it matters, or leave it wrong in some of them?" - complete with option buttons Yes, No (!), or "Always No" (!!!)
As I said at the time, "that's a stupid option and a user should never be given that choice". So, I see where you're coming from, your point is well made. :)
This is a case where only "aggressive full disclosure" got a company to respond. Which is why I'm generally only willing to go through "responsible disclosure" for companies which have shown themselves to be reasonable in the past, or in exceptional cases where the vulnerability is impossible to end users to mitigate, and/or causes exceptionally grave harm.
In what circumstances will this work? Are you recovering data from other customers? If so, will this work even if the other customer has deleted their VM using the recommended procedure?
>>Our first and immediate update is to ensure that a clean system is provided during creates, regardless of what method was taken for initiating a destroy. Engineers are updating the code base right now to ensure that will be the default behavior, and we will provide another notice when that code is live.
The scrub feature will remain, allowing customers to take an extra level of precaution if they choose to scrub the data after the delete.
I don't think everyone will be happy until you make scrubbing the default. Or I guess those direct reads with the "dd" command stop yielding data from previous VM instances, which it does kinda sound like DO is preparing to do. If scrubbing isn't going to be default, I'm kinda curious what DO will be doing to ensure clean VMs.
Well that just made my brain explode. This is going to make it that much harder to argue that public clouds take security seriously. I'd love to see an example of what data they think is fine to leak, since that seems to be their performance strategy.
Indeed. The non-scrub option simply should not exist. Are there use cases for non-scrub? Yes. Are the risks worth it? No, at least in my opinion.
Forget to check that box? Oh well, better hope the next droplet doesn't go and read your data.
Moderately competent developer doesn't realize the implications of not checking that box? Oh well, better hope that developer didn't have too much sensitive data on the droplet.
Etc etc. Security is the big area where the default should be to err on the side of caution - often removing choices that are simply too dangerous (when, for example, the tradeoff is a tiny amount of performance gain).
I say this all as someone who likes and is a customer of DO. I am disappointed.
So is it that they just didn't want to take the time to do it, or they were afraid of shortening the life-span of their SSDs by cleaning them after use?
Typical combination of: ease of implementation, desire to minimize overhead of security, and the perceived difficulty and low utility of the attack vector.
There are fundamental tradeoffs that happen when you take the VPS / cloud hosting route, and security is definitely one of them. There are reasons why Amazon, for example, doesn't just casually mix in their own services into AWS instances. Security is still a hard problem.
We've now default scrub_data to ON for both web interface and API as we look at making this process permanent. Additionally, we've re-engineered the way we're provisioning disks and access to previously written data is no longer possible.
We've taken all steps in favor of security currently and will build a permanent solution that favors security and caution moving forward.