DigitalHax – Allows you to recover data from "Destroyed" Digital Ocean VM

zagi · on Dec 31, 2013

Hi, Ben from DigitalOcean here - just to give you guys an update. This method will no longer work on a newly created droplet.

We've now default scrub_data to ON for both web interface and API as we look at making this process permanent. Additionally, we've re-engineered the way we're provisioning disks and access to previously written data is no longer possible.

We've taken all steps in favor of security currently and will build a permanent solution that favors security and caution moving forward.

colechristensen · on Dec 31, 2013

The update is appreciated, I'm glad I switched to DigitalOcean for my personal projects.

Like I've said before, I care very much less about the existence of problems than I care about the timely and appropriate response to them.

A question: is the lapse here going to trigger any bigger-picture analysis of your security practices?

flexd · on Dec 31, 2013

I am a bit surprised I have heard absolutely nothing about this via. email (I'm a customer). From what I remember of the last security incident I did not get an email until way after it had become publicly known.

You guys really need to become better at communicating with your customers when I can look at the front page of HN one day and see some issue with your services, DO people commenting and no mail in my inbox.

The priority should be to alert customers there is a problem, and most importantly to fix the problem.

And sending a mail a week or a few days later is really not okay, a rapid response on your end to notify us is needed if we are going to be able to quickly take necessary precautions.

bsaul · on Dec 31, 2013

Fantastic ! That's how you build trust. Congrats.

sneak · on Dec 31, 2013

So, this is going to overwrite lots of data on the block device you're trying to recover data from, resulting in a lot of repeated information and erasure of recoverable stuff. The correct answer is to redirect the output, and make find.sh output the gzipped data so you can pipe that to your local disk never touching the remote end.

Edit: Here's the code. https://github.com/gregimba/DigitalHax/pull/1

gregimba · on Dec 31, 2013

Ok. I really only know enough to be dangerous im 17. Looks like I have some changes to make though.

sneak · on Dec 31, 2013

Stick with it! I didn't know about shell builtins like 'read' when I was 17 - you're already ahead of millions in the game out there.

Read a lot of other people's code to see how stuff is done, it helps a lot.

bradleyland · on Dec 31, 2013

Hey, congrats on putting something out there, and for learning something new. Keep it up!

At one point in my career, I did electronic discovery and electronic forensics for a bankruptcy trustee as part of their fraud investigation process. One of the central principles is that you want to take every step you can to prevent changes to the examination target. The best way to do this is to mount the volume read-only from a boot CD or separate partition from the examination target.

The Digital Ocean documentation says that you can request that your Droplet be switched to a recovery ISO, but you have to make the request through support. I don't know if this is a technical limitation or a policy decision, but it ends up being a good mitigator against a mass examination of data using these techniques. You can see more details about how to request this under the section section "Attempt Recovery with a Recovery ISO":

https://www.digitalocean.com/community/articles/how-to-recov...

You'll want to complete the network setup, but when it comes time to mount your filesystem, use the `-o ro` flag with mount

    mount /dev/vda /mnt -o ro

From there, you can perform a forensic examination of your disk under /mnt, without risk of altering any data.

FiloSottile · on Dec 31, 2013

This is a cute PoC of how much easy it is, but with freely available forensic tools like, say, PhotoRec is possible to extract much more meaningful and diverse data (entire files, images, database files...) that by simply running strings.

So, don't take it as the maximum damage one can get.

sukaka · on Dec 31, 2013

how long does dd take? Could use an estimate. Ran dd for around 10 minutes this morning and got 500,000 lines, and it was still running.

update: finished in around 12 minutes. out.txt is around 10gb.

update: out.txt is around 54 million lines from wc -l out.txt. I'm using less with command [line number]G to poke around. I have an NYC1 droplet, and there's a lot of junk not mine.. text in other languages and python which i don't use

gregimba · on Dec 31, 2013

I am planning on adding some estimate. It really depends on the size of the image but it roughly takes 10 minutes. I believe to add an estimate you need two scripts running because of how dd works.

sillysaurus2 · on Dec 31, 2013

Thanks for the followup about how long dd takes. I was wondering as well.

May I ask, which droplet type were you running dd on? Micro?

sukaka · on Dec 31, 2013

no problem. The command "dd if=/dev/vda bs=1M | strings -n 100 > out.txt" in find.sh, which is the same as the one first mentioned today morning https://github.com/fog/fog/issues/2525. $20/month droplet

sillysaurus2 · on Dec 31, 2013

Thanks! I wonder how long it would take for someone to scrub their VM manually using dd before terminating it? Maybe the same length of time? 12 minutes seems a pretty reasonable amount of time, but since it's an SSD the writes could take >3x longer than the reads.

gregimba · on Dec 31, 2013

The other thing you could do instead of trusting Digital Ocean you could use shred on your sensitive files before you destroy the droplet.

sillysaurus2 · on Dec 31, 2013

That's true, but if you've already rm'd a bunch of sensitive files, their data unfortunately can't be shredded. So you'll have to make sure you've always used shred for everything since the beginning, which is good practice but probably rare.

jamesbrownuhh · on Dec 31, 2013

As a user of Digital Ocean (amongst others) I find it hard to get too excited about this. When I destroy a droplet (VM) I already have the option to scrub the discs before deletion.

If I choose not to use that (and I never have on any of the hundreds of machines I've created and later torn down) it's because there is nothing of any sensitivity on them. If someone wants to resurrect gigabytes of entirely boring and transient log data from what I was last doing, they're welcome to!

I can only really see this being a concern for people who were storing sensitive information on a cloud instance which they then removed and chose NOT to scrub. In which case, they already have larger issues than this one. "Problem with user, not with cloud."

bradleyland · on Dec 31, 2013

If you're in the business of designing systems for users, I hope you'll reconsider your viewpoint. I'm a Rails developer, and the attitude you've adopted is very familiar to me. It's a kind of "trust your users" attitude that was prominent in the Rails community for a long time. Unfortunately, that led to several very, very ugly security issues.

What you're saying is that insecure defaults are OK, so long as they're obvious. The problem is that things are rarely obvious to 100% of the people, 100% of the time. A company the size of Digital Ocean has enough customers that if even 5% of their customers misinterpreted this option, a "significant" number of people would be affected.

Consider, for example, a user who sees the destroy page and assumes that "scrub" data is some extra, optional precaution, because "destroy" can't possibly mean "erased but recoverable". I mean, it says destroy, right?

Just a few days ago, someone posted a link to a blog post titled Toyota Manufacturing Principles. One of the principles mentioned in that post was poka-yoke[1]; otherwise known as mistake-proofing. A tenant so important that Toyota -- one of the most successful industrial companies in the world -- has made it a core principle.

One of my business partners has a favorite catch phrase: it's never a problem until it becomes a problem. It's his way of pointing out that just because something hasn't happened to you yet doesn't mean it won't be a problem if it does. Speaking as someone who has made their fair share of mistakes, I'd caution you to consider that advice carefully.

1: http://en.wikipedia.org/wiki/Poka-yoke

jamesbrownuhh · on Dec 31, 2013

Very fair point, I do see what you're saying. I had a conversation just recently about some software that a third party had developed for us, which in one use case would basically present the users with a question saying, in effect, "Do you want to reflect the change you've just made in all places where it matters, or leave it wrong in some of them?" - complete with option buttons Yes, No (!), or "Always No" (!!!)

As I said at the time, "that's a stupid option and a user should never be given that choice". So, I see where you're coming from, your point is well made. :)

rdl · on Dec 31, 2013

This is a case where only "aggressive full disclosure" got a company to respond. Which is why I'm generally only willing to go through "responsible disclosure" for companies which have shown themselves to be reasonable in the past, or in exceptional cases where the vulnerability is impossible to end users to mitigate, and/or causes exceptionally grave harm.

jonahx · on Dec 31, 2013

In what circumstances will this work? Are you recovering data from other customers? If so, will this work even if the other customer has deleted their VM using the recommended procedure?

cmircea · on Dec 31, 2013

It only works if you're lucky to land on a drive that hasn't been zeroed when the previous VM was deleted.

gregimba · on Dec 31, 2013

Yep. Hopefully this will encourage digital ocean to change the default or at least warn people.

neom · on Dec 31, 2013

https://digitalocean.com/blog_posts/transparency-regarding-d...

smtddr · on Dec 31, 2013

>>Our first and immediate update is to ensure that a clean system is provided during creates, regardless of what method was taken for initiating a destroy. Engineers are updating the code base right now to ensure that will be the default behavior, and we will provide another notice when that code is live. The scrub feature will remain, allowing customers to take an extra level of precaution if they choose to scrub the data after the delete.

You're seeing those comments on that blog right? http://i.imgur.com/f90Nx0V.png

I don't think everyone will be happy until you make scrubbing the default. Or I guess those direct reads with the "dd" command stop yielding data from previous VM instances, which it does kinda sound like DO is preparing to do. If scrubbing isn't going to be default, I'm kinda curious what DO will be doing to ensure clean VMs.

xorgar831 · on Dec 31, 2013

Well that just made my brain explode. This is going to make it that much harder to argue that public clouds take security seriously. I'd love to see an example of what data they think is fine to leak, since that seems to be their performance strategy.

rallison · on Dec 31, 2013

Indeed. The non-scrub option simply should not exist. Are there use cases for non-scrub? Yes. Are the risks worth it? No, at least in my opinion.

Forget to check that box? Oh well, better hope the next droplet doesn't go and read your data.

Moderately competent developer doesn't realize the implications of not checking that box? Oh well, better hope that developer didn't have too much sensitive data on the droplet.

Etc etc. Security is the big area where the default should be to err on the side of caution - often removing choices that are simply too dangerous (when, for example, the tradeoff is a tiny amount of performance gain).

I say this all as someone who likes and is a customer of DO. I am disappointed.

uptown · on Dec 31, 2013

So is it that they just didn't want to take the time to do it, or they were afraid of shortening the life-span of their SSDs by cleaning them after use?

InclinedPlane · on Dec 31, 2013

Typical combination of: ease of implementation, desire to minimize overhead of security, and the perceived difficulty and low utility of the attack vector.

There are fundamental tradeoffs that happen when you take the VPS / cloud hosting route, and security is definitely one of them. There are reasons why Amazon, for example, doesn't just casually mix in their own services into AWS instances. Security is still a hard problem.

kanzure · on Dec 31, 2013

Approximately how much would it cost in DigitalOcean time to cover 10%, 50%, 90% of their data?

revelation · on Dec 31, 2013

Instead of installing apache2 you might consider just using scp.

gregimba · on Dec 31, 2013

I figured SCP would be slower and but now that I think about it your probably right.

dsl · on Dec 31, 2013

scp will always be slow because of some built in limitations. It's not your data anyway, so toss the ecnryption and just use netcat.