Hacker News new | past | comments | ask | show | jobs | submit login

If you think that you can recover overwritten data, feel free to accept 'The Great Zero Challenge' over at http://16systems.com/zero/

"Q. What is this?

A. A challenge to confirm whether or not a professional, established data recovery firm can recover data from a hard drive that has been overwritten with zeros once. We used the 32 year-old Unix dd command using /dev/zero as input to overwrite the drive. [...]"

It's been over a year, and nobody has accepted the challenge yet. Even if there isn't any prize money to win, I'd think that the PR opportunity would be quite enough for any data recovery firm to do it.

So my conclusion is: overwriting once is plenty good enough. You want to overwrite the whole disk though, otherwise the filesystem might leave metadata clues even after the file has been overwritten and unlinked.




I'd love to take them on. However they are right it is highly unlikely data can be recovered from that drive. They have only a few folders/files on there it seems leaving little to go on to rebuild the image.

It's not worth it - who needs the PR.. it's worthless when we recover zeroed disks frequently anyway :) If their willing to meet the cost to recover it (I'd do it cheap for £600). We are UK based so I guess they wont.

I dont think you can draw the conclusion that overwriting once is plenty fine based on their conclusions. At least not till: - Someone has tried their disk (people not wanting too is different from giving it a shot :)) - Someone trying a more real life example (install an OS, then copy some files in, then DD it).

Considering the second one has been done... :)

They are quite welcome to clone any old OS drive onto a disk and wipe it the same way with DD. Then fly it over to us here and pay the £1000 (ish) cost to recover it. (yes, I know that is a bit outrageous but so is their "challenge").


"Considering the second one has been done... :)"

It has? Do you have any source/article you can link to? I'd like to read about it and how it's done, seems to me like you'd need a fair amount of black magic to do it!


Well umm no article, rather practical experience :).

As I explained elsewhere we get wiped disks sent to us weekly. Some will have deen blanked with 0's (perhaps one a month). I know of only a few that specifically have had dd used on them - but usually we dont know the story of the disks :) so it could be higher.

We have a SEM that produces an image for our analysts to rebuild with a variety of software packages (Encase Enterprise is one example, and we have several pieces of kit from accessdata. Plus scripts/programs written in house).

With a zeroed disk your looking at minimum £1000 upwards and at least a months work (most of that time spent on the SEM and on one of our clusters processign the data).


Excuse the ignorant, but what is SEM short for?

Am I near the truth if I say that you analyze lots of residual bits to see how the drive usually manages to overwrite a one, and then use that to get a fuzzy logic version of the contents of the drive?


SEM = Scanning Electron Microscope. Actually when I say "ours" it is jointly owned by a local university who house it and use it when we dont need it. It is specialised (or rather adapted) for HDD scans though.

Yes that kinda explains the proces. It's rather complex and not something I am fully versed in (it not being my field, I process the data) but I will have a shot at explaining. It is possible to analyse the individual bits and predict what the byte was before by seeing what has "moved" (i.e. when you zero a byte or a cluster it simply moves all the 1 bits to zero)

The reason 3 passes defeats is (mostly) is that it deliberately makes sure every bit is moved at least once (for example by writing first FF and then 00 to it). Then a final zeroing pass. Because you write the inverse of the first pass on the second run it ensures every "pin" is moved. Then when you write 0's anything that can be reconstructed is just the random garbage from the second pass.

Anyway; a 120GB disk will produce about 1TB of statistical data from the SEM process - which we can analyse. Once you get a handle on a few "known" files (like the OS ones) you can begin to rebuild unknown protions based on that data. Keyword recognition and file signatures help identify when we succesfully recover something.

You are talking about a weeks processing on 25 node cluster (100 cores).


Interesting. Are the known files required to be able to find the data, and how big they have to be?

I'm more or less wondering if a RAID:ed system (with say 64kb big chunks of data) will make it impossible to recover the data.


No not needed: but they shortcut the process because the software can get a handle on the data it is given a little better. It just cuts the processing down a bit (never tested it without that kind of searching so I couldnt say how much).

I suspect that a RAID would foil it. For a start we would need to program in the facility to rebuild the RAID (and analyse based on Chunks). I doubt it would work out.

We do quote a price for SEM Raid recovery but it is in the 10's of thousands - a.k.a no thanks :D


I was going to say that my opinion had shifted - except that ErrantX's reply seems reasonable. A $500 prize with relatively little publicity seems unreasonable for an operation that putatively takes a specialized SEM and a week of clustered computer time. A $50,000 prize (possibly paid for by insurance) would be more to the point, if they're really confident; and if the recoverers need a standard OS or other recognizable standard files in order to decode the mechanical operation of the HD, why not make that part of the challenge too?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: