
SATA controller corrupts data, writes secondary GPT in the center of the drive - mrb
http://translate.google.com/translate?hl=ru&sl=ru&tl=en&prev=_dd&u=http://avryabov.livejournal.com/5056.html
======
andor
I recently discovered that my Sharkoon USB to SATA dock (old version, USB2)
corrupts data. It happens under both Linux and Windows and seems to only
depend on the disk used. All the newer HDDs I tried are affected. For all test
files larger than 256 MiB (or less, don't remember exactly) the checksums were
off.

Disk corruption can be hard to detect under Linux because of the disk cache.
Remember to drop it before testing backups.

~~~
voltagex_
You wouldn't happen to know which USB to SATA controller it is, would you?
(device ID under Linux might help)

Sharkoon isn't the only one that uses those controllers (Astone might also)
and I keep some of them around for testing purposes.

~~~
rwg
Before I figured out it was garbage, I bought one of the USB<->SATA+IDE dongle
things I used at ex-work for use at home:

    
    
                    USB to ATA/ATAPI Bridge:
    
                      Product ID: 0x2338
                      Vendor ID: 0x152d  (JMicron Technology Corp.)
                      Version:  1.00
                      Serial Number: 152D203380B6
                      Speed: Up to 480 Mb/sec
                      Manufacturer: JMicron
                      Location ID: 0x24112000 / 6
                      Current Available (mA): 500
                      Current Required (mA): 2

~~~
wtallis
JMicron. Figures. You ought to put one of their SSDs in one of those adapters
and have a completely broken storage system.

------
kogir
Things like this are why I like filesystems that can reliably detect
corruption, like ZFS.

~~~
wayne_h
I don't see how zfs can detect the problem any sooner than any other
filesystem.

If the problem is caused by the controller wrapping at 2tb then I wouldn't
expect zfs to figure it out until later when it tries to read files back and
finds damage.

Lets say zfs writes a file at 2tb but due to wrapping its actually written to
0 tb. Then zfs reads the file at 2tb to verify that is good. But the file
appears to be fine because its really reading it again at 0tb. At some point
zfs will detect a problem but I don't see how it can catch the problem
instantly.

~~~
DiabloD3
The difference is, zfs can detect it. The only other fs in production use that
can is Oracle's zfs clone named btrfs.

~~~
wayne_h
Oh, I see what your are saying ... that zfs can detect corrupted files, true.

Whats likely to happen in this case is this: The first blocks of the disk
contain the superblock, labels-descriptors and other filesystem metadata. Most
of which will be sitting in cache. The damaged overwritten area at 0mb won't
be noticed for some time - like next time the volume is mounted. The
filesystem will eventually notice the damage and go into some recovery mode or
halt to protect itself. Zfs has lots of redundancy so the beginning volume
labels could be rebuilt.

One benefit of zfs will be that you can tell which recovered files are good or
bad.

------
jwatte
If only firmware development legally required developers with a national basic
skills exam, all our problems would be solved!

(Actually, the real problem is the race to the bottom because most buyers buy
on price alone.)

------
wayne_h
I am currently working on a data recovery caused by 2tb limited bridge
controller.

Customer had an external usb/firewire box with a 3tb drive. At 2tb the writes
'wrapped' back to block 0. Its a mac filesystem. Everything worked great until
he wrote past 2tb. At that point it overwrote the beginning of the mac volume.
The next time he connected the drive the mac thought that it was a new raw
drive and told him that he needed to initialize the drive.

This is the second time he did this - the first time he didn't reinitialize it
and we got it all back.

Unfortunately this time they initialized the drive. This zeroed out all the
metadata, catalogs, maps etc - much bigger mess.

So some bridge controllers cannot handle drives larger than 2tb. 2tb is the
last sector that you can address with a 4 byte disk address. So for sectors
0-0xffffffff .....works fine. The next sector is 0x100000000 - 5 bytes, it
only sees the lower 4 bytes 0x00000000 and starts overwriting the beginning of
the drive.

NTFS filesystems are more recoverable in this situation because their master
file table starts 6 million sectors out on the drive so you would have to
write alot more data before you start losing all your filenames and folder
structures and pointers to the data.

------
voltagex_
Is it just me, or is firmware quality getting worse these days?

~~~
sliverstorm
Firmware is getting more sophisticated and expansive, so it wouldn't surprise
me. Bug-rate-per-LoC seems to stay pretty stable.

~~~
makomk
Yeah, in the old days this was all done in hardware and you got hardware bugs
like the infamous ones in the CMD640 IDE controller. (I think Apple had their
fair share of data corruption bugs in that era too, actually.)

~~~
yuhong
On CMD640/RZ1000, if OS/2 2.x actually replaced DOS/Windows instead of turning
into an entire fiasco, these hardware would not likely have shipped with the
problems.

------
mehrdada
Sounds like an integer overflow bug in the drive controller firmware.

------
bifrost
huh, interesting. Is this a HW or SW problem? 3TB drives are a bit of a sticky
issue with older controllers so I wouldn't be surprised if this was the issue
or not.

