That's one of the first things I did, actually. After dumping the contents of the flash off, I went on Amazon and hit 'reorder' on the same SD card that I'd bought before. Unfortunately, it was not the same: in the picture , the left is the one I'd purchased this time, and the right is the one I'd destroyed. The deals that low-cost SD card makers get on NAND flash vary greatly from day to day, so they just manufacture based on whatever controller and flash combination they can get cheapest on any given day: even the same SKU is unlikely to stay the same internally very long.
I did also try soldering to the BGA pads on the damaged one  , but no joy: I imagine that there were some traces that went backwards on the board before going towards the controller (for instance, to meet the TSOP leads), and on inserting the SD card into my laptop, I still had no signs of life.
recovery tools for SM2683EN flash controller:
xor formulas and block structure for Transcend card:
The "Update size" and "Update enable" did give me the idea to do what I called 'sector updates'. Do you have any more information on how those work?
I didn't have that 'usbdev.ru' site at the time. That page seems specific to the USB versions, not the SD card (SM2683) parts; unfortunately, I speak very little Russian. Do you have any particular parts I should ook at?
Thanks so much for any help you might be able to provide! I'd like to fill in the blanks in my knowledge of these things; in particular, I'd feel a lot more comfortable if I knew how the sector updates worked...
The one you destroyed has a single Samsung 128Gbit TLC flash; the one you bought has a pair of Micron 64Gbit MLC. I'd say the latter is almost certainly better from a reliability perspective, and probably even cost more to manufacture.
(edit: Googling for NAND flash spot pricing results in http://www.dramexchange.com , which seems to confirm those sorts of suspicions. I think the market is probably pretty volatile...)
In that vein, the 64Gbit micron devices may in fact be 128Gbit die with half dead arrays -- so they may have a similar process node and reliability to the samsung device.
The MLC is undoubtedly superior to TLC however.
Crazy to think what archaeologists may have to deal with in a 1000 years. Or (a little more sci-fi) findings at other planets.
on a side I would suggest posting this over at hackaday, the marketers-pretending-to-be-hackers crowd here won't appreciate this.
I'll consider sending this to Hackaday, too -- thanks for the reminder. That said, I've found that the HackerNews audience is pretty diverse in interest; you might be surprised what comes between the startup fever...
There's also a very interesting article about reverse-engineering the microcontroller used inside: http://www.bunniestudios.com/blog/?p=3554
Data randomization seeks to mitigate this issue by normalizing the distribution of states across the page. Having a single XOR key wouldn't do a very good job for the reasons you noted. When I worked on flash, we used elements of the address to seed a PRNG for data randomizing. So the XOR key varies across the entire device.
There are other systems in place in flash to further mitigate these issues. All programming is adaptive, using feedback between programming pulses to hit the target. The pages within a block are intelligently ordered so that a programmed cell cannot possibly have all of its neighbors programmed from lowest to highest potential.
But yes, in general, if you had the right data stream, you would be able to slightly degrade the BER, possibly past what the ECC can repair. There are a lot of systems in place though, as NAND is inherently lossy to begin with. These issues are compounded by MLC designs which have tighter margins per cell.
SSDs have yet another layer of system mitigation. I know of at least one manufacturer that disables NAND level randomizing in favor of encrypting every bit of data that is programmed. Some drives have enough redundancy that they can lose an entire flash die without losing data -- as if losing a disk in a raid setup.
You probably shouldn't be storing anything important long term on a device that programs NAND raw. i.e. flash drives and sd cards. They aren't designed nor spec'd for high reliability.
The XOR scheme is extremely cheap (compact) and does not need to operate serially on the data stream (good for performance). The only applications that use the NAND provided randomizer are the cheapest of controllers. In fact, even the SD controller in the linked article used their own XOR scheme. A system designer can always turn off the builtin randomizer, and replace it with whatever method they choose -- they all do for various reasons. At the controller level it can be implemented in, typically, higher performance and more compact logic processes. It does not need to be duplicated for multichannel devices, as it would if it were in the NAND.
...until someone finds a way to exploit it, as has happened with CD's "weak sector" copy protection schemes. It's only a matter of when it will happen, not if.
Only the most primitive SD/flash drive controllers actually use this scheme anyway -- encryption is much better at randomizing.
I agree that it also scares me when certain patterns of data are essentially harder to store than others - and the "solution" is to just make it statistically unlikely, as opposed to using a more robust encoding like RLL that guarantees worst-case behaviour (although requires more overhead).
This problem of some sequences of bits being problematic has actually been around for a long time - see
http://en.wikipedia.org/wiki/Lace_card for example - and is one of the reasons for the odd character layout of EBCDIC.
I wrote a lot of the flash object store for the Apple Newton, back in 1992. I've often wondered how many of the things we came up with were later patented by other companies.