After running `badblocks`, run `smartctl -t long $DEVICE`, which is a "long" self-test. This will take upwards of a day to complete, depending on the drive.
After that completes, check the status by running `smartctl --all $DEVICE`, and verify that there are only zeroes in the RAW_VALUE column for Reallocated_Sector_Ct, Used_Rsvd_Blk_Cnt_Tot, and
Pro tip: Many, many drives sold on Amazon by large, "reputable" suppliers are not new in the way you think they are - they are "new pulls" from systems integrators that received them as part of the complete PC and then pulled them, en masse, to sell in bulk.
Those "new pulls" may have hundreds and hundreds of hours on them ...
We (rsync.net) have started trending towards B&H Photo for larger batches of drives ... they pack them cluefully and there is no "new pull" monkey-business ... unfortunately they don't always have the quantity on-hand that we require, but that's not a knock against them ...
Yeah, that's why amazon is a hard no for hard drive buys for me. I don't mind when a lot of things are bouncing around in an oversized shipping box with insufficient padding, or in a box with cat litter, but a hard drive should be shipped responsibly, and I can't reasonably expect that from amazon.
I could imagine a firm that ordered 1000 units and did (or contracted out for) the same upgrades, now has 1000 unneeded 500-gig hard drives they can unload cheap.
Rip out the HDDs, RAM DIMMs, etc. and resell as "new" for extra $$$.
At moderate quantities, Intel and SSD makers do backend deals to get rebate credits to a PC OEM.
When I bought lots of computers, we would usually get -1-3% margins.
I have taken to using a 'burnin' script to wrap smartctl tests before and after in order to have relatively high confidence that the drive will not fail soon.
The script I used can be found on Github here
Agree this could be discussed in a bit more detail, but it is actually there.
I started doing this after reading comments online (probably here) on fraudulent cards and drives circulating around Amazon. Some comments mentioned that there are even some devices that have their firmware set-up such that writes don't fail even when you've written more than what the device can actually hold. They supposedly manage this by reusing previously used blocks for new files. I suppose in other words, each block has multiple addresses. The addresses would probably loop around the same blocks until the fake size is reached. That would give it the most realistic appearance that it's of the reported size while silently corrupting previously written storage.
In any case, so far I've only gotten 4 cards/drives since then, and I haven't found them to be frauds.
I've since wondered how much I'm subtracting from their product lifespan with my testing. I mean, I imagine that microSD cards and USB flash drives do wear leveling on unused blocks, but without TRIM support on them, after testing them, I'm probably eliminating the devices' ability to do wear leveling since I'm occupying all blocks without the ability to tell the device that they're unoccupied now.
As far as I understand, the problem with lacking TRIM support on USB flash drives is not that there's no protocol for it through USB, since there are USB-SATA adapters that specify support for UASP. Admittedly, though, I'm assuming that UASP is enough to get TRIM support. I don't remember testing that.
In any case, I wonder if this new ritual I'm doing is really worth it, and if it's not, when would it be?
EDIT: I wonder if it's realistic to expect that one day we'll get USB flash drives and microSD cards with TRIM support. I don't expect the need to check for fraudulent storage devices to go away.
The reason to write random data is that otherwise an adversary inspecting the raw device can determine how much you really stored in the encrypted volume, as the chance of a block being full of zeroes _after_ encryption is negligible so such blocks are invariably just untouched.
The encryption will ensure that random bits and encrypted data are indistinguishable (in practice, in theory this isn't a requirement of things like IND-CPA it just happens that it's what you get anyway).
TL;DR (for Debian variants):
$ sudo apt install f3
$ cd /MOUNTPOINT-FOR-SD
$ f3write .
$ f3read .
Depends on the chances and impact of getting a fake/faulty device, yeah? I've heard claims that most online marketplaces have rampant fakes, although you're 4 for 4 with good ones. On the other hand, the failure mode is data loss, which seems pretty bad. So personally I think your approach is a good idea.
My performance benchmarks were fantastic, but I started seeing some corruption issues. I wrote a C# program to fill the entire disk up with pseudorandom bytes (starting with a known seed), then read it back and compare.
Nearly all of them gave back different bytes at some point in the test. These were silent errors, and didn't trigger any read warnings or CRC/ECC failures reported by the drive's controller.
I suspect they achieved the performance by simply ignoring any errors and steamrolling right along.
Hard drives fail hard or smooth, SSDs fail hard ! On a hard drive, we can picture how the blocks are stored (sequentially). On an SSD, there is a remapping of blocks in silicon+firmware, and this added layer adds its own bugs  .
(edit : added one reference more)
yes | openssl enc -aes-256-cbc -out /dev/test-drive
and then decrypt and grep. It's almost as fast as badblocks.
You assume incorrectly. There have been attempts, like SandForce drives in the early days of SSDs but there are a bunch of reasons why you don't want to do opportunistic general-purpose compression at the lowest possible level of the storage chain.
The problem with compression* is that instead of having one drive capacity number (say, 500 GB), you have two: the drive's physical capacity and it's "logical" capacity. Which is fine and dandy, except that the logical capacity varies wildly with what kind of data is stored on the drive. Highly patterned data (text and most executables) compresses very well. Data which is already compressed (most images, videos, archives, etc) does not. So how do you report to the user how much space they have left on the drive? Even if you know the existing contents and current compression ratio, you don't know what they're going to put on the drive in the _future_. Your best guess would be, "250 GB or maybe about a terabyte, lol i dunno".
There's also the fact that application-specific compression algorithms tend to do FAR better than general-purpose compression algorithms, which is why almost all of our media storage formats default to using them. JPEG, HEVC, and so on. Plus you get the benefit of having the thing compressed all the time (even over the I/O channel or network) instead of just on disk.
Compressing data which is _already_ compressed often results in additional overhead. So unless the drive is testing the data to see how well it compresses before writing it to disk (which would murder performance), your 500 GB drive could actually end up being a 450 GB drive.
Further, always-on compression would result in a substantial performance penalty unless special silicon is crafted to handle it. Storage is already an industry with razor-thin margins, companies are not going to add to the BOM cost for a feature that could ultimately make people buy _fewer_ drives.
In the case of deduplication, there's no OS-level standard for it which makes current implementations far less efficient than they could be.
That said, data compression is very popular in the enterprise storage space, but it is typically done at the pool or volume level (large groups of disks) rather than per-disk. These arrays usually combine compression and deduplication with other strategies like thin provisioning to optimize storage to an almost absurd level. It typically requires trained storage engineers to manage them.
* _I lump deduplication into compression because dedeuplication is actually just one kind of compression strategy, even though lots of things treat it like a separate feature._
SMR maybe, I don't know if it's really worth it.
For a hard drive dealing with 4KB sectors, that's a lot of trouble for almost no benefit.
If you write random bytes then read them back, you have nothing to compare them to.
tee /dev/the/disk </dev/urandom | sha256sum
should also work fine.
1. Test N blocks of the disk, where N is small enough that you can keep a 128-bit hash of each block of random test data in memory for the duration of the test. You can use the hashes to verify that read data is correct.
2. Then test another Nb/8 blocks using the N blocks from #1 to store 128-bit hashes of the Nb/8 test blocks.
3. Then test N(1+b/8)/8 blocks, using the blocks from #1 and #2 to hold the hashes.
4. Then test N(1+b/8 + b/8^2)/8 blocks, using the blocks from #1 and #2 and #3 to hold the hashes.
(Maybe replace the 8's in that with 7's, and in each block of hashes include a hash of the 7 hashes, so you can check that hash blocks are reading back fine?)
SSD blocks can be written 10k, 100k, maybe a million times. If one block is rewritten every time the most frequently changed directory is changed, that block might be exhaustible. But starting out with writing each block once won't make a difference.
Challenge accepted, I tried to wear a device out (my setup had direct bus access to the flash part in question, no remapping layer). Went for months without an error and I eventually repurposed that corner of my desk.
30 years later, they've pushed the technology a little :-)
You won't get 10K out of an SLC cell (ten years ago those would suffer around 3,000 erase cycles, and it has to be much worse now). I don't know what QLC endurance is (16 voltage levels in a cell, run away), but I wouldn't be surprised if it was a few hundred cycles.
10k P/E cycles is the right ballpark for SLC 3D NAND these days, and it's never been anywhere as low as 3k—that's more like what good TLC 3D NAND gets. QLC is in the 500-1000 cycle range.
I'm surprised QLC gets as high as 1000, frankly. It's like the cells are on a first-name basis with the electrons they hold.
Indistinguishable from magic, man.
[Cue future archeologists: "Remember when our grandparents could only get a few gigabytes on a quark? How primitive, how sooo last-femtocentury."]
You nerd-sniped me. I couldn't resist calculating it. A femtocentury is about 3.16 microseconds.
Kioxia (formerly Toshiba) is sampling their version of the same basic idea, and SSDs using that memory should be coming out during the next few months.
Intel has of course replaced SLC NAND with their 3D XPoint memory, which is about twice the price but can sustain better write speeds because it doesn't have flash memory's slow block erase operations.
It's not 2007 anymore.
SSDs also have wear leveling; you cannot wear out a particular block of flash by writing to the same LBA a bunch of times.
Another idea for a generic approach, whether HDD or SSD, is to use f3. It'll check for fake flash (firmware loops to report a bigger device size than real size), corruptions, and read/write errors.
Because it times how long it takes to read/write each sector, it has the advantage of being able to detect "weak" sectors before they even become bad ones --- HDDs will retry accesses multiple times if an error occurs before giving up, and to a program that simply reads and writes data, weak sectors that successfully read after a small number of retries aren't noticeable. They will show up in MHDD, as taking longer than usual to read.
if I understood correctly, in the case of single/few block failures ZFS would transparently remap the bad block(s) present on the bad HDD to a good block on the same HDD (therefore without having to immediately replace the partially faulty HDD and rebuild the whole raid) => is this correct?
I admit that continuing to use the partially faulty HDD might be a bad idea, but at least having a raid 100% healthy before replacing the partially faulty HDD gives me a good feeling... .
These are simply called “bad blocks” in S.M.A.R.T - I guess it’s a little similar to overprovisioning and wear in SSDs.
SMART doesn't work anymore if it ever did. Manufacturer firmware blobs have a huge incentive to report a successful test no matter what. Also as someone else mentioned, there are plenty of used drives being resold as new in the market.
This industry has skated for way too long in my opinion. They benefit from a big reluctance on the part of people & companies to return faulty drives to protect sensitive company data.
I have seriously considered starting an ecommerce shop that sells only hard drives. But also does backblaze like tracking / reporting of individual drives, has a good return policy and degausses returned drives.
Speaking as a consumer I have zero interest in the current system, I want a system that lets me take the TCO of a drive/drive class into account before I make a purchasing decision.
Is not running badblocks before shipping a way to exploit users who won't RMA or won't realize their drive is busted? Or are drives getting damaged in transit? Or something else?
It will give you both performance statistics and catch checksum verification errors.
I run a short SMART test and check the results, as well as attributes, in case it's faulty already. Then do similar to the OP, afterwards running a short, conveyance, and long test.
Regarding temperatures: it used to be somewhat common practice to freeze drives that were clicking.
Funny to hear someone waiting to them to thaw, when I have intentionally frozen a bunch of drives before.