
WD Red DM-SMR Update: 3 Vendors Bail and WD Knew of ZFS Issues - arm
https://www.servethehome.com/wd-red-dm-smr-update-3-vendors-bail-and-wd-knew-of-zfs-issues/
======
stingraycharles
I’m very happy that this tactic has backfired so spectacularly for WD. I don’t
know what the total impact on their bottom line is, but what I’m seeing is a
huge reputation as damage for WD. People are avoiding putting WD in their NAS
in general, even if they have plenty of drives that are fine with it!

~~~
tjoff
I have such a hard time to understand how the risk vs. reward ratio for
pulling this stunt made any sense.

SMR does not have a great reputation, to put it in drives specifically
targeted for NASes and enthusiasts is a very bold move.

~~~
rwmj
> _SMR does not have a great reputation_

SMR is fine _if you know what SMR is and what you 're doing_. It might even be
fine given a sufficiently advanced translation layer (which apparently doesn't
exist yet). The problem isn't SMR, it's selling a product which uses SMR and
hiding that fact from the consumer, and I can't forgive WD for that.

~~~
yjftsjthsd-h
> It might even be fine given a sufficiently advanced translation layer (which
> apparently doesn't exist yet).

Apparently f2fs does well with it?

~~~
simcop2387
Not terribly surprising, f2fs and similar filesystems are log-based systems.
Rather than overwrite the old data they almost always try to write to the next
new area. This ends up working really well for flash technology because
writing to existing areas is expensive. You have to read in a 1-4MB area
depending on the block size, make your changes to that, erase that area
entirely, and then write the changed version. [Those sizes might be a little
off but it's what I remember]. Some of that may be hidden by the disk's
controller but it has to happen on every write to an area with existing data.
SMR is similar in this respect because you have to read, change, and write
large blocks together because of the way the tech works. So I'd expect
anything that's directly optimized for low level flash memory to work
similarly ok performance wise for it.

The real problem comes when you can't work like that, in particular this
happens when you're low on free space or if you're in a situation where
relocating data isn't possible or known that it needs to happen. RAID setups
are typically the worst case for this, since they'll effectively treat the
entire disk as used and can't relocate data on it. ZFS is a bit more
complicated with this but effectively ends up the same.

------
aidenn0
While WD's mismanagement of this is incredible, saying "WD knew about ZFS
Issues" because one employee of WD said DM SMR was inappropriate for ZFS at a
conference a few years ago overstates things.

Even at relatively small companies, not all knowledge disseminates equally
through the company, and WD isn't exactly a small company. It's almost certain
that the team that was responsible for the WD Red SMR release was not aware of
this issue or this wouldn't have happened. What probably happened was
something more along the lines of:

1\. Traditional wisdom says SMR can't be used in raid

2\. DM-SMR firmware has gotten much better at hiding the performance issues of
SMR since traditional wisdom

3\. Let's actually measure this in some RAID setups and see what happens

4\. Hmm, those numbers look reasonable, we can ship this!

At no point did anyone think very hard about ZFS being an edge-case for DM-
SMR.

All that being said, the drives _absolutely_ should have been represented as
DM-SMR, along with some nice benchmark numbers to show that the performance
hit isn't as bad as one might think. In the right RAID setup, these drives
_are_ at a reasonable performance/value point in the curve.

~~~
syshum
It is not just ZFS edge cases seeing issues, ZFS was just the bell weather.

Synology does not use ZFS at all, and they are seeing issues

Also ZFS being an "edge case" is not going to be true long term as alot of SMB
and Home NAS vendors (the very customers the WD Red Line targets) are starting
to use ZFS on Linux more and more

~~~
boomboomsubban
I haven't seen the data for a few years, but I believe random Synology and
other premade NAS dominated the market. To the point where ZFS would always be
an edgecase.

~~~
syshum
Synology and QNap are probably the 2 biggest in this market segment, Synology
already has issues and QNAP is bringing ZFS to market in their latest line of
NAS devices

Synology also has used btrfs in the past which I would suspect may suffer the
same kind of issue as ZFS but dont know forsure

~~~
boomboomsubban
My point here was that ZFS will continue being an edge case, as the premades
dominate the market and do not provide ZFS. QNAP may offer it on home devices
in the future, but may also limit it to premium devices. I am not making a
statement about any compatibility here.

------
StillBored
Ok, so everyone has decided that SMR+ZFS is bad.

Given that all traditional RAID arrays do online rebuild, what keeps a random
application, lets say a database, from creating a write pattern during the
rebuild which forces timeouts?

AFAIK, nothing. And logically, it might be possible with just a single drive
without the RAID given that WD hasn't published proper command timeout
definitions for these drives either.

So people are giving them a pass for finally admitting that the drives are
SMR. That does nothing if they won't publish topology information, or buffer
capacity so that the host can optimize write patterns.

So, until the opensource community can adjust the worse case command timeouts,
and preferably optimize write patterns based on drive topology, the hardware
doesn't appear fit for any purpose.

~~~
himinlomax
Putting a database on a spinning disk in 2020 is just completely idiotic.

~~~
zepearl
I don't think so - the decision which media to use is linked to a lot of
factors (type of database, amount&size of data written/read, type of
filesystem, costs, durability, expectations, etc...).

~~~
himinlomax
You don't think so, I know so.

There are a few edge cases where spinning disks could be acceptable, but none
where they are superior anymore.

Cost? Maybe if you're not billed for electricity. And the cost of hardware is
beer money compared to admin/dev/architect time.

Durability? Drives of both technology vary in durability based on market
class. Enterprise disks are superior to cheap consumer SSDs in that respect,
but within the same class SSDs are superior.

Type of database? Well if your database is used mostly for sequential reads,
in other words when it's for most intents and purposes used like a flat file
would be, you're in that edge case. Few individual databases are like that,
though admittedly those that are can be very large.

Data written/read? Early SSDs were admittedly prone to wearing off with heavy
writes. We're long past that, except if you're talking about dodgy SD cards
but you're not putting a database on that, are you?

Types of filesystems? Just no.

Hard drives are cheaper per byte. Significantly cheaper, but not even an order
of magnitude cheaper anymore. They use more power, their random access time is
orderS of magnitude slower, and their sequential transfer rate is slower. Hell
even sequential write is not better any more.

But I'll concede this, if you look long and hard enough, I'm sure you can find
that one odd case where a database is marginally better served by spinning
disks.

~~~
rasz
>We're long past that

its the other way around, we are currently somewhere between 150 and 1000
write cycle endurance and going down with every new node upgrade/cell density
bump.

[https://searchstorage.techtarget.com/definition/write-
cycle](https://searchstorage.techtarget.com/definition/write-cycle)

------
AdmiralAsshat
I imagine WD has said _nothing_ so far because they're probably waiting for
their lawyers to tell them exactly how much they can claim to have known/not
known without opening themselves up to litigation.

~~~
tssva
WD responded via a blog post when the use of DM-SMR for WD Red drives was
first exposed. It basically consisted of an attempt to gaslight those that
bought Red drives into believing that this is their fault for choosing drives
inappropriate for the purpose they using them for.

Who could think using Red NAS drives in a NAS would be an appropriate
technology choice? /s

~~~
TheSpiceIsLife
I saw the headline at the time, but didn’t read the blog post because reasons.

Does anyone have a link or archive link?

~~~
bonestamp2
I believe the April 20th (bottom) section of this blog post is the post in
question:

[https://blog.westerndigital.com/wd-red-nas-
drives/](https://blog.westerndigital.com/wd-red-nas-drives/)

TLDR: They suggest different RED drives should be considered for different
workloads. While that may have been their intention all along, that was not
communicated well at in marketing or retail, where the drive colors (red,
blue, purple, etc) used to be the appropriate method for selecting the right
drive for your workload/purpose.

The way they have worded it, the blog is not wrong, but they left out an
important fact: consumers didn't have the necessary info to make the decision
they're suggesting people should have made.

Even if you read the data sheets that were available at the time, one would
expect the listed transfer rate was a limit that the drive was capable of, not
the rate that if exceeded would cause the device to perform very poorly.
Worse, there would be no way to know that was the case because the fact that
they were SMR technology was previously unreleased and the drives themselves
actively hid that information from the host computer (since they were "device
managed" SMR).

The data sheets still aren't clear on how the performance is impacted with
workload, but at least they're clear about the technology they use now.

~~~
simias
>TLDR: They suggest different RED drives should be considered for different
workloads. While that may have been their intention all along, that was not
communicated well at in marketing or retail, [...]

I would argue that it's worth than that because they sold drives with
significantly different characteristics without changing the branding.

It's not even that the users weren't given enough information to make an
enlightened decision, it's that users of the older WD Red drives would
reasonably expect newer models to basically fit the same performance profile.
It's as if nvidia suddenly started selling GTX 2080 Ti cards that are
effectively rebranded 1060 under the hood.

~~~
bonestamp2
Absolutely. I just mean that even if consumers somehow guessed that something
was different, and like you pointed out: they would have no reason to suspect
that -- the information wasn't even available to decipher what that difference
was. So, their cover story is completely asinine any way you look at it.

------
duxup
I'd really like to one day find out how this decision was made at Western
Digital.

My understanding is there wasn't much for WD to gain here and serious risks...

Did they really feel this was worth it? How did this decision make its way
through the management chain?

------
stock_toaster
WD lost all their goodwill with me as a customer because of this. Shocking how
fast a brand can go from trusted to distrusted once they are shown to be
untrustworthy. ;)

~~~
thijsvandien
Well, what brand does one buy today? It's not like there was that much choice
anymore to begin with. Others had massive reliability issues (too). I don't
consider Seagate much of an alternative, for example.

~~~
yellowapple
Apparently Toshiba's the way to go. I can't vouch for 'em (I haven't bought a
Toshiba hard drive... ever; all of the ones I own are the ones that came in
the various old laptops and desktops I've got in my garage), and I usually
stick to SSDs nowadays, but next time I'm in the market for an HDD I guess
that's the way I'll end up going (with WD and Seagate both relegated to
secondaries, e.g. for RAID diversification).

------
danShumway
Open question:

Speaking as someone who accidentally bought SMR drives from Seagate that
weren't advertised as such, what's the primary danger of using these drives in
a _new_ ZFS setup?

From what I've seen, the write speed is bad enough that rebuilds might be
problematic in ZFS. But since I'm not doing a rebuild, and since I can swap
out the drive for a non-SMR alternative whenever one fails and I do need to do
a rebuild, are they still relatively safe to use?

~~~
seabrookmx
My anecdotal evidence, based on a fairly small (16TB) pool for personal use
such as backups and media, is that the _Seagate_ SMR drives are fine.

I've done multiple resilvers and they're definitely slower than conventional
drives at that task, but the main issue people are seeing with WD drives is
that in certain cases the drives take multiple seconds to respond to an ATA
command, making the system think the drive is defective. The Seagate drives
don't seem to suffer from this problem. There's lots of documentation out
there on people using Seagate "Archive" drives in NAS's as well.. these were
some of the earliest SMR drives and were on the market before the tech became
more mainstream. Their results seem to align with what I'm seeing even though
mine are branded "Barracuda Compute."

So while I wouldn't be using _any_ SMR drives for a production workload, I'm
perfectly happy with the 8TB Seagate SMR drives I picked up for $185 CAD each
for my home use.

~~~
65a
I ran a large array of 5TB Seagate SMR drives. Everything was fine until a
disk failed, at which point the resilver took almost 24h. This was an
uncomfortable window, but otherwise the array was fine, although write speeds
were about 15-40MB/s in many cases.

~~~
seabrookmx
Yup. That window is exactly why I wouldn't want to run a setup like that in a
production setting.

But based on the reports of the WD drives, they are likely to never
successfully resilver. Which is obviously much worse.

------
boomboomsubban
Let me start by saying I also find WD's actions ridiculous and misleading.

Has WD ever said Reds would work with ZFS? The Red spec sheet says performance
isn't guaranteed on setups not listed as compatible. The compatibility list is
a nightmare [1] to use, but I can't find ZFS and only Gold drives are listed
for FreeBSD. None of the roughly ten ransom products I checked listed Red.

[https://www.westerndigital.com/support/partner-product-
compa...](https://www.westerndigital.com/support/partner-product-
compatibility)

[https://www.westerndigital.com/support/partner-product-
compa...](https://www.westerndigital.com/support/partner-product-
compatibility)

------
pubutil
I just ordered some WD Reds and a Synology box a couple of days ago, not
knowing about this issue. Should I cancel the HDDs and go for another brand,
or will I be okay if I don’t use ZFS?

~~~
thefz
IIRC EFAX are affected while EFRX are not.

[https://www.hwupgrade.it/immagini/wd-hdd-
smr-25-04-2020.jpg](https://www.hwupgrade.it/immagini/wd-hdd-
smr-25-04-2020.jpg)

------
S_A_P
Interestingly these drives are on sale on newegg.com right now.

~~~
kabdib
Yup. I narrowly escaped buying one of these drives as a spare for a Synology.

Anecdotal evidence is that the SMR drives will mostly work. It's not worth the
time or trouble. WD is dead to me until the next cycle of vendor rage forces
me to choose them again. :-/

~~~
ubercow13
>It's not worth the time or trouble

Especially as they are not even cheaper.

------
Bud
How can one tell for sure if a given WD Red drive is affected by this issue? I
have some WD Red drives that are a couple years old, in my NAS.

~~~
berkut
They have different model numbers.

i.e. the original 4 TB CMR drives were: WD40EFRX

the new 4 TB SMR drives are: WD40EFAX

They also have larger caches in most cases (i.e. 256 MB vs 64 MB).

~~~
at_a_remove
I checked and lucked out, I had WD4000FYYZs. Thank you!

------
Teknoman117
Maybe I need to try ZFS again now that I have different drives. I used to run
it on 6x 3TB WD Red drives (raid-z2) and could never get ZFS to perform
anywhere near what I expected. Ended up going with LVM on mdraid (raid 6) on
dm-integrity and got tremendously better performance for my "home nas / vm
image host" machine.

~~~
jhoechtl
Use Btrfs. Seriously. It's a tremendous fs. Read the caveats concerning raid56
and success.

~~~
Teknoman117
My stack at the moment is 6x [10 TB WD Red Pros -> dm-integrity] -> mdadm
raid6 -> dm-crypt -> LVM. One of my LVM volumes is btrfs. I only use LVM
because I've had terrible performance for VM images stored in btrfs volumes.
You can always use fallocate and disable CoW for the image but that prevents
taking snapshots.

I'd if there was a zvol equivalent in btrfs.

~~~
cmurf
Neither fallocate nor nodatacow will prevent taking snapshots. The result of
writing to nodatacow shared extents is that the write must be COW, not an
overwrite. Same as a reflink copy of a file on XFS. New "overwrites" are COW,
subsequent overwrites are in fact overwrites.

------
metalliqaz
Why haven't the prices come down on these drives?

~~~
executesorder66
Greed.

------
williesleg
So tired of seeing this. The problem is you don't want to do ANY striping on
ANY SMR drives. It's not a ZFS problem. It's not a SMR problem. It's STRIPING
on SMR. PERIOD.

Once upon a time people knew what they were talking about. Now everybody's an
EXPERT NPC.

~~~
effie
Could you explain why? Isn't the problem rather in the fact the filesystem
does not know it is SMR and does not have a capability to work with such
device optimally?

------
jhoechtl
ZFS is like Bitcoin: Skyrocketing in popularity for no apparent reasons.

Although I acknowledge that in this case it is WDs fault.

~~~
fierarul
The reason is the growing data people save and experiencing bit rot. Why not
use ZFS?

~~~
Dylan16807
Because it's absolutely awful at deduplicating.

Or you're on windows.

But sure yeah it's pretty good overall, I don't disagree with your main point.

~~~
modderation
ZFS on Windows is slowly getting better.

I wouldn't actually trust it with data yet, but I suspect the developers would
welcome any contributions or effort sent in their direction:
[https://github.com/openzfsonwindows/ZFSin](https://github.com/openzfsonwindows/ZFSin)

(I'm not affiliated, but I think it's a cool project, and I really want it to
succeed)

