
Intel has screwed up their DC S3500 SSDs - luu
http://utcc.utoronto.ca/~cks/space/blog/tech/IntelDCSSDSectorSizeMistake
======
fivedogit
> "and of course it completely ruins the day of people who are trying to have
> and maintain a spares pool"

I used to maintain a distributed system of TV recording servers with hundreds
of analog TV tuner cards inside and understand this pain all too well. After
years of frustration trying to get these cards and all the different revisions
to work together on whatever version of Linux I'd adopted for the system
(kernel upgrades were a huge risk), I swore off hardware altogether for future
projects. Even though all the devices had the same chipset, I couldn't keep it
all working at the same time and it sucked all the time and energy I should
have been spending on my actual product.

God bless the rise of cloud computing. Seriously.

I can't even imagine what it must be like to maintain the amount of hardware
they have at AWS or Google. Speaking of which, how the fk does a startup like
Digital Ocean do it?

~~~
toomuchtodo
> Speaking of which, how the fk does a startup like Digital Ocean do it?

Carefully with DevOps/Networking/Infrastructure folks.

I used to manage several thousand physical Linux servers. It can be done
fairly painlessly when you control the environment.

I noticed you mentioned recording servers with analog tuner cards; today it'd
be much easier with SDR hardware streaming RTMP streams to cloud servers
writing the data out to network attached storage or even S3.

~~~
akira2501
I'm not trying to take the piss here, but why would you use a real-time
protocol to stream to an archive? And why SDR instead of a dedicated
tuner/decoder?

~~~
toomuchtodo
RTMP is of course not required; you could write locally to your workers, and
then push to your storage repo over http, rsync, whatever. My last gig was
with video broadcast, so most of my work was with rtmp, Akamai, etc. Different
ways to skin the same cat. Almost all hardware video IP encoders support rtmp
though, and you can colocate with where your storage is.

As /u/akiselev mentioned, SDR is easier to extend with the hobbyist community
that exists around it, especially if you have a custom application and need
more control.

------
rodgerd
I read this at the time and, frankly, my takeaway was more that:

> There are applications where 512b drives and 4K drives are not compatible;
> for example, in some ZFS pools you can't replace a 512b SSD with a 4K SSD

...the ZFS design was fundamentally fucked up. Intel have merely exposed a
core design problem, because sooner or later you aren't going to be able to
find 512 byte drives at all.

~~~
barrkel
Zfs sector size (ashift parameter) is set at vdev creation time. That is, when
you get a bunch of drives together, and you want to have some redundancy or
striping, you create a vdev that is composed of several drives, for your
desired mode of redundancy (and / or striping). A pool is composed of multiple
vdevs; zfs file systems all allocate from a common pool. So it's generally
only a problem if you are replacing an existing drive in a vdev after it
failed.

Zfs doesn't support a bunch of things. It has no defragmentation. Filling a
zfs pool much north of 90% tends to kill its performance even after you delete
stuff to bring it back down again. The usual answer to these things is "wipe
the pool and restore from a backup", or "zfs send <snapshot> | zfs receive
<filesystem>". The answer to changing the sector size of a vdev is similar,
just like it is for removing a disk from a vdev, or reconfiguring your
redundancy in most cases.

This is just how zfs is currently implemented. It was designed for Sun's
customers, for whom having backup for the whole pool, or having a whole second
pool to stream to, is not a big deal. Using it in a home or small business
context consequently requires more care and forethought.

~~~
rodgerd
> Zfs doesn't support a bunch of things.

I am well aware of this, having been running production systems with it since
2008, shortly after it stopped silently and irretrievably corrupting data.

> It was designed for Sun's customers, for whom having backup for the whole
> pool, or having a whole second pool to stream to, is not a big deal.

The idea that I have to destroy and re-create pools for so many no especially
uncommon events is one that runs pretty counter to the way ZFS generally does
a good job of being an enterprise filesystem. "Throw it away and restore from
backup" is not a good answer.

~~~
cbsmith
> "Throw it away and restore from backup" is not a good answer.

Honestly, when you think about the life cycle of many storage systems, it is
pretty reasonable. Once the drives get to a certain age, you tend to have to
replace them anyway, and after the array is beyond a certain age, you want to
replace the whole thing.

It makes a certain sick sense to expect a lot of enterprise customers to have
a strategy for fail over to a new storage pool.

------
EwanToo
For those who wonder why, this seems to be a decent explanation of the issue:

[http://lists.freebsd.org/pipermail/freebsd-
stable/2014-Septe...](http://lists.freebsd.org/pipermail/freebsd-
stable/2014-September/080009.html)

So you can have ZFS pools with 4K blocks, it's just if you've chosen 512-bytes
at the start, you're going to struggle

~~~
protomyth
Doesn't the current install of FreeBSD default to 4k blocks even on 512-byte
drives?

~~~
tw04
Yes. As do all of the illumos derived builds.

~~~
kdavyd
No they don't. They create the pool based on the ashift that the drives
report, unless you override it at pool creation.

------
IvyMike
I think anyone who was on the supplier side of Intel's "Copy Exact"
requirements would find this particularly ironic.

[http://www.intel.com/content/www/us/en/quality/exact-
copy.ht...](http://www.intel.com/content/www/us/en/quality/exact-copy.html)

------
nisa
Totally off-topic but I really enjoy this blog. I've discovered it while
struggling with ZFS and btrfs and it's a concise opinionated no bullshit
honest sysadmin blog from someone far more knowledgeable as myself. You may
not agree but the presentation and style is really great.

~~~
rodgerd
Chris is one of my must-read sysadmin blogs. Smart guy, good writer, lots of
interesting things to say.

------
kevin_thibedeau
All that changed are the default settings. It's unfortunate that Intel didn't
document the change but the fix is to add a new step to the drive replacement
procedure to reconfigure the firmware. This is a scriptable action. Not that
big a deal.

~~~
tw04
It's a minor inconvenience at best. It's not like ZFS let's you add the drive
rendering the pool unusable. You get an error telling you it's the wrong
sector size, at which point you fix the error and move on with life.

~~~
rincebrain
You...can't, that's kind of the point.

The statement is that you can't rollback the FW rev and it's not an end-user
configurable.

------
ericd
Hm, does anyone know if this would mess up hardware RAID (specifically LSI
MegaRAID running RAID10)?

We have some of these drives in RAID, along with some spares sitting around.
All bought around the same time, but who knows if they were manufactured
during the transition.

