> "and of course it completely ruins the day of people who are trying to have and maintain a spares pool"
I used to maintain a distributed system of TV recording servers with hundreds of analog TV tuner cards inside and understand this pain all too well. After years of frustration trying to get these cards and all the different revisions to work together on whatever version of Linux I'd adopted for the system (kernel upgrades were a huge risk), I swore off hardware altogether for future projects. Even though all the devices had the same chipset, I couldn't keep it all working at the same time and it sucked all the time and energy I should have been spending on my actual product.
God bless the rise of cloud computing. Seriously.
I can't even imagine what it must be like to maintain the amount of hardware they have at AWS or Google. Speaking of which, how the fk does a startup like Digital Ocean do it?
Honestly it's probably a lot easier at that scale. When you're a single dude maintaining 10 boxes you're fucked if one dies on you. When you're 50 people maintaining tens of thousands of boxes it's not such a big deal.
Also, when you're buying that quantity you qualify the hell out of your hardware. The big boys have groups dedicated to making sure that they get what they want from their suppliers.
DO orders from Dell who has contracts in place with their hardware suppliers to supply the same part for the life of the product. Someone like Amazon or Google order enough parts that they have the same setup, just directly with the ODM.
> Speaking of which, how the fk does a startup like Digital Ocean do it?
Carefully with DevOps/Networking/Infrastructure folks.
I used to manage several thousand physical Linux servers. It can be done fairly painlessly when you control the environment.
I noticed you mentioned recording servers with analog tuner cards; today it'd be much easier with SDR hardware streaming RTMP streams to cloud servers writing the data out to network attached storage or even S3.
I'm not trying to take the piss here, but why would you use a real-time protocol to stream to an archive? And why SDR instead of a dedicated tuner/decoder?
RTMP is of course not required; you could write locally to your workers, and then push to your storage repo over http, rsync, whatever. My last gig was with video broadcast, so most of my work was with rtmp, Akamai, etc. Different ways to skin the same cat. Almost all hardware video IP encoders support rtmp though, and you can colocate with where your storage is.
As /u/akiselev mentioned, SDR is easier to extend with the hobbyist community that exists around it, especially if you have a custom application and need more control.
Why wouldn't you? RTMP is one of the most common ways to stream video and if you're already receiving live audio/video data from a tuner then you can easily package it into an RTMP stream going out to any number of archive servers running ffmpeg (which does the hard work of any transcoding or remuxing you need to do). All you have to do is package the RTMP stream, build ffmpeg yourself with nonfree and gpl options enabled, and run/monitor off-the-shelf software with an obnoxious number of command line arguments. Chances are you can package the RTMP stream straight from the source using ffmpeg too.
I don't know about the SDR vs tuner question but many SDRs are made with TV tuners and I'd bet you'd have more luck (especially long term) with the software written by the amateur radio community than whatever company makes your brand of TV tuner.
DTV decoding is really processor intensive, and is really only now possible on the fastest processor. You can do it in an FPGA, but those are expensive. ASIC are still the cheapest way to go. It's like hardware vs software H.264 decoding; night and day speed and efficiency difference.
Your vision of the future assumes that folks will remain complacent about how much they trust their doctors, health insurance companies, medical device companies and hospitals with their minute-to-minute activity.
Google is believed to assemble its SSDs, supposedly from Intel/Micron flash memory and Marvell controllers. So even though they're the same components you'd find in Intel/Crucial SSDs, Google would obviously not run in an issue due to some firmware upgrade.
They might do that, but they have also bought this SSD. Everyone large has it and/or the S3700 hanging around somewhere. It's a really solid drive and the price / performance / reliability is hard to beat (as with its predecessor, the 320).
When a giant company talks about building hardware themselves, it means that they thought they could get a better deal than their suppliers could get. At some point you're the source of your supplier's discount, rather than a beneficiary of it.
So they call up an ODM, ask for the same motherboard as always, oh but please hold the serial port to cut the BOM by $0.50. It is a custom design made for them, but it's all still built with the same commercially available stuff that everyone uses.
No one is going to design a motherboard from scratch, when they can use the Intel reference design. Likewise, that Marvell controller that Google might use on their Google-brand SSDs is going to run Marvell's standard firmware with maybe a few changes per Google's request.
I doubt Google would run into this issue in production though, because they should have an agreement that requests an exact firmware revision from their vendor. Any change to the firmware revision would require some sort of re-qualification, which should easily catch something this big.
It's still lame that Intel did this. The drive should have just been 4K when it launched, and every other drive since 2011 should be as well.
Giant companies should buy most of their hardware to reduce cost and focus on their main activity. But they should also build a small percentage of their hardware themselves to have the freedom to ditch away a supplier. This is also important to maintain good technical knowledge to negociate more wisely with suppliers.
I read this at the time and, frankly, my takeaway was more that:
> There are applications where 512b drives and 4K drives are not compatible; for example, in some ZFS pools you can't replace a 512b SSD with a 4K SSD
...the ZFS design was fundamentally fucked up. Intel have merely exposed a core design problem, because sooner or later you aren't going to be able to find 512 byte drives at all.
Zfs sector size (ashift parameter) is set at vdev creation time. That is, when you get a bunch of drives together, and you want to have some redundancy or striping, you create a vdev that is composed of several drives, for your desired mode of redundancy (and / or striping). A pool is composed of multiple vdevs; zfs file systems all allocate from a common pool. So it's generally only a problem if you are replacing an existing drive in a vdev after it failed.
Zfs doesn't support a bunch of things. It has no defragmentation. Filling a zfs pool much north of 90% tends to kill its performance even after you delete stuff to bring it back down again. The usual answer to these things is "wipe the pool and restore from a backup", or "zfs send <snapshot> | zfs receive <filesystem>". The answer to changing the sector size of a vdev is similar, just like it is for removing a disk from a vdev, or reconfiguring your redundancy in most cases.
This is just how zfs is currently implemented. It was designed for Sun's customers, for whom having backup for the whole pool, or having a whole second pool to stream to, is not a big deal. Using it in a home or small business context consequently requires more care and forethought.
I am well aware of this, having been running production systems with it since 2008, shortly after it stopped silently and irretrievably corrupting data.
> It was designed for Sun's customers, for whom having backup for the whole pool, or having a whole second pool to stream to, is not a big deal.
The idea that I have to destroy and re-create pools for so many no especially uncommon events is one that runs pretty counter to the way ZFS generally does a good job of being an enterprise filesystem. "Throw it away and restore from backup" is not a good answer.
> "Throw it away and restore from backup" is not a good answer.
Honestly, when you think about the life cycle of many storage systems, it is pretty reasonable. Once the drives get to a certain age, you tend to have to replace them anyway, and after the array is beyond a certain age, you want to replace the whole thing.
It makes a certain sick sense to expect a lot of enterprise customers to have a strategy for fail over to a new storage pool.
If you know you can't get suitable replacements, it's an easy problem - copy the data to a new pool and retire the old one. In this case, since the part number didn't change, there was no way to know.
Sector size is a pretty fundamental property of a disk drive.
Totally off-topic but I really enjoy this blog. I've discovered it while struggling with ZFS and btrfs and it's a concise opinionated no bullshit honest sysadmin blog from someone far more knowledgeable as myself. You may not agree but the presentation and style is really great.
All that changed are the default settings. It's unfortunate that Intel didn't document the change but the fix is to add a new step to the drive replacement procedure to reconfigure the firmware. This is a scriptable action. Not that big a deal.
It's a minor inconvenience at best. It's not like ZFS let's you add the drive rendering the pool unusable. You get an error telling you it's the wrong sector size, at which point you fix the error and move on with life.
Hm, does anyone know if this would mess up hardware RAID (specifically LSI MegaRAID running RAID10)?
We have some of these drives in RAID, along with some spares sitting around. All bought around the same time, but who knows if they were manufactured during the transition.
I used to maintain a distributed system of TV recording servers with hundreds of analog TV tuner cards inside and understand this pain all too well. After years of frustration trying to get these cards and all the different revisions to work together on whatever version of Linux I'd adopted for the system (kernel upgrades were a huge risk), I swore off hardware altogether for future projects. Even though all the devices had the same chipset, I couldn't keep it all working at the same time and it sucked all the time and energy I should have been spending on my actual product.
God bless the rise of cloud computing. Seriously.
I can't even imagine what it must be like to maintain the amount of hardware they have at AWS or Google. Speaking of which, how the fk does a startup like Digital Ocean do it?