Hacker News new | past | comments | ask | show | jobs | submit login
1-2 year SSD wear on build boxes has been minimal (dragonflybsd.org)
139 points by jontro on Feb 15, 2015 | hide | past | favorite | 74 comments

Look. :)

Yeah, the build machine churns a lot. But that work should be primarily done by the FS cache, by buffers. Yes, it's going to write out those small files, but if DragonflyBSD has any kind of respectable kernel, though should be a solid curve, not lots of bursts.

I would love if my old colleague Przemek from Wikia would talk about the SSD wear on our MySQL servers which had about 100k-200k databases per shard.

We wore the _fuck_ out of some SSDs.

You should replace your HDDs with SSDs, though, for a number of reasons, and take the long view, as kryptistk noted the OP is doing. Really compare the cost of SAS 15k drives and Intel 320s or 530s.

But I think in his place, you can take the words of the inimitable Mr. Bergman:

Stop wasting your life. But don't expect a machine that does lots of random IO, like a database, to have 1-2% SSD wear after two years. It might not last two years. If it does, use it more. Aren't you making money with these drives? ;)

Yea, many people look at SSDs with a dollar for longevity number, which is a terrible metric for them. In $ per IOP SSDs kill hard drives. The amount of IOPS you can fit in a 4U rack with SSDs is insane and destroys hard drive in performance per watt when you consider all the extra controllers, cases, and power supplies needed to get anywhere close to the same performance.

And don't forget the expensive labor, if you have 20x the number of spindles to meet latency targets, you will get 20x the number of failures (roughly) over time, 20x the labor to set it all up, etc.

There are several sites doing SSD stress tests. This one claims to have written 2 Petabytes to a drive without failing:


Not all written bytes are equally expensive to the flash. Single-stream sequential writes are going to give you a lot longer lifetime than tons of random small IO with zero TRIM.

2 petabates while validating write RIGHT AFTER it happens, and never powering that drive down. Its great when all you need is a dev/null, not so much if you need to power down from time to time and retrieve useful data later.

They did do a 5 day unpowered retention test http://techreport.com/review/25681/the-ssd-endurance-experim...

I believe they have powered them down. It's not solely a drive write cache test.

exactly :) memory is cheap -this is an exercise in that imo.

But don't expect a machine that does lots of random IO, like a database, to have 1-2% SSD wear after two years.

It rather depends upon the database, no? A 512GB Samsung 850 Pro would last 29 years with 100GB of writes a day in a horrendous, worst case 3x write amplification scenario. Very, very few databases write more than 100GB of data a day, and most are magnitudes less.

To be clear here, what the poster means is that piecemeal database writes of, say, 128-byte records can cause a huge amount of write amplification, so 100GB/day worth of database writes can end up being 1000GB/day worth of flash rewrites. This issue largely goes away if the database backend appends replacement records rather then rewriting them in place, and uses the index to point to the new copy of the record. At that point the battery-backed ram the RAID system has combined with the appends results in much lower write amplification and a SSD could probably handle it.


I know of no database products that write 128-byte blocks. Most are at least 2KB, if not larger (SQL Server and pgsql are 8KB blocks). Yes, you could conceivably imagine up some hypothetical situation that would be bad for an SSD, but it is incredibly unlikely. And when I say 100GB of writes, I mean, literally, 100GB of writes, which in actual source data is generally much, much, much, much smaller.

You do understand that flash erase blocks are around 128KB now, right? Random writes to a SSD can only be write-combined to a point. Once the SSD has to actually start erasing blocks and doing collections of those randomly combined sectors things can get dicey real fast in the face of further random writes. It doesn't have a magic wand. The point is that the SSD has no way to determine ahead of time what the rewrite order is going to be as random blocks continue to get written. You can rationalize it all you want, the write amplification is going to happen everywhere in the chain to one degree or another. For that matter, filesystems are ganging blocks together now too (and have been for a long time). It is not uncommon to see 16KB-64KB filesystem block sizes.

Nobody gives a shit about a mere 100GB in physical writes to a storage subsystem in a day. A single consumer 512GB crucial can handle a rate like that for almost 30 years.

Write amplification effects are not to be underestimated, but fortunately there are only a very few situations where it's an issue. And as I said, the situations can be largely mitigated by database writers getting off their butts and fixing their backends to push the complexity to the indexes and away from trying to optimize the linear layout of the database by seek-writing into it.


4KB random writes should have a WAF of 2-7 depending on over provisioning and assuming random writes. But real workload are not purely random so it will be better than that.

You do understand that flash erase blocks are around 128KB now, right?

How is that relevant to your wrong claim about database 128-byte writes? You word that as if you're correcting me.

I don't even know what point you're trying to make anymore, but you're trying to buttress your argument with what can best be described as diversions.

Most people write far less than they think. Databases aren't particularly magical, any more than the web server log file that is written to a line at a time. Yes, people "give a shit" about a "mere" 100GB of writes, because the vast majority of real projects, including at major corporations, write far less than that per day. So are we just talking about dick measuring now?

A block might be 8KB, but the actual update you're making to the block might be much smaller. I imagine by '128 byte write' he's talking about a lot of random row updates, where each row is 128 bytes. Now, if you're not too unlucky, many updates will be combined on the same page per checkpoint, but that's not a given. On the other hand, it's reasonably likely that several updates will be combined per erase block per checkpoint. A heavily indexed table can exhibit some pretty random write patterns, though.

Additionally, the WAL has to be synced to disk every commit (unlike a web server log file), and WAL records can be very small. WAL is of course append-only, so you'd hope that a good SSD with a battery/cap backup would cache the writes and flush on the SSD erase block filling up.

A block might be 8KB, but the actual update you're making to the block might be much smaller.

All major databases will only deal in the 8KB increments (or whatever their block size is, whether larger or smaller, but never as small as originally claimed) though. They don't write less. Indeed, it's worth noting that most (every single major one) database systems actually write to a sequential transaction log (which they do not have to checkpoint every n-bytes), and only on commit do they actually then make a strategy for changing those source pages and extents, unrolling it from the log and checkpointing it, which by default includes coalescing and combining strategies. The idea that databases are randomly writing small bits of data all over the place is simply wrong, but is the entire foundation of almost every comment in this thread.


As one example. Oracle, pgsql, mysql, and others do the exact same thing.

They aren't randomly changing an int here and a bool there.

I worked on a financial system where we wrote just absurd amounts of data a day. We ran it on a FusionIO SLC device (with a mirror backup), and churned the data essentially around the clock. After three years the FusionIO little lifespan hadn't even moved.

tldr; people grossly overestimate the "magical" nature of databases.

I'm really not sure you read my post very thoroughly. I'm fairly intimately acquainted with the internals of database systems, and you don't seem to be replying to what I actually wrote - I wasn't attacking your point of view (I generally agree that a decently designed DB is unlikely to trash an SSD all that quickly), I was just hoping to shed light on the other poster's wording.

If you have an 8kb block and you change 128 bytes of it, the 'actual update' is much smaller than 8kb. Sure, you're reading/writing 8kb to disk, but everything outside of that 128 bytes is basically fat for the purposes of that change. As I said in my previous post, that can absolutely be mitigated by writes being combined through the checkpointing process, and one would hope that a decent SSD could cope easily with combining writes to an appending log.

A database can still be writing data all over the place. A heavily indexed table can cause quite varied write patterns, which can result in a lot of different pages getting touched. Fortunately, the reality is that well-designed DBs and SSDs are fairly capable of dealing with this.

My original comment on this whole discussion was that few databases write more than 100GB a day. I am not talking about whether you inserted n integers or updated so many varchar columns -- when you actually monitor its IO, it is extremely unlikely that your database exceeds 100GB a day of writes, and in all likelihood is a magnitude or two below this. Whenever anyone waves their hands and talks about databases as if they somehow imply massive use, they're just fearmongering -- actual empirical stats are your friend, and actual empirical stats show that most real-world databases barely register on the lifespan of most SSDs.

So now that we're in an understanding that we're talking about database writes to IO, the other matter is how it writes it. I've built a lot of systems on a lot of databases, and the write amplification has generally been very low. I've been building and running significant databases on SSDs for about 7 years now, and while everyone else is finally starting to realize that they're wasting their time if they aren't, we still see the sort of extreme niche fearmongering that makes other people clutch onto their magnetic storage (and I heard it the entire time. "OMG but don't databases kill flash???!?!?". No, certain volumes of writes and types of writes do. Only metering will tell you if that applies). Yes, some people do very odd things that can kill storage, but that is extremely rare. It almost certainly doesn't apply to the overwhelming percentage of HN readers.

The point is that even writing in 8kB chunks will lead to a lot of write amplification when the erase block size is at least 128kB and the flash page size is 16kB. 8kB writes are definitely less bad than 128B writes, but it's still not enough write combining to pass either of the relevant thresholds.

It's a 64x difference. And even that is grossly understating it because most database systems feature write coalescing.

It's actually "ironic" in that database systems were built to avoid random IO because it was glacially slow on spinning rust storage. They do everything possible to avoid it, and in actual, empirical metrics running databases on flash the write amplification has been very low, and hardly mirrors the claims throughout this thread.


> are you an idiot? Do you even understand

This comment breaks the HN guidelines. We'd appreciate it if you'd read the site rules and follow them:



Reading also wears out nand flash. SSD have read counters next to erase counters, too many reads and drive forces whole sector rewrite to mitigate data degradation.

There is something called read-disturb, where reading a row of NAND over and over again can disturb adjacent rows. The effect is many orders of magnitude lower than the erase wear effect and data caching within the drive (as well as data caching in the OS) mitigates it significantly. So for all intents and purposes you don't have to worry about it.

The SSD's normal internal scans will detect the bits when they start to get flaky and rewrite the block then. It might do some rough tracking to trigger a scan sooner than it would otherwrise, but it is more a firmware issue and not so much a wearout issue.


Does this mean that SSDs will lose data if you reach the write limit and continue reading from it?

"At some point in the next few years we are going to start getting HDD failures on our blade server. It has ~30 hard drives plugged into it after all (and ~12 SSDs as well). When that begins to happen I will probably do a wholesale replacement of all HDDs with SSDs. Once that is done I really expect failure rates to drop to virtually nil for the next ~20-30 years. And the blade server is so ridiculously fast that we probably won't have any good reason to replace it for at least a decade, or ever (though perhaps we will add a second awesome blade server at some point, who knows?)."

That's taking the long view. :-)

No, thats naive view. SSDs do fail, and they tend to do it without any warning. Author thinks he discovered magic.

Paradoxically SSDs from two years ago have better endurance than brand new drives now, and a lot better than SSDs in 5 years (unless something big happens).

No, nobody is magicking up any fairy dust here. The author (which is me) is looking at things realistically. My maintenance costs just from a time perspective are already radically lower.

In anycase, so as long as the SSD is powered and thus able to rewrite blocks as bits start to go bad (creating some wear in the process but not a whole lot compared to normal operation), the actual data will be retained for significantly longer than 10 years.

Unpowered data retention (where there is nothing there checking for and rewriting blocks whos bits start to go bad) depends on cell wear. I didn't pull up the chip specs but my recollection is that it is 10 years for new cells and 1-2 years for a relatively worn cell. Depends on temperature of course. That means I can pull a SSD and shelve it without too much worry for a relatively long time, something that cannot be said for any hard drive.

CD's corrode from the inside out, primarily due to air trapped in the holes I believe. Even though it is laser burned there is no real care taken in constructing the plastic sides or metal film to keep up air so when it gets vaporized the oxygen sticks around and starts corroding the metal. Or the plastic gets fuzzy from UV exposure or age. Totally different ballgame.

Hard drives have even worse problems if left unpowered. I used to pull backup drives and shelve them, but their life spans are radically lowered if left unpowered on a shelf for 6 months and then replugged into a machine at a later time. I'm sure some HDD expert can say why. So I stopped doing that. Now the backup drives stay powered on 24x7.

Hard drives seem to have a limited life span whether you use them or not. I have tons of old HDDs sitting around, some well over 20 years old in fact. If I plug an old HDD in I can sometimes read the data off before the whole thing turns into a brick. I mostly use them to test disk driver error handling.

SSDs left sitting around can always be reformatted (i.e. just using TRIM to clean out the whole thing then start writing to it fresh). Their wear limit is based on rewrite cycles rather than how long they've been sitting around. HDDs are physically destroyed, you don't get a fresh start by reformatting an old HDD.


> I used to pull backup drives and shelve them, but their life spans are radically lowered if left unpowered on a shelf for 6 months and then replugged into a machine at a later time.

That's kind of mysterious. All things being equal, they should simply experience less wear (no wear, really, other than oxidation and corrosion) than the same drive which has been left plugged in.

We had the same problem with WD Greens. We'd write a backup to the drive in a dock each week, then store the drives as monthly backups. About 6% of the drives failed when we plugged them back in.

I'd be more inclined to blame damage during handling, but only because I cannot think of an alternate explanation. There is no constant refreshing of data that happens with a disk drive (to my knowledge) such that a drive being left plugged in should be more reliable than one on the shelf.

I would be happy to be instructed here, though.

It reminds me of the "burnable CDs will keep your data for 30 years!" phase we went through. Turns out that's not so reliable after all.

Is hardware (motherboard, CPU, memory) that good nowadays that one can expect it to last 30 years. I don't think it's designed with that kind of lifetime in mind.

we don't exactly run the biggest operation, but in our experience the most common failure items in thousands-of-years-of-cumulative-uptime is network interface cards (or on-motherboard network interfaces) and hdd's.

RAID controllers fail left and right. we keep tons of spares around.

ssd's fail few and far between, cpus basically do not fail, and memory can go bad but it's exceedingly rare and easy to fix. psu's fail but are easy to fix in modern computers as well (slide-out, redundant, etc.)

having said all that, heat is the primary killer of hardware. if you run a lot of equipment in a dense environment, get a laser thermometer and find your hot spots and fix them with some industrial fans or move your gear around. once your stuff gets hot anything can fail in weird and mysterious ways.

How do network cards fail? Simply as if you cut the link? Or can you see error counters increasing and sporadic frame loss that gets worse over time?

Depends on which bit fails, but increases in packet loss are a common early symptom of small components no longer acting within their specs.

Network cards are subject to lots of signal phenomena that are rare inside the chassis. Long cables are pretty good antennas for certain types of RF signals, so there are all kinds of electrical noises, induced power spikes and other miscellaneous garbage that the network card has to tolerate. Well-shielded cables can help protect the card, but it's definitely one interface that's subject to a bit more electrical abuse than the rest.

Components that have been stressed beyond their tolerances a few times can result in things like signal filters having a lower noise threshold, which makes it harder for the card to pick out the signal from the noise, which leads to more packet loss. After enough abuse, the threshold drops below the usable level and communications halt.

There are lots of factors involved, such as shielding, proximity to nearby radiators, bend radius in cables, cable length, temperature, etc, etc. Whenever I delve into this world, I'm often amazed that anything works at all.

> How do network cards fail?

All ways they can. I've found them with blown transistors, dead rats attached, no physical imperfections, etc.

Usually for me it's been some kind of hard failure, eg completely dead.

failure modes are all over the map. sometimes they just start dropping more and more packets, sometimes it "looks like it's working" but there's no layer 1 link light, sometimes it's incredibly high latency, sometimes the entire card just disappears from view.

this mostly happens with the on-board controllers. nics don't fail as often, but we do use high end nics (intel 10g and 4x 1g)

High-end consumer motherboards often include 2 integrated NICs. Over the last decade I've owned four and had one of the NICs fail after 2-3 years on every single motherboard. Glad to know it's endemic, and Danpat's explanation is fascinating.

> RAID controllers fail left and right. we keep tons of spares around.

Kind of scary. I would guess the replacement should be perfectly identical, to the last firmware bit (... and giving thanks that no subtle circuit timing factors are involved).

Network cards? HDDs and power supplies are my most common deaths in the server room.

Same. RAM chasing them up too. Almost never had a network card failure.

I mean, this is server hardware. One of the major differences between server hardware and desktop hardware is build quality. I've got a ten-year-old 1/2U rack server sitting in a closet that I bought for pennies at a surplus auction that still runs great.

Server hardware is usually obsolete long before it stops working.

I've seen DL380 G5 giving in a few years ago already. (I still love those machines except they seem to boot slower each generation.)

But would it really last 30 years with such things as lead-free solder?

We will probably have to wait 30 years to really know.

FWIW, lots of hardware from ~30 years ago still works. I have a 27 year old Amiga500 that still boots fine (many of the floppy disks have become unreadable, though).

You can buy fully working vintage computers much older than that on eBay.

30 years of constant/daily use is different from a few years of use, then pulling out of mothballs as a curio every now and then.

A server that's 10 years old now was likely made with leaded solder.

To my knowledge, the most common guarantee target is 5 years. Equipment can last much longer or much shorter, both as a function of chance and of workload.

30 years would be exceptional if the machine spent its life at the 5-year-target load.

Love this quote: "This is the first time I've actually contemplated NOT replacing production hardware any time soon."

There are two things that benefit from the turnover of machines, one is that Intel stays profitable, the other is that hardware standards have a chance to evolve. The move from ISA->PCI->VESA->AGP->PCIe on video cards would not have been possible had people been holding on to their machines for 3 - 5 years before buying new ones.

Running a small datacenter for the last few years, I can totally identify with that quote.

On the storage side, I can't help but wonder if it's partially due to incumbent vendors pushing flash as a high markup premium product. I suspect that I am not the only one who has thought, "Well, why replace this array with more already-obsolete magnetic disk that won't be any faster? Wait a year or three, flash markups will come down. Maintenance renewal is cheap."

Now, I realize that cheap flash is dangerous to incumbent pricing structures, so $100K for an all flash tray makes sense when it can replace 10 trays of $30K disk. So maybe they've seen no other choice, business model wise. Maybe it even makes sense to milk a once-per-generation technology disruption for all it's worth for a couple of years before commoditizing it.

But it has kept me happy with what I've already got, since the pricing of the alternative was, for my use cases, hilarious. As opposed to, say, this is somewhat more expensive but we can justify it.


That's one way to look at it. Another is that we might have gone ISA->PCI->PCIe in a shorter overall timeframe b/c there were no distractions to get short-term stuff to market.

True but the relative market size growth has an impact. Had we skipped VESA and AGP for example, there would thousands more ISA slots (longer time in market) and so the next generation, PCI in your example, has to burn more cash getting "into" the space.

The obvious principle here is that innovation happens more rapidly in a market where their his a large demand for improvement and a low friction for upgrades. Pull back on either of those and it slows down the rate of innovation.

And now we have NVMe (which is totally awesome by the way)

Wear failure on SSDs is often a silly thing to worry about. The number of write-out cycles is roughly 10,000 and a modern spinning disk takes like 4 hours to write out, so it would take like 5 years of continuous writing to see an advantage for traditional HDs. In practice, you only have to worry about it, if the workload could never be considered on a traditional HD.

When faced with the choice between a Fiesta and a Ferrari, there are many reasons to choose one over the other, but it is ridiculous to say "I picked the Fiesta because the Ferrari has a known tire problem when going 200+ MPH".

Durability has steadily decreased as flash density has increased. 10000, then 5000, then 2000 for standard MLC parts over the last few years as densities have increased. I'm not sure what Samsung's new 3D process is. At the same time, the voltage regulation and comparators used on-chip has gotten a lot better, making it easier to detect leaky cells so reliability has significantly improved for the erase cycles the flash does have.

The original Intel 40G SSDs could handle an average of 10000 erase cycles (for each block of course), giving the 40G SSD around a 200TB minimum life if you divide it out and then divide by 2 for safety. (Intel spec'd 20TB or 40TB or something like that, 400TB @ 10000 erase cycles, divide by 2 gives you ~200TB or so).

A modern 512G Crucial SSD sits somewhere around a 2000 erase cycle durability, or around 512TB of relatively reliable wear (1PB / 2 for safety).

I would not necessarily trust an SSD all the way to the point where it says it is completely worn out, I would likely replace it well before that point or when the Hardware_ECC_Recovered counter starts to look scary. But I would certainly trust it at least through the 50% mark on the wear status. Remember that shelf life degrades as wear increases. I don't know what that curve looks like but we can at least assume somewhere in the ballpark of ~10 years new, and ~5 years at 50% wear. Below ~5 years and I would start to worry (but that's still better than a shelved HDD which can go bad in 6 months and is unlikely to last more than a year or two shelved).


Since it wasn't immediately clear to me how the percentages were pulled from the smartctl output, I'll note that "Perc_Rated_Life_Used" is the relevant readout.

(Some online sources mention "Wear_Leveling_Count", and even misreport that as a percentage – but in fact that seems to be an absolute count of the number of times each single block has been rewritten. The percentage is presumably this Wear_Leveling_Count divided by the rated number of cycles.)

There are certainly workloads that will wear out a SSD, random database writes being the most common. But that is only a very small portion of the storage ecosystem, not to mention that there are plenty of ways to retool SQL backends to not do random writes any more, particularly since it doesn't gain you anything on a modern copy-on-write style filesystem verses indexing the new record yourself as an append. So I expect this particular issue will take care of itself in the future. It's a matter of not blindly using someones database backend and expecting it to be nice.

The vast majority of information stored these days is write-once-read-never, followed by write-once-read-occasionally. I expected our developer box which is chock full of uncompressed crash dumps and many, many copies of the source tree in various states to have more wear on it than it did, but after thinking about it a bit I realized that most of that data was write-once-read-hardly-at-all.

In terms of hardware life, for servers there are only a few things which might cut short a computer's life, otherwise it would easily last 50 years. (1) Electrolytic capacitors. (2) Any spinning media or low frequency transformers. (3) On-mobo flash or e2 that cannot be updated.

(1) Electrolytic capacitors have largely disappeared from motherboards in favor of solid caps which, if not over-volted, should last 30 years. Electrolytic caps are not sealed well and evaporate over time, as well as slowly burn holes in the insulator. They generally go bad 10-30 years later depending on how much you undervolt them (the more you undervolt, the longer they last). Even so I still have boards with 30+ year old electrolytics in them that work.

(2) Spinning media obviously has a limited life. That's what we are getting rid of now :-). Low frequency transformers have mostly gone away. Transformers in general... anything with windings that is, have a limited life due to the wire insulation breaking down over time but most modern use cases in a computer (if there are any at all) likely have huge errors of margin.

(3) Firmware stored in E2 and flash, or OTP eprom, rather than fuse-based proms, will become corrupt over time. 10 years is a minimum life, 20-30 years is more common. It depends on a number of factors.

Other than that there isn't much left that can go bad. All motherboards these days have micro coatings which effectively seal even the chip leads, so corrosion isn't as big a factor as it was 20 years ago. The actual chip logic basically will not fail and since the high-speed clocks on the whole mobo can be controlled, so aging effects which degrade junction performance for most of the chip can be mitigated. I suppose an ethernet port might go bad if it gets hit by lightning but I've never had a mobo ethernet go bad in my life. Switch ethernet ports going bad is usually just due to poor parts selection or overvolting which would not be present in a colocation facility or machine room.

In anycase, there is no reason under the sun that a modern computer with a SSD wouldn't last 30 years with only fan, real-time clock battery, and PSU replacements.


There are still a few other things that will limit the life of modern computers.

1. That waxy thermal pad material certainly dries out and becomes ineffective in less than 10 years.

2. Power supplies still have large electrolytic capacitors right next to heatsinks

3. BGA parts with lead-free solder are still going to have a limited amount of thermal cycles they can withstand before the solder balls become brittle.

4. I notice an increasing use of conductive glue to attach flat flex cables in computer parts, especially ultrabooks, and this also has a limited number of thermal cycles before it starts to degrade. (Anyone who bought a Samsung TV between 2005 and 2008 might already be aware of this.) This is probably the worst issue since it isn't going to be fixable for a hobbyist.

Also, I wouldn't even guarantee that firmware in Flash that isn't rewritten often will last 10 years. There are a lot of dead Nintendo Wiis already due to bad blocks in the flash.

I currently own a 30 year old PC-XT and a 40 year old stereo receiver, but they didn't make it to that age without maintenance and replaced components. I'm afraid that in 30 years many of today's devices will have failed in ways that are impossible to get at or repair.

> I suppose an ethernet port might go bad if it gets hit by lightning but I've never had a mobo ethernet go bad in my life.

Been there, record thunderstorm centered directly above a client's building. Several switch and mobo ports died. The fact that said client "saved" on cabling by running UTP between buildings probably had something to do with it.

Well, I probably don't have to tell people to never run copper between buildings. Always run fiber. The longer the runs, the higher the common mode voltage between grounds and the higher the stress on the isolation circuitry. Plus lightning strikes don't have to hit the cable directly, they can pull up the ground for the whole building and suddenly instead of having 200V of common mode you have 4000V for a few milliseconds.

Anyone who has ever wired up a T1 in the basement of a highrise knows what I mean.


Not sure why you'd want to buy 10-year capable storage for a server.

I buy for reliability. If I can plug something in and just forget about it (except for patches) for 3-4 years, we're done and whatever new thing I can buy will pay for itself in power savings.

I'm happy that an SSD will last that long, but it's not something I worry about.

I have had to rebuild servers because of abject SSD failure, and will no longer buy those brands. Failure seems to be quite highly correlated with trying to save a buck by getting non-top-tier drives (whereupon it's me, or someone like me, picking up the pieces when I could be writing code. Screw that).

If a drive was happily at 90% wear after two years, I'd just provision more of the same drive. Yay! :-)

Basically only buy SSD brands who either are chip fabs or have a relationship with a single chip fab. So. Intel, Crucial, Samsung, maybe one or two others. And that's it. And frankly I have to say that only Intel and Crucial have been really proactive about engineered fixes for failure cases. Never buy SSDs from second-tier vendors such as Kingston who always use the cheapest third-party flash chips they can find. There are literally dozens of those sorts of vendors. Hundreds, even.


Not so sure about that, two out of the three SSDs I first used where Intel's second generation.. two of them went bad about a year in... I've used Corsair, Crucial and Samsung since... other than the two intels, out of about 12 drives since, I have only had one go bad (just about doa, died 2 hrs in).

It's pretty much conjecture... I will say that I was pretty disappointed in a lot of the seagate 7200 3tb drives I bought 3 years ago.. out of 12, 3 are now dead under moderate use... (I bought in different batches from different vendors for teh same model).

Micron is the name behind Crucial.

Kingston has gotten very good (and they used to suck) at least in the eMMC space (not quite a SSD, but close).

Crucial are the drives I've been ripping out. Enterprise class. They just died after 2 years of service and not really all that much wear, won't talk to the outside world anymore.

I nailed one to the wall of our IT lab as a warning not to buy any more of them.

>Kingston has gotten very good

...at ninja swapping components under same model name, google V300

I'm currently working on a project to replace our older SANs with SSD-only storage servers - I've performed a few POCs with great results and am now documenting the build as I go.

Not only is it going to give us some much needed IOP/s it's also going to save us hundreds of thousands of dollars on storage over the next 3 years.

If you're interested take a look at: http://smcleod.net/building-a-high-performance-ssd-san

None of those drives show evidence of regularly-scheduled self tests. They must not really care about them.

What sort of self-testing do you think is needed?

Daily short tests and weekly long tests seems reasonable. These drives show that they've gone thousands of hours since the last test.

It's really unclear whether explicitly initiated tests actually help for a SSD. The SSD has its own internal mechanisms to scan the flash chips which (I presume) is unrelated to SMART, since they are required for normal operation... primarily detecting weak cells before the bits actually go bad. Whole chips can go bad out of the box but after an SSD has been running for a few months there isn't much left that can go south other than normal wear or a firmware failure.

firmware issues dominated SSD problems in the early years. Those issues are far less prevalent today though Samsung got its ass bitten by not adding enough redundancy and having data go bad on some of its newer offerings. Strictly a firmware issue. Which is another way of saying that one should always buy the not-so-bleeding edge technologies that have had the kinks worked out of them rather than the bleeding-edge-technologies that haven't.

If it starts to bite me I may change my tune. But until that happens I put it in the wasted-cycled category.


Tests of what? The ability to read and write data without corruption? Ordinary usage with wear leveling takes care of that, and these SSDs are definitely being used full-time.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact