Hacker News new | past | comments | ask | show | jobs | submit login

I'm going to over-simplify a bit (and try to remember; it's been years since I've actually paid attention to flash). Flash works effectively by erasing to all-1s, and NAND'ing data with an existing block of bits. If you have a freshly erased block, it's already all 1s, and you don't have to clear it off before you can use it. This is fast.

If there are no free blocks, or for whatever reason, the flash device elects to use a block you've already written to, you need to erase the entire block, even if you're only writing one bit. This is relatively slow.

Most high end flash devices, as I understand it, try to keep a pool of pre-erased blocks so that they can stay on the fast path, on average.

As far as I can tell, what they're saying in this article is that sometimes you can hit the slow path, and that causes high latency. On average, it's not a big deal, but for some random small number of writes, it can be an issue.

Note: This is my own opinion, and not that of my employer (Google) or based on trade secrets or other IP from my employer.

Keeping pre-erased blocks is useful, but it can only reduce the write latency to what the chip gives you. And the chip gives you a longer write latency than read latency, especially for MLC with smaller process sizes.

True, there is a difference between read and write latency, but at least that is consistent and therefore easy to plan for. Large variance makes things far more difficult, in my opinion.

It's actually not consistent. On MLC flash, individual write operations can vary by a factor of 6.

see: http://cmrr-star.ucsd.edu/starpapers/309-Grupp-1.pdf (PDF)

Having a pool of pre-erased blocks helps absorb spikes in write activity, but it doesn't let you escape the fact that, on average, an SSD will need to perform an erase operation every 128th page write (assuming 128 pages per block).

If the device is kept at or near full utilization, eventually a fast operation (read) will get delayed behind a slower operation and your tail latencies will suffer.

I thought that's what I said. Guess I just suck at explaining things.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact