
The creation of the io.latency block I/O controller - ot
https://lwn.net/Articles/782876/
======
abhinai
So many algorithms, data structures and patterns in computer sciences were
based on the idea that hard disks are slow and their seek times are especially
slow. However, the introduction of solid state drives has changed this. There
days I have almost no server with an actual disk drive on it.

I wonder how (1) the software has evolved to face this new reality, (2) how
much of the old code is still being used and (3) what are the performance
penalties we pay for using disk performance characteristic assumptions in code
that actually runs on solid state drives?

~~~
thatsaguy
Incidentally, SSDs also benefit from read and write locality. Although there's
no seek penalty, there's still a large benefit from dispatching multiple reads
and writes from/to the same cell. You get those for free by trying to minimize
"seeks", although the underlying logic behind this optimization is simpler.

------
zepearl
Sorry, I don't understand what this article is taking about. Is it about the
classical CFQ/Deadline I/O schedulers and/or the new multiqueue schedulers? Or
something completely different? Thx :)

~~~
ignoramous
(per my understanding) This isn't about a new io-scheduler. You'd use this
io.latency io-controller that sits a layer above the io-scheduler and reduces
io queue-size as a means to throttle write requests. The key insight in how to
do this per control-group is to use a hierarchical non-blocking data-structure
to represent related workloads, which could be classified into 'fast' and
'slow', with 'fast' penalising the 'slow' group (by the way of reducing the io
queue-size available to it) iff 'fast' sees higher read latencies. 'Unrelated'
group doesn't get penalised as it isn't part of the same hierarchy, as it
were.

There are some gotchas though, for instance, priority inversion occurs where
'fast' needs more memory but 'slow' needs to be paged-out first, resulting in
'fast' being peanlized indirectly for the write-throttling on 'slow'.

The approach here is reminiscent of HFSC qdisc for TCP/IP (which only a
handful of people understand?):
[https://www.cs.cmu.edu/~hzhang/HFSC/main.html](https://www.cs.cmu.edu/~hzhang/HFSC/main.html)

~~~
zepearl
Thx

------
shereadsthenews
It kinda boggles the mind that an org as successful and as well-staffed as FB
still does something as primitive as chef on a cron.

~~~
laggyluke
What would you do instead?

