
What's So Bad About Posix I/O? - rbanffy
https://www.nextplatform.com/2017/09/11/whats-bad-posix-io/
======
pmontra
Tldr

"POSIX I/O is simply not what HPC needs." but it's ok for almost any other use
case and "There are many ways to move beyond the constraints imposed by POSIX
I/O without rewriting applications, and this is an area of active research in
both industry and academia." The problem is the lack of a standard solution
yet.

The post does a good job at explaining why HPC doesn't need the strong
guarantees provided by POSIX I/O and why they slow down those kind of
applications. They are perfectly fine and desirable for most workloads on
single user workstations, such as web development (save a file in the editor
and the compiler gets its bytes exactly as you wrote them.)

~~~
gnufx
I'd have thought MPI-IO is the relevant standard for HPC i/o (perhaps under
something like HDF5). I'd expect HPC applications with serious i/o
requirements to be using that anyway.

~~~
glennklockwood
It is and people do use it, but it ultimately still exists as software that
sits on top of POSIX in the vast majority of cases. It can clean up really bad
POSIX I/O by buffering and reordering, but doing so induces latency, memory
overheads, and consumes other resources that don't always pay dividends at all
scales.

~~~
gnufx
ROMIO unfortunately requires global locking on Lustre, but I don't see how
that's relevant to the use of standard or other high-level (HDF, NetCDF, ...)
non-POSIX interfaces. You can swap in PVFS, or something else with a different
architecture. (So what if object stores are on _local_ POSIX-y filesystems?)

I'd have thought the salient feature is collective i/o, not cleaning up silly
POSIX i/o of some sort. Nothing's perfect, but what should replace the non-
blocking collective MPI-IO under5, for instance? I wonder what are the major
NERSC applications that don't do sane i/o and should. The published workload
characterization doesn't help.

------
notacoward
The author brings up a lot of good points, but they're mostly good within the
context of HPC. Out in the broader universe where most data lives, many people
do require the full suite of POSIX namespace, metadata, and permission
semantics. Yes, even locking, no matter how many times we tell them that
relying on that in a distributed system is Doing It Wrong. I know because I
support their crazy antics at one of the biggest internet companies.

The author's on much firmer ground when he talks about POSIX consistency
semantics. While we can't control misguided applications' (ab)use of locks, we
can certainly offer models that don't require serializing things that the user
never wanted or expected to be serialized, or making them synchronous
likewise. As I've also written before[1], we need more ways to describe the
user's _desired_ level of ordering and durability, separately, giving the
implementation maximum flexibility to optimize I/O while preserving the
guarantees that users really care about. The setstream primitive in CCFS[2] is
a good example of the sorts of things we need here. I'm not at all convinced
that object-store-centric approaches are the way to go here, but some
evolution is clearly needed.

[1] [http://pl.atyp.us/2016-05-updating-
posix.html](http://pl.atyp.us/2016-05-updating-posix.html) section under
"fsync"

[2]
[https://www.usenix.org/system/files/conference/fast17/fast17...](https://www.usenix.org/system/files/conference/fast17/fast17_pillai.pdf)

~~~
glennklockwood
The context was definitely HPC, but that's mostly because it's the last place
where this problem persists in a very big way. The WORM workloads that serve
up most of the data we consume globally just rejected POSIX entirely and
created the scalable object interfaces we know and love. It was very obvious
that having every Roku in the world mount a Netflix file system over the WAN
was insane, so nobody had to write an article calling for the end of that bad
behavior.

Updating POSIX, whether it be through more expressive consistency options or
just giving middleware developers the knobs they need to implement such
mechanisms themselves, has come up before[1], as you say. My worry is that
many of the serious efforts from years past were ahead of their time, and
hyperscale has gone their own way so the ship has sailed on POSIX I/O.

[1] [http://www.pdl.cmu.edu/posix/](http://www.pdl.cmu.edu/posix/)

------
jhallenworld
I'm surprised there is no mention of the lack of non-blocking filesystem reads
and opens. In other words, I should be able to submit a read request, and then
select()/poll()/epoll() for when the data is available like network I/O.

~~~
geocar
Most unixes (including Linux) always report files to be readable under
select()/poll()/epoll() so this doesn't actually help read from disks (even
when they're backed by network/shared storage). That's why you often see this
kind of seeming "nonsense" in HPC startup scripts:

    
    
        for x in *; do cat "$x" & done | pipeline ...
    

Furthermore, in a networked-environment, select()/poll()/epoll() actually
decrease concurrency compared to threads because _now_ you need another system
call to go fetch the data from the kernel. If I've got the CPUs I'd rather use
threads.

In the Windows model (in contrast), the programmer says "when this data is
available (from this socket, file, etc), put it here (into this memory
buffer)" and then waits for the kernel to do _something_. As a result, you
skip that extra syscall -- but you also skip any other read you might have
needed to do for data that's arrived at (roughly) the same time. AIO[1] is
potentially similar, but it never seems to benchmark well in my experience.

[1]: [http://man7.org/linux/man-
pages/man7/aio.7.html](http://man7.org/linux/man-pages/man7/aio.7.html)

~~~
gpderetta
> Most unixes (including Linux) always report files to be readable under
> select()/poll()/epoll()

There is no other choice really. Let's say you want to read a specific page
from a file. No page is currently in the kernel buffers. If poll where to
return non-ready (and remember that poll has no idea which pages you are
interested in), what action, within posix, would you take to change it?

poll and friends just do not make sense for plain file descriptors. In linux,
splice add an interesting twist, but it is still awkward to use.

~~~
geocar
> Let's say you want to read a specific page from a file.

pread() is an unusual case. I'd be happy if poll() simply indicated read()
would succeed, and I'm happy doing an lseek() before every read().

However a pread() with O_NONBLOCK could return EAGAIN and poll() would _then_
know what pages I'm interested in.

> what action, within posix, would you take to change it?

The general (POSIX-compatible) strategy (as I indicated) has been to fork a
thread off to handle reading so that my application is notified when the data
is in the fifo. This works for me when I'm reading (say) five files from a
slow (or remote) storage, but it's not ideal since it causes a copy.

~~~
gpderetta
poll can't guarantee that the next read will succeed because first of all
doesn't know how many pages you are interested in reading and any relevant
page in the page cache might be evicted between the poll and the read.

Also having pread, which is designed for concurrency and scalability, carry
state around for poll (an unbounded amount of state btw: you can have multiple
threads issuing concurrent preads on the same fd) makes it completely
pointless.

There are linux patches floating to add flags to preadv2 so that on a page not
present it returns EAGAIN, so that you can optimistically do a sync read an
fall back to a thread pool in the slow case. It still doesn't involve poll and
friend.

~~~
geocar
poll() can't guarantee that the next read will succeed anyway, because
_something else_ might drain the socket or fifo (e.g. another thread).
Programmers simply use O_NONBLOCK anyway and use poll() is wait until the
process should try again.

------
gh02t
Note that the title is a little click-baity. Specifically, this article is
about what is wrong with POSIX I/O _in high-performance, exascale computing_.
This article is about how the POSIX I/O model breaks down in a pretty extreme
use case, not that it's fundamentally broken.

~~~
convolvatron
the atomic write business is pretty painful in any distributed setting, and
only comes into play with concurrent writers with overlapping byte offset
regions.

I've always kind of struggled to imagine a case where an application would be
correct with this guarantee and faulty otherwise. i guess some sort of
cleverly designed structure with fixed length records where a compound
mutation could be expressed in a contiguous byte range?

~~~
Derbasti
Imagine two processes appending rows to a csv file. If writes were non-atomic,
they might insert partial rows, or overwrite each other.

~~~
convolvatron
the append case is actually different, it requires serializing the end of file
pointer.

in the write atomicity case we're asking the file system to order all the
writes, and make sure the resulting file shows each of them being completed in
their entirety before the next is applied, without any interleaving even if
they refer to the same byte offsets in the file

------
amluto
> A typical application might open() a file, then read() the data from it,
> then seek() to a new position, write() some data, then close() the file.
> File descriptors are central to this process; one cannot read or write a
> file without first open()ing it to get a file descriptor, and the position
> where the next read or write will place its data is generated by where the
> last read, write, or seek call ended.

pwrite() doesn't have this problem at alll, and my copy of the "POSIX.1-2008
with the 2013 Technical Corrigendum 1 applied" man pages (type "man 3p pwrite"
on your local up-to-date Linux box) mentions a function called pwrite(). I'm
pretty sure that this makes pwrite() an example of POSIX I/O.

> Because the operating system must keep track of the state of every file
> descriptor–that is, every process that wants to read or write–this stateful
> model of I/O provided by POSIX becomes a major scalability bottleneck as
> millions or billions of processes try to perform I/O on the same file
> system.

This makes no sense. If you have millions or billions of processes on one
machine trying to perform I/O on the same file system, you probably have
scalability problems, and those problems have nothing whatsoever to do with
POSIX I/O.

If, on the other hand, you have millions or billions of processes performing
I/O to the same _networked_ filesystem, then you certainly need to think
carefully about scaling, and POSIX semantics may well get in the way, but the
problem you face here has nothing to do with the fact that POSIX has file
descriptors. This is because file descriptors are local to each machine, and
they very much don't synchronize their offsets with each other.

(Unless you use O_APPEND. Having millions of processes O_APPENDing to the same
file is _nuts_ , unless your filesystem is designed for this, in which case
it's probably just fine.)

~~~
unwind
_type "man 3p pwrite" on your local up-to-date Linux box)_

Or, of course, type "man 3p write" into Google, there are many man-page sites.

I typically use die.net [1] which is among my top search hits. It could do
with some cleaning of the web versions, but it's fine.

[1] [https://linux.die.net/man/3/pwrite](https://linux.die.net/man/3/pwrite)

------
jstimpfle
> the guarantee that data has been committed to somewhere durable when a
> write() call returns without an error is a semantic aspect of the POSIX
> write() API

No, that's just flat out wrong.

> POSIX I/O is stateful

This is fundamental to the authorization model. Authorization happens at file
open time. It's also what enables the stream abstraction.

The title of the article is really bold. POSIX I/O solves a common problem
just fine (it's not perfect, but not for the reasons given in the article, and
we don't know how to do it much better). I don't know anything about the
domain the author talks about (HPC), but it seems what he needs is basically
direct access to the block device. Or writing away through network sockets /
using a database.

~~~
aidenn0
>> the guarantee that data has been committed to somewhere durable when a
write() call returns without an error is a semantic aspect of the POSIX
write() API

> No, that's just flat out wrong.

from "man 3p write":

    
    
           After a write() to a regular file has successfully returned:
    
            *  Any successful read() from each byte position in the file that  was
               modified  by  that  write  shall  return  the data specified by the
               write() for that position until such byte positions are again modi‐
               fied.
    
            *  Any  subsequent successful write() to the same byte position in the
               file shall overwrite that file data.
    
    

>> POSIX I/O is stateful

> This is fundamental to the authorization model. Authorization happens at
> file open time. It's also what enables the stream abstraction.

This is very true, but in the workloads the author is talking about, there are
often times that a stateless API would enable a more efficient implementation.
Think about what is going on in your file server when you have 100k clients
all accessing the same open file.

> . I don't know anything about the domain the author talks about (HPC), but
> it seems what he needs is basically direct access to the block device. Or
> writing away through network sockets / using a database.

The author is talking about (possibly distributed) networked filesystems
backing clusters with extreme levels of parallelism (minimum 100s of nodes
with 10s of processors on each node, and it gets much bigger). As far as
"using a database" that falls under the category of a user-space I/O stack,
where the (userspace) database is proxying the I/O to reduce state.

The title of the article isn't at all bold in context, because it is well
accepted in HPC that POSIX I/O is the bottleneck for certain types of loads,
and the author is clarifying to those not familiar with the details why this
is true.

~~~
nwatson
For this ...

    
    
        Any successful read() from each byte position in the file that  was
        modified  by  that  write  shall  return  the data specified by the write()
    

... perhaps I can re-open() and re-read() the same byte value written by
another process for the same file, but the file contents may not have been
fully flushed all the way to disk. The file contents may be "durable" across
processes on the same running OS that mount the same filesystem ... but if the
OS happens to die before the data is flushed, then perhaps after reboot the
open()/read() will return an older value previously written.

The semantics of "durability" are a squishy concept.

~~~
aidenn0
yes, the term "durable" was perhaps a poor choice of words, but the paragraph
that followed made it clear that they were aware of the specific requirements
(specifically mentioning making dirty caches available to all processes)

~~~
dom0
_Durable_ has a very specific meaning for I/O and is just not correct here.

POSIX does intentionally not specify any durability at all (e.g. a no-op fsync
is explicitly permitted).

~~~
dullgiulio
And in practice it is close to that, due to cheating hardware with a cache.

------
davidmr
I’ve spent most of my career specifically dealing with HPC I/O from a systems
and applications perspective. I worked in academia for a while, then in the
DOE (almost indistinguishable from academia except worse bureaucracy;
shocking, I know), and finally in private industry.

I don’t think I’ve ever read a more spot-on description of the problem. Once I
left the DOE, I realized that the problem was far more acute than I had
thought. At least on the real big systems, most of our work was with a small
number of research groups basically doing the same workflow: start your job,
read in some data, crunch, every 15 minutes or something slam the entire
contents of system memory (100s of TB) out to spinning disk, crunch, slam,
your job gets killed when your time slice expires, get scheduled again, load
up the last checkpoint, rinse, repeat. Because of the sheer amount of data,
it’s an interesting problem, but you could generally work with the researchers
to impose good I/O behavior that gets around the POSIX constraints
peculiarities of the particular filesystems. You want 100,000,000 cpu hours on
a $200M computer? You can do the work to make the filesystem writes easier on
the system.

Coming into private industry was a real eye-opener. You’re in-house staff and
you don’t get to say who can use the computer. People use the filesystems for
IPC, store 100M files of 200B each, read() and write() terabytes of data 1B at
a time, you name it. If I had $100 for every job in which I saw 10,000 cores
running a stat() in a while loop waiting for some data to get written to it by
one process that had long since died, I’d be retired on a beach somewhere.

The problem with POSIX I/O is that it’s so, so easy and it almost always works
when you expect it to. GPFS (what I’m most familiar with) is amazing at
enforcing the consistency. I’ve seen parallel filesystems and disk break in
every imaginable way and in a lot of ways that aren’t, but I’ve _never_ seen
GPFS present data inconsistently across time where some write call was
finished and it’s data didn’t show up to a read() started after the write got
its lock or a situation where some process opened a file after the unlink was
acknowledged. For a developer who hasn’t ever worked with parallel computing
and whose boss just wants them to make it work, the filesystems is an amazing
tool. I honestly can’t blame a developer who makes it work for 1000 cores and
then gets upset with me when it blows up at 1500. I get grouchy with them, but
I don’t blame them. (There’s a difference!)

But as the filesystems get bigger, the amount of work the filesystems have to
do to maintain that consistency isn’t scaling. The amount of lock traffic
flying back and forth between all the nodes is a lot of complexity to keep up
with, and if you have the tiniest issue with your network even on some edge
somewhere, you’re going to have a really unpleasant day.

One of the things that GCE and AWS have done so well is to just abandon the
concept of the shared POSIX filesystem, _and_ produce good tooling to help
people deal with the IPC and workflow data processing without it. It’s a hell
of a lot of work to go from an on-site HPC environment to GCE though. There’s
a ton of money to be made for someone who can make that transition easier and
cheaper (if you’ve got it figured out, you know, call me. I want to on it!),
but people have sunk so much money into their parallel filesystems and disk
that it’s a tough ask for the C-suite. Hypothetically speaking, someone I know
really well who’s a lot like me was recently leading a project to do exactly
this that got shut down basically because they couldn’t prove it would be
cheaper in 3 years.

------
Derbasti
It seems like the OP really wants to use a database instead of a file system.

~~~
davidmr
I don't think so; HPC is a different beast There are very few if any
organizations on earth that have a database that can write a sustained
1TByte/second+ from as few as 300-400 node to 100s of PB of stateful media for
years on end.

Ken Batcher (maybe of OSU?) wrote a quote that the rest of us have been using
for years: "Supercomputers are a tool for converting a CPU-bound problem into
a HPC-bound problem."

The filesystems start to look more like databases over time, but it's not like
they can throw down a nice Cassandra cluster and have it pick up the slack.
I'm not saying it will never happen, but I don't think it's am option at the
moment.

~~~
davidmr
Rather "into an I/O-bound problem". Apologies for ruining Ken's great joke.

------
fh973
Now working on my second parallel file system, I am still amazed how the
(POSIX) file system interface has stood the test of time and has been
reimplemented in so diverse forms, from local file systems to file systems
that are essentially modern large-scale fault-tolerant distributed systems.
Completely different languages, platforms, technologies.

Sure, it is not easy to implement a file system and the details are subtle,
but the abstraction gives you the necessary freedom to build very scalable and
high performance systems and at the same time provides applications with a
well-defined set of mechanisms to solve persistence.

It might be the most stable and versatile interface that we have in computing?

------
benlorenzetti
Basic message: big parallel I/O systems need to shift responsibility for
maintaining file state, moving away from the operating system and towards the
application.

POSIX API has pretty simple view of file state with open(), read(), write(),
and close(), but this interface does not scale well concurrency-wise.

~~~
notacoward
That part scales fine. There are other problems with the POSIX open/read/write
model, such as being too strict wrt consistency and too loose wrt ordering and
durability, but it _scales_ OK. It's the metadata/namespace operations that
are hard to scale. I've been working on distributed filesystems for a long
time. If all I had to deal with was reads and writes I'd be a happy (OK, less
angry) man.

~~~
benlorenzetti
I trust your experience over my own in this area; just reading this piece with
curiosity. That makes sense, metadata and namespace operations would be
difficult when distributed. But isn't the graph in the article specifically
about file handle state maintained by the OS? I guess the graph is weird
though it wants to be logarithmic but the power of the series keeps
decreasing...what type of graph is that?

------
ape4
The POSIX aio functions. [http://man7.org/linux/man-
pages/man7/aio.7.html](http://man7.org/linux/man-pages/man7/aio.7.html)

------
AstralStorm
What's so bad about POSIX IO? People not knowing that async write APIs exist.
Additionally, that mmap exists to bypass all the atomicity guarantees. Both
are in POSIX.

------
comex
Some of the claims in this article are total nonsense.

The article points out that file descriptors are stateful - they have a
current offset associated with them, and I/O syscalls update that offset -
then claims that this "stateful model" could be a "scalability bottleneck"
with "billions of processes" in parallel filesystems. Except that each of
those billions of processes will have its own file descriptor with its own
offset! The only case where this could cause contention is if multiple threads
in a single process are trying to read from the _same_ file descriptor, which
would be a really bad idea in the first place, precisely because file
descriptors are stateful. Just open multiple descriptors - or use
pread/pwrite, which have existed for a long time. Perhaps the process-wide
statefulness of many POSIX APIs is a bad design in a world with threads, but
it has nothing to do with the concurrent-file-open benchmark in the article,
or really any other performance problems with parallel filesystems.

Anyway, file descriptors are just per-process handles used for communicating
with the kernel. At least in principle, there's no reason that remote
filesystems should know or care about file descriptors on the client end,
unless clients are using file locks (well, except those aren't file-
descriptor-based anyway, although they should be).

Later, the article claims:

> While the POSIX style of metadata certainly works, it is very prescriptive
> and inflexible; for example, the ownership and access permissions for files
> are often identical within directories containing scientific data (for
> example, file-per-process checkpoints), but POSIX file systems must track
> each of these files independently.

[..]

> Supporting the prescriptive POSIX metadata schema at extreme scales is a
> difficult endeavor; anyone who has tried to ls -l on a directory containing
> a million files can attest to this.

POSIX access bits are literally 15 bits per file. uid and gid are a few bytes.
The overhead of storing these for each file definitely isn't what's making ls
-l slow. Perhaps there's some system where time spent _checking_ them all, at
access time, is a bottleneck, but I'd be very surprised if modern Linux was
such a system; that kind of problem sounds easy to solve with some basic
caching.

The article calls out created and modified times as another part of the
metadata, thereby amusingly missing the only one of the three POSIX file
timestamps - access time - that actually can cause big scalability issues (if
left enabled).

Also, apparently the author has not heard of either ACLs, which are in POSIX,
or xattrs, which are pseudo-POSIX (multiple systems have roughly compatible
implementations based on an old POSIX draft) - both of which try to improve
flexibility compared to classic POSIX metadata. There are problems with both
of them, but you'd think they'd at least deserve a mention in the list of
alternatives.

~~~
notacoward
> Except that each of those billions of processes will have its own file
> descriptor with its own offset!

The offset is only a tiny part of how POSIX is stateful. The very fact that
each read or write is associated with a particular fd, therefore with a
particular authorization and lock context, is more of an issue at the servers.
Even more of an issue is the possibility of still-buffered writes, which POSIX
does require be visible to reads on other fds.

> At least in principle, there's no reason that remote filesystems should know
> or care about file descriptors

Untrue, and please don't try to "correct" others with your own inaccurate
information. As I just said, each file descriptor (or file handle in NFS) has
its own authorization and lock context, which must be enforced at the
server(s) so knowledge of them can't be limited to the client.

> POSIX access bits are literally 15 bits per file. uid and gid are a few
> bytes.

Also mtime and atime, and xattrs which can add up to kilobytes, but more
importantly what the author was really talking about was _namespace_
information rather than per-file metadata. It's a common mistake. Even as
someone who writes code to handle both of these separate concerns, I'm not
enough of a pedant to whine every time an application programmer gets my
domain's terminology wrong.

> the only one of the three POSIX file timestamps - access time - that
> actually can cause big scalability issues (if left enabled).

Untrue yet again. Mtime can be a problem too, as can st_size and st_blocks. In
an architecture where clients issue individual possibly-extending writes
directly to one of several data servers for a file but other clients can then
query these values through a separate metadata server, that creates a serious
aggregation problem. That's why I think the separate ODS/MDS model (as in
Lustre) sucks. People resort to it because it makes the namespace issue
easier, but it makes metadata issues harder. In the particular use cases where
people have to stick with a filesystem instead of switching to an object
store, it's a net loss.

------
contingencies
Relevant quotes from
[http://github.com/globalcitizen/taoup](http://github.com/globalcitizen/taoup)
...

 _Optimization considered harmful: In particular, optimization introduces
complexity, and as well as introducing tighter coupling between components and
layers._ \- RFC3439

 _Design up front for reuse is, in essence, premature optimization._ \-
@AnimalMuppet

 _To speed up an I /O-bound program, begin by accounting for all I/O.
Eliminate that which is unnecessary or redundant, and make the remaining as
fast as possible._ \- David Martin

 _The fastest I /O is no I/O._ \- Nils-Peter Nelson

 _The cheapest, fastest and most reliable components of a system are those
that aren 't there._ \- Gordon Bell

 _Safety first. In allocating resources, strive to avoid disaster rather than
to attain an optimum. Many years of experience with virtual memory, networks,
disk allocation, database layout, and other resource allocation problems has
made it clear that a general-purpose system cannot optimize the use of
resources._ \- Butler W. Lampson (1983)

 _Crowley 's 4th rule of distributed systems design: Failure is expected. The
only guaranteed way to detect failure in a distributed system is to simply
decide you have waited 'too long'. This naturally means that cancellation is
first-class. Some layer of the system (perhaps plumbed through to the user)
will need to decide it has waited too long and cancel the interaction.
Cancelling is only about reestablishing local state and reclaiming local
resources - there is no way to reliably propagate that cancellation through
the system. It can sometimes be useful to have a low-cost, unreliable way to
attempt to propagate cancellation as a performance optimization._

 _Optimization: Prototype before polishing. Get it working before you optimize
it._ \- Eric S. Raymond, The Art of Unix Programming (2003)

 _Before optimizing, using a profiler._ \- Mike Morton

 _Spell create with an 'e'._ \- Ken Thompson (referring to design regrets on
the UNIX creat(2) system call and the fallacy of premature optimization)

 _The No Free Lunch theorem: Any two optimization algorithms are equivalent
when their performance is averaged across all possible problems (if an
algorithm performs well on a certain class of problems then it necessarily
pays for that with degraded performance on the set of all remaining
problems)._

 _An efficient program is an exercise in logical brinkmanship._ \- Edsger
Dijkstra

 _Choose portability [high level] over efficiency [low-level]._ \- Mike
Garcanz: The Unix Philosophy

 _Laziness is the mother of efficiency._ \- Marian Propp

 _Jevons Paradox: As technology progresses, the increase in efficiency with
which a resource is used tends to increase (rather than decrease) the rate of
consumption of that resource._

 _It brings everything to a certainty, which before floated in the mind
indefinitely._ \- Samuel Johnson, on counting

... and the kicker...

 _Those who don 't understand Unix are condemned to reinvent it, poorly._ \-
Henry Spencer

~~~
userpass
And this is how we get incredibly inefficient programs like atom.

