
The Greatest Bug of All - naish
http://wilshipley.com/blog/2008/07/pimp-my-code-part-15-greatest-bug-of.html
======
nadim
Wil Shipley's blog is recommended reading, all the archives, all of it...
great stuff. Well written, relevant and insightful.

------
icey
That interlude was great.

~~~
biohacker42
Interludes like that should be part of everything. I want Steve Yegge to have
a Scarlett Johansson mini rant in the middle of his blog posts.

~~~
icey
I'm not gonna lie, it would make me read the whole thing.

------
tptacek
Memory mapping is not faster than read(2) for typical access patterns.

~~~
ajross
More to the point: if the data is really on the disk, an API change isn't
going to make the hardware run faster.

But a mapped region certainly can be faster for random access reads from a
file. Doing a seek() followed by a read() takes two system calls (and thus
four context switches) in addition to any I/O overhead. Faulting the page in
from an existing mapping instead takes only one interrupt handler at worst,
and potentially completes without kernel intervention at all if the page is
already mapped.

~~~
dhess
Typically a system call only leads to a context switch if the call will block,
or if the caller has used up its timeslice; otherwise, it's just a couple of
relatively cheap protection domain crossings.

~~~
ajross
A system call _is_ a pair of context switches: swap the registers, swap the
stack, change the TLB mappings, on some architectures flush the cache. Even
the fastest system calls (Linux on amd64) take a thousand clock cycles or
more.

~~~
tptacek
Technically, a system call (on i386) is only changing CS, EIP, and ESP. You
don't have to change CR3 or flush the TLB. "Swapping stacks" isn't expensive.

~~~
ajross
It's a thousand clock cycles more than a function call (seriously, I'm not
making this up: time it sometime). I guess whether that is "expensive" or not
is context dependent. For some applications (e.g. database servers) that make
lots of random access reads from large files, using a mapping can be
overwhelmingly faster.

I guess I'm stunned that this turns out to have been such a controversial
notion...

~~~
tptacek
Can you be specific about which part of the system call you're talking about?
Because you said "swapping the stack" (which doesn't take a thousand clock
cycles) and "flushing the TLB" (which doesn't happen in OSs where the kernel
occupies a permanent part of the VM address space).

I'm sorry, this isn't controversial. All I said was, "mmap isn't faster than
read in typical use cases". But then we all got really specific talking about
ESP and CS and CR3 and now we have something to go back and forth on, which is
kind of fun, and I might learn something new. Didn't mean to snipe at you.

~~~
ajross
To be clear: I didn't say "flush the TLB". Nonetheless unless you happen to be
lucky and have those kernel mapping in there, those TLB entries need to be
faulted in. That's expensive. Likewise the new stack isn't in L1 cache and
needs to be read in from main memory. Likewise the kernel code to execute the
handler needs to be read into the instruction cache, etc...

Given that main memory reads are pushing 100 cycles on a modern box, all those
things add up. A context switch (my usage: meaning a bounce to a non-local,
non-current execution environment) is really expensive.

~~~
tptacek
Come on. Userland code constantly blows L1 cache. Almost any library call will
churn the icache. For that matter, any memory access can collide the TLB. None
of those things are syscall-specific. If you were concerned about L1 and
icache, you needed to write your code meticulously to accomodate them anyways.

Yes, cache is important. But there's a world of difference between the "flush
the caches and the TLB" behavior that system calls used to incur and the
"expensive relative to local variable access" behavior we're talking about
here.

"A bounce to non-local, non-current execution environment" literally doesn't
mean anything. You could be talking about the CR3 change that swaps page
hierarchies, the CPL3->0 change that allows privileged instructions, the CS
change that got you there, or even the ESP change to "swap stacks". Which of
these are expensive?

I'm chasing you down because it's fun, not because I think you don't know what
you're talking about.

But, let's make this relevant: if system call overhead is drowned by disk I/O
overhead, then one argument for mmap goes away.

~~~
ajross
Why do you insist on making this a flame war? Sit down and write a program
that makes 10M trivial system calls (getpid() is a good choice) and one that
calls a function to return a constant integer 10M times.

I'm sorry, but you seem to have a _wildly_ inflated idea of the speed of
system calls on modern OSs. The real world just doesn't work like that. This
is my last post on this thread.

~~~
spc476
I did that a few months ago (<http://boston.conman.org/2007/11/30.1>) and
found some surprising results (<http://boston.conman.org/2007/12/02.1>)which
showed that using getpid() wasn't a good test of system call overhead.

~~~
tptacek
That is a really nice catch.

------
ConradHex
>The old adage (not mine) is that 99% of the time operating system bugs are
actually bugs in your program, and the other 1% of the time they are still
bugs in your program, so look harder, dammit.

I agree in general, _but_... I have run into a few platform bugs in my time.
For example, at one point realloc in the standard C library that came with
Visual Studio would break if you (IIRC) allocated a bunch of blocks that were
over 16k, and realloced them to be smaller than 16k. And I had to figure that
out the hard way, because the bug was making our code crash.

Also, on most game consoles, I'd say about a quarter of the time you _think_
something is a platform bug, it really is. Seriously. Compiler bugs and SDK
bugs just aren't all that uncommon there.

------
culley
"Now, since you're not a PhD student, you like money, so this is bad." -
great!

------
DenisM

      Remember that the very first thing you do, when looking
      at any bug, before you even start thinking about it, and
      long before you look at your code, is replicate it. You
      can't debug what you can't replicate, and user reports
      are usually lacking in some details that your trained eye
      will catch.
    

So if you don't have a repro, you don't investigate the bug? Not a good way to
treat your customers.

~~~
nertzy
I think you're reading it a bit too extremely.

His point is that you always try to reproduce the bug before you assume you
understand it.

His comments don't say anything about what to do if you're unable to
reproduce. He's just suggesting in which order he thinks you should take
certain actions.

It's altogether too easy to jump straight to a patch, submit it, and miss the
forest for the trees because you never actually went through the use flow and
realized you were patching a symptom, not the underlying problem.

~~~
DenisM
It's still a dangerous path to go down, at leat in systems software. If your
system collapses after 48 hours of stress load, what do you do then? How about
48 days? Still looking for repro? What if customer can't or won't hand over
the data?

One will do well to learn debugging things from least amount of information -
log files, crash dumps etc. Doing this consistently even for the easy bugs
with repro will teach developers to put more information into log files and
make data structures easier to discover within dumps. Then once hard problems
come you will be ready.

~~~
nertzy
I understand your sentiment but frankly I disagree.

Your first paragraph presents a straw man. Sure, in extreme cases, reproducing
a bug is not mandatory. Again, the author does not claim that reproducing a
bug is mandatory, but rather that it is a useful practice.

It is not a virtue to try to base your work on the least amount of
information.

You say that "when the hard problems come you will be ready". But really when
the hard problems come you will look at them assuming that the logged
information is enough to solve them. No programmer can know ahead of time what
information to log, so by purposefully blindfolding yourself from experiencing
the bug directly, you might miss the bigger picture.

Indeed, I can't think of a reasonable logging mechanism that the author could
have thought of ahead of time that would have helped with this particular bug.
Emphasis on "reasonable".

~~~
DenisM
Rather, when the hard problems come the logged information will be the only
thing you will have. Eventually you wil solve all the easy problems with
repro, and you will only have hard problems left. It's not a hipotethical,
that is actually one step in evolution of systems software.

