Did you do each of these after a clean reboot, or are we looking at possible caching effects from the kernel? If any part was in cache, then we might be just comparing shared memory against IPC, and that's an obvious performance win, but not really what's intended to be examined here.
The first numbers seem to imply that it takes equally long for pread to copy bytes from memory as it does to fetch them from the disk. For a quick back-of-the-napkin attempt at checking this, lets assume that disk IO accounts for 100% of this workload, and that local memory is one order of magnitude faster. In that case, I would expect the difference for an optimized implementation to be at most 10%.
I do think it is true that there are scenarios where the file mmap is faster, or that certain operations on each kernel might fall off a cliff. I just find it hard to believe that `mmap` must be as much faster as shown here in a typical situation (e.g. after a clean reboot, doing about the same amount of work, issuing optimal syscalls, with the OS/kernel not doing anything foolish).
Yes, the file is in cache, and that was my intent. That's why my `time` output says `faults 0` for both runs. That is, no page faults occurred. Everything is in RAM.
That is indeed a very common case for ripgrep, where you might execute many searches against the same corpus repeatedly. Optimizing that use case is important.
For cases where the file isn't cached, then it's much less interesting, because you're going to just be bottlenecked on disk I/O for the most part.
> then we might be just comparing shared memory against IPC, and that's an obvious performance win, but not really what's intended to be examined here.
Please don't take my comment out of context. I was specifically responding to this fairly broad sweeping claim with actual data:
> This old myth that mmap is the fast and efficient way to do IO just won't die.
You might think the fact that this isn't a myth is "obvious," but clearly, not everyone does. The right remedy to that is data, not back of the napkin theorizing. :-)
On Linux at least, you do not need to do a clean reboot to measure something without cache. You can drop the file cache with `sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches'`.
The first numbers seem to imply that it takes equally long for pread to copy bytes from memory as it does to fetch them from the disk. For a quick back-of-the-napkin attempt at checking this, lets assume that disk IO accounts for 100% of this workload, and that local memory is one order of magnitude faster. In that case, I would expect the difference for an optimized implementation to be at most 10%.
I do think it is true that there are scenarios where the file mmap is faster, or that certain operations on each kernel might fall off a cliff. I just find it hard to believe that `mmap` must be as much faster as shown here in a typical situation (e.g. after a clean reboot, doing about the same amount of work, issuing optimal syscalls, with the OS/kernel not doing anything foolish).