Removing a bottleneck in pv to get it to 123 GB/s?! I thought: cool! This kind of work is right up my alley! I'm curious, let's see how this person did it. I read that reddit post and I see the guy is mrb_... My own work, from 4 years ago. I completely did not remember :)
I don't know this mrb_ guy you're referring to, but is he the same person as _mrb or mrb? Whoever he is, he's clearly a snake_case kind of guy... next I expect to see m_rb because that's the only other form of private member variables I've seen in C++.
I’m a little surprised there’s no references to the gnu coreutil yes source. I love speculation as much as the next guy but a lot of it could be satisfied by reading the source code. It’s fascinating that the BSD code is linked but not the primary subject!
They recently-ish lifted the restriction on only replying/voting in threads 6 months and younger - I'm not sure if that also applies to editing, though, nor how far back it goes
Thanks for the link. That was what I was thinking about when writing my response about "complex code to do trivial things is kind of a meme." but i couldn't find it.
Legal reasons; it's in a section titled "Legal Issues". The idea is if you're cloning an existing program, it needs to be clear that you're not copying code from the original. This may be less of an issue with the other free implementations we have now, but a lot of these were clones of proprietary UNIX programs, and then it's really important that you can tell it's not derived from the proprietary version.
The throughput of GNU yes may be higher, but the simpler unbuffered implementations probably have a shorter latency to first line output, and since in most cases the program on the other end of the pipe only reads a handful of line, or more liely just one, this means they will actually be faster.
> but the simpler unbuffered implementations probably have a shorter latency to first line output
I tested this by taking the freebsd (buffered) implementation, stripping all the iterations, and comparing it against a version which also strips all the buffering (so the latter would just `write(STDOUT_FILENO, "y\n", 2)`, and the former would first fill an 8k buffer then write that).
The unbuffered implementation has an edge of about 1%, for a variance of above 10%. The "shorter latency" is essentially nonexistent.
On the other hand the program you are piping it to likely needs to do some setup as well. Plus with the fact that shells probably spawn the pipeline in order and that the setting up the buffer is incredibly cheap it would be incredibly rare that the first write wasn't done before the program did the first read. ...of course this can be more complicated if you are CPU starved and these programs are competing.
Probably the more useful metric is total efficiency. How many interactions need to occur before the setup cost is made back up. And how many lines are read on average from yes?
Using puts, fputs or fwrite (which is what the other implementations do) also buffers, so all the GNU implementation does is special case the buffer to remove the overhead of stdio.
I have never read the source of yes but used it so for me it could be 20 megabytes of source for what I know - as long as it is efficient why would I care ?
Tangibly related, but yes is one of my favorite apps. I use it quite often as it’s easier to just spam “yes” than dig through the man page to figure out how to make the command not ask question.
I always feel like it is the wrong way of doing it, but at the same time I just love the idea of a core utility that does nothing but spam yes.
> It looks like we can't outdo C nor GNU in this case. Buffering is the secret, and all the overhead incurred by the kernel throttles our memory access, pipes, pv, and redirection is enough to negate 1.5 GiB/s.
On my computer reading from /dev/zero (16.5GiB/s) is still significantly faster than piping yes (7.11GiB/s), so I'd be interested in an exploration of why that is. BPF would probably be useful here. Maybe when I find a moment, I'll write a follow-up, doing just that... unless someone else does it faster. ;)
I always thought the device minor number of /dev/zero should control which number you get, so you can also make useful devices like /dev/seven with minor number 7 for an infinite source of beeps, or /dev/newline with a minor number 10 for an infinite source of newlines, and /dev/rubout with minor number 127 for an infinite source of rubouts.
I have not heard of yes. It sounds like it just repeats a string infinitely. What would be a common use case? Or is it intended to be a lego brick for bigger composed commands?
As to why they don't use similar techniques, probably because they historically didn't really care enough about the performances of yes(1) to bother with it.
FreeBSD has since added output buffering to yes(1), you can see the difference between openbsd yes(1)[0] which remains utterly naïve and freebsd yes(1)[1] which uses an 8k internal buffer.
It's not naive, it's correct. GNU's focus is on features, kitchen sink style, performance tuning, etc. If that's what you want, it's for you. OpenBSD is focused on correctness. It's not correct to use hacks and buffering in tools to increase performance. That should be handled at the syscall interface or in the kernel or whatever.
Can you expand on why you think performance hacking degrades correctness? As I understand it, “yes” (with no arguments) is supposed to output “y\n” continuously until terminated. Throughout/latency are unspecified, admitting various implementations with various concerns, like GNU’s focus on throughout vs BSD’s focus on latency/simplicity. Even an implementation that took 10 seconds between each output could be desirable for some use cases! Is that version of “yes” less correct?
The case can be made that, given that kernel code tends to be more difficult to write and debug and is inherently security critical, there ought to be less of it. If the buffering can be generalized then it ought to be in a library to be shared, otherwise the application is the right place.
BSD devs tend to prioritise clean code over speed. I'm not certain of the implementation details here but their approach does make openbsd coreutils a more pleasant read than the GNU coreutils.
Also clean code usually results in compact code, which used to be effectively the same as fast code for the machines BSD was initially developed for.
I wouldn't be surprised if the BSD coreutils were more performant on older machines with small memory, buffers and prefetch queues (sometimes just a few bytes) than their GNU counterpart.
Do the various BSD equivalents have to also work on other, esoteric OSen, or do they generally tend to fork a new version only to run on just on that flavour of BSD?
Idk about the stuff that's normally ‘included’ in the distribution (base install of the OS), but NetBSD's package manager supports a ton of platforms. I've used it on Linux and macOS.
I didn't try it but I think it would be relatively straightforward to compile GNU tools on BSD.
But generally GNU tends to focus on performance and features while BSD focuses on simplicity. GNU having ridiculously complex code to do trivial things is kind of a meme.
And also, most people don't need a ridiculously fast "yes". Usually, when you want a fast stream of bytes, for example to fill up some space, /dev/zero is a better option.
>GNU having ridiculously complex code to do trivial things is kind of a meme.
One of the reasons is that GNU had to reimlmement UNIX tools as GPL code. How do you reimplement something trivial without it looking as the original to avoid copyright claims? One of the solutions is to implement it as complicated as possible.
It doesn't, 99.9999% of the time. But there's probably someone, somewhere, who uses it in some crazy huge obscure script, who appreciates the work people put into optimizing it.
But then again, maybe the purpose is to swamp the input, maybe for testing, so the more the better, so, you never know and it's wrong to decide what all users don't need.
Someone else managed to double yes performance [2] via vmsplice.
[1] https://www.reddit.com/r/unix/comments/6gxduc/how_is_gnu_yes...
[2] https://www.reddit.com/r/unix/comments/6gxduc/how_is_gnu_yes...