I do a lot of processing of lists and such on the shell. I don't think I'm unique in writing union, disjoint, columns, sequence, etc. commands. I wonder how many people out there have scripts at work that are basically long pipes doing a lot of processing to get the data munged into something acceptable for a second system.
One of these days I'll rewrite mine from the 1990s era Perl and put them out there, but I got to believe others have done better.
I'm moving toward Lua only scripting, backed by Rust only system commands (at least in model space for now). From that approach, I intend to solve this Shell use case with https://github.com/BurntSushi/fst and any needed Lua syntax extensions, like http://lua-users.org/wiki/StringInterpolation , http://lua-users.org/wiki/VarExpand , etc. So far, I have some stacks of notes on Rust replacements and some links to Lua dialogues on similar (failed) attempts for a new shell. If you have any more points of interest or a drive to implement it, cool.
What's the failing part? Someone declares 'lines in a text file' to be a set and then proceeds to write 'set operations' for them out of your basic unix text processing utilities. Unsurprisingly, it both works and doesn't look like the first thing that would come to mind when thinking of set operations. But then, does 'grep [linenoise]' look like 'finding a string' to those unfamiliar with it?
Yeah, in a way, this sort of thing is indicative of a turing tarpit.
It's not obvious which, among all the moving parts, is responsible for any of the outcomes, just by looking at the commands themselves in isolation.
If someone did a similar thing in Brainfuck it wouldn't be entirely surprising that such a thing is possible, but the practicality of being able to getting it right every time, extemporaneously, is low.
It's interesting to know that this is possible, in case of emergency, but just because it's possible doesn't make it desirable.
If someone wrote an article about how to do the same in MS DOS batch files, circa 1996, it'd be an interesting proof of robustness, but an undesirable standard operating procedure.
Yeah, honestly POSIX shell already has C arithmetic, so it would be natural for it to have set operations, a la Python (x | y, x & y, etc.) Here's Fizz Buzz in shell, without executing external processes like "expr":
for x in $(seq 100); do
{ (( x % 5 == 0 )) && (( x % 3 == 0 )) && echo "$x fizzbuzz"; } ||
{ (( x % 5 == 0 )) && echo "$x buzz"; } ||
{ (( x % 3 == 0 )) && echo "$x fizz"; } ||
echo "$x"
done
I'm a big fan of the Unix philosophy, but after a few decades, it makes sense to fold commonly-used operations into the shell, especially when they cost almost nothing to implement with zero bugs. There's no benefit to "streaming" for these operations; on the contrary, it's probably an de-optimization.
* "seq" is usually an external process. You may want to do it in bash's arithmetic for, use a while loop with `$(( ... ))", or write a `seq` function (you still need to spawn a subshell for command expansion though.)
* "(( ... ))" is not in POSIX shell, but "$(( ... ))" is. You should be able to get around with some [ "$(( blah ))" -eq 0 ] tests.
Yes you're right that (( )) isn't POSIX but $(( )) is -- I was lazy and borrowed it from the example. The one change I made was to change brace expansion {1..100} to $(seq 100) because the former isn't in POSIX, but yeah the whole thing wasn't POSIX anyway.
But it doesn't really detract from the point -- if shell has arithmetic and bitwise operators built in, it's not a stretch for it to have set operations built in.