
Set Operations in the Unix Shell (2008) - polygot
http://www.catonmat.net/blog/set-operations-in-unix-shell/
======
pepve
Alternative for the cartesian product (instead of the nested loops):

    
    
        join <(sed 's/^/1 /' set1) <(sed 's/^/1 /' set2) | cut -d ' ' -f 2-
    

Edit, someone in the comments there has a far better trick:

    
    
        join -j 2 set1 set2

------
protomyth
I do a lot of processing of lists and such on the shell. I don't think I'm
unique in writing union, disjoint, columns, sequence, etc. commands. I wonder
how many people out there have scripts at work that are basically long pipes
doing a lot of processing to get the data munged into something acceptable for
a second system.

One of these days I'll rewrite mine from the 1990s era Perl and put them out
there, but I got to believe others have done better.

~~~
mitchtbaum
I'm moving toward Lua only scripting, backed by Rust only system commands (at
least in model space for now). From that approach, I intend to solve this
Shell use case with
[https://github.com/BurntSushi/fst](https://github.com/BurntSushi/fst) and any
needed Lua syntax extensions, like [http://lua-
users.org/wiki/StringInterpolation](http://lua-
users.org/wiki/StringInterpolation) , [http://lua-
users.org/wiki/VarExpand](http://lua-users.org/wiki/VarExpand) , etc. So far,
I have some stacks of notes on Rust replacements and some links to Lua
dialogues on similar (failed) attempts for a new shell. If you have any more
points of interest or a drive to implement it, cool.

------
amelius
Those operations don't look like set-operations at all, except on close
inspection. Is the Unix methodology failing us here?

~~~
pvg
What's the failing part? Someone declares 'lines in a text file' to be a set
and then proceeds to write 'set operations' for them out of your basic unix
text processing utilities. Unsurprisingly, it both works and doesn't look like
the first thing that would come to mind when thinking of set operations. But
then, does 'grep [linenoise]' look like 'finding a string' to those unfamiliar
with it?

~~~
intransigent
Yeah, in a way, this sort of thing is indicative of a turing tarpit.

It's not obvious which, among all the moving parts, is responsible for any of
the outcomes, just by looking at the commands themselves in isolation.

If someone did a similar thing in Brainfuck it wouldn't be entirely surprising
that such a thing is possible, but the practicality of being able to getting
it right every time, extemporaneously, is low.

It's interesting to know that this is possible, in case of emergency, but just
because it's possible doesn't make it desirable.

If someone wrote an article about how to do the same in MS DOS batch files,
circa 1996, it'd be an interesting proof of robustness, but an undesirable
standard operating procedure.

