
Functional programming and the death of the Unix Way - jemeshsu
http://newcome.wordpress.com/2012/03/06/functional-programming-and-the-death-of-the-unix-way/
======
andolanra
The gist—as I understand it—is that Unix programs are essentially all
equivalent to functions : ([String], TextStream) -> (TextStream, TextStream)
which limits how one can tie them together and causes a proliferation of text-
processing tools (awk, sed, grep, cut, &c &c) which is not as elegant as
functional programming. There are already some Unix command-line tools that
work on structured data as mentioned, e.g. jsawk[1] is a tool inspired by awk
whose scripting language is JavaScript and which works on JSON.

As an aside, I'd like to see people experiment with these concepts on an OS
level while consciously targeting Xen or other virtualization systems.
Already, Haskell can run barebones on Xen using HaLVM[2] and the successor to
Plan 9, Inferno[3], _was_ a virtual machine that could run either on bare
metal or inside another OS. I can imagine an entirely new OS would meet some
resistance—like Plan 9 did—but supplying an OS _intended_ to be virtualized
would let people experiment freely within their existing OS.

[1]: <https://github.com/micha/jsawk> [2]: <http://halvm.org/wiki/> [3]:
<http://code.google.com/p/inferno-os/>

~~~
chubot
Not really sure how you jumped from structured data in Unix to a bare metal
OS, but this project came up a few weeks ago, OCaml programs on Xen with no
OS:

<http://www.openmirage.org/wiki/papers> (down now, doh)

Seems to have some similarities to halvm.

~~~
avsm2
It's back up now; the ML TCP stack was running a pcap dumper for debugging,
which didn't cope well with a hackernews link.

Hacking is going pretty well on Mirage. The longer-term plan is to generalise
the support libraries to work with other languages (particularly HalVM and
GuestVM for Java), but it's far simpler to work with just OCaml for getting
the first cut out. The Xen Cloud toolstack (also written in OCaml) is
currently being adapted to support low-latency microkernel establishment,
which will remove much of the hassle of coordinating multiple Mirage
'processes'.

Another interesting performance-related aspect has been the heavy IPC
workloads that result from using many VM-VM interconnects. Some early
benchmarks in <http://anil.recoil.org/papers/2012-resolve-fable.pdf> .

------
judgej2
Often times I’ll want to grab just one part of a command’s output to use for
the input of another command. Sometimes I can use grep to do this, and
sometimes grep isn’t quite flexible enough and sed is required. The regular
expression required to get sed to do the right thing is often complex on its
own, and of course the flags need to be set appropriately. If the data format
is in columns, sometimes cut can be simpler.

Master Foo nodded and replied: "When you are hungry, eat; when you are
thirsty, drink; when you are tired, sleep."

Upon hearing this, the novice was enlightened.

\-- <http://catb.org/~esr/writings/unix-koans/shell-tools.html>

~~~
mindslight
Don't koans generally _cast off_ preexisting notions? It seems the philosophy
espoused by that story is "don't think too much about the design of your
tools".

~~~
msbarnett
I read it as "there are many ways you could solve a problem; rather than
obsess over which is The One True Way in All Situations, choose whichever is
the simplest path to scratching your itch".

------
icebraining
TermKit[1] was a proposal that tried to fix some of those problems by adding
two new channels (separating terminal in/out from stdin/stdout data pipes) and
using MIME-types (particularly JSON) to the latter.

It got a lot of flack, and I don't agree with everything the author proposes,
but I think that part could definitively be improved.

The problem, of course, is backward compatibility: even if you can reimplement
and/or wrap the core utils, what about the thousands CLI programs in each
distro's repository? You'll end up with an hybrid beast that doesn't really do
anything well.

[1]: <http://acko.net/blog/on-termkit/>

------
dsrguru
So basically the author is saying:

The UNIX shell pipeline is really just a functional programming language whose
functions can only operate on strings. Think how much more powerful, concise,
readable, etc. it would be if other data types were supported.

I agree. I don't even think this requires changes at the OS level. Newlisp and
Racket's shell attempt might be clunky, and Clojure certainly isn't ready for
quick-and-dirty scripts, but it shouldn't be too hard to implement such a
language if shell replacement is its main purpose.

~~~
adobriyan
Unix pipes are about octet streams not strings.

~~~
dsrguru
In most hackers' idiolect, the term "string" refers to an implementation-
dependent representation of characters. An octet stream or byte stream is the
abstraction that UNIX files and the standard I/O streams happen to be
instances of. So it wouldn't be wrong per se to use "octet stream" here, but
I'm treating each phase of the pipeline as a hypothetical function that
operates on text. With less indirection, you could certainly view them as
functions operating on streams, but I abstracted that away in my mental
picture.

------
jes5199
I had an idea to write Clojure parsers for the standard unix command line
tools that translates the output into s-expressions.

I got too frustrated - on the one hand, Clojure's slow start up time meant it
was no fun to use in a simple pipe

and on the other hand, even the simplist unix util has surprisingly
complicated behavior. Like `wc` - it outputs three columns of numbers, right?
Well, unless it knows filenames, which go in a fourth column. Or unless you
pass it flags, which can turn any set of the number columns on and off. And
then I thought, wait, what happens if you make a file which has leading
whitespace in the name - it turns out it only outputs a single space in front
of the filenames, so any whitespace after that is part of the name.

Which lead me to the conclusion: unix tools aren't actually simple - they are
actually really complicated, but you can construct a happy-path of simplicity
for most use cases

and functional languages are still monolithic when it comes to interacting
with the outside world. Maybe someone will make a service I can run in the
background that will run my command-line clojure scripts in a pre-warmed JVM,
but that's not a piece of technology that _I_ want to try to write.

~~~
Ralith
Not all functional languages run on the JVM or anything similar. In fact, most
don't. See: Haskell, *ML, most Lisps and Schemes, ...

~~~
jes5199
yeah, it's true. But I still think it's much harder to compose two functional
programs running in separate processes than to write one monolithic program
composing functional libraries.

Compare to perl, where piping text is so simple that chaining perl scripts
with pipes is no harder than writing functions.

------
Ralith
> I’m not advocating a return to Lisp machines here. We tried that and it
> didn’t work. Symbolics is dead, and no one even gave a eulogy at that
> funeral.

The hell? Lisp machines were awesome. I thought it was well known that they
failed partly for political reasons, and partly because Symbolics was
absolutely horrendous at business.

The only failing I'm aware of was the lack of multi-user support, which wasn't
particularly unusual for era, and even now we don't have anything which can
compare to their high points.

------
msutherl
Dataflow environments like PureData, Max/MSP, Quartz Composer and vvvv are
very much like 2D GUI-driven shells that handle various data types. Max/MSP
was, at least, explicitly conceived as a sort of UNIX for multi-media. After
working with such tools for years, I must say that the tools the professional
programmers use seem comparatively quite primitive in many ways (though much
more advanced in others). A graphical environment that utilized the same sort
of dataflow model, but with a more general intended audience, more extensible
architecture and some concepts from functional programming would have the
power to really change how programming is done.

------
geophile
The Unix Way, but piping python objects instead of text:
<http://geophile.com/osh>.

~~~
wnoise
It's not the Unix way if it's restricted to one programming language.

------
why-el
Steve Yegge's The problem of emacs goes over some of issues discussed here,
although it focused on text processing.

<https://sites.google.com/site/steveyegge2/the-emacs-problem>

------
fpgeek
It sounds like the OP wants scsh (The Scheme Shell): <http://www.scsh.net>

------
Drbble
Try Powershell / Monad for Windows. It even solved Haskell's problem of Monad
needing a name change :-) <http://en.m.wikipedia.org/wiki/Windows_PowerShell>

------
ggchappell
... and the evolution can continue along the same lines as Unix. I think I'll
write a Haskell function that takes two strings and returns a string. The
first string is a program in a new language I'll invent. The second is the
input to the program, and the return value is the output of the program.

Maybe I could call it "herl".

    
    
      herl :: String -> String -> String

------
draven
This article reminded me of something I read about the Unix model vs the lisp
model. I managed to track it down, it's a comp.lang.lisp posting by Erik
Naggum.

[http://www.xach.com/naggum/articles/3245983402026014@naggum....](http://www.xach.com/naggum/articles/3245983402026014@naggum.no.html)

The interesting part is the 3rd paragraph of the answer.

------
andrewflnr
I wonder, does this mean we actually want types as metadata on our shell
commands? Type of args, type of stdin, type of stdout, or whatever we decide
to call them?

All apps would have to speak a single, more complicated language and we would
have to do more explicit translation but it might work out better.

~~~
luriel
Aren't Go channels similar to typed pipes?

~~~
andrewflnr
I guess, but they're not exactly cross-language inter-process communication.
Maybe you could work something like them into the OS, though.

------
szany
Conal Elliott did something like this with Tangible Functional Programming.

------
peedy
True. However, about the issue that triggered this chain of thought, did you
consider using the `cut` utility along with `grep` ?

------
vdm
Further reading: <http://pinboard.in/t:functional/t:unix/>

