
Use Haskell for shell scripting - stefans
http://www.haskellforall.com/2015/01/use-haskell-for-shell-scripting.html
======
cies
Thanks Gabriel Gonzalez! There is a comment on the blog post (by Chris Done)
asking how it deals with piping. I really wonder about that too.

Some related projects:

\- Joey Hess recently released a nice Haskell-to-sh compiler. I like this
approach as the resulting sh scripts are runnable on pretty much every *nix.
[https://joeyh.name/blog/entry/shell_monad/](https://joeyh.name/blog/entry/shell_monad/)

\- Chris Done also released a lib to do shell stuff from Haskell, which build
on the conduit library [http://chrisdone.com/posts/shell-
conduit](http://chrisdone.com/posts/shell-conduit)

\- Chris also wrote a shell in Haskell
[https://github.com/chrisdone/hell](https://github.com/chrisdone/hell)

\- Then there is Shelly by Greg Weber
[https://github.com/yesodweb/Shelly.hs](https://github.com/yesodweb/Shelly.hs)

There are probably more...

~~~
Gabriel439
You use `inproc` and `inshell` for piping. For example, here's the type of
`inshell`:

    
    
        inshell
            :: Text        -- Shell command
            -> Shell Text  -- Standard input to feed command
            -> Shell Text  -- Standard output produced by command
    

I made one intentional simplification in the API, which was to not provide a
way to capture standard error. It's definitely possible to provide such a
utility, but I wanted to simplify things as much as possible in the first
release before the slow onslaught of feature cruft begins. If there were such
a utility, it would have this type:

    
    
        both
            :: Text        -- Shell command
            -> Shell Text  -- Standard input to feed command
            -> Shell (Either Text Text)
    

... and you could selectively listen to just stderr or stdout by taking
advantage of the fact that pattern match failures short-circuit downstream
commands:

    
    
        Left txt <- both -- only read stderr
    

There is one more shell library that I know of: `process-streaming`. I
actually didn't know about `shell_monad` (that's the one most similar in
spirit to what I wrote).

The main reason I rolled my own library is that this was written with the
specific audience of people who didn't know any Haskell, but were comfortable
with Python or Bash. My actual goal is to convince people internally at
Twitter to use Haskell instead of Python for large scripts. I reviewed all
those libraries (with the exception of shell_monad) to see if I felt
comfortable marketing them to non-Haskell programmers and none of them felt
like the right level of abstraction to me. I almost ended up going with
Shelly, but in the process of polishing shelly for internal usage I found
myself continually wrapping things with better names, different types, and
providing missing features to get a single import umbrella, so I just stopped
and asked: "why not just do this as a cohesive single library instead?". Also,
`shelly` does not provide any `IO`-only commands: everything has to be wrapped
in the `Sh` monad.

As for the other libraries, `shell-conduit` was too complex for new users in
my opinion and `hell` is not embedded within Haskell (it's a separate
language), and I wanted to keep the features of Haskell. I still need some
more time to review `shell_monad` to see if I made a mistake by ignoring it.

~~~
kenko
Why `Either Text Text`? What if you're interested in both stdout and stderr?

~~~
Gabriel439
Then you can do this:

    
    
        fmap (either id id) (both ...)
    

... which is equivalent to:

    
    
        x <- both ...
        return (case x of
            Left  txt -> txt
            Right txt -> txt)
    

That removes the `Either` tag and fuses them into a single stream.

------
Doji
The tutorial does a great job of explaining why this is interesting:
[http://hackage.haskell.org/package/turtle-1.0.0/docs/Turtle-...](http://hackage.haskell.org/package/turtle-1.0.0/docs/Turtle-
Tutorial.html)

For example, the pwd function returns a FilePath type rather than a String:

    
    
      Prelude Turtle> :type pwd
      pwd :: IO Turtle.FilePath
    

The datefile function is also typed:

    
    
      Prelude Turtle> :type datefile
      datefile :: Turtle.FilePath -> IO UTCTime
    

So this really does seem to structure the data passed between commands,
instead of the "stringly typing" unix shells have historically been known for.

~~~
TazeTSchnitzel
Are those types just aliases of String?

~~~
throwaway283719
No. For example, a FilePath is (after resolving a few other type aliases)

    
    
      data Root
    	  = RootPosix
    	  | RootWindowsVolume Char
    	  | RootWindowsCurrentVolume
    
      data FilePath = FilePath
    	  { pathRoot        :: Maybe Root
    	  , pathDirectories :: [String]
    	  , pathBasename    :: Maybe String
    	  , pathExtensions  :: [String]
    	  }

------
barrkel
OK. How do you easily fork to run a command in the background? How does
setting up pipes work? What's the idiom for chdir'ing to a subdirectory such
that you pop back out again when you're done (I'd use a subshell with (ch xxx;
...) in bash)?

Getting into more tricky stuff, what's the equivalent of <() in bash?

This doesn't really demonstrate anything that shell scripts are actually
written for: orchestrating and composing other processes, and job control.

If you wanted to leverage type checking for safety, it would be more
interesting to typecheck the streams input and output by pipes.

~~~
Gabriel439
> How do you easily fork to run a command in the background?

`turtle` provides `fork` for running a command in the background. Example
usage:

    
    
        example = do
            using (fork commandToForkInAnotherThread)
            theseCommandsStillRunInTheOriginalThread
    

> How does setting up pipes work?

See the `inproc` and `inshell` commands, which let you convert any shell
command into a stream transformation embedded within Haskell.

> What's the idiom for chdir'ing to a subdirectory such that you pop back out
> again when you're done (I'd use a subshell with (ch xxx; ...) in bash)?

You can write a combinator for this using `turtle` pretty easily:

    
    
        pushd newDir = do
            oldDir <- pwd
            cd newDir
            return (cd oldDir)
    

... and you use it like this:

    
    
        example = do
            popDir <- pushDir "/tmp"
            ... do stuff ...
            popDir
    

> what's the equivalent of <() in bash?

`inproc`/`inshell` which let you read in a command's standard output as a
stream

~~~
barrkel
FWIW:

<(foo) in bash creates a fifo, and pipes the output of foo to the fifo. It
then replaces the whole <(foo) argument with the path to the fifo. This means
that commands that normally expect to read from a file on the command line can
instead be wired to read their input from a process. And, of course, both
processes run concurrently.

>(foo) does the same thing, except the other way around, for process output.

~~~
danidiaz
There is a "createPipe" function in the "unix" package
[http://hackage.haskell.org/package/unix-2.7.1.0/docs/System-...](http://hackage.haskell.org/package/unix-2.7.1.0/docs/System-
Posix-IO.html#v:createPipe) that gets us half-way towards process
substitution.

Unfortunately, I don't know how to get the name of the device file associated
to the pipe, and I need it in order to pass it as an argument to the reading
process :(

------
S4M
I don't know much about Haskell, but I thought it had some properties to
isolate side effects, but the code he gives:

    
    
        main = do
            cd "/tmp"
            mkdir "test"
            output "test/foo" "Hello, world!"  -- Write "Hello, world!" to "test/foo"
            stdout (input "test/foo")          -- Stream "test/foo" to stdout
            rm "test/foo"
            rmdir "test"
            sleep 1
            die "Urk!"
    

Clearly doesn't (it creates a directory, writes in a file, removes that file
and that directory all in one go without anything indicated by the function
_main_. Is it because it's the main function of the program, or am I missing
something?

~~~
zoomerang
Haskell functions return side-effects using the IO type, with the boilerplate
plumbing being hidden with monads and do-notation. "main" in Haskell by
default has a return type of "IO ()", and any "IO" values returned by that
function are executed by the runtime.

The end result in this case is something that just looks and feels completely
imperative.

but if you were to try and call, say, the "rmdir" function inside another
function that didn't have an IO return type, you'd get a compile error. (More
specifically, you could technically call the function, you just couldn't
return the "IO" value as a result, so it couldn't perform any actions).

~~~
tome
> IO type

:)

------
dkarapetyan
Who's the target audience of this exactly? I already see a language pragma, do
notation, liftIO, parser combinators.

Hamming has this great set of lectures on how he became a world renowned
scientist and in one of the lectures he explains why Ada failed and other
languages succeeded. The difference was that Ada was designed logically and
most successful languages were designed psychologically. Even when government
contracts mandated Ada people still wrote in Fortan and hand translated to
Ada. You can watch the videos and take from it what you will.

A minimal bash file is`#!/bin/bash`. A minimal turtle file is already way too
long and logical.

The set of videos:
[https://www.youtube.com/playlist?list=PL2FF649D0C4407B30](https://www.youtube.com/playlist?list=PL2FF649D0C4407B30).

~~~
Gabriel439
The target audience is non-Haskell programmers, and if you don't think the
tutorial is good enough to onboard such a programmer then I consider that a
bug against the library. I would actually appreciate if people submitted
Github issues highlighting any pedagogical problem with the tutorial.

I think the use of `liftIO` is a reasonable objection. When I wrote the
library I had the choice of utomatically pre-wrapping all `IO` commands with
`liftIO` for the user (making them all `Shell` commands) by default. However,
I decided not to do that for two reasons:

* If you do that you can't use them outside of a `Shell` any longer * The user has to learn `liftIO` anyway if they want to use `IO` actions not provided by the `turtle` library. I didn't want to teach the user a leaky abstraction

I don't see any issue with `do` notation is bad. Same thing with parser
combinators, which are just strings in the simple case, and the "Patterns"
section of tutorial has a table showing you how to convert regular expression
idioms to `Pattern`s:

[http://hackage.haskell.org/package/turtle-1.0.0/docs/Turtle-...](http://hackage.haskell.org/package/turtle-1.0.0/docs/Turtle-
Tutorial.html#g:13)

The language pragma is sort of a grey area. I decided to keep it because it
doesn't take a long time to explain and it significantly increases the
usability of the library.

~~~
dkarapetyan
I think you mentioned at some point the goal is to get folks using large
python scripts to use this instead. I would be very curious to hear how that
progresses.

------
boothead
For all the considerable awesomeness that Gabriel produces, I always think the
best part is the *.Tutorial module he includes. I always learn a lot and it's
always a great over view that puts the work in context.

Everyone should do this!

------
chrisBob
After learning Perl I started using it where some more educated people might
recommend a proper shell script. My thinking is that using what you know is a
whole lot more efficient than learning a new tool for a small job, even if
some people think it is the _right_ tool. I am sure it is no different for
people familiar with Haskell.

~~~
loudmax
I do a lot of shell scripting, and I'm not sure there is such a thing as a
"proper" shell script. The shell just isn't a great programming language. Just
about any modern scripting language is better, starting with Perl. But the
shell has been the lingua franca of the Unix world for decades now. It's the
one language that you can pretty much guarantee is on any Unix or Linux
server, even pretty ancient ones.

I don't doubt that Haskell isn't a better scripting language language than the
shell, but you can't assume /usr/bin/env runhaskell is going to return
anything on random Linux servers. Perl and Python, maybe, but Haskell isn't
there yet.

~~~
klibertp
> you can pretty much guarantee is on any Unix or Linux server, even pretty
> ancient ones

Well, yes and no. You can get reasonable compatibility with different Unix
flavours if you stick to sh. Your script is not going to work on BSDs once you
start using bash specific features, though.

Fun fact: on FreeBSD bash does _not_ live in /bin/bash, it's in
/usr/local/bin/bash. Every time you write a shebang with /bin/bash hardcoded
you're making your script harder to use there.

Perl is everywhere almost by default and it's more compatible as it has just
one implementation, without sh/bash/csh/ksh/tcsh/zsh madness. I'd say it's a
good idea to use Perl instead of shell script for anything more complicated
than a few lines of code if it's meant to be portable. (And I'm not Perl
programmer at all).

~~~
loudmax
Oh that's interesting. I'd assumed that FreeBSD had made bash the default
shell around the same time that Mac OS did. I guess my point still stands for
/bin/sh. Not a fun programming language though.

~~~
floatboth
No way. One of FreeBSD's goals is to get rid of everything that's GPL-
licensed. bash is not only that, but it's also horrible code.

And it's a user-friendly shell with all the tab completions and history
searches, which DOES NOT BELONG IN /bin/sh!

~~~
klibertp
Exactly, which is the reason why BSDs were not affected by shellshock. And
moreover, modern tcsh is quite a powerful and full-featured shell, too.

I unfortunately had to switch to Linux a few years ago (after using FreeBSD
for almost a decade) and I still miss how consistent and well laid out BSDs
seem in comparison.

------
mercurial
I like the Pattern thing. However, it seems to me that you're going to quickly
run into trouble if you need to even vaguely emulate shell scripting. Shell
utilities live and die by their options. It's unfortunate Haskell supports
neither named arguments nor default values. Which means that in order to
emulate options, you would need to pass records to your "shell" utility,
which, on top of being cumbersome, forces you to prefix every option in a way
unique to your utility, since you cannot have two records with the same fields
in the same namespace...

~~~
gamegoblin
Quite a lot of libraries here:
[https://wiki.haskell.org/Command_line_option_parsers](https://wiki.haskell.org/Command_line_option_parsers)

~~~
mercurial
That's not the issue.

The issue is that, if you want to simulate both "grep" and "grep -r", you need
to different functions, or you need to have your "grep" function accept a
record of parameters.

~~~
Gabriel439
Actually, you can do `grep -r` by just combining `grep` and `lstree`. Here's
an example:

    
    
        example = do
            file <- lstree "some/dir"
            True <- liftIO (testfile file)
            grep "Some pattern" (input file)
    

This is an example of how most of Bash's option heavy ecosystem is an
outgrowth of Bash's limitation as a language (individual commands accumulate
flags to work around functionality difficult to implement within the Host
language). I think having a decent host language decreases the need for so
many configuration knobs for every command.

~~~
mercurial
You're right to some extent, but I think many of these 'knobs' have a good
reason to exist (--dry-run, -a, -z for rsync for instance) and cannot be
usefully, or at all, replaced by more composability. And attempting to
implement support for them will run against the limitations of Haskell's
syntax.

Something like OCaml would be better suited, since polymorphic variants, named
and default arguments give a lot more flexibility, though the fact that shell
commands happily return different outputs depending on their options would
still be an issue.

------
falcolas
Please forgive my lack of familiarity with the concurrent workings of Haskell,
but since the Shell streams are based off []/IO, and not Concurrent.Chan, does
this mean one turtle function has to complete (and write its results to
memory) before the next turtle function can run?

To me, magic bits of shell scripts which turtle would need to improve upon
were it to replace said scripts are not the loop constructs, conditionals, or
even the type system (even though it's completely lacking in bash), it is the
ability to use pipes to link processes concurrently.

~~~
joeyh
The streaming section shows some examples of combining turtle functions, this
will be the same as shell pipes.

There's also nothing stopping you from using forkIO to spark off a separate
thread, and doing IO in multiple threads concurrently.

Haskell's IO manager allows multiple threads doing concurrent IO in what looks
like an imperative, one instruction after the other manner. Instead of async
callbacks like you might expect from other languages.

------
Klasiaster
For me combing the best parts of bash and ipython is the way to go. Up to now
this seems more comfortable to me than using subprocess in python or this
haskell aproach which needs to be aware of every programme output to give what
it promises. You can easily copy big parts of existing bash scripts and e.g.
add error handling in the python way :) Even I think for loops/list
comprehensions are betten than the strange bash syntax.

And here a short example::

    
    
      #!/usr/bin/env ipython3
      #
      # 1. echo "#!/usr/bin/env ipython3" > scriptname.ipy    # creates new ipy-file
      #
      # 2. chmod +x scriptname.ipy                            # make it executable
      #
      # 3. starting with line 2, write normal python or do some of
      #    the ! magic of ipython, so that you can use shell commands
      #    within python and even assign their output to a variable via
      #    var = !cmd1 | cmd2 | cmd3                          # enjoy ;)
      #
      # 4. run via ./scriptname.ipy - if it fails with recognizing % and !
      #    but parses raw python fine, please check again for the .ipy suffix which must be there!
      #
      # ugly example, please go and find more in the wild
      files = !ls *.* | grep "y"
      for file in files:
        !echo $file | grep "p"
      # sorry for this nonsense example ;)
      # it's even possible to access the output of a command by outputvariable.s, .p or .n
      # see file:///usr/share/doc/ipython-doc/html/interactive/reference.html#system-shell-access
    

Better take a look here, it's more complete:
[https://blog.safaribooksonline.com/2014/02/12/using-shell-
co...](https://blog.safaribooksonline.com/2014/02/12/using-shell-commands-
effectively-ipython/)

~~~
codygman
Oh, I'm going to have to see if I can use Turtle with IHaskell tomorrow!

0:
[http://gibiansky.github.io/IHaskell/](http://gibiansky.github.io/IHaskell/)
1:
[https://registry.hub.docker.com/u/gregweber/ihaskell/](https://registry.hub.docker.com/u/gregweber/ihaskell/)

------
tel
But why "turtle"?

~~~
strager
Turtle shell.

~~~
tel
And in retrospect that's quite obvious, hah!

I spent the whole time trying to think how this was connected to LOGO.

------
npsimons
Nice! Now I can add Haskell to my list of languages I can script with.

I'm always on the lookout for new languages I can script with (or at least get
closer to rapid prototyping) for easier learning, testing, problem solving,
etc. I've got templates that I run against linters, style checkers, etc for
many languages and it will be helpful to have even more options.

------
agumonkey
It's not closely related but still, it reminded me of the wonderful
[https://pypi.python.org/pypi/sh](https://pypi.python.org/pypi/sh) to write
'shell' script in python with very low boilerplate.

------
akurilin
This is awesome, I was actually looking for something like that out of sheer
curiosity, but perhaps it'll make it into production at some point.

------
fallat
I've been pushing for alternative shell scripts for awhile now. I mostly stick
to Python and Haskell now. It is great. Highly recommended.

------
amelius
I wonder why it uses the convention:

stdout (input "test/foo")

instead of:

output stdout (input "test/foo")

which would be expected considering the previous line.

------
qznc
Haskell is low on boilerplate? Yes, in general I would agree. Those scripts
however, all have to be prefixed with "{-# LANGUAGE OverloadedStrings #-}
import Turtle main = do". This is tedious boilerplate.

~~~
sukilot
You could trivially have a wrapper program that added that to every script
file before calling runhaskell.

------
joelthelion
I want to see how you implement the pipe :)

------
meekins
no

------
q3k
I don't really see the point of this, apart from academic research values.

POSIX shell is everywhere - your current Linux and OS X machines, old UNIX
workstations, home routers, servers... Just drop in a file and it will
probably run just fine, unless the author screwed something up completely.
POSIX shell scripts are the perfect bootstrap mechanisms that will run almost
anywhere regardless of architecture.

Haskell, on the other hand, is rarely present in an operating system - if you
absolutely, positively need a higher-level language for „shell scripting”,
then you have a much higher chance of finding a Perl interpreter, or even
Python. Heck, even getting ghc and its' basic ecosystem running has always
proved to be a huge burden to me. Try sticking a `cabal install` in your CI
flow, you'll see your job times increase by hours.

Third, there's just the KISS aspect of it - if you're writing something that
has logic so simple it can be stuck in a shell file, why not just write it in
a shell file? You don't need category theory to get a few files installed...

~~~
comex
Because shell is _so_ deficient that even for "simple" things it is really
easy to screw up - when whitespace or special characters in filenames cause
some case you overlooked to screw up due to terrible quoting rules, when
missing arguments cause [1], when you accidentally put bashisms in scripts
labeled /bin/sh, when you suddenly have to do some basic text parsing (e.g.
extracting capture groups from a regex) and have to either switch to perl or
use some ugly bash extension that's incompatible between the version of bash
OS X uses and the newer ones.

So you might want to use a different language - even for purely/mostly
personal use, in which case Haskell would be fine.

[1] [https://github.com/ValveSoftware/steam-for-
linux/issues/3671](https://github.com/ValveSoftware/steam-for-
linux/issues/3671)

~~~
regularfry
That's before you've even addressed the stultifying features of shell as a
language: booleans and tests are odd, arrays are odder, they have things
called "functions" which don't have return values, the list goes on. Basically
if you're writing shell, you _probably_ also have at least Perl available, and
probably Python...

