Hacker News new | comments | show | ask | jobs | submit login
Bashing the Bash – Replacing Shell Scripts with Python (medium.com)
43 points by tdurden 103 days ago | hide | past | web | favorite | 42 comments



The key is knowing what to use where. Writing certain scripts in Bash is miles more efficient use of time than using python because it's "better".

While it's certainly possible to write bad scripts in Bash like the example given, it's just as possible to write bad scripts in Python. All the time spent learning to do bash-like stuff in Python could just as well be spent learning to write better Bash.

I think the readability of the program is key as well. It's interesting that the author never included the full Python program. Based on the snippets I would expect it was longer and was definitely far more verbose. One of Bash's advantages in terms of working with files and shell commands is that most people already know how to work with files and shell commands at the command prompt and so the clarity is already there. If you replace all that with Python library calls all the sudden you're speaking a different language and comprehension will be slower unless your reader works with those Python libs just as often as they use the command line.

In other words, this may all sound good but be careful with this sort of advice. It's often counterproductive to go down this particular path


I work with a couple people who insistently write shell scripts in Python. I know it's anecdotal, but to me, it's like the metaphor 'When all you have is a hammer, everything looks like a nail.' The scripts use functions liberally, but the code isn't DRY nor self-documented. Written in shell, it would take a fifth of the LOC because there's no need to use subprocess or Popen and parse the output. They're not the best developers, but I certainly trust their Python code than I would trust their shell code. And, there's plenty of work to go around, so it's better than nothing.


As my last project developed, I moved it from simple bash scripts directly to (mostly) python utilities that had their own DSLs.. ansible, supervisord.. I expected to need to look into their source, but given the workarounds in docs and web searches that never seems to need to happen.

The resulting configuration is all configuration and not hardcoded paths mixed with Popen and random custom parsing..

I would consider custom python worse than writing bash with too much logic in the cases where you could just exit with a failure.

I think the question for your coworkers is what are they reinventing and why do they think a custom remake of it isn't a waste of everyone's time? Further, why do they think they will be more employable after making embarrassing custom internal tools? Compare that to knowing how popular tools work and perhaps contributing something small to one of them or at least being able to answer a question about them.


There are times writing a bash script it feels like "omg - a programming language would be much easier for this task"...

So I write a tiny program in the lang of my choice and invoke it from that place in the bash script.

The balance between the two tools is great - your advice is spot on for me.


I completely agree. It really depends on how mission critical the script is. One huge advantage with writing it in another language is the ability to write tests for the script.


I love both bash and python. I almost always write the first version of a script in bash, especially if it involves piping between commands, managing processes, or moving files around the file system. Only when there is significant parsing involved do I reach for python -- and even then, it's often easier to just use python for the bit of parsing that's necessary, within a heredoc using python -c.

The example script in this article hardly made use of pipes. I think if you tried to convert a shell script that used a lot of piping, the python might not come out so elegantly.

As others have mentioned, bash, or at the very least shell, also has more cross platform support (certainly more than python3). Even on resource constrained or embedded systems, you're likely to have access to a shell, whereas python is not guaranteed to be installed.

Both are good tools, and both excel in different ways; but I'm not prepared to throw out the baby with the bathwater and say bash is useless, always use python.


> I’ve idealized the four steps as separate functions.

Does the author know he can write functions in bash?

I mean, if he wants to compare programming languages, he should at least compare similar programming styles.

Bash is great for OS scripts because it has built-in commands and syntax for dealing with processus and their inter-communication (pipes and stuff).

Python has NOTHING about that. If you want to do this, you have to use a module, read its documentation, and be prepared to write ugly, verbose method calls all around.


The problem with Bash functions is that they aren't really functions -- more like subroutines. You can't even return a value from a Bash function (just a numeric status code).

Most of the functions the author wrote wouldn't even be possible in Bash, if only because they utilize return values. So your suggestion that he "compare similar programming styles" is practically impossible without using implicit state-passing methods in Bash, which are much more awkward and error-prone than real functions.

Having said that, I agree that writing scripts in Python adds a lot of overhead over Bash, and should only be done if you are implementing significant complicated logic more than just I/O. (For example, retries, exponential backoff, business logic, exception handling, etc.)


> You can't even return a value from a Bash function (just a numeric status code).

You can print a textual result and store it in a variable, which is a pretty natural pattern in bash, e.g.,

    set -e
   
    download_command () {
        if type wget >/dev/null 2>&1; then
            echo "wget -q -O-"
        elif type curl >/dev/null 2>&1; then
            echo "curl -sL"
        else
            echo "Error: curl or wget is required" >&2
            exit 1
        fi
    }

    download=$(download_command)
    public_v4=$($download http://whatismyip.akamai.com/)
    public_v6=$($download http://ipv6.whatismyip.akamai.com/)
The numeric status code is best used to indicate errors (as above, with set -e), not to overload for data.

I do agree that I'd prefer being able to return actual structured data and not strings -- but honestly you can pass e.g. JSON back-and-forth between commands that take JSON, and into `python -c 'import sys, json; something(json.loads(sys.stdin))'` if you need it for a small part of a script that's generally better written in sh.


I roughed out a LISP interpreter in Bash. Since I just stored s-expressions in bash variables and/or returned s-expressions with echo and/or parsed the parens to compute cdr or car, it was not a big success.


Return-values in (ba)sh is text output. You can pipe to and from a function in bash, and you can use replacement [eg: host=$(hostname)].

It's different, but it's a powerful paradigm.


You can output data from a function and pipe it just like any other program, right? So that's how you tend to return values from bash functions.


I don't understand your first statement. I thought that if you wrote your own function/routine/blah you can have it return anything you might like to handle - in one side and out the other.

{print_this_garbage_out=`do_a_little_dance` #maybe even with awk|sed|perl|python|sql|mom

retval_for_something_hopefully_cool=$? # or some other requisite test/set, limit sky

echo "${print_this_garbage_out}"

return ${retval_for_something_hopefully_cool}} # don't exit, because that might be silly

The bash is sort of like your machine code, it, like your machine, can use anything it has access to, if told properly. So I don't understand "can't even return a value". I don't know much about bash for bash's sake so I'm wondering if I may have missed something along the way.


This is ignorant and unsubstantiated nonsense. I love Python and I don't enjoy shell scripting, but as a professional programmer I believe it's part of my job to understand bash and sh well enough. Especially if I'm going to throw around buzzy buzzwords like "devops".

"Rewrite all your shell scripts in Python! Now you can ignore shell!"

No.

(I also have a hard time seeing the upside of e.g. Fabric. It's a Python wrapper around the shell, so now you have two things to learn: the actual commands that you need to execute, and how to express those commands in this bespoke Python-to-shell mapping.)

Just learn shell. It's not going anywhere. You'll always have occasion to use it. People will respect you.

(Edit: One trick I like: If I'm writing a shell script an there's some fiddly bit of processing that is PITA to express in sh I'll write a tiny Python script that does it and call that from the shell script. Not great for performance, but usually it's a one-off or small batch so you don't care. Sometimes it's a small enough Python snippet that you can just use `python -c "..."`)


This reminds me of the Google guidelines[1] for when to use Bash.

Between Python's extensive standard library and the ability to call shell commands[2] if necessary, I rarely reach for Bash.

[1]: https://google.github.io/styleguide/shell.xml?showone=When_t...

[2]: https://stackoverflow.com/questions/89228/calling-an-externa...


I like the idea, but that Python code doesn't look like "without much real work." The bash script is so much shorter.


I think the author forgets to mention that the primary reason bash is used is because of its cross distribution support.

I known if i write my script in bash it will work on any small linux distro.

I think python is a close second, but it still use bash to bootstrap my environment for my real work.


I've been using POSIX shell exclusively, and spent a week or so converting all of my existing scripts to POSIX a few years ago (when Ubuntu switched its initscripts to run under dash instead of bash). Bash isn't available on tiny Linux distros by default, because it's quite large. Many use dash, including most busybox based distributions.

With checkbashisms and shellcheck, it's pretty easy to spot compatibility issues and fix them.


Shouldn't busybox-based distros use busybox sh?

(busybox sh is apparently a close cousin to dash, since they both derive from the Almquist shell.)


I dunno, I just read it when researching the state of shells on various distros. Yep, according to notes in the source for the busybox shell:

"This shell is actually a derivative of the Debian "dash" shell by Herbert Xu, which was created by porting the "ash" shell, written by Kenneth Almquist, from NetBSD."

https://git.busybox.net/busybox/tree/shell/Config.in?id=9d70...

So, yes, it is the "busybox shell", but the busybox shell is a version of dash. So, if you target dash, you're good for busybox distros.


And if you're POSIX compliant, you get to target every POSIX platform (GNU/Linux, BSDs, MacOS, GNU/NT) for free:)


It's not quite free, in my experience, but yes, it is more cross-platform than bash.

The price you still have to pay for cross-platform is the different paths, different output formats for system commands, etc. This is particularly pronounced for the stuff most of my lines of shell scripts are used for, where it's interacting almost entirely with system utilities to figure out where it's running and to do things to make sure the system is able to run the software my scripts are installing (which also interacts with the system in intimate and intrusive ways). But, you'd pay this with just about any programming language.

Then again, I also use shell scripts for building/deploying things where a makefile would be overkill (and of course makefiles interact with the shell, too, so building your makefiles for POSIX shell rather than bash is also a good practice), and I make those with POSIX, as well. And, in those cases, it would definitely be OK on damned near any platform. I even worked on Windows for a few months when I got my current laptop because HiDPI support and the graphics drives on Linux just weren't working well enough, and most of my stuff worked fine there...both in the WSL and from a git bash prompt.


Perl is probably better choice than Python for cross-distro support.


The thing other languages lack for me is a clean way to execute other programs. With bash it's just "grep whatever", in other languages it's "exec('grep whatever');". Then piping that output to another program is a lot of work.

Bash makes for really good program glue.

I've considered creating a library that scanned your PATH for binaries then created dynamic function signatures to allow calling the function by name. Then that dynamic module can be included, and you'll be able to grep by calling "grep". You could make it chainable and get something like: grep('something').cut('args'); or whatever.

Of course this would only be used for personal utility scripts.


Yes it's entirely possible to create a shell like "program glue" experience in python and somebody already has done it [1][2]. It's only the python standard library which is missing some convenient way to shell out.

[1] http://plumbum.readthedocs.io/en/latest/

[2] http://amoffat.github.io/sh/


> The shell language’s only data structure is the string.

Bash has arrays.

> The expr program can convert strings to numbers to do arithmetic.

You don't need to execute expr to do arithmetic:

  $ echo $((37 * 42))
  1554


I think Bash is much more than that. For example, I mostly compose chains of commands through pipes. This extension is not pipe-centric.


I think the scope for Bash has shrunk a lot since Perl and moreso Python have become widely adopted. The rule of thumb I usually have is "if it's more than 50 lines, reevaluate if shell scripting is the best language."

I was working at a place that had been around since the early 80s. It looks like the "official" scripting language was tcsh until around the early 2000s, then Perl was preferred, and around the late 2000s they switched to Python. I ported a bunch of tcsh scripts that missed their Perl transition and was I impressed (I didn't often see "structured" shell scripts with actual design patterns) and the fact they were still used daily. I feel like my Python version had less boilerplate, better error checking, and was more legible.

If you look at the OS itself, a lot of what used to be done with shell scripts migrated to Python2 years ago (which made the move to Python3 tougher)--but the scripts are more legible and maintainable.

I do still love the challenge of running a command and trying to get my result in one line.


This so much. Using grep, cat, cut and sed to parse log files (in my case output from quantum chemical calculations) is really nice.


Here's what my take on the script would be in bash (albeit untested):

https://gist.github.com/binaryphile/3cf01870516d5fffafbc84a0...

Whoever wrote the initial bash script was clearly in a rush. I don't hold that against anyone, but it makes it poor fodder for an example of why bash is "bad". In particular, for the kind of task illustrated, it's a far better choice than python and I say that as a (formerly) unabashed pythonista.

It takes some time to get chops in bash, especially since there's no culture of developing shareable libraries, but it's no harder to learn bash via stackoverflow/google searches than it is to learn python. Use python when you need data structures and objects. Use bash to automate what you would normally do on the command line. Is that so hard?


A while ago I wrote a tiny language (called bish) that was an attempt at finding a middle ground for this problem. Essentially, you write your shell scripts in bish, with sane syntax and semantics (example: function return values, which are not really present in bash). The bish compiler then compiles your script to bash.

I haven't touched it in a few years now, but maybe it's of interest to someone: https://github.com/tdenniston/bish


There's nuance and opportunity for shell, Python, Ruby, C, etc. It's a good practice standardize on a couple of languages and treat executables as opaque processes with a limited, well-defined API/ABI. The advantage of scripting languages is reuse across scripts. There is a limited ability to reuse across shell scripts, but it's often somewhat awkward.


I wonder why the author split up the Python program into functions but left the Bash program as just a single blob of code.


The point of my shell project Oil (http://www.oilshell.org/blog/) is to get rid of the shell vs. Python debate.

My claim is that, in general, X isn’t a good bash replacement, where X is Python/Ruby/Perl/JavaScript. I think that’s obvious to some people but not to others (typically X programmers who don’t know shell, which was me in the not-too-distant past).

This article is good because it has a port of a realistic port from shell to Python. But to me it only succeeds in proving the opposite point -- that porting to Python is a lot of effort, and the result has a lot of detail that doesn't match the problem domain. Error handling is possible, which is good, but awkward. And now you need external librariesl like psutil, then a package manager and special deployment tools for Python. (And you probably need some shell scripts to build/deploy your Python...)

One thing I don't see is the complete result, which from the looks of it is pretty clunky compared to the 20 or so lines of bash.

-----

About Oil: I realized that not everyone was getting the point of it, so I resolved to write an “elevator pitch” [2].

I haven’t published that post yet, but the pitch is: Oil is the language you can convert bash scripts to automatically, once they become a maintenance problem

That might happen at 50 lines for some people or 1000 lines for other people.

Oil will have a more uniform syntax [6], not 4 different sublanguages [3]. It will get rid of word splitting, and have proper arrays [4] and hash tables. It should be easy to write a test framework in it, as opposed to bash test frameworks like bats [5] which actually modify/extend the language because it's not powerful enough.

There are some cases where porting from bash to Python is a good idea. But I also had the revelation of porting Python to shell, and being relieved (e.g. more than one deployment script). I was a Python person and I came at it from the opposite side. Shell is really expressive, but it’s hampered horrible syntax and a bad/old implementation.

In summary, the conundrum of porting things back and forth between bash <=> Python/Ruby/Perl/JavaScript is why Oil exists. It seems like this debate will never end, and I hope that the “automatic conversion” property [6] is a unique argument in favor of Oil vs. X.

If anyone is skeptical of my elevator pitch I’m interested in the feedback :) You can also try the first release [7].

[1] http://www.oilshell.org/

[2] http://www.oilshell.org/blog/2017/07/31.html

[3] http://www.oilshell.org/blog/2016/10/26.html

[4] http://www.oilshell.org/blog/2016/11/06.html

[5] https://github.com/sstephenson/bats

[6] http://www.oilshell.org/blog/2017/02/05.html

[7] http://www.oilshell.org/blog/2017/07/23.html


There is no [2] in your list of links. Does it refer to to the elevator pitch you included in your post, or did you intend to link somewhere?


Thanks I fixed it -- there is no real link #1 but I got lazy and didn't renumber all the links.

In that post I promised an elevator pitch, but didn't write it yet. I just wrote it in that comment:

Oil is the language you can automatically convert bash scripts to.

It's exactly what everybody in this thread is debating. I think the debate won't end because both bash and Python have deficiencies for a very common set of tasks. I guess this set of tasks got more important with cloud and containers and whatnot.

Feedback appreciated :)


Unit testing just means running your programs or subprograms with different inputs and checking the result. Nothing about this is hard in bash. Why would it be?


This is usually referred to as functional or acceptance testing. The goal of these types of tests is to confirm that the overall application works as expected from a users perspective. This is easy with bash. Testing individual units of code inside of a bash script is more difficult. This can be useful for pin pointing which function introduced a bug.

Edit: want to add that difficult does not mean impossible. Bats exists, and you could structure a bash script to be more easily testable.


Yeah, I just don't understand what actually makes it difficult.


A big part of it might be lack of testing utilities. Things like mocking HTTP calls or stubbing methods will be difficult without modifying the code you're testing.


Strange that no one has mentioned xonsh here:

http://xon.sh


Or Ammonite, for the strongly-typed crowd.

http://ammonite.io/




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: