Every programmer worth their salt must learn Perl, Bash and using these command line utils. There are chances you are writing lots of code which just doesn't have to get written.
vi, is something that you must absolutely learn. Learning vi(especially with macros) will give you new perspectives about thinking and working with plain text in ways you might not have imagined before. Most of it might be one time work, but it generally saves you lots of spreadsheet and scripting work. Many times you even realise you were doing lots of pointless work over the years.
Perl is the next progression in this path, and this will give you full code paradigm to do with anything Unix and text. Like so as long your application runs on Unixy operating systems, Perl can do it. You can often ship just one file to the server, be guaranteed it will run with backwards compatibility, it will run fast. You can also develop fast and change at will.
Learning bash gives you means to dump a series a commands into a file and repeating them. But many times what saves you time and effort is learning to use for loop on the command line. This helps you do a lot of bulk work in repetition.
Perl is really really bad. Bash is also really really bad. Perl is unreadable and bash is massively insecure unless you chant the magic incantations which even the pros forget.
At most you should learn Perl to learn now not to design a language and then you should learn bash to learn why such a large percentage of systems are vulnerable to various injections.
Even Apple had a bash error where they had an OS upgrade that deleted your hard drive if you had a space in the volume name which I'm making the assumption was a bash script because spaces in filename don't generally matter in any sane language.
Hating Perl is some what like hating vim, or emacs, or lisp macros.
You can live without it. But you will do a lot of pointless work over long periods of time.
There are a lots of people who are ok with that. Me personally, I attempt to live a life where I can do O(1) or O(as little steps it takes) in anything I touch.
I stubbornly refuse to do work that can be done using a computer.
I wrote an entire blogging system in Perl. knowing perl doesn't make me want to use it. Rather it makes me want to run away screaming. it doesn't compare to vim, emacs, or lisp. Perl is objectively bad
Like most programming languages Perl is fine, but there are things in it that could be better. Yes you can make a mess in it, but you can do that in any language.
P.S. Why did you write a blogging system in a language you obviously hate? :D
I didn't know better when I started (~97) and enjoyed perl for ~5 yrs before I learned better.
Perl is designed to make obfuscated code. Sure, you can try to write non-obfuscated perl but that's fighting the language. Clearly, all its magic variables were meant to be used, else they wouldn't exist. It's also got it's crazy hacked variables with "local" etc....
Sure, you can get things done in any language. Brainfuck FTW? But some languages are just poorly designed. They're not designed to make maintainable readable code and avoid bugs, they're designed so its easy to do the wrong things, make mistakes, make things hard to read, and to have errors etc. Both Perl and Bash have these traits. No language is perfect but many other languages have far less of these issues.
I have a better understanding on where you're coming from now, with your main experience of Perl being later version 4 and early version 5 of the language - things have significantly improved since then and it's a lot easier to write Perl code that's both easy to read and maintain these days. You might want to have a look through the Modern Perl book (http://modernperlbooks.com/books/modern_perl_2016/index.html) to get an idea of the direction the language has moved.
And for the more junior programmers out there: Just because a language gives you a feature doesn't mean you have to use it - part of being a good programmer is understanding when it makes sense to use a language's feature and when it doesn't. After all, C let's us directly inline machine code as an array of bytes, but that doesn't mean that we should be using that feature every time we write C code :)
> Every programmer worth their salt must learn Perl, Bash and using these command line utils. There are chances you are writing lots of code which just doesn't have to get written.
I have written a rather lot of sh (I target POSIX sh, not bash) and I'm decently familiar with common cli tools (coreutils, moreutils, etc.). IME awk+sed cover most of the "fancy" text manipulation just fine; what am I missing out on if I skip perl?
Perl was made to ensure you don't have to learn the dozens of things you'd have to learn and slap together with all sorts of adhoc hacks. Everything comes in one package. The full language can't be captured in this comment, of course. But compared to Bash, you get a C like syntax(Bash comes from Algol, you have to learn a new syntax), you get error handling, full master set regexes, heredocs, qw, associative arrays, arrays, built-in string manipulation, unix interfaces, DB drivers, first class integration with command line tooling etc etc.
Think of it like a unix native language. Like the immediate extension to the operating system itself.
Apart from that Perl is very stable, fast, and very deeply committed to backwards compatibility. You can be sure the scripts you wrote years back will run on new Perls.
awk+sed are simple match-do engines, you can't do counting(atleast not easily), you can't do other myriad stuff like working with >2-3 files at one time(iterating simultaneously, in sub-loops etc).
If you are dealing with large files, in Perl its fairly simple to build a index by doing one go on the whole file, then use the key to map to the value where you want to go in the file using seek/tell(with pack/unpack), This makes frequent look ups very computationally cheap. In awk/sed you can't do error handling, you can't do reading and writing to multiple files at the same time et al easily atleast. In Perl, you get dbm drivers and functions install so its relatively straight forward to build look up applications of all kinds.
Perl also comes with OO, so if your code gets lengthier you can can fit it into neat design patterns.
Last but not the least. Perl has CPAN. Its just one of those things. Only real competition to CPAN till date is NPM. Not even Java/Python have a library ecosystem as large as Perl.
If you need something more robust or sophisticated than sh(1)& cie, it's best to head up to a statically typed language instead -- I tend to favor go, in part because of the cross-compilable fat binary thing. The lack of static typing is especially annoying when things are getting beefy enough to consider moving away from sh(1).
For prototyping, if you know Perl already, why not; but even there, I personally reach out to sh(1) & friends instead.
Python unfortunately still doesn't provide a lot of things like heredocs, qw, multiline lambdas, many other unixy things. Its regex capabilities are also not Perl standard.
Scripting is a lot about text, but its not only about text. You need to neatly integrate with many other OS things. More importantly the language needs to be designed such that such tasks can be done with ease.
Python is more like good a small application programming language. Not that great if you do heavy OS, text or command line interaction work. It feels like a less flexible Java in that regard.
My experience (switching from heavy use of shell) is Python is really good at text processing and building data structure very quickly.
Python is a bit wordy for small things but it become extremely concice and reusable once things get a tiny bit larger.
Generally I find doing things in Python is straightforward whereas using shell is more of a low level puzzle game of conversions and reformats and fragility.
The puzzles in Python are higher level using better data types and containers. But also it's much easier to wrap things up into modules and reuse code in Python. And yeah Python is slow... but not compared to bash (unless you're using things like parallel, but Python isn't necessarily horrible for that either)
When I was a student, I once interned for a company of Windows developers, a Microsoft shop whose work was all in .NET. I remember that there were some tasks I was able to do, having to do with processing many, many little text files, that everyone was surprised I was able to do, or surprised I was able to do quickly. Every time, it was something that I put together in a few minutes with the usual Unix tools thanks to Cygwin.
It's probably true that game development is really different from, say, web development, when it comes to the relevance of Unix. But I think it's also true that often, when you don't know common Unix utils, you don't know what you're missing. I don't think any programmer could master them and regret it.
I don't regret knowing Unix tools. But I very frequently regret that other people chose to write non-trivial operations in a bash script instead of a real programming language! If Unix utils didn't exist and everyone just used Python instead I'd honestly probably prefer that.
What I do know is that if I encounter the need to debug a bash script I get extremely irritated.
Is it immediately obvious what a script is doing top to bottom? It's probably ok. Does it have a bunch abbreviated and potentially obscure flags? Does it have conditional logic? I'd probably prefer it to be written in Python.
If 100% of bash scripts were written in Python I would not be upset. Although I have similar thoughts on Python! Once gets over a couple hundred lines I'd probably prefer it to be written in Rust!!
Incidentally a lot of my recent work is basically glorified Bash scripts (Nix takes care of portability issues for us, but the language is still Bash), and then we've recently created a tool in Python that we invoke as a command-line application.
I try to write my Bash in a tasteful style: I embrace bash features (since my scripts ship with with a precise version of bash anyway) where they make things clearer or more concise (for instance, associative arrays). I do use bash parameter expansion, mostly to avoid nesting conditionals or long lines that read like
test ... && declare ...
but I usually include a link to the bash manual page on parameter expansion in a comment next to them as a courtesy. I favor long flags and split most commands with multiple args out across multiple lines.
I do use one bash option via `shopt`, namely `lastpipe`. It lets you assign variables from pipelines, like:
instead of using a subshell on the right-hand side of an assignment. Almost all of my assignments take that form, and there are no unnecessary intermediate assignments, virtually no mutation of variables, and very few ifs or case statements. I think and hope that it's possible to read any chunk of these scripts, then, with very little context: you can focus on a single pipeline at a time without regard for the rest of the program or any variable assignments.
I also just don't do any parsing or validation of arguments in bash. Once I feel I need those, I use something else. Those scripts aren't things I present to outside teams as interfaces.
I find it easier to read my bash scripts than our Python code, but I don't have much history with Python, and I'm also partial to those kinds of pipelines— they're basically the same as method chaining with `.` to do basic FP stuff in OOP languages like Scala and Java and Ruby.
But I also wonder where the line is and what business my team even has writing Python when none of us has much Python experience at all. We've succumbed to the standard argument that we should use Python because it's easy to hire for, which I think is probably a bad one. As the program gets larger and more complex, does anything come after Python for us? How do we know when it's time or if a given program should (from the start) go there?
I don't know yet, but as I try to level up my Python skills, I'm also starting to learn Rust with writing CLI tools in mind, with the book Command-Line Rust. I'm hoping that after that, the upcoming Refactoring to Rust will help me devise a way to decide when switching to Rust is actually a good vs. bad idea, as well as how to do it.
I love PowerShell, but I've yet to meet someone who 'lives in' PowerShell the way so many 'live in' Bash and the various GNU CLI utilities, which I think inhibits familiarity of the kind most convenient for whipping up quick scripts like that.
That said, PowerShell is much more a 'real programming language' than the shell languages that came before it. PowerShell scripts are a lot more maintainable that scripts written for those other shells. And there are some really powerful things built on top of shell (like DSC, for example) that don't to my knowledge have analogues written directly in other shell languages. Plus PowerShell has an actual package management story, code signing capabilities, and other modern niceties.
But for me and I suspect most others, it falls down as an interactive environment for rwasons of performance and verbosity, and Windows environments themselves are unfortunately still harder to automate or inspect programmatically because most applications don't provide or document PowerShell-friendly interfaces. So I don't think it's surprising that despite all of PowerShell's incredible advancemwnts, you rarely meet Windows users who have invested in it as much as the average longtime Unix user invests in their respective CLI utilities.
But for, say, game developers who say 'Unix is not my environment; I'm not sure I wanna bother with all of that CLI and scripting stuff', they do nowadays have an alternative in that arena which is outstanding in a lot of positive ways. I don't anyone would regret deeply learning PowerShell, either.
Hell, my team at work is all Unix guys. I probably have the most experience PowerShell experience of any of us, which is way less than my GNU command line environment experience. But if someone with deep PowerShell experience joined our team, I'd absolutely encourage them to use PowerShell on Linux wherever we write scripts, and be happy to learn whatever I needed to to keep up. PowerShell is a nice enough language that I think it's very reasonable to bring it with you to other operating systems as well.
Eh; Perl just feels to me like a cancerously overgrown AWK. For simple tasks, AWK is tremendously more elegant. For anything too complex for AWK- which is a rather high ceiling- there are a whole host of scripting languages I'd select before Perl. AWK is available in some form on many more posix-ish environments than Perl and the most modern implementations are lightning-fast.
To make awk work for non trivial manipulations you have use cat, awk, | and >
in several sequences.
This is not by any means elegant. What happens when there is a line that needs some error checking and handled a little differently? Or just a bad line? Once you realise that error tracing, and fixing is just not possible for non trivial stuff you are better off now skipping the entire ceremony and simply use Perl.
As matter of fact, Larry Wall invented Perl while trying to the very same things you mention with C and awk. Then you realise you are dealing with so many adhoc things, without dependency checks, error handling and then you actually need a fully blown programming language to do this sort of work.
I've written AWK programs that perform decidedly non-trivial manipulations on data, and none of the problems you describe are real. It is absolutely not the case that one must shell out to a pipeline of other commands to get work done; you can write structured programs in AWK just as you would in any other scripting language, and operate in multiple passes over data.
It's a very amusing irony if a Perl enthusiast genuinely believes that AWK can only be used in the form of cryptic one-liners, especially in the context of POSIX tools one "must" learn.
The only reason I care about vi, is that back in the day it would be only sane thing installed by default in many UNIX boxes, unless I would like to have some fun with ed.
And until IDEs became common in UNIX land, I would rather be on Emacs camp.
vi macros are those magical things that one must just know. Once you know how to use them, its really one those things with regards to text manipulation that helps you convert manual O(n^k) tasks to O(1) task. And a lot of things in our industry can be reduced to text manipulation.
Some of these examples are genuinely hilarious -- I honestly laughed aloud at the example file given for csplit (https://blog.robertelder.org/intro-to-csplit-command/), one command I don't routinely use but now shall certainly do so more...
Brilliant to begin every video with "blah is my favorite command." I watched the video for 'paste' first and I thought it wonderful that someone else also thinks paste is their favorite.
Piping tail to grep is another example I was hoping to see. A simple use case would be to show a filtered live stream of incoming error messages printed to a log.
Actually, I'm very glad that Thompson, Ritchie, and friends used these strange names for the commands they implemented (supposedly because they didn't like to type long words). This leaves longer, more meaningful names open for me to use in naming commands I write.
In any case, (a) UNIX (as it was spelled then) antedates C, so it's more that the UNIX spelling conventions influenced C's development; and (b) lots of earlier and later programming languages have had mystical abbreviations. For example, ALGOL68 replaced “integer” with “int”, for reasons I have never understood; Wirth's original CDC 6600 had a data type “alfa”, which was a 10-character string; William Wulf's Bliss language had library routines with names like “getnum” rather than “get_number”. In PL/I, you can write “proc” for procedure, and “dec” for “decimal”.
Me, I prefer to program in a language that has primitive procedure names like “call-with-current-continuation”.
>> For example, ALGOL68 replaced “integer” with “int”, for reasons I have never understood;
It's helpful to understand the context which evolved this desire for terseness.
A) in the 60s through to the early 90s, to communicate with the "computer" you used a terminal. They used a serial connection which started out life very slow. Sending 7 chars (integer) was measurably longer than 3 (int). Commands used a lot (ls) benefited from being short.
B) CPUs were slow. It's hard to fathom now but not only were the CPUs slow they were shared (in many cases) by multiple terminals. Simply processing extra letters, matching strings to known commands, goes up the longer the command.
C) storage costs would make your eyes water. Source code was terse partly for compiling performance, but also to lower storage cost. "Huge" storage meant "a few mega bytes". Most storage was measured in kilo bytes.
D) same for ram. Storing the word "integer" in ram (in your text editor) took up ram that could be used elsewhere.
Of course none of this applies now. But unfortunately habits learned early are hard to break.
> B) CPUs were slow. It's hard to fathom now but not only were the CPUs slow they were shared (in many cases) by multiple terminals. Simply processing extra letters, matching strings to known commands, goes up the longer the command.
DEC systems often had (at least) two options for serial communications, a cheaper PIO-based one and cards that could do DMA (yes, DMA for 300-9600 baud serial lines). The former would cause more CPU load. Yes, minicomputers where so slow that handling a 9600 baud serial line (~900 characters per second at most) wasn't negligible.
WRT actual code editing and terminal usage the early years were dominated by text interfaces with typically at most 25 lines by 80 characters. (Greater height and width came with a steep increase in price).
This fostered shorter meaningful "names" for everything from types to variables to system commands; things were far easier to grasp when short operations fitted on a single line, at worst spilled onto few others, and functional groupings fitted on a screen.
With 64 kilobytes of core memory to run in (and no swap), linkers were usually limited to 6 or 8 character symbol names. Compilers could let you name your variables anything but "my_value" and "my_valence_count" would still represent the same thing in memory. Being purposefully parsimonious with your names was good defensive programming practice.
I think powershell is meant to be tab-completed, more than typed at 300wpm? I still prefer terse, though; pretty sure I can type `tar xvf` faster than I can tab-complete `tar --extract --verbose --file`.
Not the parent commenter. I prefer PowerShell to any Unix shell. It can be tab-completed, including every cmdlet's parameter names. Writing PowerShell scripts is a delight. In fact I can write the core business logic in C#, VB, F#, or any other CLR language, and expose the shell logic with a thin PowerShell wrapper.
Yup. IMO PowerShell's decisions were right for the modern age, just like Unix's decisions were right for its age.
Today terminals are fast and highly functional, memory is cheap, CPUs are blazing fast too. Our modern bottleneck is not the computer struggling to intake commands fast enough, but people struggling to understand what the heck Bob was trying to accomplish 3 years ago.
And for that, a verbose scripting style that helps not need to look up every single parameter in the manual is very helpful.
> (supposedly because they didn't like to type long words)
It’s possibly more to do with the imposition of having to send commands with a teleprinter running at 75 baud than a preference for terse commands. Terminals in the very early 70s weren’t much better. Limited to 72x20 characters and not hitting the heady speed of 9600 baud until much later in the 70’s.
That seems backwards to me, short commands should be where users place their daily workflow of aliases. Longer names should be where the kind of static, shell scripting "API" sits. Imagine a world where we could set 'mv' and 'ls' to something other than the `mv` and `ls` binaries without breaking our systems...
Powershell fixed that issue and faces an incredible amount of snobbery and developers dismissing it because they mock the "Verbosity" of commands like:
Get-Content -Tail 20 -Wait
Nevermind that it has tab-completion of not just the commands but the parameter names too. Nevermind that it deals with piping actual objects around rather than forcing everything to be a string. What really matters to developers hearts is that everything they type is 2 characters or less and can demonstrate their extensive memory for whether it's du or df for checking space on a drive.
You learn it like a language and it soon becomes second-nature to refer to chowning, chmodding, catting, and fscking. Much better than that stupid lowest-common-denominator horridly verbose tripe that gets peddled today.
You can practice some of the exercises using this interactive TUI app: https://github.com/learnbyexample/TUI-apps/tree/main/CLI-Exe...
See also: https://ratfactor.com/slackware/pkgblog/coreutils, https://maizure.org/projects/decoded-gnu-coreutils/ and https://www.pixelbeat.org/docs/coreutils-testing.html