You can read and write binary, including null, in pure bash, without even subshells, let alone external processes like dd and xxd.
This is a driver/client for a vintage portable disk drive that has an rs232 interface. Any disk drive obviously has to handle arbitrary data including binary including 0x00.
It's entirely in native bash with no external tools and not even any subshells. It does have to call stty and mkfifo once at startup, but that's just startup like to open the serial port, not part of the work.
The gimmick for reading nulls is not exactly efficient. Basically you read one character at a time and look at the errorlevel from read to detect the difference between "got nothing" and "got 0x00"
It's fine for this case because the drive is so slow and small that even the entire disk is only 200k max. Making bash read one byte at a time for 200k is still nothing today but only because hardware is insane today.
But it's possible. And the beginning of the post does say "as a thought experiment".
Similarly, you don't need xxd to convert back & forth between raw binary and text encoding.
My own similar thought experiment included the idea that, if I'm not going to use any external process I can possibly avoid and only use internal bash features, on the flip side of that, I will squeez bash for all it's worth, allow every bashism possible.
printf % codes and brace expansions, and printf -v to write directly to variables without needing a subshell or a temp file.
See file_to_fhex() for an example of reading binary into hex pairs.
That example is simpler to read than tpdd_read() because reading from a file is simpler than reading from a stream wrt detecting eof.
read a byte, use printf to convert that byte to a hex pair. The funny looking syntax with a single ' in front of the variable is important.
See tpdd_write() for writing hex pairs back out to binary.
brace-expand "aa bb cc" to "\xaa\xbb\xcc", feed that to printf to output binary.
All through this script I'm using arrays instead of simple variables to hold the hex pairs. That's not necessary, I'm just doing that because this app needs to do a lot of byte/position counting on the data, both for parsing the output from the drive and for sending commands to the drive. It's all packets of fixed length fields.
Meanwhile I at least 50% want to apologize for it.
I know full well that it is crammed full of inscrutabilities, relies on side effects, implicit behaviours, an ocean of globals & other state, and isn't even fully consistent with internal conventions.
This would be doing a bad job at work.
Some of that is just working with what you have, like the globals are how you avoid needing subshells so that's justified.
But there's other things like a lot of "business logic" is implimented in the form of arcane conditional brace expansions and particular return values of some commands etc, that could have been done more scrutably with ordinary readable explicit logic, just somewhat more verbosely.
I'm the first to say it aint readable. It's really a great example of "just because you CAN do something..."
But I wanted no dependencies (that aren't likely already there), cross platform, doesn't break in 6 months simply because 6 months have passed, yet interpreted vs a c program. Awk would tick the same boxes. I have both awk and ksh scripts I wrote in the 90's on xenix that still work the same today on linux & mac & bsd, meanwhile a python script breaks in a year.
Windows 98 has me so scarred that, decades later, I always assume my computer is going to crash at any memento, and still periodically hit Control + S as an idle twitch.
Two crashes a couple months apart led me to switching to Linux. I then read up on rsync and using hard links to make incremental snapshot backups. Coupled with doing this over ssh and now I had multiple automated backups.
As my laptop got older, it kept running fine b/c of Linux but I wanted a new laptop. My girlfriend at the time was using said laptop and drinking and said "Oh, do you want me to not drink over your laptop?"
I replied "that laptop has every bit of important data I need backed up in two different locations. Please spill whatever you want on it as that would give me a valid reason to go buy a new one."
Old hacker culture, no, current hacker culture, yes. Anime themed ricing of Linux distros is extremely common, there are probably more anime profile pictures than not in some communities I hang out in
I guess I don't consider "ricing" your Linux distro as hacker culture. The people I knew from college CS courses who did that almost entirely spent more time tweaking themes than hacking(based on their success in the CS program). I see far more anime avatars associated with anti-social Twitter trolls rather than actual hackers.
I was expecting this to just be "the netcode was written in Bash. For the actual game state management, we just use existing X." But no, it looks like they've covered ALL of that.
It's such a nebulous thing, eh? I'd say `Awk` (like curl, sed, grep, etc.) is part of the Bash toolkit. But then why isn't Python?
I guess the idea is that Bash is about piping together these small programs that typically take stdio and produce stdout. I think these have a name.. like GNU-style programs or something.
I haven't worked extensively in Bash alone but I did work extensively in Make for a while and I can say with confidence: the specialized old tooling languages are extremely good at what they do, Make goes out of its way to make building things extremely slick. These language tools were designed to tackle every problem in their realm because, for a lot of people, there wasn't an alternative - their structure shows clear and careful design intent.
Can you give some examples? I've personally found bash to be among the most difficult to work with. If I need to do something in bash I'd rather keep it as a lightweight wrapper around a ruby script or something, though I know that isn't always an option
Bash is about pipelines which is a rather functional concept. If you can exploit these you can write succinct, readable and effective bash IMHO. If you're just stringing together a lot of conditionals it gets ugly fast.
There isn't such a thing as readable bash, as a significant amount of its syntax is symbolic (and cryptic, by the way), and it's full of warts and unintuitive quirks.
Writing very simple scripts is certainly simple, but anything even a bit functionally more advanced (arrays, but even strings cycling or splitting is quirky) is a pain. Heck, even something as simple as process substitution is horribly unsafe (I wonder how many devs know how/why).
It's practically impossible to write correct Bash without shellcheck.
I disagree that bash needs to be cryptic, but I've also written perl code, so maybe any resemblance of a sane perspective was beaten out of me.
> It's practically impossible to write correct Bash without shellcheck.
I agree with this, since I've never seen a bash script, in the wild, that passed the important bits of shellcheck, without already being passed through spellcheck. I challenge anyone that thinks they've written good bash, with any significant level of complexity, to run that script through shellcheck. Each time I've done this, when starting at a new company, the response has usually been "Will not fix. That's why you don't use spaces in filenames/paths".
But it's not a developer's choice - Bash's syntax is cryptic itself. Here's a random sampling of common patterns:
Arrays:
- `${#myvar[@]}`: unreadable
- `mapfile myvar < <(grep pattern file)`: "mapfile" is not exactly a clear term (there's the synonym
"readarray", but "mapfile" is the reference; worse, you may find both used inconsistently); the
command as I wrote it also has two problems
Strings:
- `${myvar##.*}`: unreadable
- `echo $(IFS=,; echo "${myvar[*]}")`: unreadable; most also don't know the difference between `[@]`
and `[\*]`; this doesn't also work for two-char join operations
- `[[ $myvar =~ $(echo '\bpattern\b') ]]`: unreadable, and most don't know why command substitution
is needed
Redirections:
- `3>&2 2>&1 1>&3`: most don't know what this means
One can certainly get an idea of what's going on; but one gets an idea of what's going on also when reading assembly with labels.
If you are manipulating files and processes, bash is a very natural and succint language to do so. Programs are much more elegant than a generic programming language.
I notice that the author is going into awk for floating point processing. The 1993 language standard for the Korn shell brings floating point into the language as a native data type. This advanced version of the Korn language does not appear to be under active maintenance, but it will likely be much, much faster than handling floating point in awk. Unfortunately, none of the BSD-licensed clones of Korn support the 1993 standards (including floating point).
I also see that the author is having some trouble reading individual bytes. Perhaps these shell tricks might be useful.
At one point I've had a NBT parser implementation implemented almost fully, but I decided it was not worth the hassle to finish it. The code is currently lost, due to my extensive use of tmpfs as a project directory, and a system crash.
Hah, are you me? I also lost some Minecraft server management stuffs written in bash in a tmpfs dir. It was years ago, but I 100% feel your pain.
Also, well done! This looks like it was a lot of fun.
Where there's a will, there's a way. I once needed to do network calculations where I only had some very limited tools, so I used awk to convert 32-bit ints to a 32-character ascii bit string of "1" and "0" and the converse function, then wrote and, or, and not functions that worked on the ascii strings.
Set -e is not global. It only applies in some cases. In other contexts (inside a function when called as if <function>) errors will not be caught and once you realise that you realise there’s no hope of writing error proof logic in bash.
Ah, I'm sorry. I didn't realize the problem with functions and indeed misunderstood.
I mean, while it's amazing what you can do with bash, I'd be wary to use it for "production" stuff. So far, I mostly used it for "toolbox" type if stuff that is only supposed to be used by the devs. For that purpose, it worked well though.
Reading stuff like this makes me think about how much more... "Fun" game programming is than the stuff I do in web dev.
These problems (yes, self-constrained) look like so much fun to solve, in a way that "I did xyz thing in pure CSS" is just less so to me.
Maybe It's that I miss rapid cycle of Completely Lost -> Earth-shattering Realization of How To Make This Work that the first ~1-2 years of my programming journey was full of
The web has some pretty cool stuff to tinker with these days. WebGL in particular would give you that sense of learning you’re craving, is practical to know and IMHO has a much more pleasant feedback loop than C++.
I am in awe. Also, I think you are a huge masochist! It's like wood working with cutlery. But it's actually really informative about some advanced bash knowledge. Kudos!
The results you post don't seem to differ much, do you mean the difference is bigger as the inputs get larger?
Spawning an extra subshell inside 'time' seems strange, as does using echo instead of </dev/null.
Also: people wiriting things in bash usually try not to rely on utilities that are not present by default, and wcalc is not.
I searched for "font" in this thread and certainly did not expect this comment. I found it so hard to read that I added custom CSS rules to replace it with 'sans-serif' for the body and 'monospace' for the code.
I just love this, this is why i love programming and writing a forking minecraft server in C was one of the most fun things i did in computing, i still have the source for that experiment, good times!
This is a driver/client for a vintage portable disk drive that has an rs232 interface. Any disk drive obviously has to handle arbitrary data including binary including 0x00.
It's entirely in native bash with no external tools and not even any subshells. It does have to call stty and mkfifo once at startup, but that's just startup like to open the serial port, not part of the work.
https://github.com/bkw777/pdd.sh
The gimmick for reading nulls is not exactly efficient. Basically you read one character at a time and look at the errorlevel from read to detect the difference between "got nothing" and "got 0x00"
It's fine for this case because the drive is so slow and small that even the entire disk is only 200k max. Making bash read one byte at a time for 200k is still nothing today but only because hardware is insane today.
But it's possible. And the beginning of the post does say "as a thought experiment".
Similarly, you don't need xxd to convert back & forth between raw binary and text encoding.
My own similar thought experiment included the idea that, if I'm not going to use any external process I can possibly avoid and only use internal bash features, on the flip side of that, I will squeez bash for all it's worth, allow every bashism possible.