
The Bourne Shell Source Code - kick
https://www.tuhs.org/cgi-bin/utree.pl?file=V7/usr/src/cmd/sh
======
kick
This is the source to the original Bourne Shell, shipped in Research UNIX v7.
You've probably used GNU Bash, which stands for "Bourne-Again SHell."

The Bourne sh is significant for a few reasons. Primarily, its GNU descendant
is now installed on billions of devices.

Perhaps of more interest: Bourne's sh source code heavily abuses C macros to
look and feel like ALGOL-68. This is made more significant because it came
before C was standardized: it took real knowledge to abuse _that_ much.

While this is C that compiled just fine for the day, and might compile
_mostly_ without errors for a compiler with a K&R compatibility mode, it's
absolutely wild, and is written compensating for some of K&R's faults (see how
it handles true v. false).

I recommend mac.h as a file of particular interest:

[https://www.tuhs.org/cgi-
bin/utree.pl?file=V7/usr/src/cmd/sh...](https://www.tuhs.org/cgi-
bin/utree.pl?file=V7/usr/src/cmd/sh/mac.h)

Bonus points to anyone who understands what these three lines are doing within
it:

    
    
         #define LOBYTE 0377
         #define STRIP 0177
         #define QUOTE 0200
    
    

Also of interest:

This was specifically the reason that the International Obfuscated C Code
Contest was created, started just minutes after seeing Bourne's sh for the
first time (I'm sorry for formatting this as code block, but the formatting
breaks otherwise):

    
    
         Q: How did the IOCCC get started?
         A: One day (23 March 1984 to be exact), back Larry 
        Bassel and I (Landon Curt Noll) were working for National 
        Semiconductor's Genix porting group, we were both in our 
        offices trying to fix some very broken code. Larry had been
        trying to fix a bug in the classic Bourne shell (C code 
        #defined to death to sort of look like Algol) and I had been
        working on the finger program from early BSD (a bug ridden 
        finger implementation to be sure). We happened to both 
        wander (at the same time) out to the hallway in Building 7C 
        to clear our heads.
    
         We began to compare notes: ''You won't believe the code
        I am trying to fix''. And: ''Well you cannot imagine the 
        brain damage level of the code I'm trying to fix''. As well 
        as: ''It more than bad code, the author really had to try to
        make it this bad!''.
    
        After a few minutes we wandered back into my office 
        where I posted a flame to net.lang.c inviting people to try 
        and out obfuscate the UN*X source code we had just been 
        working on.

~~~
eqvinox
> Bonus points to anyone who understands what these three lines are doing
> within it:
    
    
         #define LOBYTE 0377
         #define STRIP 0177
         #define QUOTE 0200
    

That's just 0xff, 0x7f and 0x80 in octal, and the high bit used to be a flag
for all kinds of "magic" behaviour back when 7-bit ASCII was the norm...

~~~
teknopaul
I find myself validating a lot of input to be ascii still. I think its time to
write a lib to make use of all those wasted bits.

~~~
teknopaul
I signed the above post with an emoticon that does not render? Could it be
that hn is not 8bit safe?

~~~
detaro
HN filters out emoji (well, most of them, the blacklist appears to be
incomplete)

~~~
codetrotter
> the blacklist appears to be incomplete

I was under the impression that the emojis that are not blacklisted are
intentionally whitelisted.

For example, all of the flag emojis are possible to use. That seems to be no
coincidence.

There are a few others that are possible to use as well, most of which appear
to be logically grouped together.

For example, multiple emojis relating to time are possible to use.

The flags make sense I think. And I could see the ones relating to time being
sort of relevant in post titles about time management for example.

For some reason, some of the symbols resembling media playback controls are
possible to use too.

I don't see any reason for the media playback controls symbols to be available
(while so many others are blacklisted I mean), but it does make the following
possible though:

"This is so sad. Alexa, play Despacito 2."

Now playing: Despacito 2 (feat. Eminem)

⏯ ⏮ ⏭ —————————○—————— (2:01 / 3:14)

Not that there is any immediate practical use of that of course.

I think it would be interesting if the mods would comment on which emojis and
other symbols outside of human writing systems are available and if indeed
that is intended or not, and if intended what they are expecting people to use
them for and why they decided to whitelist exactly those that they did while
still blacklisting a few of the other ones that might be useful.

It might just be like you said also though, maybe they explicitly blacklisted
some symbols only and then more symbols were introduced later and the
blacklist was never updated to include those.

~~~
tyingq
Flags work also: 🇯🇵 🇰🇷 🇩🇪 🇨🇳 🇺🇸

~~~
teknopaul
Techies at theregister.co.uk told me mysql cant persist some types of utf8
chars in their forums. Its not deliberate but common emojis dont work there
too. Anyone know if HN uses mysql?

~~~
kick
News, the software behind HN, doesn't, no. The filter for emojis is
deliberate.

------
drallison
Back when I was writing C compilers the Bourne Shell was my nemesis. The
Bourne shell did exercise nearly every "feature" of the C language. Compiling
and then running the shell was a great test case for an optimizing compiler
and turned up many bugs. But, when the compiled code failed, winding back to
the underlying C through the all of the macros and optimizations was
exceptionally difficult. I still remember many a late night trying to figure
out what happened. (Thanks Steve, for many fascinating hours of struggle.)

------
gyrator
As someone who used the C preprocessor to generate CPP, Java, and C# from a
common source in order to have a common library for native apps, I always
appreciate a good bit of preprocessor abuse - it's one of the things that
makes C so much fun!

~~~
Koshkin
My favorite idea of an extreme preprocessing is using C itself as the
preprocessor language, with, optionally, the only syntactic sugar being
provided by ASP-like brackets.

------
m0d0nne11
I'm another who recalls this mess being used (1990 or so) as the acid test for
C compilers, source code analyzers and debuggers. I can still picture the look
of pride on a certain salesman's face when he demoed a valgrind-like tool for
us that didn't just crumble to pieces when asked to chew on this tangle.

------
silasdb
The whole program trusts definitions in mac.h [1] like:

    
    
        #define IF if(
        #define THEN ){
        #define ELSE } else {
        #define ELIF } else if (
        #define FI ;}
    
        #define BEGIN {
        #define END }
        #define SWITCH switch(
        #define IN ){
        #define ENDSW }
        #define FOR for(
        #define WHILE while(
        ...
    

Isn't it nowadays considered bad practice? After taking a glance at the code,
I see there might be some advantages like not forgetting to add missing {}. Is
there any other explanation on why they created a dialect on the top of C
using the preprocessor?

[1] [https://www.tuhs.org/cgi-
bin/utree.pl?file=V7/usr/src/cmd/sh...](https://www.tuhs.org/cgi-
bin/utree.pl?file=V7/usr/src/cmd/sh/mac.h)

EDIT: fix English

~~~
kick
Bourne liked ALGOL. A lot. So much that he was one of the few people who wrote
their own ALGOL-68 compiler. Using the preprocessor to feel more at home is a
pretty good idea in this case.

This wasn't particularly popular to anyone who wasn't, well, Bourne, even at
the time. I posted an example here:
[https://news.ycombinator.com/item?id=22199664](https://news.ycombinator.com/item?id=22199664)

------
JdeBP
Interestingly,
[https://news.ycombinator.com/item?id=22188704](https://news.ycombinator.com/item?id=22188704)
, with similar techniques reinvented 4 decades later, was on Hacker News only
yesterday.

~~~
kick
The technique of macroing C to death to look like another language never died!
It's still a very popular thing to do, but it's gotten a bit less wild now
that C has been standardized to death.

~~~
amyjess
I remember in college some friends of mine decided to macro C into really bad
fake German. Think 'inten mainen" and 'printenoutenf'.

