
Things Every Hacker Once Knew - ingve
http://www.catb.org/esr/faqs/things-every-hacker-once-knew/
======
soneil
I always thought it was a shame the ascii table is rarely shown in columns (or
rows) of 32, as it makes a lot of this quite obvious. eg,
[http://pastebin.com/cdaga5i1](http://pastebin.com/cdaga5i1)

It becomes immediately obvious why, eg, ^[ becomes escape. Or that the
alphabet is just 40h + the ordinal position of the letter (or 60h for lower-
case). Or that we shift between upper & lower-case with a single bit.

esr's rendering of the table - forcing it to fit hexadecimal as eight groups
of 4 bits, rather than four groups of 5 bits, makes the relationship between
^I and tab, or ^[ and escape, nearly invisible.

It's like making the periodic table 16 elements wide because we're partial to
hex, and then wondering why no-one can spot the relationships anymore.

~~~
bogomipz
>"It becomes immediately obvious why, eg, ^[ becomes escape. Or that the
alphabet is just 40h + the ordinal position of the letter (or 60h for lower-
case). Or that we shift between upper & lower-case with a single bit."

I am not following, can you explain why ^[ becomes escape. Or that the
alphabet is just 40h + the ordinal position? Can you elaborate? I feel like I
am missing the elegance you are pointing out.

~~~
soneil
If you look at each byte as being 2 bits of 'group' and 5 bits of 'character';

    
    
        00 11011 is Escape
        10 11011 is [
    

So when we do ctrl+[ for escape (eg, in old ansi 'escape sequences', or in
more recent discussions about the vim escape key on the 'touchbar' macbooks) -
you're asking for the character 11011 ([) out of the control (00) set.

Any time you see \n represented as ^M, it's the same thing - 01101 (M) in the
control (00) set is Carriage Return.

Likewise, when you realise that the relationship between upper-case and lower-
case is just the same character from sets 10 & 11, it becomes obvious that you
can, eg, translate upper case to lower case by just doing a bitwise or against
64 (0100000).

And 40h & 60h .. having a nice round number for the offset mostly just means
you can 'read' ascii from binary by only paying attention to the last 5 bits.
A is 1 (00001), Z is 26 (11010), leaving us something we can more comfortably
manipulate in our heads.

I won't claim any of this is useful. But in the context of understanding why
the ascii table looks the way it does, I do find four sets of 32 makes it much
simpler in my head. I find it much easier to remember that A=65 (41h) and a=97
(61h) when I'm simply visualizing that A is the 1st character of the
uppercase(40h) or lowercase(60h) set.

~~~
amackera
This single comment has cleared up so much magic voodoo. I feel like
everything fell into place a little more cleanly, and that the world makes a
little bit more sense.

Thank you!

------
TorKlingberg
A lot of hardware still uses serial, and not just industrial stuff. Everything
from sewing machines to remote controlled cameras.

If you work on embedded devices you will still encounter serial/RS-232 all the
time. Often through USB-to-serial chips, which only adds to the challenge
because they are mostly unreliable crap. Then there are about 30 parameters to
configure on a TTY. About half do absolutely nothing, a quarter completely
breaks the signal giving you silence or line noise, the final quarter only
subtly breaks the signal, occasionally corrupting your data.

Still, there is nothing like injecting a bootloader directly to RAM over JTAG,
running that to get serial and upload a better bootloader, writing it to flash
and finally getting ethernet and TCP/IP up.

~~~
gcp
Arduino got so popular because it addressed those exact problems: realiable
USB to serial built in, foolproof bootloader.

 _Still, there is nothing like injecting a bootloader directly to RAM over
JTAG, running that to get serial and upload a better bootloader, writing it to
flash and finally getting ethernet and TCP /IP up._

I'm happy to have gotten mostly rid of this. Gone are the days of choosing
motherboards based on the LPT support, and praying that the JTAG drivers would
work on an OS upgrade.

~~~
kosma
> I'm happy to have gotten mostly rid of this. Gone are the days of choosing
> motherboards based on the LPT support, and praying that the JTAG drivers
> would work on an OS upgrade.

It's still there, and not going anywhere - the only thing that's past is LPT.
Last time I used a Linux-grade Atmel SoC, it had a USB-CDC interface but the
chain was still the same: boot from mask ROM, get minimal USB bootloader, load
a bootstrap binary to SRAM, use that to initialize external DRAM, then load a
flashing applet to DRAM, run it, use that to burn u-boot to flash, and then
fire up u-boot's Ethernet & TFTP client to start a kernel from an external
server and mount rootfs over NFS. Considering the amount of magic, it worked
amazingly well. The whole shebang was packaged into a zip file with a single
BAT to double-click and let it do the magic.

As for COM and LPT - FTDI and J-Link changed the embedded landscape forever,
and thanks for that.

------
erikb
Funny, had to learn all this stuff for my Master's thesis as it was a crucial
part of my project to provide reliable shell command exchange via serial
connection. It was really really hard to find anybody who knows anything about
this lower network level and terminals.

What I can add for everybody who feels the same disappointment as ESR: It's
very common for a growing community that three things happen.

A) The number of people with just a little knowledge over the holy grail of
your community increases.

B) The popular communication is taken over by great communicators who care
more about their publicity than your holy grail.

C) This gives the impression that the number of really cool people decreases.
And that is depressing to old timers. But it's in fact often not true.
Actually most often the number of cool people increases too! It's just that
their voices are drowned in all the spam of what I like to call the "Party
People" (see B).

So yes, you can actually cheer. It's harder to find the other dudes, but there
are more of them! Trust me, I'm not the oldest guys here but I've seen some
communities grow and die till now, and it's nearly always like that.

~~~
digi_owl
And these days

D) the B)s use various "social" tactics to tar and feather A)s that get in
their way...

~~~
forgottenpass
See also: [https://meaningness.com/metablog/geeks-mops-
sociopaths](https://meaningness.com/metablog/geeks-mops-sociopaths)

~~~
rasjani
As having seen and being part of the start, rise and fall of certain scene of
genre of music, this article provided a great piece to reflect upon how and
what really did happen :)

------
Waterluvian
When you find swaths of knowledge that younger people don't know, you've found
success in the overall human goal of abstracting concepts and building on the
shoulders of those who came before us.

I'm not suggesting the article is a, "Gosh, Millenials!" conversation. I just
get a warm tingle when reminded that I have absolutely no clue how to do
something people did just a generation ago, and I don't need to. It's success!

~~~
chias
This exactly! I like to make the car analogy:

I have no idea how my car works. I mean, I more or less understand the
principles underlying the internal combustion engine, but I wouldn't be able
to service one, much less assemble one. But _I don 't need to_. Typically, the
only indication made available to me that something is wrong is a single bit
of information ("check engine light"), but _that is enough_. You don't have to
be a "car person" to make effective use of a car. I get in, I go, and well
over 99% of the time that's the end of the story.

Compare this with computers. When something goes wrong, it's usually vital
that you (or your users) relay the precise error message (and God help you if
there isn't one). You generally have to be a "computer person" to some degree
to make effective use of a computer. If you are unconvinced by this
comparison, contrast how often your family asks you to perform [computer task]
versus how often you would approach a mechanic family member to perform [car
task]; Contrast how often you hear "I can't do this, I'm not a car person"
versus "I can't do this, I'm not a computer person".

I consider swaths of modern hackers who simply don't know about much of ASCII
as evidence of babysteps towards computers maturing as a technology.

~~~
joshuata
I understand your argument, but I would disagree with a few points:

First, we use computers for so many more things than cars. The average user
does really well with basic tasks like checking their email and simple word
processing. This would be daily driving in your car analogy. Occasionally
things blow up, but that isn't too different from a major problem with a car.
However users are constantly trying new things with computers, new programs,
websites, and tasks. Car-owners who are constantly trying new things with
their cars have as many problems, if not more, than the average computer user.
The difference is that the people who use the full range of their car's
capabilities are deeply interested in their vehicles.

Second, abstractions like the check engine light are far from perfect. How do
you know whether the light signals imminent failure or a minor inconvenience?
What additional information is needed for the mechanic to diagnose the
problem? I recently chased down a problem in my car that caused the check
engine light to come on with a code that was physically impossible. It took a
few weeks of careful experimentation and instrumentation before I was able to
figure out what it thought was going on. This was a case where I absolutely
needed more than a cursory knowledge of how my car works.

I also think that a hacker should be similar to a amateur mechanic: although
their car might be fuel-injected, they have a cursory knowledge of how a
carburetor works. They may have an automatic transmission, but they understand
what a clutch is. Compare that to many developers who have never set foot
outside their niche; They have never used a radically different programming
language or a different OS. They've never taken the time to dig into the
layers beneath the one they use. I would argue that is a weakness. How will
you ever debug a problem when it inevitably occurs in the layers beneath you?

------
coderjames
Many of the control codes are still in active use today in the air-ground
communications protocol spoken between airplanes and Air Traffic Control.

The ACARS[0] protocol I work with every day starts each transmission with an
SOH, then some header data, then an STX to start the payload, then ends with
either an ETX or an ETB depending on whether the original payload had to be
fragmented into multiple transmissions or fits entirely into one.

These codes aren't archaic and obsolete in the embedded avionics world.

[0] ACARS: "Aircraft Communications Addressing and Reporting System" \- see
ARINC specification 618[1]

[1][http://standards.globalspec.com/std/10037836/arinc-618](http://standards.globalspec.com/std/10037836/arinc-618)

~~~
Animats
Here's the origin of that - The Teletype Model 28 "stunt box".[1] This was a
mechanical state machine, programmable by adding and removing metal levers
with tines that could be broken off to encode bit patterns. These were used in
early systems where a central computer polled mechanical Teletype machines in
the field, and started and stopped their paper tape readers and other devices.
Remote stations would punch a query on paper tape and put it in the reader,
then wait until the central computer would poll them and read their tape. This
was addressable, so many machines could be on the same circuit. Used in 1950s
to 1970s, when mainframes were available but small computers were not.

[1]
[https://www.smecc.org/teleprinters/28stuntbox001.pdf](https://www.smecc.org/teleprinters/28stuntbox001.pdf)

~~~
i336_
Thanks so much for sharing this. This is exactly the kind of TTY history I've
always been looking for.

------
OliverJones
Those DB9 and DB25 connectors are still kicking around the bottom of my
toolbox.

Why is DEL's bit value 0xff (or 0255)? Because there was a gadget out there
for editing paper tape. Yes. You could delete a character by punching out the
rest of the holes in the tape frame. I used it once. It was ridiculous.

~~~
jobead
LOL, I swear it feels like the answer to every "why is this weird computer
thing this way?" question I see is "because we used to do it this way on punch
cards."

~~~
kosma
Wait until you find out that "coder" originally meant "someone who encodes
messages into Morse". And if you start digging into the word "code", you'll
find it comes from latin "codex" which is a mutation of "caudex" meaning,
literally, "tree trunk".

Because back then, people would write on wooden tablets covered with wax.

So.. next time you see a kludge and think of "historical reasons", consider
that "historical" goes back much farther than 20th century. :)

~~~
yrro
Huh. I remembered from school that "caudex" was a reasonable translation for
the insult "blockhead"... now I know why!

------
TeMPOraL
I'm sad that "[FGRU]S ({Field|Group|Record|Unit} Separator)" didn't get much
use, and instead we have to rely on tabs or commas (TSV / CSV), and suffer
from the problem of quoting / escaping.

BTW, I use Form Feed (CTRL+L) character in my code to divide sections, and
have configured Emacs to display them as a buffer-wide horizontal line.

~~~
leephillips
It seems that so many programming headaches have the same root cause: the set
of characters that compose "text" is the same set that we use to talk _about_
text. Hence the nightmares with levels of quoting and escaping. The use of
out-of-band characters like NULLs to separate pieces of text does help, but I
don't think there is a complete solution. Because, eventually, we want to
explain how to use these special characters, which means we must talk _about_
them, by including them in text....

~~~
sbuttgereit
> Hence the nightmares with levels of quoting and escaping.

PostgreSQL has an interesting approach to this problem that I've found really
straight forward and allows me to express text as text without getting into
strange characters. What they've done is allowed using a character sequence
for quoting rather than relying on a single character. They start with a
character sequence that is unlikely to appear in actual text: $$, it's called
dollar quoting. Beyond just $$, you can insert a word between the $$ to allow
for nesting. Better explained in the docs:

[https://www.postgresql.org/docs/current/static/sql-syntax-
le...](https://www.postgresql.org/docs/current/static/sql-syntax-
lexical.html#SQL-SYNTAX-DOLLAR-QUOTING)

What the key here is that I am able to express string literals in PostgreSQL
code (SQL & PL/pgSQL) using all of the normal text characters without escaping
and the $$ quoting hasn't come with any additional cognitive load like complex
escaping can (and before dollar quoting, PostgreSQL had nighmareish escaping
issues). I wish other languages had this basic approach.

~~~
DougWebb
Perl's had something like that for a long time: quote operators. You can quote
a string using " or ' (which mean different things), and you can quote a regex
using /. But for each of these you can change the quote character by using a
quote operator: qq for the double-quote behavior, q for the single-quote
behavior, and qr for the regex behavior. (There are a few others two, but I
used these most often.)

    
    
        my $str1 = qq!This is "my" string.!;
        my $str2 = qq(Auto-use of matching pairs);
        $str2 =~ qr{/url/match/made/easy};
    

The work I did with Perl included a LOT of url manipulation, so that qr{}
syntax was really helpful in avoiding ugly /\/url\/match\/made\/hard/ style
escaping.

~~~
brennen
Perl is still, I think, the gold standard for quoting and string manipulation
syntax. I am to this day routinely perplexed by the verbosity and ugliness of
simple operations on strings in other languages.

(Of course, this may also be one of the reasons that programmers in its broad
language family have a pronounced tendency to shoehorn too many problems into
complex string manipulation, but I suppose no capability comes without its
psychological costs.)

~~~
i336_
Yup, the 8085 CPU emulator in VT102.pl[1] uses a JIT which is essentially a
string-replacement engine.

[1]: [http://cvs.schmorp.de/vt102/vt102](http://cvs.schmorp.de/vt102/vt102)
(note - contains VT100 ROM as binary data, but opens in browser as text)

------
cyberferret
Things that I still find hard to forget these days:

* ASCII codes for those single and double box characters, so I could draw a fancy GUI on those old IBM text monitors

* Escape codes for HP Laserjets and Epson printers for bold, compressed character sizes etc.

* Batch file commands

* Essential commands for CONFIG.SYS

* Hayes modem control codes

* Wordstar dot format commands

* WordPerfect and DisplayWriter function keys

* dBaseII commands for creating, updating and manupulating records

I wish they would all move out of my head and leave room for me to learn some
new stuff quicker!

~~~
tonyedgecombe
_Escape codes for HP Laserjets_

<esc>&l1O to switch to landscape :)

I'm just about to write some code to parse HP PJL so you never know when ...

~~~
mark-r
PJL is easier to parse than it looks at first glance. I worked on a laser
printer controller back in the late '80s that for convenience we made mostly
HP compatible. Have fun!

------
davidwihl
Fun fact about octal: every commercial and most non-commercial aircraft have a
transponder to identify with Air Traffic Control. The squawk code is in octal.

[https://en.m.wikipedia.org/wiki/Transponder_(aeronautics)](https://en.m.wikipedia.org/wiki/Transponder_\(aeronautics\))

~~~
dingaling
And the civilian SSR system was built on the released frequencies and
protocols of Mode 3 of the military Mark X IFF system, which was adopted by
civilian agencies in the mid-1950s after a particularly nasty airborne
collision.

Sixty years later and we're still bounded by legacy. As a result of shortage
the general-use squawk codes are namespaced into each national ATC region, so
aircraft have to constantly change squawks even in supposedly contiguous
regions such as Eurocontrol. Squawk 4463 means a different thing in UK
airspace than in French.

Ironically, military aircraft still support Mode 3 in order to integrate with
civilian ATC, who call it Mode-A, but all their special don't-shoot-
me-I'm-friendly stuff is handled by more modern encrypted protocols.

------
krylon
The fact that Windows uses CR-LF as a line separator baffles me to this day
(and I am not old enough to have ever used or even seen a teletype terminal!)
- for a system that was developed on actual teletype terminals, it would have
made perfect sense: To start a new line, you move the carriage back to the
beginning of the line and advance the paper by one line.

But DOS was developed on/for systems with CRT displays.

It doesn't really bother me, but every now and then this strikes me as
peculiar.

~~~
torrent-of-ions
When you think about how a typewriter works it's actually correct. The
"newline" handle does both a carriage return and a line feed. But you could
conceivably do a carriage return without a line feed (and type over what you
already have), or a line feed without a carriage return (which might have some
actual use).

~~~
Turing_Machine
One use for CR without LF was to overstrike passwords on printing terminals.
You'd enter your password (which would appear on the paper) then the system
would go through several rounds of issuing a CR followed by overstriking the
password with random characters.

~~~
radarsat1
I still use CR without LF all the time. If I'm writing something that features
a long loop, I might want status updates, so printf("%d \r", percentage);
helps tremendously. You'll need to fflush(stdout) too, since usually flush is
triggered by "\n".

On the other hand, there are far fewer use cases for LF without CR, certainly
nothing that isn't better done using ANSI codes.

~~~
torrent-of-ions
This the use of CR without LF on a virtual TTY. On a real TTY or typewriter
there is less use, but as one user pointed out one could remove information
like a password which was printed.

LF without CR is something that one would do on a typewriter for typing
tabulated data or mathematical formulae. It's just a way to go "down" but stay
at the horizontal position you were previously at.

------
Freak_NL
> It is now possible that the user has never seen a typewriter, so this needs
> explanation […]

Aw man… I'm only 36, but now I feel old for growing up in a time where a
typewriter was still common enough to run into (even if they were rapidly
being displaced by personal computers).

They still exist in the wild though as a hipster accessory — they probably do
well on Instagram too I suppose.

~~~
SixSigma
I'm work at a global logistics company. Manual typewriters are still somehow
part of our workflow for filling in forms, not even carbon backed.

It boggles my mind why that hasn't been replaced by a PDF form. Perhaps IT
being siloed in another building keeps such legacy going.

~~~
digi_owl
With many of these things i suspect it comes down to laws and regulations more
than anything else.

Meaning that your typewritten document will be accepted as evidence during a
lawsuit or similar, while a PDF of same may not.

------
alblue
I gave a talk on the origins of Unicode a while ago (now published on InfoQ at
[https://www.infoq.com/presentations/unicode-
history](https://www.infoq.com/presentations/unicode-history) if you're
interested) where I talked about ASCII, and where that came from in the past
(including baudot code and teletype).

The slide pertaining to ASCII is here:

[https://speakerdeck.com/alblue/a-brief-history-of-
unicode?sl...](https://speakerdeck.com/alblue/a-brief-history-of-
unicode?slide=6)

------
falsedan
Ah the good old days, when hackers were hackers and quiches were quiches.

Oh wait, this article is 'man ascii' & 'man kermit'.

~~~
falsedan
Although interestingly enough from 'man ascii', it's clear why ^C is ETX:

    
    
      >         Oct   Dec   Hex   Char                        Oct   Dec   Hex   Char
      >         ────────────────────────────────────────────────────────────────────────
      >         000   0     00    NUL '\0'                    100   64    40    @
      >         001   1     01    SOH (start of heading)      101   65    41    A
      >         002   2     02    STX (start of text)         102   66    42    B
      >         003   3     03    ETX (end of text)           103   67    43    C
      >         004   4     04    EOT (end of transmission)   104   68    44    D
      >         005   5     05    ENQ (enquiry)               105   69    45    E
    

Holding Ctrl set bit 6 to '0', bit 7 to '1', and bit 8 to '0'. 'C' and 'c'
differ by bit 6 only ('1' for 'c').

~~~
jccc
I remember CLU said, "Acknowledge" but I can't remember if the MCP said, "End
of transmission" or "End transmission."

From now on every time I Ctrl-d I want to think the voice of the Master
Control Program.

~~~
nostoc
I'm pretty sure the MCP says "end of line", not "End of transmission".

It's been a while though, maybe he says both.

~~~
jccc
Ah, right. I thought there was at least one place where he said, "End
transmission," but I'm probably wrong.

~~~
prodigal_erik
Then there's "End Of User" which makes their terminal explode.

[http://www.catb.org/jargon/html/E/EOU.html](http://www.catb.org/jargon/html/E/EOU.html)

------
pjmorris
Does anyone remember EBCDIC? IBM defined EBCDIC for the same purposes as
ASCII, but ASCII took off with newer generations of machines. The last time I
wrote an ASCII-EBCDIC conversion routine was the late 90's, part of generating
a file for upload to a vendor's mainframe.

~~~
tyingq
Some popular projects still have to support it:
[https://github.com/apache/httpd/blob/2.4.x/include/util_ebcd...](https://github.com/apache/httpd/blob/2.4.x/include/util_ebcdic.h)

------
cpr
ESR forgot to give the reason for XON/XOFF: physical terminals often couldn't
keep up with an output stream even at a low 9600 baud, so they'd have to back-
pressure the sender (usually a dial-up or direct-connect host) to let them
know when to stop and when to start.

Plus, people used them manually (control-S/control-Q) on systems to stop
output scrolling by, and restart it when they've read what's on the screen,
before built-in pagination filters (e.g., more(1) or less(1)) became common.
(Specially back in the DECsystem-10/-20 days.)

------
charles-salvia
Not to mention how the truly ubiquitous HTTP protocol uses CRLF in the
headers. The mechanical origins of the CR/LF combo were so strongly ingrained
in developer culture that line oriented protocols like HTTP inexplicably
continued to use it - either that or Tim Berners Lee just copied it from
earlier line oriented protocols like SMTP because it just seemed like the
"right way to do things".

It also doesn't help that most modern web servers also include logic to handle
a single LF character to terminate lines anyway.

~~~
digi_owl
Well it meant you could interact with a web server via telnet in a pinch.

------
diebir
RS-232 is invaluable. Give me a new piece of hardware and as long as there are
2 wires for the serial port I can port Linux kernel to it.

I have ported Linux to a custom ARM board many years ago. Started with a boot
loader written in assembly and writing a single char into serial port for a
debug console. It takes a single line of assembly or C. Infinitely easier than
USB. From there on, I was able to unwind the whole system, develop USB
drivers, TCP tunnels, etc.

------
inlineint
The article recalls teletype terminals. I'm just putting a video showing how
it worked in close:
[https://www.youtube.com/watch?v=MikoF6KZjm0](https://www.youtube.com/watch?v=MikoF6KZjm0)
. It hypnotizes IMO :)

I'd like to see how looks editing and running BCPL/C programs using ed on such
a terminal.

~~~
i336_
I went from amazed ("it's like a mechanical watch...") to bored/irritated
("oh, so THAT's what 110 baud was like... ouch") to amused ("...but I could
put up with this, I guess.") as I watched this. Thanks so much.

My conclusion is that I could probably put up with using one of these, but
that I'd feel quite cramped if it was my main workstation.

The mechanics of `ed' suddenly make a lot more sense now.

The dialing bit at the end was awesome. Reminds me of relay-based lift motor
rooms! (There are videos of those on YouTube.)

------
jecel
_The following table describes ASCII-1965, the version in use today. It used
to be common knowledge that the original 1963 ASCII had been sightly different
(lacking tilde and vertical bar) but that version went completely extinct
after ASCII-1965 was promulgated._

Not quite true - early adopters like DEC kept using the 1963 version for a
very long time, which prompted others to follow them. When the Smalltalk group
decided to replace their own characters for ASCII in Smalltalk-80 to be
compatible with the rest of the world, it was the 1963 version that they used.

Due to this, since I use the Celeste program in Squeak Smalltalk to read my
email, I see a left arrow whenever someone wrote an underscore. The other
difference is that I have an up arrow instead of ^. But it did adopt the
vertical bar and tilde from 1967 ASCII, so it was a mix.

~~~
i336_
For anyone similarly curious what Celeste is, there are some screenshots at
[http://wiki.squeak.org/squeak/1467](http://wiki.squeak.org/squeak/1467), and
it's included in [http://files.squeak.org/3.7/unix-
linux/](http://files.squeak.org/3.7/unix-linux/). No, it doesn't work in
Squeak 5.1 :)

To the OP, I'm very curious why you're using Celeste. I take it you've been
using it for years?

~~~
jecel
When switching from a KDE based Linux to a Gnome based one many years ago I
had to find a replacement for KMail. Since I design Smalltalk computers, it
seemed silly not to use Celeste even if I had to put up with some limitations
and add a bunch of bug fixes (all dealing with tolerating broken emails).

It is not something I would give to a "normal" person to use, but it is more
than good enough to me. The main problem is that it does a reasonable job of
showing incoming emails (though attachments show up as links at the end of the
text) but when editing an email to send it shows the raw headers and MIME for
any attachment.

~~~
i336_
Welp. I didn't see your comment until now. I wonder if you'll see/get this
reply.

What do you mean by "design Smalltalk computers"? That sounds really
interesting. Do you mean you configure them to autoboot Linux into Squeak or
similar? What are they used for? (...I'm guessing education-type
environments...?)

I can understand what you mean. I tried to get it working, as I said, and
while I got the main window open I had no idea how to configure it (and I must
admit I don't have much incentive to.)

------
aap_
And yet he doesn't know the PDP-11 is an octal machine even though it's 16
bits.

~~~
baking
This deserves a little more explanation. The PDP-11 had 8 registers and 8
addressing modes so in the days before debuggers you could dump the binary and
pretty much debug straight from the octal without so much as referring to your
handy-dandy pocket reference card.
[http://www.montagar.com/~patj/dec/pocket/pdp11_programmingca...](http://www.montagar.com/~patj/dec/pocket/pdp11_programmingcard_1975.pdf)

EDIT: Also, the most significant bit was used for 3/4 of the instruction space
as a flag for a byte vs. word operation (or add vs subtract) so having it
alone in its own octal digit (0/1) made perfect sense. For example 01ssdd was
MOV (word) and 11ssdd was MOVB. 06ssdd was ADD and 16ssdd was SUB.

------
kps
> ENQ (Enquiry), ACK (Acknowledge) In the days of hardware serial terminals,
> there was a convention that if a computer sent ENQ to a terminal, it should
> reply with terminal type identification followed by ACK. While this was not
> universal, it at least gave computers a fighting chance of autoconfiguring
> what capabilities it could assume the character to have.

That's not quite right (the ACK part isn't right at all). See
<[https://en.wikipedia.org/wiki/Enquiry_character>](https://en.wikipedia.org/wiki/Enquiry_character>).

------
nerdponx
Why did the end-of-line indicator settle on LF and opposed to CRLF? Naively,
the latter makes more sense to me by virtue of it being more explicit. Do
Unix-alikes always inject a CR after a LF?

~~~
jcrawfordor
This is mostly coincidental history. A lot of teleprinters used to require
that CR+LF was sent for largely mechanical reasons, and Windows was made
similar to MS-DOS which was made similar to CP-M which had been built for
these kinds of devices.

Unix, on the other hand, was made similar to Multics which had the clever idea
of on-the-fly replacing a line feed with whatever the printer required. So
text files needed only the LF, and a CR was added automatically if the printer
required it. This had its upsides and downsides, but the major upside was that
you could print a text file on two different systems and reliably have it come
out the same way!

This is one of the reasons that in e.g. C, the '\n' character is somewhat
awkwardly defined as "a one-byte number that will move the cursor to the start
of the next line", when on some operating systems (Windows) this will end up
actually being the two-byte CR+LF sequence (well, when output is in text
mode...). Even around the time of C's development this was already an issue
and newline translation magic was already required, the ancestors of Linux and
the ancestors of Windows just happened to put that magic in a different place.

------
alphonsegaston
I always thought that instead of these lamentations about lost knowledge,
people should just put together resource guides to maintain these skills.
Hackers used to know these things? If they're still useful to know, how can I
learn about them today? Otherwise, it just sounds like the worst combination
of geek posturing and "kids these days."

~~~
dualogy
> If they're still useful to know, how can I learn about them today?

By reading the article instead of lamenting lamentations ;)

~~~
alphonsegaston
Yeah, no, I read it :)

Some of that stuff piques my interest, but then it can be difficult to get
started understanding it relative to other stuff that's more widely covered on
the web. I guess I am just a little pampered when it's comparatively much
easier to learn about a web framework. But that's probably part of the old
school hacker ethos as well.

------
billpg
Wasn't "{Field|Group|Record|Unit} Separator" meant to allow an alternative to
using commas in CSV data?

~~~
masklinn
It has two more levels than CSV (which only has units and records) so no. And
FS is _file_ separator not field.

These were used for serial data sources, not just network but punch cards or
drums or magnetic tapes. GS, RS and US were intended for databases on serial
data sources, "group" is a modern-day table. Lammert Bies has more:
[https://www.lammertbies.nl/comm/info/ascii-
characters.html](https://www.lammertbies.nl/comm/info/ascii-characters.html)

You can repurpose the final two (record and unit) for CSV, but that's not
their original role, and you'll have to make sure they're never opened via
user-controlled anything as these control codes are non-printable.

------
rocky1138
Seeing this reminds me of how things were and to be thankful for how far
things have come.

------
quietriot
Read this. Or at least take a look for some historical insight. It is both dry
and interesting.

An annotated history of some character codes or ASCII: American Standard Code
for Information Infiltration

[http://worldpowersystems.com/J/codes/](http://worldpowersystems.com/J/codes/)

It describes how we got here in excruciating detail, starting with Morse.
Military communication systems had great influence. Before the ASCII, as we
know it today, there was an "ASCII-1963" that was a bit different.

The long and winding path: Morse Baudot Murray ITA2 FIELDATA ASCII-1963
ASCII-1967

------
cesarb
> SO (Shift Out), SI (Shift In): Escapes to and from an alternate character
> set. These are never interpreted in this fashion by Unix or any other
> software I know of [...]

Aren't SO and SI used by the ISO-2022-JP character encoding?

~~~
tyingq
Linux does do something with SI and SO, at least on the actual framebuffer
console. It enables the VT100 alternate character set, for drawing things.

The k,l,m,n,o,p,q,r,s,t,u,v,w keys then become useful for drawing lines

If you have a linux desktop, switch to an actual console (Ctl-Alt-F1), then:

echo -e \\\x0E

And type a bit of lowercase characters between k and w. To get the terminal
back to normal:

echo -e \\\x0F

~~~
kbob
That's what the VT100 did with SI and SO. It's been 30? 35? years since I
touched a VT100, but I think I recall there being a configuration bit to
enable alternate character set on SI/SO.

~~~
i336_
Here you go:
[http://www.pcjs.org/devices/pcx86/machine/5170/ega/2048kb/re...](http://www.pcjs.org/devices/pcx86/machine/5170/ega/2048kb/rev3/vt100/)

Probably not what you used to use, but "ctty com2" (and a couple of Returns
directed at the VT100) will make it "go." I'm not 100 on what the keymapping
is when you're in the setup screen (virtual Setup key at bottom-left).

------
Derbasti
Isn't BS used in combination with other characters to encode non-character
terminal information, like text color changes? In some programs, `]\b` for
example will change the text color.

The `]` character itself will not be printed, since `\b` will delete it from
the visual line, and this effectively creates a side-channel for communicating
"invisible" information within the regular character stream.

It's a pain to work with, though, since it makes things like `strlen` behave
in very non-intuitive ways. Just imagine a string becoming _longer_ when you
delete the BS character. That's no fun.

~~~
ksherlock
^[ (control-[) is how escape is rendered. VT100/ANSI uses escape sequences for
terminal control sequences (like positioning the cursor or changing the
color).

------
cafard
Ha! The first technical book I ever bought was _The RS-232 Solution_. It is
not impossible that I bought it at an airport bookstore--this was in the
mid-1980s.

------
akavel
Hm, maybe _that 's_ the way how we can try fixing the "programmers don't know
their ancestors' discoveries" issue? By "older" programmers blogging about
things that are obvious to them, but apparently no longer to people? in a
somewhat loose "old folk stories" style, but slightly more dense than the pure
"funny folklore tales" like what did Jobs say to woz, or esr quip to ken?

~~~
pjmlp
Blogging and story telling isn't enough, if the audience doesn't care.

For example, there are lots of old papers from Xerox PARC, Burroughs, AT&T and
many others freely accessible on the Internet.

Some of them, we have to thank to the laborious work from people that bothered
to digitize their original form, produced by plain typewriters.

Yet, I doubt many youngsters bother to read them.

~~~
azeirah
I'm 21 years old, the way Alan Kay speaks about Xeroc PARC work, Douglas
Engelbart, Burroughs etc intrigues me. Then I try reading for example
Engelbart's paper, it's fucking huge! I just get bored so quickly with the way
they're written, especially because they're so out of context, many things
were very different back then, so a lot of what I know right know either
didn't exist yet, or will only confuse me due to my assumptions about how
things were done back then.

Then again, I have read a few papers, it's just highly inconvenient. And in
what kind of setting would you take the time to really read these anyway? Work
setting? No, just get your work done. At home? I don't mind reading a bit
during my own time, but I don't want to spend hours upon hours trying to
understand something from a very different context. Academic? Yeah sure why
not. But I'm not in academia!

~~~
i336_
If you have the resources, find out what museums are in your area, or plan
trips to the museums that contain things you'd be interested in.

For example
[https://www.youtube.com/watch?v=MikoF6KZjm0](https://www.youtube.com/watch?v=MikoF6KZjm0)
is in a museum, as is
[https://www.youtube.com/watch?v=NEbMksxQAgs](https://www.youtube.com/watch?v=NEbMksxQAgs).
You can go and check them out and poke them (to some extent).

While watching videos is really awesome, actually being able to watch these
things in action provides a level of context that is impossible to convey
digitally.

------
eldavido
I'm working on a greenfield project at a hotel where we've had to interface
with a Mitel PBX over RS-232.

I was shocked how literally the ASCII codes were followed by the PBX. It sent
an "ENQ" before each command, we had to send an ACK back, and then it sent us
STX/ETX-delimited records.

I'm 32 and working today, in 2017. I hope I make stuff that lasts this long.

~~~
i336_
It's harder now, because any developer can make a fancy framework, draw some
graphics, make a website, find a domain (on the .io TLD!) noone else has
thought of yet, and look as good as (or better than _) multimillion dollar
companies. Anybody can compete now. And so, for the sake of landing a job, or
networking /personal publicity, or even just for the experience, people do.

The problem is, frameworks and libraries a) are both easy and fun to write, b)
are really hard to comprehensively test with full architectural coverage, and
c) suggest some form of standardized behavior. While (b) creates technical
debt, the main issue is (c) vs (a): we're flooded with "do it this way!" from
a thousand groups, even in situations where the developer(s) didn't really
intend for that to be their predominant statement.

With all this noise and chaos, it almost feels awkward to stick to old, icky,
widely-hated legacy standards in the face of all this innovation. Or at least
that's what it's felt like to me. Objectively thinking about it and comparing
everything, though, nothing's perfect, but what's been around for a while has
the combined benefit of a) having a fairly widespread mindshare, and b) having
known solution patterns for a wide range of issues.

I guess what I'm trying to say is that building software on top of tried-and-
tested methodologies is likely to produce long-lived results (which is fairly
logical).

I'm also reminded of [http://qdb.us/53151](http://qdb.us/53151) :D

_ \- I say "or better than" because large corporations often have standardized
internal Web style guidelines and rendering toolkits/templating engines, and
making sweeping changes in those is harder than for websites that are little
more than a landing page and some documentation - so the bigger an enterprise
is, the likelier it is that its website might look mildly dated.

------
EvanAnderson
If you want to geek-out further on ASCII have a look at Bob Bemer's site:
[https://www.bobbemer.com/](https://www.bobbemer.com/) Bob is colloquially
known as the "father of ASCII" (among other things) and his writing is fun to
read and interesting.

------
rwallace
Good article, brought back memories! A small correction:

> 56 kilobits per second just before the technology was effectively wiped out
> by wide-area Internet around the end of the 1990s, which brought in speeds
> of a megabit per second and more (200 times faster)

That should read 20 times faster?

~~~
cnvogel
> and more...

probably refers to the first widely installed (with coaxial cables) 10 MBit
Ethernet networks...

------
markbnj
I wrote code on a teletype connected to an HP3000 in the mid-70's, so its
always a kick to see that technology mentioned here. Especially when it's
described as "really old" which obviously it, and I, are :).

------
bfrog
RS-232 is alive and well in the micro controller space, thank you very much.

------
earthly10x
Pretty soon we'll have to start telling them what a command line interface is,
commonly known today as a foreign and abstract concept defined by an acronym,
"CLI".

~~~
wtbob
I don't think so: just as reading, writing & speaking are still our primary
forms of communication (rather than pointing & grunting), so too will CLIs
last.

~~~
Yizahi
It could be deprecated fast though if either big manufacturers decide to ship
GUI only systems, or if GUI systems became way more reliable. Or could live
forever.

For example we ship a certain device that can be configured fully both from
CLI, from web GUI and from Windows application. Customers internally have
teams that are 100% polarized - some teams use GUI stuff only and some teams
use CLI only. Both refuse to switch on principle. ANd in our case I think in
5-10 years CLI will die. It is just a mess - you can do fast configuration
with userfriendly extras in CLI but only on small scale. It just doesn't allow
userfriendly editing of kilometer long configs, especially on 100s and 1000s
of devices. So if we'll continue improve GUI configuration CLI will eventually
die I think. Same could happen in general IT systems, provided there is better
alternative (or forced alternative).

~~~
i336_
Right, that makes a lot of sense. If you develop the CLI version more and made
it easier to (for example) batch-apply patterns against many datasets, and
added sufficient facility to fix the other configuration gripes that are
currently harder than in the GUI, well, the CLI will win some more.

I do have to agree, GUIs can be easier to use if they're well-designed. It's
also arguably easier to build GUIs than CLIs in some situations, particularly
where you don't need something to be fully Turing-complete.

------
alexeiz
The FIX protocol uses the SOH control character to separate message fields.

------
gunnarde
copy con com2 atz atdt ath0

------
HeyLaughingBoy
NO CARRIER

