What is perhaps more interesting is the evolution and thinking about how it is done so easily on modern systems.
I think I've rewritten the basic 'cooked' mode probably a few times the latest being for a simulation of it on embedded systems[1]. The basic code is easy and it fails on the interesting edge cases (the most common one being if you auto CR/LF at the end of the screen few readline functions back up to the previous line.)
These days however if you have "lots" of memory and rendering a screen takes less time than rendering a single character did back in the day. You can just delete the character out of the buffer and re-render the entire screen. No need to count, your screen can do funky things like set your prompt different colors have tabs at an arbitrary number of spaces etc. Basically you can start at the top and render the entire 40 - 50 lines of text, poof in less an few mS. That was unthinkable when even at 9600 baud it took a couple of seconds to re-render the enter screen.
There was a Byte article around 1982 presenting a text editor in Z80 assembler that worked just that way: rerender the view on every keystroke. I was impressed -- "wow, it's that simple?" Of course, this was with memory-mapped textmode display, not a serial terminal, but it felt audacious to me to 'waste' that much computing for simplicity.
The article included the source code of the whole editor. That much assembly code in a magazine was unusual even then.
A lot of early text editors, particularly the tiny single-segment DOS ones, worked on the same principle. They didn't exhibit any flicker (most of the time --- anyone remember "CGA snow"?) because writing the same values to VRAM doesn't affect the screen display.
This one might've used a vertical-blank interrupt to time the refreshes -- it's what I think I remember, but I could easily be mixing it up with some other article around then, like Chris Crawford's series on Atari 800 graphics.
Jeez, after reading the articles today and yesterday, it was only after your comment that it dawned on me that you'd have to write a program to read keystrokes and, instead of moving a sprite in a game or something, it should show text.
It never clicked before this that terminals weren't just somehow "automatic". I now understand what terminal emulators do, thanks!
It took me a long time to realize that a VT100 was essentially a low-end microcomputer that just happened to have a "VT100 emulator" in ROM instead of having BASIC or a bootloader.
We once wrote a text editor as an exercise in FP, the key handler function had to return the area of the screen that needed repainting but also proof that this area was sufficient to catch all the changes, I think we were given some library to do the dirty work of actually drawing it to the screen so I don't know how nasty that was but our key handling code was quite simplistic!
You can just delete the character out of the buffer and re-render the entire screen.
I've used apps which I suspect do this (not necessarily terminal emulators), and I'd strongly recommend against it; it may not be noticeable if your input rate isn't high, but if you lean on a key and let the keyboard auto-repeat, the rest of the screen/window, which hasn't changed, starts to noticeably flicker annoyingly.
I could see that being especially troublesome in a system where you didn't have visibility into the repaint algorithm. Old skool UNIX used to start 'saving' damage events when they came in quickly so that it could dispatch them all at once. When your system did that you could lean on the delete (or backspace) key and it would start deleting characters and then just stop, and when you lifted the delete key it would suddenly have 15 - 20 whatever number of keys go away.
The other place it got annoying is when the system had a 'screen flash' when the input buffer was full. If your auto-repeat time was faster than the buffer drain time you would start getting flashes intentionally by the screen to tell you it was dropping input keys.
If you have one of the ST Micro 469i disco boards you can play with this using my 'term' demo in that repo. That implements a 25 x 80 "terminal" screen on the attached 800 x 480 screen. I've not yet connected it to micropython, but that is the intention.
This is interesting, because it reveals the reason Windows Services for Linux gets some terminal things wrong - it's just emulating a Linux kernel, and they haven't done this bit the same. Running
cat > /dev/null
then pressing tab then backspace works properly on Linux (including SSH from a terminal on Windows), but doesn't work on WSL (or Cygwin).
> When you hit backspace, the kernel tty line discipline rubs out your previous character by printing (in the simple case) Ctrl-H, a space, and then another Ctrl-H.
And you think 'Yup, I knew that.' ...
(In my case, I just about recall re-implementing it when mucking about with serial consoles in my youth - On low baud connections, you could actually see it do it - backspace - space - backspace )
The whole TTY layer actually contains a surprising amount of functionality, handling line editing being only one of them; this is one of the best articles I've seen about the rest of that: http://www.linusakesson.net/programming/tty/ (It's also been mentioned multiple times on HN.)
When I first started programming a pdpd11 using basicplus2 I wrote a program that had a built in password which consisted of the password which included backspaces and *'s which obliterated the first half of the printed password so when it was printed out you could not read the password.
Primitive but effective.
Even when viewing it on the screen it was quick enough you could not read the password
Way back on Solaris I once managed to accidentally name a file backspace. Somehow I managed to finger-flub and hit "mv filename <crtl-h> <enter>" or something like that. Had to use a GUI interface to delete that file.
Not at a Unix machine right, now, but this might work:
rm \BKSP where the BKSP represents the backspace key. If it worked, it did because the backslash escapes the usual meaning of the backspace key, which is to erase the preceding character.
Also, related (though may not work for above case), to delete other files with funny characters in the name (like a leading dash):
rm -- -filename
where the -- means (end of the options), so even a leading dash after -- is not treated it as an option.
Best to use the -i option too, with such cases, for interactive confirmation of the remove.
This seems complicated and error prone. Although I can see the advantages of this approach, it really makes me wonder whether the designers chose the right abstractions.
Also, the fact that I can type "cat binary-file" and mess up my terminal settings tells me that they didn't clearly separated concerns.
This is legacy built on legacy built to work with minimal CPU cycles on existing hardware designed to be compatible with hardware from the 40s.
Yes, concerns aren't separated ;)
A much more recent and far superior design can be found in Windows NT, where things like coloring are done entirely through an ioctl-equivalent (whereas Unix has amounts of in-band signaling, while some later extension use tty_ioctl).
nah, what they did was a masterstroke of contract work that allowed them to be the de-facto compatibility layer for every two bit IBM PC clone out there. The rest is history.
If the shell didn't allow cat to send commands to it (which may mess it up), it also wouldn't allow ncurses and such from using colours and other advanced features.
It'd be a failure to separate concerns if the shell detect and treated cat output differently.
Terminal control could (and should) have been out of band, e.g. via ioctls on the fds of the tty between shell and subprocess, instead of in-band via "special characters", which usually (always?) hint at bad design (compare e.g. with SQL injection).
As an example on how to do it right, consider how commands like "git diff" determine whether to paginate/colorize their output: they call the isatty(fd) stdlib function on stdout, which uses ioctl TIOCGETA to get terminal properties. Iff that fails, there's no TTY on the other end of the fd and "git diff" just writes the raw output.
If only there even was a clean, consistently implemented, well-documented protocol... but there is not - few tools properly escape their output, and you can't get much right without a terminfo database, even with you can't get consistent results, and the protocol is a mess.
You seem to be assuming terminals designed to support Unix, when in fact it's the other way around. None of this stuff is Unix-specific; it's just ISO 6429 / ISO 2022 / ASCII (or ECMA-48, ECMA-35, ECMA-6 if you want free copies).
Out of band? Do you suddenly double everyone's wiring costs? That won't fly. How about multiplexing control and data streams onto the same channel? Yeah, that's exactly what ISO 6429 / ISO 2022 / ASCII does.
If you want to express terminal control using function calls rather than inserting raw multiplexed control sequences into text — Yeah, that's a good idea… which Unix has had since 1978.
OOB is almost always a terrible idea. Urgent data like interrupts, as used in TCP (basically Unix signals over network), might be an exception. IOCTL's are a pain.
Those who don't understand Unix are doomed to reinvent it, poorly.
You should try to implement it the way you think it should be.
The problem you will notice: "content" and "presentation" can't be easily separated because it's important when changes in presentation take effect. Also one man's presentation is the next man's content. They are just not independent streams, period.
It's actually great that you can capture streams including control characters and replay them later. It works pretty well, especially these days where there aren't many competing instruction sets anymore.
Even in HTML you need to tag elements with class names inline. (And the HTML approach could never work for general purpose terminals -- as opposed to crippled, specialized ones, like Mathematica / Jupyter notebooks).
--
> If only there even was a clean, consistently implemented, well-documented protocol... but there is not - few tools properly escape their output, and you can't get much right without a terminfo database, even with you can't get consistent results, and the protocol is a mess.
You can go GUI. But many of the programs you want to use aren't available for GUI, guess why...
isatty() is a hack, but it's convenient and has very low false positives. If you filter output into another program before sending it to the terminal, the hack falls flat. You then need to provide a means for the user to be explicit, like `git diff --color=always`.
> Those who don't understand Unix are doomed to reinvent it, poorly.
Let's avoid hollow arguments to authority that preclude any interesting criticism or discussion. Unix has its fair share of flaws.
> it's important when changes in presentation take effect. [...] They are just not independent streams, period.
Not independent, but separate. Your argument would also apply to stdout and stderr - these are separate even though not independent.
ioctls could very well apply to a specific position in the data stream instead of being immediate. And if you don't like ioctls, there could well be a third output fd instead. In that case, the synchronization problem could be solved with a "barrier" feature.
Can we at least agree that the actual terminal protocol situation as it is now quite frankly sucks? The poor documentation and completely inconsistent implementations have caused me quite some stress.
> Let's avoid hollow arguments to authority that preclude any interesting criticism or discussion.
It's not an argument. It's just a category for the rest of the comment.
> Not independent, but separate. Your argument would also apply to stdout and stderr - these are separate even though not independent.
No - technically they are completely independent. There is no ordering or interleaving or association defined.
Actually stdin/out/err are nothing but conventions introduced by the shell.
> ioctls could very well apply to a specific position in the data stream instead of being immediate.
The general idea of a stream does not include positions - they don't have a clearly defined beginning or end. Streams are not like files on a filesystem, which do indeed have a size and can be indexed.
Stream positions are another specialist concept that eases development for some apps or environments, but gets in the way of development of general purpose environments.
> Can we at least agree that the actual terminal protocol situation as it is now quite frankly sucks? The poor documentation and completely inconsistent implementations have caused me quite some stress.
Frankly there are not many problems with Unix terminals. They are quite simple. As Unix in general they just won't keep you from shooting yourself in the foot.
Job control is hairy, though, but mostly because it's an ill-defined problem. In particular there was never a good agreed on mechanism for starting processes in independent environments. Systemd might have solved that to some degree (not a systemd fan).
The situation with incompatible instruction sets (escase sequences) has also dramatically become better, partly because there are libraries like ncurses, partly because pretty much everything is VT100 compatible these days, partly because we have GUIs for complex graphical specialist applications.
If you have a concrete problem - get in touch with me and I am happy to help.
> If the shell didn't allow cat to send commands to it (which may mess it up), it also wouldn't allow ncurses and such from using colours and other advanced features.
But control commands shouldn't be commingled with general input/output. That's how you get security bugs like the recent bash CGI vulnerability. There should be an interface that allows ncurses etc. to use colours, but it should be via a distinct control channel, not "magic" sequences embedded in normal data.
The binary file problem is nothing to do with abstractions and concerns, because it goes away if, as the designers of mosh did, one decides that one's terminal will not implement ISO 2022 character set switching.
> This seems complicated and error prone. Although I can see the advantages of this approach, it really makes me wonder whether the designers chose the right abstractions.
Keep in mind that many of the original tty devices were teleprinters, and the physical actions of the printer head upon being commanded to backspace would involve moving the actual printer head back a character - potentially using an eraser head - and then potentially moving the pointer head back again once more if that advanced the head. It's perhaps complicated an error prone, but it's also the optimal movement of the printer head.
You do not want to bake "complicated and error prone" into the peripherals of the time. You had no firmware. I'm not sure if you even had a proper ROM, necessarily. So you do it on the software side.
There's no abstraction here per se - but a 1:1 mapping of commands to common sets of physical hardware actions. That's not even a bad thing - what exactly would the abstraction gain you back then? You might abstract the escape codes to use for a given terminal via a terminfo or termcap database, to support multiple devices, but the core actions were always the same.
Hopefully you're not still using teleprinters in production - but someone is. And the VDUs that replaced teleprinters reused the same control mechanisms - keeps the complexity and thus unit price down, easier to retrofit / make backwards compatible with software expecting teleprinters, etc... I know at least one place still using fairly dumb VDUs in production, and these sequences are the simplest fastest commands you can send them to effectively implement "backspace".
If you ape the same logic in your latest and greatest single-page javascript webpage "app", and you're not writing a terminal emulator, it might be the wrong abstraction. It's sane for ttys though.
> Also, the fact that I can type "cat binary-file" and mess up my terminal settings tells me that they didn't clearly separated concerns.
It's a gun with the potential to be a footgun. Benefits, drawbacks.
Writing a shell script that wants to read passwords? cat a file which disables input echoing. After invoking readline, cat a different file to re-enable it. "Feature".
Want to cobble together a quick script to use your laser printer to print barcodes? Create a binary file that creates the barcode you want, cat it, pipe that through sed a few times to replace the number, and then pipe straight to the appropriate /dev/ttyXY. I've actually seen this, fixed this, and updated this. Figuring out the correct keyboard inputs to give vi was 'fun'. "Feature".
If you don't want to mess up your terminal, "view" or "vi" is likely the correct command - not cat.
> Writing a shell script that wants to read passwords? cat a file which disables input echoing. After invoking readline, cat a different file to re-enable it. "Feature".
Sending control characters to the tty won't do that. Input echo is done in the kernel and you need an ioctl to disable it.
If the kernel is operating under the assumption that the device is operating under local input echo mode (e.g. the ioctl has already been invoked to disable kernel input echo, as the terminal is assumed to be handling it), sending escape codes to the tty will do that.
EDIT: In case my air quotes around "Feature" weren't enough of a hint, I'm not actually recommending anyone do this.
I didn't read the article and all the comments but this is the sort of thing once again reminds that there's really nothing new in most of the new technologies :) Lets take Dom updates for example. Do you update the whole tree and rerender the whole screen or only the changed parts. Problem is essentially the "same" albeit tech and corner cases have their own set of differences. Next, consider that maybe your browser is just more fancier terminal and webserver is just evolution of old mainframes.
Well, that's just my way of thinking but because way of thinking and having started my career writing unix applications, I still can maintain my interest in latest 'trends' and at least feel that I have something to contribute when trend seems to gear towards hiring younger and younger devs.
i'm confused as to why the article implies the kernel is serving the shell.
isn't the shell usually a program running in userspace? that interprets commands, and prints characters on the screen, etc, in userspace? i didn't think the kernel was involved at all, except to receive exec syscalls.
You're conflating the shell and the terminal. The shell is a running process, the terminal (or tty) is essentially an input-output device connected to the kernel. If you want to see what I mean, open a terminal and run the `tty` utility. It will print the device file for your current terminal (Probably a `/dev/pts/` entry). If you write any data to that 'file', it will appear in your terminal, and if you read from that file you'll get completely unfiltered input from the terminal (Unfiltered as in you simply see the keys as they are pressed. If you backspace, you'll see a backspace character - no line editing).
The kernel acts as the gatekeeper between the input/output of the processes running on a tty, and the input/output of the tty device itself. In 'raw' mode, the kernel simply takes the input from the tty and sends it as input to the running process, and then takes the output from the running processes and sends it as output to the tty device. The kernel still isn't completely pointless in this instance though, as it does provide some other facilities.
But it doesn't have* to do that. In 'cooked' mode (The default mode), the kernel will allow line-editing. It does this by buffering the inputs from the tty device (And writing them back to the tty device, so they appear on the screen) without actually sending them to the running processes until you press return.
The result is the programs like `cat`[1] which do nothing more then call `read()`, and send the buffer to `write()` still allow line editing features like backspace - even though the program itself has no concept of line editing (or even lines at all).
It should be clarified that modern `bash` does operate in raw mode, and handles all the line editing itself (The `readline` library also does this - I can't say for certain if `bash` uses `readline` internally or not, I haven't checked). Unix utilities like `cat` though don't change tty settings and thus are generally operating in `cooked` mode (You can change these settings via `stty`, but it's generally not necessary).
But try remapping C-w in bash. The tty remaps it on the very next line. You'll have to stty werase unmap it to be able to change it in your read line .inputrc.
So the tty cooking still has charge of some things.
I think that's actually just bash messing with you. Bash implements it's own erase-word and binds it to whatever is set to werase in the terminal, so if you don't clear the werase entry bash will just keep rebinding it every new prompt. It's definitely not the kernel's doing - for example vim, which also uses raw mode, can use C-w just fine, and the functionality bash gives to C-w is different from what the kernel provides.
The shell needs to print a prompt and read input. This is I/O, which involves the kernel. The shell is (when running interactively) attached to some terminal. These days it's usually a pseudo-terminal with a terminal emulator like xterm on the other end. (With some exceptions) the read syscall on a terminal does not return until there's a newline. The kernel only implements simple line editing. You can kill the whole like (^U) or erase the last character (^H i.e. backspace). Traditionally kill and erase were @ and # which you may see in some old documentation. On a hardcopy terminal you could type daat##te and it would run date. If the right flags are set, instead of outputting the erase character, Unix will back up, print a space, and back up again as described.
However the shell supports more line editing. I bet you can press the left arrow and edit the middle of the line. I bet you can press the up arrow and get the last line typed. I bet you can press ^A and get to the beginning of the line. The kernel doesn't do any of this, and you can't do it in cat. What happens is the shell turns off line editing. Then it gets all input unprocessed, and can decide what to do on it's own. This is why Chris Siebenmann put the PS in there.
It's true that modern shells like "bash" are receiving a stream of bytes in what is called the "raw" mode of the tty subsystem. For programs that don't switch to raw mode, though, the tty is generally in the "cooked" mode that has existed for most of the life of UNIX and similar platforms.
In cooked mode, the kernel is interpreting your input and provides basic line editing facilities -- only once you "commit" a line, generally by pressing something like carriage return or enter, does the kernel make the entire line available on the tty file descriptor.
That's why the author mentions "cat" as a way to witness the kernel's line editing behaviour: it generally leaves the terminal in "cooked" mode.
For historical reasons, there is way too much kernel level character input character processing in UNIX/Linux serial port code. That stuff belongs in user space today.
Yes. Though I guess it'll be with us forever more, just like the old complicated x86 instructions that nobody ever uses any more, but that no one is in a hurry to remove, because they only take up a very small fraction of the chip.
You can also choose to ignore the reason-by-analogue I gave, and just take my prediction as a bet. I offer even odds at a 100 bucks that this ttl stuff will still be in the Linux kernel in 20 years.
I didn't ignore your reasoning, I acknowledged it and said it was not insightful.
Unused CPU instructions are difficult (but not impossible!) to deprecate for completely different reasons than the critical functionality of line discipline that is used all day every day and for which nobody has proposed a viable alternative.
I think I've rewritten the basic 'cooked' mode probably a few times the latest being for a simulation of it on embedded systems[1]. The basic code is easy and it fails on the interesting edge cases (the most common one being if you auto CR/LF at the end of the screen few readline functions back up to the previous line.)
These days however if you have "lots" of memory and rendering a screen takes less time than rendering a single character did back in the day. You can just delete the character out of the buffer and re-render the entire screen. No need to count, your screen can do funky things like set your prompt different colors have tabs at an arbitrary number of spaces etc. Basically you can start at the top and render the entire 40 - 50 lines of text, poof in less an few mS. That was unthinkable when even at 9600 baud it took a couple of seconds to re-render the enter screen.
[1] https://github.com/ChuckM/stm32f469i/blob/master/demos/util/...