
All About EOF (2012) - joubert
https://latedev.wordpress.com/2012/12/04/all-about-eof/
======
majewsky
Unix applications assume end-of-file when read() returns 0 bytes. Ctrl-D can
generate such a condition in a quite contrived way. The article is very wrong
about this:

> In fact, the Control-D you type in at the shell to end input is simply a
> signal to the shell to close the standard input stream.

The shell is not involved in this at all, and no file descriptor gets closed.
[0] The magic happens in the kernel's VT (virtual terminal) subsystem. When
the user presses Ctrl-D, the terminal emulator (e.g. xterm) writes the
corresponding byte sequence [1] into the master side of the pty (pseudo-
terminal) device. Upon observing this input, the kernel will immediately
answer all outstanding read() syscalls on the slave side of the pty device.

Now usually, when the terminal device is in its default mode (aka "canonical"
or "cooked" mode), the kernel will buffer user input until a full line is
observed (i.e. until the user hits Return), so read()s on the slave side will
block until Return is hit. Due to the behavior above, Ctrl-D can be used to
send partial lines to an application. Try this, for example:

1\. Run `cat`.

2\. Type something and press Enter.

3\. Type something and press Ctrl-D.

4\. Type nothing and press Ctrl-D.

The last step will cause `cat` to exit because the read() returns 0 bytes
(since there is nothing in the kernel buffer), which `cat` interprets as
encountering the end of the input file.

[0] Which wouldn't make sense anyway, since the shell can only close its own
file descriptors, but the reading happens in an entirely different process.

[1] I'm not incredibly familiar with the master side of ptys, but I would
guess that the concrete byte written is 0x04, since the letter D is 0x44 and
the Ctrl modifier (in ye olden days) unset the 0x40 bit, leaving only 0x04.

~~~
JdeBP
> _The magic happens in the kernel 's VT (virtual terminal) subsystem._

No, it is in the line discipline, which all terminals have, real, pseudo, and
kernel virtual.

------
JdeBP
> _So what is the difference between the type command used above and the
> Notepad application? It’s actually hard to say. Possibly the type command
> has some special code that checks for the Control-Z character in its input._

Actually, it's very easy to say, and I've been pointing to the code for about
a decade.

* [http://jdebp.eu./FGA/dos-character-26-is-not-special.html](http://jdebp.eu./FGA/dos-character-26-is-not-special.html)

------
jokh
Why doesn't Windows drop the purported compatibility with CP/M and get rid of
control-z being a special file marker at this point? Lots of legacy code
relying on it?

~~~
JdeBP
It's not _in_ Windows in the first place. The headlined article does
explicitly say this.

It's library code in a large number of applications programs.

* [http://jdebp.eu./FGA/dos-character-26-is-not-special.html](http://jdebp.eu./FGA/dos-character-26-is-not-special.html)

------
Thorrez
One weird thing I've found about EOF is that EOF is sometimes not actually the
end. You can read an EOF from stdin and then continue reading more data.

~~~
majewsky
See my sibling comment:
[https://news.ycombinator.com/item?id=19366521](https://news.ycombinator.com/item?id=19366521)
\- In short, EOF is probably not what you think it is.

