
A 35-year-old bug in Patch found in efforts to restore 29-year-old BSD - fanf2
http://bsdimp.blogspot.com/2020/08/a-35-year-old-bug-in-patch-found-in.html
======
mncharity
> 35-year-old bug

Long ago, someone was archiving magnetic tapes at MIT, containing Lisp Machine
backups, from longer ago. Or maybe they were TOPS-20 backups... my memory has
faded.

Archiving here means using an old 9-track tape machine with a custom driver,
to copy data off now read-once 9 and 7 track tapes. Read-once tapes, because
after the tape goes by the read head, the tape's plastic backing goes one way,
and the tape's rust goes another. The backing is rewound, and the rust makes a
scattered little pile. The original driver would backup and retry on error.
Scrubbing back and forth, back and forth. Which here, would be bad. But back
to our story.

On one such longer-ago backup, was a core dump file. A core dump, with a
snapshot of the frame buffer. A frame buffer showing someone's screen, at that
moment of the core dump, so long ago. And at that moment, that someone was
being pranked. Pranked by a program which would draw, crawling across the
screen, a little spider. A little bitmap sprite bug. A bug, trapped by chance
in a core dump, and preserved in rust.

~~~
anonymousisme
Never saw the spider, but I do remember "xroach", in which roaches would
scatter from underneath where a window was after being closed.

Great fun from about 25-30 years ago.

EDIT: It looks like xroach still lives!
[https://www.freshports.org/games/xroach/](https://www.freshports.org/games/xroach/)

~~~
bitwize
A bitmap spider figures into a display-hack demo in Open Genera that I could
never get working. I think it may be more related to that.

~~~
mncharity
That sounds right - tnx!

------
asveikau
This is an interesting choice to make. There is an option to risk incorrectly
applying some corner case of well formed patches, and a risk of incorrectly
applying malformed ones that are working silently today.

I can see wanting to avoid the former. The latter also seems bad though.
Rarely is it ever made so clear that fixing bugs in a popular tool also leads
to compatibility issues. People often cite that concern to avoid fixing bugs,
but this is a legit case of that dichotomy.

------
jtchang
On a related note I've found finding information on the patch command to be a
bit harder than normal. Without digging into the code I was trying to figure
out why some people name things in the first two lines with an ending of .old.
Some common search queries really have a hard time bringing this up. One link
I found was here but didn't really go into much depth:

[https://stackoverflow.com/questions/987372/what-is-the-
forma...](https://stackoverflow.com/questions/987372/what-is-the-format-of-a-
patch-file)

~~~
pwg
If I'm understanding you correctly you are wondering why some filenames are
named _.old in the header of patch files.

If so, then that is likely an easy answer. No one creates patch files by hand,
they are created by diff (or similar tools).

And diff, at least the Unix variant, is used to compare two files and produce
an output that will change the first of the two into the second of the two
(i.e., it outputs the 'differences' between the two files, hence the utility's
name 'diff').

Well, to have two files, one of the files has to be given a different name. So
if one is editing a single file, and is not using some kind of source control
system that tracks changes, it is common to first do:

    
    
        cp file_to_be_edited file_to_be_edited.old
    

Then, edit "file_to_be_edited".

Then produce a patch by doing:

    
    
        diff -u file_to_be_edited.old file_to_be_edited > file_to_be_edited.diff
    

And since diff puts the filenames of the two files it compares into the header
lines of the output unified diff format, you get a file named _.old showing up
in the output patch file.

~~~
inopinatus
Well I just want to stop you right there and say that I have sinned and
written diffs by hand, and what’s more they were for a sendmail.cf, and if
this conjures visions of a cantankerous sandal-wearing Unix admin then so be
it

~~~
rascul
I just got flashbacks. My night is now ruined.

------
cptnapalm
As 2.11BSD code is, I think, at least partially prior to the settlement with
AT&T, does anyone know what, if any, files in it are still encumbered?

~~~
LukeShu
The Version 6 (7?) UNIX code that BSD is based on was re-licensed to a
4-clause BSD license in 2002.

[http://www.lemis.com/grog/UNIX/](http://www.lemis.com/grog/UNIX/)

~~~
scruffyherder
It’s too bad caldera didn’t have ownership to give it away, rather they had
rights to sublicense it.

Should have bought a $100 ancient Unix license instead.

------
kazinator
I cannot reproduce this with an installation of GNU patch 2.7.6.

I could be doing something wrong or misunderstanding the bug.

Input files:

    
    
      $ cat patch-bug-test
      How
      now
      brown
      cow?
      Now
      is
      the
      time
      for
      all
      good
      men.
    
      $ cat patch-bug-test-2
      How
      now
      brown
      cow?
      Now
      is
      the
      time
      for
    

Context diff:

    
    
      $ diff -c patch-bug-test patch-bug-test-2
      *** patch-bug-test 2020-08-17 11:39:03.056723058 -0700
      --- patch-bug-test-2 2020-08-17 11:41:46.683095324 -0700
      ***************
      *** 7,12 ****
        the
        time
        for
      - all
      - good
      - men.
      --- 7,9 ----
    

Apply in reverse to copy of patch-bug-test-2:

    
    
      $ cp patch-bug-test-2 patch-bug-test-3
      $ diff -c patch-bug-test patch-bug-test-2 | patch -R patch-bug-test-3
      patching file patch-bug-test-3
    

The operation is successful and the reverse-patched is now identical to the
original:

    
    
      $ diff patch-bug-test patch-bug-test-3 
      # no output
    

I did try it with different numbers of lines removed.

Looking at the code in the GNU version, the function is quite different. That
block of code is found, with the comment intact, but there are differences. It
looks like this in the GNU patch repository, as of this commit:
[http://git.savannah.gnu.org/cgit/patch.git/tree/src/pch.c?id...](http://git.savannah.gnu.org/cgit/patch.git/tree/src/pch.c?id=099394003477b83c2eb4be07fd0173d6e696cf4e)

    
    
         if (!chars_read) {
           if (repl_beginning && repl_could_be_missing) {
              repl_missing = true;
              goto hunk_done;
           }
           if (p_max - p_end < 4) {
             strcpy (buf, "  \n");  /* assume blank lines got chopped */
             chars_read = 3;
           } else {
             fatal ("unexpected end of file in patch");
           }
         }
    

The unpatched FreeBSD one referenced in the article:

    
    
       if (len == 0) {
         if (p_max - p_end < 4) {
           /* assume blank lines got chopped */
           strlcpy(buf, "  \n", buf_size);
         } else {
           if (repl_beginning && repl_could_be_missing) {
             repl_missing = true;
             goto hunk_done;
           }
           fatal("unexpected end of file in patch\n");
         }
       }
    

The order of the tests is reversed, which could make a difference (though
obviously only in the case when repl_beginng and repl_could_be_missing are
true and the goto is taken). In the oldest baseline that is in the GNU patch
repo (2009-dated), it is already this way, so to find the commit which
affected this code we would have to look to earlier GNU patch sources.

~~~
Someone
I don’t think GNU patch shares history with Larry Wall’s original.
[https://directory.fsf.org/wiki/Patch](https://directory.fsf.org/wiki/Patch)
seems to say so: _”GNU version of Larry Wall 's program that takes "diff's"
output and applies it to an original file to generate a modified version of
that file”_

I tried to verify by looking at their git repo, but that stops at “Import of
patch-2.1.tar.gz”
([https://git.savannah.gnu.org/cgit/patch.git/log/?ofs=450](https://git.savannah.gnu.org/cgit/patch.git/log/?ofs=450))

~~~
rurban
Nope. GNU Patch definitely is based on Larry's patch. But this bug is known
since shar's need to workaround it. It's just that the GNU patch had that
fixed very early on, BSD and probably others not. So shar still had to use the
workaround.

~~~
bsdimp
Shar's workaround is for leading white space. This is change is for not
assuming traling newlines.

