
Integer overflow in glibc strncat - anon1385
https://sourceware.org/bugzilla/show_bug.cgi?id=19390
======
mjevans
It was easier to find the source page via a search engine than from the bug
reporting details page...

[https://sourceware.org/glibc/wiki/GlibcGit](https://sourceware.org/glibc/wiki/GlibcGit)

[https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86...](https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86_64/multiarch/strncat-
sse2-unaligned.S)

[https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86...](https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86_64/multiarch/strcat-
sse2-unaligned.S)

It's been too long since I've done any assembly programming, but this really
doesn't appear to have good documentation for maintainability; or anything
resembling sufficient documentation at all.

~~~
Kristine1975

      /* Inline corresponding strlen file, temporary until new strcpy
      implementation gets merged.  */
    

git blame reveals this is from March 2013. "Temporary" indeed...

~~~
gnarbarian
Nothing is more permanent than a temporary solution.

~~~
munificent
I'm guessing you mean this in jest, but I believe it is literally true.

Temporary code is usually poorly documented, poorly designed, deeply coupled,
ugly code. Those are all the exact qualities that make it _harder_ to change
than high quality code.

Unless you work very hard, the natural tendency is for all good code (which is
correspondingly easy to change) to eventually be replaced with worse code that
is harder to replace.

Natural selection applies to code too and, unfortunately, the qualities that
lead to longevity are often exactly opposite of the ones that lead to good
code.

~~~
btrask
Those are all factors, but I think there's a deeper one: the same pressure to
create a temporary solution also tends to keep it in place. On the other hand,
when you are at your leisure to search for an ideal permanent solution, you
probably have more time to keep changing it.

------
nomadlogic
Seems to give more credence to:
[https://www.sudo.ws/todd/papers/strlcpy.html](https://www.sudo.ws/todd/papers/strlcpy.html)

~~~
dietrichepp
Not really. The strlcat function is safer because it always NUL terminates the
destination buffer, and therefore makes bugs in user code less likely. The
flaw here is just an implementation defect.

Also note that SIZE_MAX is an unreasonable input, one would even say it
violates preconditions.

~~~
jmgao
> Also note that SIZE_MAX is an unreasonable input, one would even say it
> violates preconditions.

Annex K adds rsize_t, RSIZE_MAX, and friends, which restrict inputs to
reasonable values. Unfortunately, it doesn't retcon the old functions like
strncpy, and worse, no one actually implements it.

~~~
klodolph
No one? I thought Microsoft implemented it, or at least an early version
thereof.

~~~
cremno
A very early version. Among other differences the functions don't care about
RSIZE_MAX (even though newer VS versions define and use rsize_t). However the
Windows SDK contains <strsafe.h> which does limit the length of strings
(STRSAFE_MAX_CCH).

[https://msdn.microsoft.com/en-
us/library/windows/desktop/ms6...](https://msdn.microsoft.com/en-
us/library/windows/desktop/ms647466%28v=vs.85%29.aspx)

------
martin_
What'd be a use case of strncat(s1, s2, SIZE_MAX)? It seems like this would be
hard to find in the wild

~~~
stephencanon
Commonly, a policy declared that all use of `strcat` must be replaced with
`strncat` and at one or more locations in a codebase an actual size bound
could not be ascertained without more engineering effort than management was
willing to spend.

~~~
dietrichepp
Ah yes, we are absolutely certain that this new code is incorrect, but it
doesn't go against corporate policy.

~~~
stephencanon
The new code is precisely equivalent to the old code (except for the bug
report linked here).

------
xroche
Yup, I reported the same issue a year ago
([http://blog.httrack.com/blog/2014/08/31/i-found-a-bug-in-
str...](http://blog.httrack.com/blog/2014/08/31/i-found-a-bug-in-strncat/))

------
__michaelg
I would argue that copying _more_ bytes (as demonstrated) is correct
behaviour:

(1.) The function is supposed to append a string to another, not paste a byte
range into another one, leaving extra bytes at the end untouched. Thus the
functional result in the test code is correct (i.e. contains the concatenated
string with for the usual definition of NUL-terminated strings.)

(2.) The strncat man page (and similarly the C99 standard) says: "[...] the
size of dest must be at least strlen(dest)+n+1." It does not specify what
happens to unused bytes at the end. It would be incorrect if the function
would ever touch more than `strlen(dest)+n+1` bytes in dest, but it doesn't.
Supplying a proper value for n in the code (e.g. `strlen(s2)`) shows that it
wouldn't overrun the buffer.

Thus supplying SIZE_MAX for `n` would be a bad idea (because the function
could overwrite random memory locations) and would probably be incorrect in
most cases (e.g. on amd64 with SIZE_MAX=2^64-1 it would only be correct for
the null pointer.)

Copying _less_ would obviously be incorrect. I couldn't reproduce that though.
(The _more_ case even appears on Ubuntu with eglibc on both x86 and amd64.)

Also, this is probably still a bug and not the anticipated behaviour. (As in:
whoever wrote that didn't think of those cases and the overflow happens by
mistake.)

~~~
stephencanon
> The strncat man page (and similarly the C99 standard) says: "[...] the size
> of dest must be at least strlen(dest)+n+1."

No, the standard does not say that. It (both C99 and C11) says, verbatim: "the
maximum number of characters that can end up in the array pointed to by s1 is
strlen(s1)+n+1." If the manpage on your system says that, that's a fairly
serious deficiency in the manpage.

The standard also says "characters that follow it [a null character] are not
appended", so copying extra bytes is explicitly not correct.

------
ambrop7
Huh, I never use str(n)cat, because it's inefficient by design.

~~~
halayli
what's inefficient about it?

~~~
ambrop7
The typical strcat code you see in the wild:

char buf[100]; buf[0] = '\0'; strcat(buf, "Hello"); strcat(buf, " ");
strcat(buf, "World"); strcat(buf, "!\n");

The inefficiency is that each time you do strcat, the existing characters of
the string are iterated. So something like a common sequence of building a
string by appending little pieces has quadratic inefficiency.

~~~
bmm6o
It's unfortunate that strcat & friends don't return a pointer to the new end
of the string, or a count of the characters copied. You could use those to
make subsequent cats more efficient.

