
Adding strlcpy() to glibc - corbet
http://lwn.net/SubscriberLink/612244/8a95e0fc58bf17f6/
======
nkurz
Deep in the comments, 'slibc' is mentioned. I hadn't known about it, but this
library provides str..._s() implementations of all the standard str...()
functions as defined in Annex K of the C11 standard (which I also hadn't known
about).

[https://code.google.com/p/slibc/](https://code.google.com/p/slibc/)

At a glance, this seems like a better solution than any of alternatives:
require the buffer length to always be specified, and call a user controlled
handler instead of allowing an overflow. This handler defaults to abort().

While there are fine arguments to be made for using a real string library like
bstring, are there downsides to the str_s approach compared to using the str,
strp, or strl family of functions?

~~~
hyperrail
One issue with the str..._s() functions is that the invalid-parameter handler
is a global setting - all code that calls the slibc functions and hits an
invalid parameter problem will call the handler. This could cause anything
from your application code to some random library you never heard of to make
your program abort().

I personally don't think this is a big problem, as the kinds of issues that
would cause the invalid parameter handler to be called usually are not
recoverable no matter where they originate from.

But I can definitely see why the loss of control is technically troubling -
those random libraries could be calling set_constraint_handler_s() and thus
causing your code to keep running and not abort(), forcing you to rely on your
second defense of checking the return values from all the functions.

(There's also a potential political issue with str..._s(), but I won't go into
that one.)

------
Zardoz84
I usually use strncpy and enforce null at end, that virtually is the same that
does a strlcpy.

But if you know how long is the source and destiny buffers (and if you are
using str[nl]cpy, probably you know it), you could use memcpy and get a much
more faster copy.

~~~
adekok
strncpy() zero-fills the array. Which seems overly wasteful.

    
    
        char buffer[1 << 24}]
        strncpy(buffer, "a", sizeof(buffer));
    

will set _all_ of the array to zero, instead of just copying two bytes.

And as you noted, it won't NUL terminate the string. Since strncpy() doesn't
return C strings, it's not really a "C string" function. It's a horrible,
inefficient bastardization which should be used by no one.

~~~
ploxiln
Meanwhile, strlcpy() returns the length of the source. So if you just want to
copy a part of the source, or you're not sure if the source is terminated past
the length you want to copy, it's either wasteful or dangerous.

~~~
wahern
Those are completely made-up objections. strlcpy also fails to fix me coffee
in the morning, ergo it should be avoided.

------
TazeTSchnitzel
Safe string handling in C _can_ be done, but not with char*.

PHP's Zend Engine has safely-handled strings, for example, but we do that by
reference-counting them and having an explicit length.

~~~
michaelmior
SDS[0] from the creator of Redis is also useful. It doesn't do reference
counting, but is helpful for many other string manipulation issues.

[0] [https://github.com/antirez/sds](https://github.com/antirez/sds)

~~~
TazeTSchnitzel
The clever implementation allowing it to function like any other char* and
that makes indexing work is what really sets it apart from the alternatives.

------
ape4
Hopefully it will be added to the standard C library so it will be everywhere
- eg Microsoft, Apple, etc.

~~~
_delirium
C11 added a different set of safe string functions, although as an optional
standard annex (Annex K). They have _s suffixes, e.g. strcpy_s(), and somewhat
different behavior. It seems unlikely that ISO would also add strlcpy() at
this point, given that one set of "safe" functions is already (sort of) in the
standard. But maybe POSIX could standardize it.

~~~
yuhong
Which came directly from MS.

~~~
dottrap
Too bad Microsoft Visual Studio refuses to get with the times and update their
antiquated compiler from C89. They are now 2 standards and 25 years behind.

~~~
ygra
VS has a C++ compiler, not a C compiler. There is no getting with the times.
That's like advocating that the Java compiler should get with the times and
start compiling C++11. C and C++ are quite different languages, despite
looking superficially the same. Think of it as a DVCS history that evolved
into two very different branches at one point (C89) with the occasional
feature branch of C merged into C++.

~~~
dottrap
Visual Studio is the only major C++ compiler that is inept at handling C. This
is not unreasonable to expect proper C support and there is something terribly
broken with Visual Studio in that they can't. Even Stroustrup has called for
better compatibility with C.

~~~
_delirium
I don't see any particular reason you'd expect a C++ compiler to handle C,
except the very specific part of C (largely C89) that is a proper subset of
the C++ standards. They're very different languages at this point.

gcc's C++ compiler doesn't handle C past C89 either; you can't use C99
features when compiling with g++. There is a separate C compiler that's part
of the GNU Compiler Collection, but then there's also a Fortran and a Go
compiler: gcc includes a lot of languages. MSVC++ doesn't; it's just a C++
compiler, not a Fortran or Java or C compiler. Linux norms are a bit more
polyglot, so gcc is more polyglot than MSVC++ is.

I personally like C, so it'd be nice if Microsoft shipped a C compiler in
addition to a C++ compiler, but they've chosen not to. As a result, the "C"
you can write is precisely the C that is also valid C++, and you can't use any
features that aren't valid C++. If you really want there to be a language
"C/C++" again that all C++ compilers can be expected to handle, you'd have to
convince the C++ standards committee to add C99 or C11 features to the next
C++1x.

------
penguindev
I agree with drepper on this; it's a solution in search of a problem. You
should either know WTF you can accept or use a higher level construct that can
resize. Silent truncation seems bad - truncation attacks are a real attack
vector that SSL, for example, tries to prevent.

~~~
ori_b
strlcpy is designed to make it easy to detect truncation. You get back the
buffer size you need to store the result. If this size is >= the size of the
buffer, you truncated.

~~~
to3m
You can just keep going, building up your string using strlcpy/strlcat as you
go, and check for truncation (whether from this call, or a previous one) using
the return value whenever it's convenient. Depending on what sort of programs
you write, truncation might even not be a problem anywhere from some to
virtually all of the time, so in those cases you can just let it happen and
the code ends up even simpler.

(When I first discovered strlcpy/strlcat I went and changed a bunch of string-
fiddling code to use them and it was really amazing how much simpler
everything became. Virtually all of my bounds checks could go away, leaving
just the string stuff. Much nicer? Well, I wouldn't go that far. But certainly
fewer ways for it to go wrong.)

------
sigzero
I am curious why strlcpy() was not modified to check the length going in and
out to check for truncation? I am not a C/C++ guy but that is one question I
had when he said that was one of the gripes for the function.

~~~
imanaccount247
Because its return value already tells you if truncation occurred.

------
smegel
> The primary complaint about strlcpy() is that it gives no indication of
> whether the copied string was truncated or not.

So how hard would it be to add a return value indicating this?

~~~
ploxiln
If you did, it wouldn't be strlcpy() anymore. The point is compatibility with
existing users and expectations.

------
bitwize
The correct approach is to use the strxxx_s family of functions. This is also
in the most recent C standard.

~~~
noselasd
the strxxx_s family blow up your program if they encounter something bad.

What most people want is some simple functions that that can tell you if
something would have gone bad. (e.g. truncate the copied string and tell you
if it was truncated)

~~~
bitwize
Good. Attempting to copy or concatenate a too long string is indicative of a
logic bug. If it really bothers you, use _set_invalid_parameter_handler to
change the behavior to something you can recover from.

