
C String Creation - fcambus
http://maxsi.org/coding/c-string-creation.html
======
jws
Good news in the article, apparently asprintf is scheduled for the next POSIX
standard. I wonder how one sees what is in the next POSIX standard, it would
inform my choices of what I use. I'll use something that works on all my
platforms if it is headed to a standard, but usually not if it may head to
oblivion.

Also open_memstream is POSIX 2008, now if it would just get into OS X…

~~~
audidude
If you want this today on all (C89) platforms, GLib ships an embedded copy of
gnulib and runs virtually everywhere.

~~~
sortie
gnulib is atrocious beyond belief though. I hear GLib tends to abort your
process on OOM, though I haven't done my research on this library, so I would
be careful using these to develop reliable software.

------
jhallenworld
Should we not be using wchar_t strings in modern C?

    
    
        int main(int argc, char *argv[]) {
            wchar_t buf[100];
            wprintf(L"Hello, world!\ntype something>");
            if (fgetws(buf, 100, stdin))
                wprintf(L"You typed '%ls'\n", buf);
            if (argv[1]) {
                char *s = argv[1];
                /* Convert char string to wchar_t string */
                size_t len = mbsrtowcs(buf, &s, 99, NULL);
                if (len != (size_t)-1) {
                    buf[len] = 0;
                    wprintf(L"argv[1] is '%ls'\n", buf);
                }
            return 0;
        }
    

It's a pain, but the advantage is access to iswalpha() and friends.

~~~
apaprocki
"Modern" C is char16_t and char32_t. The old wchar_t type has many issues. You
can read more here: [http://www.open-
std.org/jtc1/sc22/wg14/www/docs/n1286.pdf](http://www.open-
std.org/jtc1/sc22/wg14/www/docs/n1286.pdf)

~~~
sortie
char16_t and char32_t are useless. The C standard declares functions in
<uchar.h> for converting them to and from char, but not wchar_t. The
conversion to char may be lossy depending on the platform. No other interfaces
uses those types. There's no portably lossless path converting them to and
from wchar_t.

~~~
apaprocki
The point is that wchar_t should be removed entirely. Assume it doesn't exist
anymore and use char16_t and char32_t everywhere.

~~~
sortie
This is entirely undesirable. First of all, char16_t and char32_t are kinda
useless as there's no standard interfaces using them, and there's no
conversion functions to and from wchar_t.

Secondly, no, you're asking for a massive addition of 2 new versions for every
interface that mentions wchar_t. That's a huge addition to standard libraries.
That's error prone and bloats things up. Then additionally you're asking for a
rewrite of all software using wchar_t. And only until everything is
transitioned, which isn't going to happen, the standard libraries will be much
larger.

The solution is rather to embrace wchar_t and fix it. All sensible and modern
platforms, which is a premise of this article on modern POSIX functions, have
a 32-bit wchar_t type. That's excellent. It's only Windows, which due to
historical short-sightedness that have 16-bit wchar_t. But writing portable C
for native Windows is a losing game, the winning move is not to play. (Do see
midipix which is upcoming and will provide a new POSIX environment for Windows
with musl and 32-bit wchar_t). In fact, 16-bit wchar_t violates the C
standard. That moment you give up broken platforms with 16-bit wchar_t,
wchar_t works as intended, and this is a non-problem. Embracing char16_t and
char32_t is a worse problem and isn't solving anything.

------
hthh
This is an interesting article, but the "Portability" comments could be a lot
more useful: strndup and open_memstream are both "POSIX 2008", but strndup can
be used on OS X while open_memstream cannot.

~~~
to3m
You'd probably be able to duplicate open_memstream on OS X using funopen.

~~~
btrask
Ran into this recently. It's not open_memstream but same idea:
[https://github.com/NimbusKit/memorymapping](https://github.com/NimbusKit/memorymapping)

------
rumcajz
This is an advice to use C in a way that resembles higher level languages. But
it you want to do so, why not simply use a higher level language?

The power of C -- which distinguishes it from most other languages -- is the
ability to allocate almost everything statically.

In fact, the older I get the more I appreciate pre-Algol60 way to allocating
stack frames statically.

~~~
sortie
Whether C is appropriate is highly depending on the project and context.
Higher level languages offer a lot and should be used when appropriate.

But when C is appropriate, and these problems arise, which they will in any C
codebase of appreciable complexity, these string creation interfaces are
waiting for you, and will help you write correct code. It's usually a
worthwhile effort to reconstruct higher level abstractions in C, in a good
manner, for the same reasons you use them in higher level languages.

Note that statically allocating everything is hardly always possible. See the
distinction between bounded and unbounded in the article, the unbounded case
is really common.

------
kevin_thibedeau
I see too much damage caused by strncpy() to ever recommend it for use. Code
that is blissfully unaware of the non-guaranteed NUL or that repetitively does
extra work to guarantee a NUL. Use strlcpy() if available or reimplement it.

~~~
sortie
As in the article, strncpy has valid uses, but it's widely misunderstood due
to the poor name. strlcpy is what the name suggests. People are also surprised
by the zero padding.

