Hacker News new | past | comments | ask | show | jobs | submit login
Proposal for C23: Improve type generic programming [pdf] (open-std.org)
57 points by HexDecOctBin 6 days ago | hide | past | favorite | 48 comments

Type inference is great and undeserving of the "bah humbug" type comments. Not a fan of it in function return types but that's easy to avoid in code review or linting (both of which you need plenty in C).

The closure proposal is much more interesting. I skimmed the discussion and it points out many of the benefits and some syntax proposals, but I'm wondering about memory management.

In C++ a lambda may not be convertible to a function pointer if the captured context is non zero sized, and if the captured context is larger than the size of a pointer it must use some kind of dynamic allocation. C++ has scoped destructors, making implementation of that relatively straightforward (free the context when the lambda goes out of scope), but what would C require? Defer statements? Require a free() without matching call to malloc?

Closures in C are still possible using void* as a user context, which is common both for closure semantics in libraries as well as implementing closures in languages transpired to or interpreted in C. The memory management is explicit, the types remain explicit (except for the closed over context), albeit a bit verbose.

As much as I love functions as types and programming with them I'm confused how this is a positive feature for C, which lacks some of the semantics that other languages have making closures sensible.

> In C++ a lambda may not be convertible to a function pointer if the captured context is non zero sized, and if the captured context is larger than the size of a pointer it must use some kind of dynamic allocation

You are mixing things up here. A lambda will never require memory allocation, regardless of its size: https://godbolt.org/z/Mebn4h

std::function can't get around memory allocation for sufficiently large functor objects, but that is a matter of how libraries implement type erasure for functors and has nothing to do with the lambda expressions of the core language.

So this proposal retains the C++ feature of not allowing you to do anything with the captured context. Semantically, it's as if you created the captured variables as new function-local variables when you initialized the function call, and the wording makes it UB to modify the captured variables via a pointer.

Where the capture context lives in memory and how it is allocated and deallocated is not precisely specified by the specification, but there specification does permit it to be handled via C++-style scoped destruction. That C has no way for you to manually write this code does not preclude implementations from using that logic internally, especially since most C compilers are also C++ compilers and handling C lambdas as identically to C++ lambdas as possible would have some benefits.

In the section on lambdas, there is discussion of the "Objective-C blocks" language extension alongside C++ lambdas. They missed that the extension is to C, not Objective-C.

I'm not suggesting they standardize Clang's language extension, but I think it's worth noting that a similar feature has actually been implemented for the C language itself, rather than a higher-level one.

Given WG14 views on existing practices, it also fails me why they haven't looked into it.

Please leave C alone. Thanks much.

Thanks to the sheer number of C programs, no compiler supporting later C standards will drop support for prior standards unless it's explicitly targeting the later standard and that standard is fundamentally breaking with earlier ones. And even then, the major players won't because they still have to support every program ever written in C.

So any changes to C at this point will not break your ability to use whichever prior C standard you happen to be interested in.

MSVC 2019 enters the bar with support for C89, C11 and C17.

Though they never actually dropped C99 support, they never had full C99 support to begin with. C has long been a second class citizen (compared to C++) under MS's development tools.

They had C99 support to the extent required by ISO C++ compliance.

The backtrack from "we don't need C anymore" is most likely driven by, WSL, Azure Sphere OS, Azure RTOS, and above all probably some vocal customers with enough power to make it matter.

Even the Windows kernel team dropped the COM based userspace drivers framework, replacing it for a C based one.

In any case, everything from C99 that became optional in C11 isn't planned to ever be supported.

Except when you come to a new company and find out they're using the new glory all over the place and insist you do too.

Well, if you join a team you join whatever they're using. Which may not even be C. If you want control, you have to lead the team in the company or run your own company. That's part of being a "company man". You forfeit control over the kinds of things you work on and many decisions about what and how they're made unless you have the right ear or the right level of authority.

And that's why the commenter wants people to not touch C - if C doesn't change then the workplace won't change too.

I mean, you can try and hold back the tide but the world changes. Take control of your section of it if you want control, but you can't stop change. Perhaps try to become as relatively unpopular as Common Lisp so that no new standards are ever made (though the community continues to extend the language environment with things like QuickLisp and Alexandria). Or as truly unpopular as JOVIAL so that there really are no more changes.

Not everyone has the means to become their own employer. I agree with you about control and I myself am in business for this reason, but suggesting that they should simply use whatever they like doesn't address the problem.

Blowing up language is not nice even for me as a businessman - now I have to place special care about who's coding what and how, something I didn't need to do up until now (in case of C).

> Blowing up language is not nice even for me as a businessman - now I have to place special care about who's coding what and how, something I didn't need to do up until now (in case of C).

This part is fair, but I've never worked (professionally) in a language that had a BDFL who controlled its extension in a deliberate fashion. It's always been design-by-committee languages. As a consequence, since I began, we've always had coding standards that limited what portions we could use for which projects (usually with options to try out novel parts of the language/standard library with extra reviews). And this is even true with C, though the restrictions weren't as severe (working in embedded, typically mandating no/limited use of malloc/free and no recursion, direct or indirect).

Or companies will leave C behind.

Some sort of type inference would make a lot of sense to close some left-over holes from C99:

E.g. why does this work:

    struct bla_t bla = { .x=1, .y=2, .z=3 };
But when I want to assign to an existing variable a "type hint" is needed:

    bla = (struct bla_t) { .x=1, .y=2, .z=3 };
Why the (struct bla_t)? The compiler knows the type of "bla" after all.

Also this would be nice to create a zero-initialized struct value:

    const struct bla_t bla = {};
...this would basically be a C99 designated initialization without any initializers (it works as a language extension in gcc and clang, but is an error in MSVC).

Second this. These are some of the most annoying syntax warts.

Another thing - that probably has no resolution - is the struct-tag thingie. Even after all those years I'm not decided if it's worth typedefing structs to remove the struct tag. Like

    typedef struct _FooBuffer FooBuffer;
    struct _FooBuffer {

    // now we can declare FooBuffer variables without struct tag:
    FooBuffer foo;

   // had we just declared struct FooBuffer { ... } (without
   the typedef), then we would have to do
   struct FooBuffer foo;
As almost everybody, for a new language I would not want to have any struct tags. Even though the argument from the Linux Kernel code style guide makes some sense to me ("don't typedef struct tags away because we want to see it's a struct"). Each struct keyword moves the following code 7 columns to the right, which is annoying in lines where there are 3 or more of them.

You can always set your compiler to accept only older standards...

Yes, and please leave C++ alone too, at least for 20 years.

There's a time and place for everything. Regarding the time, it is and has been the time for this since lisp emanated. But C is not the place, IMO. I don't want to be left to wonder about sizeof(anything) in C.

"Lisp Programmers know the value of everything and the cost of nothing."

C programmers ought to know the cost and sizeof everything...

I hope the committee sees it like this as well.

I also believe the feature (auto) makes sense for C++ with its verbose and lengthy types nobody but your compiler cares about (say, e.g., multimap::equal_range). For C? Get out of here..

this passes, and suddenly zig becomes a whole lot more interesting ;) (but zig also has type inference you say? Well, it's a consistent proposition, and not a tacked-on "feature".)

C has VLAs, so sizeof is a runtime operation. That's already more complex than C++, where sizeof is fixed at compile time.

VLA has been "dropped" from C11 though (made optional), so it's essentially been degraded to a compiler-specific language extension.

IMHO VLAs were a "misfeature" that shouldn't have slipped into the standard in the first place.

With some slight tweaking variable modified types are the simplest and cleanest way forward to supporting arrays/slices as function parameters.

  void foo(size_t n, char buf[n]) {
    size_t m = sizeof buf;
In a saner world n and m should be equal, but in a cruel twist it doesn't do what you'd expect. I don't understand why the committee crippled the semantics here when adding VLAs and variable modified types. Never too late to fix things, though.

That's interesting. Can you elaborate on why?

The reason it doesn't work despite being syntactically correct is because the standard goes out of its way to declare that, despite appearances, buf still behaves as if declared as `char *buf`. Thus sizeof just evaluates to the size of a pointer.

I think the original intent of the variably modified parameter syntax was merely to support optimization--a compiler could hypothetically prefetch n elements. Whether or why they didn't care to consider the potential security and robustness benefits of making the semantics behave like automatic storage VLAs, I have no idea. But over the years I've seen multiple proposals to rehabilitate the semantics.

> I also believe the feature (auto) makes sense for C++ with its verbose and lengthy types nobody but your compiler cares about (say, e.g., multimap::equal_range). For C? Get out of here..

It's more than convenience. auto is effectively mandatory in many cases where the return type of templates/functions is private to the implementation (lambdas being the most common example, but libraries like hana make extensive use of such private types).

I'm all for it. auto, simple lambda and closures. The syntax also looks sane.

C needs operator overloading. Something like this: https://github.com/FrozenVoid/Infodump-DB/blob/943900c2dbf2e...

No, I'd say that operators will always result in a primitive operation is actually a feature - everything more complex warrants a proper function.

I'd say they should probably go the other way, defining even built-in operators like:

operator+(int a, int b) __gcc_internal_int_sum(a, b)

This way, we can change (through overriding) the way these operations work (rounding, overflow, coercing, etc.), without having to define non-idiomatic functions to do so. A lot of undefined behavior can be made user-defined this way.

There should be a middle ground. The main issue with operator overloading is that every overload exists in the global scope. Thus, you get issues and surprises with unknown or competing overloads. Sometimes, a workable solution would be to have overloads restricted to a kind of named scope that the calling code can enter explicitly. Maybe something like operator_context(mz_vector_ops) { ... } which makes all operators defined in the context of my_vector_ops apply to expressions within the block, but nowhere else.

Math libraries are an exception. You want this:

    d + A*B*(c*(a*b))
Not this:


You can get this with clang's vector extension (https://clang.llvm.org/docs/LanguageExtensions.html#vectors-...), and this would make a lot of sense to add to the C standard.

IMHO vector math is about the only place where operator overloading is acceptable in C++.

It'd be cool to have the Haskell feature of putting backticks around a binary function so you can use it infix, e.g.:

    ((A `mmul` B) `mvmul` (a `vdot` b) * c) `vadd` d
Still not as nice as operators, but not terrible either.

so are expression grammar libraries, etc

Consider arbitrary precision, rationals, interval arithmethic, countless algebraic types, matrices, vectors with all of them requiring typing and composing long functions versus intuitive formula:


The following style is much better:

   v3 r = Add_v3(v, w);
   v4 w = Scale_v3(r, 1.0);
You can definitely write a little 3D graphics backend this way.

C's lack of operator overloading can be annoying for some tasks, however I still like to use C for almost everything to this day, because of the lightness of going through old code and seeing instantly what happens - never have to run any type resolution in my head.

And importantly, _knowing_ that I don't have to do this. Because I might almost never have to do this in cleanly written C++ code, but how do I know?

What you actually want is function overloading, which C can't support since C functions map directly to elf symbols, and all assembly functions are vararg with all arguments being WORDs. (That's how it is in elf, *nix and x64, don't know much about other platforms)

You don't need a vtable for overloading; that is just runtime polymorphism ("ad hoc polymorphism".

Function overloading can be done compile time, eg., parametrically:

    add(3.0, 1.0) -> compile to add_float32_float32
    add(1, 2)  -> compile to add_int32_int32

As I said, `add` function in C maps to a symbol named `add` in assembly. You're demangling the function name, which is what C++ does.

Sure, you're right.

I guess we'd need C to have a special syntax to compile in something like a vtable

add_f32_f32(x, y) calls add(MAGIC_NO, x, y)

and add has a big switch on MAGIC_NO for selecting implementation...

You'd need a keyword for this, "multi int add(int a, int b) ..."

such complications where this is what C++ gives you natively

_Generic macros provide similar functionality to function overload(which in C++ involves name mangling, that will never be accepted for C) https://en.cppreference.com/w/c/language/generic

Clang has C function overloading. I use it in a reasonable sized code base. It really cleans up the source code, you end up skipping a lot of the type name or package name parts of function names which accumulate in larger projects leaving a much cleaner source to read. 100% recommend.

Not necessarily, they could standardize it as a syntax like _Generic, to make it so that the user has to choose a unique name for each function, but then they can all be called by a common name.

In C++, name mangling is used to solve this issue

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact