Hacker News new | past | comments | ask | show | jobs | submit login
Modern C for Fedora (and the World) (lwn.net)
110 points by signa11 5 months ago | hide | past | favorite | 87 comments



> C++ has already adopted type inference, where the compiler figures out what the appropriate type for the variable should be from how it is used, in this case. There are schemes afoot to add a similar feature to C, but type inference is incompatible with implicit int.

Please no. In C++ this only makes sense when you have huge templated types such as iterators etc. Or inside templates where the types are not fixed.

Neither exist in C. It is just a distraction with no benefits in C.


I can think of many counter examples, e.g

auto v = MAX(a, b); where max infers the result type from the input types using generics or compiler intrinsics.

auto v = (struct foo){.a = 1}; where one wouldn’t want to repeat the type as in C++ auto v = make_unique<foo>(1);


an auto cast based on the destination type might be cleaner, a la

  struct foo v = (auto) {.a = 1 };


C has type generic math (which came first) and something called generic selection via _Generic (which came later, and provides a model for implementing the type generic math stuff).

I suspect that a C++-like auto could be of benefit in conjunction with that stuff.


C23 has that auto, although the semantics are more limiting.


In fact C always had type generic math in the form of the normal math operators.


Right: so this would make sense:

  char x = 1, y = 2;
  auto var = x * y;   // var inferred as int, due to promoted type


It is already there in C23.


And its broken! Consider:

auto i = 0.0;

i is a double in C23 but an int in earlier version. Please don't use C23.


Seems like a simple fix, just do a global search & replace of 'auto' with 'int'.

In the rare case where someone used 'auto type foo' you'll get compiler errors and can easily fix those.

I have never actually seen auto used in a C code base.


Yes the use of auto is rare, and should not be used in the first place. Still i don't like silently breaking old code. You cant just do a search and replace. This is valid:

auto float f;

so a search and replace would break that code.


Search-replace has its problems:

    auto tautology = sizeof("automatons are on auto");
but int float f; is not a big one: The compiler will flag it as a syntax error, and you fix it as it comes.


I agree. The big problem here is that the meaning of a rare but valid peace of code has changed and almost no one knows about it. We don't break user space.


> We don't break user space.

That's one strategy. It has consequences, such as keeping around K&R declarations and implicit pointer/int conversion for decades after they started being warned against. Other strategies may have merit.


I don't think either of those examples is the same.

The problem isn't merely gaining or losing features or syntax, it's changing the meaing and behavior of an existing thing.

Code is the definitive source of truth for the thing someone devised.

It's not just a thing, it's the reference for how to implement the thing.

If you have a recipe from 100 years ago that refers to ingredients and tools that are no longer available, or are now called something else, or need to be translated into current equivalents, those are all no problem.

But if the old recipe uses a term that we do still have, but means something different now than it did when written, that's a problem. Now you're breaking the very concept of writing and communicating and documenting. That's not a sane thing to do intentionally. Think scientific and industrial processes not just cakes.

In this particular case, since the feature happens to be presumed to be rarely used, it might be fairly easily addressed by just having the compiler issue a full stopping error for any use of auto by default, including the reason why so that the user doesn't just blindly add the flag to allow it.

In the future, when someone tries to build some code, it will fail to build, and that is 1000% prefferable to successfully building the wrong output.


It is not valid C for almost 25 years and compilers warn about it for a long time. I doubt this is an issue at all.


If anyone actually ever wrote that, pre-C23, then more likely than not they intended for i to be a double. Otherwise, why write 0.0 when 0 would suffice? It could go either way: Switching to C23 could cause a bug, or it could fix a bug. Or maybe fix one bug and cause another.


I actually fixed this exact bug in a C library recently which used auto most likely because the author was used to using it in C++, it compiled and went through. Led to very spicy behaviour in C.


I'm doing a lot of C in the last 15 years, and have never seen usage of "auto" in any codebase. It's an ancient way to declare a local variable from the early "typeless" days of C.


You are right it is bad to use typeless variables, and its also bad to use auto in C23 to infer type. So this code should never be written, but its still bad to break existing code.


I suspect that nobody outside of C conformance suites or compiler unit tests has used auto in C, at least in the last 30 years.


0.0 is a double literal, how is that broken?


Because it changes the meaning of existing code. That's a big no-no.


Maybe people could grep for "auto" in their codebase, see it's not there, and then change to C23?

Though realistically if this looks like a problem it's a trivial compiler diagnostic. "Would this thing have a different type in C23 to earlier? Emit a warning, promotable to error".

The whole changes meaning panic is rather overdone and mostly acts to stop C improving over time.


You cant make a compiler diagnostic, because it was valid before and after C23, the code just means different things. In fact just using auto without a type was bad practice in earlier versions, now it can be considered a feature, so its less likely to give you a warning now than in the past.


Compilers already have diagnostic for these exact things ("because it was valid before and after C23, the code just means different things"), e.g. all the -wc99-compat warnings


The obvious diagnostic would be "if auto and < C23, emit warning". That can definitely be implemented. Choosing the name would take longer than writing the code.

Lots of stuff is valid and gets warnings on request, like `if (a = b)`, or anything where people get precedence or aliasing rules often enough that someone bothered to patch the compiler.


That is if you know about this issue. Very few people do. At some point C23 may become a default version of C in some compilers and someone who has been building code without specifying what version of C they use, will silently get a bug. That is the problem. There is no way for a C compiler to look at the code and determine what version it was written for.


Implicit int was removed in C99. Since then most compilers warn about this in default compilation mode.


Chesters' (green) field, where Chesterton's fence is. :)


Or when the type is hidden (lambdas). Or to avoid needless redundandancy (e.g : auto foo = std::make_unique<Foo>()). And probably some more.


There is the "Almost Always Auto" idea in C++, but AFAICS it's (fortunately IMO) not catching on.


There is some discussion there about whether it's a good idea to enable -Werror: make all warnings errors (which breaks builds for users that use a newer compiler that introduces new warnings) Someone said, how about doing that for version control checkouts. But downstream users use version control checkouts too.

I have a solution for that kind of thing: ./configure --maintainer.

If the building user identifies as a maintainer, then some things can happen differently. Stricter compiler options may be one. Another thing you might do in maintainer mode is update generated files (e.g. Yacc parsers) rather than using the canned, shipped code.

In TXR's ./configure, maintainer mode enables parallel make, which is suppressed otherwise. There is a way to enable it without requesting maintainer mode. Parallel make asks for problems because it can introduce race conditions into build steps, causing builds to intermittently fail. So in keeping with the theme of not breaking users, we disable it --- unless they announce themselves to be developing users via maintainer mode.


A few things to note and out this. Yes feel free to enable -Werror, but warnings are implementation specific, so there is no way to write portable C code that is warning free. A compiler can issue a warning for anything, like you using a computer, or you should mind you knees when walking stairs.

There is a worrying trend about -Werror, and that is that people turn it on and then when compilers add new warnings, builds fail and developers complain to the compiler developers. This is BAD. we should encourage compiler developers to find as many warnings as possible, and then users should turn off the once they don't like. Juste because a warning is stupid to you doesn't mean that it wont catch a bug for someone else. Compiler developers are responsible for correctly compiling your code, not to keep your code warning free.

I have spoken to several compiler developer who don't want to add new valid warnings because of the vitriol they will receive from developers.


Introduce an error epoch and let maintainers advance it at whatever pace they'd like. You could even select a more recent epoch for your local builds and keep your CI builds on an earlier epoch until you've qualified against the most recent one.


> people turn it on and then when compilers add new warnings, builds fail and developers complain to the compiler developers.

That's a strange reply to a comment which specifically acknowledges the issue and describes one solution.


> The page you have tried to view (Modern C for Fedora (and the world)) is currently available to LWN subscribers only. Reader subscriptions are a necessary way to fund the continued existence of LWN and the quality of its content.

Not sure if this somehow changed at LWN because I thought I'd been able to look at this article before.


> Inconsistent return statements

Returning the wrong data type has been an error with Clang since the earliest days. Why does GCC still allow that?


It doesn't:

  int main(void) {
    struct {
      int a;
    } s;
    return;
    return &s;
    return s;
  }

  $ gcc test.c
  test.c: In function ‘main’:
  test.c:5:3: warning: ‘return’ with no value, in function returning non-void
      5 |   return;
        |   ^~~~~~
  test.c:1:5: note: declared here
      1 | int main(void) {
        |     ^~~~
  test.c:6:10: warning: returning ‘struct <anonymous> *’ from a function with return type ‘int’ makes integer from pointer without a cast [-Wint-conversion]
      6 |   return &s;
        |          ^~
  test.c:7:10: error: incompatible types when returning type ‘struct <anonymous>’ but ‘int’ was expected
      7 |   return s;
        |          ^


Is there some place where I can see all the packages that need to be fixed and raise a pull request for any package? Maybe, I can help with some fixes during the holiday season.



Thanks. The link seems to list relevant links and repos. I'll see how I can help now.


https://src.fedoraproject.org/rpms/nbdkit (replace "nbdkit" with the package name)


Thanks. This helps even more. I need not go through all the packages. I'll get a fedora account now :)


Besides fixing bad C code: how many of those packages are still properly maintained? The Ruby ones don't seem to be. Fedora and other distros likely have to build their packages with patches.


Why not just change the defaults and then add some bypass flags to the compiler to allow old code to compile without hard errors?


Read the article comments for some of the pedantry on why C compiling has be painful forever.

There is no keeping people happy.


I did read the article. Unless I missed something, I didn't see anything about why they couldn't add some mechanism to bypass these warnings for old files.


"They" (e.g. GCC maintainers) do. You have dialect options like -std=c90, and fine-grained control over diagnostics: you can disable or enable individual warnings, or have them treated as errors.

GCC compiles C with a certain default dialect. It used to be gnu89 for the longest time, then gnu99 for a while. Now it is gnu11, if not higher. This default dialect is not ISO C; it's a GNU dialect.

So just to have your code processed as ISO C, you have to use a non-default option. Some GNU extensions are still recognized (rather than diagnosed as syntax errors) so if you wish not to accidentally use those, you need -pedantic also.

In Makefiles, you can tweak compiler options for individual files. E.g. with target-specific assignments in GNU Make.

  # don't warn about conversions in old_parser.c
  old_parser.o: CFLAGS += -Wno-conversion


The standard version is a TU-wide flag which doesn't help with headers. And it seems like the proposal here is to change the standard so that those legacy constructs are outright rejected. So my question was, if you're going to change the default in the standard, why not provide a per-header escape hatch for old code?


I'm not sure which meaning/nuance of standard you are referring to, but the ISO C standard doesn't specify any flags mechanism, translation-unit-wide or otherwise.

The GCC implementation (and likely Clang also) supports altering diagnostics over sections of a translation unit.

https://gcc.gnu.org/onlinedocs/gcc/Diagnostic-Pragmas.html


Or use TCC or CParser for C89/C99 and much faster compiling times.


gnu17 on latest.


I wonder if this item, "Assignments between incompatible pointer types" includes what I understood is (was?) actually recommended practice in C - `int *c = malloc(sizeof(int))`. Has the preferred style changed, or is this only a problem for non-void pointer types (i.e. `float *p = &some_int`) ?


You are right. void pointers can convert to any pointer type in C. That's why you don't need to cast malloc in C.


GCC will not warn with implicit void pointer conversions, I believe that’s explicitly allowed. It’s only when you have incompatible types like ‘float *y = (int *)x;’


I really prefer sizeof(*c) in that expression to protect against a badly-sized allocation of c's type changes.


One K&RC construct can sneak into new C by accident. A lost C++ developer might wander into a C program and write:

  void fun() {  }
which is fully declared in C++, but in C it is an old-style function definition that doesn't declare anything about its arguments.


Not exactly. As of C17, the function definition you wrote implies that the function has no arguments, just like in C++. However, a declaration that is not part of a definition, like

  void fun();
(which you typically find in a header file) indeed does not say anything about the function's arguments. You have to write

  void fun(void);
to specify that the function takes no arguments. For consistency, you can (and probably should) also use that style in the function's definition, but you don't have to.

Anyway, C23 will align C's behavior with C++: void fun() will mean the same thing as void fun(void), whether as part of a function definition or not.


I have N3096.pdf (April Fool's 2023 draft) perpetually open; will check it.


In C23 has the same semantics as in C++.


The C Programming Language

Does someone now a source of the 2nd Edition “ANSI C” with a good print quality? The ones sold by Amazon seem to be scans and no based on digital typesetting - which is mentioned in the first pages.


I have one in HTML.


> first edition of The C Programming Language that many of us still have on our bookshelves

I suspect that plenty of people with the second edition on their bookshelves have retired already.


Still at least 10 years to go.


Interesting - normally lwn links posted here are shared with access (a promo link) - this one appears to be behind a paywall?


weird—it wasn't that way initially when I loaded the page earlier today, but refreshing the tab shows that it is indeed paywalled now. the earlier, unpaywalled version is available on archive.org: https://web.archive.org/web/20231222032328/https://lwn.net/A...


> Part of the problem, perhaps, is that it appears to have fallen on Fedora and GCC developers to make these fixes.

Well, they were playing with fire, and now are complaining.


Huh? Who's been playing with fire? Fedora ships 15k packages. They don't develop them. They're trying to modernize hundreds of packages on a volunteer basis.


Well... "it appears to have fallen on Fedora and GCC developers" suggests they are are forced into doing this. But it's the reverse: they decided this is a sensible goal. OK, they are free to do that, but painting it as if they are unfairly left alone in their plight is framing it rather strongly.


Well Clang also did this for clang 16 and tried to ship it with 16. Most of these were K&R C constructs that were deprecated in c89. c23 kinda removes and in one case even repurposes the code. So on GCC and Clang side, it was c23 compliance and on fedora and gentoo(which also helped) force by compiler defaults.


Futile exercise. Compilers should follow the standard, even if it is ugly or dated.


They do. That's what the whole -std=c99 flag asks for. You either tell the compiler what language the text files are supposed to contain or it takes a guess.


Have you actually read the article? They want to make the default behaviour to deviate from _all_ known standards.


I haven't read the article, but following this effort from the compiler side, I do not think this is the case. Also there is exactly one ISO C standard. Newer versions supersede older ones from a legal point of view. The current one is ISO C17 which will soon be replaced by ISO C23 (officially next year). Older C standards are historical documents and code targeting them is not conforming to ISO C. The C committee takes extreme care not to break existing code except for very good reasons. So code that can only be compiled assuming older versions of ISO C will likely use questionable constructs which were removed in later versions for good reasons.


The current standard is C23.


Have you actually read the article? They want to make the default behaviour to deviate from _all_ known standards.


I have, are you sure you're up to date what standards actually say?


Which standard? C has 6


Have you actually read the article? They want to make the default behaviour to deviate from _all_ known standards.


The paywalled article?


Not paywalled, I can freely access it.


It is paywalled, its accessible only with a subscription.


Horrifying how old C was basically a dynamically typed language, but without type checks even at runtime.

I would love to help modernize these tools. Is the process to submit a merge request still based on mail lists? Can I just open a GitHub Pull Request somewhere if I decide to fix some package? Is there a list of packages that no one has picked up yet?

I'm sure you could get lots of people like me willing to help if you just made contributions a bit easier to make.


The term is weakly typed. Weak vs. strong and dynamic vs. static are orthogonal.


C was mostly a statically weakly typed language. I can't think of anything else in that space


The process depends on the package. If there is an active upstream project, a github pull request or similar would often do. It might be a good idea to coordinate with the distributions involved. For gentoo, a tracking bug seems to be here:

https://bugs.gentoo.org/show_bug.cgi?id=gcc-14




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: