(Don't) crank up the warnings to 11

bragr · on March 15, 2023

I don't know anything about the code quality of the author, but in my experience, the ones that complain about static checking and other guard rails the most, are the ones that need it the most, because in their hubris of thinking they won't make those kinds of errors, they make the most egregious forms of those errors. e.g. smart enough take pot shots about the possible downsides of the process, but not smart enough (or self aware enough) to see their own fallibility.

stusmall · on March 15, 2023

I remember an argument with an engineer about static code analysis years ago. Without a trace of irony, he honestly said "I know it's high quality code. I wrote it. What could a program tell me about it that I don't already know?" You can guess how much of a nightmare his code was to work with.

This isn't a comment on the original article or the author at all. Just my experience with a certain type of engineer.

nerdponx · on March 15, 2023

There must be a corollary for programmers who have their preferred set of static analysis tools, but bristle in the same manner at new static analysis tools, or other peoples' static analysis tools.

kibwen · on March 15, 2023

"Have you ever noticed that anybody driving slower than you is an idiot, and anyone going faster than you is a maniac?" ~ George Carlin

packetlost · on March 16, 2023

I'll give a legitimate example I've had with Python and Mypy. The codebase I work on uses static typing extensively, and largely it's great. However Mypy is... well it fails open badly in obviously incorrect cases. The problem is the teams are by now socialized to trust Mypy and get surprised when very obvious things fail in production because they thought the tool would catch it.

This is the real risk of static analysis IMO, putting undue trust in the tool(s).

TeeWEE · on March 17, 2023

That’s more a problem of moot and python than of static analysis itself.

Python is a dynamic language and mypy is as an after thought not a sound static type checker.

Better than nothing…would be way more risky not having it. However engineer must understand the shortcomings.

I btw gladly switched from python to Kotlin. Makes it so much easier to trust the compiler. Less work for me

nerdponx · on March 16, 2023

I don't know if this is the same phenomenon, but it's a good lesson on its own. No testing or analysis tool is perfect all the time.

That said, I'm curious about the "fails open" cases. Are you using `--strict`?

packetlost · on March 16, 2023

We can't use strict because typing was introduced post-hoc and while most of the codebase is covered, not 100% is.

Gigachad · on March 16, 2023

This guy makes up 90% of the HN comments about how safe C is possible, you just have to pay attention.

packetlost · on March 16, 2023

hahaha, my inclination is almost always the exact opposite. Half the things I wrote a year ago are shit and need to be rewritten and I know it and so do all of my users

klodolph · on March 15, 2023

In my experience, the people who develop static analysis tools and add warnings to compilers are aware of the problems. They know that turning on all warnings is the path of insanity.

It’s a knob that you turn. You get useful information as you start to raise the warning level, but as you increase it higher and higher, you start to spend more and more time managing and silencing warnings at specific locations in your code, or dithering about how to handle some particular warning that does not (or even can not) manifest as actual incorrect program behavior.

I’m the first person to enable Wall Wextra and a few others, and I even turn on Werror religiously during development. But there are some super noisy warnings around. Static analysis tools are wonderful but they usually contain rules that are unreasonably picky, for most people most of the time. Usually these rules are off by default.

Someone · on March 15, 2023

I read this article not as complaining about static checking but as warning that a rule that compilation should not produce any warnings can be counter-productive if many potential problems that get reported are red herrings.

As an extreme example for C, if you had a compiler that issued a warning for every potential int overflow, I think every programmer would ignore those warnings because it would report a warning on such statements as

  int foo = bar + 1;

and preventing that requires one to write

  int foo;
  if(bar != INT_MAX) {
    foo = bar + 1;
  } else {
    ???
  }

Even ignoring that figuring out what to write for ??? can take a lot of time, that blows up the code. Macros can make the source smaller, but your binary will still be larger and slower.

jiggawatts · on March 15, 2023

In better languages, the compiler will write that code for you, and eliminate it whenever it can prove that the condition will never be hit.

This is one of the philosophical differences between C and other languages that makes everyone consider C unsafe. A legacy language that should no longer be used, because all of its users are perpetually incentivised to take shortcuts precisely like the one you've just described.

zokier · on March 15, 2023

Most programming languages are not expressive enough to reliably eliminate redundant overflow checks (or overflows); afaik doing so in general way requires something in the ballpark of refinement/dependent types

jiggawatts · on March 16, 2023

In which case the compiler will emit the overflow or bounds checks, which is fine.

If the developer is certain that a specific add operation should use unchecked add for performance, then they can override the behaviour as a spot micro-optimisation. That then becomes a "self documenting" thing, easily found by a simple keyword search.

To give you an idea of just how rarely this is actually needed or done, there are less than a thousand uses of "unchecked_add()" in all of the Rust code on GitHub!

https://github.com/search?q=.unchecked_add%28+language%3ARus...

codebje · on March 16, 2023

If you don't know what ??? should be, your code's going to misbehave in production - at best crashing, at worst causing data corruption or an exploitable vulnerability.

If "???" is just "foo = INT_MIN;" clang will optimise it all down to a single addition (see https://godbolt.org/z/K1ee8h3TM). If it's always "foo = INT_MIN;" across all of your code you can just use "-fwrapv" to override the C standard and define signed addition as two's complement wrapping.

GCC will ignore the possibility of overflow in signed addition if you don't do anything special (e.g. https://godbolt.org/z/Pc76c5E9z).

You can use "-ftrapv" (or "-fsanitize-undefined-trap-on-error -fsanitize=signed-integer-overflow") to make bigger, slower, safer code that fails fast on unexpected overflows across the board.

You can do the rather unergonomic "__builtin_add_overflow(bar, 1, &foo);" to tell the (GCC or clang, at least) compiler "???" is two's complement wrapping. You can do the cheeky, non-portable, but predictable on the same compiler and platform "int foo = bar + 1L;" to do the same thing. Combine these with "-ftrapv" and you'll get code that short-circuit blows up for unchecked unsigned overflow with the opportunity to produce faster code where you either are damn sure it can't overflow or where you want two's complement wrapping on overflow.

If it's not in a tight inner loop avoiding a check is premature optimisation - if you're consuming I/O and doing lightweight processing on it, an overflow flag test on signed additions will not make a measurable performance difference, nor will it blow out your binary size unreasonably.

IMO it's not a red herring to have signed integer overflow flagged as a potential error. The error is the failure to have an answer for "???".

Someone · on March 16, 2023

> If you don't know what ??? should be, your code's going to misbehave in production

Not necessarily. The programmer may know the ??? code will be dead, but it’s not realistic to expect the compiler to deduce that.

For example, the programmer may know that, in this program, baz has been checked to be in the range [0,999] ten layers up the call stack, and bar was computed from it five layers up by calling quux, and thus will be in the range [31,98355], which is fine, given that this program tests that int can hold that range at startup.

Not all functions in a program have to be free of undefined behavior in isolation to make a program free of it.

codebje · on March 18, 2023

That's a great example of knowing exactly what ??? means: the program is misbehaving and should terminate.

It's dead in the current call stack, but over time programs change and that is how errors creep in. If it's a precondition that _baz_ is in a particular range for the function to behave, why not check that in an assert and save yourself or a future programmer a ton of stress trying to work out why some unrelated part of the code blew up when they made what seemed like a simple change nine layers up the call stack?

There's no runtime cost to an assert in production code, but it makes your intent clear to both the compiler and other humans.

nerdponx · on March 15, 2023

This is actually my annoyance with monadic error handling. When I'm writing a little command line utility, I don't really want to explicitly handle every edge case. Just let the damn program crash and shame on me.

piperswe · on March 15, 2023

At least with Rust's take on monadic errors, you can sprinkle some `unwrap`s around and get what you want.

uxp100 · on March 15, 2023

What level of static checks are you comfortable with? Because I don’t exactly think you’re wrong, but writing deviation reports for nearly every cast is a bummer, and I think many who have worked with high levels of static checks are familiar with the situation where there is no possible way to write a line of code that does not cause at least one message. I’ve spent days writing justifications for disabling particular messages (justifications such as “analysis tool authors misunderstood misra-c, here is evidence” or “reason for message is based on single paper from 1978, does not align with programming practice of the past 30 years”).

I’m net positive on static analysis, but I will forgive a lot of grumbling. It’s a lot of effort and does not catch many issues.

uxp100 · on March 16, 2023

Furthermore, point 4 in the article (fixing unimportant warnings can cause bugs elsewhere) is real. I mean, it indicates issues with test coverage, but who doesn’t have issues with test coverage? Oh, 100% unit test coverage? Is that branch coverage or MC/DC coverage?

But anyway, on one team, when making warning fixes we would compare the final binary to show that with our compiler (the one that matters) our change produced identical output to the previous commit. If there actually was a bug that the warning found, well that was a whole different procedure.

flohofwoe · on March 16, 2023

OTH: If the warnings can't have false positives, they would be errors.

There's a lot to talk about 'warning hygiene', it's part of the daily life of C/C++ coders since it's very common that existing 'warning clean' code bases suddenly spam new warnings after upgrading compilers to a new version (that's also why it's good advice to not lambast an unfamiliar code base for having warnings before investigating why that is, it may simply have been developed with a slightly older compiler version than yours).

I don't agree with the author though: warnings should be turned to 11, then investigated, and locally suppressed if they turn out to be false positives (which is very frequent if you turn the warning level to max - also because some of the 'fringe warnings' are merely coding style suggestions).

Also: static analyser tools almost always depend on additional hints outside the language spec (usually provided through asserts), without those additional hints all static analysers produce a ton of false positives because C and C++ alone don't provide the builtin language features to express enough 'intent' to the analyser.

sgerenser · on March 16, 2023

Don’t 100% agree with this, but it reminds me of a former coworker who got annoyed by all the “noise” of warnings (in a C++ codebase) so he set the flag warn=off. When I turned them back on, there were over 3000 warnings. Including one about a format string error that literally said “format string does not match arguments, this code will crash if executed.”

rcyeh · on March 16, 2023

Wow! I was about to comment: when there are so many unheeded “noise” warnings, it can be easy to miss rare ones and new ones.

I remember a codebase with thousands of warnings, and hiding among them were a few with the word “intend”, like:

“if (condition); statement;”

https://learn.microsoft.com/en-us/cpp/error-messages/compile...

“variable == value;”

https://learn.microsoft.com/en-us/cpp/error-messages/compile...

Gigachad · on March 16, 2023

Warnings should be checked by CI. Adding a new warning should fail the build requiring you to either deal with it, or add a disable comment to the section.

Otherwise warnings are useless.

jjgreen · on March 15, 2023

There's a link to Professor Lemire's bio on the linked article: https://lemire.me/en/

mhh__ · on March 15, 2023

He's good to very good when talking about performance and simd tricks but I don't know if he's particularly sage when it comes to general programming practice.

In this case I think he's probably wrong about compiler warnings but basically right about a lot of static analysers.

nkurz · on March 16, 2023

I'm not sure of your familiarity, but I've worked with Daniel on several projects, and think you might be guessing wrong about his attitude toward warnings. He's not the sort who ignores warnings. His tendency is to change his code so they don't happen, even if he doesn't think a particular warning is helpful or necessary.

I'm pretty sure his point isn't that the end user should get in the habit of ignoring warnings, but that those who decide which warnings are produced at which level need to be conscious that adding too many false positives can be counterproductive. His target audience here isn't application programmers, but those who decide which warnings are made default.

mpweiher · on March 16, 2023

Hmm...when the memory checker came out for Xcode, I ran it on my code-base, parts of it decades old.

I still haven't adopted ARC.

I expected mayhem.

Turns out it "found" two false positives, i.e. leaks that are mandated by the low-level Objective-C APIs. I put the required incantations around them to shut up the checker and then I was good to go.

Keeping it simple, I guess...

¯\_(ツ)_/¯

Oh, and back to my thesis about "safyeness"[1], I certainly felt much better about my code once those checks had passed, even though it hadn't changed.

[1] https://blog.metaobject.com/2014/06/the-safyness-of-static-t...

syntheweave · on March 16, 2023

There's an overwhelming bias in the profession towards writing code that creates some kind of result first, and then figuring out what it does later.

I don't think it's new or specific to tooling practices, it just reflects seasonal fashions based on whatever the last problem someone had was: "the old stuff was too restrictive" vs "the old stuff blew up too often". And there is always difficulty at the beginning of the specification in that feature design tends to change partway through, so guardrails in implementation can hinder the path towards seeing the result.

However, I do think it's countered in some sense by designing towards intentional wrongness: In my code it's always been habitual to first write skeleton code that is a stub that always "returns true", prints out string literals that are what I want, or other such things. Where something a little more complex like managing the state of a video game scene is called for, I've learned to turn towards first enumerating a large number of states, and then gradually parcelling it out into more specific data and algorithms. The reason why that works is because it means I am starting from a fundamentally static position: I've exactly specified what states the system can be in, and haven't taken on unexpected surprises. So I know that I don't cover everything, but I do understand what I covered.

In contrast I think there's also a mindset of "open-ended permissive, fix as you go" where the system is allowed to do anything at anytime, and features are shipped broken and maybe discovered in test. I believe this is what results in ball of mud designs because, with no central point of enumeration, it allows the specification to creep up from any direction. But - it gets a result. And as the system's ambitions grow, the temptation to throw open the doors of permissiveness also grow. And that leads to the "alternate hard and soft layers" pattern[0]. Where we fail isn't in not using that pattern, it's in not recognizing when it's forming and heaping too many concerns on a specific layer.

[0] https://kidneybone.com/c2/wiki/AlternateHardAndSoftLayers

emmelaich · on March 18, 2023

Pretty sure his code quality is good. See https://lemire.me/blog/about-me/

Anyway, this is mostly a warning to more junior programmers.

jgrahamc · on March 15, 2023

I prefer maximum warnings and use of #pragma to disable a warning at a specific point when I know the warning is useless/wrong. That way I can get to a totally clean build with zero warnings output. The last significant piece of C I wrote:

    # Turn on absolutely all warnings and turn them into errors
    
    CFLAGS += -g -Wall -Wextra -Wno-unused-parameter -Werror

https://github.com/cloudflare/keyless/blob/master/Makefile#L...

The reason I do this is if I leave some warnings in I can get complacent "oh, there's always a few warnings" or "that's expected" and then miss something. So, I like all the warnings and warnings treated as errors.

Snetry · on March 15, 2023

well, clearly not "maximum warnings" since you are missing a lot of them. -pedantic -Wunreachable-code -Wcast-qual -Wswitch-default -Wconversion -Wshadow -Wstrict-aliasing -Winit-self -Wcast-align -Wpointer-arith -Wmissing-declarations -Wmissing-include-dirs -Wno-unused-parameter -Wuninitialized and thats not even all that are useful.

Also, don't use -Werror if you can Warnings will change based on compiler, compiler version and a countless number of other factors that make working with code that turns them into errors a pain.

throwway120385 · on March 16, 2023

I actually do what parent does for work projects but I also control the compiler and can decide when to upgrade. Like everything it's a trade-off and if I wanted wide support for different platforms I would consider your advice.

Snetry · on March 16, 2023

Do you only control the compiler or the entire toolchain and system?

If only the compiler you are still on very shaky grounds with -Werror.

Even for the Linux kernel devs -Werror is too harsh.

flohofwoe · on March 16, 2023

A good strategy is to use -Werror in development mode and CI builds, but not in 'distribution mode' (or alternatively: use -Werror for your own code, but not for imported libraries).

loup-vaillant · on March 15, 2023

Man, you should try Clang's `-Weverything` it's… well you know what, I'll try that on my stupidly high-quality cryptographic code:

  $ make "CC=clang -std=c99" "CFLAGS=-O3 -Weverything"
  clang -std=c99 -O3 -Weverything -I src -I src/optional -fPIC -c -o lib/monocypher.o src/monocypher.c
  src/monocypher.c:262:6: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
                  u8 tmp[64];
                     ^
  src/monocypher.c:231:9: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
          u32    pool[16];
                 ^
  src/monocypher.c:337:12: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
          const u64 s0 = ctx->h[0] + (u64)s[0]; // s0 <= 1_fffffffe
                    ^
  In file included from src/monocypher.c:54:
  src/monocypher.h:289:9: warning: padding size of 'crypto_poly1305_ctx' with 4 bytes to alignment boundary [-Wpadded]
  typedef struct {
          ^
  src/monocypher.c:410:9: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
          size_t nb_blocks = message_size >> 4;
                 ^
  src/monocypher.c:437:6: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
          u64 c = 5;
              ^
  src/monocypher.c:498:6: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
          u64 v0 = ctx->hash[0];  u64 v8  = iv[0];
              ^
  src/monocypher.c:621:10: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
                  size_t nb_words = message_size >> 3;
                         ^
  src/monocypher.c:600:9: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
          size_t nb_blocks = message_size >> 7;
                 ^
  src/monocypher.c:640:9: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
          size_t hash_size = MIN(ctx->hash_size, 64);
                 ^
  src/monocypher.c:786:6: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
                  u8 hash_area[1024];
                     ^
  src/monocypher.c:874:10: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
                                          u32 next_slice   = ((slice + 1) % 4) * segment_size;
                                              ^
  src/monocypher.c:801:6: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
          int constant_time = config.algorithm != CRYPTO_ARGON2_D;
              ^
  src/monocypher.c:1170:6: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
          i32 q = (19 * t[9] + (((i32) 1) << 24)) >> 25;
              ^
  src/monocypher.c:1337:5: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
          u8 isodd = s[0] & 1;
             ^
  src/monocypher.c:1349:6: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
          int isdifferent = crypto_verify32(fs, gs);
              ^
  src/monocypher.c:1433:7: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
          i32 *quartic = t1;
               ^
  src/monocypher.c:1500:5: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
          fe x2, z2, x3, z3, t0, t1;
             ^
  src/monocypher.c:1650:6: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
          u64 carry = 1;
              ^
  src/monocypher.c:1676:6: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
          u32 B[8];  load32_le_buf(B, b, 8);
              ^
  src/monocypher.c:1756:6: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
          int is_square = invsqrt(h->X, h->X);
              ^
  src/monocypher.c:1956:8: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
                          int lsb = v & (~v + 1); // smallest bit of v
                              ^
  src/monocypher.c:2015:7: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
                  int h_digit = slide_step(&h_slide, P_W_WIDTH, i, h);
                      ^
  src/monocypher.c:1994:12: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
          ge_cached lutA[P_W_SIZE];
                    ^
  src/monocypher.c:2185:5: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
          fe tmp_a, tmp_b;  // temporaries for addition
             ^
  src/monocypher.c:2540:5: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
          fe t1, t2;
             ^
  src/monocypher.c:2641:6: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
          int is_square = invsqrt(t1, t1);
              ^
  src/monocypher.c:2696:6: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
          int is_square = invsqrt(t3, t3); // t3 = sqrt(-1 / non_square * u * (u+A))
              ^
  src/monocypher.c:2775:6: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
          u32 t[16] = {0};
              ^
  src/monocypher.c:2815:6: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
          u32 m_scl[8];
              ^
  src/monocypher.c:2870:22: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
          crypto_poly1305_ctx poly_ctx;           // auto wiped...
                              ^
  src/monocypher.c:2925:6: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
          int mismatch = crypto_verify16(mac, real_mac);
              ^
  src/monocypher.c:2953:6: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
          int mismatch = crypto_aead_read(&ctx, plain_text, mac, ad, ad_size,
              ^
  33 warnings generated.
  clang -std=c99 -O3 -Weverything -I src -I src/optional -fPIC -c -o lib/monocypher-ed25519.o src/optional/monocypher-ed25519.c
  src/optional/monocypher-ed25519.c:165:8: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
                          u64 in = K[i16 + j] + ctx->input[j];
                              ^
  src/optional/monocypher-ed25519.c:158:9: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
          size_t i16 = 0;
                 ^
  2 warnings generated.
  ar cr lib/libmonocypher.a lib/monocypher.o lib/monocypher-ed25519.o
  clang -std=c99 -O3 -Weverything  -shared -Wl,-soname,libmonocypher.so.4 -o lib/libmonocypher.so.4 lib/monocypher.o lib/monocypher-ed25519.o
  ln -sf `basename lib/libmonocypher.so.4` lib/libmonocypher.so
  doc/doc_gen.sh

---

Well…

Yeah.

That is cranking up warnings up to 11. Now this is C99 we're talking about, how about removing that obviously useless `-Wdeclaration-after-statement`?

  $ make "CC=clang -std=c99" "CFLAGS=-O3 -Weverything -Wno-declaration-after-statement"
  clang -std=c99 -O3 -Weverything -Wno-declaration-after-statement -I src -I src/optional -fPIC -c -o lib/monocypher.o src/monocypher.c
  In file included from src/monocypher.c:54:
  src/monocypher.h:289:9: warning: padding size of 'crypto_poly1305_ctx' with 4 bytes to alignment boundary [-Wpadded]
  typedef struct {
        ^
  1 warning generated.

Ah, this one's may be real. Let's see this struct:

  typedef struct {
   // Do not rely on the size or contents of this type,
   // for they may change without notice.
   uint8_t  c[16];  // chunk of the message
   size_t   c_idx;  // How many bytes are there in the chunk.
   uint32_t r  [4]; // constant multiplier (from the secret key)
   uint32_t pad[4]; // random number added at the end (from the secret key)
   uint32_t h  [5]; // accumulated hash
  } crypto_poly1305_ctx;

Ah, I see: we end by an odd number of 32-bit words, and in the 64-bit architecture I'm compiling this on, there will be 4 bytes of padding. But then I have two problems: first, who cares? Second, I also target 32-bit architectures on which there will not be any padding, and adding it manually would be wasteful.

There's the `#pragma` to silence the warning of course, but it's ugly, and I'm not sure it is strictly portable (unless the standard says compilers should ignore the `#pragma` if they don't understand it, but I confess haven't checked).

gavinhoward · on March 15, 2023

I use -Weverything and turn off two warnings by pragma, which are -Wpadded, the one that generated your last warning, and -Wc++98-compat because I use `alignof()`.

The pragmas may not be portable, but that doesn't matter in this specific case. I surround them with the equivalent of:

    #ifdef __clang__
    #pragma clang ...
    #endif // __clang__

It works fine.

Also, I like the style where declarations come first, so I leave that warning on. It's actually useful to me. (Probably not to anyone else, though.)

aw1621107 · on March 16, 2023

> (unless the standard says compilers should ignore the `#pragma` if they don't understand it, but I confess haven't checked).

It does (Section 15.9 [cpp.pragma]):

> Any pragma that is not recognized by the implementation is ignored.

However, Clang has the -Wunknown-pragmas warning which will fire on pragmas it doesn't recognize, so you'll probably want to do what gavinhoward does and guard non-Clang pragmas with compiler-specific #ifdefs.

technion · on March 16, 2023

Given your example actually uses "-std=c99", the "incompatible with standards before C99" warning really feels it should be disabled, and the fact it isn't really feels like a bug to ime.

Ameo · on March 15, 2023

So much dev time goes up in smoke because of pedantic and overzealous linters, formatters, etc. in PR checks and CI pipelines.

Your whole PR gets held up because of a style nit like an unused import or trailing whitespace. That forces you to pull up the PR in Github, dig into the CI failure, check out your branch again, make the fix, commit, push, and wait for CI all over again. If your PR was previously approved, then you have to bug the reviewer to re-approve since the review gets dismissed when you push new commits.

I know this isn't exactly what this article was talking about, but it's such a big frustration to me. I'm all for leaving warnings like this on and cleaning them up over time, but yeah I strongly believe that turning all lints and checks and warnings all the way up and then gating PRs behind them is a recipe for huge developer frustration and loss of velocity.

Centigonal · on March 15, 2023

Sounds like your team may benefit from pre-commit hooks!

Our team has an onboarding script we have every new dev run that (among other things) installs pre-commit. Our main repo includes hooks for style, max file size, etc.

If there's a style nit, git will yell at you and not accept the commit. If possible, it'll actually correct the style issue (e.g. blackify a python file for you) and you can just stage the updated file.

We still have CI/CD, but that catches other errors, not linting issues. This method has worked pretty well for us so far.

pornel · on March 16, 2023

These hooks must be really high quality — fast and reliable. Otherwise they will get in the way, slowing down development or even breaking git.

List of problems include:

• the hook runs tools directly on the working directory, and not on the stage/index, so it lints against a mixed version of the code, and not what is actually being committed.

• the hook tries to fix the problem above, but doesn't understand submodules and screws them up

• a new version of a linter adds a new warning that fires on existing code, and now nobody can commit anything without disabling the hook.

• hook expects the tools to be in PATH that isn't set as expected when git is used via IDE or some other GUI tool

and so on.

jacobsenscott · on March 15, 2023

We have a fairly good middle ground - our linters only alert on changed lines. They run locally with a commit hook, so you can fix them before they hit the build. A clean linter run isn't required to merge, we trust devs to review the warnings and decide if they should fix them. Also you can turn off the github "feature" that dismisses the review when you push a new change. Trust devs to only make trivial changes after the review - or to re-request the review themselves if they make bigger changes.

mhh__ · on March 15, 2023

A lot of this is a symptom of CIs being deliberately dumb to avoid being wrong, and our workflows being shit. Fuck the CI, the velocity is quite bad just from the programmers machine.

One small thing that arcanist (random example) does that raw git does not is making it very "automatic" to make a really simple change to some code. Even theoretically very trivial things like guessing which files you meant to add to the commit can be a real time saver.

In fact, contrary to what I just wrote, it's not even entirely about time - a lot of this is really about mental context switching.

(Also it's not universally true that you need re-approval)

ChrisMarshallNY · on March 15, 2023

> Training young programmers to avoid non-issues may make them less productive.

I’ve found that training good habits, pays off in spades.

Once a technique becomes habit, its overhead basically drops to 0.

May take bit to get there, but once it’s ingrained, it is cost-free.

An example is when I trained myself to switch from Whitesmiths indenting, to K&R indenting (when I switched from C++, to Swift).

It took a few weeks of discomfort, but it is totally ingrained, these days.

swader999 · on March 16, 2023

Why not just make your tool apply the indenting style? Then it's automatically baked into the team and project, no need for debates or merge issues.

ChrisMarshallNY · on March 16, 2023

Sure, but I needed to do it, anyway. K&R has basically won the fight, so it's a no-brainer.

Like I said, once I got in the habit, the overhead dropped to 0.

Same for my code commenting and structuring, as well as many of the "little things" that I do, every day.

wyldfire · on March 16, 2023

If your editor can do it, that's great.

But another option is to apply it on commit [1].

[1] https://github.com/barisione/clang-format-hooks

aidenn0 · on March 15, 2023

Right thinking people can also disagree on what is a false-positive. I remember a dialog between two people about one such thing. The summary is:

1. No improper behavior occurs currently because of the item flagged

2. However, reasonable changes to other parts of the code could cause the problem to occur

3. There is an obvious way of rewriting that avoids #2.

One person argued it was a false positive because of #1, while the other argued it was a true positive because #2 and #3 combined means you definitely want to change the code...

Gigachad · on March 16, 2023

Our general rule is that if you think the linter/warnings/style preference is wrong, open a PR up with the upstream linter project and convince them they should remove it. Otherwise just change the submission to pass.

ok123456 · on March 15, 2023

Compiler generated warnings are generally pretty high quality. Dynamic analysis tools generate high quality warnings. Static analysis tools that work on a significant amount of the AST seem to generally give good enough warnings.

It's the shallow pattern based, heuristic tools seem to generate a lot of noise.

gumby · on March 16, 2023

When I have "start from an empty buffer" code I try to crank the warnings up to "ten" at least. It helps a lot.

But I agree that all the way to "eleven" is usually counterproductive. And rarely does one get to start with an empty buffer, especially given you're going to #include something.

zX41ZdbW · on March 15, 2023

The order should be as follows:

- automated testing;

- -Wall -Wextra, and -Weverything with some exceptions;

- address, memory, thread, and undefined sanitizers;

- fuzzing (at least some sort of), coverage-guided fuzzing with sanitizers;

- static analysis, starting with clang-tidy;

- other static analyzers, such as Coverity, CodeQL - have the least priority.

An example of how everything of the above, everywhere, all at once is applied in public can be seen in the ClickHouse repository: https://github.com/ClickHouse/ClickHouse in the per-commit checks. But keep in mind that the checks are really heavy - run for multiple hours, and we have to apply kludges to support GitHub Actions with spot instances...

Also the dashboard can be checked here: https://aretestsgreenyet.com/

wahern · on March 15, 2023

C and C++ developers who have been around long enough understand why clang specifically says not to regularly use -Weverything.

Historically even -Wextra has resulted in spurious warnings for perfectly valid and clean code, as most if not all -Wextra warnings are heuristics. -Weverything is also all heuristics, but even more volatile and debatable. Any worthwhile -Weverything heuristic will eventually move to -Wextra.

So, sure, use -Weverything on occasion to see if there's anything interesting to take note of. But if you include it by default you're inviting fatigue and unnecessary rewrites, which may be just as likely to result in the introduction of bugs as to fix any bugs.

gavinhoward · on March 15, 2023

I think this depends on the person.

I love -Weverything. I turn off two warnings by pragma in headers (-Wpadded, -Wc++98-compat), but other than that, I make the code obey all of warnings. This means things like explicitly marking unused arguments, marking pointer casts with a void pointer cast (with an assert to ensure alignment is okay), etc.

But my code will build cleanly everywhere. That's really nice.

shrimp_emoji · on March 16, 2023

Based.

In the epistemically Lovecraftian realm of UB insanity that is C and C++, I happily kneel at the altar of following petty rules if it means it can keep my code safer from bugs.

The post author is basically staring in the face of decades of mind-rending CVEs and, when glancing over to something that can preclude most of them, says, "Yeah but this is annoying and distracting. I need to not be annoyed or distracted so that I can focus on writing good code (unlike all the programmers that came before me)."

wahern · on March 16, 2023

> when glancing over to something that can preclude most of them

That's where we vehemently disagree. IME some diagnostics promote worse code.

But my argument regarding -Weverything comes down to something more practical--your code may compile today, but it will stop compiling tomorrow when the suite of diagnostics and their implementation change. And in a project maintained by more than one person, attempting to keep abreast of -Weverything is a losing battle, with the only benefit being aesthetic. Again, useful diagnostics will migrate to -Wextra, and the best of them to -Wall.

You're arguing against a strawman, alluding to those who resist -Wextra or even -Wall. But -Weverything is defined as a testing bed for diagnostics. By definition even the compiler maintainers disagree about the value--how much it helps versus incentivizes unnecessary, even dangerous code modifications.

Rather than relying on -Weverything to identify flaky C or C++ code, perhaps one should look into not using those languages for the bulk of their application. That way you avoid the most problematic areas of C and C++. Running around throwing casts everywhere to satisfy -Weverything is running in the wrong direction.

-Weverything to my mind is something that might only seem useful to those copy+pasting code from Stack Exchange or now GPT. I'll stick with my well worn copy of the C and C++ standards, and spending my time proving the correctness of hairy code.

gavinhoward · on March 16, 2023

> But my argument regarding -Weverything comes down to something more practical--your code may compile today, but it will stop compiling tomorrow when the suite of diagnostics and their implementation change

Yes, but I always use the latest Clang. And I am a one-man show. Totally not a problem for me.

gavinhoward · on March 16, 2023

Heh. Me too.

https://gavinhoward.com/2023/02/why-i-use-c-when-i-believe-i...

And I agree. That author is not doing everything he can.

zamnos · on March 15, 2023

Interesting. I'm from a different part of industry so x86_64 Linux is support is enough, where is "everywhere" for you?

gavinhoward · on March 16, 2023

Linux, BSD's, Mac OSX, and Windows, on x86, x86_64, ARM, aarch64, and the many architectures supported by Gentoo Linux.

zamnos · on March 18, 2023

Gotcha, thanks.

tmtvl · on March 15, 2023

I like adding -pedantic to -Wall and -Wextra, although getting other people's software to compile with that is a nightmare.

shrimp_emoji · on March 16, 2023

Yep, like your GUI framework's.

It's great how unsafe code rolls downhill and forces yours to be less safe. :p

omoikane · on March 15, 2023

> Modifying code to fix a non-issue has a non-zero chance of introducing a real bug

One example of such bug, where fixing a purify warning about uninitialized memory crippled OpenSSL:

https://www.schneier.com/blog/archives/2008/05/random_number...

via: https://xkcd.com/424/

I think it's generally preferred to enable most warnings but have some annotations to silence false positives, preferably with sufficient comments accompanying those annotations to help the code reviewers.

plorkyeran · on March 15, 2023

I've never thought that's a very good example of introducing a bug while trying to fix warnings. The warning was correctly pointing out that the code was broken, and the obviously incorrect "fix" merely made it never work instead of sometimes work, but sometimes working really wasn't good enough either.

WirelessGigabit · on March 16, 2023

They're CodeQL warnings. On GitHub. They don't happen during development. Nothing is telling you to have to have Warnings dialed up to 11 when coding.

But at least have the courtesy to run them on your draft PR (if manual runs are needed) or make them part of the build process.

Warnings are there because many people got bit in the rear by them. Warnings tell you places where you're not explicit.

Explicitness is good.

But when and where you go through the warnings is up to you. As long as you do it before other people review the code.

remexre · on March 15, 2023

Iunno, for this example, isn't this actually UB according to standard C? I don't think it'd be that objectionable to put e.g. a semantic comment saying "yeah no this is actually not UB because we check in ./configure that your compiler is one known to not do bad things," *if* such a thing were standardized across linting tools.

Of course, that's a heckuva if.

david2ndaccount · on March 15, 2023

This is not standard C, this is already in the land of compiler extensions and platform specific code. The behavior is defined by the compiler and no compiler should treat this as UB.

deathanatos · on March 15, 2023

It is using compiler intrinsics, I grant, and taking advantage of IB is yes, well-defined given an implementation (as we have here) … but it seems like it's still C?

Here, I'm not 100% certain this code isn't UB: we type-pun a __m512i and a uint64_t … I'm not certain that's not UB. If I'm forced into writing C, I try to avoid doing that at all costs … I'd rather not encounter nasal demons.

The warning also seems a bit merited, given that each "increment" is over more than a single object on the underlying array. (But, of course, that is the intent. But I'm just not sure the static analyzer can see it.)

zokier · on March 15, 2023

> Here, I'm not 100% certain this code isn't UB: we type-pun a __m512i and a uint64_t … I'm not certain that's not UB

I'm 99% certain its not UB in itself, because of https://github.com/gcc-mirror/gcc/blob/master/gcc/config/i38...

brigade · on March 15, 2023

It’s not potentially UB until you access the object through the wrong pointer type; pointer math isn’t type punning.

In this case it should be accessed through one of the load intrinsics, which has its behavior defined in the same way memcpy doesn’t invoke UB.

cpeterso · on March 15, 2023

Plus, a pointer to __m512i might require different alignment than a pointer to unsigned long.

This code received a pointer to unsigned long from somewhere that might have cared about the data it pointed to, so his code should acknowledge that (with type casts or assertions) or make a __m512i or void* pointer part of this function's signature, making the type change part of the contract with the caller.

aidenn0 · on March 15, 2023

Someone can perhaps update me for C11, which I'm not up-to-date on, but at least as of C99, while the details of the specification was perhaps unclear on the specifics, the intent was clear that type-punning (of non-character types) without using unions is UB.

remexre · on March 15, 2023

Is the code actually compiled with -fno-strict-aliasing (which IME is rare for userspace code, but I don't work with numerical code very much), or are you saying this is because of the types involved in the cast? I didn't see anything in [0] about a change in pointer semantics from using the vector extensions in the compilation unit?

[0]: https://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html#Ve...

zokier · on March 15, 2023

_m512i is defined as may_alias https://github.com/gcc-mirror/gcc/blob/master/gcc/config/i38...

https://gcc.gnu.org/onlinedocs/gcc/Common-Type-Attributes.ht...

eklitzke · on March 15, 2023

This is not UB.

In this case where you're using the wrong pointer type the worst thing that can happen is you can have aliasing issues. For example, if I have an int* and a double*, the C compiler is free to assume that the pointers can never alias each other so it can optimize loads and stores assuming that an update to the int pointer will never affect the double pointer. In general this can cause problems if the pointers actually do alias each other, but if there's no aliasing occurring then nothing bad will happen.

remexre · on March 15, 2023

My understanding was that the UB was defined such that:

    int i = 0;
    float f = 1.0f;
    void* p;
    
    if(rand() & 1)
      p = &i;
    else
      p = &f;
    
    int j = *(int*)p;
    assert(i == j);

could legally be compiled to a no-op (or a single call to rand() with an ignored result, I suppose), despite there being no "weird" pointer stores, since the ill-typed-at-runtime load "can't happen," so the compiler can reason that (int*)p must equal &i, and hence j = i.

EDIT: And, in fact, gcc does this: https://godbolt.org/z/WM33vzzdo

Not sure if that applies to vector loads, and the gcc docs don't specify, though.

jeffbee · on March 15, 2023

The PR in question seems to be https://github.com/RoaringBitmap/CRoaring/pull/446

I agree that the GitHub analyses are worse than useless, but I also think it highlights the friction between intrinsics and the program in which they appear. There probably is a better model.

nkurz · on March 16, 2023

I think this "friction" is really important to understand. For code of this sort, usually you figure out what assembly you want produced, and then you figure out the intrinsics necessary to produce this. In your mind, there is a clear one-to-one correspondence between the what you write and the assembly you want generated, but too frequently the compiler may have other ideas. It constantly feels like your are fighting the compiler's attempts to subvert you. Personally, I think this best current solution is writing blocks or entire functions in assembly, but admittedly the tools for this aren't great.

teddyh · on March 15, 2023

> However, I fear that it is part of a larger trend where people come to rely more or more on overbearing static analysis to judge code quality. The more warnings, the better, they think.

Don’t get me started on security scanners which basically report all “open” ports as a security issue.

https443 · on March 15, 2023

This is why the NPM warnings are such a problem too. https://overreacted.io/npm-audit-broken-by-design/

adastra22 · on March 15, 2023

I have turned off all GitHub scanner notifications. There is just too much noise and my time is limited.

2h · on March 15, 2023

Or, maybe stop doing pointer arithmetic? We have had some pretty good new languages appear in the last 7 years. I know people are averse to change, me included. But you may want to ask yourself if continuing to use a 50 year old language is the right answer.

zaptheimpaler · on March 15, 2023

This is low level code written in a library to make full use of AVX-512.. which is inherently a processor native feature that has to be written in a language close to assembly because the processor only understands machine code.. there is no way to avoid assembly or the pointer arithmetic. Someone has to do the pointer arithmetic so the higher level languages and libraries don't have to.

Snetry · on March 15, 2023

Oh how correct you are.

Let us throw away all the existing code written in languages made in a different age and use languages that are completely binary incompatible just because we can.

loup-vaillant · on March 15, 2023

Well, then the proper response from Github would be something like:

  WARNING: HIGH-TO-CRITICAL SEVERITY:
  You are using C. C is not memory safe, and likely
  to expose your users to critical vulnerabilities.

Not sure it would fly.

eklitzke · on March 15, 2023

This example isn't doing pointer arithmetic.

jeffbee · on March 15, 2023

Yes, it is.

  const __m512i *ptr1 = (const __m512i*)container1->words;

Then later:

  __m512i r1 = _mm512_loadu_si512(ptr1+i);