Hacker News new | past | comments | ask | show | jobs | submit login
Everybody makes mistakes when writing comparison functions (karpov2007.medium.com)
62 points by ingve 78 days ago | hide | past | favorite | 62 comments

Unused function argument is a warning any sane compiler can spit out and turn into a hard error (so this is not really the best example of what PVS studio can do for you), which makes one wonder: why was this not caught earlier? Warnings not enabled, or ignored, and is that something which is problematic wrt something as major as OpenSSL?

I also wonder this. "Unused parameter in function" is one of the most basic warning/linting checks. I'd be interested in finding out how this managed to reach release.

Having unused parameters is super common, especially when using function-pointer-based APIs like qsort. Arguably they should have annotated the arguments to suppress the warnings, but that starts to get into compiler compatibility (OpenSSL is used on a _lot_ of systems with a _lot_ of compilers) and becomes a much larger question than just setting -Wno-unused-parameter.

> especially when using function-pointer-based APIs like qsort

Well, the function should be called somewhere, even if it's ignored when null. I can't think of a legitimate case for an unused parameter - it might be #ifdef-ed out in some cases, but it should still be referenced somewhere.

For qsort, sure, but it's pretty common to not need the void* argument for the start_routine you pass to pthread_create, to come up with one example.

Both GCC and clang warn about unused parameters, but only when you run them with -Wextra.

As lots of us know from experience, it's very ver hard to get an in-production project into a warning free state with warnings at maximum and warnings as errors. I'm currently working on a new project this year and it has been such a joy to have near max compiler warnings as errors with multiple compilers (and clang tidy and clang format) right from the start.

Integration of a static analyzer into an existing project may be hard. Especially if the project is big and old. In fact, it’s not as scary as it seems to be. There are ways to do it as easy as possible: How to introduce a static code analyzer in a legacy project and not to discourage the team - https://pvs-studio.com/en/blog/posts/0743/

Warning definitions also change over time with new versions

it's very ver hard to get an in-production project

True, especially for older projects which never cared about warnings to begin with. Still when I encounter one I might try it anyway, gradually if possible, to see what comes out. Not just because it is indeed just joy to have no warnings, but also because more often than not some of those warnings actually tell you there are bugs. Or for example (just had this last week) because there are projects which compile with 10k+ warnings which mean that even if you just compile a single file, it is way to hard to tell whether your new code just introduced a new warning and if so which one it is.


One strategy I employed with a very old code base with a gajillion warnings was to output them to a file and compare against a reference file at the end of the build. If the output changed we would fix those and maybe any that kept happening if you added new compilation units etc. It at least helped us slowly reverse the trend of accumulating warnings.

I did the same, but it meant that the build could not go in parallel.

I also stored the warnings.txt in git so that before any commit I could simply do a git diff to see what warnings were changed by my additions.

You've got to love that -Wall rusted in place as meaning not, in fact, all warnings, but instead just some arbitrary set of warnings that some people wanted many years ago and now we mustn't change it because too many projects have code that builds under -Wall -Werror only because the warnings that code triggers weren't covered by -Wall many years ago.

https://xkcd.com/1172/ in action

GHC has the correct policy, that -Werror is not recommended, and if you use it there are no guarantees your build won't break on the future.

Every compiler should adopt it. And every build tool should stop printing thousands of lines of "build tool passed here" on the default verbosity.

Another benefit to Rust editions too.

In principle Rust isn't obliged to let your clearly bad program compile once the realisation is reached that it's broken. If there's a new warning emitted, and you've told the compiler not to allow warnings, well, to bad, fix the warning. However sheer weight of numbers, as with GCC, might beat that position for some future case, we can imagine that if an urgently needed warning breaks 80% of the most popular crates that's not going to be OK.

We get infinite extra tries - Rust editions mean that by definition there is no Rust 2022 code today, and so we can define up front that Rust 2022 code must not have whatever egregious yet widespread problem couldn't be fixed in Rust 2018 code due to the numbers involved.

When people write new code which defaults to Rust 2022 it gets flagged as bad, forcing them to fix it, but their old Rust 2018 code still compiles unless they want to go to the effort to migrate it.

[Yes I know Mara is poised to ship Rust 2021, but this is a hypothetical so I used a future year instead]

Reading https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html, which says

“In order to get a warning about an unused function parameter, you must either specify -Wextra -Wunused (note that -Wall implies -Wunused), or separately specify -Wunused-parameter.”

So, it seems one can do


So I want to add an explanation. As I see from the comments, many of you thought that the PVS-Studio analyzer warns about a function's unused argument. That's not quite so. Or, to be precise, this is so, but the analyzer processes this case smarter than most linters or compilers.

It's a bad idea to program an analyzer so that it just issues a warning to unused arguments. Such analyzer would produce many false positives, which is why many developers don't look at (or disable) these warnings in their compilers/analyzers.

The PVS-Studio analyzer implements sort of empirical "magic". PVS-Studio relies on the fact that there are arguments of the same type and some of them are not used, while the other ones are used several times. At the same time, there are a number of exceptions to the rule. For example, the diagnostic is not triggered if the number of unused arguments exceeds two.

All this allows the V751 diagnostic to issue few false positives, which makes the tool surpass its competitors. To be exact, when developing PVS-Studio, we do not implement rules if we cannot make them better than those of the compilers - https://pvs-studio.com/en/blog/posts/0802/ . Thanks to the diagnostic I described above, one can find interesting errors - https://pvs-studio.com/en/blog/examples/v751/ .

P.S. The PVS-Studio analyzer also provides a "stupid" version of this diagnostic - V2537 https://pvs-studio.com/en/docs/warnings/v2537/ . It was developed to check code against MISRA C and MISRA C++ standards. But the case above was special and by default this diagnostic was disabled - same as the other ones related to MISRA.

The author links to a much more detailed list of mistakes in comparison functions. It's worth reading: https://pvs-studio.com/en/blog/posts/cpp/0509/

The mistake I often see in interviews:

    int cmp(int a, int b) {
        return a - b;
This can easily overflow or underflow. For example, if a = INT_MAX and b < 0.

I don't count it against the person interviewing, because its common and arguably reasonable if your values are all < INT_MAX/2.

The author's article about comparison functions linked in the blog post is worth the read

Regarding the OpenSSL example presented here, wouldn't any decent IDE catch an unused parameter in a function? Why is a separate static analyzer necessary for this

> Why is a separate static analyzer necessary for this

Because this is a nail this co-founder of PVS-Studio saw while holding the hammer he wants to sell.

By the way, it seems to me this error does not have much to do with comparison functions, it is a mistake that can be made in many kinds of functions.

Unused parameters are warned against in C (hence all the UNUSED macro hacks) and no static analyzers are going to save you if you don't address compiler warnings, they are just going to add more warnings to that pile of warnings you already have (or maybe don't get because the right warnings are not enabled).

(I'm sure their blog is full of cases a static analyzer would handle which compilers won't warn you for though)

Still surprising to see such an error in OpenSSL and like others here I would be interested to know why it was not caught.

Last time OpenSSL was in the news for a bug that should generate a warning I took a cursory look at some of the code, it was stuffed with stuff that would generate similar warnings. So at that time I concluded that going through all the warnings would be a pretty big job, and that is likely why nobody did so. On the more speculative front, it might be that some maintainer is against taking whitespace patches, that would probably make anyone who tries give up pretty quickly.

I'm a bit skeptical of the claim "The code quality is excellent", if it was this would be a lone warning emitted by the compiler, and surely it would have been fixed then.

“The code quality is excellent” is either the author hedging pissing people off who are fans of OpenSSL or their analyzer doesn’t spot a lot of issues and thus they conclude it’s excellent for those reasons.

Actually the author said that the “The code quality is excellent” specifically about the new version 3.0 in contrast with the older OpenSSL versions, where he could find much more problems.

I have not looked yet at the new version, which has a very large number of changes, but it is quite possible that the author is right and that the rewritten code is very good.

I think comparison functions are a prime "use case" for this kind of error because it will pass the type check easily and comparison is often somewhat symmetric, so you will do a lot of copy and paste. Only unit or property tests could catch this (and a compiler warning, of course). The example here is of course extreme, but I bet you will find something like this quite often, especially in OO code like Java.

A lot of people don’t use IDEs to write C.

That said, I believe the two mainstream C compilers (GCC and Clang) can both be configured to emit a warning here.

If a diagnostic issues warnings to unused arguments - it's a weak diagnostic. PVS-Studio acts differently. Take a look at my clarifying comment below or above (not sure where it would be when you read it).

I think the point is if you integrate the analyzer in the workflow its hard to ignore. The ide warnings are easy to miss sometimes.

The Right Thing™ for a project that actually has multiple contributors is that the CI pipeline rejects code that based on the compiler diagnostics is either definitely wrong or, if it was correct, should be annotated in a way the compiler can understand and stop emitting the diagnostic. It doesn't matter whether one or another contributor has this switched off or ignored it.

That is, committing this code should have had the same impact as if it was missing the semi-colon, the library doesn't build, tree is on fire, fix before you do new work.

Of course if your code is in sufficiently bad state, you might find it's frustrating to have say 500 bugs to solve before you can get the CI pipeline to output artefacts. I think that means you didn't have good software and you must fix those bugs first, but if you're convinced the software is good it can be tempting to instead disable the diagnostics telling you otherwise and press on.

You have 178 seconds to live:


I have a friend that used to be an OpenSSL contributor (not sure if he still is). Not really a big deal. There were a lot of them.

He was generally cranky ["curmudgeonly"? -Ed.] about the code, and tried submitting quality fixes, where possible; but this was years ago.

PVS-Studio looks great. I wish it worked on Swift. SwiftLint has its limitations.

I was surprised to read in the article that the code quality was good, unless they some how pulled off a major refactor since LibreSSL basically called the code a mess.

If I recall, the claim was that OpenSSL prioritized issues paid for by companies, and the alot was left to rot. I'm curious if your friend saw the same things?

OpenSSL 3.0.0, which is what the article mentions, is a new version that is incompatible with older versions of OpenSSL. its internals operate in a completely different manner. It is a very changed codebase. That being said, I would not say its code quality is good, regardless of whether any tooling reports that.

I have never gazed upon the code, myself, so only have thirdhand anecdotal information.

He didn't speculate as to "why," but he said the project was very buggy, and rather "messy."

To be fair, we worked for a company (he was one of my employees) that was anal about Quality, and hard to please. It rubbed off on us.

There is a difference in when a tool considers code to be of high quality and when humans do.

Most linters and editors would highlight this. Also, if one writes test, one of the test cases should fail.

This is just a typo, not really a quality ad for some code quality monitor.

So I want to add an explanation. As I see from the comments, many of you thought that the PVS-Studio analyzer warns about a function's unused argument. That's not quite so. Or, to be precise, this is so, but the analyzer processes this case smarter than most linters or compilers. Continue: https://karpov2007.medium.com/a-few-more-words-about-the-pvs...

As someone used to tools like IntelliJ for Java and Kotlin, and Clippy for Rust, this is like the most basic kind of warning I can imagine.

It blows my mind sometimes what IntelliJ, in particular, catches... like code branches that can never execute due to intricate conditional blocks in the same block of code... forgetting to use a method argument is not even worth a mention :D

Look at just how many issues IntelliJ can detect for Java: https://www.jetbrains.com/help/idea/list-of-java-inspections...

Unfortunately, I don't fully understand your comment. PVS-Studio has a variety of diagnostics. However, such simple errors still exist. By the way, PVS-Studio found errors in IntelliJ IDEA as well :) - https://pvs-studio.com/en/blog/posts/java/0603/

I know it's a bit futile, but according to its name the function seems quite generic. So I guess it's used in various places, so I wonder how such a big (in consequences, not in lack of attention: I code all day long I make tons of mistakes myself; been there done that got the T-shirt) mistake could still be there... (I repeat: I don't blame the coder, I know the stuff, I just honestly wonder)

I was a bit disappointed when the focus is not related to cryptography.

I.e. a poorly written comparison function compares multiple expressions and the use of && short-circuit the logic, result in a timing attack.

Would love to read an article about this topic, i.e. how to write a function whose execution time is constant regardless of the inputs, and thus not leak any side-channel information to the attacker.

I agree, it’s a bit of one-sided point of view. It’s only about the code quality. The cryptography analysis is beyond static analysis tools’ capabilities. Static analysis is more high-level study.

I don't see how that's specific to comparison functions. It's just a result of someone not unit-testing their code.

I was expecting to see an example of structure comparisons such as struct date {int year; int month; int day;} which are easy to get wrong unless you've internalized the pattern (or use modern C++ where it'll generate the comparison operators automatically for you).

The main specific is that it’s one of the most common errors. People don’t check comparison functions, but those functions have errors. You can read more here: https://pvs-studio.com/en/blog/posts/cpp/0509/

I have to admit that I was expecting this to be about some subtle and often overlooked logic error in writing comparison functions (say assuming that aba > b). But this is about what essentially amounts to typographic errors in the code.

On a related note, I think the most errors I've seen in sizable C++ systems has been operator overloading for things like '=' and '=='. Too much hidden magic can occur.

If you don't override or explicitly delete them you get compiler defaults, which are even more magic and harder for the reader to reason about.

I agree, and I think it's probably a good idea for any class that isn't a simple structure with functions.

I suppose that a problem with C++ (and the more feature-filled source control systems) is that it's dangerous to mix different levels of programmer sophistication in a team. The people are reflected in the code particularly as the obfuscation increases.

Everybody makes mistakes writing any kind of function.

God I'm so tired of this take. "All code has bugs." "Everyone writes shitty code, some just make it less shitty." ad nauseum.

I don't know where this started (more, when it got popular) but it's such a cop out and lazy argument to anything safety-related.

It adds nothing to the discussion and provides no insight or substance, yet people parrot it often, especially around "language lawyer"-type HN articles.

Moreover, it's wrong. Not every function has a bug. Yes, there are ways to prove this. No, not every function needs to protect against the computer being struck by lightning, or any other fault, in every case. It's such a weird and wrong argument and it's always said with such confidence.

The idea that all software is terrible or all code has bugs is leveraged at my work as a form of psychological safety. "Yes, 'all' software has bugs, so you should feel safe exposing a bug you made and we can all learn from it." A person who took the saying as an excuse to be sloppy would get called out hard on our teams and wouldn't last if they kept that view.

I think every code has bugs in the same way that every gun is always loaded. Yes you can throughly check for it in some place and be rather sure that it's not loaded, but the way to act around it is as if it could always be loaded.

This doesn't excuse writing buggy code, same as you should always handle a gun with care, but it promotes good behaviour around it.

This is nonsense. Comparing the two makes no sense.

> Yes you can throughly check for it in some place and be rather sure

Look up formal verification. You can prove, mathematically, that code is bug free.

I write code in Agda (a proof assistant) for a living and this is rarely true. Because (1) even if you prove your program is correct as per spec it doesn't prove it's correct according to user/PM since spec can/will be buggy (2) proving every single theorem about your system is an extraordinary time sink, an engineer needs to know what parts are higher risk and needs to be proven and which parts are corrolaries of basic algebra and proving is of academic interest.

EDIT: In my experience the biggest benefit you get from proof assistants like Coq, Agda etc is being able to run arbitrary unittests within your type system. It's convenient because it makes all bugs a type error and therefore possible to check at compile time. This doesn't mean agda will magically find all bugs, but it means you have more tools to do so.

Bugs in spec != bugs in code. This is expressly moving the goalposts.

Even if this were true my point (2) still stands.

But it really isn't true because spec is part of the code. If your formalism is wrong, irregardless of whether implementation is correct, your code will behave wrong. There is absolutely no goal post moving. If my program is wrong, I can't tell my user "well, my unittests are all passing and I have 100% code coverage".

> Even if this were true my point (2) still stands.

And it is completely beside the point of the original argument. I was never arguing against your second point.

No, that does not follow, it is actually a common logical flaw.

What you can prove with formal verification is that the code conforms to the sacro-saint spec. But who says the spec is bug-free?

You can formally prove that a function always does exactly what the author intended, no less, no more. But it still might be the wrong thing to do. The question is, do you count that as a bug or something else.

That's moving the goalposts, and not the point originally being made nor the point I'm addressing.

> Not every function has a bug

Ok, you're right. Every non-trivial function has a bug.

My favorite thing about Clojure is that comparison functions work on any number of arguments and correctly test equality of nested data structures.

I like comparing how different birds interact with each other in the park. Some flock in little groups, but the ducks swim around the park pond.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact