Hacker News new | past | comments | ask | show | jobs | submit login
A brief history of one line fixes (tedunangst.com)
264 points by coconutrandom on March 1, 2014 | hide | past | favorite | 154 comments



> What do all these earlier mistakes have in common, apart from the obvious: being exemplars of “catastrophic loss of structural integrity”? They all date from before 2013. That’s how we know the NSA wasn’t involved.

I get that this is trying to make fun of the response to Apple's "goto fail;", but this logic ("These similar errors predate 2013 → Apple's similar error was not an NSA backdoor") seems rather faulty. There are a number of flaws with this line of reasoning. To name a few:

* The NSA could have been involved with backdoors before 2013 (unlikely in Debian, but mentioning 2013 is a bit of a red herring)

* Apple could have been encouraged to insert a backdoor and did so in a way that gave them plausible deniability (either because the NSA specified that, or because they wanted to cover their behinds)

Whether or not this incident was the result of the NSA's prompting is something we'll probably never know[0], but this article doesn't do much to argue one way or the other.

[0] The only way we could know is if someone literally came out and admitted it (or someone like Snowden were to leak it). It's possible to prove the existence of something (ie, a backdoor attempt), but impossible to prove the absence of something.


My takeaway was that the NSA could have been involved in all the above, we just weren't paying as much attention before ;-)

Of course, whether the NSA was involved or not isn't the point of this article. The point is that high severity one-liner bugs have been made before.

I feel partly to blame. Until my comment on HackerNews revealing where the source code was, no one seemed willing to post more details about the bug (based on Twitter posts). But I think it will end well, as more people will pay attention to the code in future. Sadly, Apple has yet to share revised source code for the 10.9.2-shipping security library that I'm aware of.


What you've argued is faulty logic was actually sarcasm. One major clue is that the previous paragraph also contained heavy sarcasm.


Uh yeah, you're right. I'm surprised that didn't jump off the page at me but hey, I read the comments first and then the article too, failing "like a good banker", meaning succeeding in not spotting the bullshit but not any worse than my fellow hn-ers who also upvoted it.

I amazed your at bottom of the comments. But then again I was silly enough to upvote the parent before I read the full text.

"A sound banker, alas, is not one who foresees danger and avoids it, but one who, when he is ruined, is ruined in a conventional and orthodox way with his fellows, so that no-one can really blame him" - JM Keynes


The logic at the end is still in parody mode. Flawed is the point.


It's pointless to discuss whether this is an intentional flaw or a mistake because the problem lies in code review and tool. As someone has said in the past to me that compiler can catch these errors, well, if so, what flags do they use and should we have a standard list of flags to enable to test these problems? Do we have tools to show these errors instead of going through a 1000 pages compiled log?


i don't think it's pointless to discuss. sometimes it is sometimes it isn't.

in the apple case though the two revision numbers we have are 55179.13 and 55471. i don't know how apple numbers, so i'm not entirely sure what the .13 is, but theoretically we could have 292 patches between the two we see publicly.

it's impossible for us to know if there would have been a code change that could have caused a merge conflict at that place.


> impossible to prove the absence of something.

If p => q, then ¬q => ¬p. So if p is "The goto fail was setup by the NSA", then surely we can come up with a reasonable q which could be (dis)provable?


would it be something like this:

  if "setup by the NSA" then "goto fail"
I can think of a q that disproves p then!

  not goto fail.
problem is

  goto fail.
since q is true, p is either false or true, with no distinction either way. So... impossible to prove.


Nope. P is "the nsa is responsible for goto fail".

A possible q could be "someone from nsa communicated with someone from apple"

Another possible q could be "goto fail was introduced on purpose"

These specific qs might be difficult/impossible to disprove, but that doesn't mean that there is no q we can disprove.


[deleted]


I thought it was sarcasm. It's funny because there is an unbraced if and a goto, but the error is sonewhere else


A favorite of mine, in C:

  ASSERT(apples = 1);
  VERIFY(oranges = 2);
(Note the '=' means assignment, not comparison.)

Both are assertion-testers resulting in a core dump if the test fails, but ASSERT is only defined for DEBUG builds.

Since the assignments are to nonzero (true) values, buggy values of 'apples' are silently corrected -- but only during test runs. Buggy values of 'oranges' are always silently corrected.

Hilarity ensued. Afterwards, 'gcc -Wparentheses' became mandatory.


One way to avoid this kind of mistakes is to use Yoda conditions [0]:

    ASSERTS(1 == apples);
If you forget one `=`, it doesn't compile.

[0]: https://en.wikipedia.org/wiki/Yoda_conditions


I don't think learning this is worth it. Using this will not always warn you( complicated expression or variables ) and you still might miss something.

+EDIT: In those occasions when you actually compare to a integer, the expression is so simple that you immediately understand the whole line. In case of an actual typo the compiler will warn you, and the inversion is not needed. Nowadays, there is simply no excuse for a ignored or missed( bad compiler ) warning.

Assignment in an If statement must produce a warning!


Don't understand? If the constant is on the left, then forgetting one of the equals signs will always be an error.


Yes but I think he is saying that if the left/what you compare to is not a constant then it will not warn you (if the type agrees, as it should anyway for a comparison).


If the left is a function call, it will still error out because you can't assign to it.

If you're doing

    if (var1 == var2)
then yoda expressions won't save you, however.


Also, in C++ you can assign to the return value of a function returning a reference.


The word learning in this context doesn't mean understanding this simple trick, which any newbie should grasp immediately.


I like Yoda conditions not only because it protects against the missing "=", but also because often the constant is the shorter side, so reading

    if (42 == some_long_expression_here)
    {
        ...
    }
It is immediately clear that the rest of the line after the two equals signs must be an expression because the curly brace in the beginning of the next line. So folks don't have to bother that much about counting parens.


Well, you won't see 'magic values' in production code, or at least should minimize it


On the other hand, Yoda conditions are - as implied by the name - counterintuitive and, as pointed out below, aren't really necessary when compilers pick up this kind of error anyway. The real problem is that this kind of mistake is even possible in the first place.


This technique was taught to me by Charles Fiterman, before he was at geodesic. And while compilers are a bit better about this nowadays, back then this was a good technique to use.


Don't most compilers give a warning for this sort of thing?


Side fun question: what was Yoda conditions called before Star Wars was in the cinemas?


Maybe it didn't have a name? After all, A New Hope was only released a few years after C, and earlier languages (BCPL, Pascal) used a different expression for assignment (:=), so they didn't suffer from this problem.


Coverity has ASSERT_SIDE_EFFECT for exactly this sort of case.


Cough, there's one major thing all of these bugs have in common - they're all in C code.

This is mostly due to the volume and prominence of C. But a language with the same fundamental semantics of C but lacking the cleverbait syntax and weakened types would have prevented over half of these mistakes.


I would be more willing to blame C if these bugs were unsafe memory operations. Unused parameters, logic errors and assignment instead of comparison aren't unique to C and the language isn't without counter measures for them.

The X one is probably the only very C bug on the list and compiler warnings and/or a good static code checker would have found it.


I deliberately did not address unsafe memory operations, because the current widely-accepted state of the art for memory safety is garbage collection, which has significant overhead. I could have also tooted the horn for Rust (or Felix/BitC/Deca), but my goal was not to point out blank-slate solutions, but just how much of C itself could be trivially fixed.


Among others the Objective-C crowd might want to have a chat with you about it being the "widely-accepted state of the art." This is ignoring the fact that memory management does not really address C's ability to do unsafe memory operations as much as its just resource management.


I really should have said garbage collection + RAII. I'm not saying that GC is accepted by everyone, just among the people working in current memory-safe languages. (Or are you saying that ObjC does something even better that's yet to catch on?)

It's not so much about a language's ability to do unsafe memory operations, but the necessity of doing them to write useful code in the language.


Compilers can't tell if a shared library symbol resolves to null.


GCC's -Waddress triggers a warning on the use of a function in a conditional expression like in that X bug.


"Over half" is dubious.

The compiler could have prevented 'X' by not permitting "function == int" comparisons. That's 1/5.

The only thing you might be referring to with "cleverbait syntax" is the ++ fix in tarsnap, but note that the bug was not including ++. Being generous, we could say that if ++ wasn't allowed, it might have been more obvious that the code was broken, getting us 2/5. And being extra generous, maybe if `!i` was disallowed, they would have gone straight for `i <= 0` instead of `i != 0`. So 3/5, but yeah, dubious.

The only other one that's remotely like a C misfeature is that the compiler apparently didn't warn about 'unused parameter' in the Android bug.


Yes, those are the three I am referring to, but I don't agree that any are generous or dubious. Out of the C context, what does "not integer" possibly mean? `i` should either have been a boolean, or apparently an explicit multi-state enum, rather than the ever-favored weakly-typed ubiquitous `int` that C makes so attractive.

I'm not claiming the unused parameter bug, because there can be legitimate unused function parameters, and we can easily envision a very similar bug that still used all parameters.

But add in the current Apple bug that precipitated this post, and you have 4/6.

The general defense by C programmers is that everything works as long as you do the right thing. I have been there. However as these bugs illustrate, vigilance does not scale.


C has enums, and it's true that they can be implicitly used as ints and sometimes that can be bad, but here they just weren't used. I'm not sure what you think C could have done differently. It's not obvious to me that the current enums aren't good enough for this particular purpose, for example. This feels more like a culture thing than a language thing.

And note that 2/3 of the bugs you're claiming, apply to all weakly typed languages, of which there are many. Python would not have prevented them, for example. Showcasing them as problems with C seems somehow fishy. (I don't mean to imply dubious motivations on your part, I just get the feeling of "there's something not quite right with this argument".)


A lot of it is culture, but I feel that culture fuels itself with the tricks available in the language. It just feels so clever to use the return value of `i++` - number of lines matters one-level down in ASM, so why not here. And things like `if (!i)` are quite cute, while being hard to aggregate unions of existing types, so return-value definitions especially tend to stay informal.

You're right about my analysis applying to all weakly typed languages, or at least languages with weakly-typed core functions. Python pretty much only brings memory safety to the table (which as I said in a sibling comment has an actual overhead so I didn't touch upon it here). But it does show that the Tarsnap and the Apple bug are caused by syntax, which is a pretty heavy indictment.


Python is a strongly typed language.


For every language, there are senses in which it is strongly typed and others in which it is weakly typed.

In this case, we're referring to none of these being errors as weak typing:

    if 0: pass
    not 42
    'a' == 9


> The compiler could have prevented 'X' by not permitting "function == int" comparisons. That's 1/5.

0 is also a special case litteral for the null pointer, and function names are valid in pointer contexts, so I'm not even sure that "f == 0" is badly typed.


Only in C++ pre C++11 - most C compilers use

#define NULL ((void *)0)

in which case, using NULL, you're comparing a function to a pointer.


And C has this thing called lint that's only been around since 1979 that can detect many of these issues.


Yes, many tools have been built to address the problems with C, and one is remiss if they ignore the greater development ecosystem.

But, that they're not integrated into the language itself means those tools can only ever make suggestions in a separate context, and so cannot function cohesively to permanently rule out classes of bugs in a way that can simply be taken for granted.

In fact, the Debian OpenSSL bug was actually caused [1] by a message from one of these tools being uncritically acted upon.

[1] the job of failure analysis is to find all contributing factors, not just pin everything on one.


Static code analysis tools exist in many languages even very high level, safe ones. A lot of them are even named after lint or as a tip of the hat to it.

Their presence isn't a sign of flaws of the language. They are signs of flaws in programmers. They give the designers the ability to extend the power of the compiler's warnings to help catch common mistakes and tune the automated feedback to fit the needs of the project.


Sure, but static analysis can only add so much, as the programmer is still working in the base language and can't convey higher level concepts to the static analysis tool, so it's forced to backfit them by guessing intent. And to the extent you can make annotations for this, you're effectively working in a new formal language without the benefit of a compiler.


I heard people working in those other languages never make mistakes.


Yup. We found an exhaustive list of all ze software bugs and ... cough, cough, gasp, gasp... all happen to be in C. Whatever shall we do?


C is the least of the problems here.


The commentary on last example from Tarsnap seems wrong. The error was that the nonce stopped being incremented (see the linked Tarsnap blog post), but the author suggests the issue there is the unbraced if. The unbraced if is still not great style, but it wasn't the cause of the security blunder.


It is kind of on purpose, that's the whole point of the writeup. It's taking a little jab at everyone who proclaims a "simple and obvious" way any given error could've been avoided. Fact is, making the programmer type more (braces or other stuff) doesn't save him from himself; the goto fail error could've been made with braces just as well as without.

The other examples are make the same point. There are ways to help catch specific bugs. But there is no magic "fix it" button. Consider static analysis for example: you actually have to run it and inspect the output. You have to interpret it right. You have to fix it right. And so on. Human error can ruin each step, as has happened.


The commentary does fit the recent Apple 'goto fail' bug. I think this post may've originally included it. Perhaps the wrong commentary got deleted.


Yes. That was the joke.


I have done similar thing to myself in Python, sabotaging my own little project. The following is not a rare thing. Imagine this were initializing random key in TLS... But since Python syntax is arguably cleaner and more readable, it's easier to catch. But that's a bet we have to put up with...

    >>> def f(a,b=[]):
    ...     b.append(a)
    ...     return b
    ...
    >>> print f(1)
    [1]
    >>> print f(1)
    [1, 1]
    >>> print f(1)
    [1, 1, 1]


    >>> import random
    >>> def key():
    ...     return random.randint(1,10000)
    ...

    >>> def f2(a, b=key):
    ...     return b()
    ...
    >>> def f3(a, b=key()):
    ....    return b
    >>> f2(1)
    3974
    >>> f2(1)
    8684
    >>> f3(1)
    2867
    >>> f3(1)
    2867


Well, any Python developer worth his salt knows that initializing default parameters that way is a big no-no. I actually use code with those exact mistakes in interviews, as it is a good way to tell Python beginners from more enlightened users.


It's easy to make mistakes, especially when the compiler/interpreter gives no assistance/warnings.


Pylint catches this, pylint categorizes this as a warning.

If your language has a lint tool and your code is remotely important, please run the lint tool as part of your build.


Correct

Unfortunately a lot of people still use it

(and it's actually correct, IF you don't modify it)

It's a risk not worth taking (to me, at least)


That's the only way of creating function with state without introducing closure or making an object with __call__. It's very handy for accumulators and other bits of state which are inherent to the function you write. It's essentially a `static` variable declared inside a function in C and believe it or not it has it's uses.

Depending on a problem it can be worthwhile to use other techniques of creating stateful callables, but that doesn't mean you should never create a function with mutable default argument. It's there in the language - learn about it, understand how it works and why it works like this, then understand where to use it and use it where it makes sense. That's a pretty generic advice, valid for almost any language feature.


> That's the only way of creating function with state without introducing closure or making an object with __call__

What about this?

  def foo():
      if not hasattr(foo, 'static_list'):
          foo.static_list = []
At least it's a bit easier to see that something funky is going on.


The "No True Scotsman" helps out for essentially every language though.


    > But since Python syntax is arguably cleaner
    > and more readable, it's easier to catch.
I've seen nasty bugs in Python code having to do with the end of blocks losing their indentation due to someone's merge/editor mistake.

Having braces or other explicit start/end markers for blocks would have prevented those issues. So Python's syntax can in some cases encourage these sorts of mistakes.


Yeah, that's an excellent point. I remember arguing with friends about this vs braces. Mis-indent error is also a hard-to-detect bug in general. Also, a joke from Python-dev:

>>> from __future__ import braces


Python with braces ... aka Javascript

And I'm actually being serious. I program in both quite regularly. The Python ecosystem is amazing, but the web defaults to Javascript.


The __future__ may not be far off: Python with Braces (http://www.pythonb.org/).


This certainly is an underdog. Not that many people, probably just me, haven't heard of this yet. Interesting!


> How is this possible? Does nobody use a compiler that warns about comparisons always being false?

External symbols are resolved at link time; as far as the compiler is concerned, setuid might be a symbol declared as __weak.

I don't particularly like the snide remarks towards the maintainers of these libraries.


I ran the following code through GCC:

    #include <stdio.h>
    #include <unistd.h>
    #include <sys/types.h>

    int main(void)
    {
      if (getuid() != 0 && geteuid == 0)
      {
        printf("only root\n");
        return 1;
      }
      printf("okay\n");
      return 0;
    }
With just a plain GCC (no options), the code compiled and exhibited the bug (it always printed "okay"). That's to be expected. What was not expected what when I compiled the code with "gcc -ansi -Wall -Wextra -pedantic". NOT ONE ERROR OR WARNING!

How is that happening?

Well, C allows function pointers. Also, in C, a value of 0 used in a pointer context is treated as a NULL pointer. So that one line is seen as:

    if (getuid() != 0 && geteuid == NULL)
Now, let's play compiler. geteuid() is a function and it's defined. So the compiler can generate:

    if (getuid() != 0 && false)
The result of this expression can never be true. But because the call to getuid() could cause side-effects (the prototype for getuid() does not include the GCC extension marking it as a "pure" function) the compiler has to emit a call to the function, but otherwise, it ignores the result since it doesn't matter what the result is. The resultant code is:

    int main(void)
    {
      getuid();
      printf("okay\n");
      return 0;
    }
And thus, it's possible, even with the commonly used "-Wall -Wextra" options to GCC. To detect this, you will also need to add "-Wunreachable-code" (why that isn't included with '-Wall' is a good question).


> "gcc -ansi -Wall -Wextra -pedantic". NOT ONE ERROR OR WARNING!

  $ rpm -q gcc
  gcc-4.8.2-7.fc20.x86_64
  $ gcc -ansi -Wall -Wextra -pedantic test.c
  test.c: In function ‘main’:
  test.c:7:36: warning: the comparison will always evaluate as ‘false’ for the address of ‘geteuid’ will never be NULL [-Waddress]
         if (getuid() != 0 && geteuid == 0)
                                      ^
If fact, just -Wall is enough to enable that warning. Maybe the coverage of -Waddress has changed since your version?


Could very well be. I am using a slightly older version of GCC.


huh?

gcc 4.2 -Wall:

  warning: the address of ‘geteuid’ will never be NULL
gcc 4.8 -Wall:

  warning: the comparison will always evaluate as 'false' for the address of 'geteuid' will never be NULL [-Waddress]
       if (getuid() != 0 && geteuid == 0)

Also, as came up in the goto fail discussion, -Wunreachable-code was removed from gcc and doesn't warn on anything.


It appears that GCC has changed a bit since that article was originally written (2010). And I'm using a GCC that is a bit old.


It appears that you are spreading FUD. In a thread discussing if one line bugs could be written on purpose, it is only fitting to discuss if you demonstrated gcc's failing by using a version older than 4.2 was also on purpose.

By the way, I'm interested to know which version you used.


How am I spreading FUD? By using an older version of GCC and reporting what I saw? Should I edit my post to remove all mentions of GCC and just leave how the code is interpreted as is?


It has to be before 4.2 (you never said which version you used), which is really old. And you went to all this trouble but never occurred to you that compiler version could be a factor?

Sure it could be an honest mistake, but that's also true for Apple's bug (and the rest).


I suspect the author is parodying the reaction towards apples mistake, and bares these maintainers no ill will.


Regarding the Debian OpenSSL disaster, the question was brought up on the OpenSSL mailing list (and read there); regarding the Tarsnap bug, it is not an unbraced if but a non-incremented nonce. If two of the five comments in an article are obviously wrong, how much trust should I possibly put into the remaining 60%?


It's sarcasm. You've missed the rhetorical point of the article.

(To be more clear, each comment mimics the comments made about the "goto fail" bug, and the crucial point is not pure mockery or a defence of Apple, but rather the final paragraph.)


Probably a lot of people missed this point of the article. It certainly seemed odd to me.


If there is sarcasm or rhetorical point to the article, it is not made clear at all. This is just poor writing.


I agree with you, poor writing.

You were being voted down earlier, but not any more. So there are others who agree with you.


The point is that because there are always mistakes in everything, that the NSA is never involved? Strange, because I'm sure they've been implicated more than zero times.

NOTE: I have no opinion on their involvement in the goto fail;


> The point is that because there are always mistakes in everything, that the NSA is never involved?

I'm pretty sure he's being sarcastic.


The logical negation of "there are rarely these easy mistakes in something, so the NSA must always be involved", is not "because there are always mistakes in everything, that the NSA is never involved".


Oh. Okay. Sorry.


Another thing they have in common is trying to reason about security of SSL certificate chains and other high abstraction concepts in terms of C pointers and function return values.


Maybe that will be my new way to measure the effectiveness of my test suite.

"Imagine the NSA snuck a one line 'fix' into your software overnight. Do your tests quickly and accurately detect the problem and point to the code that is broken? If not, your unit tests are broken."


If, per se, the NSA has your codebase, surely they would be able to find something you haven't tested. Your test cases should be extensive, but it's impossible to make them exhaustive.


And even if you do, now you have a large and probably-buggy test base.


But what if they "fixed" your unit tests? Gotta test the tests!


Lint tools exist.

My number one take away from this is that we should all be using them more.


And ignoring compiler warnings less.


I only recently learnt C, and -Wall -Wextra -Werror was drummed into me as being "the right way". It certainly does catch a lot of my idiotic mistakes.


Unfortunately, if you're using anyone else's code (especially if it's multi-platform code), -Werror is sometimes not feasible. For one, lots of third-party code is full of warning-level mistakes, but most of them are not errors, and fixing them up would be a job for a whole separate company.

For another, sometimes warnings are there because the code generated by compiler macros isn't available to the compiler for the purpose of e.g. checking for unused variables/parameters or type range checking. For example, there are some Linux kernel macros that will ignore certain parameters on a lot of architectures; this will give unused-variable warnings on those architectures if the variable/param isn't used elsewhere. In other places, data types change between signed and unsigned based on architecture, meaning that an in-general-meaningful test for >= zero will give you a warning on platforms where the variable is unsigned.

Long story short, the preprocessor, while super-useful, is not good for static checking.


Although be aware that -Wall and -Wextra activate different warnings on different compilers and even on different versions of the same compiler. E.g. gcc added lots of warnings to them in the last versions.

So your code might not compile anymore on newer compiler versions. This is not a problem as such but for example in combination with updates to your continuous integration environment, this might cause build fails.


Yeah warnings are out of control. New architectures and platforms bring up new things to be concerned about, thus new warnings. But old code quits working solely because of the build issue you mention.

Wonder what the solution would be.


Does C have the ability to locally silence warnings in particular sections of code?


Yes, but it's compiler dependent.

For GCC, see http://gcc.gnu.org/onlinedocs/gcc/Diagnostic-Pragmas.html


Useful, but no solution to the issue of new warnings. You'd have to be able to say "I just want THESE warning" to guarantee code continues to compile.


Oh, thanks, I actually didn't know that! I'm only working on personal projects at the moment, but that's super useful to know. Thanks


"This is an important lesson to learn: Mistakes can happen any time a piece of code is modified."

From http://www.daemonology.net/blog/2011-01-18-tarsnap-critical-... regarding the Tarsnap bug.

Solution: write more unit tests.


Or use better abstractions / static analysis.


A lot of arguments center around "well, the compiler should have done X!"

I don't write a lot of C code, but when I did, and even more relevant - when I compile packages, the compiler tends to spit out chapters worth of warnings that I just ignore. How accepted is compiling C code in practice with Werror? Could it be the compiler did warn about goto fail, and the OP's error, but it was 1 warning out of a million? Really interested in an answer from those who work in C shops.


The project I work on (QEMU) uses -Werror in development but defaults it to off for releases. This means that warnings don't sneak into the codebase but end users don't have to deal with unnecessary build failures just because they're using a newer and pickier compiler or oddball host system.

Putting in the effort to get down to "emit no warnings" (and then enforcing that you stay there with -Werror) is worth doing exactly because you then do get new problems drawn forcibly to your attention, IMHO.


I write my code with -Wall, -Werror, -pedantic, and a standard specified. If my code does something questionable, I need to instruct the compiler that I really mean for it to do that. Why tolerate unsafe and ambiguous behavior?


One problem is that -Wall doesn't actually enable all warnings. And that's nuts.


You can enable -Wextra as well, but there are still more warnings you have to enable manually. At that point though, you're looking at weird not-necessarily-bad and language-specific things.


In my past experience writing and compiling c and c++ on Radar software: The project was big and fair amount of legacy code, there were a fair amount of warnings in the c code. A lot of code was in Ada, which I grew to really appreciate, but for some things we needed code in c.

I started work on a new from the ground up simulator in c++. The 12+ coders got used to the warning = errors (Werror). I think it made us better coders. We were able to do that because the project was from the ground up new. Most projects in the Radar field use a lot of previously tested legacy code that does issue warnings.


In the case of Xorg, if they had warnings enabled, it was one warning among pages and pages of them. Recently though, they cleaned it up, 1.16 and onward should compile without warnings[0].

[0] http://keithp.com/blogs/xserver-warnings/


It's probably putting out warnings because you're being sloppy. Maybe that's ok, maybe it's not, but cleaning up warnings is like washing your hands on the way out of the restroom, it's good hygiene.


1. Start writing tests. Tests will save your ass when you're merging a feature branch. Cause are you sure everything still behaves correctly after those ingenious refactorings?

2. Use an IDE and/or compiler that actually warns you about common human errors like unused variables, unreachable statements, and dumb mistakes like a case-block with silent fallthrough to the next case-block.

3. Acknowledge responsibility when writing security related code. Know that there's no room for mistakes.


with the X bug, building X has always triggered tons of warnings, it's easy to see why they're overlooked http://keithp.com/blogs/xserver-warnings/


openBSD credo: security bugs are just bugs.

Mostly human, and avoidable but always present because people don't take the time to do it correctly. Quality, taking the time to make the things right, slowly, correctly, all that matters more than genius. Quality code at all level is the key to secure development. Every little things count.

I know a few openBSD developers, and their song about doing the right things, taking our times to do it correctly - simpler, with quality because it is the shortest path to correctness thus speed (what is the use of an incorrect program: none). This song is appealing. The fact they are strangely psychopathic paranoid nerds is not ;)

I think computer industry would be far better if we would slow down and focus on doing less software, but better built.

Less functionalities maybe, but the one that maybe more powerful. The stable foundation for sustainable progress.

And, if you like them -even though they are repulsing antipathic nerds- you can support them :)


careful there. we're not all the same -- some are MUCH more paranoid than others and I am yet to meet an antipathetic developer. ;)

But thanks to those who support what we do!


To those learning C programming "by reading other's code", following is an example of what not to do (either the subtracted or the added line).

[Now I wait for a deluge or replies about how it's "idiomatic" C. All I have to say to that is, s/ma// ]

  --- libc-a/memset.c
  +++ libc-b/memset.c
  @@ -1,6 +1,6 @@
  void *memset(void *_p, unsigned v, unsigned count)
   { 
       unsigned char *p = _p;
  -    while(count-- > 0) *p++ = 0;
  +    while(count-- > 0) *p++ = v;
       return _p;
   }
[Edited later to use the code mode]


I was going to complain about the 4 errors in 4 lines of code, until I realized that HN converts asterisk, text, asterisk into italicized text.

I assume you wanted your code to look like this:

    void *memset(void *_p, unsigned v, unsigned count) {
        unsigned char *p = _p;

    -   while(count-- > 0) *p++ = 0;
    +   while(count-- > 0) *p++ = v;

        return _p;
    }
In which case, I'm not sure what you're complaining about. Is it that memset here has a non-standard signature (v is supposed to be int, not unsigned)? Or that the "before" code ignored v? Or the un-braced loop body on the same line? Or is the some other problem I'm overlooking (which is possible, it's been about a decade since C was my main day-to-day language)?

[Edited to try to coax better text formatting from HN]


Thanks for the formatting. I fixed my comment too. As for the bad legibility of the code as written (not due to formatting), where do I start...

Oh I know, the writer obviously wants very very badly to write this while loop as a one-liner. That's the start of all the other badness. As a reader of this code now, I have to keep every possible combination of post-increment side effects and precedence rules in mind before this code makes sense.

And yes, I know code like this is quoted in books as example of how cleverly you can write "powerful" one-liners. As far as I'm concerned, it's just a dumb show of alpha-coder pride and it's not a surprise to me that a serious bug was buried here.


I think it's ok. I was going to say "He should have used memset instead of this low level memory walking that belongs inside the implementation of... oh I see". This is a critical function; doing it the proper "machine-workings, helping the compiler" way is more important than having a first read immediate understanding. Not that an experienced C programmer won't have it anyway.


What is particularly bad? How would you have done it differently? I've not done much C, so I can't see the issue, although is it perhaps the types? Would count be modified by side-effect in the calling context also?


Everything. The information density of the statement is enormous. Do you know the precendence of --, > , ++ and = by heart? Also if you lose a space you are relying on compiler maximal munch to process (count-->0) as --, > and not -, ->. Did p get incremented or did p get incremented? etc.

Finally, the same people who write that kind of code also write things like f(x++, (x+1)) without realizing that this is undefined in C.

Also, as an added bonus, postincrement/decrement sometimes creates extra unwanted copies in C++ so you need to use the pre- versions.

Compilers are good enough to handle tight loops like this even on very low optimization settings. Better to write that stuff out.

I have found that as I program more, I kind of agree with Crockford that ++, and -- are evil and should always be replaced by pp += 1; What that does is force you to avoid actions in conditional expressions and pull the mutators inside the block. It also forces you to be explicit about when you expect the increments to happen.


> Do you know the precendence of --, > , ++ and = by heart? > Also if you lose a space you are relying on compiler maximal munch to process (count-->0) as --, > and not -, ->.

Those are fundamental things that should be second-nature to you if you claim to really know C, just as a mathematician would know that exponentiation has higher precedence than multiplication which has higher precedence than addition.

> Did p get incremented or did p get incremented?

There's an easy and concise mnemonic for the precedence of ++/-- and dereference - just remember that this is a strcpy, as shown in K&R:

    while(*dst++ = *src++)
      ;
So here it is the pointer that is incremented, and not what it points to.

> Finally, the same people who write that kind of code also write things like f(x++, (x+1)) without realizing that this is undefined in C.

Those are the people who think they know C, but actually don't; it's an issue of not knowing what you don't know, and in such situations you should either seek to find the answer, or be conservative.


For any junior engineers keeping score at home:

NEVER WRITE THIS CODE! EVER!

This is the code that gives us the classic buffer overflow vulnerability all because checking for a length isn't elegant.

Also, how does this code work if what one of those pointers points to are declared volatile? What if both pointers point to volatile things? What happens if one of those pointers crosses through 0 because it rolled over from max?

Thanks. I doubt you could have provided a better example of the precise problem.


> This is the code that gives us the classic buffer overflow vulnerability all because checking for a length isn't elegant.

The few characters it would take to check length didn't effect this API, it was just part of the decision to use null terminated strings instead of storing lengths.

>Also, how does this code work if what one of those pointers points to are declared volatile? What if both pointers point to volatile things?

I can't imagine any way in which volatile would cause problems. What are you pondering? strcpy doesn't take volatile pointers anyway.

> What happens if one of those pointers crosses through 0 because it rolled over from max?

That's just a very specific type of buffer overflow. strcpy is the wrong function to use in such a scenario.

> Thanks. I doubt you could have provided a better example of the precise problem.

strcpy is for very specific situations. Writing it in a simple way isn't the problem. It's not like it's even theoretically possible for strpy to avoid overflowing. It doesn't have enough information, only two pointers.

In short, this is a bad API, it is not a bad implementation.


> I can't imagine any way in which volatile would cause problems. What are you pondering?

Termination condition. Which data applies and does it have to be reread after the assignment?


It is read once and written once, and the value that was read is the value of the expression.

Note that this "elaborated" version,

    while(*src) {
      char c = *src;
      src++;
      *dst = c;
      dst++;
    }
if the pointers were volatile, would actually read the source value twice.


The original while loop is functioning more like a do {} while() loop and the c is declared outside.

  char c;
  do {
    c = *src;
    src++;
    *dst = c;
    dst++;
  } while (c != '\0');
But, even here, people reading the code can reason about the volatiles without having to go look up the standard.

And don't even get me started about compiler implementation of volatile.


So your conclusion is that the original code is more logical with regard to volatile pointers, even though calling it with one would mean casting away the volatile and basically inviting disaster?

No coding style is immune to problems caused by changing the number of reads to a variable. Volatile is an exceptional thing that must be treated with care and heavy documentation. It has no relevance to the quality of a normal-code strcpy.


This is the code that gives us the classic buffer overflow vulnerability all because checking for a length isn't elegant.

Or rather, buffer overflows arise because someone did not think of lengths, and somehow this mentality became ingrained in a rather large number of programmers; this code is perfectly fine if the length of the source string is always less than or equal to that of the destination. I think saying "never use X" is almost always a bad idea, since it discourages reasoning, and not thinking about lengths is what lead to this problem in the first place.

On the other hand, "never use gets()" is more appropriate, since whoever came up with that function clearly never thought of lengths...


"Those are fundamental things that should be second-nature to you if you claim to really know C"

That's a bogus argument. There's an easy way to write code that's readable and maintainable by mixed teams and then there's this macho way of showing off to no particular advantage.

- just remember that this is a strcpy, as shown in K&R: while(dst++ = src++) ;

Or -- just don't. Just write code that doesn't rely on every reader to mentally clear through a dense thicket of precedence and side-effects to understand your strcpy and get your job done in a pragmatic fashion with 4 lines of easily understood C code instead of 1 line that makes your heart glow with momentary pride.


The problem here is that the bug is not in the "dense" part - it's more likely that whoever wrote that code thought memset was always used to set blocks of memory to 0, and as for why it wasn't discovered sooner: it wasn't being used for anything else! Someone with that misunderstanding could've just as well have written what you consider more "readable" code, but with the same bug:

    *s = 0;
    s++;


You're right the bug is not in the dense part, but adjacent to it. My argument was that the extra cognitive overhead of digesting this complexity leads to the original authors and reviewers of the code to miss other obvious bugs.

One could reasonable speculate if a code-reviewer had sent the original code back to be written in a readable fashion, the actual bug in the code would have been caught too.


As far as I'm concerned, that while expression is simpler than any possible three-statement for loop could be. Similarly, as long as the line is short, I have to dedicate less thought to tracking a single-line bracketless loop than a multi-line bracketed loop. (But don't use multiple lines without brackets, that makes it too hard to find the start and end and slows me down.)

I can see your point about *p++ being a speed bump, but overall this version seems less complex to me.


Not that dense. Difference is professional vs hobbyist. Don't imagine the professionals are going to write down to the hobbyist.


Sorry our definitions of what constitutes a "professional" are entirely different. The best teams I've worked in take great care to avoid needless complexity in code. Readability and maintainability take precedence over everything else... (even performance unless proven otherwise by measurement).

Given the kinds of projects I've been part of are already swarming with (externally imposed) system complexity to begin with, an attitude of "I write dense code since I'm a professional" won't get anyone very far in such teams.


I'm sorry if that seems dense to you. That's pretty much beginner C stuff; after years of writing it, its absolutely transparent to an experienced programmer. I'm astonished at the vitriol; every language takes some learning whether it be lambdas or annotations or declarations or whatever; if it seems opaque to you, perhaps some reading is in order.


"That's pretty much beginner C stuff"

And yet we started off discussing a serious bug in libc discovered in exactly that kind of code. That's the libc in Android!! My argument is that the extra cognitive overhead of digesting needless complexity leads to the original authors and reviewers of the code to miss other obvious bugs.

I'm astonished at the vitriol

I'm sorry if my straightforward and respectful response to your dismissive and patronizing remarks sounds vitriolic to you.


Because somebody is writing the libc code says nothing about their skill; just the opposite because who wants to be in charge of reimplementing tired old library routines? The new guy I would imagine.

This bug is simply a failure to test. Where was the trivial automated unit test for this library routine? Never written I imagine.

Reminds me of another buggy library I had to work with - the Linux c runtime. I was porting to a risc processor oh ten years ago. Something was wrong with the memcpy routine. So I wrote a unit test. A very thorough unit test. Found 12(!) bugs before I was through.

Test was: copy from aligned source + (0..128) to aligned destination + (0..128) length (0..128). Simple triple loop. Anybody could have written it. Nobody ever had.

This is not a C problem, or an Android problem. Its an industry problem. Intel shipped a Pentium processor once upon a time that couldn't divide by 10.


Apology accepted. I didn't mean you; there is plenty of vitriol elsewhere in this thread.


A "professional" would never write that sort of dense C code.


Again, that's normal C code. Maybe you don't like the language; maybe you don't want to learn it. But its absolutely common and you won't get very far without mastering that stuff.

Would we criticize assembler because its full of confusing registers and stuff? Sure go ahead; but it sounds silly.


"Would count be modified by side-effect in the calling context also?"

Questions like this and others (eg., does the pointer deref operation applies to incremented or non-incremented value of p?) that you have to answer before you can make sense of the code. The cognitive overload on the reader's brain is at least part of the reason that a bug like that went past code-review in memset function on a major platform. MEMSET !! Was the needless cleverness and terseness of the code worth it?

P.S. The answer to your question is No. The variable count as used in the body of this function is local to this function (it's passed by value... so it's a copy of what the caller passed).


Did you just argue that all parameters that will be modified should be copied into new variables?

You have an interesting take on C.

And I don't see what's 'clever' about decrementing a count instead of incrementing a count.

It would be just as easy to assign 0 in a larger function, and I see no way of reducing the cognitive load. This line isn't doing anything complex or weird.


Did you just argue that all parameters that will be modified should be copied into new variables?

Can't tell if this question is referring to something I said? If so, I'd like to know what I said that implied such a thing.


You seemed to be labeling the uncertainty of whether modifying count affects the caller 'needless cognitive overhead'.

The only way to avoid altering the count parameter is to make a new variable for counting.


Well I re-read that comment and it does sound like I'm lumping the call-by-value question with the other needlessly complex stuff (Mainly single line while loop depending on multiple operator precedence issues). Well let me correct myself, I didn't mean to do that.

The single line while loop there is just bristling with pitfalls. I'd reject such code in code review and ask it to be rewritten in more readable C. As an added bonus, this exercise would probably also have caught the original bug that started this discussion.

The pass-by-value question came from someone who hadn't written C code in years and I'm pretty sure a practicing C coder (in production code) would not have that question in the first place. Or if it does come from a fresh programmer just getting started, it'll only take one code review to get that straight (and probably earn that programme's work some added scrutiny for some time).


Cheers mate, ah, so unless the function declares that it takes a pointer then it's pass-by-val?


Everything is pass by value in C, but pointers are values. Functions can't modify the value of a pointer, but they can modify whatever's in the memory pointed to by the pointer. To change the pointer itself, you'd have to pass a pointer to a pointer. Functions like asprintf do this.

C++ added pass by reference (&), which can be a handy bit of syntactic sugar.


"so unless the function declares that it takes a pointer"

What you're saying is correct in the practical sense. But ggreer's answer is the real technically detailed explanation (everything is passed by value, including pointers to memory. But then you can manipulate that memory you have the pointer to. And that makes it effectively pass-by-reference). Also, C++ adds a language defined pass-by-reference operation... so there's that.


Actually, with the Debian SSL "fix" that removed entropy, the dev who made the change did ask about it on IRC. Though a bad mistake, it's understandable how it happened.


This one may not be as bad as the ones in the article, but it still gave us a few weeks of work upgrading kernels on a couple of thousand production servers: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.g...


In the example of the Apple bug, I've read it took eighteen months (!!) to discover it after it has been committed in the source code base. As a consequence, I would be interested in knowing, for each bug, the time it took to discover them.

Also, I assume that once discovered, the time to fix it is usually negligible. Usually. And as long as you don't count the time for full deployment in production...


i'm actually a little surprised about the openssl one. I always assumed it would be good practice to number error codes negatively. OTOH that's also kinda expected behavior for reverse engineers. if the return value is negative you can be almost certain it's an error code

i guess whoever wrote that line thought the same.


I've come across a bug that was probably caused by a developer that held the Ctrl key while pressing the up arrow which caused the selected line to move above another.

It luckily only took me a full day to find within the codebase..


"Never write your own security code, because you'll get it wrong. Leave it to the smart people."




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: