Hacker News new | past | comments | ask | show | jobs | submit login
Error handling style in C (pixelstech.net)
42 points by sonic0002 on April 24, 2012 | hide | past | favorite | 42 comments



A forward jumping goto (to a single target inside a function) is just perfect for C error handling code. Don't be misguided by a silly principle of goto's being always bad. They get the job done in the cleanest possible way, so you should use them for doing cleanups.

The examples did not have any resources to clean up, and that is what makes error handling in C painful. In the absence of any cleanup routines, this will do:

  return (
    do_something() == SUCCESS &&
    do_something_else() == SUCCESS &&
    do_final_thing() == SUCCESS) ? SUCCESS : FAILURE;
Of course, once you add resources to clean up or error codes that are meaningful (not just success/fail) error handling gets more painful.

You should not try to perfect something as mundane as error handling. Just write the damn code and get over it.


Why should the goto be to one single target? Multiple goto statements are good for multiple clean ups without adding indentation levels and without having artificially long logic ands. For example:

    int init_abc()
    {
        if (!init_a())
            goto err_a;
        if (!init_b())
            goto err_b;
        if (!init_c())
            goto err_c;
        return 1;

     err_c:
        cleanup_b();
     err_b:
        cleanup_a();
     err_a:
        return 0;
    }
seems to be the cleanest way to do what it does in C. For what it's worth, it is the way a lot of error handling is done in the Linux kernel.


I guess it's fine to use multiple targets too. However, usually you can get away with one, because free(NULL) and similar cleanups tend to be no-ops. So you have something like:

    char *foo = 0, *bar = 0;
    if((foo = malloc(X)) == NULL || (bar = malloc(Y)) == NULL)
      goto cleanup;
    make_me_millions(foo, bar);

  cleanup:
    free(bar);
    free(foo);
In this case, and many cases like it, there's no need to have two jump targets, because one is good enough. You'll have to declare the variables early on anyway to avoid warnings/errors from definitions that cross jump labels.

So there's probably nothing wrong with multiple jump targets but that might not be needed with well-behaving cleanup functions.


because free(NULL) and similar cleanups tend to be no-ops. So you have something like

You really need to check the specification on each function. free is defined that free(NULL) is no-op, but there are other things where that is not the case. Also, that code is not portable since NULL does not have to be 0.


Just a correction: the code is actually perfectly portable. The integer constant 0 is the canonical definition of the null pointer by definition in the standard (See Section 6.2.2.3 "Pointers" in C89). The null pointer constant (NULL) is defined primarily for convenience (so a reader knows you mean a null pointer instead of a arithmetic zero). Of course, the bitwise representation of the null pointer need not be all-bits-zero; that is, NULL = (void )0 != ((int *)&0).


I stand corrected, it does define the integer constant 0 to be promoted to the null pointer. In C11 this is Section 6.3.2.3. To make it even more confusing Section 7.19 Common Definitions <stddef.h> defines NULL to be an implementation depefined null pointer constant.

I was always taught that (void *)0 is a valid definition of NULL, but that 0 was not necessarily.


> You really need to check the specification on each function.

Sure, but I need to do that anyway and constantly, since I use far too many API's to memorize. It's also convenient to make cleanup(NULL) a no-op for the API's you create.

NULL is zero. Even if the standard would allow it to be something else, it's not relevant with the compilers and platforms I use. There's no point in being a language lawyer just to achieve "portability" to some imaginary platform where a byte is 7 bits and NULL == 0xdeadbeef. Make a list of compilers and platforms you support and forget the rest if you want to get stuff done.


I'm actually curious how the compilers would handle this. The compiler may inline the gotos so it's really:

  int init_abc()
  {
      if(!init_a())
          return 0;
      if(!init_b())
          cleanup_a();
          return 0;
      if(!init_c())
          cleanup_a();
          cleanup_b();
          return 0;
      return 1;
  }


Good points - it also a lot easier to debug what's being returned by the function as you have fewer breakpoints to set in gdb.


Why goto is bad? Again this "structured programming" bullshit. Kernel uses 2 a lot to handle errors like here [1].

[1] https://github.com/torvalds/linux/blob/master/mm/shmem.c#L99...


THANK you. I was thinking the exact same thing. This is seen all over the kernel, and I found it to be a very readable, simple solution, and a very good use of goto.


In general, gotos are bad. For error handling in C, they're perfect. Just don't jump backwards in code.


Jumping backwards is fine, too. It's okay to ignore Dijkstra when he is wrong.


Dijkstra justified his paper against gotos with reasoned statements that make a lot of sense.

Unstructured goto's that do not form simple loops or skips are going to make the position in code a much more complicated notion. And the position in code is crucial to understanding the meaning of variables, for example.


How dare you imply that programming could possibly be subtle or nuanced or require thought? The very idea that our entire field of expertise could not be adequately conveyed by an eight-page Microsoft Word document of absolute endorsements and prohibitions—laughable!


C already has a command to jump backwards in a loop. Continue.


Just because linux kernel use goto's doesn't mean that goto isn't bad. We need another argumentation.


I think that this[1] email thread between Torvalds and various other kernel developers sums up the use of goto in C the best.

I find that those who think that any given programming concept is "inherently" bad to be dangerous, especially if they were just taught that way.

[1] http://kerneltrap.org/node/553/2131


Here's the error handling I teach for C:

http://c.learncodethehardway.org/book/learn-c-the-hard-waych...

It's using the gotos, but I make macros that hide the actual word "goto" away so that idiots (who think goto is bad because of a ranting "paper" written decades past) will not notice it.

That and I print out error diagnostics, the location of the error at the time, and the errno. These macros probably saved me a good decade of time finding errors in C.


>That and I print out error diagnostics, the location of the error at the time, and the errno. These macros probably saved me a good decade of time finding errors in C.

Line number and file macros are freaking awesome. Out of everything on that page those should be the major takeaway for the reader.

I also think it would be a good idea to add some context to why the goto is so useful in those examples. If any of the checks fail, you're in a full-stop, stick-a-fork-in-us-we're-done error state. Since you know that the function's operations have failed, and that you should be cleaning up everything immediately, goto is the right choice.


gcc/g++ has had __FUNCTION__ and __PRETTY_FUNCTION__ since forever (the first is just the function name; the 2nd in c++ has the argument types, class name and more) - might be a useful addition to your toolkit. It's easy to trace back from line numbers to file/function etc, but I find that the function name is actually more informative for me -- and is easily incorporated into macros such as yours at no extra cost.

C++11 and the latest C have standard equivalents for these, but I'm too lazy to look them up right now..


It lacks 4th method that I find the best: with long jumps. I mean for example:

  jmp_buf errbuf;
  int result;
  if( !( result = set_jmp ) )
  { 
      /* Code to do on fail. Result may be some error code */
  }
  else
  {
      someaction( par1, par2, ..., errbuf);
      anotheraction( par1, par2, ..., errbuf);
      /* Etc. */
  }
errbuf might be global if you prefer, but I'd rather avoid them. When something is wrong called function calls longjmp(errbuf, errorcode).


Unfortunately longjmp() is not available in the Linux kernel, so I integrated the BSD implementation of it into my library/kmodule for this purpose: https://github.com/haberman/upb/blob/master/bindings/linux/s...


A longjmp is just a fancy goto. If it's all local to the function, just use goto.


That's more than goto:

1. It can be called from any place in code.

2. Its call can be written in function. You don't write so much ifs whenever you call it.


There's the 5th method that I tend to use:

  if (!init_stuff(bar)) {
    return FALSE;
  }
  
  if (!do_the_thing(bar)) {
    return FALSE;
  }
  
  return TRUE;


I do the same thing with my code, but I think it clutters it up and it's hard to follow the logic this way with clean up code all over the place.


this is correct and why the original post is all wrong. what is calling foo()? it should be making sure foo() is happy, and if not, then continue your logic from there.

that is how you handle errors in C.


It's pretty hard to make a good argument against the forward goto method. I have found it handy to refer those who want to continually argue this point to the following CERT article:

https://www.securecoding.cert.org/confluence/display/seccode...


As with anything else, a balance of these is most useful.

If there is nothing to clean up, error checks and early exit. But only if it is one nested level deep. Any more and someone reading your code may miss it.

When there is a stack of things to be cleaned up use multiple goto targets at the end and fall through them when everything is ok.

Try to keep the nesting to a minimum. I used to indent my C with 2 spaces. Now that I use mainly use Python I have switched to 4 spaces in C and it helps keep things readable and reminds me not to get too deeply nested.

Mix and match as needed. But I think the most important thing is to keep the nesting to a minimum, that makes it much easier for those who come after you to maintain things. If you must have deep nesting or complicated logic at least make sure you do a good job of commenting what it is supposed to be doing.


The Page seems to be down now, here is the google cached version http://webcache.googleusercontent.com/search?q=cache:pixelst...


Dijkstra wrote a reasonable argument against use of a prevalent structure at the time, and titled it "a case against the goto statement". The editor of journal, Niklaus Wirth (of Pascal, Module and Oberon fame) changed the title, and the rest is history.

Historical context is VERY important: At the time - 1968, goto was often used as the main form of control. "break" and "continue" were not universally accepted as a good idea - in fact, the idea of block scopes for if/then/else wasn't universally accepted!

e.g. in the Fortran of the day, the IF statement was followed by up to 3 numbers: line numbers to go to if the IF argument was positive, zero or negative. Wanted to break out of a loop? locate a suitable number and jump to it. That was also the case for BASIC, APL, and other languages of the day. Note: line numbers, not descriptive labels (which were a luxury added to most of these languages some 20 years later). That was a big part of the problem.

The result was a lot of what became known as "spaghetti code" - even code that had no conditionals would jump back and forth to the point that identifying an unconditional thread was similar to pulling one spaghetto out of a lump of spaghetti.

Goto is not inherently bad - but at the time, its use was (in the wild) more often than not problematic. With better language semantics -- chiefly block scoping and break/continue style loop exits -- there became much fewer reasons to use goto, and it became unfashionable to use it at all.

But it still has its place:

a) breaking out of multiple scopes when the language does not natively support it. (perl, java do; c, c++ don't)

b) one way switching to a scope that doesn't naturally nest with other scopes, such as in the case of error handling. Exceptions can sometime provide a good way to do that without goto, but not always (and not every language has them)

c) efficiently threading code. The python interpreter got a boost of ~20% a few years back IIRC by the introduction of a (computed) goto alternative to the plain C "switch" decoder. One could argue that it's an optimization the compiler should do, but apparently no compiler did at the time (And I am not sure one does today).

"goto" is not inherently evil; like C macros, it solves problems that the underlying language did not get to solve in the "most proper" way. You could eliminate them for having evil uses, but you better address all the use cases, or the language expressiveness will suffer. As far as I know, no language -- including LISP -- has made it always simpler to NOT use goto.


the goto style. you can escape goto by the following trick:

  do
  {
  if (!do_something( bar )) {
                break;
        }
        if (!init_stuff( bar )) {
                break;
        }
        if (!prepare_stuff( bar )) {
                break;
        }
        return do_the_thing( bar );
  }while(0);
  return 0;


The “goto” style is better; it avoids excessive nesting and abusing “do { … } while (…)” for something wholly unrelated to looping. You want a jump, use a jump—it’s okay. “goto” lets you unwind from any number of errors without undue nesting or repetition:

    int foo(int bar) {

        if (!do_something(bar))
            goto do_something_error;

        if (!init_stuff(bar))
            goto init_stuff_error;

        if (!prepare_stuff(bar))
            goto prepare_stuff_error;

        return do_the_thing(bar);

    prepare_stuff_error:
        unprepare_stuff(bar);

    init_stuff_error:
        deinit_stuff(bar);

    do_something_error:
        undo_something(bar);

        return 0;

    }


Yup, jumps are the best way to correctly unwrap things (e.g. if you acquired resources), and if you're using it when you need to you might as well use it everywhere so your codebase is consistent.

Although I believe your jumps are in the wrong place logically speaking: if do_something() errored out, logically speaking you should not have to undo_something because do_something cleaned up its crap. So the jumps should be after the cleanups (and you should never need to unprepare_stuff).

Just as the opening of a file is not in a protected scope (if opening the file fails, you don't close it)


You’re right, you shouldn’t need to clean up after the first failure. I was thinking, say, new_a() succeeds and new_b() fails: you don’t need to free_b() but you definitely need to free_a(). The calling function is responsible for coordinating error handling amongst those functions it calls.

“…you should never need to unprepare_stuff”

I was just poking fun at the useless names.


The second one (using GOTO) is actually fairly common:

http://vilimpoc.org/research/raii-in-c/

Despite the evil that can be made of GOTO, it has it's uses.


a related question--how do people generally build up decorated strerror messages? e.g. in some versions of ls, "ls foo" (when foo doesn't exist) will print "ls: stat: foo: No such file or directory". how is that string built up?


Look at the man page for err(3) and related functions.


The site appears to be down. Does anyone have a mirror?



goto :3




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: