

A Manifesto For Error Reporting - andyjpb
http://www.drmaciver.com/2013/03/a-manifesto-for-error-reporting/

======
jerf
I'd expand on one of his points and observe that one of the core problems is
that they are called _exceptions_ , which prejudices the discourse in advance.
They _aren't_ exceptions. They are a way of declaring a handler that is scoped
to receive certain types of objects on a second control plane (beyond normal
control flow), which when invoked, destructively unwinds the stack until a
matching handler is found or the program terminates due to falling off the
top.

That is what they _are_. One of the things this mechanism is _used for_ is
exceptions, but that is a separate concept. It is possible that exceptions are
generally a good idea, but that is the wrong mechanism to use for them (see,
for instance, Lisp conditions & restarts for a good argument to that effect
[1]). It is also possible that exceptions in general are a bad idea and it is
better to use inline error codes rather than the separate control plane _but_
that there is some other valid use for this second control plane (see all the
non-exception uses of "exceptions", and note this is not hypothetical; see
Python's StopIteration exception [2]).

Conflating the feature with the most popular use just leads to lots of
confusing debates with people talking past each other.

[1]: [http://www.gigamonkeys.com/book/beyond-exception-handling-
co...](http://www.gigamonkeys.com/book/beyond-exception-handling-conditions-
and-restarts.html)

[2]: <http://www.python.org/dev/peps/pep-0234/> , use Find to search for "It
has been questioned" for a direct response to this issue

~~~
RickHull
> _I'd expand on one of his points and observe that one of the core problems
> is that they are called exceptions, which prejudices the discourse in
> advance. They aren't exceptions. They are a way of declaring a handler ..._

I can't figure out what you're referring to with _they_. I am guessing wrapped
exceptions, but also maybe the example where nil was passed to the constructor
instead of a hash?

~~~
jerf
Exceptions are called exceptions. But they really aren't. They're two concepts
conflated together.

~~~
RickHull
> _Exceptions are called exceptions. But they really aren't. They're two
> concepts conflated together._

> _They aren't exceptions. They are a way of declaring a handler ..._

Hm, if that's the case, then you described the handling of exceptions, which
is distinct from the declaration of the exceptional case as well as the act of
throwing or raising the exception.

I feel like your comment unhelpfully adds mystique rather than clarifies. I
may be holding you to an "unfair" standard though, based on your comment
history, of which I am an enthusiastic fan.

~~~
jerf
It's hard to say "exceptions aren't actually exceptions" without essentially
being guilty of equivocation in advance. The important part of my message was
where I broke it down in two pieces, neither of which I call exceptions in a
desperate and apparently failed attempt to avoid further confusion. Sorry.
There's the control flow construct, and there's the use of the control flow
construct for handling certain types of errors, and the two get bundled
together under one word in a way I think is misleading.

------
mcherm
> If you catch an exception, I need to know about it unless you’re really
> goddamn sure I don’t

I don't think I agree with this one. We agree that when a piece of of code
encounters a problem it should trigger an exception with stack trace and
useful information. But you seem to believe that when an exception handler
catches an exception it should log it, then do something about it. I think
that exception handlers should do one of two things:

(1) Catch the exception, add additional information that the lower-level
function didn't have access to (eg: what file was being processed when it
occurred), then re-throw the exception.

(2) HANDLE the problem somehow.

(Anything else and you just shouldn't handle the exception.)

Now, (2) has several possible things. Perhaps if the web service is down we
can fall back on using the cached values from the database -- that's an
example of FIXING the problem. Perhaps the value isn't really needed and we
can leave it out -- that's AVOIDING the issue. And the most common is to show
an error message to the user in one form or another -- that's REPORTING it.

If you REPORT the problem, I believe you should always output the exception
and stack trace someplace. But if you FIX or AVOID the problem, then it may or
may not be appropriate to log it. A FIX or AVOID situation that occurs quite
rarely is probably worth logging; one which occurs under normal circumstances
(the web service goes down for maintenance for several hours each week) may
only need a counter in some admin console.

(PS: using exceptions as control flow is an extreme case of FIX -- reaching
the end of the list was an exception that is FIXED by moving on to the next
piece of work.)

~~~
DRMacIver
Yes, in retrospect I'm not sure I agree with this one as strongly as I wrote
it. Perhaps better to say that you should log by default.

But that being said, problems that the code needs to fix may be symptoms and
you may need them for debugging. I think they're worth logging more often than
not. Maybe just less noisily than your normal error reporting.

------
gingerlime
I try to apply this to my logger statements. For example:

    
    
        unless signals.has_key? key.to_sym
          logger.error("wrong signal received: #{key.inspect} not in #{signals.inspect}")
          raise ActiveRecord::RecordNotFound
    

Classifying what's an `error`, `warning` and `info` can be confusing some
times, but I have quite clear guidelines, and it helps to better deal with
errors overall.

error/fatal - anything that I simply can't recover from. e.g. if a parameter
is missing, there's no way to even guess. error logging is almost always
accompanied by an exception being raised. (btw, on our rails app we use the
logging-rails[1] gem, that emails those errors to us)

warning - something seems wrong, but we can somehow still continue, or it's an
error that I don't want an email about. For example, blocking spam on a form
submission.

info - useful stuff to know what's going on with the app. User registered /
logged in, payment received. Those are also sent to graphite for measuring

debug - all the other stuff you need when writing code.

[1]<https://github.com/TwP/logging-rails>

~~~
adrianmsmith
Yes I've thought about this topic as well.

All loggers in all languages have these different levels. But rarely is it
defined which to use when! It's considered "obvious" but, unless it's actually
defined, each programmer will find it obvious in a different way. Reading the
logfile in production or using rules to only display logs beyond a certain
severity won't be useful if every piece of code uses different levels to mean
different things.

Here are the rules I came up with:

[http://www.databasesandlife.com/which-log-levels-to-use-
when...](http://www.databasesandlife.com/which-log-levels-to-use-when/)

~~~
gingerlime
looks like we have the same idea of which levels makes sense. It's surprising
how many developers don't bother thinking about it or realising the importance
of logging in general, and consistent logging in particular.

------
henrik_w
Lots of good advice. I am constantly amazed at how many error messages don't
contain any dynamic information (as mentioned in the article - what the
offending value was, and why it was wrong).

One of the best fixes is for developers to have to spend time
debugging/trouble shooting. If nothing else, it teaches you the importance of
good error/log messages.

------
Roboprog
God knows I've seen enough "two year old" error messages: "I don't like it!
<spits out>". Well, what _would_ you like, you sniveling little diaper wetting
sot of a program?!?

I could not agree more with his comment about "Bad value {X} should be ...".

Oh, and the part about the amazing disappearing stack trace -- I've seen way
too much "print e.toString()" which discards all that wonderful "where"
information.

~~~
moe
_disappearing stack trace_

In many (most?) OO-languages it takes a surprising amount of gymnastics to
properly chain exceptions. Especially if you're a library-author depending on
other libraries.

The blame goes squarely to the language designers here. This feature should be
baked into the core of _every_ language because adding it with a 3rd party
library is far from trivial in most.

Here's an example of such a library (ruby):
<https://github.com/pangloss/nested_exceptions>

Use it!

~~~
Roboprog
For all my grumbling about Java, I guess exception chaining is one thing they
actually did get right from the beginning:

throw new RuntimeException( "App feature X broke ...", e);

Could be worse, could be C -- setjmp/longjmp :-)

Actually, longjmp is pretty useful, but it typically makes for "course
grained" error handling, and you better have a good error message logged
before jumping back. As well as protect yourself from resource leaks...

------
praptak
Python specific advice on re-raising upon catching and keeping the original
trace: in Python 2.7+, please use chained exceptions:
<http://www.python.org/dev/peps/pep-3134/>

For older Python versions please see
<http://blog.ianbicking.org/2007/09/12/re-raising-exceptions/>

~~~
masklinn
> For older Python versions please see
> <http://blog.ianbicking.org/2007/09/12/re-raising-exceptions/>

This is super important, I recently rediscovered this with a colleague (the
company's tool was fucking up exceptions everywhere, either over-logging
things or throwing stacktraces out in an attempt to rewrap them and we decided
to fix this).

Although the second-to-last example:

    
    
        new_exc = Exception("Error in line %s: %s"
                        % (lineno, exc or exc_class))
        raise new_exc.__class__, new_exc, tb
    

would be much better written as:

    
    
        raise Exception, "Error in line %s: %s" % (lineno, exc or exc_class), tb

------
islon
When I'm coding I normally think about 2 things:

\- What should I do here to help me and other developers understand what's
gone wrong. This generally results in one of the three categories:

    
    
        - Is it a "shouldn't happen" error like hardware malfunction, internet down, etc.
        - Is it a invalid data error, someone screwed data and I got the error here.
        - Is it a algorithm error, I got something wrong in my code and this wasn't supposed to happen.
    

\- What should I show to my user. Divided in three categories too:

    
    
        - Unrecoverable error: should show a big red screen.
        - Recoverable error: should show a informative message with instructions.
        - Errors that don't affect the user: no need to show anything, just log the error.

~~~
Roboprog
Who is this "user" you speak of? :-)

I'd extend that a bit for batch systems: invoke some kind of tool to file a
trouble ticket to get the operations/support group's attention, as there won't
be somebody sitting at a console to watch. Alternately, you could view this as
generating a big red screen, just realize that any users are sitting
"someplace else".

Yeah, some of us still have batch operations going, as well as a web app or
two.

------
TheOnly92
What I think is that, depending on your users though, they do not understand
your backend enough to make proper error reporting.

The "Bad argument" example, they don't know how many different situation
triggers the same error, they don't know if what they did before will help you
figure out their problem. In short, they don't know enough to be helpful.

I don't have any great solution to this, but putting a middleman who, although
can't help you with programming, knows what is helpful and what is not enough
to provide useful error reporting to you might solve it.

~~~
DRMacIver
I agree that this can be a problem.

I think a good default here is that if you don't know enough about what you
should report, you should report as much as you reasonably can - for example
if you can't say what's wrong with a value, at least say what that value was.

------
eksith
This is why we had a logger running in the background that received all
anomalous events from warnings to exceptions (the class, method and passed
data including passed parameters and the relevant line). And we had a catch-
all trigger for when something goes really, really, really, badly wrong and
there's nothing else available to dump to the logger.

Annoying during development, a godsend in production. No one is immune from
stupid mistakes.

~~~
DRMacIver
Yes, indeed, we have the same, but it becomes much more useful if the
exceptions being thrown by your dependencies are of high quality.

------
br1
Windows developers have it easier because debuggers are better. The same
debugger can work on VB6 code calling C++ calling C#. Memory dumps also seems
to be more common and useful in Windows than in Unix. If you get dump for all
crashes, stack traces seem almost useless in comparison.

------
wilmoore
The only tragedy about this thread is that more people aren't commenting.
Either people are taking the advice and moving on or they don't care. I really
hope it is the former :)

