
Ask HN: How do you automate logging bugs in your product? - _virtu
<i></i>TL;DR : How do you log bugs in your product in an automated manner without dirtying up your bug tracker?<i></i><p>Recently, at my place of work, we&#x27;ve come into a bit of an issue with the way that log our bugs. Currently we automate logging many different types of exceptions and errors to our bug tracking system from our product. We&#x27;re currently using unfuddle (which is a whole different religious war) to track bugs.<p>We&#x27;ve noticed that due to the automated manner of tracking bugs we have started to decrease our signal&#x2F;noise ratio in our bug tracker due to the large number of duplicates that we&#x27;re seeing. This makes triaging time consuming as we aren&#x27;t sure which bugs are legitimate issues or possible duplicates of other bugs we&#x27;ve seen before.
Our team has been on a recent stent of patching noisy bugs as they come in, but it feels like these are merely bandaids to a larger problem.<p>How do you automate logging bugs in your product?<p>Some examples of bugs that we&#x27;ve been having issues with :<p>- Loops in automated processes can hit states that raise exceptions. These can generate a large number of bugs.<p>- Database restores that may fail in a staging&#x2F;production environment
======
dexterbt1
Wait, you seem to equate a runtime exception as a bug, which to me are two
different but related concepts.

Directly posting every single exception raised to your bug tracker is a sure
fire way to flood and overwhelm you. Logging to unfuddle, to me, is the wrong
way to do it.

Learn proper Ops. Learn the concepts properly: logging for application
monitoring and trace debugging is one thing. Posting a confirm functional bug
report is another.

Error detection/escalation is best left to proper tools such as nagios/zabbix
where they have grouping of errors etc.

Application level tracing is best left to a logging tools like
sentry/logstash/new relic/etc.

Use the d

~~~
_virtu
What is a confirm functional bug report?

What does your workflow for logging exceptions and errors look like?

What sort of separation do you have between application exceptions and other
types of exceptions such as database connectivity? How do either of the
aforementioned exceptions/errors materialize into a bug?

~~~
dexterbt1
Sorry for the typos, I was typing this on mobile.

I meant "confirmed functional bug report"; and by that I mean a reproducible
scenario that actually is caused by faulty code path leading to wrong
result/behaviour. Usually, in my experience, bugs are ideally reported by
people after some repeatable / confirmable thought-out process.

If I understood you correctly, you are automatically posting every exception
to your bug tracker. My point is that not all exceptions are bugs. And
certainly not all bugs are discoverable through exceptions.

> What sort of separation do you have between application exceptions > and
> other types of exceptions such as database connectivity? How do > either of
> the aforementioned exceptions/errors materialize into a bug?

These are great questions that you should be asking while architecting /
implementing / testing / running your application (i.e. whole lifecycle).
Again, not all exceptions are bugs. Many, but not all exceptions are to die /
crash your program. Many exceptions are definitely recoverable (unexpected
input), while others are not (failed network, unreachable DB). If you are
using a decent logging framework, it’s obvious that they have different levels
for logging messages. FATAL/ERROR usually mean unrecoverable, while WARNINGs
are typically recoverable but may prompt further investigation for the log
audience, and INFO/DEBUG are generally harmless, non-error messages. This
design has become widespread across many languages/frameworks and makes you
ponder why.

> What does your workflow for logging exceptions and errors look like?

My view is that error handling is generally a delicate art and science (see
why Golang doesn’t have exceptions, a decision made by veteran language
designers!). I follow the principle of Fail Fast, especially for unrecoverable
errors; that is, don’t hide fatal errors. Bubble up exceptions. Carefully
design your layers to raise and catch the appropriate exceptions, if at all
(see also Erlang’s let it crash principle). c2 wiki has some good starting
discussions:
[http://c2.com/cgi/wiki?FailFast](http://c2.com/cgi/wiki?FailFast)
[http://c2.com/cgi/wiki?FirstRuleOfLogging](http://c2.com/cgi/wiki?FirstRuleOfLogging)

Note, these are all abstract and conceptual, and not tool oriented. They go
back to the design of your system and your processes, to fit your organization
/ communication structure.

Also not stated is the great importance of automated tests (unit tests /
integration tests / functional tests). Remember I mentioned confirmed bugs?
Because at the very least, every bug discovered should have a corresponding
test written to prevent regression. A decent engineering team should have good
grasp and practice of many levels of automated testing.

P.S. as for our tools, our stack uses logstash/elasticsearch for distributed
logging + opsview/nagios NRPE for ops-related detection/escalation of issues
that warrant attention.

~~~
_virtu
Thanks for the reply. I was actually looking for more of an abstract solution.
Tools are great and all but building processes and architecting a better way
to log bugs is more important.

------
redpillbluepill
Our dev team use Sentry (getsentry.com). We had the hosted plan, and we also
have one installed in a local server.

You can check out their github repo as well
[https://github.com/getsentry/sentry](https://github.com/getsentry/sentry)

~~~
jbrooksuk
+1 for Sentry. This is what we use at work.

For the side business I work ([https://styleci.io](https://styleci.io)) we use
Bugsnag; [https://bugsnag.com](https://bugsnag.com) and a custom application.

------
kornish
We log runtime errors in Slack using a webhook (have an #errors channel which
receives a post with a stack trace and useful metadata), then notice patterns
in error generation manually, discuss the errors and why they might be
happening in Slack, and eventually funnel urgent or important ones into our
task management system.

Right now, we don't have a ton of traffic so the errors channel receives a
very manageable volume. I imagine this sort of approach would get overwhelming
for a product with more concurrent users or more runtime-error-generating code
paths.

------
jamies888888
I log JavaScript-based errors using Google Analytics events. Easy to set up
(especially if you already have GA set up) and the graphing and grouping
capabilities of GA are quite well-suited for bug trend tracking.

~~~
zwetan
almost the same but I log using the exception call instead of event

[https://developers.google.com/analytics/devguides/collection...](https://developers.google.com/analytics/devguides/collection/protocol/v1/parameters#exception)

------
nowarninglabel
We have a "log cop" responsible for looking at the errors in the logs and
being the human conduit to filing a ticket for a bug as needed. We've got it
down pretty well now where we have trained folks on it and rotate the
responsibility each sprint and doesn't take more than a hour or so a week of
the designated person's time. Part of making it doable though was getting rid
of a lot of the superfluous logging we were doing.

------
jakozaur
I use Sumo Logic and their Log Reduce feature. That automatically aggregates
exception patterns. That eliminates duplicates and gives nice summary:
[https://www.sumologic.com/resource/featured-videos/demo-
sumo...](https://www.sumologic.com/resource/featured-videos/demo-sumo-logic-
log-reduce-next-generation-log-analytics-featured-video/)

Disclaimer: I work for Sumo Logic.

------
michaelmior
I'm going to throw out a recommendation for Rollbar[0]. Of the tools I used, I
found it's great and providing the info necessary to quickly track down and
fix errors. It's also got some useful tools for triage built in.

[0] [https://rollbar.com/](https://rollbar.com/)

------
njovin
Rollbar (rollbar.com) dedupes error occurrences. It automatically integrates
with JIRA and github but you could probably tap into their API to integrate if
with unfuddle.

------
benwoodward
After trying a few alternatives, I've settled on
[https://bugsnag.com](https://bugsnag.com)

~~~
abluecloud
+1, their exception aggregation is lovely

------
kuahyeow
We have been trialling [http://squash.io/](http://squash.io/) at work as a
aggregation for our errors and it's working out well so far. Next step is to
integrate into our ticket tracker.

------
fizerkhan
Give it a try: [https://www.atatus.com](https://www.atatus.com) which provides
JavaScript error tracking and Real user monitoring.

------
techaddict009
One of company I worked for used :
[https://www.graylog.org/](https://www.graylog.org/)

------
simonmorley
We use Airbrake. It also allows us to create errors manually- generally used
for logging exceptions from within rescue blocks atm.

