Hacker News new | past | comments | ask | show | jobs | submit login
Node.js Logging Made Right (itnext.io)
85 points by keenondrums 13 days ago | hide | past | web | favorite | 44 comments

In my experience, it's better to create a "request context" at the beginning of handling a request and pass it down explicitly to subsequent async calls that handle the request. I think that this approach is similar to Golang's "context" package, for example.

As other commenters note, using CLS (via async hooks or domains) is often too much magic and can possibly cause leaks or at least it results in program logic that's hard to reason about and hard to test.

I disagree. Logging is a cross-cutting concern. It is basically the example used to define what a cross-cutting concern is.

I want my cross-cutting concerns to be as invisible as possible, I definitely do not want them to proliferate through all function calls I make in the app.

I do not believe domains or async hooks is too much magic, and they are pretty sound when used properly. As long as they are used responsibly and in a limited setting it is just the right amount of magic.

Cross-cutting concerns are an antipattern.

Logging should not happen inside libraries or submodules. Libraries should expose errors and exceptions either via throwing or message passing (e.g. dispatching error events). Logging should happen at the top-level of the application from a central place (logging logic should not be scattered all over the source code). The top level application logic should aggregate errors from children components/libraries and decide how to log them. This makes it easy to change the default logger and to customize various aspects of the logging.

When you scatter logging logic everywhere throughout your source code, you break the principle of separation of concerns which is probably the most important principle of software development. Even the term 'cross-cutting concern' implies a violation of separation of concerns (a violation of the cross-cutting kind to be exact). A concern cannot be both separate and cross-cutting.

>When you scatter logging logic everywhere throughout your source code, you break the principle of separation of concerns which is probably the most important principle of software development. Even the term 'cross-cutting concern' implies a violation of separation of concerns (a violation of the cross-cutting kind to be exact). A concern cannot be both separate and cross-cutting.

That's because logging is not a concern of your app -- it's a secondary need that's orthogonal (hence cross-cutting) to its actual concerns, and which is there regardless of the app.

Changing your whole app's structure just so that you collect things to log in a central place (as you propose), would be the real violation of concerns.

Not to mention this way you pass around information that might be useless for the purposes of the app (and is just needed for debugging), that it's enough that separate components know themselves internally.

>> Changing your whole app's structure just so that you collect things to log in a central place (as you propose), would be the real violation of concerns.

Maybe it depends on what kind of language/platform you use but if each component is able to dispatch events on itself instead of logging, then that does not require changing the whole app's structure.

The only concern of the top level application logic would be monitoring/reporting so logging fits naturally under that label.

When I write code, I don't want anything to be invisible. I should be able to look at a single file, and understand how it works without reading any others. Sure, there may be some imports at the top of the file, but clearly named external API's will infer what they do.

Yes, and when `const cls = require('cls-hooked')` is written at the top of the file it's pretty clear that the file is using CLS. Nothing invisible about it.

Yes. This is correct way to do logging. Pass an object with request, user, logger and other attributes around as you process it. This is basically "session"

This is the exact problem I solved with CabinJS; making this simple on both client and server. I've tried nearly every logging service and they continually get one thing or the other wrong (even after my attempts to fix them).


Hapi.js does this natively.

No, it does not do it to best standards and simplicity. Hapi itself is also incredibly convoluted and bloated.

I believe just the opposite.

Explicitly passing the context is robust but in my experience becomes neglected because of the extra work involved everywhere and so just doesn't happen. (The tracing equivalent of requiring a password so strong that people write it on sticky notes.)

My C++, Python, Ruby, C#, and Java runtimes provides thread local storage without cooperation from every function; I don't see why it's improper for my JS runtime to do the same.

It's single threaded. You can usually stick the context in the request object itself.

The issue becomes when you're calling some generic library that's unaware that it is running in a request handler. If you want to log from deeper in some logic, maybe you've passed some data or domain objects but not a request/context/session object. Do you pollute that logic with additional data passed in or does your library just create its own logger and call `log.info('intermediate result of some thing: XXX')` and let the CLS magic associate that with the request that generated the call?

And generally speaking, you want to exit the domain of request/response as early as you can, because it's your interface with the outside world, rather than something that should be pervasive throughout your system.

You still have to rely on async hooks since you won’t be able to pass the context through to a library that doesn’t account for such API.

Along the way you might have some generic data transformation functions that require no notion of a request. They're just pure functions that transform data and can be easily unit tested. Seems nuts to mock a request just to test something like that.

If they don't require a notion of a request then don't pass it to them?

Or you could just pass the function an optional tracing ID as an argument.

Or just keep logging / requests out of the pure functions, and leave it to the context around them. You can always have the pure functions return a `Result` or `Either` type, if you need information about a failure to be logged.

This golang pattern is the worst. Why not just built in thread resolution when the parent cancels?

The context pattern exists because go routines don't have parent child relationships.

I haven't looked into this recently so I don't know if anything has changed, but when I last looked up Proxy benchmarks I found that it performed very poorly. Does anyone know if this is no longer the case?

My initial reaction to this is "just because you can doesn't mean you should". Perhaps that's a bit harsh, but it seems wasteful to be creating these wrappers instead of just adding the desired functionality directly to your favorite logging library. Did you try bringing it up with the people maintaining the wrapped libraries? It's also worth mentioning that async_hooks is considered an experimental API.

The simpler solution is to stick a unique id in the context object of each request, perhaps passing around a request-specific log function as well if needed. You should have access to it from all your call sites so you can pass it in as needed, and there's no need for magic.

Agreed, we append a logging function to the request context and inject it where needed. With async/await it actually isnt too unwieldily, though it was a bit more intrusive with promises. I think a lot of guys from multithreaded backgrounds (in my case java) often struggle with this approach because looks like such an anti-pattern for them.

You will lose this context when an error is thrown deep in a library somewhere and your error stack ends up missing most of the useful info. Also how do you deal with I caught exceptions?

This isn't an issue if you're using promises. You can use try/catch from within async functions and it works exactly as you'd expect.

A few years before async functions were available you could also achieve the same thing with generators. In fact, the original release of koa used co [0] under the hood.

I'll note my response is written with koa middleware in mind. I haven't kept up much with express, although I think it should be possible to achieve the same thing since they also support async functions.

With koa you can add a middleware that wraps all downstream middleware in a try/catch:

    app.use(async (ctx, next) => {
      try {
        await next()
      } catch (error) {
        // handle errors

[0] https://github.com/tj/co

The other main difference between koa and express is that, in express, you send responses from handlers while in koa, you bubble up a {req,res} context back to koa which sends it which gives you real middleware since you can still mutate the response.

The promise chaining nature also makes it so that your example handler can catch all downstream errors unlike express where you must call next(err) and handle it in a downstream handler vs koa's upstream middleware pattern (clojure's ring works the same way).

I thank co and early koa for making node bearable for me long before async/await were available.

Thus is about adding trace ids to the log statement. This is definitely helpful, but from the title I had expected something else.

Async Hooks? They are insanely slow. If you are using many promises (lie graphql server) then it will result in x100 performance penalty.

while there is a performance impact, it shouldn‘t be anywhere near 100x. for reference: https://github.com/bmeurer/async-hooks-performance-impact/bl...

The implementation for promises is currently a pretty heavy performance hit:


The node and v8 teams are working to improve that, but the work hasn't all landed yet

It's not 100x as stated by the OP, but I wouldn't use it (async_hooks) in situations where performance is actually a concern. I wouldn't use node there either, but that's besides the point.

async_hooks are 100% usable in production right now, we use it at my current gig for tracing and its had minimal impact on the services using it.

No, they are not usable in production. We tried and was forced to rewrite whole app after this ASAP and eliminate usage of async hooks. We used them exactly same way: logs and some query-level context data.

Bluebird is bad for measuring async hooks because they have custom scheduler and most of the execution is done in single tick.

Overhead is for each tick, not each promise. If you have ~10 promises per request then you have N overhead if you will have 1000 promises you will get 100N overhead.

Our backend barely survived this after async hooks was just enabled and just disabling them make everything x100 faster.

Are you saying that async hooks perform fine with Bluebird but not native promises? Why not use Bluebird everywhere then?

Because this is not a solution, you still get penalty and bluebird is slower in latest nodejs versions.

Not really a long term solution.

We adapted approach with context from golang and now it works really well.

I see this "I'd never use node" sentiment quite a bit in the Java community. I'm always curious for the reason. Is it the JavaScript? or is Node unable to perform? After reading so many success stories from Netflix and Walmart, I'm wondering what the missing piece is....

Specifically in the area of high CPU load tasks. NodeJS is perfectly fine for heavily async tasks, especially calling out to external services. When you have CPU intensive tasks the lack of control over the event loop (task list) can be painful.

It’s not just Java, it’s any programmer with proficiency in any other language, including those that are quite proficient at JavaScript. The sentiment after our team wrapped up a recent Node project for a client was basically “the internet keeps lying about the maturity of the node ecosystem”. That was with the latest TypeScript and an architect who was very good.

The event loop model that Node uses is easily replicated in other programming languages. It’s not a sensible default though, blocking code is better for most domains.

What if you don’t utilize promiseResolved hook? What is the practical performance hit for the rest of the tickobject/tcpwrap/etc hooked resources?

Is Angular's zone.js any different?

I've used it to do essentially the same thing. Create a new zone for every request that comes in and all async functions within that zone can utilize the id assigned to it's zone for request tracking.


I like the async hook idea. But I’m having a hard time imagining his unit tests would work?

I worked on a codebase that used the old "domains" this way, and it was very kludgy. We could get unit tests working, but we essentially mocked calling that 'set' method before the function we were testing used it. We eventually ripped it all out in favor of explicitly passing arguments down the async chain because we found that for some reason those "magic" variables were leaking across requests. Could have been another factor that caused it, but the magic was too much for us.

I just use a incremental counter and closure. For the logging itself you have to choose between being slow and sync, or fast and async but miss some log events if you get a crash.

The package he uses (cls-hooked) hasn't been updated for 2 years. Does anyone know any good alternatives?

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact