Structured logging is just taking whatever information you already intended to log anyway and giving it a consistent structure. Ie, instead of
"$TIMESTAMP $LEVEL: Hey look index $INDEX works!"
you have
$TIMESTAMP $LEVEL $FORMATTER $MSG $INDEX
I'm quite intentionally drawing a parallel to `echo` vs `printf` here, where printf has an interface that communicates intended format while echo does not.
The only overhead of structured logs done right is that you need to be explicit about your intended formatting on some level, but now with the bonus of reliable structure. Whether you use glog or spdlog or whatever is somewhat inconsequential. It's about whether you have an essentially unknown format for your data, having thrown away the context of whatever you wish to convey, or whether you embed the formatting with the string itself so that it can later be parsed and understood by more than just a human.
If you're concerned about the extra volume of including something like "[%H:%M:%S %z] [%n] [%^---%L---%$] [thread %t] %v" on every log entry, then you use eg (in GLOG equivalent for your language) LOG_IF, DLOG, CHECK, or whatever additional macros you need for performance.
If I'm wrong or just misunderstanding, please do correct me.
LOG_IF and VLOG, and related concepts from other libraries, have their places but I don't consider not logging to be a solution to expensive logging. In fact, that was my whole original point.
With structured logging the outputs may be trivial in which case the compiler can do a good job. But if you are inserting a std::string into a log line and your log output format is JSON, there's nothing the compiler can do about the fact that the string must be scanned for metacharacters and possibly escaped, which will be strictly more costly than just copying the string into the output.
Please understand my line of inquiry is sincere and not intended to be read as bad-faith-argumentative. I say this only because tone is so hard to convey.
Anyway.
Isn’t the choice of eg JSON-string vs any other string format somewhat beside the point? Wouldn’t you either 1) need to scan for metacharacters or 2) NOT need to scan for metacharacters, regardless of whether your log is structured or unstructured, at time of output?
Ignoring for a moment the additional cost per character of outputting additional data, but ignoring nothing else in the scenario, wouldn’t something like, for example:
“My index is {index}␟index=5”
cost the same to output as:
“My index is 5”?
It seems to me that the cost of interpreting the formatting doesn’t need to be paid until you wish to read the log message, which presumably you don’t strictly need to do at all until you actually care about the content of the message, or at the very least can defer the cost until a less critical moment, or do some “progressive enhancement” dependent on some set of pre-requisite conditions.
"$TIMESTAMP $LEVEL: Hey look index $INDEX works!"
you have
$TIMESTAMP $LEVEL $FORMATTER $MSG $INDEX
I'm quite intentionally drawing a parallel to `echo` vs `printf` here, where printf has an interface that communicates intended format while echo does not.
The only overhead of structured logs done right is that you need to be explicit about your intended formatting on some level, but now with the bonus of reliable structure. Whether you use glog or spdlog or whatever is somewhat inconsequential. It's about whether you have an essentially unknown format for your data, having thrown away the context of whatever you wish to convey, or whether you embed the formatting with the string itself so that it can later be parsed and understood by more than just a human.
If you're concerned about the extra volume of including something like "[%H:%M:%S %z] [%n] [%^---%L---%$] [thread %t] %v" on every log entry, then you use eg (in GLOG equivalent for your language) LOG_IF, DLOG, CHECK, or whatever additional macros you need for performance.
If I'm wrong or just misunderstanding, please do correct me.