

Log for Machines - pquerna
http://journal.paul.querna.org/articles/2011/12/26/log-for-machines-in-json/

======
thaumaturgy
NO! No, no, no, a thousand times, no.

I spend all day every day troubleshooting broken things. Human readable logs
are the very last thing that I have left that leaves me with any energy to
keep troubleshooting things. If you take that away from me too, I swear I'm
going to shoot somebody. (If I have any sense left when I go 'round the bend,
I'll target a software engineer that thought it would be a good idea to fix
what wasn't broke.)

Here's the thing: I can do anything with a human readable text log. I can scan
it visually and look for anomalies; I can use my favorite text editor with it;
I can grep it; I can cobble together some magic incantation on a command line
to munge it to death until it cries out in mercy and gives me _exactly what
I'm looking for_.

If I have to use somebody's special web-based woo-widget to view and search
the contents of a log, if I'm restricted to the limits of their imagination
when it comes to trying to get useful information out of the log, if I have to
debug their dumbass invention while somebody's time-critical server is
flopping around on the floor before I can even begin to figure out what in the
heck is actually wrong with the server, I'm going to be stupid not to give up
completely on support.

If you want to log information _which is useful to you_ , you can do that
without hardly any trouble at all without replacing perfectly good, reliable
logging facilities which are the way they are for damned good reasons. Just
write a Ruby whatchamacallit or a Python thingamajig or a PHP flibbedyfloo and
log it to your MySQL database or your Postgres database or your no-sql nuh-uh-
it's-not-a-database. I don't care.

But if I ever open up a server log and get blasted by a jillion lines of JSON
that make it a thousand times harder for me to figure out how something broke
I'm going to go on a shooting spree.

~~~
CoffeeDregs

        >Here's the thing: I can do anything with a human 
        >readable text log. I can scan it visually and look for 
        >anomalies; I can use my favorite text editor with it; I 
        >can grep it; I can cobble together some magic 
        >incantation on a command line to munge it to death 
        >until it cries out in mercy and gives me exactly what I'm looking for.
    

That's basically _why_ the article is proposing using JSON: it's easily
readable/scannable by a human, can be easily opened in a text editor, isn't
all garbagy like XML, has near perfect support in every scripting language and
is rumored to be Christ 2.0.

    
    
        >But if I ever open up a server log and get blasted by a jillion
        >lines of JSON that make it a thousand times harder for me to figure
        >out how something broke I'm going to go on a shooting spree.
    

Are these really that different?:

    
    
        >host.com:80 w.x.y.z - - [25/Dec/2011:06:07:42 -0600] "POST /page/open HTTP/1.1"
        >200 1200 "-" "Mozilla/5.0 (Windows NT 5.1; rv:8.0) Gecko/20100101 Firefox/8.0"
    
        >{'host':'host.com', 'timestamp':'2011-12-25 06:07:42', 'method': 'POST', 
        >'path':'/page/open', 'protocol':'HTTP 1/1", 'response_code':200, 
        >'content_length':1200, 'user_agent': 'Mozilla/5.0 (Windows NT 5.1; rv:8.0) Gecko/20100101 Firefox/8.0'}
    

There's a bit of extra verbiage, but the JSON version is perfectly extensible
and would allow the devs to add additional debugging information to help you
find issues. That said, I can see the downside to letting devs insert random
crap, because you could get reams of garbage in the logs. As example of
possibly helpful info it'd be easy to add db and app times:

    
    
        >{'host':'host.com', 'timestamp':'2011-12-25 06:07:42', 'method': 'POST', 'path':'/page/open',
        >'db_time':0.157, 'app_time':0.43, 'protocol':'HTTP 1/1", 'response_code':200,
        >'content_length':1200, 'user_agent': 'Mozilla/5.0 (Windows NT 5.1; rv:8.0) Gecko/20100101 Firefox/8.0'}

~~~
thaumaturgy
> _...isn't all garbagy like XML..._

I don't think it's fair to take something that isn't garbagy at all and then
say that the proposed alternative isn't bad because it's not as garbagy as one
of the garbagiest things ever invented. :-)

JSON may not be as garbagy as XML, but it's still garbagy. With logs in their
current format, often my very first tool in log analysis is the scroll wheel
on my mouse. It's so lovely because all the lines are unbroken, the columns
line up so neatly, and the data in the columns are so distinct from each-
other. JSON dicks all that up.

Now, OK, I'll admit it wouldn't be very hard for me to whip up something that
will take a JSON-formatted log and turn it into something I actually want to
read. And in fact that's exactly what I'd end up doing.

But, logs are already getting big enough that some companies are doing very
stupid things to them. (Bluehost, by default, logs only the last 24 hours for
all hosted sites; if you turn on their "extended" logs feature, their logs
still have huge gaps in them.) JSON makes that problem a lot worse for those
companies. And, if their log rotation isn't set up correctly -- and if they're
doing vhosting I'll bet you a donut it's not -- then what would have been a
silly amount of data to process now becomes nearly impossible to process.

~~~
CoffeeDregs

        >> ...isn't all garbagy like XML...
        >I don't think it's fair to take something that isn't garbagy at all 
        >and then say that the proposed alternative isn't bad because it's not as
        >garbagy as one of the garbagiest things ever invented. :-)
    

I can see how what I wrote could be interpreted to say "JSON is a little
garbagy", but I didn't mean to imply that JSON is at all garbagy. In fact, in
comparison to the standard web log, it's very readable. Further, as a dev, I
read web logs pretty frequently and I've forgotten what the "- -" was in the
log line I pasted...

EDIT: hey, I ignored most of your reply and that's rude:

My first tool for debugging Apache is 'less', so I would be screaming and
yelling if JSON logs were harder to read than non-JSON logs.

Agreed about the size of the logs, but JSON logging seems to only change the
caliber of the handgun used to shoot yourself in the foot, not turn it into a
Howitzer.

------
zdw
This falls under "good intention, but not well planned out" IMO. Single line
logs are MUCH easier to process with text-oriented tools, which is why they've
stuck around for so long.

This solves the encoding problem, but not the "how do I read/search it"
problem, which is kicked down the line to other tools. And having a different
format for every daemon is just a new problem.

I'd look at how others have dealt with the problem - for example, Logstash,
which is a log parsing and movement engine:

<http://logstash.net/docs/1.0.17/learn>

Also, timestamps aren't recommended for storing time data - they're not human
readable, and convey less information than something like ISO8601 which covers
more edge cases.

------
CrLf
Not this again...

    
    
        * The first level consumer of a log message is a human.
        * The programer knows what information is needed to debug an issue.
    
      I believe these presumptions are no longer correct in server side software.
    

The author's assumptions are wrong on both cases:

There may be any number of automated agents consuming the log, so a human
isn't the largest consumer of log messages (in volume), but it is the first
consumer of log messages as it is a human that ultimately will need them to
troubleshoot issues.

Programmers are not the ones who "debug" issues in production environments.

Both of these are just symptoms of the same base misconception about how
systems administrators go about their work. The same misconception that
brought us XML configuration files.

When troubleshooting production issues I want to be able to grep for specific
strings or regular expressions. I want to be able to tail the log. I do not
want to have to parse structure (programatically or mentally), I do not want
multi-line log entries. I do not want to be forced to use a tool just to
transform the log from whatever format into something I can read.

Going even further: often production issues arise when only people less
familiar with the system are available locally. In these cases I don't want to
explain log structure over the phone.

------
mdwrigh2
There's actually a plan to replace syslog with something that's easier to
create tools for.

[https://docs.google.com/document/pub?id=1IC9yOXj7j6cdLLxWEBA...](https://docs.google.com/document/pub?id=1IC9yOXj7j6cdLLxWEBAGRL6wl97tFxgjLUEHIX3MSTs&pli=1)

------
btipling
I wonder if we need text logs at all anymore. It seems silly to expect people
to be able to read millions of log entries from tens, hundreds or even
thousands of servers. Maybe binary logs using protocol buffers or thrift are
the way of the future.

~~~
pquerna
Maybe.

Protobuffers or Thrift can definitely be smaller in byte size, which does
matter somewhat, but JSON right now is just more easily accessible.

I think in a few years, assuming JSON logging really takes off, you'll see
people using these other RPC formats in order to optimize and reduce the size
of their messages.

I view 'better' RPC formats as a performance optimization of the same basic
idea though: Easily add new fields, and use a format designed for computers to
consume.

~~~
jarito
JSON lends itself pretty well to compression as well so a binary format might
not buy you as much as you think.

------
donut
Regardless of the syntax for encoding a server event, having a request id
(txnId in the post) is an often overlooked detail. See how Google do this in
their systems: [http://highscalability.com/blog/2010/4/27/paper-dapper-
googl...](http://highscalability.com/blog/2010/4/27/paper-dapper-googles-
large-scale-distributed-systems-tracing.html)

------
gosub
I'd prefer to have two logs, one for human reading and one for machine
consumption.

~~~
sneak
DRY. Just make a text filter that takes the machine ones on stdin and prints
out the human ones on stdout. Unix, baby!

------
latchkey
Apache Flume also xfers things in json.

------
loeg
s,JSON,BSON,g

These logs are for machines, and should be compact -- why use JSON at all?

~~~
CoffeeDregs
Two reasons:

    
    
        Easily read by human *and* by machine;
    
        Highly compressible since it's full of repeating strings.
    

My Apache logs compress by about 94%. Chances are good that you wouldn't find
much of a size difference between gzipped JSON and gzipped BSON (though
parsing BSON would be faster).

~~~
loeg
One could imagine a "bzoncat" utility which simply reads bson objects on stdin
and emits formatted json objects on stdout. I don't know, the difference
probably doesn't matter.

------
angersock
Oh christ. I'm pretty sure my sysadmin friends would murder anyone that tried
this on their boxen. As an engineer, while I appreciate the amount of shiny
going on here, I doubt that adding yet another layer of tooling requirements
to dealing with logs is the right answer--this could lead to serious ax-
sharpening of picking the best JSON-to-readable scripts, and you can bet your
ass that'll be different on every machine, install, site, etc. etc.

All that being said, I could see this being hilariously useful in game
development. Having little JSON crumbs dribble out during playtesting and QA
would be really handy, and that is an area where data mining is actually quite
useful. Having a trivial way of sorting through crumbs by issue or tester, and
then being able to, say, project the bugs into an HTML5/WebGL-based tool for
devs to use would be really nifty.

~~~
HalibetLector
Social games has been doing this for 2+ years already, with JSON log files
scraped into various databases and viewed using web tools for everything from
A/B test impressions to virality to monetization to error logging (especially
useful when you can't get the error to reproduce locally).

------
johnjhayes
Before tea, standard sysadmin response: NO

After tea and some thought: No thanks

