Monday, November 03, 2008

An unusual approach to log parsing

I saw an interesting article about logging today on reddit, and it struck a nerve with me, specifically how most text logs are not designed for easy parsing.  (I don't agree with the second point, though -- sometimes logging is tracing, or perhaps more accurately, sometimes tracing is logging.)

We had a lot of log and trace data at Mozy, several GB per day.  A traditional parsing approach would have been tedious and prone to regressions when the messages generated by the server changed.  So Paul Cannon, a frustrated lisp programmer, designed a system where the log API looked something like this:

self.log('command_reply(%r, %r)' % (command, arg))

Then the log processor would define the vocabulary (command_reply, etc.) and, instead of parsing the log messages, eval them!   This is an approach that wouldn't have occurred to me, nor would I have thought of using partial function application to simplify passing state (from the log processor and/or previous log entries) to these functions.  (e.g., the entry for command_reply in the eval namespace might be 'command_reply': partial(self.command_reply, db_cursor, thread_id))

There are drawbacks to this approach; perhaps the largest is that this works best in homogeneous systems.  Python's repr function (invoked by the %r formatting character) is great at taking care of any quoting issues necessary when dealing with Python primitives, as well as custom objects with some help from the programmer.  But when we started having a C++ system also log messages to this system, it took them several tries to fix all the corner cases involved in generating messages that were valid python code.

On balance, I think this un-parsing approach was a huge win, and as the first application of "code is data" that made more than theoretical sense to me it was a real "eureka!" moment.