The “Spot the log entry” contest
February 6, 2007
As a programmer, most of us would have been taught to write comments, externalize constants, indent code and of course log the flow of control. Credit goes to OpenSource for having provided such wonderful frameworks like Log4j to enable logging.
The framework authors did foresee the problem of growing log files and were smart to provide means to control the level of logging. While, this does help a lot to categorize log entries, it doesnot solve the most commonly occurring problem for a developer:
That of “Spotting the log entry” that is most relevant to the situation.
Let me explain. Picture this: a large application has been deployed in production for a while. The log level has been set to INFO because the application support team argues that ERROR doesnot give them enough detail for trouble shooting. In the application code, a few developers have gotten carried away and have logged entry and exit of each method call (re-inventing the “around” advice in AOP if you may call it that way). The result – log files that have rolled many times over and totalling to around 100 MB across just as many files.
These files are not always easily available to the developer debugging a problem that has been reported in production. The developer needs to quickly find the log entry that helps to diagnoze the issue and here the “Spot the log entry” contest begins.
Obstacles in this course are(and not limited to) :
- Having to sift through potentially hundreds of thousands of log entries
- Log files located on remote machines accessible via protocols like SSH/SFTP.
- Noise in log files – entries that log entry and exit from methods
- Absence of tools to analyze the structured data contained in log entries. Text editors are an often used, but poor choice.
- Inability to analyze contiguous log events. For e.g log entries made from one thread donot appear together in a log file in medium to heavy use systems.
There are a few ways in which this problem can be addressed:
- Avoid AOP like log statements in code. Use AOP at run time to instrument byte-code on the fly if required.
- Clear logging strategy
- Log errors when encountered the first time and not when it is handled each time, say as it progresses up the call stack.
- Use of appropriate log levels to differentiate between debug information, messages, warnings and errors.
- Use of log patterns that provide sufficient information to analyze the flow. Logging timestamp, thread, category and priority for e.g
What if your log files are still huge after all this? Its time to invest in tools that help you spot your log entry.
Some of us at MindTree(http://www.mindtree.com) looked around for OpenSource tools for log analysis when we had to inspect logs from aorund a dozen servers. Chainsaw (http://logging.apache.org/log4j/docs/chainsaw.html) was a decent implementation but not good enough. Commercial tools were not satisfactory either.
Thats when we decided to implement Insight – an application analysis tool.
To start with, it was conceived to do comprehensive log analysis. In brief it provides the following:
- Provide visual analysis of any pattern based log files
- Analyze logs from remote servers over (S)Ftp and Http.
- Supports tailing of local files and a plug-in for Eclipse
- Provides summary and detailed view of the log event
- Supports “no-mutating” analysis of the data set – such as search, sort.
- Supports “mutating” analysis of data set – via progressive filtering
- Helps to locate the “context” of an event i.e snap shot of log entries around a specific log entry.
- Optimized for performance and footprint size
- Loads 1000 entries in around 375 ms
- VM size between 45 to 60MB even after loading 110 000 entries
See attached presentation for details on Insight and testimonials : Insight features
Our developers are now front runners in the “Spot the log entry” contest :)
MindTree Insight is now an OpenSource project on SourceForge and is available at :
The download of the latest release is available at: