LinuxDevCenter.com

oreilly.comSafari Books Online.Conferences.

We've expanded our Linux news coverage and improved our search! Search for all things Linux across O'Reilly!

Search
Search Tips

advertisement

Listen Print Subscribe to Linux Subscribe to Newsletters

Secure Cooking with Linux, Part 1
Pages: 1, 2, 3

Recipe 9.35: Combining Log Files

Author's note: Each system log file in /var/log contains different information about your running Linux system. This recipe from Chapter 9 on "Testing and Monitoring" shows how to merge multiple log files, keeping the lines in sorted order by timestamp, which is trickier than you might think.



Problem

You want to merge a collection of log files into a single, chronological log file.

Solution

#!/bin/sh
perl -ne \
   'print $last, /last message repeated \d+ times$/ ? "\0" : "\n" if $last;
    chomp($last = $_);    
    if (eof) {
        print;
        undef $last;
    }' "$@" | sort -s -k 1,1M -k 2,2n -k 3,3 | tr '\0' '\n'

Discussion

The system logger automatically prepends a timestamp to each message, like this:

Feb 21 12:34:56 buster kernel: device eth0 entered promiscuous mode

To merge log files, sort each one by its timestamp entries, using the first three fields (month, date, and time) as keys.

A complication arises because the system logger inserts "repetition messages" to conserve log file space:

Feb 21 12:48:16 buster last message repeated 7923 times

The timestamp for the repetition message is often later than the last message. It would be terribly misleading if possibly unrelated messages from other log files were merged between the last message and its associated repetition message.

To avoid this, our Perl script glues together the last message with a subsequent repetition message (if present), inserting a null character between them: this is reliable because the system logger never writes null characters to log files. The script writes out the final line before the end of each file and then forgets the last line, to avoid any possibility of confusion if the next file happens to start with an unrelated repetition message.

The sort command sees these null-glued combinations as single lines, and keeps them together as the files are merged. The null characters are translated back to newlines after the files are sorted, to split the combinations back into separate lines.

We use sort -s to avoid sorting entire lines if all of the keys are equal: this preserves the original order of messages with the same timestamp, at least within each original log file.

If you have configured the system logger to write messages to multiple log files, then you may wish to remove duplicates as you merge. This can be done by using sort -u instead of -s, and adding an extra sort key -k 4 to compare the message contents. There is a drawback, however: messages could be rearranged if they have the same timestamp. All of the issues related to sort -s and -u are consequences of the one-second resolution of the timestamps used by the system logger.

We'll note a few other pitfalls related to timestamps. The system logger does not record the year, so if your log files cross a year boundary, then you will need to merge the log files for each year separately, and concatenate the results. Similarly, the system logger writes timestamps using the local time zone, so you should avoid merging log files that cross a daylight saving time boundary, when the timestamps can go backward. Again, split the log files on either side of the discontinuity, merge separately, and then concatenate.

If your system logger is configured to receive messages from other machines, note that the timestamps are generated on the machine where the log files are stored. This allows consistent sorting of messages even from machines in different time zones.

See Also

sort(1).

Be sure to check back here next week for recipes from Linux Security Cookbook on how to restrict access to network services by time of day, and on how to use sudo to permit read-only access to a shared file.


Return to Linux DevCenter.




Tagged Articles

Be the first to post this article to del.icio.us

Sponsored Resources

  • Inside Lightroom
Advertisement

Sponsored by:

O'Reilly Media

©2009, O'Reilly Media, Inc.
(707) 827-7000 / (800) 998-9938
All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.
About O'Reilly
Academic Solutions
Authors
Contacts
Customer Service
Jobs
Newsletters
O'Reilly Labs
Press Room
Privacy Policy
RSS Feeds
Terms of Service
User Groups
Writing for O'Reilly
Content Archive
Business Technology
Computer Technology
Google
Microsoft
Mobile
Network
Operating System
Digital Photography
Programming
Software
Web
Web Design
More O'Reilly Sites
O'Reilly Radar
Ignite
Tools of Change for Publishing
Digital Media
Inside iPhone
O'Reilly FYI
makezine.com
craftzine.com
hackszine.com
perl.com
xml.com

Partner Sites
InsideRIA
java.net
O'Reilly Insights on Forbes.com