Fast extraction of a time range from syslog logfile?
- by mike
I've got a logfile in the standard syslog format. It looks like this, except with hundreds of lines per second:
Jan 11 07:48:46 blahblahblah...
Jan 11 07:49:00 blahblahblah...
Jan 11 07:50:13 blahblahblah...
Jan 11 07:51:22 blahblahblah...
Jan 11 07:58:04 blahblahblah...
It doesn't roll at exactly midnight, but it'll never have more than two days in it.
I often have to extract a timeslice from this file. I'd like to write a general-purpose script for this, that I can call like:
$ timegrep 22:30-02:00 /logs/something.log
...and have it pull out the lines from 22:30, onward across the midnight boundary, until 2am the next day.
There are a few caveats:
I don't want to have to bother typing the date(s) on the command line, just the times. The program should be smart enough to figure them out.
The log date format doesn't include the year, so it should guess based on the current year, but nonetheless do the right thing around New Year's Day.
I want it to be fast -- it should use the fact that the lines are in order to seek around in the file and use a binary search.
Before I spend a bunch of time writing this, does it already exist?