AWK: compare apache dates without using regular expression

Posted by smallmeans on Stack Overflow See other posts from Stack Overflow or by smallmeans
Published on 2010-05-15T20:34:21Z Indexed on 2010/05/15 20:44 UTC
Read the original article Hit count: 225

Filed under:
|
|
|

I'm writing a loganalysis application and wanted to grab apache log records between two certain dates. Assume that a date is formated as such: 22/Dec/2009:00:19 (day/month/year:hour:minute)

Currently, I'm using a regular expression to replace the month name with its numeric value, remove the separators, so the above date is converted to: 221220090019 making a date comparison trivial.. but..

Running a regex on each record for large files, say, one containing a quarter million records, is extremely costly.. is there any other method not involving regex substitution?

Thanks in advance

Edit: here's the function doing the convertion/comparison

function dateInRange(t, from, to) {
    sub(/[[]/, "", t);
    split(t, a, "[/:]");
    match("JanFebMarAprMayJunJulAugSepOctNovDec", a[2]);
    a[2] = sprintf("%02d", (RSTART + 2) / 3);
    s = a[3] a[2] a[1] a[4] a[5];
    return s >= from && s <= to;
}

"from" and "to" are the intervals in the aforementioned format, and "t" is the raw apache log date/time field (e.g [22/Dec/2009:00:19:36)

© Stack Overflow or respective owner

Related posts about awk

Related posts about regex