Apache logging issues
- by Dan
I'm trying to parse apache log files, but I'm finding some strange results and I'm not sure what they mean. Hopefully someone can provide some insight. (all of the IP addresses were altered. none actually start with 192, I didn't figure the search engines mattered though.)
In the first example, multiple ip addresses are showing up in the host field:
192.249.71.25 - - [04/Aug/2009:04:21:44 -0500] "GET /publications/example.pdf HTTP/1.1" 200 2738
192.0.100.93, 192.20.31.86 - - [04/Aug/2009:04:21:22 -0500] "GET /docs/another.pdf HTTP/1.0" 206 371469
What causes this? Does it have to do with proxy servers? Is there a way to have Apache only log one?
In the second example, a bunch of information is just completely missing! What would cause this?
msnbot-65-55-207-50.search.msn.com - - [29/Dec/2009:15:45:16 -0600] "GET /publications/example.pdf HTTP/1.1" 200 3470073 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)" 266 3476792
- - - - "-" - - "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; InfoPath.1)" 285 594
- - - - "-" - - "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; InfoPath.1)" 285 4195
- - - - "-" - - "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; InfoPath.1)" 299 109218
crawl-17c.cuil.com - - [29/Dec/2009:15:45:46 -0600] "GET /publications/another.pdf HTTP/1.0" 200 101481 "-" "Mozilla/5.0 (Twiceler-0.9 http://www.cuil.com/twiceler/robot.html)" 253 101704
My CustomLog configuration says:
LogFormat "%h %l %u %t \"%r\" %s %b \"%{Referer}i\" \"%{User-agent}i\" %I %O" common