nagios wrongly reports packet loss

Posted by Alien Life Form on Server Fault See other posts from Server Fault or by Alien Life Form
Published on 2012-11-02T09:11:55Z Indexed on 2012/11/02 11:03 UTC
Read the original article Hit count: 335

Filed under:
|
|

Lately, on my nagios 3.2.3 install (CentOS5, monitoring ~ 300 hosts, 1150 services) has sdtarted to occasionally report high packet loss on 50-60 hosts at a time. Problem is it's bogus. Manual runs of ping (or its own check_ping binary) finds no fault with any of the affected hosts. The only possible cures I found so far are:

  1. run all the checks manually (they will succeed but it may act up again on next check)
  2. acknowledge and wait for the problem to go away (may take several ours)

I suspect (but have no particular reason other than single rescheduled checks succeeding) that the problem may lay with all the checks being mass scheduled together - in which case introducing some jitter in the scheduling (how?) might help. Or it may be something completely different.

Ideas, anyone?

© Server Fault or respective owner

Related posts about networking

Related posts about nagios