nagios wrongly reports packet loss
Posted
by
Alien Life Form
on Server Fault
See other posts from Server Fault
or by Alien Life Form
Published on 2012-11-02T09:11:55Z
Indexed on
2012/11/02
11:03 UTC
Read the original article
Hit count: 335
Lately, on my nagios 3.2.3 install (CentOS5, monitoring ~ 300 hosts, 1150 services) has sdtarted to occasionally report high packet loss on 50-60 hosts at a time. Problem is it's bogus. Manual runs of ping (or its own check_ping binary) finds no fault with any of the affected hosts. The only possible cures I found so far are:
- run all the checks manually (they will succeed but it may act up again on next check)
- acknowledge and wait for the problem to go away (may take several ours)
I suspect (but have no particular reason other than single rescheduled checks succeeding) that the problem may lay with all the checks being mass scheduled together - in which case introducing some jitter in the scheduling (how?) might help. Or it may be something completely different.
Ideas, anyone?
© Server Fault or respective owner