Search Results

Search found 334 results on 14 pages for 'nagios'.

Page 3/14 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >

  • Nagios notifications definitions

    - by Colin
    I am trying to monitor a web server in such a way that I want to search for a particular string on a page via http. The command is defined in command.cfg as follows # 'check_http-mysite command definition' define command { command_name check_http-mysite command_line /usr/lib/nagios/plugins/check_http -H mysite.example.com -s "Some text" } # 'notify-host-by-sms' command definition define command { command_name notify-host-by-sms command_line /usr/bin/send_sms $CONTACTPAGER$ "Nagios - $NOTIFICATIONTYPE$ :Host$HOSTALIAS$ is $HOSTSTATE$ ($OUTPUT$)" } # 'notify-service-by-sms' command definition define command { command_name notify-service-by-sms command_line /usr/bin/send_sms $CONTACTPAGER$ "Nagios - $NOTIFICATIONTYPE$: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ ($OUTPUT$)" } Now if nagios doesn't find "Some text" on the home page mysite.example.com, nagios should notify a contact via sms through the Clickatell http API which I have a script for that that I have tested and found that it works fine. Whenever I change the command definition to search for a string which is not on the page, and restart nagios, I can see on the web interface that the string was not found. What I don't understand is why isn't the notification sent though I have defined the host, hostgroup, contact, contactgroup and service and so forth. What I'm I missing, these are my definitions, In my web access through the cgi I can see that I have notifications have been defined and enabled though I don't get both email and sms notifications during hard status changes. host.cfg define host { use generic-host host_name HAL alias IBM-1 address xxx.xxx.xxx.xxx check_command check_http-mysite } *hostgroups_nagios2.cfg* # my website define hostgroup{ hostgroup_name my-servers alias All My Servers members HAL } *contacts_nagios2.cfg* define contact { contact_name colin alias Colin Y service_notification_period 24x7 host_notification_period 24x7 service_notification_options w,u,c,r,f,s host_notification_options d,u,r,f,s service_notification_commands notify-service-by-email,notify-service-by-sms host_notification_commands notify-host-by-email,notify-host-by-sms email [email protected] pager +254xxxxxxxxx } define contactgroup{ contactgroup_name site_admin alias Site Administrator members colin } *services_nagios2.cfg* # check for particular string in page via http define service { hostgroup_name my-servers service_description STRING CHECK check_command check_http-mysite use generic-service notification_interval 0 ; set > 0 if you want to be renotified contacts colin contact_groups site_admin } Could someone please tell me where I'm going wrong. Here are the generic-host and generic-service definitions *generic-service_nagios2.cfg* # generic service template definition define service{ name generic-service ; The 'name' of this service template active_checks_enabled 1 ; Active service checks are enabled passive_checks_enabled 1 ; Passive service checks are enabled/accepted parallelize_check 1 ; Active service checks should be parallelized (disabling this can lead to major performance problems) obsess_over_service 1 ; We should obsess over this service (if necessary) check_freshness 0 ; Default is to NOT check service 'freshness' notifications_enabled 1 ; Service notifications are enabled event_handler_enabled 1 ; Service event handler is enabled flap_detection_enabled 1 ; Flap detection is enabled failure_prediction_enabled 1 ; Failure prediction is enabled process_perf_data 1 ; Process performance data retain_status_information 1 ; Retain status information across program restarts retain_nonstatus_information 1 ; Retain non-status information across program restarts notification_interval 0 ; Only send notifications on status change by default. is_volatile 0 check_period 24x7 normal_check_interval 5 retry_check_interval 1 max_check_attempts 4 notification_period 24x7 notification_options w,u,c,r contact_groups site_admin register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE! } *generic-host_nagios2.cfg* define host{ name generic-host ; The name of this host template notifications_enabled 1 ; Host notifications are enabled event_handler_enabled 1 ; Host event handler is enabled flap_detection_enabled 1 ; Flap detection is enabled failure_prediction_enabled 1 ; Failure prediction is enabled process_perf_data 1 ; Process performance data retain_status_information 1 ; Retain status information across program restarts retain_nonstatus_information 1 ; Retain non-status information across program restarts max_check_attempts 10 notification_interval 0 notification_period 24x7 notification_options d,u,r contact_groups site_admin register 1 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE! }

    Read the article

  • Integrating Nagios with a ticketing system/incident mnagement system

    - by sektor
    Is there a free ticketing system/incident management system which will help me in achieving the following? 1) If a service goes down then Nagios alerts the on-duty staff and pushes the status to some backend or DB as a ticket, say the initial status is "New". 2) The on-duty staff logs in through a frontend and acknowledges the new ticket by marking it as "In progress", so now the status of the ticket changes from "New" to "In progress". 3) If even after "n" number of minutes no person from on-duty staff has changed the ticket status to "In progress" then Nagios alerts the next level of contacts. Although if the on-duty staff has acknowledged the ticket then there is no need to alert the next level. 4) When the service comes up Nagios closes the ticket by marking it "Closed" Now I already have Nagios monitoring set up and currently it alerts by sending text messages and mails, what I'm looking for is some framework which only escalates the issue(alerts the second level) if the first level(on-duty staff) fails to respond to the initial alert. By "responding to the alert" I mean, the on-duty staff can login via some frontend and basically change the status to something like "Acknowledged" or "In progress".

    Read the article

  • Nagios check_host_alive and check_ping not showing host as down

    - by Kyle
    I am using the check_host_alive command to send 5 packets every minute to all my routers at remote locations. I noticed today I received a notification from The AT&T Global Client Support Center that a router was down (which can take 5-30 minutes to send these notices out) and never received a notice from Nagios. I went onto Nagios and it is was showing the host as alive with a latency of 0ms. This tells me it is seeing the automated response from my router in the data center that, "TTL expired in transit" as a reply from the remote router. Is there anyway for me to tell nagios to check where the reply is comming from? I feel like other people have to of had this issue... I tested it with the check_ping command and it produced the same results. I have the command defined has %hostname% and the proper IP in the host definition, and it works fine for telling me the latency is high. Any ideas are welcome, I have already exercised my Google skills with no results. EDIT: root@IM-UBTU:/# /usr/local/nagios/libexec/check_ping -H 192.168.250.1 -w 100.0,10% -c 200.0,20% -vvv CMD: /bin/ping -n -U -w 10 -c 5 192.168.250.1 Output: PING 192.168.250.1 (192.168.250.1) 56(84) bytes of data. Output: From 10.69.10.2 icmp_seq=1 Time to live exceeded It knows something is wrong why doesn't it give me a warning?

    Read the article

  • Not all events appear in Nagios history (archive)

    - by Lars
    in the "Host & Service history" of my check_mk interface I can see various events, but a lot of events are missing. On the default interface at "View Alert History For This Service" or in the logfiles /var/log/nagios/archives/*.log the same issue: I can see many events of the last days, but not all of them. In the /etc/nagios/nagios.cfg the options log_event_handlers, log_initial_status and log_passive_checks are set to 0, the other log_... options are set to 1. I don't think that any of these options causing the problem that not all events are logged. What could cause this problem?

    Read the article

  • Distributed Nagios Installation

    - by kruczkowski
    I'm looking for a plug-in or product that will act as a remote probe and perform tests then send back the results to the central Nagios server. Reason for this is that I'd like to monitor internal systems and servers at customers, but don't want to allow all the traffic passing the firewalls. Ideally I'd like a soft-probe that would be installed and then perform the tests and send back the results (via SSH) to the central Nagios installation. Does anyone know of a product or plug-in that would offer such service? If not Nagios, is there any other monitoring system that does such a thing (ideally open-source)?

    Read the article

  • Configuring Nagios BGP plugin on Ubuntu

    - by user141610
    I am trying to configure nagios check_bgp_neighbors plug-in on Ubuntu and followed README file of check_bgp_neighbors plug-in. I have made following changes: define command{ command_name check_bgp_all command_line $USER1$/check_bgp_neighbors -H $HOSTADDRESS$ -C $USER3$ -n $ARG1$ -n $ARG2$ } to define command{ command_name check_bgp_all command_line /usr/local/nagios/libexec/check_bgp_neighbors.sh -H xx.xx.xx.49 -C xx.xx.xx.50 And define service{ use server-service hostgroup_name svc-bgp1 service_description BGP Check 1 check_command check_bgp_all!10.0.0.1!172.16.0.2 } to define service{ use generic-service hostgroup_name svc-bgp1 service_description BGP Check 1 check_command check_bgp_all!xx.xx.xx.50 } xx.xx.xx.49 is the IP of the host router and xx.xx.xx.50 is the IP of eBGP neighbour. After that it shows critical status. I know my command is not correct but cannot detect the problem. I learned that in this plug-in user-name and password of the host router are required but don't know how and where to provide it. Nagios log does not show any error message. Status information: Failed: status:0 prefixes:0 sent:0 received:0

    Read the article

  • Nagios service active only when other service is failing

    - by Laimoncijus
    Is is possible to define service to be active only the times while other service is failing? Consider following example: 2 hosts available: HostA (primary) and HostB (backup). Nagios service, which monitors amount of active connections to the host: gives OK when amount of connections to host 0 gives FAILURE when amount of connections to host = 0 If setup nagios service to monitor both hosts: HostA and HostB - it will give me OK for HostA (while it is primary and all connections normally goes to it) and FAIL for HostB (while it is backup and will receive no connections while HostA is alive). Can I make the nagios service for HostB somehow depend on sevice of HostA and give no failures (or maybe be inactive) up to the moment the service of HostA starts failing?

    Read the article

  • Setting nagios location in map

    - by Mech Software
    I have Nagios installed and I'm working on getting the network map correct. The problem I have is that "Nagios" appears to be in the "internet" when it should be located on the MechNAS server. What I want is Nagios Process to show up inside the local network. So it should show up at the same layer as MechNAS and development. Where exactly is that configured? I didnt see any place to set that up and it looks now like it's out there on it's own. Documentation and Googling didnt seem to turn up anything either.

    Read the article

  • Nagios state transition and event handler issue

    - by Dattatray
    We are using Nagios to check duplicate processes. define service { use local-service host_name xxx service_description xxx Duplicate Processes check_interval 1 max_check_attempts 1 contact_groups admins event_handler restart-dependent-processes check_command check_procs_duplicate!2!3!2!2!2 } check_procs_duplicate checks if there are any duplicate processes and returns the state - e.g. CRITICAL. The event handler kills the duplicate processes and it's dependent processes and starts one instance of the process and dependent process. At the end of this again Nagios checks if there are any duplicate processes and sets the state accordingly - OK/WARNING/CRITICAL. The event handler takes more time to start the processes and during this time if someone manually starts the process, the state will remain in CRITICAL itself. During the next interval, Nagios will again check for duplicate processes and it will find it again CRITICAL. The event handler will not get executed now, as the previos and current both the states are CRITICAL. Any pointers about how to fix this issue?

    Read the article

  • nagios service check

    - by DRH
    I am new to nagios and we have a small issue I need to ask assistance with. Many of the machines that we monitor can go unresponsive for a bit when some very intensive cpu tasks are run. This makes nagios send warnings and alerts while these hosts are busy reporting things like 'ping timeout' or 'zombie processes' and even swap space warnings, but in actuality there is not a problem. Is there a way to configure nagios to not send such alerts, but check x number of times over a period of time and only then send an alert at the end of that time if the server in question has not recovered?. Looking at the commands.cfg file, I see entries like this: define command{ command_name check_local_swap command_line $USER1$/check_swap -w $ARG1$ -c $ARG2$ } How could I modify this example to accomplish what I want above. Thanks

    Read the article

  • Monitoring MySQL SELECT/WRITE/UPDATE/SLOW queries in Nagios

    - by imaginative
    There's ways to get performance graphs with several monitoring software packages out there such as ZenOSS. There's a plugin available that will graph MySQL based SELECT/WRITE/SLOW queries in a nice rrd style graph. I'm curious if there is a way to also get similar graphs available in Nagios 3.0? I know Nagios has tools like pnp and can integrate rrd, but is there something readily available that can plugin to monitor those MySQL specifics?

    Read the article

  • Nagios check for wuauserv on Windows Server 2008+

    - by Mechaflash
    From Windows Server 2008+, wuauserv is no longer a service that's ran all of the time and is instead ran as a scheduled task. I'm not sure of the exact behavior of how the scheduled task is created as it seems the schedule is generated and edited by another service. Prior to this, we setup nagios to just check for the running service to ensure it was accepting updates. My question is, how does one track the proper execution/running of wuauserv service in Windows Server 2008+ to ensure it is accepting updates with nagios?

    Read the article

  • Monitor uSWGI via Nagios: invalid socket

    - by webjay
    I'm trying to monitor uSWGI via Nagios, but according to uWSGI I have specified an invalid socket. The socket path I got from the JSON config file which also says chmod-socket: 666 so I have a hunch that the problem is permission based. The socket file is owned by www-data who I don't want to tinker with, so any other ways? uwsgi --socket=/tmp/app.sock --nagios detected binary path: /usr/local/bin/uwsgi UWSGI UNKNOWN: you have specified an invalid socket ls -l /tmp/app.sock srw-rw-rw- 1 www-data www-data 0 2012-10-26 17:00 /tmp/app.sock

    Read the article

  • I'd like to configure nagios to alerts only when there are no more mx servers available

    - by user37991
    In my company there are two redundant MX servers, I would like to tell nagios to wake me in the night ONLY if both servers are down. The default behavior is to alert whenever one of the MX servers is down. I would like to set a timeperiod i.e. 23:00 to 06:00 when nagios only alerts me by sms in case both servers are down. I am using nagios3 but I couldn't find something like this in the docs. Thanks

    Read the article

  • Nagios shell script cannot be executed

    - by MeinAccount
    I'm trying to monitor GitLab with nagios. I've created the following command definition and shell script but when checking the service I'm receiving the following e-mail. How can I solve this? The file is executable. [...] nagios : 3 incorrect password attempts ; TTY=unknown ; PWD=/ ; USER=git ; COMMAND=/bin/bash -c /var/lib/nagios/custom_plugins/check_gitlab.sh Command definition: define command { command_name custom_check_gitlab command_line /var/lib/nagios/custom_plugins/check_gitlab.sh } Shell script: #! /bin/sh # [...] RAILS_ENV="production" # Script variable names should be lower-case not to conflict with internal /bin/sh variables such as PATH, EDITOR or SHELL. app_root="/home/git/gitlab" app_user="git" unicorn_conf="$app_root/config/unicorn.rb" pid_path="$app_root/tmp/pids" socket_path="$app_root/tmp/sockets" web_server_pid_path="$pid_path/unicorn.pid" sidekiq_pid_path="$pid_path/sidekiq.pid" ### Here ends user configuration ### # Switch to the app_user if it is not he/she who is running the script. if [ "$USER" != "$app_user" ]; then sudo -u "$app_user" -H -i $0 "$@"; exit; fi # Switch to the gitlab path, if it fails exit with an error. if ! cd "$app_root" ; then echo "Failed to cd into $app_root, exiting!"; exit 1 fi ### Init Script functions check_pids(){ if ! mkdir -p "$pid_path"; then echo "Could not create the path $pid_path needed to store the pids." exit 1 fi # If there exists a file which should hold the value of the Unicorn pid: read it. if [ -f "$web_server_pid_path" ]; then wpid=$(cat "$web_server_pid_path") else wpid=0 fi if [ -f "$sidekiq_pid_path" ]; then spid=$(cat "$sidekiq_pid_path") else spid=0 fi } # Checks whether the different parts of the service are already running or not. check_status(){ check_pids # If the web server is running kill -0 $wpid returns true, or rather 0. # Checks of *_status should only check for == 0 or != 0, never anything else. if [ $wpid -ne 0 ]; then kill -0 "$wpid" 2>/dev/null web_status="$?" else web_status="-1" fi if [ $spid -ne 0 ]; then kill -0 "$spid" 2>/dev/null sidekiq_status="$?" else sidekiq_status="-1" fi } check_pids check_status if [ "$web_status" != "0" -a "$sidekiq_status" != "0" ]; then echo "GitLab is not running." exit 2 fi if [ "$web_status" != "0" ]; then printf "The GitLab Unicorn webserver is \033[31mnot running\033[0m.\n" exit 1 fi if [ "$sidekiq_status" != "0" ]; then printf "The GitLab Sidekiq job dispatcher is \033[31mnot running\033[0m.\n" exit 1 fi if [ "$web_status" = "0" -a "$sidekiq_status" = "0" ]; then printf "GitLab and all it's components are \033[32mup and running\033[0m.\n" exit 0 fi

    Read the article

  • Nagios simple dashboard

    - by Thomas
    I am looking for a dead simple dashboard for Nagios so our IT team can view the status of our services. In an old version of what's up gold, it was a nice dashboard with different rectangular shape being red, yellow or green depending on the status of the service and could be display easily on a screen. Is there some copycat dashboard for nagios ? any better recommendation ? I want something you can see from your desk 15meters away: red or green, no need for details.

    Read the article

  • Nagios test of smtp configuration

    - by Funky Si
    Is there a way of configuring a nagios check that a smtp service is correctly configured and emails are going out. I have a check that the service is running, but recently we noticed that the configuration had been altered and no emails where going out, but the service was still running. One idea I had was to schedule a regular email to be sent, is it possible for nagios to check for that email and throw an alert if it didn't detect it? Any other ideas to monitor this gratefully received.

    Read the article

  • Nagios only create warning for a http service

    - by MeinAccount
    I would like to also monitor non-crucial services with nagios like for example our GitLab-server or phpMyAdmin instance. Is there any way to just create warnings instead of circuital errors for some services? At the moment I'm using the following: define service { host_name localhost use generic-service service_description HTTP GitLab check_command check_www!git.example.com!'/users/sign_in' } define command { command_name check_www command_line /usr/lib/nagios/plugins/check_http -H '$ARG1$' -I '$HOSTADDRESS$' -e 'HTTP/1.1 200 OK' -u '$ARG2$' }

    Read the article

  • Properly escaping check_command in nagios

    - by shadyabhi
    When I execute sudo -u nagios /usr/lib64/nagios/plugins/check_by_ssh.sh hostname "check_haproxy -u \"http://localhost:10000/haproxy?stats\;csv\"" it runs perfectly on the server. For this, I have this in my HAProxy.cfg define service { use generic-service hostgroup_name pwmail-ee-oxweb service_description HAProxy-ee servicegroups ssh-dep check_command check_by_ssh!check_haproxy -u \"http://localhost:10000/haproxy?stats\;csv\" contacts sysad,mail-hosting-rt } It doesn't work. Says that Return code of 127 is out of bounds - plugin may be missing. What am I doing wrong?

    Read the article

  • Message from Nagios Server

    - by user12213
    Nagios Server is monitoring my Server which hosts Windows Sharepoint. I am getting the following 2 alerts in my inbox from Nagios Server 1. Service: C:\ Drive Space State: CRITICAL Additional Info: CRITICAL - Socket timeout after 10 seconds 2. Service: CPU Load State: CRITICAL Additional Info: CRITICAL - Socket timeout after 10 seconds What do I infer from these?

    Read the article

  • PNP4Nagios, nagiosgraph, separate Cacti, or something else for Nagios trending

    - by Matt
    I've been using Nagios for a while now and recently started using Cacti after being dissatisfied with the lack of scaling and lack of any GUI in MRTG. I'm interested in adding trending to my Nagios installation and wondered what was the best route to go. I've looked around a bit and have seen what's available, but there's not a lot of information around to differentiate them from each other. My Nagios install has about 250 hosts and 1100 service checks, but many of them are just simple network devices and there's only about 20 servers and 300 services associated with them. All servers but 2 are running Windows Server 2003. What are the main highlights of PNP4Nagios vs. nagiosgraph, or would I be better off using some sort of tool to convert the data to RRD form and just view it directly in Cacti? Is there a completely different direction I could go that would be even better? Please comment if you need any more information, I tend to be too wordy and tried to keep this question brief. Thanks!

    Read the article

  • Nagios escalation debugging

    - by Oesor
    I'm having some issues with escalations happening properly and I'm not sure if it's because of my config or because the nagios binary is nonstandard and something may be broken. I've got little experience with nagios, and just want to make sure this is being set appropriately. Should the following config file definition allow the escalations to take over and increment the notification interval as expected? Is there somewhere else in the config files I should be looking at to figure out what's going on? I've enabled debug 32 in the config and it's simply spitting out 'Host notification will NOT be escalated.' for each notification. The configuration does pass the pre flight check with no issues, and reports that it's parsing the three host escalations in the config. # test host definition define host { host_name test alias test address 10.0.0.10 hostgroups test check_interval 0 retry_interval 1 max_check_attempts 2 flap_detection_enabled 0 icon_image windows.png icon_image_alt LOGO - Windows vrml_image windows.png statusmap_image windows.png action_url /info/host/275 check_period 24x7 contact_groups hostgroup15_servicegroup1,hostgroup15_servicegroup10,hostgroup15_servicegroup13,hostgroup15_servicegroup14,hostgroup15_servicegroup2,hostgroup15_servicegroup3,hostgroup15_servicegroup4,hostgroup15_servicegroup42,hostgroup15_servicegroup45,hostgroup15_servicegroup46,hostgroup15_servicegroup47,hostgroup15_servicegroup5,hostgroup15_servicegroup8,hostgroup15_servicegroup9,ov_monitored_by_master check_command check_host_15!-H $HOSTADDRESS$ -t 3 -w 500.0,80% -c 1000.0,100% parents nagios notifications_enabled 1 notification_interval 3 notification_period 24x7 notification_options u,d,r use host-global } define hostescalation{ host_name test first_notification 3 last_notification 4 notification_interval 10 contact_groups hostgroup15_servicegroup1,hostgroup15_servicegroup10,hostgroup15_servicegroup13,hostgroup15_servicegroup14,hostgroup15_servicegroup2,hostgroup15_servicegroup3,hostgroup15_servicegroup4,hostgroup15_servicegroup42,hostgroup15_servicegroup45,hostgroup15_servicegroup46,hostgroup15_servicegroup47,hostgroup15_servicegroup5,hostgroup15_servicegroup8,hostgroup15_servicegroup9,ov_monitored_by_master } define hostescalation{ host_name test first_notification 4 last_notification 5 notification_interval 30 contact_groups hostgroup15_servicegroup1,hostgroup15_servicegroup10,hostgroup15_servicegroup13,hostgroup15_servicegroup14,hostgroup15_servicegroup2,hostgroup15_servicegroup3,hostgroup15_servicegroup4,hostgroup15_servicegroup42,hostgroup15_servicegroup45,hostgroup15_servicegroup46,hostgroup15_servicegroup47,hostgroup15_servicegroup5,hostgroup15_servicegroup8,hostgroup15_servicegroup9,ov_monitored_by_master } define hostescalation{ host_name test first_notification 5 last_notification 0 notification_interval 240 contact_groups hostgroup15_servicegroup1,hostgroup15_servicegroup10,hostgroup15_servicegroup13,hostgroup15_servicegroup14,hostgroup15_servicegroup2,hostgroup15_servicegroup3,hostgroup15_servicegroup4,hostgroup15_servicegroup42,hostgroup15_servicegroup45,hostgroup15_servicegroup46,hostgroup15_servicegroup47,hostgroup15_servicegroup5,hostgroup15_servicegroup8,hostgroup15_servicegroup9,ov_monitored_by_master }

    Read the article

  • Nagios plugin script not working as expected

    - by Linker3000
    I have modified an off-the-shelf Nagios plugin perl script to (in theory) return a one or zero according to the existence, or not, of a file on a remote linux server. The script runs a remote ssh session and logs in as the nagios user. The remote linux servers have private keys setup for that user, and on the bash command line the script works as expected, but when run as a plugin it always returns '1' (true) even if the file does not exist. Some help with the logic or a comment on why things are not working as expected within Nagios would be appreciated. I'd prefer to use this ssh login method rather than having to install nrpe on all the linux servers. To run from a command line (assuming remote server has a user called nagios with a valid private key): ./check_reboot_required -e ssh -H remote-servers-ip-addr -p 'filename-to-check' -v Ta. #! /usr/bin/perl -w # # # License Information: # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. # ############################################################################ use POSIX; use strict; use Getopt::Long; use lib "/usr/lib/nagios/plugins" ; use vars qw($host $opt_V $opt_h $opt_v $verbose $PROGNAME $pattern $opt_p $mmin $opt_e $opt_t $opt_H $status $state $msg $msg_q $MAILQ $SHELL $device $used $avail $percent $fs $blocks $CMD $RMTOS); use utils qw(%ERRORS &print_revision &support &usage ); sub print_help (); sub print_usage (); sub process_arguments (); $ENV{'PATH'}=''; $ENV{'BASH_ENV'}=''; $ENV{'ENV'}=''; $PROGNAME = "check_reboot_required"; Getopt::Long::Configure('bundling'); $status = process_arguments(); if ($status){ print "ERROR: processing arguments\n"; exit $ERRORS{'UNKNOWN'}; } $SIG{'ALRM'} = sub { print ("ERROR: timed out waiting for $CMD on $host\n"); exit $ERRORS{'WARNING'}; }; $host = $opt_H; $pattern = $opt_p; print "Pattern >" . $pattern . "< " if $verbose; alarm($opt_t); #$CMD = "/usr/bin/find " . $pattern . " -type f 2>/dev/null| /usr/bin/wc -l"; $CMD = "[ -f " . $pattern . " ] && echo 1 || echo 0"; alarm($opt_t); ## get cmd output from remote system if (! open (OUTPUT, "$SHELL $host $CMD|" ) ) { print "ERROR: could not open $CMD on $host\n"; exit $ERRORS{'UNKNOWN'}; } my $perfdata = ""; my $state = "3"; my $msg = "Indeterminate result"; # only first line is relevant in this iteration. while (<OUTPUT>) { my $result = chomp($_); $msg = $result; print "Shell returned >" . $result . "< length is " . length($result) . " " if $verbose; if ( $result == 1 ) { $msg = "Reboot required (NB: Result still not accurate)" . $result ; $state = $ERRORS{'WARNING'}; last; } elsif ( $result == 0 ) { $msg = "No reboot required (NB: Result still not accurate) " . $result ; $state = $ERRORS{'OK'}; last; } else { $msg = "Output received, but it was neither a 1 nor a 0" ; last; } } close (OUTPUT); print "$msg | $perfdata\n"; exit $state; ##################################### #### subs sub process_arguments(){ GetOptions ("V" => \$opt_V, "version" => \$opt_V, "v" => \$opt_v, "verbose" => \$opt_v, "h" => \$opt_h, "help" => \$opt_h, "e=s" => \$opt_e, "shell=s" => \$opt_e, "p=s" => \$opt_p, "pattern=s" => \$opt_p, "t=i" => \$opt_t, "timeout=i" => \$opt_t, "H=s" => \$opt_H, "hostname=s" => \$opt_H ); if ($opt_V) { print_revision($PROGNAME,'$Revision: 1.0 $ '); exit $ERRORS{'OK'}; } if ($opt_h) { print_help(); exit $ERRORS{'OK'}; } if (defined $opt_v ){ $verbose = $opt_v; } if (defined $opt_e ){ if ( $opt_e eq "ssh" ) { if (-x "/usr/local/bin/ssh") { $SHELL = "/usr/local/bin/ssh"; } elsif ( -x "/usr/bin/ssh" ) { $SHELL = "/usr/bin/ssh"; } else { print_usage(); exit $ERRORS{'UNKNOWN'}; } } elsif ( $opt_e eq "rsh" ) { $SHELL = "/usr/bin/rsh"; } else { print_usage(); exit $ERRORS{'UNKNOWN'}; } } else { print_usage(); exit $ERRORS{'UNKNOWN'}; } unless (defined $opt_t) { $opt_t = $utils::TIMEOUT ; # default timeout } unless (defined $opt_H) { print_usage(); exit $ERRORS{'UNKNOWN'}; } return $ERRORS{'OK'}; } sub print_usage () { print "Usage: $PROGNAME -e <shell> -H <hostname> -p <directory/file pattern> [-t <timeout>] [-v verbose]\n"; } sub print_help () { print_revision($PROGNAME,'$Revision: 0.1 $'); print "\n"; print_usage(); print "\n"; print " Checks for the presence of a 'reboot-required' file on a remote host via SSH or RSH\n"; print "-e (--shell) = ssh or rsh (required)\n"; print "-H (--hostname) = remote server name (required)"; print "-p (--pattern) = File pattern for find command (default = /var/run/reboot-required)\n"; print "-t (--timeout) = Plugin timeout in seconds (default = $utils::TIMEOUT)\n"; print "-h (--help)\n"; print "-V (--version)\n"; print "-v (--verbose) = debugging output\n"; print "\n\n"; support(); }

    Read the article

  • Nagios orphaned services warnings

    - by Gordon
    We have had Nagios running on one of our servers with out any problems for a while but lately certain old service warning have been reappearing and then disappearing on the service detail page. From looking at the logs I found warning like the following. Warning: The check of service 'Tomcat' on host 'virtual1' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service... Has anyone ever came across this before or at least know a way to delete the old Orphaned Warnings. The Nagios Version we are running is Version 3.0b7 so an update might be in order. Thanks.

    Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >