I wrote a few quick posts on using Nagios Event_Handlers to restart a service on the local system. Mostly I followed the example from the Nagios documentation, but it was a little tricky using SUDO to restart a service. Once I solved that, the logical next step was to be able to restart a service on a REMOTE system with the event_handlers and NRPE.
NAGIOSSVR runs nagios and monitors itself and WEBSVR. I use the "check_linux_procs" script which is also known as "check_system_procs". On the remote server WEBSVR, the script configuration lines look something like:
# Processes to check PROCLIST_RED="httpd sendmail nrpe" PROCLIST_YELLOW="crond" # Ports to check PORTLIST="25 80 5666"
The check_linux_procs is executed on the remote server via NRPE. We can use NRPE to remotely execute event handlers as well as service checks. Setup is a bit more complex than a local host configuration.
Proper SUDO configuration is required on the remote system, WEBSVR. Read my other post on the Nagios Local Sevice Restart with Event_Handlers for the more information on the SUDO settings.
On WEBSVR I created a very simple script that uses sudo to restart the services. Something like:
#!/bin/sh # /usr/bin/sudo /sbin/service httpd restart /usr/bin/sudo /sbin/service sendmail restart exit 0
NRPE is not listed because... well, if NRPE crashes the event_handler cannot run since it uses NRPE to connect and execute the script. Do not foget to add your script to your nrpe.cfg file like below and restart NRPE on WEBSVR.
command[remote_restart]=/usr/local/nagios/libexec/eventhandlers/remote-restart
On NAGIOSSVR, this service check has a max_check_attempts of 3. So I had to tweak the script I used before. The trick here is passing the right variables through. In my Nagios commands.cfg I added the $HOSTADDRESS$ value to the end of the line like so:
command_line $USER1$/eventhandlers/restart-services-remote \ $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$ $HOSTADDRESS$
And the local "event_handler" script on NAGIOSSVR looks similar to the localhost example, but the "sudo" restart command is replaced with:
/usr/local/nagios/libexec/check_nrpe -H $4 -c remote_restart
Don't forget to change the case logic if you need to adjust for a different max_check_attempts value in your config.
NOTES:
HINTS: