他のサーバーのリソース監視する方法
Modified: 28 May 2006
ネットワーク上の他のLinuxサーバーのリソース監視をするには、nrpe(Nagios Remote Plugin Executer)というリソース監視ツールを監視対象のサーバーにインストールする必要があります。
ここの解説を参考に実験しました。
http://anabuki.dip.jp/tips/nagios.htm
以下からダウンロードします。
http://www.nagiosexchange.org/NRPE.77.0.html?&tx_netnagext_pi1[p_view]=126
以下のように、解凍してMakeします。
# tar zxvf nrpe-2.0.tar.gz
# cd nrpe-2.0
# ./configure
# make all
OpenSSLのライブラリがない場合エラーになりますので、エラーになる場合は、インストールする必要があります。(2.0のとき)
# yum install openssl # yum install openssl-devel参考にしたページにも書いてありますが、実は、2.0 では、以下のエラーが出てうまくいかなかったので、1.9 で行いました。
CHECK_NRPE: Error - Could not complete SSL handshake以下に回避方法の説明がありましたが。
http://www.nagios.org/faqs/viewfaq.php?faq_id=191
前もって、Pluginを、以下からダウンロードし、
http://www.nagios.org/download/
以下の手順でリモートサーバーにもインストールします。
# tar zxvf nagios-plugins-1.4.2.tar.gz # cd nagios-plugins-1.4.2 # ./configure # make # make install次に、nrpeのモジュールをインストールします。
# cp src/nrpe /usr/local/nagios/libexec/ # cp nrpe.cfg /usr/local/nagios/etc/設定ファイル("/usr/local/nagios/etc/nrpe.cfg")を以下のように変更します。
############################################################################# # Sample NRPE Config File # Written by: Ethan Galstad (nagios@nagios.org) # # Last Modified: 03-05-2003 # # NOTES: # This is a sample configuration file for the NRPE daemon. It needs to be # located on the remote host that is running the NRPE daemon, not the host # from which the check_nrpe client is being executed. ############################################################################# # PORT NUMBER # Port number we should wait for connections on. # NOTE: This must be a non-priviledged port (i.e. > 1024). # NOTE: This option is ignored if NRPE is running under either inetd or xinetd server_port=5666 # SERVER ADDRESS # Address that nrpe should bind to in case there are more than one interface # and you do not want nrpe to bind on all interfaces. # NOTE: This option is ignored if NRPE is running under either inetd or xinetd #server_address=192.168.1.1 # ALLOWED HOST ADDRESSES # This is a comma-delimited list of IP address of hosts that are allowed # to talk to the NRPE daemon. # # NOTE: The daemon only does rudimentary checking of the client's IP # address. I would highly recommend adding entries in your # /etc/hosts.allow file to allow only the specified host to connect # to the port you are running this daemon on. # # NOTE: This option is ignored if NRPE is running under either inetd or xinetd allowed_hosts=127.0.0.1 # NRPE USER # This determines the effective user that the NRPE daemon should run as. # You can either supply a username or a UID. # # NOTE: This option is ignored if NRPE is running under either inetd or xinetd nrpe_user=nagios # NRPE GROUP # This determines the effective group that the NRPE daemon should run as. # You can either supply a group name or a GID. # # NOTE: This option is ignored if NRPE is running under either inetd or xinetd nrpe_group=nagios # COMMAND ARGUMENT PROCESSING # This option determines whether or not the NRPE daemon will allow clients # to specify arguments to commands that are executed. This option only works # if the daemon was configured with the --enable-command-args configure script # option. # # *** ENABLING THIS OPTION IS A SECURITY RISK! *** # Read the SECURITY file for information on some of the security implications # of enabling this variable. # # Values: 0=do not allow arguments, 1=allow command arguments dont_blame_nrpe=0 # DEBUGGING OPTION # This option determines whether or not debugging messages are logged to the # syslog facility. # Values: 0=debugging off, 1=debugging on debug=0 # COMMAND TIMEOUT # This specifies the maximum number of seconds that the NRPE daemon will # allow plugins to finish executing before killing them off. command_timeout=60 # INCLUDE CONFIG FILE # This directive allows you to include definitions from an external config file. #include=<somefile.cfg> # INCLUDE CONFIG DIRECTORY # This directive allows you to include definitions from config files (with a # .cfg extension) in one or more directories (with recursion). #include_dir=<somedirectory> #include_dir=<someotherdirectory> # COMMAND DEFINITIONS # Command definitions that this daemon will run. Definitions # are in the following format: # # command[<command_name>]=<command_line> # # When the daemon receives a request to return the results of <command_name> # it will execute the command specified by the <command_line> argument. # # Unlike Nagios, the command line cannot contain macros - it must be # typed exactly as it should be executed. # # Note: Any plugins that are used in the command lines must reside # on the machine that this daemon is running on! The examples below # assume that you have plugins installed in a /usr/local/nagios/libexec # directory. Also note that you will have to modify the definitions below # to match the argument format the plugins expect. Remember, these are # examples only! # The following examples use hardcoded command arguments... command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10 command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20 command[check_disk1]=/usr/local/nagios/libexec/check_disk -w 20 -c 10 -p /dev/sda1 command[check_disk2]=/usr/local/nagios/libexec/check_disk -w 20 -c 10 -p /dev/sda2 command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200 # The following examples allow user-supplied arguments and can # only be used if the NRPE daemon was compiled with support for # command arguments *AND* the dont_blame_nrpe directive in this # config file is set to '1'... #command[check_users]=/usr/local/nagios/libexec/check_users -w $ARG1$ -c $ARG2$ #command[check_load]=/usr/local/nagios/libexec/check_load -w $ARG1$ -c $ARG2$ #command[check_disk]=/usr/local/nagios/libexec/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$ #command[check_procs]=/usr/local/nagios/libexec/check_procs -w $ARG1$ -c $ARG2$ -s $ARG3$ #実行ユーザを作ります。
# useradd -d /usr/local/nagios nagios"/etc/services"に以下を追加します。
nrpe 5666/TCP # NRPE"/etc/xinetd.d/nrpe"を作成します。
service nrpe { flags = REUSE log_on_failure += USERID port = 5666 socket_type = stream protocol = tcp user = nagios server = /usr/local/nagios/libexec/nrpe server_args = -c /usr/local/nagios/etc/nrpe.cfg --inetd type = UNLISTED wait = no }"xinetd"を再起動します。
# service xinetd restart
チェックコマンド(プラグイン)をコピー(インストール)します。
# cp src/check_nrpe /usr/local/nagios/libexec/"/etc/services"に以下を追加します。
nrpe 5666/TCP # NRPE"/usr/local/nagios/etc/checkcommands.cfg"に、以下を追加します。
# 'check_nrpe' command definition define command{ command_name check_nrpe command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ }"/usr/local/nagios/etc/services.cfg"に、以下を追加します。
### Resource monitor ### define service{ use generic-service1 host_name recipebase service_description HDD check_command check_nrpe!check_disk1 }nagiosを再起動します。
# service nagios restart
リモート側の動作確認
"/usr/local/nagios/etc/nrpe.cfg"で設定したチェックコマンドの実行を確認します。
# /usr/local/nagios/libexec/check_users -w 5 -c 10 USERS OK - 1 users currently logged in |users=1;5;10;0 # /usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20 OK - load average: 0.04, 0.02, 0.00|load1=0.040;15.000;30.000;0; load5=0.020;10.000;25.000;0; load15=0.000;5.000;20.000;0; # /usr/local/nagios/libexec/check_disk -w 20 -c 10 -p /dev/sda1 DISK OK - free space: /boot 58 MB (59%);| /boot=40MB;78;88;0;98 # /usr/local/nagios/libexec/check_disk -w 20 -c 10 -p /dev/sda2 DISK OK - free space: / 722 MB (21%);| /=2707MB;3408;3418;0;3428 # /usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z PROCS OK: 0 processes with STATE = Z # /usr/local/nagios/libexec/check_procs -w 150 -c 200 PROCS OK: 72 processes #ポートが開いていることを確認します。
# netstat -ln Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State tcp 0 0 0.0.0.0:2049 0.0.0.0:* LISTEN tcp 0 0 0.0.0.0:5666 0.0.0.0:* LISTEN tcp 0 0 0.0.0.0:901 0.0.0.0:* LISTEN : #
エラーが出る場合
データがうまく取得できない場合、以下のエラー表示になります。
CHECK_NRPE: Error receiving data from host."/usr/local/nagios/etc/nrpe.cfg"を開いて、"debug=1"にして、"tail /var/log/messages"を確認します。
私の場合、以下のようなエラーが出たりしました。
Jan 8 00:55:32 host nrpe[18690]: Unable to open config file '/usr/local/nagios/etc/nrpe.cfg' for reading Jan 8 00:55:32 host nrpe[18690]: Config file '/usr/local/nagios/etc/nrpe.cfg' contained errors, bailing out...
Jan 8 01:00:34 host nrpe[18755]: No variable value specified in config file '/usr/local/nagios/etc/nrpe.cfg' - Line 110 Jan 8 01:00:34 host nrpe[18755]: Config file '/usr/local/nagios/etc/nrpe.cfg' contained errors, bailing out...