Monitoring Cisco Nexus 7000 switches with Icinga/Nagios
Cisco uses an different MIB to make information over SNMP for there NEXUS hardware available.tcomm.es has provided some plugins to monitor such switches espacially for the FRU-hardware. The original website is not available at the moment, so I’ve made a copy at github.
All configuration examples are tested with a Nexus7000 C7010 and software version 5.1(2) and6.2(8). The following chapters shows some configurations:
checking the power supplies
I use the check_cisco_fru_ps-Plugin.
First you’ve to call the switch in test mode to get the necessary Power IDs:
./check_cisco_fru_ps.pl -H <IP or hostname> -C <community> -E <snmp-version>
Lines with KW inside shows the used power supplies.
CISCO_FRU_PS OK - TEST MODE CISCO POWERS DATA Power id = 471 Power status = 2 (On) Power description = N7K-AC-6.0KW ... Power id = 470 Power status = 2 (On) Power description = N7K-AC-6.0KW ...
Now you run this command in check-mode:
./check_cisco_fru_ps.pl -H <IP or hostname> -C <community> -E <snmp-version> -e 470,471 -w 9,12 -c 5,8
The values for warning and critical are fixed and explained in the online help_./check_cisco_frups.pl -h, but for critical checks the value 5 should be also applied to match the_PSFAIL error. Last step is define a command and a service check. This service check has no perfdata output, just set process_perf_data to __.
define command { command_line $USER1$/check_cisco_fru_ps.pl -H '$HOSTADDRESS$' -C '$ARG1$' -E 2c -e '$ARG2$','$ARG3$' -w 9,12 -c 8 command_name check_snmp_cisco_fru_ps } define service { use generic-service service_description Cisco Nexus7010 Power supply status host_name nexus7010 process_perf_data 0 check_command check_snmp_cisco_fru_ps!<community>!470!471 }
checking the fans
Here is the next plugin I use check_cisco_fru_fan-.
Once again the first run is to call the switch in test mode to get the necessary Fan IDs:
./check_cisco_fru_fan.pl -H <IP or hostname> -C <community> -E <snmp-version>
You’ll get the lines with information about the fans and id’s.
CISCO_FRU_PS OK - TEST MODE CISCO POWERS DATA CISCO_FRU_FAN OK - TEST MODE CISCO FANS DATA Fan id = 537 Fan status = 2 (Up) Fan description = Fan Module-4 Fan id = 535 Fan status = 2 (Up) Fan description = Fan Module-2 Fan id = 534 Fan status = 2 (Up) Fan description = Fan Module-1 Fan id = 536 Fan status = 2 (Up) Fan description = Fan Module-3
Now you should run this command in check-mode:
./check_cisco_fru_fan.pl -H <IP or hostname> -C <community> -E <snmp-version> -e 534,535,536,537 -w 4 -c 3
The values for warning and critical are fixed and also documented in the online help_./check_cisco_frufan.pl -h. Last step is define a command and a service check. This service check has no perfdata output, just set process_perf_data to __.
define command { command_name check_snmp_cisco_fru_fan command_line $USER1$/check_cisco_fru_fan.pl -H '$HOSTADDRESS$' -C '$ARG1$' -E 2c -e '$ARG2$' -w 4 -c 3 } define service { use generic-service service_description Cisco Nexus7010 Fan status host_name nexus7010 process_perf_data 0 check_command check_snmp_cisco_fru_fan!<community>!534,535,536,537 }
checking the moduls
The last plugin I use for hardware monitoring is check_cisco_fru_module-.
The same as before run it first to call the switch in test mode to get the necessary Module IDs:
./check_cisco_fru_module.pl -H <IP or hostname> -C <community> -E <snmp-version>
You’ll get the lines with information about the modules and id’s.
CISCO_FRU_MODULE OK - TEST MODE CISCO MODULES DATA Module id = 33 Module status = 2 (Ok) Module description = Fabric card module Module id = 34 Module status = 2 (Ok) Module description = Fabric card module Module id = 22 Module status = 2 (Ok) Module description = 10/100/1000 Mbps Ethernet Module Module id = 32 Module status = 2 (Ok) Module description = Fabric card module Module id = 27 Module status = 2 (Ok) Module description = Supervisor module-1X Module id = 31 Module status = 2 (Ok) Module description = 10 Gbps Ethernet Module Module id = 26 Module status = 2 (Ok) Module description = Supervisor module-1X
Now you should run this command in check-mode:
./check_cisco_fru_fan.pl -H <IP or hostname> -C <community> -E <snmp-version> -e 22,26,27,31,32,33,34 -w 4,5,13,14,19,23 -c 7,8,20
The values for warning and critical are fixed and also documented in the online help_./check_cisco_frumodule.pl -h. Last step is define a command and a service check. This service check has no perfdata output, just set process_perf_data to __.
define command { command_name check_snmp_cisco_fru_module command_line $USER1$/check_cisco_fru_module.pl -H '$HOSTADDRESS$' -C '$ARG1$' -E 2c -e '$ARG2$' -w 4,5,13,14,19,23 -c 7,8,20 } define service { use generic-service service_description Cisco Nexus7010 Module status host_name nexus7010 process_perf_data 0 check_command check_snmp_cisco_fru_module!<community>!22,26,27,31,32,33,34 }
More check commands will follow soon.
Tom