1

Yesterday I upgraded a server from Debian 9 to Debian 10. This server is supervised with nagios. Since the upgrade, I get an alert, status Unknown saying :

"Volumegroup array03-0 wasn't valid or wasn't specified with "-v Volumegroup", bye. false

The service is VG baie03-0 usage, its command is check_nrpe!check_vgs_array03-0. The goal of this service is to generate an alert if storage on the array is almost full.

check_nrpe command is standard :

# 'check_NRPE' command definition
define command{
        command_name check_nrpe
        command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
        }

If I'm not mistaken, it means that I have a check_vgs_array03-0 command in my /etc/nagios/nrpe.cfg on the supervised server. Let's look at it, here it is :

command[check_vgs_array03-0]=/usr/lib/nagios/plugins/check_vg_size -w 20 -c 10 -v array03-0

If I just type this command on the supervised server, I have no errors, it works.

VG array03-0 OK Available space is 805 GB;| array03-0=805GB;20;10;0;19155

I got the error if, for example, I type a volumegroup name that doesn't exist.

check_vg_size plugin script goes like this :

#!/bin/bash
#check_vg_size
#set -x
# Plugin for Nagios
# Written by M. Koettenstorfer ([email protected])
# Some additions by J. Schoepfer ([email protected])
# Major changes into functions and input/output values J. Veverka ([email protected])
# Last Modified: 2012-11-06
#
# Description:
#
# This plugin will check howmany space in volume groups is free

# Nagios return codes
STATE_OK=0
STATE_WARNING=1
STATE_CRITICAL=2
STATE_UNKNOWN=3
STATE_DEPENDENT=4

SERVICEOUTPUT=""
SERVICEPERFDATA=""

PROGNAME=$(basename $0)

vgs_bin=`/usr/bin/whereis -b -B /sbin /bin /usr/bin /usr/sbin -f vgs | awk '{ print $2 }'`
_vgs="$vgs_bin --units=g"

bc_bin=`/usr/bin/whereis -b -B /sbin /bin /usr/bin /usr/sbin -f bc | awk '{ print $2 }'`

exitstatus=$STATE_OK #default
declare -a volumeGroups;
novg=0; #number of volume groups
allVG=false; #Will we use all volume groups we can find on system?
inPercent=false; #Use percentage for comparison?

unitsGB="GB"
unitsPercent="%"
units=$unitsGB

########################################################################
### DEFINE FUNCTIONS
########################################################################

print_usage() {
        echo "Usage: $PROGNAME  -w <min size warning level in gb> -c <min size critical level in gb> -v <volumegroupname> [-a] [-p]"
        echo "If '-a' and '-v' are specified: all volumegroups defined by -v will be ommited and the remaining groups which are found on system are checked"
        echo "If '-p' is specified: the warning and critical levels are represented as the percent space left on device"
    echo ""
}

print_help() {
        print_usage
        echo ""
        echo "This plugin will check how much space is free in volume groups"
        echo "usage: "
        exit $STATE_UNKNOWN
}


checkArgValidity () {
# Check arguments for validity
        if [[ -z $critlevel || -z $warnlevel ]] # Did we get warn and crit values?
        then
                echo "You must specify a warning and critical level"
                print_usage
                exitstatus=$STATE_UNKNOWN
                exit $exitstatus
        elif [ $warnlevel -le $critlevel ] # Do the warn/crit values make sense?
        then
        if [ $inPercent != 'true' ]
        then
            echo "CRITICAL value of $critlevel GB is less than WARNING level of $warnlevel GB"
            print_usage
            exitstatus=$STATE_UNKNOWN
            exit $exitstatus
        else
            echo "CRITICAL value of $critlevel % is higher than WARNING level of $warnlevel %"
            print_usage
            exitstatus=$STATE_UNKNOWN
            exit $exitstatus
        fi
        fi
}

#Does volume group actually exist?
volumeGroupExists () {
        local volGroup="$@"
        VGValid=$($_vgs 2>/dev/null | grep "$volGroup" | wc -l )

        if [[  -z "$volGroup" ||  $VGValid = 0 ]]
        then
                echo "Volumegroup $volGroup wasn't valid or wasn't specified"
                echo "with \"-v Volumegroup\", bye."
                echo false
                return 1
        else
                #The volume group exists
                echo true
                return 0
        fi
}

getNumberOfVGOnSystem () {
        local novg=$($_vgs 2>/dev/null | wc -l)
        let novg--
        echo $novg
}

getAllVGOnSystem () {
        novg=$(getNumberOfVGOnSystem)
        local found=false;
        for (( i=0; i < novg; i++)); do
                volumeGroups[$i]=$($_vgs | tail -n  $(($i+1)) | head -n 1 | awk '{print $1}')
                found=true;
        done
        if ( ! $found ); then
                echo "$found"
                echo "No Volumegroup wasn't valid or wasn't found"
                exit $STATE_UNKNOWN
        fi
}

getColumnNoByName () {
        columnName=$1
        result=$($_vgs 2>/dev/null | head -n1 | awk -v name=$columnName '
                BEGIN{}
                        { for(i=1;i<=NF;i++){
                              if ($i ~ name)
                                  {print i } }
                        }')

        echo $result
}

convertToPercent () {
#$1 = xx%
#$2 = 100%
    # Make values numbers only
        local input="$(echo $1 | sed 's/g//i')"
        local max="$(echo $2 | sed 's/g//i')"
        local onePercent='';
        local freePercent='';
        if [ -x "$bc_bin" ] ; then
                onePercent=$( echo "scale=2; $max / 100" | bc );
                freePercent=$( echo "$input / $onePercent" | bc );
        else
                freePercent=$(perl -e "print int((($max-$input)*100/$max))")
        fi
        echo $freePercent;
        return 0;
}

getSizesOfVolume () {
        volumeName="$1";
        #Check the actual sizes
        cnFree=`getColumnNoByName "VFree"`;
        cnSize=`getColumnNoByName "VSize"`;
        freespace=`$_vgs $volumeName 2>/dev/null | awk -v n=$cnFree '/[0-9]/{print $n}' | sed -e 's/[\.,\,].*//'`;
        fullspace=`$_vgs $volumeName 2>/dev/null | awk -v n=$cnSize '/[0-9]/{print $n}' | sed -e 's/[\.,\,].*//'`;

        if ( $inPercent ); then
        #Convert to Percents
                freespace="$(convertToPercent $freespace $fullspace)"
        fi
}

setExitStatus () {
        local status=$1
        local volGroup="$2"
        local formerStatus=$exitstatus

        if [ $status -gt $formerStatus ]
        then
                formerStatus=$status
        fi

        if [ $status = $STATE_UNKNOWN ] ; then
                SERVICEOUTPUT="${volGroup}"
                exitstatus=$STATE_UNKNOWN
                return
        fi

        if [ "$freespace" -le "$critlevel" ]
        then
                SERVICEOUTPUT=$SERVICEOUTPUT" VG $volGroup CRITICAL Available space is $freespace $units;"
                exitstatus=$STATE_CRITICAL
        elif [ "$freespace" -le "$warnlevel" ]
        then
                SERVICEOUTPUT=$SERVICEOUTPUT"VG $volGroup WARNING Available space is $freespace $units;"
                exitstatus=$STATE_WARNING
        else
                SERVICEOUTPUT=$SERVICEOUTPUT"VG $volGroup OK Available space is $freespace $units;"
                exitstatus=$STATE_OK
        fi

        SERVICEPERFDATA="$SERVICEPERFDATA $volGroup=$freespace$units;$warnlevel;$critlevel"
        if [ $inPercent != 'true' ] ; then

                SERVICEPERFDATA="${SERVICEPERFDATA};0;$fullspace"
        fi

        if [ $formerStatus -gt $exitstatus ]
        then
                exitstatus=$formerStatus
        fi
}


checkVolumeGroups () {
checkArgValidity
        for (( i=0; i < novg; i++ )); do
                local status="$STATE_OK"
                local currentVG="${volumeGroups[$i]}"

                local groupExists="$(volumeGroupExists "$currentVG" )"

                if [ "$groupExists" = 'true' ]; then
                        getSizesOfVolume "$currentVG"
                        status=$STATE_OK
                else
                        status=$STATE_UNKNOWN
                        setExitStatus $status "${groupExists}"
                        break
                fi

                setExitStatus $status "$currentVG"
        done
}

########################################################################
### RUN PROGRAM
########################################################################


########################################################################
#Read input values
while getopts ":w:c:v:h:ap" opt ;do
        case $opt in
                h)
                        print_help;
                        exit $exitstatus;
                        ;;
                w)
                        warnlevel=$OPTARG;
                        ;;
                c)
                        critlevel=$OPTARG;
                        ;;
                v)
                        if ( ! $allVG ); then
                                volumeGroups[$novg]=$OPTARG;
                                let novg++;
                        fi
                        ;;
                a)
                        allVG=true;
                        getAllVGOnSystem;
                        ;;
                p)
                        inPercent=true;
                        units=$unitsPercent
                        ;;
                \?)
                        echo "Invalid option: -$OPTARG" >&2
                        ;;
        esac
done

checkVolumeGroups


echo $SERVICEOUTPUT"|"$SERVICEPERFDATA
exit $exitstatus

I I use another arg (another script) to the check_nrpe command, it works.

for example :

root@nagiosserver:/usr/local/nagios# /usr/local/nagios/libexec/check_nrpe -H srv-supervised04 -c check_load OK - load average: 3.79, 2.99, 1.83|load1=3.790;25.000;30.000;0; load5=2.990;20.000;25.000;0; load15=1.830;15.000;20.000;0;

VG array03-0 does exist :

root@srv-supervised04:/usr/lib/nagios/plugins# vgdisplay --- Volume group --- VG Name array03-0 System ID Format
lvm2 Metadata Areas 1 Metadata Sequence No 34 VG Access read/write VG Status resizable MAX LV 0 Cur LV 5 Open LV 4 Max PV
0 Cur PV 1 Act PV 1 VG Size
<18,71 TiB PE Size 4,00 MiB Total PE
4903887 Alloc PE / Size 4697600 / <17,92 TiB Free PE / Size 206287 / <805,81 GiB VG UUID
OgzAMF-DGbW-3t3L-Wk7k-gY1g-s6fH-zYEKad

So. VG does exist. The check_vg_size plugin works when used locally, the check_nrpe command works from the nagios server when used with another plugin but check_vg_size doesn't work from nagios server. Error message is apparently that array03-0 doesn't exist while it does. I haven't changed anything from all the files. It appeared with Debian update from 9 to 10 (during the installation, I decided to keep my nrpe.cfg modified file).

Anyone knows where it can come from ? Debian version ? New bash version maybe ? An incompatibility between the nagios server (still Debian 9) and the supervised one (Debian 10) ?

1 Answer 1

1

I fixed it.

It's a permission problem. I had to give sudo permissions to the nagios user on the plugin.

nagios ALL=(root) NOPASSWD: /usr/lib/nagios/plugins/check_vg_size

then modify

/etc/nrpe.cfg

to add sudo before the start of the command

command[check_vgs_array03-0]= sudo /usr/lib/nagios/plugins/check_vg_size -w 20 -c 10 -v array03-0

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.