VMS-to-Unix Phrase Book
<=  Return           

3.5  Detecting Program Errors

Problem

You want your script to detect if a program has succeeded of failed to do the operation requested.

Solution

Test the special shell variable $? for the program's exit status.

Discussion

As discussed in 3.2 your shell scripts should always exit with a status code indicating if the operation of the script was successful or there were errors. Your script can likewise test the exit status of other scripts, built-in comands, and programs. All of the individual examples below can be seen and contrasted together in this example script.

The most common way of testing a program's exit status is to use an if-then-else statement.

    if (rm sample.dat)
    then do-this-if-no-error
    else do-this-if-there-was-an-error
    fi;
In general, a program or built-in command is suppose to return an exit status of 0 if it succeeded, and non-zero if there was an error. The construct if (command) will take the then branch if the command succeeded, or the else branch if it failed. Often there is nothing else that needs to be done if the command succeeded, so a colon (:) can be used to provide a null statement for the then clause:
    if (rm sample.dat)
    then :
    else do-this-if-there-was-an-error
    fi;
Another common construct is the use of the logical 'or' and 'and' operators (|| and &&) to link to statements together.
    do-this-command || die $? 'program error'
The above statement says perform the first command, or if it fails, do the second.
    do-this-command && rm temp.bak
The above statement says perform the first command, and if it succeeds, do the second as well.

Think of these constructs as lazy logical evaluations which are common in many languages. Given a logical expression some high-level language:

    ( var1 or var2 )
a lazy compiler realizes that if the first var is true, it does not need to evaluate the second to know if the entire expression will be true. Likewise given:
    ( var1 and var2 )
a lazy compiler realizes that if the first var is false, it does not need to evaluate the second to know if the entire expression will be false.

The return value of a command can be explicitally tested using the $? shell variable.

    mv delete.me sample.data;
    exitStat=$?;
    if [ $exitStat -ne 0 ]
    then die $exitStat 'unable to rename file'
    fi;
In this case the test command must be used to compare the returned value as compared to zero and itself return a 0 if the result is true or a non-zero value if it was false or an error occured. By the way, the character [ is an alias for the test command. The above could also have been expressed as:
    mv delete.me sample.data;
    exitStat=$?;
    if ( test $exitStat -ne 0 )
    then die $exitStat 'unable to rename file'
    fi;

A testing construct keenly missed by DCL programmers when working in the Unix world is the ON condition THEN command. This command establishes a condition handler which is invoked whenever any command or program meets or exceeds the error threshold established. The Bourne shell has a somewhat similar trapping mechanism, but it is designed for a whole different class of errors, as explained in 3.6. However a slight subversion of of the for loop block can be used to create a similar construct to the DCL ON condition THEN command:

     for blockOK in false; do
        man man >delete.me || break
        mv delete.me sample.dat || break
        rm sample.dat || break
        blockOK=true
     done

     if [ "$blockOK" = "true" ]
     then info 'we made it'
     else die 1 "we didn't make it"
     fi;
Here's how it works. The for statement assigns the string "false" to the for loop variable blockOK. Since there is only one value, this for loop will execute exactly once. We now have a block in which to execute and test several statements in a row. If any one of the statements fail, the break command breaks us out of the block, and the sentinel variable is not set to "true". Finally we test the sentinel variable doing a string comparison.

See Also

3.2 - Using STDERR and Exit Codes;
Chapter 10 of Unix for OpenVMS Users ;
Chapter 46 of Unix Power Tools .


<=  Return           

Colophon:
Best Viewed With Any Browser
This page maintained by:
    Bill.Costa@unh.edu
    of the Enterprise Computing Group
    in the dept of Computing & Information Sevices
    at the University of New Hampshire

Typographical
Conventions

Created:  31-Jan-2001 BC
Revised:   6-Dec-2001 BC