Scripts

Steven Zeil

Last modified: Aug 29, 2023
Contents:

You can put any sequence of Unix commands into a file and turn that file into a command. Such a file is called a script. For example, suppose that you are working on a program myprog and have several files of test data that you run through it each time you make a change. You might create a file dotest1 with the following lines:

./myprog < test1.dat > test1.dat.out
./myprog < test2.dat > test2.dat.out
./myprog < test3.dat > test3.dat.out
./myprog < test4.dat > test4.dat.out
./myprog < test5.dat > test5.dat.out 

You can’t execute dotest1 yet, because you don’t have execute permission. So you would use the chmod command to add execute permission:

chmod u+x dotest1

Now you could execute dotest1 by simply typing

./dotest1

Most shells provide special facilities for use in scripts. Since these differ from one shell to another, it’s a good idea to tell Unix which shell to use when running the script. You do this by placing the command #!path-to-the-shell in the first line of the script.

 

There are two main families of shells. The sh family includes sh and bash. The csh family includes csh and tcsh.

bash and tcsh have a number of features that are oriented towards people typing in commands directly from the keyboard. For example, both feature command and filename completion when the Tab key is pressed.

Such interactive features are unnecessary when writing scripts and, in some cases, could actually create problems. So we tend to fall back on the simpler sh and csh shells when writing command scripts.

In this lesson, we will focus on sh.

1 Parameters

We can pass parameters to shell scripts from the command line. For example, suppose we wanted a script to execute a single test like this:

dotest2 test1.dat

that would feed test1.dat (or whatever) to the input of myprog, saving the output in test1.dat.out.

We use the symbol $k to stand for the \(k^{th}\) argument given to the script. So we can write our script dotest2, as follows:

#!/bin/sh
./myprog < $1 > $1.out

After the appropriate chmod, this could then be invoked as

./dotest2 test1.dat test2.dat test3.dat test4.dat test5.dat

Of course, scripts can have more than one parameter.

Example 1: Try This: A script with command line parameters
  1. Use emacs or any other editor to create the following file, saving it as mcdonald

    #!/bin/sh
    echo Old McDonald had a farm.
    echo EIEIO
    echo And on that farm he had a $1.
    echo EIEIO
    echo With a $2, $2 here,
    echo And a $2, $2 there.
    
  2. Do

    chmod +x mcdonald
    

    to make it executable, and then try invoking it this way:

    ./mcdonald cow moo 
    ./mcdonald dog bark
    

Whatever we supply in the command line gets substituted for the $1, $2, etc., parameters before that line of the script is run. That can cause problems if we supply characters that are special to the shell. The solution, just as when we worry about special characters when working form the command line, is quoting. Prudent script programmers will consider the possibilities that $1 and other parameters might contain blanks or other special characters and use quotes to control the problem, e.g.,

    echo And on that farm he had a "$1".

(although, to be fair, even special characters aren’t all that likely to mess up an echo command).

2 Control Flow

2.1 If statements

Shells feature control flow based on “status codes” returned by programs to indicate whether the program execution “succeeded” or “failed”. For example, this script tests to see if the gcc compiler has successfully compiled a file.

if gcc -c expr.c;
then
  echo All is well.
else
  echo Did not compile!
fi

In sh (and its relatives bsh and bash), the if is followed by a list of commands, each terminated by a ‘;’. The status code of the last command is used as the if condition.

This idea of returning a status code explains why, in C and C++, the function main is always supposed to return an int:

int main (int argc, char** argv)
{
   ⋮
   return 0;
}

The returned value is the status code. You are supposed to return a zero when your program works normally, but you are supposed to return a non-zero “error code” if your program terminates abnormally (e.g., if it discovers that a file it needs is missing).

The shell if statements will take the “then” branch when a command returns a status code of 0, but any non-zero value denotes a failure and will cause the script to follow the “else” branch. Most Unix commands will list the status codes they return on their man pages (the pages you get when running the command man command).

How do you know when a command is supposed to return “success” (0) or “failed” (non-zero) signals? The safest way to tell is to look at the man page for that command. But in many cases you can guess. For example, g++ will signal “success” if no compilation or linking errors were encountered. grep will signal “success” if it finds at least one line that matches your search pattern.

Example 2: Try This: Conditions Based on Command Success/Fail
  1. cd ~/playing

  2. Open a text editor and create the following as a file wheresWaldo.sh:

    #!/bin/sh
    if grep -i waldo "$1";
    then
        echo 'There he is!'
    else
        echo "I don't see him."
    fi
    
  3. After saving your file, do

    chmod +x wheresWaldo.sh

  4. Now let’s create some test data.

    echo "Hello." > yes.txt
    echo "I'm Waldo." >> yes.txt   # Note the >> for appending
    echo "Hi there." > no.txt
    echo "My name is Walter." >> no.txt
    
  5. Try out the script.

    ./wheresWaldo.sh yes.txt
    ./wheresWaldo.sh no.txt
    
  6. Although the script works (unless you have mistyped it), it’s a little distracting to have grep’s output mixed with the output of the script. We could refine our script by sending the grep output to a file instead of to the screen. (There’s even a special file/device that we can send output to that just makes it vanish, /dev/null. E.g., grep -i waldo "$1" > /dev/null).

    But using grep in scripts like this is so common that there’s a special flag -q or --quiet, that suppresses the output.

    Edit wheresWaldo.sh and make a small change:

    #!/bin/sh
    if grep --quiet -i waldo "$1";
    then
        echo 'There he is!'
    else
        echo "I don't see him."
    fi
    
  7. Try out the script again.
    ./wheresWaldo.sh yes.txt
    ./wheresWaldo.sh no.txt
    

2.2 Loops

Looping is also available in the shells. One of the most commonly used loop forms is that of visiting every item in some list. The list is often all files satisfying some wildcard pattern:

for file in *.txt
do
echo $file
done

Another common kind of loop visits every parameter passed to the script

for x in $*
do
echo $x
done

$* is a list of all the command parameters. If either of these scripts were stored in a file testp and then invoked as

./testp a b c

the output would be

a
b
c

If invoked as

./testp *.txt

the output would be a list of all .txt files in the working directory.

Another common scripting pattern is a script that has few special parameters at the beginning, followed by an arbitrary number of remaining parameters (often filenames). The shift command helps us to handle these. Each call to shift removes the first element of the $* list.

Example 3: Try This: Looping through parameters

Create a file testp2 containing this script:

#!/bin/sh
p1="$1"
shift
p2="$1"
shift
echo The first two parameters are\
$p1 and $p2
echo The remaining parameters are $*
for x in $*
do
echo $x
done

Use chmod to make that file executable and invoke it as

./testp2 a b c d

Study the output. Do you see the effects of the shifts? If not, try commenting out the shift commands (put a ‘#’ character in front of them) and run the command again.

Once you are satisfied that you know what is happening and why, try invoking it like this:

./testp2 a "b c" 'd e f' g

The “$*” is expanded into a simple string “a b c d e f g” and the for loop actually breaks that string apart at each blank space. The result does not respect the grouping of parameters imposed by the quotes in the original command. Hence 'd e f' gets broken into three pieces. On the other hand, the shift command does respect the original quoting. Hence the "b c" were kept together by the shifting.

A while loop is also available.

while condition;
do
commands
done

It’s hard to see how we can use this with the tools we have discussed so far. In the next section, we’ll look at some useful ways to write conditions that can take advantage of this style of loop, including providing a more accurate way to loop through command line arguments.

3 The test and expr Programs

One thing that becomes obvious quickly is that the status code based testing is limited. Often we want to

but status codes indicating whether or not a program executed successfully don’t seem to help us very much.

The solution is to use a program whose job is to

The Unix program that does this is called test:

gcc -c expr.c
if test -r expr.o;
then
echo All is well.
else
echo Did not compile!
fi

test takes a bewildering variety of possible parameters. You can see the whole list by giving the command man test. Many of these are used for checking the status of various files. The -r expr.o in the above script checks to see if a file named expr.o exists and if we have permission to read it. Some common file tests are:

test is true if
-r file file exists and is readable
-w file file exists and is writable
-x file file exists and is executable
-d file file exists and is a directory

Strings are compared with = and != in sh.

if test $USER = zeil;
then
echo Nice guy!
else
echo Who are you?
fi

We can use the ability to compare strings to provide a more accurate solution to the problem of looping through command-line arguments than we were able to achieve in the previous section.

Example 4: Try This: Looping through parameters with while

Create a file testp3 containing this script:

#!/bin/sh
p1="$1"
shift
p2="$1"
shift
echo The first two parameters are\
$p1 and $p2
echo The remaining parameters are $*
while test "$1" != "";
do
file="$1"
shift
echo $file
done

Use chmod to make that file executable and invoke it as

./testp3 a b c d

Study the output. Do you see the effects of the shifts? If not, try commenting out the shift commands (put a ‘#’ character in front of them) and run the command again.

Once you are satisfied that you know what is happening and why, try invoking it like this:

./testp3 a "b c" 'd e f' g

This time you should see that the original grouping of the commands via quotes is being preserved. That could be important if we had file names that had blank characters inside the name. Such names are somewhat rare in Unix practice, but common in Windows, so if you tend to work with files created on both kinds of systems, you might want to be sure that your scripts can handle such names.

Numbers are compared with rather clumsy operators in sh (-eq -ne -lt -gt -le -ge):

if test $count -eq 0;
then
echo zero
else
echo non-zero
fi

But where do numbers come from if all the variables contain strings?

From yet another program, of course. The shells themselves have no built-in numeric capability. Calculations can be performed by the expr program. This program treats its arguments as an arithmetic expression, evaluates that expression, and prints the result on standard output. For example:

expr 1
`1`
expr 2 + 3 \* 5
`17`

Note the use of \ to “quote” the following character (*). Without that backwards slash, the shell into which we typed the command would treat the * as the filename wildcard, replacing it with a list of all files in the current directory, and expr would have actually seen something along the lines of

expr 2 + 3 file1.txt file2.txt myfile.dat 5

Now, how do we get the output of an expr program evaluation into a variable or a script expression where it can do some good? For this we use the convention that the backwards apostrophes, f, when used to quote a string, mean “execute this string as a shell command and put its standard output right here in this command”.

For example:

echo Snow White and the expr 6 + 1 Dwarves
`Snow White and the expr 6 + 1 Dwarves`
echo Snow White and the `expr 6 + 1` Dwarves
`Snow White and the 7 Dwarves`

With these two ideas, we can now do numerical calculations in our scripts:

count=0
for file in *
do
count=`expr $count + 1`
done
echo There are $count files\
in this directory.
Example 5: Try This: Looping from the command line

Although control flow commands are most commonly used inside scripts, they are just part of the same shell language that you use when you type commands directly at the keyboard. Try the following commands:

If your login shell is bash:

for file in ~/*
do
echo My home directory contains $file
done

If your login shell is tcsh:

foreach file (~/*)
echo My home directory contains $file
end

This can be a useful trick sometimes when you want to apply a command to every file in some directory (although you can usually accomplish the same task with some combination of find or xargs).

4 Scripting Example: The Student Test Manager

As an example of how to bring all those scripting details together, let’s look at some scripts to aid in testing programs. Many programming students wind up adopting a hit-and-miss approach to testing their code, in part because they don’t set themselves up with an easy way to repeat a large number of tests every time they make a change to their programs.

What we’d like to end up with is a simple system for testing non-GUI programs. The idea is that the student designs a number of tests and can then issue a command to run all or a selected subset of those tests. The command should run those tests, capturing the program outputs in files that the student can examine later. Furthermore, we can save the student a bit of time by letting him or her know if the program output has changed on any of those tests.

4.1 Sample Scenario

So, suppose the student is working on a program named myProg and has designed 20 test cases. Once the program compiles successfully, the student could say

./runTests ./myProg 1 20

to run all 20 tests. The output might look something like:

Starting test 1...
Starting test 2...
Starting test 3...
Starting test 4...

Maybe at this point the output stops, suggesting that the program has been caught in an infinite loop. The student kills the program with^-C. Now looking in the directory, the student finds files testOut1.txt, testOut2.txt, testOut3.txt, and testOut4.txt corresponding to the tests that were actually started. The student looks at the first 3 of these, decides that the captured output looks correct, and starts debugging the program to figure out why it hung up on test 4. Eventually the student makes change to the program, recompiles it, and tries again, this time just running test 4.

./runTests ./myProg 4 4
Starting test 4...
** Test 4 output has changed

Not surprisingly, the output from test 4 is different, because the infinite loop has now, apparently been fixed. Checking testOut4.txt, everything looks good.

Encouraged, the student launches the whole test set once again

./runTests ./myProg 1 20
Starting test 1...
Starting test 2...
** Test 2 output has changed
Starting test 3...
** Test 3 output has changed
Starting test 4...
Starting test 5...
⋮
Starting test 20...

The unexpected has occurred. The fix to make test 4 work has changed the behavior of the program on tests 2 and 3, which had previously been believed to be OK. The student must go back and check these (as well as the outputs of tests 5…20) to see what has happened. Annoying? Yes, but it’s better that the student should discover these changes in behavior before submitting than that the grader should do so after submission!

4.2 The Script

As we have envisioned it, our runTests script takes three parameters:

  1. The name of the program to run

  2. The number of the first test to be performed

  3. The number of the last test to be performed

So we start our script by gathering those three parameters:

#!/bin/sh
programName=$1
firstTest=$2
lastTest=$3

Clearly, the main control flow here will be a loop going through the requested test numbers.

#!/bin/sh
programName=$1
firstTest=$2
lastTest=$3
#
# Loop through all tests
testNum=$firstTest
while test $testNum -le $lastTest;
do
⋮
testNum=`expr $testNum + 1`
done

For each test, we will eventually want to compare the output from this test, stored in testOuti.txt to the output from the previous test, which we will assume is stored in testOuti.old.txt. The cmp command lets us compare two files to see if their contents are identical.

#!/bin/sh
programName=$1
firstTest=$2
lastTest=$3
#
# Loop through all tests
testNum=$firstTest
while test $testNum -le $lastTest;
do
⋮
# Has the output changed?
if test -r testOut$testNum.old.txt;
then
if cmp testOut$testNum.old.txt testOut$testNum.txt;
then
donothing=0
else
echo \*\* Test $testNum output has changed
fi
fi
testNum=`expr $testNum + 1`
done

To set this up, we must determine just where the .old.txt files come from. They are simply the previous version of the test output files (if that particular test has ever been run).

#!/bin/sh
programName=$1
firstTest=$2
lastTest=$3
#
# Loop through all tests
testNum=$firstTest
while test $testNum -le $lastTest;
do
# Save the previous output from this test
if test -r testOut$testNum.txt;
then
/bin/mv testOut$testNum.txt testOut$testNum.old.txt
fi
⋮
#
# Has the output changed?
if test -r testOut$testNum.old.txt;
then
if cmp testOut$testNum.old.txt testOut$testNum.txt;
then
donothing=0
else
echo \*\* Test $testNum output has changed
fi
fi
testNum=`expr $testNum + 1`
done

Finally, we come to the heart of the matter. We need to actually execute the program, saving the output in the appropriate testOut... file. Exactly how we want to execute the program depends upon how the program gets its input data. I’m going to assume, for the moment, that this program reads its input data from the standard input stream, and that the student saves the input test cases in testIn1.txt, testIn2.txt, …

#!/bin/sh
programName=$1
firstTest=$2
lastTest=$3
#
# Loop through all tests
testNum=$firstTest
while test $testNum -le $lastTest;
do
#
# Save the previous output from this test
if test -r testOut$testNum.txt;
then
/bin/mv testOut$testNum.txt testOut$testNum.old.txt
fi
# Run the test!   
echo Starting test $testNum...
$programName < testIn$testNum.txt 2>&1 > testOut$testNum.txt
#
# Has the output changed?
if test -r testOut$testNum.old.txt;
then
if cmp testOut$testNum.old.txt testOut$testNum.txt;
then
donothing=0
else
echo \*\* Test $testNum output has changed
fi
fi
testNum=`expr $testNum + 1`
done

By making minor changes to the way the program is run, we can accommodate a number of different possible program styles. How would you change this script for a program that read no inputs at all, but could be invoked with different command-line parameters?

There are a number of possibilities, but I would put the various parameters into the testIn... files, and run them this way:

#!/bin/sh
⋮
#
# Run the test!   
echo Starting test $testNum...
$programName `sed -e s/[\\r\\n]//g testIn$testNum.txt` \
2>&1 > testOut$testNum.txt
⋮