Executing Programs

Before we can execute a file, we have to find it. Well, when I say “we”, I really mean the command shell, the program whose job it is to process the command lines we type and interpret them as things to actually do.

One way to make sure that the command shell can find the file we want is if we always type out a correct path to that file. But for the commands that we use most often, that would get to be tiresome rather quickly.

For example, the cp and ls commands are actually programs located at /usr/bin/cp and /usr/bin/ls. We certainly could type

/usr/bin/ls ...

every time we wanted to list out a directory, but who would want to?

So the command shell has a mechanism in place that allows us to drop certain paths from our command names. If we do not give the shell an absolute path to a file that we want to execute, it will look for the file in one of the paths stored in an environment variable named PATH.

Environment variables are rather like variables in a programming language, but are stored by the command shell. We access an environment variable by putting a ‘$’ in front of its name.

For example, if I print the value of PATH:

echo $PATH

I will probably see something like

/home/zeil/bin:/usr/local/bin:/usr/bin:/bin

This is actually a list of 4 paths, separated by colons:

/home/zeil/bin This is my own directory, one in which I place a number of programs of my own that I use frequently. I have added it to my PATH using a technique I’ll show later in this lesson.
/usr/local/bin This is a system path, used mainly for programs that are installed for use by anyone on a Linux machine but that are not considered a fundamental component of Linux.
/usr/bin Another system path, this one for programs that are common to most Linux distributions.
/bin Another system path, used for the most common, core commands in a Linux system. (Actually, on the CS Linux servers, /usr/bin and /bin share the same contents.)

When we type a command beginning with a simple name like “cp”, the shell will search through each of those paths in turn. For me, this means that the shell looks for to see if there is a file named /home/zeil/bin/cp. There isn’t, so next the shell looks for /usr/local/bin/cp. But no such file exists. Then the shell looks at /usr/bin/cp, and there it finds a program that just happens to be our old friend, the file copy command.

Example 1: Try This: the PATH

Issue the following commands:
date
which date
The which commands searches the PATH for a file and tells us where it find it.
cp /usr/bin/date ~/playing/date2
You have made a copy of the date program within your own directory.
~/playing/date2
date2
which date2
The first of these two commands works because we gave the absolute path to our new date2 program. The second one fails, because ~/playing is not in our PATH, so the shell cannot find date2. Similarly, the which command is silent because it cannot find date2.

How do we know that ~/playing is not in our PATH? The which command above is a pretty good clue. But we can check directly:
echo $PATH
Clearly, ~/playing is not in our PATH.

Let’s change that:
export PATH=~/playing/:$PATH
This command resets the PATH to be “~/playing” followed by a colon (:) followed by the former value of PATH. Check it out.
echo $PATH
which date2
date2
date2 is now found by the shell as part of the normal command search.

This change to PATH will last until we change the PATH again or until we log out of this session. If you have other windows open with SSH sessions to the same remote Linux machine, they are not affected by this change to PATH.

1.2 The Execute Permission

We have already studies file permissions. Recall that the three basic permissions we can set are read, write, and execute. It’s easy to see the effect of the execute permission.

Example 2: Try This - Execute Permission

Give the following commands:
which date2
date2
Now change the permission on that file:
ls -l ~/playing/date2
chmod -x ~/playing/date2
ls -l ~/playing/date2
And try running it again:
date2
which date2
~/playing/date2
We broke it! Well, really we made it un-executable. Luckily, the fix is easy.
chmod +x ~/playing/date2
ls -l ~/playing/date2
date2
which date2
~/playing/date2

2 The Content of an Executable File

Earlier, we said that one of the requirements for a file to be executable is that

“3. It must have contents that ‘make sense’ to the program that will launch it.”

In this section, we will look at the two key ideas in that statement: 1) the idea of a program that launches the file, and 2) what it means to “make sense” to that launcher.

2.1 Binary Executables

Most of the executable programs you have encountered so far have been native binary executables. Most of the Linux commands that we have learned fall into this category. This is also what we get when we compile C++ or C code to produce an executable.

Example 3: Try This: a binary executable

Let’s set up a C++ program that counts the number of words in its input.

mkdir ~/playing/wordCounter cd ~/playing/wordCounter

Open your favorite editor and create the following as a file wordCounter.cpp in ~/playing/wordCounter:

#include <string> #include <iostream> using namespace std; int main(int argc, char** argv) { string word; int counter = 0; while (cin >> word) { ++counter; } cout << argv[1]; cout << ", I have read " << counter << " words." << endl; return 0; }

Now, let’s compile that.

g++ -o wordCount -g wordCounter.cpp

This should compile cleanly. If you get error messages, open the .cpp file in your editor and fix it. Then recompile.

Let’s look at what we have:

ls -l

You should see that we have a new file, wordCount. Notice that the compiler has, conveniently, already set the execute permission on that file.

Normally, we can’t view binary files (executable or not), but there is a command, hd, which stands for “Hexadecimal Dump”, that can show us the bytes that make up any file.

Try it first on a file whose contents we already know:

hd wordCounter.cpp | head

(The head command clips the output after the first few lines.)

In the leftmost column you see a counter showing the number of bytes since the start of the file. After that you see the actual bytes in the file, 16 per line. All of the numbers shown are in hexadecimal (base-16).

On the right of each line, you see another representation of those same 16 bytes. If a byte contains the ASCII code of a printable character, that character is shown. Numbers that are outside the ASCII range of 0..127 or that denote non-printable characters (e.g., tabs and newline characters) are shown as a period (.).

You should be able to make out the familiar contents of the C++ code in this file.

Now let’s try the same thing with our executable:

hd wordCount | head

This is considerably messier. Lots of the bytes contain values that are not printable ASCII. Most of the printable characters are occurring by happenstance — some of the numbers that make up the machine code just happen to correspond to ASCII characters. (If you were to look far enough into the file, you might actually find recognizable strings that represent data values like “I have read ”.)

Look at the very beginning however. The first 4 bytes are 7F followed by the ASCII codes for ‘E’, ‘L’, and ‘F’. Every binary executable on Linux will start with those same four bytes. It’s called a magic number, meaning that it is a special code that identifies what kind of file we are looking at.

This particular magic number is a signal to the command shell that the file contains a binary executable in ELF format, and that the command shell can simply load and launch that program.

OK, let’s run the program.
./wordCount Hello
(Why do we need to put the “./” in front of the program name? Because ~/playing/wordCounter/ is not in our PATH!)

The program will appear to pause. Actually, it is reading from standard in, your keyboard.

Type
This is my
input. I am
done.
After the final line, make sure you have typed Enter and then type a Ctrl-d. This is the end-of-input signal on Linux.

The program should respond:
Hello, I have read 7 words.
Remember that standard in doesn’t always have to come from the keyboard.

Open your favorite editor and create a file in that same directory named data1.txt containing
This is my
input. I am
done.
and another file named data2.txt containing
 blah blah
 blah blah
 blah blah
 blah blah
 blah blah
Now run the program again, using redirection:
./wordCount Hooray < data1.txt
./wordCount Yawn < data2.txt

2.2 Text Executables

7F E L F, as a sequence of ASCII codes, is the magic number that, when it appears at the start of a file, tells the shell to treat that file as a binary executable.

But text files can be made executable as well. To do that, we need to

Set the execute permission on the file.
Use the magic number “#!” to tell the shell how to run the file.

The first of these is simply a matter of issuing the appropriate chmod command.

For the second, we add a line to the start of the text file:

#!path-to-a-program

The command shell will interpret this as an instruction to present the rest of this text file to the standard in of the program named in path-to-a-program.

The most common applicaiton for this idea is to feed a series of commands to a command shell. A file with a sequence of commands like that is called a shell script.

3 Shell scripts

We now know how to feed text to a program in an executable file.

Suppose that the program that we named in the #! header was a command shell. A command shell reads lines of text (usually from the keyboard) and executes each line as a command. So we could arrange a whole sequence of commands to be executed by listing them in a file.

We call file like this a shell script (because it’s a ’script“ of commands that we feed to a ”shell"). This lets us capture a sequence of commands that we type over and over and turn them into a single command.

For example, when working with a program, we might want to be able to recompile that program and run our tests on it immediately.

Example 4: Try This: capturing commands in a script
Return to your ~/playing/wordCounter directory, if necessary.
Use an editor to create this file, buildIt.sh:
#!/bin/sh
g++ -o wordCount -g wordCounter.cpp
./wordCount Hello < data1.txt
./wordCount $USER < data2.txt
The first line names /bin/sh as the command shell we want to use to process the commands that are in the rest of the file. sh is a “stripped-down” version of the normal command shell that eliminates some of the features specifically oriented towards keyboard entry of commands.
Make this executable, and run it:
chmod +x buildIt.sh
./buildIt.sh
Now edit wordCounter.cpp and change “words” to “phrases”.
Run our script again:
./buildIt.sh
It should be obvious that the changed code has been recompiled and the new program yields different output on our tests.

3.1 Parameters

We can pass parameters to shell scripts from the command line. For example, suppose we wanted a script to execute a single test like this:

dotest2 test1.dat

that would feed test1.dat (or whatever) to the input of myprog, saving the output in test1.dat.out.

We use the symbol $k to stand for the $k^{th}$ argument given to the script. So we can write our script dotest2, as follows:

#!/bin/sh
./myprog < $1 > $1.out

After the appropriate chmod, this could then be invoked as

./dotest2 test1.dat test2.dat test3.dat test4.dat test5.dat

Of course, scripts can have more than one parameter.

Example 5: Try This: A script with command line parameters
Use your favorite text editor to create the following file, saving it as mcdonald.sh
#!/bin/sh
echo Old McDonald had a farm.
echo EIEIO
echo And on that farm he had a $1.
echo EIEIO
echo With a $2, $2 here,
echo And a $2, $2 there.
Do
chmod +x mcdonald.sh
to make it executable, and then try invoking it this way:
./mcdonald.sh cow moo 
./mcdonald.sh dog bark

Whatever we supply in the command line gets substituted for the $1, $2, etc., parameters before that line of the script is run. That can cause problems if we supply characters that are special to the shell. The solution, just as when we worry about special characters when working form the command line, is quoting. Prudent script programmers will consider the possibilities that $1 and other parameters might contain blanks or other special characters and use quotes to control the problem, e.g.,

    echo And on that farm he had a "$1".

(although, to be fair, even special characters aren’t all that likely to mess up an echo command).

Closely related to the numbered parameters $1, $2, etc., is the parameter $*, which holds all of the command line parameters gathered together in one string. This is often used in writing loops to process the command parameters, one by one:

for x in $*
do
    echo I am looking at $x
done

The first line indicates that we are going to loop through each of the parameters in $*. Each time around the loop, a different parameter value will be selected and assigned to $x. Of course, you can change the name of the loop variable $x to whatever you like, and you can put any sequence of commands into the loop body.

Example 6: Try This: A script with a loop
Let’s re-use a directory from an earlier exercise.
cd ~/playing/compilation
Use your favorite text editor to create the following file, saving it as whatisit.sh
#!/bin/sh
for file in $*
do
    file $file
    echo
done
Do
chmod +x whatisit.sh
to make it executable, and then try invoking it this way:
./whatisit.sh sayhello.h
./whatisit.sh sayhello.cpp
./whatisit.sh sayhello.* hello1 what*
./whatisit.sh *

There’s a lot more that can be done with shell scripts, including if-then-else structures and programs/commands that act as conditions to obtain more complicated behaviors. These are covered in a later lesson.

3.2 Debugging Your Scripts

Scripting is just another form of programming. Just like programs that you write in C++ or other “traditional” programming languages, you need to test your scripts to see if they work, and, just like those other programs, your scripts are probably not going to work correctly on the first try.

Debugging your scripts is not all that different from debugging traditional programs, either. Many of the basic techniques are similar:

3.2.1 Isolate the Problem

If you have a script that does several things in succession, try to determine how far along that sequence you have been getting before things go wrong. The whole reasoning process that we call “debugging” is a lot easier if you know where to look.

You can try to do this by examining any intermediate results to see if they are correct. In particular, if your script produces some temporary or working files, look at those. If your script rms those files when it’s done with them, try commenting out the rm statements (put a # in front of them) until you’re sure everything is working. Another way to examine intermediate results is by adding debugging output to print them out (see below).

If that doesn’t work, you can try “shortening” your script by commenting out the later steps and seeing if the partial script seems to work, then uncommenting the next step and testing the script again, and so on.

3.2.2 Add Debugging Output

Just as you might add extra output statements to a C++ program to reveal the values of selected important variables, you can do the same thing with scripts. The echo command is used for this purpose. For example, if we were having problems with this script:

#!/bin/sh
count=0
for file in *
do
count=`expr $count + 1`
done
echo There are $count files\
in this directory.

we might add some temporary output to see just what was going on inside the loop:

#!/bin/sh
count=0
for file in *
do
  echo Looking at $file.
count=`expr $count + 1`
  echo Count is $count.
done
echo There are $count files\
in this directory.

Debugging output like this not only reveals the values of variables, but also can be valuable in showing which branch of an if was selected, how many times a loop is executed, etc.

Because echo can print pretty much anything you give it as an argument, it’s particularly useful when you have a script that runs other programs and suspect that your script may not be invoking those programs correctly. For example, if you have a script huntFor:

#!/bin/sh
#
# Hunt through a list of files for
# a string. E.g.,
#   huntFor sqrt *.cpp
#   
Search=$1
shift
grep -l $Search $*

and are wondering if the grep command is being issued the way you expect, you might try:

#!/bin/sh
#
# Hunt through a list of files for
# a string. E.g.,
#   huntFor sqrt *.cpp
#   
Search=$1
shift
echo grep -l $Search $* 
grep -l $Search $*

to see the command just before it gets issued.

You might find it valuable to actually create the above script, storing it in a file named huntFor, and try testing it like this:

./huntFor yes ~/UnixCourse/compileAsst/*.cpp
./huntFor bigger ~/UnixCourse/compileAsst/*.cpp
./huntFor "bigger than" ~/UnixCourse/compileAsst/*.cpp

The last case reveals a problem, which is easily fixed:

#!/bin/sh
#
# Hunt through a list of files for
# a string. E.g.,
#   huntFor sqrt *.cpp
#   
Search=$1
shift
grep -l "$Search" $*

If you have tried those tests and don’t see how this fix works, review Quoting.

The use of echo to print out entire commands does carry a bit of a risk if the commands involve output redirection or pipes. For example, if the original command is redirecting its output into a file:

cat $1 $2 > $3

then throwing an echo version in front:

echo cat $1 $2 > $3
cat $1 $2 > $3

will actually result in your debugging output being written into the output file, probably messing things up even more than they were before.

In cases like that, you can obtain the same information more easily by our next technique, tracing the execution.

3.2.3 Trace the Execution

If you were working with a C++ program, you could use gdb or ddd to step through the program and see exactly which statements were being executed, in which order.

Shells provide a more primitive, but still useful, tracing facility. The “-x” option, supplied to sh or csh, asks it to list each command before it executes it (and after replacing any variables by their values).

So, if you have written a shell script named “myScript” with the first line:

#!/bin/sh

you might test it this way:

sh -x myScript parameters

Example 7: Tracing a Script
Let’s return to our last example.
cd ~/playing/compilation
ls
You should see your script whatisit.sh from the earlier exercise.

Take a good look at it and remember what it looks like:
cat whatisit.sh
Just to remember how it lo9oks when it is working, try
./whatisit.sh sayhello.* hello1 what*
./whatisit.sh *
Now trace the execution:
sh -x ./whatisit.sh sayhello.* hello1 what*
sh -x ./whatisit.sh *
You can see that each command is printed before it is actually executed.