Quoting Special Characters

Last modified: Aug 29, 2023
Contents:

1 Special Characters

In the grep examples of the previous section, all of our searches were for single words consisting of alphabetic characters only.

But what if we wanted to search for a phrase (with blanks between the words) or for some pattern involving punctuation? That could be a problem because blanks and many punctuation characters have special meanings to the command shell, the program that reads our commands from the keyboard and launches the programs to carry out those commands.

For example, a blank normally means “we are done typing one command parameter and are ready to start on a new one.” So if we tried to use a grep command, which normally has the form

grep flags pattern one-or-more-file-paths

to search for the phrase “Old Dominion” in a file foo.txt, we might try writing

grep Old Dominion foo.txt

But what we would get is an error message complaining that grep could not find a file named “Dominion”, because the blank space after the ‘d’ in “Old” means “we are done typing one command parameter (the pattern ) and are ready to start typing the next (the first of one-or-more-file-paths ).” So “Dominion” would be interpreted as a file name rather than as part of the pattern to search for.

Among the characters that the command shell will treat as “special” are blanks and

/ * < > | ? \ ; , ! $ ' ` "

We’ve already seen special uses for the first 2 of these, and will encounter some of the others in later lessons.

Because these characters tend to cause problems when we type them, we generally avoid unnecessary uses of them. For example, you’ll seldom see any of them as part of a file name. That’s not because it’s illegal to use them in file names. Unix is amazingly tolerant of what characters get put into file names, but only masochists would use these special characters because the command shell will interpret them as something else, making it difficult to type the file name.

An exception to that rule: because many of us work in a world of mixed Unix and Windows PC users, and Windows seems to encourage the use of blanks in file names, people working in Unix often have to deal with file names with embedded blanks. Again, that can cause issues. For example, if we have a file named “daily data.dat” and we wanted to copy it to a directory named “archive”, we might try this

cp daily data.dat archive/

only to receive error messages complaining that cp could not find a file named “daily” nor a file named “data.dat”.

2 Quoting – Three Ways

What do we do if we need to type one of these special characters into a command but not have it treated specially? For example, suppose that we had a file foo.txt and we wished to list all the lines in that file that contained a "*". We can’t do

grep * foo.txt

because the * is a special (the wild card “matches anything”) character to the shell.

What we need to do is to quote that special character somehow to prevent the command shell from treating it specially. There are three ways we can do this:

  1. We can place a backslash (\) in front of the special character.:

    grep \* foo.txt
    

    Note that backslashes can quote themselves. So if we wanted to print on our terminal screen a backslash followed by an asterisk, we could write

    echo \\\*
    

    The first backslash quotes the second one. The third one quotes the asterisk.

  2. We can enclose all or part of the argument in single quotation marks. This suppresses all special characters. Also, if the enclosed portion includes blanks, it combines what would otherwise have been seen as multiple parameters into a single parameter.

    grep Hello there foo.txt
    

    This would look in the files named “there” and “foo.txt” for any lines containing the word “Hello”.

    grep 'Hello there' foo.txt
    

    This would look in the file named “foo.txt” for any lines containing the phrase “Hello there”.

    grep 'Hello there!' foo.txt
    

    This would look in the file named “foo.txt” for any lines containing the phrase “Hello there!”. Note that, without the quotes, the “!” would have been treated as a special character.

  3. Finally, we can enclose all or part of the argument in double quotation marks. This suppresses all special characters except $, and also gathers its contents into a single parameter.

Example 1: Try This: Single versus Double Quotes
echo $USER *
echo '$USER *'
echo "$USER *"

Can you see the differences in how the two special characters are treated in each case?

2.1 Examples: Unconventional File Names

One of the common uses of quoting is in dealing with file names containing spaces (or other unusual characters). Unix users tend to avoid file names with spaces, but they aren’t actually illegal. And if you are hopping back and forth between Unix and Windows systems, you will find that Windows users love to put blank spaces inside file names. (Unless they are Windows programmers who use the Windows cmd program to do text-based commands. They hate spaces inside file names, too.)

Example 2: Try This: Unusual Characters in File Names

Give the commands:

cd ~/playing
ls
cp math.h "ax bx cx.h"
ls

You now have a new file. Notice anything odd about it? Let’s look inside it.

Give the command

more cx.h

That doesn’t work, even though it would be easy to guess from the preceding ls listing that it would have worked.

Give the commands

more ax bx cx.h
more "ax bx cx".h
more ax\ bx\ cx.h

Which ones work, and why?

Now try this. Type more ax and then hit the Tab key to request automatic file name completion.

Notice how the automatic feature fills in appropriate quoting for you.

There are worse things than blanks that can legally occur in file names. In fact, just about any special character could be put into a file name if we were masochistic enough.

Example 3: Try This: Even More Unusual Characters in File Names

Give the commands:

cp "math.h" "ma*+;.h"
ls

Now suppose you want to access the new file. Try this:

echo ma*
echo ma**

Why did you get both files?

To get just the new one, again the answer is proper quoting. Try:

echo ma\**

Do you understand why the two asterisks are treated differently?

Why would

echo ma*\*

not do the same thing?