Redirection allows you to alter the behavior of individual commands by
changing where they get their input and where they store their output. Pipes
allow you to combine commands in interesting ways. Next we will look at a
pair of commands that also let you combine selective inputs and combinations
of commands. These two commands, xargs
and find
,
are interesting in part because they allow you to issue other commands from
them.
xargs
reads a list of file names from the standard
input and fills those file names in to a command of your choosing. In its
simplest form, you can use xargs
like this:
xargs partial-command
In this form xargs
will read a list of file
names (paths) from the standard input and will simply tack them on to the
end of the partial command.
Where do the file names come from? You're not likely to type them
directly at the keyboard. (If you were going to do that, you would
probably just have typed the whole command.) So, usually, you will run
this with a list of files names that you have collected into a text file,
or you may pipe into xargs
the output of a command that lists
the files you want.
Example 2.32. Try This:
ls /bin ls /bin | xargs echo Issuing a command on
Most command shells will have limits on just how long a single
command can get, and xargs
tries to be smart about the
way it constructs commands. If the standard input contains a very
large number of files, xargs
will break the list up into
pieces. Look at the output above. Can you see evidence that
xargs
has split the list into separate commands?
You can control the maximum number of files that xargs
will pack into one command using the -n
flag.
The most common reason for doing this is because not
all Unix commands work on arbitrarily long lists of files. Some work only
on a single file, making -n 1
a useful option.
In this basic form, xargs
tacks the file names onto the
end of the generated command. But sometimes you might want the filenames
placed into the middle of the command. The -i
option permits
that. If xargs
is executed with a -i
flag, then
it looks in your partial command for the characters “{}” and
places your files there (one at a time, as if you had said “
-n
1
”.
As an example of using xargs
, suppose that you have a
directory with a large number of data files. You want to copy those files
into a new directory. However, you have edited many of these files, so the
original directory is littered with backup files left by the editors. You
would prefer not to copy those backups. The backups can be recognized
because some end in “
.bak
” and some end in
“
~
”.
Now, normally, you would copy files from one directory to another like this:
cpoldDirectory
/*newDirectory
/
But there is not obvious way to rewrite the wildcard pattern
*
to exclude the backup files (wildcards are great for
including things, not so great for excluding them).
But a list of file names is nothing more than text, and we have some
powerful tools like grep
for editing and selecting
text.
Example 2.35. Try This:
mkdir ~/xargs cd ~cs252/Assignments/xargs ls ls | grep -v '\.bak$' ls | grep -v '\.bak$' | grep -v '~$' ls | grep -v '\.bak$' | grep -v '~$' | xargs -i cp {} ~/xargs ls ~/xargs
Wildcards give us a way to describe a number of files at once. But wildcards have a limitation. They can only describe one "level" of directories at a time. You can write a wildcard expression to look at a variety of files in one directory, or at a variety of files in one or more subdirectories of that directory, or in one or more sub-subdirectories of that one. But you cannot write a wildcard expression that will simultaneously describe files in a directory and in its subdirectories.
Some commands will try to help with this. Both cp
and
rm
, for example, offer a -r
flag
(“r” for “recursive”) that will descend into an
arbitrary number of levels of subdirectories, but these are all-or-nothing
selections. You can't be very selective about what files get processed
this way.
This is where find
come in. find
is the
Swiss army knife of Unix commands. It provides all kinds of ways to select
files, no matter how deep they are in your directory structure. It
provides a variety of things it can do with selected files, or it can fill
their names into an arbitrary command in a manner similar to xargs
-i
.
The general form of a find
command is
findlist-of-files-and-directories
list-of-actions
find
looks at each file and directory given in
the list. For directories, it also looks at all files and directories
inside those, descending as far as it can from directory to
directory.
The actions in the command are all given a flags (beginning with “-”). Some actions will “do something” to a file. Others are used to select which files will be passed on to the later commands in the list.
The most common selection action is -name
, which is
given a wildcard expression to match file names against.
Example 2.37. Try This:
ls /usr/include/w*.h ls /usr/include/*/w*.h find /usr/include -name 'w*.h'
The wildcard expression for -name
must be quoted,
because you don't want the command shell to expand it before it launches
find
.
Other useful ways to select files include -type
, which
chooses different "types" of files. Directories are type 'd' and ordinary
files are type 'f'.
Example 2.38. Try This:
find /usr/include -type d -name 'u*' find /usr/include -type d -name 'u*' -ls
Note that all the files listed are directories.
find /usr/include -type f -name 'u*'
You can also select files based on how long ago they were modified.
Example 2.39. Try This:
find ~ -mtime +7 find ~ -mtime -7
One of these lists only files you have modified within the past 7 days. The other lists files whose last modification is more than 7 days in the past.
Not sure which is which? Try
find ~ -mtime +7 | xargs ls -ld find ~ -mtime -7 | xargs ls -ld
What can find
do to files it
selects? The simplest possibility if to simply print the file name, which
is done by the action “
-print
”. In fact, that's
the behavior we have seen in all the examples so far, because it's the
default if you don't do anything else to the selected files. Sometimes,
though, you use -print
because you want to print a file name
and do something else to a file.
You can get a bit more information than just the name:
The most powerful use of find
comes form the
-exec
action, which allows you to specify an arbitrary Unix
command that you want applied to selected files. The command is terminated
by a quoted semi-colon (“\;”) and should contain the
characters “{}” at the point where you want to insert the
file name.
Example 2.41. Try This:
cp ~cs252/Assignments/xargs/* ~/xargs ls ~/xargs find ~/xargs -name '*.bak' -print -exec rm {} \; ls ~/xargs
Closely related is -ok
, which asks for permission
before applying a command.
Actually, because many commands, such as grep
, can be
used to test files for certain properties, -exec
can actually
be used to select files as well as to operate on them.
There are many other possible actions as well. See the
man
page for details.