Filters & Text Manipulation Tools
1.
Printing:
pr-sh
% ls /home/cs476
| pr -h "--cs 476-" -l 20
-2 -n -d | more -20
-h header, -l page length, -2 columns, -n add line numbers, -d double space
2. File
Comparison:
compare-sh
% paste f1 f2
a
a
b
b
c
d
e
% cmp
f1 f2
f1 f2 differ: char 5, line 3
% comm
f1 f2
a
b
c
d
e
% comm -12 f1
f2
a
b
-12 suppress columns 1 and 2.
% diff f1 f2
3c3,4
< c
---
> d
> e
3.
Head & Tail:
headtail-sh
% cat
f1
a
b
c
% head -n
1 f1
a
% tail -n
1 -f f1
c
/usr/bin/tail options:
-n no of lines, -f force wait -r
reverse lines
Exercise:
getting the middle middle-sh
%
head -n 1
f1 > h
%
tail -n 1
f1 > t
%
comm -23 f1 h
> f1_h
%
comm -23 f1_h t
> m
4.
Cut & Paste:
cutpaste-sh
% cat fields
Wahab, Hussein
Maly, Kurt
% cut -d, -f2 fields
> F
Hussein
Kurt
-d
delimiter, -f field number
% cut -d, -f1 fields > L
% cat L
Wahab
Maly
% paste
-d "
" F L > names
% cat names
Hussein Wahab
Kurt Maly
% cat names emails
Hussein Wahab
Kurt Maly
wahab@cs.odu.edu
maly@cs.odu.edu
% paste
names
emails
Hussein Wahab wahab@cs.odu.edu
Kurt Maly maly@cs.odu.edu
% cat info
Hussein Wahab
wahab@cs.odu.edu
x4512
Kurt Maly
maly@cs.odu.edu
x3915
% paste
-s -d":;\n"
info > info2
-s concatenate separate lines, -d lines
delimiters
% cat info2
Hussein Wahab:wahab@cs.odu.edu;x4512
Kurt Maly:maly@cs.odu.edu;x3915
5.
Sort:
sort-sh
% cat sortdata
Wahab,Hussein
Maly,Kurt
Wahab,Hussein
Maly,Kurt
% sort
sortdata
Maly,Kurt
Maly,Kurt
Wahab,Hussein
Wahab,Hussein
% sort
-u sortdata
Maly,Kurt
Wahab,Hussein
% sort
-t, -k 2 sortdata
Wahab,Hussein
Wahab,Hussein
Maly,Kurt
Maly,Kurt
% paste f1 f2
a
a
b b
c d
e
% sort -u -m f1 f2
a
b
c
d
e
-u unique, -t field separtator, -k field number –m merge
6. Uniq: uniq-sh
% cat grades
1: A
2: B
3: C
4: D
5: F
6: A
7: A
8: B
9: F
% cut -d: -f2 grades | sort | uniq -c
3 A
2 B
1 C
1 D
2 F
-c count
the occurrence of each value
7. Translate: translate-sh
% cat fields |
tee /dev/tty | tr 'a-z' 'A-Z' | tee /dev/tty
| tr 'A-Z' 'a-z'
Wahab, Hussein
Maly, Kurt
WAHAB, HUSSEIN
MALY, KURT
wahab, hussein
maly, kurt
8. Egrep:
egrep-sh
% ls -l /home/cs476 |
egrep "^d.*[M|m]ail.*"
drwx------ 2
cs476 cs476 512 Oct 22 1999 Mail
drwx------ 2
cs476 cs476 512 Dec 9 2003 mail
drwx------ 2
cs476 cs476 512 Nov 3 1998 nsmail
% ypcat passwd | egrep
"^cs[5-9][0-9]+:"
| cut
-d: -f1 | sort -u | tee /dev/tty |
wc -l
9. sed: sed-sh
% sed
“s/^/000/; s/A/Excellent/; s/B/Very Good/; s/C/Good/; s/D/Pass/;
s/F/Fail/” grades
Exercise: getting the middle midsed-sh
low=$1
high=$2
filename=$3
set `wc $filename`
filelen=$1
sed "1,$low d; $high,$filelen d"
$filename
% more fac
Hussein Wahab:1:Networks
Kurt Maly:1:Digital
Lib
Steve Olariu:1:Theory
Mike Oversteet:2:Software
Engineering
Michele Wiegle:3:Networking
Shun Toida:4:Graph
Theory
Ravi
Mukkamala:1:Distributed Systems
Mohammed
Zubair:1:Digital Lib
Irwin Levenstien:2:Databases
Stewart Shen:4:Web
Technolgy
3:Assistant Professor:starting rank
2:Assocaite
Professor: middle rank
1:Hussein Wahab:Networks:Professor: top rank
1:Kurt Maly:Digital Lib:Professor: top
rank
1:Mohammed Zubair:Digital Lib:Professor: top
rank
1:Ravi Mukkamala:Distributed Systems:Professor:
top rank
1:Steve Olariu:Theory :Professor: top rank
1:Stewart Shen:Web Technolgy:Professor: top
rank
2:Irwin
Levenstien:Databases:Assocaite Professor: middle rank
2:Mike Oversteet:Software Engineering:Assocaite
Professor: middle rank
3:Michele Weigle:Noetworkin:Assistant Professor:starting
rank
4:Shun Toida:Graph Theory:Emiratus
Professor: Retired
sort -t:
-n -k2 fac > facS
sort -t:
-n -k1 rank > ranks
join -t:
-1 2 -2 1 facS rankS > join_result.all
join -o 1.1 2.2 1.3
-t: -1 2 -2 1
facS rankS > join_result.some
-n numeric
-1 2
-2
1 join fields of file1 and file2
-o output fields of file1
and file2
1. Selecting and
printing lines:
ex1-sh
Courses accounts:
print login and name:
% ypcat passwd
| awk
-F: ' /^cs[5-9][0-9]+/
{ print $1, $5
} '
cs656 cs656 Class Account
cs554 ajay's grad Networking class
cs775 CS775 Grader Account
-F Field separator
Faculty accounts:
à print login
and name:
% ypcat passwd
| awk -F: ' $4 == 13
{
print
$1 "--> " $5 } '
Shen --> Stewart Shen
Wahab --> Dr. wahab
à print all fields
except the password field:
% ypcat passwd |
awk
-F:
' $4 == 13
{ $2=
"" ; {print} } '
à print line number, login and name:
% ypcat passwd |
awk -F:
' $4 == 13
{ print ++count
“: “ $1 "-->
" $5 } '
5: shen --> Stewart Shen
11: Wahab -->
Dr. Wahab
Exercise: getting file name from a path pathtofile-sh
echo $1 | awk -F/ '{print
$NF}'
Example:
% pathtofile-sh /home/wahab/public_html
public_html
ex2-sh:
ypcat passwd | awk -F: -f ex2-awk
ex2-awk:
$4 == 13 { print ++count,
$1 "--> " $5 }
-f file containing instructions
Usage:
% ex2-sh
3. Begin and END: ex3-sh
& ex3-awk
ex3-sh:
ypcat
passwd | awk
-F: -f ex3-awk
ex3-awk:
system
("date");
printf " HOME = %s \n",
ENVIRON["HOME"]
}
$4 == 13 { print
++count, $1 "--> " $5 }
printf
"Total number is %d\n", count
printf "PATH = %s \n", ENVIRON["PATH"]
Usage:
% ex3-sh
1. Selecting and printing lines:
ex1pl-sh & ex1-pl
ex1pl-sh:
ypcat passwd | ex1-pl
ex1-pl:
system (“date”);
print ("BEGIN - finding cs accounts
\n");
while (<>) {
if (
/^cs[4-9][0-9]+:/
) {
$count++;
print ("$. --> $_
" );
}
print ("END - total: $count
\n");
$. Current line number,
$_ Content of current line.
Usage:
% ex1pl-sh
2. Selecting fields: ex2pl-sh & ex2-pl
ex2pl-sh: ypcat passwd | ex2-pl
ex2-pl:
print
("ENV-HOME: $ENV{'HOME'} \n");
print("BEGIN - finding cs accounts \n");
$lineno
= 0 ;
while (<>) {
if (/^cs[4-9][0-9]+:/) {
split (/:/);
$count++;
print ("$. --> $_[0]
$_[4] \n " );
}
print ("END -
total: $count \n");
Usage:
% ex2pl-sh
Exercise: getting file name from a path
% path2file-sh /home/wahab/public_html
public_html