Simple Introduction to UNIX
R.B. Lammers, updated Feb 3, 2003
This introduction assumes you know a little about UNIX (you know about
simple navigation of the directory structure and can view files with
ls, cd, more, pwd, cp, mv,
ln -s, chmod). See Simpler
Introduction to UNIX - The Basics for more details.
The following
commands will allow you to process text files.
Oct 23, 2002: See also a recent article in Linux Journal: "Dogs" of the Linux Shell by L.J. Iacona. Several of the commands in the article appear quite useful (e.g. tac, fold, and dirname).
man get manual page on a UNIX command
example: man uniq
cut extract columns of data
example: cut -f -3,5,7-9 -d ' ' infile1 > outfile1
-f 2,4-6 field
-c 35-44 character
-d ':' delimiter (default is a tab)
sort sort lines of a file (Warning: default delimiter is white space/character transition)
example: sort -nr infile1 | more
-n numeric sort
-r reverse sort
-k 3,5 start key
wc count lines, words, and characters in a file
example: wc -l infile1
-l count lines
-w count words
-c count characters
paste reattach columns of data
example: paste infile1 infile2 > outfile2
cat concatenate files together
example: cat infile1 infile2 > outfile2
-n number lines
-vet show non-printing characters (good
for finding problems)
uniq remove duplicate lines (normally from a sorted file)
example: sort infile1 | uniq -c > outfile2
-c show count of lines
-d only show duplicate lines
join perform a relational join on two files
example: join -1 1 -2 3 infile1 infile2 > outfile1
-1 FIELD join field of infile1
-2 FIELD join field of infile2
cmp compare two files
example: cmp infile1 infile2
diff or diff3 compare 2 or 3 files - show differences
example: diff infile1 infile2 | more
example: diff3 infile1 infile2 infile3 > outfile1
head extract lines from a file counting from the beginning
example: head -100 infile1 > outfile1
tail extract lines from a file counting from the end
example: tail +2 infile1 > outfile1
-n count from end of file (n is an integer)
+n count from beginning of file (n is an integer)
dos2unix convert dos-based characters to UNIX format (the file is
overwritten).
example: dos2unix infile1
tr translate characters - example shows replacement of spaces
with newline character
example: tr " " "[\012*]" < infile1 > outfile
grep extract lines from a file based on search strings and
regular expressions
example: grep 'Basin1' infile1 > outfile2
example: grep -E '15:20|15:01' infile1 | more
sed search and replace parts of a file based on regular
expressions
example: sed -e 's/450/45/g' infile1 > outfile3
Regular Expressions
Regular expressions can be used with many programs including ls, grep, sed,
vi, emacs, perl, etc. Be aware that each program has variations on usage.
ls examples:
ls Data*.txt
ls Data4[5-9].ps list ps files beginning with Data numbered 45-49
sed examples: (these are the regex part of the sed command only)
s/450/45/g search for '450' replace with '45' everywhere
s/99/-9999\.00/g search for all '99' replace with '-9999.00'
s/Basin[0-9]//g remove the word Basin followed by a single digit
s/^12/12XX/ search for '12' at the beginning of a line,
insert XX
s/Basin$// remove the word Basin if it is at the end of
the line.
s/^Basin$// remove the word Basin if it is the only word on
the line.
s/[cC]/100/g search for 'c' or 'C' replace with 100
45,$s/\([0-9][0-9]\)\.\([0-9][0-9]\)/\2\.\1/g
on lines 45 to the end of file, search for two digits
followed by a '.' followed by two digits. replace
with the digit pairs reversed.
2,$s/,\([^,]*\),/,\"\1\",/
on all lines except the first, search for a comma,
followed by any text, followed by a comma. replace
the found text surrounded by double quotes.
s/\([0-9][0-9]\):\([0-9][0-9]\):\([0-9][0-9][0-9][0-9]\)/Year = \3, Month = \2, Day = \1/
search for 2 digits, followed by a colon, followed by 2 digits,
followed by a colon, followed by 4 digits. replace with
text plus values in a different order.
Pipes, standard input, standard output:
Standard output, ">", places the results of a command into the file named
after the ">". A new file will be written (an old file with the same name
will be removed). In order to append to an existing file use ">>".
Pipes allow you to connect multiple commands together to form a data stream.
For example, to count the number of times the string "Nile" occurs in the
3rd column of a file run this:
cut -f 3 infile1 | sort | uniq -c | grep 'Nile'
or do this:
cut -f 3 infile1 | grep 'Nile' | wc -l
From a global STN Attributes data set (tab delimited):
- extract all North American basins draining into the Atlantic Ocean
- select only columns 2,3,4,5,11,12,13, and 17
- replace all missing data values (either -99 or -999) with -9999.0
- remove duplicate lines
- sort by the first column
- number all lines sequentially
- save to a new file
grep 'North America' STNAttributes.txt | grep 'Atlantic Ocean' \
| cut -f 2-5,11-13,17 | sed -e 's/-99\|-999/-9999\.0/g' \
| sort | uniq | cat -n > NewSTNAttributes.txt
--