Simple Introduction to UNIX
R.B. Lammers, updated Feb 3, 2003
This introduction assumes you know a little about UNIX (you know about
simple navigation of the directory structure and can view files with
ls, cd, more, pwd, cp, mv,
ln -s, chmod). See Simpler
Introduction to UNIX - The Basics for more details.
The following commands will allow you to process text files.
Oct 23, 2002: See also a recent article in Linux Journal: "Dogs" of the Linux Shell by L.J. Iacona. Several of the commands in the article appear quite useful (e.g. tac, fold, and dirname).
man get manual page on a UNIX command example: man uniq cut extract columns of data example: cut -f -3,5,7-9 -d ' ' infile1 > outfile1 -f 2,4-6 field -c 35-44 character -d ':' delimiter (default is a tab) sort sort lines of a file (Warning: default delimiter is white space/character transition) example: sort -nr infile1 | more -n numeric sort -r reverse sort -k 3,5 start key wc count lines, words, and characters in a file example: wc -l infile1 -l count lines -w count words -c count characters paste reattach columns of data example: paste infile1 infile2 > outfile2 cat concatenate files together example: cat infile1 infile2 > outfile2 -n number lines -vet show non-printing characters (good for finding problems) uniq remove duplicate lines (normally from a sorted file) example: sort infile1 | uniq -c > outfile2 -c show count of lines -d only show duplicate lines join perform a relational join on two files example: join -1 1 -2 3 infile1 infile2 > outfile1 -1 FIELD join field of infile1 -2 FIELD join field of infile2 cmp compare two files example: cmp infile1 infile2 diff or diff3 compare 2 or 3 files - show differences example: diff infile1 infile2 | more example: diff3 infile1 infile2 infile3 > outfile1 head extract lines from a file counting from the beginning example: head -100 infile1 > outfile1 tail extract lines from a file counting from the end example: tail +2 infile1 > outfile1 -n count from end of file (n is an integer) +n count from beginning of file (n is an integer) dos2unix convert dos-based characters to UNIX format (the file is overwritten). example: dos2unix infile1 tr translate characters - example shows replacement of spaces with newline character example: tr " " "[\012*]" < infile1 > outfile grep extract lines from a file based on search strings and regular expressions example: grep 'Basin1' infile1 > outfile2 example: grep -E '15:20|15:01' infile1 | more sed search and replace parts of a file based on regular expressions example: sed -e 's/450/45/g' infile1 > outfile3 Regular Expressions Regular expressions can be used with many programs including ls, grep, sed, vi, emacs, perl, etc. Be aware that each program has variations on usage. ls examples: ls Data*.txt ls Data4[5-9].ps list ps files beginning with Data numbered 45-49 sed examples: (these are the regex part of the sed command only) s/450/45/g search for '450' replace with '45' everywhere s/99/-9999\.00/g search for all '99' replace with '-9999.00' s/Basin[0-9]//g remove the word Basin followed by a single digit s/^12/12XX/ search for '12' at the beginning of a line, insert XX s/Basin$// remove the word Basin if it is at the end of the line. s/^Basin$// remove the word Basin if it is the only word on the line. s/[cC]/100/g search for 'c' or 'C' replace with 100 45,$s/\([0-9][0-9]\)\.\([0-9][0-9]\)/\2\.\1/g on lines 45 to the end of file, search for two digits followed by a '.' followed by two digits. replace with the digit pairs reversed. 2,$s/,\([^,]*\),/,\"\1\",/ on all lines except the first, search for a comma, followed by any text, followed by a comma. replace the found text surrounded by double quotes. s/\([0-9][0-9]\):\([0-9][0-9]\):\([0-9][0-9][0-9][0-9]\)/Year = \3, Month = \2, Day = \1/ search for 2 digits, followed by a colon, followed by 2 digits, followed by a colon, followed by 4 digits. replace with text plus values in a different order. Pipes, standard input, standard output: Standard output, ">", places the results of a command into the file named after the ">". A new file will be written (an old file with the same name will be removed). In order to append to an existing file use ">>". Pipes allow you to connect multiple commands together to form a data stream. For example, to count the number of times the string "Nile" occurs in the 3rd column of a file run this: cut -f 3 infile1 | sort | uniq -c | grep 'Nile' or do this: cut -f 3 infile1 | grep 'Nile' | wc -l From a global STN Attributes data set (tab delimited): - extract all North American basins draining into the Atlantic Ocean - select only columns 2,3,4,5,11,12,13, and 17 - replace all missing data values (either -99 or -999) with -9999.0 - remove duplicate lines - sort by the first column - number all lines sequentially - save to a new file grep 'North America' STNAttributes.txt | grep 'Atlantic Ocean' \ | cut -f 2-5,11-13,17 | sed -e 's/-99\|-999/-9999\.0/g' \ | sort | uniq | cat -n > NewSTNAttributes.txt--