awk
Pattern-directed scanning and processing language for text manipulation.
Basic Usage
Print entire file.
awk '{ print }' [file]
Or:
awk '{ print $0 }' [file]
Field Separators
Default (whitespace)
awk '{ print $1, $2 }' [file]
Tab Delimiter
awk -F'\t' '{ print $1, $2 }' [file]
Comma Delimiter (CSV)
awk -F',' '{ print $1, $2 }' [file]
Custom Delimiter
awk -F':' '{ print $1, $3 }' /etc/passwd
Print Columns
Specific Columns
First column:
awk '{ print $1 }' [file]
First and third columns:
awk '{ print $1, $3 }' [file]
Last Column
awk '{ print $NF }' [file]
Second to Last
awk '{ print $(NF-1) }' [file]
All But First Column
awk '{ $1=""; print $0 }' [file]
Or:
awk '{ for(i=2; i<=NF; i++) printf "%s ", $i; print "" }' [file]
Pattern Matching
Match Specific Value
Lines where first column equals "value":
awk '$1 == "value"' [file]
Lines where first column does not equal "value":
awk '$1 != "value"' [file]
Match Pattern
Lines containing "pattern":
awk '/pattern/' [file]
Lines NOT containing "pattern":
awk '!/pattern/' [file]
Regular Expression Match
awk '$1 ~ /^[0-9]+$/' [file]
Not matching:
awk '$1 !~ /^[0-9]+$/' [file]
Conditional Operations
Greater Than / Less Than
awk '$3 > 100' [file]
awk '$2 <= 50' [file]
Multiple Conditions (AND)
awk '$1 == "value" && $2 > 100' [file]
Multiple Conditions (OR)
awk '$1 == "value" || $2 > 100' [file]
Built-in Variables
| Variable | Description |
|---|---|
NR | Current record (line) number |
NF | Number of fields in current record |
FS | Field separator (default: space) |
OFS | Output field separator |
RS | Record separator (default: newline) |
ORS | Output record separator |
FILENAME | Current input file name |
Examples
Print line numbers:
awk '{ print NR, $0 }' [file]
Print number of fields per line:
awk '{ print NF }' [file]
BEGIN and END
BEGIN Block
Executed before processing any lines:
awk 'BEGIN { print "Header" } { print }' [file]
END Block
Executed after processing all lines:
awk '{ sum += $1 } END { print sum }' [file]
Arithmetic Operations
Sum Column
awk '{ sum += $2 } END { print sum }' [file]
Average
awk '{ sum += $1; count++ } END { print sum/count }' [file]
Count Lines
awk 'END { print NR }' [file]
Output Formatting
Custom Output Field Separator
awk 'BEGIN { OFS="|" } { print $1, $2, $3 }' [file]
Printf Formatting
awk '{ printf "%-10s %5d\n", $1, $2 }' [file]
Advanced Examples
Remove Duplicate Lines
awk '!seen[$0]++' [file]
Print Lines Longer Than 80 Characters
awk 'length > 80' [file]
Print Specific Line Range
Lines 10 to 20:
awk 'NR>=10 && NR<=20' [file]
Calculate Column Sum by Group
awk '{ sum[$1] += $2 } END { for (key in sum) print key, sum[key] }' [file]
Print Every Nth Line
Every 5th line:
awk 'NR % 5 == 0' [file]
Working with Multiple Files
Process Multiple Files
awk '{ print FILENAME, $0 }' file1.txt file2.txt
Join Files by Column
awk 'NR==FNR { a[$1]=$2; next } { print $0, a[$1] }' file1.txt file2.txt
Common Use Cases
Extract Email Addresses
awk -F'@' '/@/ { print $2 }' [file]
Count Word Frequency
awk '{ for(i=1;i<=NF;i++) freq[$i]++ } END { for(word in freq) print word, freq[word] }' [file]
Convert CSV to TSV
awk -F',' 'BEGIN { OFS="\t" } { $1=$1; print }' input.csv