Awk Command Daily Work Summary

Designed for data extraction and reporting.

awk is its own programming language itself and contains a lot of really good tools, enables a programmer to write tiny but effective programs in the form of statements that define text patterns that are to be searched for in each line of a document and the action that is to be taken when a match is found within a line.

Reference from GeeksforGeeks awk in 20 mins WHAT CAN WE DO WITH AWK ?

  1. AWK Operations: (a) Scans a file line by line (b) Splits each input line into fields © Compares input line/fields to pattern (d) Performs action(s) on matched lines

  2. Useful For: (a) Transform data files (b) Produce formatted reports

  3. Programming Constructs: (a) Format output lines (b) Arithmetic and string operations © Conditionals and loops

日期记录的部分主要平时遇到的零散总结: ################################################################ #   Date           Description #   09/11/2019     skip first line #   02/28/2019     print last column #   02/26/2019     awk remote execution # ################################################################

02/26/2019

When use awk in script, may suffer shell unexpected expanding:

1
ssh -o StrictHostKeyChecking=no sshrm1 "ifconfig eth0 | grep \"inet\" | awk '{print $2}'"

Above will not get right data, instead preceding \ before $

1
ssh -o StrictHostKeyChecking=no sshrm1 "ifconfig eth0 | grep \"inet\" | awk '{print \$2}'"

Another method is awk the return value from ssh rather than wrap it in ssh command.

02/28/2019

Print last column separated by space:

1
2
## NF: count of fields of a line
awk '{print $NF}' <file>

09/11/2019

Skip the first line:

1
2
## NR: current count of lines
awk 'NR>1 {print $1}' <file>

You can use NR>=2, NR<5, NR==3, etc to limit the range.

Quick Start

1
2
3
4
## check version
awk -W version
## looks also works
awk --version

awk has BEGIN and END block, between is the body:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
## BEGIN and END run only once
## body run as line number times
awk 'BEGIN {print "start..."} {print NR, $0} END {print NR}' /etc/hosts

## BEGIN
start...
## body
1 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
2 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
3 172.16.163.83 myk8s1.fyre.ibm.com myk8s1
4 172.16.182.156 myk8s2.fyre.ibm.com myk8s2
5 172.16.182.187 myk8s3.fyre.ibm.com myk8s3
## END
5

We can also put the awk option into awk script:

1
awk -f file.awk /etc/passwd

file.awk content:

1
2
3
4
5
6
## FS is used to specify delimiter to parse line, by default awk use space
BEGIN { FS=":" ; print "User Name:"}
## $3 > 999 is the condition match
## NR is internal variable of awk
$3 > 999 {print NR, $0; count++ }
END {print "Total Lines: " NR " Count Lines: " count}

Let’s see more examples, actually sed may perform the same task but awk is more readable.

1
2
3
## set "," as delimiter, $1 to uppercase, $2 to lowercase
## toupper and tolower is awk internal functions
awk -F"," '{print toupper($1), tolower($2), $3}' <file>

lastlog.awk file to show non-root user login statistics

1
2
3
4
5
6
7
8
9
10
11
12
13
## exclude if match these:
!(/Never logged in/ || /^Username/ || /^root/) {
cnt++
## line fields == 8
if (NF == 8)
printf "%8s %2s %3s %4s\n", $1, $5, $4, $8
else
printf "%8s %2s %3s %4s\n", $1, $6, $5, $9
}
END {
print "==============================="
print "Total # of user processed: " cnt
}
0%