awk使用的一些技巧记录

awk指定分隔符时,没有匹配到分隔符的行会怎样?

一个经典题目:从下面的文件里获取HH:MM

[root@Almalinux-VM1 awk]# cat HM
183.250.220.178|-l[20/jul/2017:10:35:14 +0800]|POST /audiosearch/search
HTTP/1.1|200|54|-lDalvik/1.6.0(linux;U;Android 4,4,4;Konka Android TV 638
Build/KTU84P)|-l-[5.069|5.001,0.005|www.kuyun.com|8771|172.21.19.67:8084,172.21.19.66:8084]
183.250.220.178|-l[20/jul/2017:10:35:14 +0800]|POST /audiosearch/search
HTTP/1.1|200|54|-lDalvik/1.6.0(linux;U;Android 4,4,4;Konka Android TV 638
Build/KTU84P)|-l-[5.069|5.001,0.005|www.kuyun.com|8771|172.21.19.67:8084,172.21.19.66:8084]

我最早的思路:使用正则匹配'[20/jul/2017]',再用管道再执行一次awk.见下

[root@Almalinux-VM1 awk]# cat HM |awk 'BEGIN {FS="|"}$2 ~/2017/ {gsub("[0-9]{3}.*\[0-9]{4}:|:14.*search","");print $0}'
awk: 命令行:1: 警告:转义序列“\[”被当作单纯的“[”
10:35
10:35

可以看到实现了要求.

看到别人写的标准答案

[root@Almalinux-VM1 awk]# awk 'BEGIN {FS="2017:|:14"}  {print $2}' HM
10:35


10:35

大吃一惊,原来可以这样写.引出一个问题,文件有的行没有'2017:|:14',那执行他们的时候是怎么执行?

[root@Almalinux-VM1 awk]# awk 'BEGIN {FS="2017:|:14"}  {print $1}' HM
183.250.220.178|-l[20/jul/
HTTP/1.1|200|54|-lDalvik/1.6.0(linux;U;Android 4,4,4;Konka Android TV 638
Build/KTU84P)|-l-[5.069|5.001,0.005|www.kuyun.com|8771|172.21.19.67:8084,172.21.19.66:8084]
183.250.220.178|-l[20/jul/
HTTP/1.1|200|54|-lDalvik/1.6.0(linux;U;Android 4,4,4;Konka Android TV 638
Build/KTU84P)|-l-[5.069|5.001,0.005|www.kuyun.com|8771|172.21.19.67:8084,172.21.19.66:8084]
[root@Almalinux-VM1 awk]# awk 'BEGIN {FS="2017:|:14"}  {print $2}' HM
10:35


10:35


[root@Almalinux-VM1 awk]# awk 'BEGIN {FS="2017:|:14"}  {print $3}' HM
 +0800]|POST /audiosearch/search


 +0800]|POST /audiosearch/search

对比以上的执行过程可以发现,当指定'FS',没有匹配到指定的'FS'的行,整行默认为$1.

再来测试下

[root@Almalinux-VM1 awk]# awk 'BEGIN {FS="MP3"} {print}' items.txt 
101,HD Camcorder,Video,210,10 
102,Refrigerator,Appliance,850,2 
103,MP3 Player,Audio,270,15 
104,Tennis Racket,Sports,190,20 
105,Laser Printer,Office,475,5
[root@Almalinux-VM1 awk]# awk 'BEGIN {FS="MP3"} {print $1}' items.txt 
101,HD Camcorder,Video,210,10 
102,Refrigerator,Appliance,850,2 
103,
104,Tennis Racket,Sports,190,20 
105,Laser Printer,Office,475,5
[root@Almalinux-VM1 awk]# awk 'BEGIN {FS="MP3"} {print $2}' items.txt 


 Player,Audio,270,15