awk使用的一些技巧记录
awk指定分隔符时,没有匹配到分隔符的行会怎样?
一个经典题目:从下面的文件里获取HH:MM
[root@Almalinux-VM1 awk]# cat HM
183.250.220.178|-l[20/jul/2017:10:35:14 +0800]|POST /audiosearch/search
HTTP/1.1|200|54|-lDalvik/1.6.0(linux;U;Android 4,4,4;Konka Android TV 638
Build/KTU84P)|-l-[5.069|5.001,0.005|www.kuyun.com|8771|172.21.19.67:8084,172.21.19.66:8084]
183.250.220.178|-l[20/jul/2017:10:35:14 +0800]|POST /audiosearch/search
HTTP/1.1|200|54|-lDalvik/1.6.0(linux;U;Android 4,4,4;Konka Android TV 638
Build/KTU84P)|-l-[5.069|5.001,0.005|www.kuyun.com|8771|172.21.19.67:8084,172.21.19.66:8084]
我最早的思路:使用正则匹配'[20/jul/2017]',再用管道再执行一次awk.见下
[root@Almalinux-VM1 awk]# cat HM |awk 'BEGIN {FS="|"}$2 ~/2017/ {gsub("[0-9]{3}.*\[0-9]{4}:|:14.*search","");print $0}'
awk: 命令行:1: 警告:转义序列“\[”被当作单纯的“[”
10:35
10:35
可以看到实现了要求.
看到别人写的标准答案
[root@Almalinux-VM1 awk]# awk 'BEGIN {FS="2017:|:14"} {print $2}' HM
10:35
10:35
大吃一惊,原来可以这样写.引出一个问题,文件有的行没有'2017:|:14',那执行他们的时候是怎么执行?
[root@Almalinux-VM1 awk]# awk 'BEGIN {FS="2017:|:14"} {print $1}' HM
183.250.220.178|-l[20/jul/
HTTP/1.1|200|54|-lDalvik/1.6.0(linux;U;Android 4,4,4;Konka Android TV 638
Build/KTU84P)|-l-[5.069|5.001,0.005|www.kuyun.com|8771|172.21.19.67:8084,172.21.19.66:8084]
183.250.220.178|-l[20/jul/
HTTP/1.1|200|54|-lDalvik/1.6.0(linux;U;Android 4,4,4;Konka Android TV 638
Build/KTU84P)|-l-[5.069|5.001,0.005|www.kuyun.com|8771|172.21.19.67:8084,172.21.19.66:8084]
[root@Almalinux-VM1 awk]# awk 'BEGIN {FS="2017:|:14"} {print $2}' HM
10:35
10:35
[root@Almalinux-VM1 awk]# awk 'BEGIN {FS="2017:|:14"} {print $3}' HM
+0800]|POST /audiosearch/search
+0800]|POST /audiosearch/search
对比以上的执行过程可以发现,当指定'FS',没有匹配到指定的'FS'的行,整行默认为$1.
再来测试下
[root@Almalinux-VM1 awk]# awk 'BEGIN {FS="MP3"} {print}' items.txt
101,HD Camcorder,Video,210,10
102,Refrigerator,Appliance,850,2
103,MP3 Player,Audio,270,15
104,Tennis Racket,Sports,190,20
105,Laser Printer,Office,475,5
[root@Almalinux-VM1 awk]# awk 'BEGIN {FS="MP3"} {print $1}' items.txt
101,HD Camcorder,Video,210,10
102,Refrigerator,Appliance,850,2
103,
104,Tennis Racket,Sports,190,20
105,Laser Printer,Office,475,5
[root@Almalinux-VM1 awk]# awk 'BEGIN {FS="MP3"} {print $2}' items.txt
Player,Audio,270,15