Nginx 日志分析

日志条目:

172.16.20.25 - - [25/Apr/2020:16:41:13 +0800] “GET / HTTP/1.1” 200 4833 “-” “Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36” “-”

常用的变量信息位置

$remote_addr $1

$time_local $4

$request $7

$status $9

$body_bytes_sent $10

1.统计2017年9月5日 PV量

grep ‘05/Sep/2017’ cd.mobiletrain.org.log | wc -l

#统计时间段访问的页面浏览量

awk ‘$4>="[21/Apr/2020:20:39:02" && $4<="[21/Apr/2020:20:39:03"{print $0 }’ access.log-20200423 | wc -l

2.统计2017年9月5号前10top

awk ‘/05/Sep/2017/{ips[$1]++} END {for(i in ips){print i,ips[i]}}’ access.log-20200423 | sort -k2rn | head -10

3.统计2017年9月5号 访问大于100次的IP

awk ‘/05/Sep/2017/{ips[$1]++} END {for(i in ips){if(ips[i]>100)print i,ips[i]}}’ access.log-20200423 | sort -k2rn | head -10

4.统计2017年8月5 url访问次数

awk ‘/05/Sep/2017/{url[$7]++}END{for(i in url){print i,url[i]}}’ /var/log/nginx/access.log-20200423 | sort -k2rn | head

/ 27 /wl.js 20

/price.min.js 19

/RX3K6PAI.js 19

/base.js 18 /channel.js 18

/hotwords.js 18

/o2_ua.min.js 17

/img/centos-logo.png 15 /img/header-background.png 15

5.统计2017年8月5号 每个url访问内容总大小($body_bytes_sent)

[root@ansible ~]# awk ‘/05/Sep/2017/{size[$7]+=$10}END{for(i in size){print i,size[i]}}’ /var/log/nginx/access.log-20200423 | sort -k2rn | head

/images/focus4.png 1393378

/img/header-background.png 1243440

/images/focus2.jpg 987973 / 924662

/images/574bcbdbNe83d1983.jpg 914070

/images/focus3.jpg 912254 /images/57ee2f5eN9ad17bf4.jpg 903375

/images/focus1.jpg 900067 /images/focus5.jpg 785911

/images/5819cbc0Na5bad5e9.jpg 727950

#url字节大小+ 访问次数

[root@ansible ~]# awk ‘{url[$7]++;sizi[$7]+=$10}END{for(i in url){print url[i],sizi[i],i}}’ /var/log/nginx/access.log-20200423 | sort -k2rn | head 9

1393378 /images/focus4.png 15

1243440 /img/header-background.png 9

987973 /images/focus2.jpg 27 924662 / 7

914070 /images/574bcbdbNe83d1983.jpg 9

912254 /images/focus3.jpg 7

903375 /images/57ee2f5eN9ad17bf4.jpg 9

900067 /images/focus1.jpg 9

785911 /images/focus5.jpg 7

727950 /images/5819cbc0Na5bad5e9.jpg 8

6.统计每个IP访问的IP状态码数量

[root@ansible ~]# awk ‘{status[$1" "$9]++}END{for(i in status){print i,status[i]}}’ /var/log/nginx/access.log-20200423 192.168.101.176 200 1332

192.168.101.176 404 170

192.168.101.176 499 4

192.168.101.176 502 1

192.168.101.176 304 487

192.168.101.174 304 422

7.统计访问状态码为404及出现的次数

[root@ansible ~]# awk ‘/<404>/{b[$1" "$9]++}END{for(i in b){print i,b[i]}}’ /var/log/nginx/access.log-20200423

192.168.101.176 404 170

8.统计前一分钟的pv量

date= ( d a t e − d ′ 1 m i n u t e ′ + (date -d '1 minute' +%d/%b%Y:%H:%M); awk -v date= (date−d′1minute′+date '$0 ~ date {i++} END {print i } ’ /var/log/nginx/access.log-20200423

9.统计2020年4月21号 20:39 - 20:39,访问状态码是404

awk ‘$4>="[21/Apr/2020:20:39:02" && $4<="[21/Apr/2020:20:39:03"{if($9==“404”){ip_code[$1" "$9]++}} END{for(i in ip_code){print i,ip_code[i]}}’ /var/log/nginx/access.log-20200423

192.168.101.176 404 243

192.168.101.172 404 242

10.统计状态码的数量

[root@ansible ~]# awk ‘/21/Apr/2020/{status[$9]++}END{for(i in status){print i,status[i]}}’ /var/log/nginx/access.log-20200423

304 245

200 972

404 114

502 1

[root@ansible ~]# awk ‘/21/Apr/2020/{status[$9]++;tolot++}END{for(i in status){printf i" “;printf status[i]”\t"; printf “%.2f”,status[i]/tolot*100 ;print “%”}}’ /var/log/nginx/access.log-20200423

304 245 18.39%

200 972 72.97%

404 114 8.56%

502 1 0.08%