Nginx 日志分析
日志条目:
172.16.20.25 - - [25/Apr/2020:16:41:13 +0800] “GET / HTTP/1.1” 200 4833 “-” “Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36” “-”
常用的变量信息位置
$remote_addr $1
$time_local $4
$request $7
$status $9
$body_bytes_sent $10
1.统计2017年9月5日 PV量
grep ‘05/Sep/2017’ cd.mobiletrain.org.log | wc -l
#统计时间段访问的页面浏览量
awk ‘$4>="[21/Apr/2020:20:39:02" && $4<="[21/Apr/2020:20:39:03"{print $0 }’ access.log-20200423 | wc -l
2.统计2017年9月5号前10top
awk ‘/05/Sep/2017/{ips[$1]++} END {for(i in ips){print i,ips[i]}}’ access.log-20200423 | sort -k2rn | head -10
3.统计2017年9月5号 访问大于100次的IP
awk ‘/05/Sep/2017/{ips[$1]++} END {for(i in ips){if(ips[i]>100)print i,ips[i]}}’ access.log-20200423 | sort -k2rn | head -10
4.统计2017年8月5 url访问次数
awk ‘/05/Sep/2017/{url[$7]++}END{for(i in url){print i,url[i]}}’ /var/log/nginx/access.log-20200423 | sort -k2rn | head
/ 27 /wl.js 20
/price.min.js 19
/RX3K6PAI.js 19
/base.js 18 /channel.js 18
/hotwords.js 18
/o2_ua.min.js 17
/img/centos-logo.png 15 /img/header-background.png 15
5.统计2017年8月5号 每个url访问内容总大小($body_bytes_sent)
[root@ansible ~]# awk ‘/05/Sep/2017/{size[$7]+=$10}END{for(i in size){print i,size[i]}}’ /var/log/nginx/access.log-20200423 | sort -k2rn | head
/images/focus4.png 1393378
/img/header-background.png 1243440
/images/focus2.jpg 987973 / 924662
/images/574bcbdbNe83d1983.jpg 914070
/images/focus3.jpg 912254 /images/57ee2f5eN9ad17bf4.jpg 903375
/images/focus1.jpg 900067 /images/focus5.jpg 785911
/images/5819cbc0Na5bad5e9.jpg 727950
#url字节大小+ 访问次数
[root@ansible ~]# awk ‘{url[$7]++;sizi[$7]+=$10}END{for(i in url){print url[i],sizi[i],i}}’ /var/log/nginx/access.log-20200423 | sort -k2rn | head 9
1393378 /images/focus4.png 15
1243440 /img/header-background.png 9
987973 /images/focus2.jpg 27 924662 / 7
914070 /images/574bcbdbNe83d1983.jpg 9
912254 /images/focus3.jpg 7
903375 /images/57ee2f5eN9ad17bf4.jpg 9
900067 /images/focus1.jpg 9
785911 /images/focus5.jpg 7
727950 /images/5819cbc0Na5bad5e9.jpg 8
6.统计每个IP访问的IP状态码数量
[root@ansible ~]# awk ‘{status[$1" "$9]++}END{for(i in status){print i,status[i]}}’ /var/log/nginx/access.log-20200423 192.168.101.176 200 1332
192.168.101.176 404 170
192.168.101.176 499 4
192.168.101.176 502 1
192.168.101.176 304 487
192.168.101.174 304 422
7.统计访问状态码为404及出现的次数
[root@ansible ~]# awk ‘/<404>/{b[$1" "$9]++}END{for(i in b){print i,b[i]}}’ /var/log/nginx/access.log-20200423
192.168.101.176 404 170
8.统计前一分钟的pv量
date= ( d a t e − d ′ 1 m i n u t e ′ + (date -d '1 minute' +%d/%b%Y:%H:%M); awk -v date= (date−d′1minute′+date '$0 ~ date {i++} END {print i } ’ /var/log/nginx/access.log-20200423
9.统计2020年4月21号 20:39 - 20:39,访问状态码是404
awk ‘$4>="[21/Apr/2020:20:39:02" && $4<="[21/Apr/2020:20:39:03"{if($9==“404”){ip_code[$1" "$9]++}} END{for(i in ip_code){print i,ip_code[i]}}’ /var/log/nginx/access.log-20200423
192.168.101.176 404 243
192.168.101.172 404 242
10.统计状态码的数量
[root@ansible ~]# awk ‘/21/Apr/2020/{status[$9]++}END{for(i in status){print i,status[i]}}’ /var/log/nginx/access.log-20200423
304 245
200 972
404 114
502 1
[root@ansible ~]# awk ‘/21/Apr/2020/{status[$9]++;tolot++}END{for(i in status){printf i" “;printf status[i]”\t"; printf “%.2f”,status[i]/tolot*100 ;print “%”}}’ /var/log/nginx/access.log-20200423
304 245 18.39%
200 972 72.97%
404 114 8.56%
502 1 0.08%