强大的日志分析工具AWStats
 
 
:此文基本按照官方文档的操作一步步进行,在安装过程中参考了相关的文档作了必要的改动.
 
环境 :
redhat 9
Apache/2.0.54 源码安装,路径为/usr/local/apache2
主机IP 192.168.0.111域名 s1.domain1.com
所用的版本awstats-6.5.tar.gz
实现功能:
通过awstats统计s1.domain1.com的访问信息,并提供web页面查看
 
AWStats的功能很多,具体的可点击上面的官网,我在此主要用它来分析apache服务器的日志.使用之前还是说说大致的原理:awstats提供一系列的perl脚本实现:服务配置,日志读取,报表生成等功能.而功能实现的具体执行过程是:首先当然是apache将访问情况记录到日志中,AWStats每次执行更新时读取这些日志,分析日志数据,将结果存储到数据库中,(这个数据库是awstats自带的,并不需要第三方软件支持.),最后AWStats提供一个cgi程序通过web页面来显示数据库中所统计的数据.
 
首先看当前apache的配置,vi httpd.conf
<VirtualHost *:80>
    ServerAdmin yahoon@xxx.com
    DocumentRoot /var/www/html/s1
    ServerName  s1.domain1.com
    ErrorLog logs/s1_web-error_log
    TransferLog logs/s1_web-access_log
</VirtualHost>
访问s1.domain1.com的日志记录到/usr/local/apache2/logs/s1_web-access_log
 
安装过程如下
将软件包解压缩为/usr/local/awstats
第一步是为网站生成配置文件,其实有很多方式,按照官方文档的说明推荐使用脚本,而且这样也是比较简单的一种方式.
转到 /usr/local/awstats/tools,此目录下有很多不同功能的perl脚本.执行
[root@server1 tools]# perl awstats_configure.pl
会进入一个交互式的配置程序,过程中会有些问题要你回答,当然这些都是可以在以后的配置文件中修改的.我将所有的内容和注解列出如下
----- AWStats awstats_configure 1.0 (build 1.6) (c) Laurent Destailleur -----
This tool will help you to configure AWStats to analyze statistics for
one web server. You can try to use it to let it do all that is possible
in AWStats setup, however following the step by step manual setup
documentation (docs/index.html) is often a better idea. Above all if:
- You are not an administrator user,
- You want to analyze downloaded log files without web server,
- You want to analyze mail or ftp log files instead of web log files,
- You need to analyze load balanced servers log files,
- You want to ‘understand’ all possible ways to use AWStats…
Read the AWStats documentation (docs/index.html).
---à Running OS detected: Linux, BSD or Unix
检测到了我当前的操作系统
-----> Check for web server install
  Found Web server Apache config file '/usr/local/apache2/conf/httpd.conf'
检测到了我当前apache的配置文件
-----> Check and complete web server config file '/usr/local/apache2/conf/httpd.conf'
Warning: You Apache config file contains directives to write 'common' log files
This means that some features can't work (os, browsers and keywords detection).
Do you want me to setup Apache to write 'combined' log files [y/N] ?y
AWStats推荐使用combined格式的日志,apache默认使用common格式.所以这个地方提示要不要修改apache的配置文件,将日志格式改为combined,当然回答yes.
Add 'Alias /awstatsclasses "/usr/local/awstats/wwwroot/classes/"'
  Add 'Alias /awstatscss "/usr/local/awstats/wwwroot/css/"'
  Add 'Alias /awstatsicons "/usr/local/awstats/wwwroot/icon/"'
  Add 'ScriptAlias /awstats/ "/usr/local/awstats/wwwroot/cgi-bin/"'
  Add '<Directory>' directive
  AWStats directives added to Apache config file.
apache的配置文件里面增加了上述内容,里面是一些目录和别名的设置.这部分一般加到了文件的最后面.
-----> Update model config file '/usr/local/awstats/wwwroot/cgi-bin/awstats.model.conf'
  File awstats.model.conf updated.
更新了模板文件
-----> Need to create a new config file ?
Do you want me to build a new AWStats config/profile
file (required if first install) [y/N] ? y
是否需要创建一个新配置文件,y
-----> Define config file name to create
What is the name of your web site or profile analysis ?
Example: [url]www.mysite.com[/url]
Example: demo
Your web site, virtual server or profile name:
Ø         s1.domain1.com
这个地方填网站的域名
-----> Define config file path
In which directory do you plan to store your config file(s) ?
Default: /etc/awstats
Directory path to store config file(s) (Enter for default):
> 回车
生成的配置文件存放的路径,默认是/etc/awstats,我不修改,直接回车.
-----> Create config file '/etc/awstats/awstats.s1.domain1.com.conf'
 Config file /etc/awstats/awstats.s1.domain1.com.conf created.
可以看到文件已经创建了,路径为/etc/awstats/awstats.s1.domain1.com.conf'
-----> Restart Web server with '/sbin/service httpd restart'
Stopping httpd: [  OK  ]
Starting httpd: httpd: Could not determine the server's fully qualified domain name, using 127.0.0.1 for ServerName
[  OK  ]
由于修改了httpd.conf,所以它会自动重启apache来使配置文件生效,但是这里有个问题,因为我的apache并没有采用系统自带的,而是源码安装的,也并没有加入service,所以此处启动的虽然成功了,但并不是我需要的.很简单,等会把它关闭,然后开启我自己的.
-----> Add update process inside a scheduler
Sorry, configure.pl does not support automatic add to cron yet.
You can do it manually by adding the following command to your cron:
/usr/local/awstats/wwwroot/cgi-bin/awstats.pl -update -config=s1.domain1.com
Or if you have several config files and prefer having only one command:
/usr/local/awstats/tools/awstats_updateall.pl now
Press ENTER to continue...
由于要得到新信息,就必须更新数据库,也就是说重新做读取日志.分析日志提取里面新增的部分进行分析,将更新的数据存入数据库的过程.最好就让它定时执行.这段话是提醒你,要实现这个功能就需要手动把上面粗体的两行中的任一行写入crontab,让它定时执行.其中第一句是更新s1.domain1.com,而第二句是在有多个站点情况下,将所有的站点都作更新.我在此不做这个操作.回车.
A SIMPLE config file has been created: /etc/awstats/awstats.s1.domain1.com.conf
You should have a look inside to check and change manually main parameters.
You can then manually update your statistics for 's1.domain1.com' with command:
> perl awstats.pl -update -config=s1.domain1.com
You can also read your statistics for 's1.domain1.com' with URL:
Ø       [url]http://localhost/awstats/awstats.pl?config=s1.domain1.com[/url]
Press ENTER to finish...
按回车来结束安装.这段话就写的很清楚,接下来的工作就是
检查配置文件
执行perl awstats.pl -update -config=s1.domain1.com来更新数据库
通过[url]http://localhost/awstats/awstats.pl?config=s1.domain1.com[/url]来查看统计的信息
这段脚本执行完,我们的httpd.conf也更新了,下面是它主要改动的部分
CustomLog logs/access_log common改为
CustomLog logs/access_log combined
在文件的最后面增加了
# Directives to allow use of AWStats as a CGI
Alias /awstatsclasses "/usr/local/awstats/wwwroot/classes/"
Alias /awstatscss "/usr/local/awstats/wwwroot/css/"
Alias /awstatsicons "/usr/local/awstats/wwwroot/icon/"
ScriptAlias /awstats/ "/usr/local/awstats/wwwroot/cgi-bin/"
# This is to permit URL access to scripts/files in AWStats directory.
<Directory "/usr/local/awstats/wwwroot">
    Options None
    AllowOverride None
    Order allow,deny
    Allow from all
</Directory>
 
在修改之前,还是先停掉脚本启动的apache,将正确的apache启动
[root@server1 tools]# service httpd stop
[root@server1 tools]# /usr/local/apache2/bin/apachectl start
 
访问网站s1.domain1.com正常显示,这次访问就应该被记录到了日志里面.
访问[url]http://192.168.0.111t/awstats/awstats.pl?config=s1.domain1.com[/url],提示

Forbidden

You don't have permission to access /awstats/awstats.pl on this server
很明白是权限问题,/usr/local下执行chmod -R 777 awstats (应该755就行了,因为我只是测试而已)
 
检查配置文件的关键项目
vi /etc/awstats/awstats.s1.domain1.com.conf
检查并做如下修改
# LogFile="/var/log/httpd/mylog.log"
LogFile="/usr/local/apache2/logs/s1_web-access_log"
指到apche的日志
LogType=W
表示分析的是web日志
LogFormat=1
表示日志格式为combined
SiteDomain="s1.domain1.com"
域名
HostAliases="s1.domain1.com [url]www.s1.domain1.com[/url] 127.0.0.1 localhost"
这个变量的意思是这个域的别名.即多个域名对应同一网站的情况,这句是自动生成的.我这里没有用到所以就没有改.
 
更新数据库
cd /usr/local/awstats/wwwroot/cgi-bin
perl awstats.pl -config=s1.domain1.com -update
得到如下错误
Error: AWStats database directory defined in config file by 'DirData' parameter (/var/lib/awstats) does not exist or is not writable.
Setup ('/etc/awstats/awstats.s1.domain1.com.conf' file, web server or permissions) may be wrong.
Check config file, permissions and AWStats documentation (in 'docs' directory).
同时访问[url]http://192.168.0.111t/awstats/awstats.pl?config=s1.domain1.com[/url]提示
Error: AWStats database directory defined in config file by 'DirData' parameter (/var/lib/awstats) does not exist or is not writable.
Setup ('/etc/awstats/awstats.s1.domain1.com.conf' file, web server or permissions) may be wrong.
Check config file, permissions and AWStats documentation (in 'docs' directory).
很明显是/var/lib/awstats这个目录不存在
这里需要说明,这个路径是由配置文件/etc/awstats/awstats.s1.domain1.com.conf里面的DirData参数决定的,表示存放数据库的地方.我没有修改所以依然是默认值DirData="/var/lib/awstats"
既然提示不存在,现在就来新建它
cd /var/lib
mkdir awstats
 
进行数据库更新
[root@server1 cgi-bin]# perl awstats.pl -config=s1.domain1.com -update
信息如下
Update for config "/etc/awstats/awstats.s1.domain1.com.conf"
With data in log file "/usr/local/apache2/logs/s1_web-access_log"...
Phase 1 : First bypass old records, searching new record...
Searching new records from beginning of log file...
Jumped lines in file: 0
Parsed lines in file: 1处理的总行数(apache的日志是一行代表一个记录)
 Found 0 dropped records,
 Found 1 corrupted records, 发现了一个损坏的记录
 Found 0 old records,
 Found 0 new qualified records.
 
查看日志的内容
[root@server1 cgi-bin]# less /usr/local/apache2/logs/s1_web-access_log
192.168.0.28 - - [30/Aug/2007:10:19:59 +0800] "GET / HTTP/1.1" 200 23
 
访问[url]http://192.168.0.111/awstats/awstats.pl?config=s1.domain1.com[/url]查看统计信息
很是奇怪,为什么会这样呢?打开httpd.conf发现虚拟主机的部分并没有改动,所以记录下来的日志类型并没有改变,需要单独设置使用的日志类型.
<VirtualHost *:80>
    ServerAdmin yahoon@xxx.com
    DocumentRoot /var/www/html/s1
    ServerName s1.domain1.com
    LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined1
    CustomLog logs/s1_web-access_log combined1
ErrorLog logs/s1_web-error_log
</VirtualHost>
可以看到我这里将这种日志格式命名为combine1,只是名称而已,可以随便起.修改后重启apache
 
将日志清空,然后访问s1.domain1.com,产生日志如下
[root@server1 cgi-bin]# less /usr/local/apache2/logs/s1_web-access_log
192.168.0.28 - - [30/Aug/2007:13:04:05 +0800] "GET / HTTP/1.1" 304 - "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;
SV1; TencentTraveler )"
192.168.0.28 - - [30/Aug/2007:13:05:30 +0800] "GET / HTTP/1.1" 304 - "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;
SV1; TencentTraveler )"
192.168.0.28 - - [30/Aug/2007:13:05:30 +0800] "GET / HTTP/1.1" 304 - "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;
SV1; TencentTraveler )"
192.168.0.28 - - [30/Aug/2007:13:05:30 +0800] "GET / HTTP/1.1" 304 - "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;
SV1; TencentTraveler )"
192.168.0.28 - - [30/Aug/2007:13:05:31 +0800] "GET / HTTP/1.1" 304 - "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;
SV1; TencentTraveler )"
192.168.0.28 - - [30/Aug/2007:13:05:31 +0800] "GET / HTTP/1.1" 304 - "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;
SV1; TencentTraveler )"
……
 
再次更新数据库
[root@server1 cgi-bin]# perl awstats.pl -config=s1.domain1.com -update
Update for config "/etc/awstats/awstats.s1.domain1.com.conf"
With data in log file "/usr/local/apache2/logs/s1_web-access_log"...
Phase 1 : First bypass old records, searching new record...
Direct access to last remembered record has fallen on another record.
So searching new records from beginning of log file...
Phase 2 : Now process new records (Flush history on disk after 20000 hosts)...
Jumped lines in file: 0
Parsed lines in file: 6
 Found 0 dropped records,
 Found 0 corrupted records,
 Found 0 old records,
 Found 6 new qualified records.
 
访问[url]http://192.168.0.111/awstats/awstats.pl?config=s1.domain1.com[/url]
 
最后需要说明的是,在文中已经多次提到过,要能反应最新信息就必须更新数据库.也就是说你可能已经对它进行了多次访问,但是查看统计页面发现没变化.这是因为这些日志还没有更新到数据库.简单的方法就是定时执行.
当然现在写的也只是这个软件的基本安装和使用,它的使用远不止于这些.尤其其中涉及到了apache日志的相关知识,例如rotate日志循环,负载均衡的日志合并等等.我也刚接触,希望大家多提宝贵意见.
 
下面是相关的网页