文章出处:www.net1980.com
 
某运营商的两套业务系统之间的网络拓扑如下所示:
两套业务系统之间由4台路由器组成口字型的网络冗余结构,主备链路各有2套捆绑的2M传输。
 
【故障描述】:
  业务系统的管理员反映,最近2套业务系统之间的网络不稳定,每天在凌晨时候都会出现网络中断的现象。
 
【故障原因分析】:
  由于故障的现象较特殊,发生在每天的凌晨的1点左右,为了查找故障原因我们做了以下分析:

1、在SNMP网管系统提取该网络链路的流量图表。
         从流量图上可以看出该网络链路的流量在凌晨的时候流量并没有突变的情况,流量低且平稳,因此流量突变导致网络中断的推测不成立。
 
2、对比R3640a和R3640b路由器的近几天运行记录。
[R3640a]
%2008/11/18 01:24:51-INTERFACE-6:
 Line protocol ip on the interface Serial1:0 is DOWN
%2008/11/18 01:24:51-INTERFACE-6:
 Line protocol ip on the interface Serial0:0 is DOWN
%2008/11/18 01:24:51-INTERFACE-6:
 Line protocol ip on the interface Serial1:0 is UP
%2008/11/18 01:24:51-INTERFACE-6:
 Line protocol ip on the interface Serial0:0 is UP
%2008/11/21 01:23:50-INTERFACE-6:
 Line protocol ip on the interface Serial1:0 is DOWN
%2008/11/21 01:23:50-INTERFACE-6:
 Line protocol ip on the interface Serial0:0 is DOWN
%2008/11/21 01:23:50-INTERFACE-6:
 Line protocol ip on the interface Serial1:0 is UP
%2008/11/21 01:23:50-INTERFACE-6:
 Line protocol ip on the interface Serial0:0 is UP
%2008/11/22 01:23:32-INTERFACE-6:
 Line protocol ip on the interface Serial1:0 is DOWN
%2008/11/22 01:23:32-INTERFACE-6:
 Line protocol ip on the interface Serial0:0 is DOWN
%2008/11/22 01:23:32-INTERFACE-6:
 Line protocol ip on the interface Serial1:0 is UP
%2008/11/22 01:23:32-INTERFACE-6:
 Line protocol ip on the interface Serial0:0 is UP
[DGyhIODDX2Rh3640a]disp clock
  Current router time:13:50:52 Nov 22 2008

[R3640b]
%2008/11/18 01:21:36-INTERFACE-6:
 Line protocol ip on the interface Serial0:0 is DOWN
%2008/11/18 01:21:36-INTERFACE-6:
 Line protocol ip on the interface Serial1:0 is DOWN
%2008/11/18 01:22:11-INTERFACE-6:
 Line protocol ip on the interface Serial0:0 is UP
%2008/11/18 01:22:11-INTERFACE-6:
 Line protocol ip on the interface Serial1:0 is UP
%2008/11/21 01:20:41-INTERFACE-6:
 Line protocol ip on the interface Serial0:0 is DOWN
%2008/11/21 01:20:41-INTERFACE-6:
 Line protocol ip on the interface Serial1:0 is DOWN
%2008/11/21 01:20:41-INTERFACE-6:
 Line protocol ip on the interface Serial0:0 is UP
%2008/11/21 01:20:41-INTERFACE-6:
 Line protocol ip on the interface Serial1:0 is UP
%2008/11/22 01:20:22-INTERFACE-6:
 Line protocol ip on the interface Serial0:0 is DOWN
%2008/11/22 01:20:22-INTERFACE-6:
 Line protocol ip on the interface Serial1:0 is DOWN
%2008/11/22 01:21:01-INTERFACE-6:
 Line protocol ip on the interface Serial0:0 is UP
%2008/11/22 01:21:01-INTERFACE-6:
 Line protocol ip on the interface Serial1:0 is UP
[DGyhIODDX2Rh3640b]disp clock
  Current router time:13:49:08 Nov 22 2008
 
         从两台路由器的运行记录看,R3640b的故障发生时间总是较R3640a的故障发生时间早2分钟,刚好两台路由器的系统时钟也只相差2分钟(R3640a的时钟较快),因此可以认为两台路由器实际上是同时中断的,导致两台路由器同一时刻中断只能是外部因素,可能是电源问题或者是传输问题等,但两台路由器都没有掉电重启的记录,而且路由器上的其他端口也没有中断的记录,因此2M传输故障的可能性最大。
         最后联系传输部门更换传输电路后网络恢复正常。
 
【故障总结】:
         通过仔细分析网络设备的运行日志,可以发现很多网络故障的原因,在日常的网络维护中应该好好运用。