文章出处:www.net1980.com 
故障现象:
         某企业的网络结构如下图所示,业务系统服务器直接与两台CISCO7609连接,再上连接到两台huawei NE80路由器,与骨干网络通信。维护人员在网络测试中发现,重启业务系统A的服务器之后,网络中断,两台CISCO7609与业务系统服务器连接的GE9/12端口出现err-disable 状态。
 
 
思科7609路由器端口反复UP/Down导致err-disable_Down
 
 
 
 
 
 
 
 
 
 
 
 
原因分析:
         两台CISCO 7609路由器的IOS版本号为version 12.2, 业务系统服务器为华为设备。以下以CISCO7609与业务系统A相连端口为对象进行分析:
1、登陆CISCO7609-1查看日志信息,发现在0:42左右,CISCO7609-1与业务系统A相连的GigabitEthernet9/12端口反复出UP/Down告警。在Feb 9 00:42:31.967时端口变为err-disable 状态。
以下为CISCO7609-1的部分日志信息:
*Feb 9 00:42:26.435 BEIJING: %LINK-3-UPDOWN: Interface GigabitEthernet9/12, changed state to up
*Feb 9 00:42:27.015 BEIJING: %LINK-3-UPDOWN: Interface GigabitEthernet9/12, changed state to down
*Feb 9 00:42:26.435 BEIJING: %LINK-SP-3-UPDOWN: Interface GigabitEthernet9/12, changed state to up
*Feb 9 00:42:27.019 BEIJING: %LINK-SP-3-UPDOWN: Interface GigabitEthernet9/12, changed state to down
*Feb 9 00:42:27.815 BEIJING: %LINK-3-UPDOWN: Interface GigabitEthernet9/12, changed state to up
*Feb 9 00:42:28.355 BEIJING: %LINK-3-UPDOWN: Interface GigabitEthernet9/12, changed state to down
*Feb 9 00:42:27.819 BEIJING: %LINK-SP-3-UPDOWN: Interface GigabitEthernet9/12, changed state to up
*Feb 9 00:42:28.355 BEIJING: %LINK-SP-3-UPDOWN: Interface GigabitEthernet9/12, changed state to down
*Feb 9 00:42:29.315 BEIJING: %LINK-3-UPDOWN: Interface GigabitEthernet9/12, changed state to up
*Feb 9 00:42:29.843 BEIJING: %LINK-3-UPDOWN: Interface GigabitEthernet9/12, changed state to down
*Feb 9 00:42:29.315 BEIJING: %LINK-SP-3-UPDOWN: Interface GigabitEthernet9/12, changed state to up
*Feb 9 00:42:29.847 BEIJING: %LINK-SP-3-UPDOWN: Interface GigabitEthernet9/12, changed state to down
*Feb 9 00:42:31.967 BEIJING: %PM-SP-4-ERR_DISABLE: link-flap error detected on Gi9/12, putting Gi9/12 in err-disable state
*Feb 9 00:42:32.147 BEIJING: %PM-SP-STDBY-4-ERR_DISABLE: link-flap error detected on Gi9/12, putting Gi9/12 in err-disable state
 
2、 CISCO厂商的设备为了保证网络的可靠性,启用了相应的保护技术。如:在10秒钟内如果路由器的以太端口反复出现5次up/down的告警,那么CISCO路由器会因检测到端口出现link-flap error错误,而将端口置于err-disable状态。
CISCO7609-1#sh errdisable detect
ErrDisable Reason Detection status
----------------- ----------------
udld Enabled
bpduguard Enabled
security-violatio Enabled
channel-misconfig Enabled
psecure-violation Enabled
mac-limit Enabled
unicast-flood Enabled
vmps Enabled
loopback Enabled
pagp-flap Enabled
dtp-flap Enabled
link-flap Enabled
l2ptguard Enabled
gbic-invalid Enabled
dhcp-rate-limit Enabled
storm-control Enabled
inline-power Enabled
arp-inspection Enabled
packet-buffer Enabled
link-monitor-fail Enabled
oam-remote-failur Enabled
transceiver-incom Enabled
dot1ad-incomp-ety Enabled
dot1ad-incomp-tun Enabled
CISCO7609-1#sh errdisable flap-values
ErrDisable Reason Flaps Time (sec)
----------------- ------ ----------
pagp-flap 3 30
dtp-flap 3 30
link-flap 5 10
 
3、引起端口出现Link-flap error的可能原因:
Link-flap error
Link flap means that the interface continually goes up and down. The interface is put into the errdisabled state if it flaps more than five times in 10 seconds. The common cause of link flap is a Layer 1 issue such as a bad cable, duplex mismatch, or bad Gigabit Interface Converter (GBIC) card. Look at the console messages or the messages that were sent to the syslog server that state the reason for the port shutdown.
         进一步检查发现,在对服务器重启的过程中,在CISCO7609设备上,与业务系统服务器相连的端口均会出现多次的UP/Down告警(并不仅仅限于业务系统A),也就是说,这是一个普遍的现象,可以推断引起端口出现Link-flap error的原因不大可能是上文中所说的a bad cable和bad Gigabit Interface Converter (GBIC) card的。检查相连的端口的速率及双工模式,也已设为固定模式。
 
解决办法:
         CISCO 7609路由器检测到端口出现link-flap error错误后,会将端口置于err-disable状态。不过,CISCO设备针对这个保护技术,也有相应的自动恢复技术。
         当端口因Link-flap error原因而置于err-disable状态后,可以设置一个延时,将端口状态自动恢复正常。
命令为:errdisable recovery cause Link-flap
         默认情况下,自动恢复功能是关闭的,自动恢复延时默认为300秒。只需要开启此功能,并设置合适的自动恢复时间即可。
CISCO7609-1#sh errdisable recovery
ErrDisable Reason Timer Status
----------------- --------------
udld Disabled
bpduguard Disabled
security-violatio Disabled
channel-misconfig Disabled
vmps Disabled
pagp-flap Disabled
dtp-flap Disabled
link-flap Disabled
l2ptguard Disabled
psecure-violation Disabled
gbic-invalid Disabled
dhcp-rate-limit Disabled
mac-limit Disabled
unicast-flood Disabled
storm-control Disabled
arp-inspection Disabled
loopback Disabled
link-monitor-fail Disabled
oam-remote-failur Disabled
dot1ad-incomp-ety Disabled
dot1ad-incomp-tun Disabled
Timer interval: 300 seconds
Interfaces that will be enabled at the next timeout: