问题描述:
存储那边设置好后,看到的分配了779G大小,结果在系统上fdisk -l 看到的是853G ,相差太多,同事dmesg 里面还会报i/o错误,但是存储那边没有报警
[27059.003909] end_request: I/O error, dev dm-8, sector 0
[27059.003930] end_request: I/O error, dev dm-8, sector 0
[27059.003940] end_request: I/O error, dev dm-8, sector 24
[27059.543520] end_request: I/O error, dev dm-5, sector 0
[27059.543533] end_request: I/O error, dev dm-5, sector 0
[27059.543544] end_request: I/O error, dev dm-5, sector 0
[27059.543556] end_request: I/O error, dev dm-5, sector 0
[27059.543571] end_request: I/O error, dev dm-5, sector 56
[27059.543583] end_request: I/O error, dev dm-5, sector 0
[27059.543593] end_request: I/O error, dev dm-5, sector 0
[27059.543613] end_request: I/O error, dev dm-5, sector 209715072
[27059.543628] end_request: I/O error, dev dm-5, sector 209715184
[27059.543638] end_request: I/O error, dev dm-5, sector 0
[27059.543649] end_request: I/O error, dev dm-5, sector 8
而且系统没有安装multipath,fdisk -l只能看到一块sdb 但是/dev/下面会有很多其他sde 到sdi,暂时还不知道ubuntu是如何整合的这些盘,不知道是rdac 方式还是其他
解决:
我就安装了multipath-tools ,使用multipath -ll 命令查看
sdf: checker msg is "directio checker reports path is down"
sdk: checker msg is "directio checker reports path is down"
3600a0b800021d4f100002b2850ff4892dm-6 IBM ,1722-600
[size=100G][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:2:2 sdf 8:80 [failed][faulty]
\_ round-robin 0 [prio=0][enabled]
\_ 2:0:2:2 sdk 8:160 [failed][faulty]
sdj: checker msg is "directio checker reports path is down"
sdn: checker msg is "directio checker reports path is down"
3600a0b800021d4f100002b2a50ff48cadm-8 IBM ,1722-600
[size=160G][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:2:4 sdj 8:144 [failed][faulty]
\_ round-robin 0 [prio=0][enabled]
\_ 2:0:2:4 sdn 8:208 [failed][faulty]
sde: checker msg is "directio checker reports path is down"
3600a0b800026541800000bca50ee104cdm-2 IBM ,1815 FASt
[size=777G][features=0][hwhandler=0]
\_ round-robin 0 [prio=1][active]
\_ 1:0:0:1 sdb 8:16 [active][ready]
\_ round-robin 0 [prio=0][enabled]
\_ 2:0:1:1 sde 8:64 [active][faulty]
sdl: checker msg is "directio checker reports path is down"
sdo: checker msg is "directio checker reports path is down"
3600a0b800021cf1200004df750ff52f3dm-7 IBM ,1722-600
[size=121G][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:2:5 sdl 8:176 [failed][faulty]
\_ round-robin 0 [prio=0][enabled]
\_ 2:0:2:5 sdo 8:224 [failed][faulty]
sdh: checker msg is "directio checker reports path is down"
sdm: checker msg is "directio checker reports path is down"
3600a0b800021cf1200004df650ff52bddm-4 IBM ,1722-600
[size=100G][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:2:3 sdh 8:112 [failed][faulty]
\_ round-robin 0 [prio=0][enabled]
\_ 2:0:2:3 sdm 8:192 [failed][faulty]
sdc: checker msg is "directio checker reports path is down"
sdg: checker msg is "directio checker reports path is down"
3600a0b800021d4f100002b2550ff485adm-5 IBM ,1722-600
[size=100G][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:2:0 sdc 8:32 [failed][faulty]
\_ round-robin 0 [prio=0][enabled]
\_ 2:0:2:0 sdg 8:96 [failed][faulty]
sdd: checker msg is "directio checker reports path is down"
sdi: checker msg is "directio checker reports path is down"
3600a0b800021cf1200004df550ff528ddm-3 IBM ,1722-600
[size=100G][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:2:1 sdd 8:48 [failed][faulty]
\_ round-robin 0 [prio=0][enabled]
\_ 2:0:2:1 sdi 8:128 [failed][faulty]
明显是有多个存储共享的不同磁盘,有777G的 DS4800 100G DS4300 可以通过IBM FAStT 和DS对比Machine Type信息知道的
现在知道了dmesg 里面的报错的原因了,系统默认会directio 定时检查链路,默认就是尝试读取第一个sector 0的
ubuntu documentation 参考 https://help.ubuntu.com/12.10/serverguide/dm-multipath-chapter.html
待续