描述:
ceph集群osd硬盘损坏引起的写入错误。
日志信息:
2017-12-13 03:40:38.596764 7f5e32df2700 -1 filestore(/var/lib/ceph/osd/ceph-44) FileStore::_do_copy_range: write error at 1118208~-5, (5) Input/output error
os/filestore/FileStore.cc: In function 'int FileStore::_do_copy_range(int, int, uint64_t, uint64_t, uint64_t, bool)' thread 7f5e32df2700 time 2017-12-13 03:40:38.596798
os/filestore/FileStore.cc: 3628: FAILED assert(pos == end)
ceph version 10.2.7 (50e863e0f4bc8f4b9e31156de690d765af245185)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0x562d99d0e6db]
2: (FileStore::_do_copy_range(int, int, unsigned long, unsigned long, unsigned long, bool)+0x18ec) [0x562d999ca53c]
3: (GenericFileStoreBackend::clone_range(int, int, unsigned long, unsigned long, unsigned long)+0x7b) [0x562d99a17b0b]
4: (FileStore::_do_clone_range(int, int, unsigned long, unsigned long, unsigned long)+0x80) [0x562d999c8a50]
5: (FileStore::_clone_range(coll_t const&, ghobject_t const&, ghobject_t const&, unsigned long, unsigned long, unsigned long, SequencerPosition const&)+0x1a1) [0x562d999f99c1]
6: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHandle*)+0x42ca) [0x562d99a0411a]
7: (FileStore::_do_transactions(std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Transaction> >&, unsigned long, ThreadPool::TPHandle*)+0x3b) [0x562d99a06a8b]
8: (FileStore::_do_op(FileStore::OpSequencer*, ThreadPool::TPHandle&)+0x2b5) [0x562d99a06d75]
9: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa6e) [0x562d99cffabe]
10: (ThreadPool::WorkThread::entry()+0x10) [0x562d99d009a0]
11: (()+0x8184) [0x7f5e4dbbb184]
12: (clone()+0x6d) [0x7f5e4bce4ffd]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
dmesg查看信息:
[ 8674.029792] sd 0:0:2:0: [sdc] tag#21 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[ 8674.029800] sd 0:0:2:0: [sdc] tag#21 Sense Key : Medium Error [current]
[ 8674.029802] sd 0:0:2:0: [sdc] tag#21 Add. Sense: Unrecovered read error
[ 8674.029804] sd 0:0:2:0: [sdc] tag#21 CDB: Read(16) 88 00 00 00 00 00 02 8c 49 a8 00 00 01 00 00 00
[ 8674.029806] blk_update_request: critical medium error, dev sdc, sector 42748543
解决:
判断为ceph集群osd的硬盘损坏导致,建议更换该osd存储硬盘。