线上发现L版本一个OSD down,不确定是否磁盘故障,之前的filestore排查起来比较熟,换成Bluestore以后,有些细节上的操作不一样,因为用到的是SSD,所以有了这篇排查文档。
排查过程
定位故障节点
代码语言:javascript复制[root@demo-host ceph]# ceph osd tree|grep down
20 1.00000 osd.20 down 0 1.00000
[root@demo-host ceph]# ceph osd find 20
{
"osd": 20,
"ip": "192.168.8.124:6800/1298894",
"osd_fsid": "a99bc25c-4cf4-5429-9171-4084555af14b",
"crush_location": {
"host": "demo-host-ssd",
"media": "site1-rack1-ssd",
"mediagroup": "site1-ssd",
"root": "default"
}
}
登录到上面的192.168.8.124,执行命令"dmesg -T",发现有dm-0设备发生io错误
代码语言:javascript复制[Wed Feb 27 16:24:02 2019] hpsa 0000:03:00.0: Acknowledging event: 0x40000032 (HP SSD Smart Path state change)
[Wed Feb 27 16:24:02 2019] hpsa 0000:03:00.0: hpsa_update_device_info: LV failed, device will be skipped.
[Wed Feb 27 16:24:02 2019] hpsa 0000:03:00.0: scsi 0:1:0:0: updated Direct-Access HP LOGICAL VOLUME RAID-1( 0) SSDSmartPathCap En Exp=1
[Wed Feb 27 16:24:02 2019] hpsa 0000:03:00.0: scsi 0:1:0:2: updated Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap En Exp=1
[Wed Feb 27 16:24:21 2019] buffer_io_error: 1 callbacks suppressed
[Wed Feb 27 16:24:21 2019] Buffer I/O error on dev dm-0, logical block 0, async page read
[Wed Feb 27 16:24:21 2019] Buffer I/O error on dev dm-0, logical block 0, async page read
[Wed Feb 27 16:24:21 2019] Buffer I/O error on dev dm-0, logical block 0, async page read
[Wed Feb 27 16:24:21 2019] Buffer I/O error on dev dm-0, logical block 0, async page read
[Wed Feb 27 16:24:21 2019] Buffer I/O error on dev dm-0, logical block 0, async page read
[Wed Feb 27 16:24:21 2019] Buffer I/O error on dev dm-0, logical block 0, async page read
[Wed Feb 27 16:24:21 2019] Buffer I/O error on dev dm-0, logical block 0, async page read
[Wed Feb 27 16:24:21 2019] Buffer I/O error on dev dm-0, logical block 468834288, async page read
[Wed Feb 27 16:24:21 2019] Buffer I/O error on dev dm-0, logical block 468834288, async page read
[Wed Feb 27 16:24:22 2019] Buffer I/O error on dev dm-0, logical block 468834288, async page read
检查osd日志出现“ERROR: osd init failed: (5) Input/output error ”
代码语言:javascript复制[root@demo-host ceph]# tail -100 /var/log/ceph/ceph-osd.20.log
2019-02-27 16:31:34.492858 7fc0f33aed80 1 bdev(0x55d1a3c16000 /var/lib/ceph/osd/ceph-20/block) open size 1920345309184 (0x1bf1d800000, 1.75TiB) block_size 4096 (4KiB) non-rotational
2019-02-27 16:31:34.492906 7fc0f33aed80 -1 bluestore(/var/lib/ceph/osd/ceph-20/block) _read_bdev_label failed to read from /var/lib/ceph/osd/ceph-20/block: (5) Input/output error
2019-02-27 16:31:34.492917 7fc0f33aed80 1 bdev(0x55d1a3c16000 /var/lib/ceph/osd/ceph-20/block) close
2019-02-27 16:31:34.751175 7fc0f33aed80 1 bluestore(/var/lib/ceph/osd/ceph-20) _mount path /var/lib/ceph/osd/ceph-20
2019-02-27 16:31:34.751738 7fc0f33aed80 -1 bluestore(/var/lib/ceph/osd/ceph-20/block) _read_bdev_label failed to read from /var/lib/ceph/osd/ceph-20/block: (5) Input/output error
2019-02-27 16:31:34.751776 7fc0f33aed80 1 bdev create path /var/lib/ceph/osd/ceph-20/block type kernel
2019-02-27 16:31:34.751779 7fc0f33aed80 1 bdev(0x55d1a3c16200 /var/lib/ceph/osd/ceph-20/block) open path /var/lib/ceph/osd/ceph-20/block
2019-02-27 16:31:34.751978 7fc0f33aed80 1 bdev(0x55d1a3c16200 /var/lib/ceph/osd/ceph-20/block) open size 1920345309184 (0x1bf1d800000, 1.75TiB) block_size 4096 (4KiB) non-rotational
2019-02-27 16:31:34.752485 7fc0f33aed80 -1 bluestore(/var/lib/ceph/osd/ceph-20/block) _read_bdev_label failed to read from /var/lib/ceph/osd/ceph-20/block: (5) Input/output error
2019-02-27 16:31:34.752495 7fc0f33aed80 1 bdev(0x55d1a3c16200 /var/lib/ceph/osd/ceph-20/block) close
2019-02-27 16:31:35.009776 7fc0f33aed80 -1 osd.20 0 OSD:init: unable to mount object store
2019-02-27 16:31:35.009796 7fc0f33aed80 -1 ** ERROR: osd init failed: (5) Input/output error
2019-02-27 16:31:55.220715 7ff4da40cd80 0 set uid:gid to 167:167 (ceph:ceph)
2019-02-27 16:31:55.220746 7ff4da40cd80 0 ceph version 12.2.11 (26dc3775efc7bb286a1d6d66faee0ba30ea23eee) luminous (stable), process ceph-osd, pid 1564222
2019-02-27 16:31:55.221547 7ff4da40cd80 -1 bluestore(/var/lib/ceph/osd/ceph-20/block) _read_bdev_label failed to read from /var/lib/ceph/osd/ceph-20/block: (5) Input/output error
2019-02-27 16:31:55.221977 7ff4da40cd80 -1 bluestore(/var/lib/ceph/osd/ceph-20/block) _read_bdev_label failed to read from /var/lib/ceph/osd/ceph-20/block: (5) Input/output error
2019-02-27 16:31:55.222331 7ff4da40cd80 -1 bluestore(/var/lib/ceph/osd/ceph-20/block) _read_bdev_label failed to read from /var/lib/ceph/osd/ceph-20/block: (5) Input/output error
2019-02-27 16:31:55.222747 7ff4da40cd80 -1 bluestore(/var/lib/ceph/osd/ceph-20/block) _read_bdev_label failed to read from /var/lib/ceph/osd/ceph-20/block: (5) Input/output error
2019-02-27 16:31:55.226811 7ff4da40cd80 0 pidfile_write: ignore empty --pid-file
2019-02-27 16:31:55.235463 7ff4da40cd80 0 load: jerasure load: lrc load: isa
2019-02-27 16:31:55.235531 7ff4da40cd80 1 bdev create path /var/lib/ceph/osd/ceph-20/block type kernel
2019-02-27 16:31:55.235538 7ff4da40cd80 1 bdev(0x5608d71b6000 /var/lib/ceph/osd/ceph-20/block) open path /var/lib/ceph/osd/ceph-20/block
2019-02-27 16:31:55.236101 7ff4da40cd80 1 bdev(0x5608d71b6000 /var/lib/ceph/osd/ceph-20/block) open size 1920345309184 (0x1bf1d800000, 1.75TiB) block_size 4096 (4KiB) non-rotational
2019-02-27 16:31:55.236467 7ff4da40cd80 -1 bluestore(/var/lib/ceph/osd/ceph-20/block) _read_bdev_label failed to read from /var/lib/ceph/osd/ceph-20/block: (5) Input/output error
2019-02-27 16:31:55.236478 7ff4da40cd80 1 bdev(0x5608d71b6000 /var/lib/ceph/osd/ceph-20/block) close
2019-02-27 16:31:55.494201 7ff4da40cd80 1 bluestore(/var/lib/ceph/osd/ceph-20) _mount path /var/lib/ceph/osd/ceph-20
2019-02-27 16:31:55.494686 7ff4da40cd80 -1 bluestore(/var/lib/ceph/osd/ceph-20/block) _read_bdev_label failed to read from /var/lib/ceph/osd/ceph-20/block: (5) Input/output error
2019-02-27 16:31:55.494724 7ff4da40cd80 1 bdev create path /var/lib/ceph/osd/ceph-20/block type kernel
2019-02-27 16:31:55.494727 7ff4da40cd80 1 bdev(0x5608d71b6200 /var/lib/ceph/osd/ceph-20/block) open path /var/lib/ceph/osd/ceph-20/block
2019-02-27 16:31:55.494921 7ff4da40cd80 1 bdev(0x5608d71b6200 /var/lib/ceph/osd/ceph-20/block) open size 1920345309184 (0x1bf1d800000, 1.75TiB) block_size 4096 (4KiB) non-rotational
2019-02-27 16:31:55.495323 7ff4da40cd80 -1 bluestore(/var/lib/ceph/osd/ceph-20/block) _read_bdev_label failed to read from /var/lib/ceph/osd/ceph-20/block: (5) Input/output error
2019-02-27 16:31:55.495335 7ff4da40cd80 1 bdev(0x5608d71b6200 /var/lib/ceph/osd/ceph-20/block) close
2019-02-27 16:31:55.758790 7ff4da40cd80 -1 osd.20 0 OSD:init: unable to mount object store
2019-02-27 16:31:55.758804 7ff4da40cd80 -1 ** ERROR: osd init failed: (5) Input/output error
接下来确定dm-0是不是和osd-20有关联,熟悉的顺藤摸瓜操作如下,注意warning提示有PV丢失
代码语言:javascript复制[root@demo-host ceph]# ls -l /var/lib/ceph/osd/ceph-20/
total 48
-rw-r--r-- 1 ceph ceph 456 Feb 25 19:56 activate.monmap
lrwxrwxrwx 1 ceph ceph 93 Feb 25 19:56 block -> /dev/ceph-72de7913-115e-4df5-868d-7f4cf7ea2b37/osd-block-a99bc25c-4cf4-5429-9171-4084555af14b #注意对应的LV和VG
-rw-r--r-- 1 ceph ceph 2 Feb 25 19:56 bluefs
-rw-r--r-- 1 ceph ceph 37 Feb 25 19:56 ceph_fsid
-rw-r--r-- 1 ceph ceph 37 Feb 25 19:56 fsid
-rw------- 1 ceph ceph 56 Feb 25 19:56 keyring
-rw-r--r-- 1 ceph ceph 8 Feb 25 19:56 kv_backend
-rw-r--r-- 1 ceph ceph 21 Feb 25 19:56 magic
-rw-r--r-- 1 ceph ceph 4 Feb 25 19:56 mkfs_done
-rw-r--r-- 1 ceph ceph 41 Feb 25 19:56 osd_key
-rw-r--r-- 1 ceph ceph 6 Feb 25 19:56 ready
-rw-r--r-- 1 ceph ceph 10 Feb 25 19:56 type
-rw-r--r-- 1 ceph ceph 3 Feb 25 19:56 whoami
[root@demo-host ceph]# vgs
WARNING: Device for PV BLh8zq-EFYG-Yoy9-75uU-b7VM-p66e-o0C6Fu not found or rejected by a filter.
VG #PV #LV #SN Attr VSize VFree
ceph-04507dae-fa43-4a5f-910e-9a7f8fd64fbd 1 1 0 wz--n- <5.46t 0
ceph-2d626a29-6409-4edd-b3e0-df6dc0259629 1 1 0 wz--n- <5.46t 0
ceph-72de7913-115e-4df5-868d-7f4cf7ea2b37 1 1 0 wz-pn- <1.75t 0 #注意
ceph-782b8301-ed74-4809-b39c-755bebd86a81 1 1 0 wz--n- <1.75t 0
ceph-94096bc2-4a6a-4ac5-922a-9129e3b96311 1 1 0 wz--n- <5.46t 0
ceph-957c14e6-c45e-4794-a6c4-92e55b267fd6 1 1 0 wz--n- <5.46t 0
ceph-9a7d0101-f451-4e86-b1c0-3a4e4fced4bd 1 1 0 wz--n- <5.46t 0
ceph-a150e02b-ec98-48ac-92b9-4f754aea19c2 1 1 0 wz--n- <5.46t 0
ceph-d3c92af2-9aee-4141-a693-9d21c329bec6 1 1 0 wz--n- <5.46t 0
ceph-dfe4f8f2-880f-414d-af58-5b3c77ed2628 1 1 0 wz--n- <5.46t 0
[root@demo-host ceph]# lvs
WARNING: Device for PV BLh8zq-EFYG-Yoy9-75uU-b7VM-p66e-o0C6Fu not found or rejected by a filter.
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
osd-block-737138bb-53f8-5f20-b131-d776fec5e62e ceph-04507dae-fa43-4a5f-910e-9a7f8fd64fbd -wi-ao---- <5.46t
osd-block-31724c12-5cab-54ba-a0ea-f7bd0c5bdb39 ceph-2d626a29-6409-4edd-b3e0-df6dc0259629 -wi-ao---- <5.46t
osd-block-a99bc25c-4cf4-5429-9171-4084555af14b ceph-72de7913-115e-4df5-868d-7f4cf7ea2b37 -wi-a---p- <1.75t #注意
osd-block-8505d8f5-4ea3-59d0-870e-59d360f5015c ceph-782b8301-ed74-4809-b39c-755bebd86a81 -wi-ao---- <1.75t
osd-block-e9a70833-590b-5993-9638-179baaa782a5 ceph-94096bc2-4a6a-4ac5-922a-9129e3b96311 -wi-ao---- <5.46t
osd-block-31541688-fb32-5337-af90-09d185613075 ceph-957c14e6-c45e-4794-a6c4-92e55b267fd6 -wi-ao---- <5.46t
osd-block-df6cd15a-1b5c-5443-a062-50fa64fa9d07 ceph-9a7d0101-f451-4e86-b1c0-3a4e4fced4bd -wi-ao---- <5.46t
osd-block-b28a126d-0a7b-503d-80c5-7cbaa04d0a9b ceph-a150e02b-ec98-48ac-92b9-4f754aea19c2 -wi-ao---- <5.46t
osd-block-377ff375-d2bf-5ad9-94b4-2127b6dcf9e7 ceph-d3c92af2-9aee-4141-a693-9d21c329bec6 -wi-ao---- <5.46t
osd-block-4f147edf-9cb7-5263-bec0-3fa34dc0373f ceph-dfe4f8f2-880f-414d-af58-5b3c77ed2628 -wi-ao---- <5.46t
[root@demo-host ceph]# ls -l /dev/ceph-72de7913-115e-4df5-868d-7f4cf7ea2b37/osd-block-a99bc25c-4cf4-5429-9171-4084555af14b
lrwxrwxrwx 1 ceph ceph 7 Feb 27 16:32 /dev/ceph-72de7913-115e-4df5-868d-7f4cf7ea2b37/osd-block-a99bc25c-4cf4-5429-9171-4084555af14b -> ../dm-0 #验明正身
接下来查看Raid卡信息,确定磁盘状态
代码语言:javascript复制[root@demo-host ceph]# hpssacli ctrl slot=0 show config detail
Array: B
Interface Type: Solid State SATA
Unused Space: 0 MB (0.0%)
Used Space: 1.7 TB (100.0%)
Status: Failed Physical Drive #盘丢了,warning也有提示
MultiDomain Status: OK
Array Type: Data HPE SSD Smart Path: enable
Warning: One of the drives on this array have failed or has been removed.
Logical Drive: 2
Size: 1.7 TB
Fault Tolerance: 0
Heads: 255
Sectors Per Track: 32
Cylinders: 65535
Strip Size: 256 KB
Full Stripe Size: 256 KB
Status: Failed #挂了
MultiDomain Status: OK
Caching: Disabled
Unique Identifier: 600508B1001C3FBF225890CDE3612E98
Logical Drive Label: 0606978APVYKH0BRH9507N6082
Drive Type: Data
LD Acceleration Method: HPE SSD Smart Path
physicaldrive 2I:4:1 #记录插槽序号,后面备用
Port: 2I
Box: 4
Bay: 1
Status: Failed
Last Failure Reason: Hot removed
Drive Type: Data Drive
Interface Type: Solid State SATA
Size: 1920.3 GB
Drive exposed to OS: False
Native Block Size: 4096
Firmware Revision: 4IYVHPG1
Serial Number: BTYS802201ZJ1P9DGN
Model: ATA VK001920GWJPH
SATA NCQ Capable: True
SATA NCQ Enabled: True
Maximum Temperature (C): 41
Usage remaining: 99.80%
Power On Hours: 4868 #真是一块短命的SSD
Estimated Life Remaining based on workload to date: 101213 days
SSD Smart Trip Wearout: False
PHY Count: 1
PHY Transfer Rate: Unknown
Drive Authentication Status: Not Applicable
Sanitize Erase Supported: False
基本确定是SSD损坏,接下来开始清除系统残留的LVM信息,先umount掉对应目录
代码语言:javascript复制[root@demo-host ceph]# mount -l|grep ceph
tmpfs on /var/lib/ceph/osd/ceph-20 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-21 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-22 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-23 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-24 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-25 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-26 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-27 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-28 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-29 type tmpfs (rw,relatime)
[root@demo-host ceph]# umount /var/lib/ceph/osd/ceph-20
[root@demo-host ceph]# mount -l|grep ceph
tmpfs on /var/lib/ceph/osd/ceph-21 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-22 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-23 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-24 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-25 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-26 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-27 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-28 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-29 type tmpfs (rw,relatime)
检查vg和lv信息,注意pv有warning,提示丢了盘
代码语言:javascript复制[root@demo-host ceph]# vgs
WARNING: Device for PV BLh8zq-EFYG-Yoy9-75uU-b7VM-p66e-o0C6Fu not found or rejected by a filter.
VG #PV #LV #SN Attr VSize VFree
ceph-04507dae-fa43-4a5f-910e-9a7f8fd64fbd 1 1 0 wz--n- <5.46t 0
ceph-2d626a29-6409-4edd-b3e0-df6dc0259629 1 1 0 wz--n- <5.46t 0
ceph-72de7913-115e-4df5-868d-7f4cf7ea2b37 1 1 0 wz-pn- <1.75t 0
ceph-782b8301-ed74-4809-b39c-755bebd86a81 1 1 0 wz--n- <1.75t 0
ceph-94096bc2-4a6a-4ac5-922a-9129e3b96311 1 1 0 wz--n- <5.46t 0
ceph-957c14e6-c45e-4794-a6c4-92e55b267fd6 1 1 0 wz--n- <5.46t 0
ceph-9a7d0101-f451-4e86-b1c0-3a4e4fced4bd 1 1 0 wz--n- <5.46t 0
ceph-a150e02b-ec98-48ac-92b9-4f754aea19c2 1 1 0 wz--n- <5.46t 0
ceph-d3c92af2-9aee-4141-a693-9d21c329bec6 1 1 0 wz--n- <5.46t 0
ceph-dfe4f8f2-880f-414d-af58-5b3c77ed2628 1 1 0 wz--n- <5.46t 0
[root@demo-host ceph]# lvs
WARNING: Device for PV BLh8zq-EFYG-Yoy9-75uU-b7VM-p66e-o0C6Fu not found or rejected by a filter.
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
osd-block-737138bb-53f8-5f20-b131-d776fec5e62e ceph-04507dae-fa43-4a5f-910e-9a7f8fd64fbd -wi-ao---- <5.46t
osd-block-31724c12-5cab-54ba-a0ea-f7bd0c5bdb39 ceph-2d626a29-6409-4edd-b3e0-df6dc0259629 -wi-ao---- <5.46t
osd-block-a99bc25c-4cf4-5429-9171-4084555af14b ceph-72de7913-115e-4df5-868d-7f4cf7ea2b37 -wi-a---p- <1.75t
osd-block-8505d8f5-4ea3-59d0-870e-59d360f5015c ceph-782b8301-ed74-4809-b39c-755bebd86a81 -wi-ao---- <1.75t
osd-block-e9a70833-590b-5993-9638-179baaa782a5 ceph-94096bc2-4a6a-4ac5-922a-9129e3b96311 -wi-ao---- <5.46t
osd-block-31541688-fb32-5337-af90-09d185613075 ceph-957c14e6-c45e-4794-a6c4-92e55b267fd6 -wi-ao---- <5.46t
osd-block-df6cd15a-1b5c-5443-a062-50fa64fa9d07 ceph-9a7d0101-f451-4e86-b1c0-3a4e4fced4bd -wi-ao---- <5.46t
osd-block-b28a126d-0a7b-503d-80c5-7cbaa04d0a9b ceph-a150e02b-ec98-48ac-92b9-4f754aea19c2 -wi-ao---- <5.46t
osd-block-377ff375-d2bf-5ad9-94b4-2127b6dcf9e7 ceph-d3c92af2-9aee-4141-a693-9d21c329bec6 -wi-ao---- <5.46t
osd-block-4f147edf-9cb7-5263-bec0-3fa34dc0373f ceph-dfe4f8f2-880f-414d-af58-5b3c77ed2628 -wi-ao---- <5.46t
遵从LVM三级结构,LV->VG->PV,先删掉lv和vg,发现有残留
代码语言:javascript复制[root@demo-host ceph]# vgremove ceph-72de7913-115e-4df5-868d-7f4cf7ea2b37
WARNING: Device for PV BLh8zq-EFYG-Yoy9-75uU-b7VM-p66e-o0C6Fu not found or rejected by a filter.
WARNING: 1 physical volumes are currently missing from the system.
Do you really want to remove volume group "ceph-72de7913-115e-4df5-868d-7f4cf7ea2b37" containing 1 logical volumes? [y/n]: y
Do you really want to remove active logical volume ceph-72de7913-115e-4df5-868d-7f4cf7ea2b37/osd-block-a99bc25c-4cf4-5429-9171-4084555af14b? [y/n]: y
Aborting vg_write: No metadata areas to write to!
[root@demo-host ceph]# lvs
WARNING: Device for PV BLh8zq-EFYG-Yoy9-75uU-b7VM-p66e-o0C6Fu not found or rejected by a filter.
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
osd-block-737138bb-53f8-5f20-b131-d776fec5e62e ceph-04507dae-fa43-4a5f-910e-9a7f8fd64fbd -wi-ao---- <5.46t
osd-block-31724c12-5cab-54ba-a0ea-f7bd0c5bdb39 ceph-2d626a29-6409-4edd-b3e0-df6dc0259629 -wi-ao---- <5.46t
osd-block-a99bc25c-4cf4-5429-9171-4084555af14b ceph-72de7913-115e-4df5-868d-7f4cf7ea2b37 -wi-----p- <1.75t
osd-block-8505d8f5-4ea3-59d0-870e-59d360f5015c ceph-782b8301-ed74-4809-b39c-755bebd86a81 -wi-ao---- <1.75t
osd-block-e9a70833-590b-5993-9638-179baaa782a5 ceph-94096bc2-4a6a-4ac5-922a-9129e3b96311 -wi-ao---- <5.46t
osd-block-31541688-fb32-5337-af90-09d185613075 ceph-957c14e6-c45e-4794-a6c4-92e55b267fd6 -wi-ao---- <5.46t
osd-block-df6cd15a-1b5c-5443-a062-50fa64fa9d07 ceph-9a7d0101-f451-4e86-b1c0-3a4e4fced4bd -wi-ao---- <5.46t
osd-block-b28a126d-0a7b-503d-80c5-7cbaa04d0a9b ceph-a150e02b-ec98-48ac-92b9-4f754aea19c2 -wi-ao---- <5.46t
osd-block-377ff375-d2bf-5ad9-94b4-2127b6dcf9e7 ceph-d3c92af2-9aee-4141-a693-9d21c329bec6 -wi-ao---- <5.46t
osd-block-4f147edf-9cb7-5263-bec0-3fa34dc0373f ceph-dfe4f8f2-880f-414d-af58-5b3c77ed2628 -wi-ao---- <5.46t
[root@demo-host ceph]# vgs
WARNING: Device for PV BLh8zq-EFYG-Yoy9-75uU-b7VM-p66e-o0C6Fu not found or rejected by a filter.
VG #PV #LV #SN Attr VSize VFree
ceph-04507dae-fa43-4a5f-910e-9a7f8fd64fbd 1 1 0 wz--n- <5.46t 0
ceph-2d626a29-6409-4edd-b3e0-df6dc0259629 1 1 0 wz--n- <5.46t 0
ceph-72de7913-115e-4df5-868d-7f4cf7ea2b37 1 1 0 wz-pn- <1.75t 0
ceph-782b8301-ed74-4809-b39c-755bebd86a81 1 1 0 wz--n- <1.75t 0
ceph-94096bc2-4a6a-4ac5-922a-9129e3b96311 1 1 0 wz--n- <5.46t 0
ceph-957c14e6-c45e-4794-a6c4-92e55b267fd6 1 1 0 wz--n- <5.46t 0
ceph-9a7d0101-f451-4e86-b1c0-3a4e4fced4bd 1 1 0 wz--n- <5.46t 0
ceph-a150e02b-ec98-48ac-92b9-4f754aea19c2 1 1 0 wz--n- <5.46t 0
ceph-d3c92af2-9aee-4141-a693-9d21c329bec6 1 1 0 wz--n- <5.46t 0
ceph-dfe4f8f2-880f-414d-af58-5b3c77ed2628 1 1 0 wz--n- <5.46t 0
检查pv状态,发现有unknown设备,提示有pv设备“BLh8zq-EFYG-Yoy9-75uU-b7VM-p66e-o0C6Fu” 丢失
代码语言:javascript复制[root@demo-host ceph]# pvs
WARNING: Device for PV BLh8zq-EFYG-Yoy9-75uU-b7VM-p66e-o0C6Fu not found or rejected by a filter.
WARNING: Device for PV BLh8zq-EFYG-Yoy9-75uU-b7VM-p66e-o0C6Fu not found or rejected by a filter.
PV VG Fmt Attr PSize PFree
/dev/sdc ceph-782b8301-ed74-4809-b39c-755bebd86a81 lvm2 a-- <1.75t 0
/dev/sdd ceph-9a7d0101-f451-4e86-b1c0-3a4e4fced4bd lvm2 a-- <5.46t 0
/dev/sde ceph-04507dae-fa43-4a5f-910e-9a7f8fd64fbd lvm2 a-- <5.46t 0
/dev/sdf ceph-d3c92af2-9aee-4141-a693-9d21c329bec6 lvm2 a-- <5.46t 0
/dev/sdg ceph-a150e02b-ec98-48ac-92b9-4f754aea19c2 lvm2 a-- <5.46t 0
/dev/sdh ceph-2d626a29-6409-4edd-b3e0-df6dc0259629 lvm2 a-- <5.46t 0
/dev/sdi ceph-957c14e6-c45e-4794-a6c4-92e55b267fd6 lvm2 a-- <5.46t 0
/dev/sdj ceph-dfe4f8f2-880f-414d-af58-5b3c77ed2628 lvm2 a-- <5.46t 0
/dev/sdk ceph-94096bc2-4a6a-4ac5-922a-9129e3b96311 lvm2 a-- <5.46t 0
[unknown] ceph-72de7913-115e-4df5-868d-7f4cf7ea2b37 lvm2 a-m <1.75t 0
手工删除pv是不行的,这里需要用到一个pvscan --cache命令去刷新缓存,之后再看pv、vg、lv通通都被清理掉了
代码语言:javascript复制[root@demo-host ceph]# pvscan --cache
[root@demo-host ceph]# pvs
PV VG Fmt Attr PSize PFree
/dev/sdc ceph-782b8301-ed74-4809-b39c-755bebd86a81 lvm2 a-- <1.75t 0
/dev/sdd ceph-9a7d0101-f451-4e86-b1c0-3a4e4fced4bd lvm2 a-- <5.46t 0
/dev/sde ceph-04507dae-fa43-4a5f-910e-9a7f8fd64fbd lvm2 a-- <5.46t 0
/dev/sdf ceph-d3c92af2-9aee-4141-a693-9d21c329bec6 lvm2 a-- <5.46t 0
/dev/sdg ceph-a150e02b-ec98-48ac-92b9-4f754aea19c2 lvm2 a-- <5.46t 0
/dev/sdh ceph-2d626a29-6409-4edd-b3e0-df6dc0259629 lvm2 a-- <5.46t 0
/dev/sdi ceph-957c14e6-c45e-4794-a6c4-92e55b267fd6 lvm2 a-- <5.46t 0
/dev/sdj ceph-dfe4f8f2-880f-414d-af58-5b3c77ed2628 lvm2 a-- <5.46t 0
/dev/sdk ceph-94096bc2-4a6a-4ac5-922a-9129e3b96311 lvm2 a-- <5.46t 0
[root@demo-host ceph]# vgs
VG #PV #LV #SN Attr VSize VFree
ceph-04507dae-fa43-4a5f-910e-9a7f8fd64fbd 1 1 0 wz--n- <5.46t 0
ceph-2d626a29-6409-4edd-b3e0-df6dc0259629 1 1 0 wz--n- <5.46t 0
ceph-782b8301-ed74-4809-b39c-755bebd86a81 1 1 0 wz--n- <1.75t 0
ceph-94096bc2-4a6a-4ac5-922a-9129e3b96311 1 1 0 wz--n- <5.46t 0
ceph-957c14e6-c45e-4794-a6c4-92e55b267fd6 1 1 0 wz--n- <5.46t 0
ceph-9a7d0101-f451-4e86-b1c0-3a4e4fced4bd 1 1 0 wz--n- <5.46t 0
ceph-a150e02b-ec98-48ac-92b9-4f754aea19c2 1 1 0 wz--n- <5.46t 0
ceph-d3c92af2-9aee-4141-a693-9d21c329bec6 1 1 0 wz--n- <5.46t 0
ceph-dfe4f8f2-880f-414d-af58-5b3c77ed2628 1 1 0 wz--n- <5.46t 0
[root@demo-host ceph]# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
osd-block-737138bb-53f8-5f20-b131-d776fec5e62e ceph-04507dae-fa43-4a5f-910e-9a7f8fd64fbd -wi-ao---- <5.46t
osd-block-31724c12-5cab-54ba-a0ea-f7bd0c5bdb39 ceph-2d626a29-6409-4edd-b3e0-df6dc0259629 -wi-ao---- <5.46t
osd-block-8505d8f5-4ea3-59d0-870e-59d360f5015c ceph-782b8301-ed74-4809-b39c-755bebd86a81 -wi-ao---- <1.75t
osd-block-e9a70833-590b-5993-9638-179baaa782a5 ceph-94096bc2-4a6a-4ac5-922a-9129e3b96311 -wi-ao---- <5.46t
osd-block-31541688-fb32-5337-af90-09d185613075 ceph-957c14e6-c45e-4794-a6c4-92e55b267fd6 -wi-ao---- <5.46t
osd-block-df6cd15a-1b5c-5443-a062-50fa64fa9d07 ceph-9a7d0101-f451-4e86-b1c0-3a4e4fced4bd -wi-ao---- <5.46t
osd-block-b28a126d-0a7b-503d-80c5-7cbaa04d0a9b ceph-a150e02b-ec98-48ac-92b9-4f754aea19c2 -wi-ao---- <5.46t
osd-block-377ff375-d2bf-5ad9-94b4-2127b6dcf9e7 ceph-d3c92af2-9aee-4141-a693-9d21c329bec6 -wi-ao---- <5.46t
osd-block-4f147edf-9cb7-5263-bec0-3fa34dc0373f ceph-dfe4f8f2-880f-414d-af58-5b3c77ed2628 -wi-ao---- <5.46t
最后保守起见还是手工点亮故障灯,通知机房换盘
代码语言:javascript复制[root@demo-host ceph]# hpssacli ctrl slot=0 pd 2I:4:1 modify led=on
总结
Bluestore用到了LVM,因此卸载物理磁盘之前,一定要遵循LVM规范去删除残留数据,不然之后换上新盘会造成管理上的混乱。