0%

ceph-13.2.5删除osd,新建osd


前言

在搭建集群的时候,在node2上挂了一个lvm格式的osd(VG1/MyLvm),10G大小,想着既然是lvm,那就可以直接扩容, 所以又加了一个10G的虚拟硬盘,并扩容进了VG1/MyLvm,结果就是:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
$ lsblk

NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 20G 0 disk
├─sda1 8:1 0 1G 0 part /boot
└─sda2 8:2 0 19G 0 part
├─centos-root 253:0 0 17G 0 lvm /
└─centos-swap 253:1 0 2G 0 lvm [SWAP]
sdb 8:16 0 10G 0 disk
└─sdb1 8:17 0 10G 0 part
└─VG1-MyLvm 253:2 0 20G 0 lvm
sdc 8:32 0 10G 0 disk
└─sdc1 8:33 0 10G 0 part
└─VG1-MyLvm 253:2 0 20G 0 lvm
sr0 11:0 1 1024M 0 rom

$ sudo vgdisplay

--- Volume group ---
VG Name centos
System ID
Format lvm2
Metadata Areas 1
Metadata Sequence No 3
VG Access read/write
VG Status resizable
MAX LV 0
Cur LV 2
Open LV 2
Max PV 0
Cur PV 1
Act PV 1
VG Size <19.00 GiB
PE Size 4.00 MiB
Total PE 4863
Alloc PE / Size 4863 / <19.00 GiB
Free PE / Size 0 / 0
VG UUID fEJqFf-qaqN-ZeZe-3FG1-Jeya-SyAi-WCtB0u

--- Volume group ---
VG Name VG1
System ID
Format lvm2
Metadata Areas 2
Metadata Sequence No 39
VG Access read/write
VG Status resizable
MAX LV 0
Cur LV 1
Open LV 1
Max PV 0
Cur PV 2
Act PV 2
VG Size 19.99 GiB
PE Size 4.00 MiB
Total PE 5118
Alloc PE / Size 5118 / 19.99 GiB
Free PE / Size 0 / 0
VG UUID OMx43d-hsd1-80vh-DCvO-870t-zPSn-123yoU

首先可以看到逻辑卷MyLvm的确变成了20G大小。

1
2
3
4
5
6
7
8
9
10
11
$ sudo ceph osd df tree

ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS TYPE NAME
-1 0.02939 - 40 GiB 13 GiB 27 GiB 33.56 1.00 - root default
-3 0.00980 - 10 GiB 1.1 GiB 8.9 GiB 11.42 0.34 - host node1
0 hdd 0.00980 1.00000 10 GiB 1.1 GiB 8.9 GiB 11.42 0.34 68 osd.0
-5 0.00980 - 20 GiB 11 GiB 8.9 GiB 55.71 1.66 - host node2
1 hdd 0.00980 1.00000 20 GiB 11 GiB 8.9 GiB 55.71 1.66 0 osd.1
-7 0.00980 - 10 GiB 1.1 GiB 8.9 GiB 11.42 0.34 - host node3
2 hdd 0.00980 1.00000 10 GiB 1.1 GiB 8.9 GiB 11.42 0.34 68 osd.2
TOTAL 40 GiB 13 GiB 27 GiB 33.56

可以看到osd.1的容量的确变成了20G,但是已经用了11G??? 刚挂上去的10G空间全部被标记为使用???

不明白这是什么原理,可能这样去扩容是不对的,为了把这10G空间夺取回来, 决定先删除osd.1,清空lvm卷,再把它加回来。


删除osd

这里直接参考官方教程ADDING/REMOVING OSDS

步骤即:

  1. 将osd设置为out。
  2. 关闭正在运行的osd进程。
  3. crush map中删除它。
  4. 删除OSD authentication key
  5. 删除osd。

命令如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
$ sudo ceph osd out osd.1

marked out osd.1.

$ sudo systemctl stop ceph-osd@1

$ sudo ceph osd crush remove osd.1

removed item id 1 name 'osd.1' from crush map

$ sudo ceph auth del osd.1

updated

$ sudo ceph osd rm 1

removed osd.1

查看集群状态:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
$ sudo ceph -s

cluster:
id: 082d1625-1d68-4261-82c8-3fe9fe3ef489
health: HEALTH_WARN
Degraded data redundancy: 256/768 objects degraded (33.333%), 32 pgs degraded, 68 pgs undersized

services:
mon: 3 daemons, quorum node1,node2,node3
mgr: node1(active), standbys: node3, node2
mds: fs_test-1/1/1 up {0=node1=up:active}
osd: 2 osds: 2 up, 2 in
rgw: 1 daemon active

data:
pools: 8 pools, 68 pgs
objects: 256 objects, 136 MiB
usage: 2.3 GiB used, 18 GiB / 20 GiB avail
pgs: 256/768 objects degraded (33.333%)
36 active+undersized
32 active+undersized+degraded

$ sudo ceph osd df tree

ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS TYPE NAME
-1 0.01959 - 10 GiB 1.1 GiB 8.8 GiB 0 0 - root default
-3 0.00980 - 10 GiB 1.1 GiB 8.8 GiB 11.48 1.00 - host node1
0 hdd 0.00980 1.00000 10 GiB 1.1 GiB 8.8 GiB 11.48 1.00 68 osd.0
-5 0 - 0 B 0 B 0 B 0 0 - host node2
-7 0.00980 - 10 GiB 1.1 GiB 8.8 GiB 11.48 1.00 - host node3
2 hdd 0.00980 1.00000 10 GiB 1.1 GiB 8.8 GiB 11.48 1.00 68 osd.2
TOTAL 20 GiB 2.3 GiB 18 GiB 11.48

可以看到osd.1已经删除,这里的WARN是因为只有两个osd节点造成的。


创建osd


方法一:使用ceph-deploy。

因为这里相当于是撤下来的盘,所以先将它里面的内容擦除,借助ceph-volume来进行:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
$ sudo ceph-volume lvm zap VG1/MyLvm

--> Zapping: /dev/VG1/MyLvm
--> Unmounting /var/lib/ceph/osd/ceph-1
Running command: /bin/umount -v /var/lib/ceph/osd/ceph-1
stderr: umount: /var/lib/ceph/osd/ceph-1 (tmpfs) unmounted
Running command: /usr/sbin/wipefs --all /dev/VG1/MyLvm
Running command: /bin/dd if=/dev/zero of=/dev/VG1/MyLvm bs=1M count=10
Running command: /usr/sbin/lvchange --deltag ceph.type=block /dev/VG1/MyLvm
stdout: Logical volume VG1/MyLvm changed.
Running command: /usr/sbin/lvchange --deltag ceph.osd_id=1 /dev/VG1/MyLvm
stdout: Logical volume VG1/MyLvm changed.
Running command: /usr/sbin/lvchange --deltag ceph.cluster_fsid=082d1625-1d68-4261-82c8-3fe9fe3ef489 /dev/VG1/MyLvm
stdout: Logical volume VG1/MyLvm changed.
Running command: /usr/sbin/lvchange --deltag ceph.cluster_name=ceph /dev/VG1/MyLvm
stdout: Logical volume VG1/MyLvm changed.
Running command: /usr/sbin/lvchange --deltag ceph.osd_fsid=a59044b3-ff4c-4a99-978f-1336ff4504a0 /dev/VG1/MyLvm
stdout: Logical volume VG1/MyLvm changed.
Running command: /usr/sbin/lvchange --deltag ceph.encrypted=0 /dev/VG1/MyLvm
stdout: Logical volume VG1/MyLvm changed.
Running command: /usr/sbin/lvchange --deltag ceph.cephx_lockbox_secret= /dev/VG1/MyLvm
stdout: Logical volume VG1/MyLvm changed.
Running command: /usr/sbin/lvchange --deltag ceph.block_uuid=B553Ss-LYdv-3FEW-q5u9-XKph-qn2v-y3ecu0 /dev/VG1/MyLvm
stdout: Logical volume VG1/MyLvm changed.
Running command: /usr/sbin/lvchange --deltag ceph.block_device=/dev/VG1/MyLvm /dev/VG1/MyLvm
stdout: Logical volume VG1/MyLvm changed.
Running command: /usr/sbin/lvchange --deltag ceph.vdo=0 /dev/VG1/MyLvm
stdout: Logical volume VG1/MyLvm changed.
Running command: /usr/sbin/lvchange --deltag ceph.crush_device_class=None /dev/VG1/MyLvm
stdout: Logical volume VG1/MyLvm changed.
--> Zapping successful for: <LV: /dev/VG1/MyLvm>

使用ceph-delpoy创建osd:

1
2
3
4
5
6
7
8
9
10
$ ceph-deploy osd create --data VG1/MyLvm node2

......
......
[node2][DEBUG ] --> ceph-volume lvm activate successful for osd ID: 1
[node2][DEBUG ] --> ceph-volume lvm create successful for: VG1/MyLvm
[node2][INFO ] checking OSD status...
[node2][DEBUG ] find the location of an executable
[node2][INFO ] Running command: sudo /bin/ceph --cluster=ceph osd stat --format=json
[ceph_deploy.osd][DEBUG ] Host node2 is now ready for osd use.

查询集群状态:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
$ sudo ceph -s
cluster:
id: 082d1625-1d68-4261-82c8-3fe9fe3ef489
health: HEALTH_OK

services:
mon: 3 daemons, quorum node1,node2,node3
mgr: node1(active), standbys: node3, node2
mds: fs_test-1/1/1 up {0=node1=up:active}
osd: 3 osds: 3 up, 3 in
rgw: 1 daemon active

data:
pools: 8 pools, 68 pgs
objects: 256 objects, 136 MiB
usage: 3.5 GiB used, 37 GiB / 40 GiB avail
pgs: 68 active+clean

$ sudo ceph osd df tree
ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS TYPE NAME
-1 0.03908 - 40 GiB 3.3 GiB 37 GiB 8.34 1.00 - root default
-3 0.00980 - 10 GiB 1.1 GiB 8.8 GiB 11.49 1.38 - host node1
0 hdd 0.00980 1.00000 10 GiB 1.1 GiB 8.8 GiB 11.49 1.38 68 osd.0
-5 0.01949 - 20 GiB 1.0 GiB 19 GiB 5.20 0.62 - host node2
1 hdd 0.01949 1.00000 20 GiB 1.0 GiB 19 GiB 5.20 0.62 55 osd.1
-7 0.00980 - 10 GiB 1.1 GiB 8.8 GiB 11.49 1.38 - host node3
2 hdd 0.00980 1.00000 10 GiB 1.1 GiB 8.8 GiB 11.49 1.38 68 osd.2
TOTAL 40 GiB 3.3 GiB 37 GiB 8.34
MIN/MAX VAR: 0.62/1.38 STDDEV: 3.14

可以看到集群现在就健康了,并且osd.1就有19G可用了。


方法二:手动创建。

尝试官方教程失败,如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
$ sudo mount -o user_xattr /dev/VG1/MyLvm /var/lib/ceph/osd/ceph-1

mount: wrong fs type, bad option, bad superblock on /dev/mapper/VG1-MyLvm,
missing codepage or helper program, or other error

In some cases useful info is found in syslog - try
dmesg | tail or so.


$ sudo mount /dev/VG1/MyLvm /var/lib/ceph/osd/ceph-1


$ sudo ceph-osd -i 1 --mkfs --mkkey

2019-07-19 16:54:52.691 7efc821c7d80 -1 auth: unable to find a keyring on /var/lib/ceph/osd/ceph-1/keyring: (2) No such file or directory
2019-07-19 16:54:52.691 7efc821c7d80 -1 monclient: ERROR: missing keyring, cannot use cephx for authentication
failed to fetch mon config (--no-mon-config to skip)

暂时放弃手动。


参考

ADDING/REMOVING OSDS