Proxmox Add node to cluster and HA
- Category: 電腦相關
- Last Updated: Thursday, 06 July 2017 17:21
- Published: Thursday, 06 July 2017 16:58
- Written by sam
Now status
root@px157:/etc/pve# pvecm status
Quorum information
------------------
Date: Thu Jul 6 13:49:32 2017
Quorum provider: corosync_votequorum
Nodes: 3
Node ID: 0x00000001
Ring ID: 1/144
Quorate: Yes
Votequorum information
----------------------
Expected votes: 3
Highest expected: 3
Total votes: 3
Quorum: 2
Flags: Quorate
Membership information
----------------------
Nodeid Votes Name
0x00000001 1 10.0.252.157 (local)
0x00000002 1 10.0.252.158
0x00000003 1 10.0.252.159
Install new node and setup
root@px160:~# apt update &;& apt dist-upgrade
Network
root@px160:~# vi /etc/network/interfaces
auto lo
iface lo inet loopback
iface eno1 inet manual
auto vmbr0
iface vmbr0 inet static
address 10.0.252.160
netmask 255.255.255.0
gateway 10.0.252.253
bridge_ports eno1
bridge_stp off
bridge_fd 0
iface eno2 inet manual
auto vmbr1
iface vmbr1 inet static
address 10.56.56.160
netmask 255.255.255.0
bridge_ports eno2
bridge_stp off
bridge_fd 0
root@px160:~# ifup vmbr1
root@px160:~# ping 10.56.56.157
PING 10.56.56.157 (10.56.56.157) 56(84) bytes of data.
64 bytes from 10.56.56.157: icmp_seq=1 ttl=64 time=0.098 ms
Add to cluster
root@px160:~# pvecm add 10.0.252.157
The authenticity of host '10.0.252.157 (10.0.252.157)' can't be established.
ECDSA key fingerprint is SHA256:PJRC6MdQfYMlD6IN4u+Wa7JeVJshKFm2okN9XG9Zu1c.
Are you sure you want to continue connecting (yes/no)? yes
This email address is being protected from spambots. You need JavaScript enabled to view it.'s password:
copy corosync auth key
stopping pve-cluster service
backup old database
waiting for quorum...OK
generating node certificates
merge known_hosts file
restart services
successfully added node 'px160' to cluster.
Check pvecm status
root@px157:/etc/pve# pvecm status
Quorum information
------------------
Date: Thu Jul 6 14:04:23 2017
Quorum provider: corosync_votequorum
Nodes: 4
Node ID: 0x00000001
Ring ID: 1/148
Quorate: Yes
Votequorum information
----------------------
Expected votes: 4
Highest expected: 4
Total votes: 4
Quorum: 3
Flags: Quorate
Membership information
----------------------
Nodeid Votes Name
0x00000001 1 10.0.252.157 (local)
0x00000002 1 10.0.252.158
0x00000003 1 10.0.252.159
0x00000004 1 10.0.252.160
Install pveceph to add new osd and mon
root@px160:~# pveceph install
root@px160:~# pveceph createmon
ceph-mon: set fsid to 698c4b1b-9010-4dae-ae9e-1d70d43d48e9
ceph-mon: created monfs at /var/lib/ceph/mon/ceph-3 for mon.3
Created symlink /etc/systemd/system/ceph-mon.target.wants/This email address is being protected from spambots. You need JavaScript enabled to view it. -> /lib/systemd/system/[email protected].
admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
INFO:ceph-create-keys:ceph-mon admin socket not ready yet.
INFO:ceph-create-keys:ceph-mon is not in quorum: u'probing'
INFO:ceph-create-keys:ceph-mon is not in quorum: u'probing'
INFO:ceph-create-keys:ceph-mon is not in quorum: u'electing'
INFO:ceph-create-keys:ceph-mon is not in quorum: u'electing'
INFO:ceph-create-keys:ceph-mon is not in quorum: u'electing'
INFO:ceph-create-keys:ceph-mon is not in quorum: u'electing'
INFO:ceph-create-keys:Talking to monitor...
exported keyring for client.admin
updated caps for client.admin
INFO:ceph-create-keys:Talking to monitor...
INFO:ceph-create-keys:Talking to monitor...
INFO:ceph-create-keys:Talking to monitor...
then add new osd
root@px160:~# fdisk -l /dev/sdb
Disk /dev/sdb: 558.4 GiB, 599550590976 bytes, 1170997248 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 36C7DB48-6E50-47EF-986E-E1A1A075B83B
Device Start End Sectors Size Type
/dev/sdb1 2048 206847 204800 100M Ceph OSD
/dev/sdb2 206848 1170997214 1170790367 558.3G unknown
my /dev/sdb is used before, need delete old partition beforce use
root@px160:~# fdisk /dev/sdb
Welcome to fdisk (util-linux 2.29.2).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.
Command (m for help): d
Partition number (1,2, default 2):
Partition 2 has been deleted.
Command (m for help): d
Selected partition 1
Partition 1 has been deleted.
Command (m for help): w
The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.
Create osd
root@px160:~# pveceph createosd /dev/sdb
The operation has completed successfully.
Check ceph status (because we add new osd to the pool, just watting
root@px160:~# ceph status
cluster:
id: 698c4b1b-9010-4dae-ae9e-1d70d43d48e9
health: HEALTH_WARN
57 pgs backfill_wait
26 pgs degraded
26 pgs recovery_wait
5 pgs stuck unclean
recovery 3307/30432 objects degraded (10.867%)
recovery 4552/30432 objects misplaced (14.958%)
services:
mon: 4 daemons, quorum 0,1,2,3
mgr: 0(active), standbys: 1, 2
osd: 4 osds: 4 up, 4 in; 57 remapped pgs
data:
pools: 1 pools, 128 pgs
objects: 10144 objects, 39366 MB
usage: 118 GB used, 2114 GB / 2233 GB avail
pgs: 3307/30432 objects degraded (10.867%)
4552/30432 objects misplaced (14.958%)
57 active+remapped+backfill_wait
45 active+clean
26 active+recovery_wait+degraded
io:
client: 253 kB/s wr, 0 op/s rd, 42 op/s wr
recovery: 24191 kB/s, 6 objects/s
from ceph health detail
recovery 1857/30432 objects degraded (6.102%)
recovery 4472/30432 objects misplaced (14.695%)
And keep going
root@px160:~# cp /etc/pve/priv/ceph.client.admin.keyring /etc/pve/priv/ceph/ceph.keyring
root@px160:~# vi /etc/pve/storage.cfg ###add new ip to monhost
dir: local
path /var/lib/vz
content vztmpl,backup,iso
lvmthin: local-lvm
thinpool data
vgname pve
content images,rootdir
rbd: ceph
monhost 10.56.56.157;10.56.56.158;10.56.56.159;10.56.56.160
content rootdir,images
krbd 1
pool ceph
username admin
nfs: abc
export /mnt/DATA
path /mnt/pve/abc
server 10.0.252.231
content images,vztmpl,backup,iso,rootdir
maxfiles 365
options vers=3
And that's all for new ceph osd and mon to exist cluster.
If done, go to next step.
root@px160:~# ceph status
cluster:
id: 698c4b1b-9010-4dae-ae9e-1d70d43d48e9
health: HEALTH_OK
services:
mon: 4 daemons, quorum 0,1,2,3
mgr: 0(active), standbys: 1, 2
osd: 4 osds: 4 up, 4 in
data:
pools: 1 pools, 128 pgs
objects: 10260 objects, 39850 MB
usage: 119 GB used, 2113 GB / 2233 GB avail
pgs: 128 active+clean
io:
client: 182 kB/s wr, 0 op/s rd, 16 op/s wr
For test HA,
Ready your mustdie vm, and mustdie node
Here is my
root@px159:/# qm list
VMID NAME STATUS MEM(MB) BOOTDISK(GB) PID
6543 wanttodie running 512 10.00 31358
Add to HA
root@px159:~# ha-manager add vm:6543 --group sam
root@px159:~# ha-manager set vm:6543 --state started
root@px159:~# ha-manager config
vm:6543
state started
root@px159:~# ha-manager status
quorum OK
master px157 (active, Thu Jul 6 15:21:40 2017)
lrm px157 (active, Thu Jul 6 15:21:48 2017)
lrm px158 (active, Thu Jul 6 15:21:41 2017)
lrm px159 (active, Thu Jul 6 15:21:41 2017)
lrm px160 (active, Thu Jul 6 15:21:46 2017)
service vm:109 (px158, started)
service vm:111 (px158, started)
service vm:113 (px157, started)
service vm:114 (px157, started)
service vm:6543 (px160, started)
Now I want to do let node px159 lose power and come back
root@px160:~# pvecm status
Quorum information
------------------
Date: Thu Jul 6 15:33:57 2017
Quorum provider: corosync_votequorum
Nodes: 3
Node ID: 0x00000004
Ring ID: 1/152
Quorate: Yes
Votequorum information
----------------------
Expected votes: 4
Highest expected: 4
Total votes: 3
Quorum: 3
Flags: Quorate
Membership information
----------------------
Nodeid Votes Name
0x00000001 1 10.0.252.157
0x00000002 1 10.0.252.158
0x00000004 1 10.0.252.160 (local)
Now wait ceph ready
root@px160:~# ceph health
HEALTH_ERR 1 host (1 osds) down; 1 osds down; 1 mons down, quorum 0,1,3 0,1,3; 103 pgs are stuck inactive for more than 300 seconds; 103 pgs degraded; 103 pgs stuck degraded; 103 pgs stuck inactive; 103 pgs stuck unclean; 103 pgs stuck undersized; 103 pgs undersized; 158 requests are blocked > 32 sec; 3 osds have slow requests; recovery 8383/31032 objects degraded (27.014%)
ok, working done
root@px160:~# ceph health
HEALTH_WARN 1 mons down, quorum 0,1,3 0,1,3
and our vm 6543 is up, and auto move to px157
Let px159 back up, everything is fine.
root@px160:~# ceph health
HEALTH_OK
Let us do last test, if px160 died and never come back?
root@px159:~# pvecm status
Quorum information
------------------
Date: Thu Jul 6 16:25:32 2017
Quorum provider: corosync_votequorum
Nodes: 4
Node ID: 0x00000003
Ring ID: 1/156
Quorate: Yes
Votequorum information
----------------------
Expected votes: 4
Highest expected: 4
Total votes: 4
Quorum: 3
Flags: Quorate
Membership information
----------------------
Nodeid Votes Name
0x00000001 1 10.0.252.157
0x00000002 1 10.0.252.158
0x00000003 1 10.0.252.159 (local)
0x00000004 1 10.0.252.160
poweroff px160
root@px159:~# pvecm status
Quorum information
------------------
Date: Thu Jul 6 16:27:13 2017
Quorum provider: corosync_votequorum
Nodes: 3
Node ID: 0x00000003
Ring ID: 1/160
Quorate: Yes
Votequorum information
----------------------
Expected votes: 4
Highest expected: 4
Total votes: 3
Quorum: 3
Flags: Quorate
Membership information
----------------------
Nodeid Votes Name
0x00000001 1 10.0.252.157
0x00000002 1 10.0.252.158
0x00000003 1 10.0.252.159 (local)
root@px159:~# pvecm nodes
Membership information
----------------------
Nodeid Votes Name
1 1 px157
2 1 px158
3 1 px159 (local)
Delete it
root@px159:~# pvecm delnode px160
Killing node 4
root@px159:~# pvecm status
Quorum information
------------------
Date: Thu Jul 6 16:29:29 2017
Quorum provider: corosync_votequorum
Nodes: 3
Node ID: 0x00000003
Ring ID: 1/160
Quorate: Yes
Votequorum information
----------------------
Expected votes: 3
Highest expected: 3
Total votes: 3
Quorum: 2
Flags: Quorate
Membership information
----------------------
Nodeid Votes Name
0x00000001 1 10.0.252.157
0x00000002 1 10.0.252.158
0x00000003 1 10.0.252.159 (local)
My px160 is mon.3 and osd.3
root@px159:~# ceph osd out osd.3
marked out osd.3.
root@px159:~# ceph osd crush remove osd.3
removed item id 3 name 'osd.3' from crush map
root@px159:~# ceph auth del osd.3
updated
root@px159:~# ceph osd rm osd.3
removed osd.3
mon
root@px159:~# ceph mon remove 3
removing mon.3 at 10.56.56.160:6789/0, there will be 3 monitors
del about px160
root@px159:~# vi /etc/pve/storage.cfg
root@px159:/etc/pve# vi ceph.conf
root@px159:/etc/pve/ha# vi groups.cfg
root@px159:/etc/pve# vi storage.cfg
root@px159:/etc/pve/nodes# rm -rf px160/
root@px159:/etc/pve/priv# vi authorized_keys
root@px159:/etc/pve# ceph -w
cluster:
id: 698c4b1b-9010-4dae-ae9e-1d70d43d48e9
health: HEALTH_OK
services:
mon: 3 daemons, quorum 0,1,2
mgr: 0(active), standbys: 1, 2
osd: 3 osds: 3 up, 3 in
data:
pools: 1 pools, 128 pgs
objects: 10345 objects, 40188 MB
usage: 119 GB used, 1555 GB / 1674 GB avail
pgs: 128 active+clean
io:
client: 283 kB/s wr, 0 op/s rd, 39 op/s wr
Now, we can reinstall node and add to our cluster like article start.