1. 程式人生 > >ceph recovery controlled

ceph recovery controlled

[[email protected] ceph-cluster]# cat ceph.conf
[global]
fsid = 380d4224-78e1-4d19-95c7-74c278712b0e
mon_initial_members = k8s-n2, k8s-m3, k8s-master-1, k8s-master-2, k8s-n1
#mon_host = 109.105.1.208,109.105.1.209,109.105.1.253,109.105.1.254,172.10.1.246
mon_host = 172.10.1.208,172.10.1.209,172.10.1.253,172.10.1.254,172.10.1.246
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
public network = 172.10.0.0/16
cluster network = 172.10.0.0/16

osd pool default size = 2
osd pool default min size = 1
mon clock drift allowed = 0.1
mon allow pool delete = true
mds recall state timeout = 150
mds cache size = 10737418240
mds max file size = 3298534883328
mds health cache threshold = 2.000000
[osd]
osd max write size = 512
osd client message size cap = 2147483648
osd deep scrub stride = 131072
osd disk threads = 4
osd map cache size = 512
osd scrub begin hour = 23
osd scrub end hour = 7
osd max backfills = 6
osd recovery max active = 15
osd_recovery_sleep_hdd = 0

注:osd_recovery_sleep_hdd是影響恢復速度最大的一個引數,這個引數不設為0,調整其他兩個引數只能穩定提升到40objects/s,而把這個引數調為0後可穩定達到800objects/s,
[[email protected] ceph-cluster]#ansible ceph-nodes -m copy -a ‘src=/etc/ceph/ceph.conf dest=/etc/ceph/’

在osd的所有節點執行(全域性):
for i in $(ps aux|grep ceph-osd|awk ‘{print $16}’); do systemctl restart [email protected]

$i; done

不重啟操作:
單個 OSD 引數調整
[[email protected] ~]# ceph daemon osd.12 config set debug_osd 10
[[email protected] ~]# ceph --admin-daemon /var/run/ceph/ceph-osd.0.asok config show | grep osd_max_backfills
“osd_max_backfills”: “8”,
針對所有osd的操作:
注:如果加入新的osd節點,要重新執行三個命令,因為新加來的會採用預設的引數,即配置檔案裡的引數。

[[email protected] ~]# ceph tell osd.* injectargs ‘–osd_max_backfills=7’
osd.0: osd_max_backfills = ‘7’ rocksdb_separate_wal_dir = ‘false’ (not observed, change may require restart)
osd.1: osd_max_backfills = ‘7’ rocksdb_separate_wal_dir = ‘false’ (not observed, change may require restart)
osd.2: osd_max_backfills = ‘7’ rocksdb_separate_wal_dir = ‘false’ (not observed, change may require restart)
osd.3: osd_max_backfills = ‘7’ rocksdb_separate_wal_dir = ‘false’ (not observed, change may require restart)
osd.4: osd_max_backfills = ‘7’ rocksdb_separate_wal_dir = ‘false’ (not observed, change may require restart)
osd.5: osd_max_backfills = ‘7’ rocksdb_separate_wal_dir = ‘false’ (not observed, change may require restart)
osd.6: osd_max_backfills = ‘7’ rocksdb_separate_wal_dir = ‘false’ (not observed, change may require restart)
osd.7: osd_max_backfills = ‘7’ rocksdb_separate_wal_dir = ‘false’ (not observed, change may require restart)
osd.8: osd_max_backfills = ‘7’ rocksdb_separate_wal_dir = ‘false’ (not observed, change may require restart)
osd.9: osd_max_backfills = ‘7’ rocksdb_separate_wal_dir = ‘false’ (not observed, change may require restart)
osd.10: osd_max_backfills = ‘7’ rocksdb_separate_wal_dir = ‘false’ (not observed, change may require restart)
osd.11: osd_max_backfills = ‘7’ rocksdb_separate_wal_dir = ‘false’ (not observed, change may require restart)
實際上並不需要重啟所有osd即已生效,從監控中可以看到變化
[[email protected] ceph-cluster]# ceph tell osd.* injectargs ‘–osd_recovery_max_active=15’
osd.0: osd_recovery_max_active = ‘15’ (not observed, change may require restart)
osd.1: osd_recovery_max_active = ‘15’ (not observed, change may require restart)
osd.2: osd_recovery_max_active = ‘15’ (not observed, change may require restart)
osd.3: osd_recovery_max_active = ‘15’ (not observed, change may require restart)
osd.4: osd_recovery_max_active = ‘15’ (not observed, change may require restart)
osd.5: osd_recovery_max_active = ‘15’ (not observed, change may require restart)
osd.8: osd_recovery_max_active = ‘15’ (not observed, change may require restart)
osd.9: osd_recovery_max_active = ‘15’ (not observed, change may require restart)
osd.10: osd_recovery_max_active = ‘15’ (not observed, change may require restart)
osd.11: osd_recovery_max_active = ‘15’ (not observed, change may require restart)

[[email protected] lyf3]# ceph tell osd.* injectargs ‘–osd_recovery_sleep_hdd=0’
osd.0: osd_recovery_sleep_hdd = ‘0.000000’ (not observed, change may require restart)
osd.1: osd_recovery_sleep_hdd = ‘0.000000’ (not observed, change may require restart)
osd.2: osd_recovery_sleep_hdd = ‘0.000000’ (not observed, change may require restart)
osd.3: osd_recovery_sleep_hdd = ‘0.000000’ (not observed, change may require restart)
osd.4: osd_recovery_sleep_hdd = ‘0.000000’ (not observed, change may require restart)
osd.5: osd_recovery_sleep_hdd = ‘0.000000’ (not observed, change may require restart)
osd.6: osd_recovery_sleep_hdd = ‘0.000000’ (not observed, change may require restart)
osd.7: osd_recovery_sleep_hdd = ‘0.000000’ (not observed, change may require restart)
osd.8: osd_recovery_sleep_hdd = ‘0.000000’ (not observed, change may require restart)
osd.9: osd_recovery_sleep_hdd = ‘0.000000’ (not observed, change may require restart)
osd.10: osd_recovery_sleep_hdd = ‘0.000000’ (not observed, change may require restart)
osd.11: osd_recovery_sleep_hdd = ‘0.000000’ (not observed, change may require restart)

watch 指令碼(親測不太好使)
watch -n 1 -d “ceph pg dump|grep recovering|awk ‘{print $1,$2,$4,$10,$15,$16,$17,$18}’”
可用dstat 命令檢視磁碟讀寫
lsblk檢視各個磁碟
執行:
dstat -td -D /dev/sdb