虛擬機器塊裝置掛載引數錯誤導致無法開機問題處理
故障現象
給虛擬機器增加一個卷後,重啟無法進入系統。
故障原因
/etc/fstab中填寫的檔案系統型別與分割槽實際檔案系統不一致。
處理方法
將虛擬機器系統卷掛載到其他虛擬機器上,修改/etc/fstab為正確配置。
排查過程
同事給一臺虛擬機器新增一個卷,重啟之後發現虛擬機器起不來了,開啟vnc consle發現虛擬機器進入了修復介面:
而這臺虛擬機器在建立的時候似乎沒有正常初始化,輸入root密碼不對,無法進入系統。於是想進入單使用者模式修改root密碼,在grub啟動選單按e鍵進入編輯介面,講linux16那一行的ro修改為rw init=/sysroot/bin/sh,然後ctlr+x啟動:
卡在這個介面,可以看到確實跟塊裝置掛載有關:
看來沒辦法進入系統了,於是想把虛擬機器系統卷對映到宿主機上,然後修改它的/etc/fstab:
# rbd map vms/a3dc7178-6936-4a3e-a129-38de543b70c8_disk --id cinder rbd: sysfs write failed RBD image feature set mismatch. You can disable features unsupported by the kernel with "rbd feature disable". In some cases useful info is found in syslog - try "dmesg | tail" or so. rbd: map failed: (6) No such device or address
失敗,報錯大致是說rbd中帶有系統kernel不支援的feature,需要先使用rbd feature disable取消image上的feature。這是線上的機器,當然不能這麼幹,保不齊出什麼問題。
而系統也沒有nbd模組,因此rbd-nbd也不可用,這條路也行不通。
那麼只能將它掛到另一個虛擬機器上。於是建立一個新的虛擬機器vm-01,並在其所在宿主機上編輯blk.xml:
<disk type='network' device='disk'> <driver name='qemu' type='raw' cache='writeback'/> <auth username='cinder'> <secret type='ceph' uuid='6f29947a-ae9b-477a-a31c-d0942177b734'/> </auth> <source protocol='rbd' name='vms/a3dc7178-6936-4a3e-a129-de543b70c8_disk'> <host name='10.20.67.106' port='6789'/> <host name='10.20.67.107' port='6789'/> <host name='10.20.67.108' port='6789'/> </source> <target dev='vdb' bus='virtio'/> </disk>
其中source就是故障虛擬機器的系統卷。然後attach到vm-01上:
# virsh attach-device instance-00002965 blk.xml Device attached successfully
進入vm-01,檢視塊裝置:
[[email protected]01 ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sr0 11:0 1 520K 0 rom vda 253:0 0 60G 0 disk └─vda1 253:1 0 60G 0 part / vdb 253:16 0 60G 0 disk └─vdb1 253:17 0 60G 0 part
vdb就是剛剛attach的故障虛擬機器的系統卷,將它掛載到/mnt目錄:
[[email protected]01 ~]# mount /dev/vdb1 /mnt mount: wrong fs type, bad option, bad superblock on /dev/vdb1, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so.
發現錯誤,根據錯誤提示,檢視dmesg:
[[email protected]01 ~]# dmesg | tail [ 3437.071352] pci 0000:00:06.0: reg 0x14: [mem 0x00000000-0x00000fff] [ 3437.071927] pci 0000:00:06.0: BAR 1: assigned [mem 0xc0000000-0xc0000fff] [ 3437.072845] pci 0000:00:06.0: BAR 0: assigned [io 0x1000-0x103f] [ 3437.074055] virtio-pci 0000:00:06.0: enabling device (0000 -> 0003) [ 3437.078750] ACPI: PCI Interrupt Link [LNKB] enabled at IRQ 11 [ 3437.079535] virtio-pci 0000:00:06.0: virtio_pci: leaving for legacy driver [ 3437.082683] virtio-pci 0000:00:06.0: irq 29 for MSI/MSI-X [ 3437.082705] virtio-pci 0000:00:06.0: irq 30 for MSI/MSI-X [ 3437.086378] vdb: vdb1 [ 3597.459912] XFS (vdb1): Filesystem has duplicate UUID fc1bfc5d-a5d1-4c3c-afda-167500654723 - can't mount
原來是因為vdb1和vda1的檔案系統的uuid重複了,因為這兩臺虛擬機器都是由同一個映象克隆而來。
於是用另一個映象建立vm-02,重複上面的步驟掛載,一切正常,檢視/etc/fstab:
# cat /mnt/etc/fstab # # /etc/fstab # Created by anaconda on Thu Dec 17 17:11:31 2015 # # Accessible filesystems, by reference, are maintained under '/dev/disk' # See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info # UUID=fc1bfc5d-a5d1-4c3c-afda-167500654723 / xfs defaults 0 0 #/dev/vdb none swap sw,comment=cloudconfig 0 0 /dev/vdb1 /data xfs defaults 0 0 /dev/vdc1 /data1 xfs defaults 0 0
跟同事確認,他將/dev/vdc1格式化為ext4檔案系統,而/etc/fstab寫的是xfs,這才導致掛載錯誤。於是修改/etc/fstab:
/dev/vdc1 /data1 ext4 defaults 0 0
然後解除安裝:
[[email protected]02 ~]# umount /dev/vdb1
在宿主機上detach裝置:
# virsh detach-device instance-0000296e blk.xml
Device detached successfully
啟動虛擬機器networktool01,順利進入系統,塊裝置也已正確掛載:
[[email protected] ~]# df -hT Filesystem Type Size Used Avail Use% Mounted on /dev/vda1 xfs 60G 1.2G 59G 2% / /dev/vdb1 xfs 200G 33M 200G 1% /data /dev/vdc1 ext4 197G 61M 187G 1% /data1