一次卷组信息丢失找回的过程–LVM back to normal

一次卷组信息丢失找回的过程–LVM back to normal

一次卷组信息丢失找回的过程–LVM back to normal

Wrote by: Rocky
Date: 2016-06-02
Update:
20160602 初稿
20160603 修订排版
20160607 修订排版
20160615 修订排版

摘要

本文记录了一次卷组丢失,如何找回卷组,并成功挂载逻辑卷的曲折过程。

正文

一天一台虚拟机无法ssh登录。值班同事在控制台上查看机器夯住了。
server down
控制台重启后提示文件系统损坏,需要进单用户。
manitenance

root密码不是标准的,小朋友穷举法试出来了,不容易。

尝试fsck提示找不到wls卷,折腾一下无果。控制台操作不便,先注释/etc/fstab中 wls那一行,重启机器ssh登录后再战。

VolGroup01信息丢失

[root@CNSZ041429 tmp]# pvs
  PV         VG         Fmt  Attr PSize  PFree
  /dev/sdb1  VolGroup00 lvm2 a--  20.97G    0
  /dev/sdd   VolGroup00 lvm2 a--   9.97G    0

[root@CNSZ041429 tmp]# vgs
  VG         #PV #LV #SN Attr   VSize  VFree
  VolGroup00   2   6   0 wz--n- 30.94G    0

[root@CNSZ041429 tmp]# lvs
  LV      VG         Attr   LSize   Origin Snap%  Move Log Copy%  Convert
  LVhome  VolGroup00 -wi-ao 480.00M
  LVroot  VolGroup00 -wi-ao  16.78G
  LVswap1 VolGroup00 -wi-ao   3.41G
  LVswap2 VolGroup00 -wi-ao   3.47G
  LVtmp   VolGroup00 -wi-ao   1.94G
  LVvar   VolGroup00 -wi-ao   4.88G

fdisk -l 还能看到盘/dev/sdc

[root@CNSZ041429 ~]# fdisk -l
Disk /dev/sda: 209 MB, 209715200 bytes
255 heads, 63 sectors/track, 25 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          25      200781   83  Linux

Disk /dev/sdb: 22.5 GB, 22548578304 bytes
255 heads, 63 sectors/track, 2741 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *           1        2741    22017051   8e  Linux LVM

** Disk /dev/sdc: 42.9 GB, 42949672960 bytes **
255 heads, 63 sectors/track, 5221 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1               1        5221    41937651   8e  Linux LVM

Disk /dev/sdd: 10.7 GB, 10737418240 bytes
255 heads, 63 sectors/track, 1305 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/sdd doesn't contain a valid partition table

恢复PV的UUID

通过–restorefile的参数可以恢复LVM的元数据,而不会对用户数据做任何改动。

[root@CNSZ041429 backup]# pwd
/etc/lvm/backup

[root@CNSZ041429 backup]# cat VolGroup01 | grep -C1 /dev/sdc
                        id = "7BDdCd-JZtq-bdNk-qR1G-rhR5-BYDX-NJta8y"
                        device = "/dev/sdc"     # Hint only

[root@CNSZ041429 backup]# pvcreate /dev/sdc -u 7BDdCd-JZtq-bdNk-qR1G-rhR5-BYDX-NJta8y --restorefile /etc/lvm/backup/VolGroup01

报错了:

Couldn't find device with uuid 7BDdCd-JZtq-bdNk-qR1G-rhR5-BYDX-NJta8y.
Device /dev/sdc not found (or ignored by filtering).

fdisk -l 还能看到盘/dev/sdc 为什么提示找不到呢?
打印详细日志看看:

pvcreate -vvv /dev/sdc -u 7BDdCd-JZtq-bdNk-qR1G-rhR5-BYDX-NJta8y --restorefile /etc/lvm/backup/VolGroup01

略……
    /dev/sdc: size is 83886080 sectors
      /dev/sdc: block size is 4096 bytes

** /dev/sdc: Skipping: Partition table signature found **

      Closed /dev/sdc
Device /dev/sdc not found (or ignored by filtering).

从详细日志可以得知,pvcreate报错的原因是/dev/sdc上有分区。

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1               1        5221    41937651   8e  Linux LVM

如果想使用分区作为PV,则需要对/dev/sdc1这个设备文件做操作。我们试一下:

pvcreate -vvv /dev/sdc1 -u 7BDdCd-JZtq-bdNk-qR1G-rhR5-BYDX-NJta8y --restorefile /etc/lvm/backup/VolGroup01

**   Physical volume "/dev/sdc1" successfully created **

[root@CNSZ041429 ~]# pvs
  PV         VG         Fmt  Attr PSize  PFree
  /dev/sdb1  VolGroup00 lvm2 a--  20.97G    0
  /dev/sdc1  VolGroup01 lvm2 a--  40.00G    0
  /dev/sdd   VolGroup00 lvm2 a--   9.97G    0

目前进展还算顺利,创建完pv,继续恢复vg。

[root@CNSZ041429 ~]# vgcfgrestore VolGroup01
  Restored volume group VolGroup01

[root@CNSZ041429 ~]# vgs
  VG         #PV #LV #SN Attr   VSize  VFree
  VolGroup00   2   6   0 wz--n- 30.94G    0
  VolGroup01   1   1   0 wz--n- 40.00G    0

lv也回来了

[root@CNSZ041429 ~]# lvs
  LV      VG         Attr   LSize   Origin Snap%  Move Log Copy%  Convert
  LVhome  VolGroup00 -wi-ao 480.00M
  LVroot  VolGroup00 -wi-ao  16.78G
  LVswap1 VolGroup00 -wi-ao   3.41G
  LVswap2 VolGroup00 -wi-ao   3.47G
  LVtmp   VolGroup00 -wi-ao   1.94G
  LVvar   VolGroup00 -wi-ao   4.88G
  LVwls   VolGroup01 -wi---  40.00G

激活卷组

[root@CNSZ041429 ~]# vgchange -ay VolGroup01
  device-mapper: reload ioctl failed: Invalid argument
  1 logical volume(s) in volume group "VolGroup01" now active

出错了: ** device-mapper: reload ioctl failed: Invalid argument **
不过卷组VolGroup01激活了,看看lv状态。

LV      VG         Attr   LSize   Origin Snap%  Move Log Copy%  Convert
LVhome  VolGroup00 -wi-ao 480.00M
LVroot  VolGroup00 -wi-ao  16.78G
LVswap1 VolGroup00 -wi-ao   3.41G
LVswap2 VolGroup00 -wi-ao   3.47G
LVtmp   VolGroup00 -wi-ao   1.94G
LVvar   VolGroup00 -wi-ao   4.88G
LVwls   VolGroup01 -wi-d-  40.00G

LVwls VolGroup01 -wi-d-状态不对,而且 VolGroup01的目录不存在

[root@CNSZ041429 ~]# ls -ld /dev/VolGroup0*
drwxr-xr-x 2 root root 160 May 30 12:29 /dev/VolGroup00
drwxr-xr-x 2 root root  60 May 30 12:36 /dev/VolGroup02

又回到原点了。
对于wls卷一般我们使用整块盘做PV,不额外做分区。难道是有人修改了sdc磁盘分区?改了卷组信息,机器没重启没有生效。这次宕机了,重启就找不到卷组VolGroup01了。

继续看无效参数的报错,message:

Jun  1 09:32:59 cnsz041429 kernel: device-mapper: table: device 8:33 too small for target
Jun  1 09:32:59 cnsz041429 kernel: device-mapper: table: 253:7: linear: dm-linear: Device lookup failed
Jun  1 09:32:59 cnsz041429 kernel: device-mapper: ioctl: error adding target to table
Jun  1 09:33:38 cnsz041429 kernel: device-mapper: table: device 8:33 too small for target
Jun  1 09:33:38 cnsz041429 kernel: device-mapper: table: 253:7: linear: dm-linear: Device lookup failed
Jun  1 09:33:38 cnsz041429 kernel: device-mapper: ioctl: error adding target to table

这里提示我们PV的size小于LV的size,通过LVM元数据信息来看:

PV         VG         Fmt  Attr PSize  PFree
/dev/sdc1  VolGroup01 lvm2 a--  40.00G    0

VG         #PV #LV #SN Attr   VSize  VFree
VolGroup01   1   1   0 wz--n- 40.00G    0

LV      VG         Attr   LSize   Origin Snap%  Move Log Copy%  Convert
LVwls   VolGroup01 -wi-d-  40.00G

上面的LVM元数据信息都是通过之前的操作,根据/etc/lvm/backup/VolGroup01文件恢复的,在这个文件中记录的PV大小是:

dev_size = 83886080 = 41943040 KB

而sdc1的大小是:

sdc1 41937651

sdc1小于PV的记录,而sdc的大小正好等于原来PV的size:

sdc  41943040

也就是说,VolGroup01原本的PV是sdc,而不是sdc1,所以目前的这个问题的根源就是PV sdc上被创建了分区——sdc1,如果仅仅是创建了分区,没有在上面创建文件系统的话,删掉这个分区,重新恢复试试。

删除分区

[root@CNSZ041429 log]# parted /dev/sdc rm 1
Information: Don't forget to update /etc/fstab, if necessary.

查看分区表信息

[root@CNSZ041429 log]# fdisk -l

Disk /dev/sda: 209 MB, 209715200 bytes
255 heads, 63 sectors/track, 25 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          25      200781   83  Linux

Disk /dev/sdb: 22.5 GB, 22548578304 bytes
255 heads, 63 sectors/track, 2741 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *           1        2741    22017051   8e  Linux LVM

Disk /dev/sdc: 42.9 GB, 42949672960 bytes
255 heads, 63 sectors/track, 5221 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System

强制恢复UUID

[root@CNSZ041429 log]# pvcreate --restorefile /etc/lvm/backup/VolGroup01 --uuid 7BDdCd-JZtq-bdNk-qR1G-rhR5-BYDX-NJta8y /dev/sdc
  Can't initialize physical volume "/dev/sdc" of volume group "VolGroup01" without -ff

[root@CNSZ041429 log]# pvcreate -ff --restorefile /etc/lvm/backup/VolGroup01 --uuid 7BDdCd-JZtq-bdNk-qR1G-rhR5-BYDX-NJta8y /dev/sdc
Really INITIALIZE physical volume "/dev/sdc" of volume group "VolGroup01" [y/n]? y
  WARNING: Forcing physical volume creation on /dev/sdc of volume group "VolGroup01"
  Writing physical volume data to disk "/dev/sdc"
  Physical volume "/dev/sdc" successfully created

恢复VG

[root@CNSZ041429 log]# vgcfgrestore --file /etc/lvm/backup/VolGroup01 VolGroup01
  Restored volume group VolGroup01

[root@CNSZ041429 log]# vgs
  VG         #PV #LV #SN Attr   VSize  VFree
  VolGroup00   2   6   0 wz--n- 30.94G    0
  VolGroup01   1   1   0 wz--n- 40.00G    0
  VolGroup02   1   1   0 wz--n- 40.00G    0

激活卷组

[root@CNSZ041429 log]# vgchange -ay VolGroup01
  1 logical volume(s) in volume group "VolGroup01" now active

挂卷

[root@CNSZ041429 log]# mount /dev/VolGroup01/LVwls /wls-z/

/dev/mapper/VolGroup01-LVwls
                       40G   19G   19G  51% /wls-z

[root@CNSZ041429 wls-z]# ls -lh
total 2.1G
drwx------  2 iaopr  wls     4.0K Jun  8  2013 iaopr
drwx------  2 root   root     16K Apr 16  2012 lost+found
drwxr-s--- 21 oracle oracle  4.0K May 19 20:31 oracle
drwxr-xr-x  8 root   service 4.0K Apr  3  2013 paic
-rw-r--r--  1 root   sys     2.0G Jun 14  2013 swapfile3
drwxr-xr-x  2 root   root    4.0K Oct 10  2015 wls

挺好,还能读写。
折腾了几天终于搞定,还算有所收获。

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注