mdXX: bitmap superblock UUID mismatch

am 26.01.2011 21:58:25 von Reynald Borer

Hello guys,

I have been using md raids for quite a long time now and it always
worked fine, until recently when I upgraded some hardware on my
workstation. Unfortunately the hardware I changed proved itself to be
very unstable, and I encountered a lot of hard lockups of the system
while running. Those lockups recently made one of my raid 1 array
fails with the infamous error message "mdXX: bitmap superblock UUID
mismatch".

Here is what I have found in the kernel logs when I try to activate
the given raid group:
-----------------
md/raid1:md126: active with 2 out of 2 mirrors
md126: bitmap superblock UUID mismatch
md126: bitmap file superblock:
magic: 6d746962
version: 4
uuid: 37102258.af9c1930.b8397fb8.eba356af
events: 199168
events cleared: 199166
state: 00000000
chunksize: 524288 B
daemon sleep: 5s
sync size: 248075584 KB
max write behind: 0
md126: failed to create bitmap (-22)
-----------------

Such error messages are displayed each time I try to run the raid
group. Content of /proc/mdstat is:
-----------------
md126 : inactive sdb6[0] sda6[1]
496151168 blocks
-----------------

If I try to examine both disks with mdadm -E it shows some checksum
mismatch for both partitions:
-----------------
root@bob # mdadm -E /dev/sda6
/dev/sda6:
Magic : a92b4efc
Version : 0.90.03
UUID : 37102258:bf9c1930:b8397fb8:eba356af
Creation Time : Mon Aug 7 21:06:47 2006
Raid Level : raid1
Used Dev Size : 248075584 (236.58 GiB 254.03 GB)
Array Size : 248075584 (236.58 GiB 254.03 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 6

Update Time : Wed Jan 12 00:12:44 2011
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Checksum : e4883f8e - expected e4883e8e
Events : 199168

Number Major Minor RaidDevice State
this 1 8 38 1 active sync

0 0 8 70 0 active sync
1 1 8 38 1 active sync
root@bob # mdadm -E /dev/sdb6
/dev/sdb6:
Magic : a92b4efc
Version : 0.90.03
UUID : 37102258:bf9c1930:b8397fb8:eba356af
Creation Time : Mon Aug 7 21:06:47 2006
Raid Level : raid1
Used Dev Size : 248075584 (236.58 GiB 254.03 GB)
Array Size : 248075584 (236.58 GiB 254.03 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 6

Update Time : Wed Jan 12 00:12:44 2011
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Checksum : e4883fac - expected e4883eac
Events : 199168

Number Major Minor RaidDevice State
this 0 8 70 0 active sync

0 0 8 70 0 active sync
1 1 8 38 1 active sync
-----------------

Any idea how I could try to save my raid group?

Thanks in advance for your help.

Best Regards,
Reynald
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: mdXX: bitmap superblock UUID mismatch

am 27.01.2011 21:53:46 von NeilBrown

On Wed, 26 Jan 2011 21:58:25 +0100 Reynald Borer
wrote:

> Hello guys,
>
> I have been using md raids for quite a long time now and it always
> worked fine, until recently when I upgraded some hardware on my
> workstation. Unfortunately the hardware I changed proved itself to be
> very unstable, and I encountered a lot of hard lockups of the system
> while running. Those lockups recently made one of my raid 1 array
> fails with the infamous error message "mdXX: bitmap superblock UUID
> mismatch".
>
> Here is what I have found in the kernel logs when I try to activate
> the given raid group:
> -----------------
> md/raid1:md126: active with 2 out of 2 mirrors
> md126: bitmap superblock UUID mismatch
> md126: bitmap file superblock:
> magic: 6d746962
> version: 4
> uuid: 37102258.af9c1930.b8397fb8.eba356af
^ this is an 'a'

> events: 199168
> events cleared: 199166
> state: 00000000
> chunksize: 524288 B
> daemon sleep: 5s
> sync size: 248075584 KB
> max write behind: 0
> md126: failed to create bitmap (-22)
> -----------------
>
>
> Such error messages are displayed each time I try to run the raid
> group. Content of /proc/mdstat is:
> -----------------
> md126 : inactive sdb6[0] sda6[1]
> 496151168 blocks
> -----------------
>
>
> If I try to examine both disks with mdadm -E it shows some checksum
> mismatch for both partitions:
> -----------------
> root@bob # mdadm -E /dev/sda6
> /dev/sda6:
> Magic : a92b4efc
> Version : 0.90.03
> UUID : 37102258:bf9c1930:b8397fb8:eba356af
^ this is a 'b'

So you certainly do have some sick hardware!!!

I suggest that you find some hardware that you can trust,
mount one of the two devices ( (sdb6 or sda6) ignoring the raid stuff,
and copy data off to the device that you trust.

Then start again.

NeilBrown

> Creation Time : Mon Aug 7 21:06:47 2006
> Raid Level : raid1
> Used Dev Size : 248075584 (236.58 GiB 254.03 GB)
> Array Size : 248075584 (236.58 GiB 254.03 GB)
> Raid Devices : 2
> Total Devices : 2
> Preferred Minor : 6
>
> Update Time : Wed Jan 12 00:12:44 2011
> State : clean
> Active Devices : 2
> Working Devices : 2
> Failed Devices : 0
> Spare Devices : 0
> Checksum : e4883f8e - expected e4883e8e
> Events : 199168
>
>
> Number Major Minor RaidDevice State
> this 1 8 38 1 active sync
>
> 0 0 8 70 0 active sync
> 1 1 8 38 1 active sync
> root@bob # mdadm -E /dev/sdb6
> /dev/sdb6:
> Magic : a92b4efc
> Version : 0.90.03
> UUID : 37102258:bf9c1930:b8397fb8:eba356af
> Creation Time : Mon Aug 7 21:06:47 2006
> Raid Level : raid1
> Used Dev Size : 248075584 (236.58 GiB 254.03 GB)
> Array Size : 248075584 (236.58 GiB 254.03 GB)
> Raid Devices : 2
> Total Devices : 2
> Preferred Minor : 6
>
> Update Time : Wed Jan 12 00:12:44 2011
> State : clean
> Active Devices : 2
> Working Devices : 2
> Failed Devices : 0
> Spare Devices : 0
> Checksum : e4883fac - expected e4883eac
> Events : 199168
>
>
> Number Major Minor RaidDevice State
> this 0 8 70 0 active sync
>
> 0 0 8 70 0 active sync
> 1 1 8 38 1 active sync
> -----------------
>
>
> Any idea how I could try to save my raid group?
>
> Thanks in advance for your help.
>
> Best Regards,
> Reynald
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: mdXX: bitmap superblock UUID mismatch

am 14.02.2011 17:33:29 von Reynald Borer

Hi,

Nice catch for the 1 bit difference, I didn't see it. My point
regarding bitmap reconstruction was because this raid was used in a
LVM setup. the LVM partition used two distinct raid 1 disks, and I was
not able to start the LVM correctly without this failing raid.

In the end, I was able to save my LVM by simply skipping the raid 1
and using directly one partition. The LVM tool was clever enough to
detect the MD bits and proposed me to remove them in order to mount
directly the partition, and it worked fine. Thus I was able to save my
data.

Thanks for your answer though.

Regards,
Reynald

On Thu, Jan 27, 2011 at 9:53 PM, NeilBrown wrote:
> On Wed, 26 Jan 2011 21:58:25 +0100 Reynald Borer com>
> wrote:
>
>> Hello guys,
>>
>> I have been using md raids for quite a long time now and it always
>> worked fine, until recently when I upgraded some hardware on my
>> workstation. Unfortunately the hardware I changed proved itself to b=
e
>> very unstable, and I encountered a lot of hard lockups of the system
>> while running. Those lockups recently made one of my raid 1 array
>> fails with the infamous error message "mdXX: bitmap superblock UUID
>> mismatch".
>>
>> Here is what I have found in the kernel logs when I try to activate
>> the given raid group:
>> -----------------
>> md/raid1:md126: active with 2 out of 2 mirrors
>> md126: bitmap superblock UUID mismatch
>> md126: bitmap file superblock:
>> =A0 =A0 =A0 =A0 =A0magic: 6d746962
>> =A0 =A0 =A0 =A0version: 4
>> =A0 =A0 =A0 =A0 =A0 uuid: 37102258.af9c1930.b8397fb8.eba356af
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ^ this is an 'a'
>
>> =A0 =A0 =A0 =A0 events: 199168
>> events cleared: 199166
>> =A0 =A0 =A0 =A0 =A0state: 00000000
>> =A0 =A0 =A0chunksize: 524288 B
>> =A0 daemon sleep: 5s
>> =A0 =A0 =A0sync size: 248075584 KB
>> max write behind: 0
>> md126: failed to create bitmap (-22)
>> -----------------
>>
>>
>> Such error messages are displayed each time I try to run the raid
>> group. Content of /proc/mdstat is:
>> -----------------
>> md126 : inactive sdb6[0] sda6[1]
>> =A0 =A0 =A0 496151168 blocks
>> -----------------
>>
>>
>> If I try to examine both disks with mdadm -E it shows some checksum
>> mismatch for both partitions:
>> -----------------
>> root@bob # mdadm -E /dev/sda6
>> /dev/sda6:
>> =A0 =A0 =A0 =A0 =A0 Magic : a92b4efc
>> =A0 =A0 =A0 =A0 Version : 0.90.03
>> =A0 =A0 =A0 =A0 =A0 =A0UUID : 37102258:bf9c1930:b8397fb8:eba356af
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ^ this is a '=
b'
>
> So you certainly do have some sick hardware!!!
>
> I suggest that you find some hardware that you can trust,
> mount one of the two devices ( (sdb6 or sda6) ignoring the raid stuff=
,
> and copy data off to the device that you trust.
>
> Then start again.
>
> NeilBrown
>
>
>> =A0 Creation Time : Mon Aug =A07 21:06:47 2006
>> =A0 =A0 =A0Raid Level : raid1
>> =A0 Used Dev Size : 248075584 (236.58 GiB 254.03 GB)
>> =A0 =A0 =A0Array Size : 248075584 (236.58 GiB 254.03 GB)
>> =A0 =A0Raid Devices : 2
>> =A0 Total Devices : 2
>> Preferred Minor : 6
>>
>> =A0 =A0 Update Time : Wed Jan 12 00:12:44 2011
>> =A0 =A0 =A0 =A0 =A0 State : clean
>> =A0Active Devices : 2
>> Working Devices : 2
>> =A0Failed Devices : 0
>> =A0 Spare Devices : 0
>> =A0 =A0 =A0 =A0Checksum : e4883f8e - expected e4883e8e
>> =A0 =A0 =A0 =A0 =A0Events : 199168
>>
>>
>> =A0 =A0 =A0 Number =A0 Major =A0 Minor =A0 RaidDevice State
>> this =A0 =A0 1 =A0 =A0 =A0 8 =A0 =A0 =A0 38 =A0 =A0 =A0 =A01 =A0 =A0=
=A0active sync
>>
>> =A0 =A00 =A0 =A0 0 =A0 =A0 =A0 8 =A0 =A0 =A0 70 =A0 =A0 =A0 =A00 =A0=
=A0 =A0active sync
>> =A0 =A01 =A0 =A0 1 =A0 =A0 =A0 8 =A0 =A0 =A0 38 =A0 =A0 =A0 =A01 =A0=
=A0 =A0active sync
>> root@bob # mdadm -E /dev/sdb6
>> /dev/sdb6:
>> =A0 =A0 =A0 =A0 =A0 Magic : a92b4efc
>> =A0 =A0 =A0 =A0 Version : 0.90.03
>> =A0 =A0 =A0 =A0 =A0 =A0UUID : 37102258:bf9c1930:b8397fb8:eba356af
>> =A0 Creation Time : Mon Aug =A07 21:06:47 2006
>> =A0 =A0 =A0Raid Level : raid1
>> =A0 Used Dev Size : 248075584 (236.58 GiB 254.03 GB)
>> =A0 =A0 =A0Array Size : 248075584 (236.58 GiB 254.03 GB)
>> =A0 =A0Raid Devices : 2
>> =A0 Total Devices : 2
>> Preferred Minor : 6
>>
>> =A0 =A0 Update Time : Wed Jan 12 00:12:44 2011
>> =A0 =A0 =A0 =A0 =A0 State : clean
>> =A0Active Devices : 2
>> Working Devices : 2
>> =A0Failed Devices : 0
>> =A0 Spare Devices : 0
>> =A0 =A0 =A0 =A0Checksum : e4883fac - expected e4883eac
>> =A0 =A0 =A0 =A0 =A0Events : 199168
>>
>>
>> =A0 =A0 =A0 Number =A0 Major =A0 Minor =A0 RaidDevice State
>> this =A0 =A0 0 =A0 =A0 =A0 8 =A0 =A0 =A0 70 =A0 =A0 =A0 =A00 =A0 =A0=
=A0active sync
>>
>> =A0 =A00 =A0 =A0 0 =A0 =A0 =A0 8 =A0 =A0 =A0 70 =A0 =A0 =A0 =A00 =A0=
=A0 =A0active sync
>> =A0 =A01 =A0 =A0 1 =A0 =A0 =A0 8 =A0 =A0 =A0 38 =A0 =A0 =A0 =A01 =A0=
=A0 =A0active sync
>> -----------------
>>
>>
>> Any idea how I could try to save my raid group?
>>
>> Thanks in advance for your help.
>>
>> Best Regards,
>> Reynald
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid=
" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at =A0http://vger.kernel.org/majordomo-info.html
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: mdXX: bitmap superblock UUID mismatch

am 14.02.2011 17:36:29 von mathias.buren

On 14 February 2011 16:33, Reynald Borer wrot=
e:
> Hi,
>
> Nice catch for the 1 bit difference, I didn't see it. My point
> regarding bitmap reconstruction was because this raid was used in a
> LVM setup. the LVM partition used two distinct raid 1 disks, and I wa=
s
> not able to start the LVM correctly without this failing raid.
>
> In the end, I was able to save my LVM by simply skipping the raid 1
> and using directly one partition. The LVM tool was clever enough to
> detect the MD bits and proposed me to remove them in order to mount
> directly the partition, and it worked fine. Thus I was able to save m=
y
> data.
>
> Thanks for your answer though.
>
> Regards,
> Reynald
>
>
> On Thu, Jan 27, 2011 at 9:53 PM, NeilBrown wrote:
>> On Wed, 26 Jan 2011 21:58:25 +0100 Reynald Borer l.com>
>> wrote:
>>
>>> Hello guys,
>>>
>>> I have been using md raids for quite a long time now and it always
>>> worked fine, until recently when I upgraded some hardware on my
>>> workstation. Unfortunately the hardware I changed proved itself to =
be
>>> very unstable, and I encountered a lot of hard lockups of the syste=
m
>>> while running. Those lockups recently made one of my raid 1 array
>>> fails with the infamous error message "mdXX: bitmap superblock UUID
>>> mismatch".
>>>
>>> Here is what I have found in the kernel logs when I try to activate
>>> the given raid group:
>>> -----------------
>>> md/raid1:md126: active with 2 out of 2 mirrors
>>> md126: bitmap superblock UUID mismatch
>>> md126: bitmap file superblock:
>>> Â Â Â Â Â magic: 6d746962
>>> Â Â Â Â version: 4
>>> Â Â Â Â Â uuid: 37102258.af9c1930.b8397fb8=
eba356af
>> Â Â Â Â Â Â Â Â Â Â =
Â Â Â ^ this is an 'a'
>>
>>> Â Â Â Â events: 199168
>>> events cleared: 199166
>>> Â Â Â Â Â state: 00000000
>>> Â Â Â chunksize: 524288 B
>>> Â daemon sleep: 5s
>>> Â Â Â sync size: 248075584 KB
>>> max write behind: 0
>>> md126: failed to create bitmap (-22)
>>> -----------------
>>>
>>>
>>> Such error messages are displayed each time I try to run the raid
>>> group. Content of /proc/mdstat is:
>>> -----------------
>>> md126 : inactive sdb6[0] sda6[1]
>>> Â Â Â 496151168 blocks
>>> -----------------
>>>
>>>
>>> If I try to examine both disks with mdadm -E it shows some checksum
>>> mismatch for both partitions:
>>> -----------------
>>> root@bob # mdadm -E /dev/sda6
>>> /dev/sda6:
>>> Â Â Â Â Â Magic : a92b4efc
>>> Â Â Â Â Version : 0.90.03
>>> Â Â Â Â Â Â UUID : 37102258:bf9c1930:b=
8397fb8:eba356af
>> Â Â Â Â Â Â Â Â Â Â =
Â Â Â Â ^ this is a 'b'
>>
>> So you certainly do have some sick hardware!!!
>>
>> I suggest that you find some hardware that you can trust,
>> mount one of the two devices ( (sdb6 or sda6) ignoring the raid stuf=
f,
>> and copy data off to the device that you trust.
>>
>> Then start again.
>>
>> NeilBrown
>>
>>
>>> Â Creation Time : Mon Aug Â 7 21:06:47 2006
>>> Â Â Â Raid Level : raid1
>>> Â Used Dev Size : 248075584 (236.58 GiB 254.03 GB)
>>> Â Â Â Array Size : 248075584 (236.58 GiB 254.03 GB)
>>> Â Â Raid Devices : 2
>>> Â Total Devices : 2
>>> Preferred Minor : 6
>>>
>>> Â Â Update Time : Wed Jan 12 00:12:44 2011
>>> Â Â Â Â Â State : clean
>>> Â Active Devices : 2
>>> Working Devices : 2
>>> Â Failed Devices : 0
>>> Â Spare Devices : 0
>>> Â Â Â Â Checksum : e4883f8e - expected e4883e8e
>>> Â Â Â Â Â Events : 199168
>>>
>>>
>>> Â Â Â Number Â Major Â Minor Â RaidDe=
vice State
>>> this Â Â 1 Â Â Â 8 Â Â Â 38=
Â Â Â Â 1 Â Â Â active sync
>>>
>>> Â Â 0 Â Â 0 Â Â Â 8 Â Â =
Â 70 Â Â Â Â 0 Â Â Â active sync
>>> Â Â 1 Â Â 1 Â Â Â 8 Â Â =
Â 38 Â Â Â Â 1 Â Â Â active sync
>>> root@bob # mdadm -E /dev/sdb6
>>> /dev/sdb6:
>>> Â Â Â Â Â Magic : a92b4efc
>>> Â Â Â Â Version : 0.90.03
>>> Â Â Â Â Â Â UUID : 37102258:bf9c1930:b=
8397fb8:eba356af
>>> Â Creation Time : Mon Aug Â 7 21:06:47 2006
>>> Â Â Â Raid Level : raid1
>>> Â Used Dev Size : 248075584 (236.58 GiB 254.03 GB)
>>> Â Â Â Array Size : 248075584 (236.58 GiB 254.03 GB)
>>> Â Â Raid Devices : 2
>>> Â Total Devices : 2
>>> Preferred Minor : 6
>>>
>>> Â Â Update Time : Wed Jan 12 00:12:44 2011
>>> Â Â Â Â Â State : clean
>>> Â Active Devices : 2
>>> Working Devices : 2
>>> Â Failed Devices : 0
>>> Â Spare Devices : 0
>>> Â Â Â Â Checksum : e4883fac - expected e4883eac
>>> Â Â Â Â Â Events : 199168
>>>
>>>
>>> Â Â Â Number Â Major Â Minor Â RaidDe=
vice State
>>> this Â Â 0 Â Â Â 8 Â Â Â 70=
Â Â Â Â 0 Â Â Â active sync
>>>
>>> Â Â 0 Â Â 0 Â Â Â 8 Â Â =
Â 70 Â Â Â Â 0 Â Â Â active sync
>>> Â Â 1 Â Â 1 Â Â Â 8 Â Â =
Â 38 Â Â Â Â 1 Â Â Â active sync
>>> -----------------
>>>
>>>
>>> Any idea how I could try to save my raid group?
>>>
>>> Thanks in advance for your help.
>>>
>>> Best Regards,
>>> Reynald
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-rai=
d" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at Â http://vger.kernel.org/majordomo-info.=
html
>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid"=
in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at Â http://vger.kernel.org/majordomo-info.ht=
ml
>

Wow, that's cool. I suppose RAID1 is simple enough for this to be
possible, though. Still, cool. :-)

// M
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html