First raid1 sector gets zeroed at first reboot
First raid1 sector gets zeroed at first reboot
am 16.08.2011 19:42:14 von Asdo
Hello all
sometimes I put grub on the first sector of a MD raid1 device, which is
on disk partition and not on the whole disk.
(there is another bootloader in the MBR which chainloads this one, and
that's not the problem)
Sometimes, and I'm not yet able to reproduce it reliably, that sector
gets zeroed at first reboot.
So the first reboot after installation of the OS + grub indeed succeeds,
but the next reboot fails. After the first reboot the first sector of
such MD device gets zeroed, so at the second reboot the bootloader is
missing. At that point I have to boot with a live-cd again and reinstall
Grub in there to be able to boot again.
I totally confirm that the sector is nonzero before the first reboot,
and is zero after the second reboot. Not sure when exactly it gets
zeroed but it's between those two points in time. I suspect it becomes
zero at the first reassemble of the MD device.
After the second reboot the problem won't ever happen again on that
RAID. And if it hasn't happened by that time it won't ever happen again
on that RAID.
I'm thinking at a bug in some RAID initialization procedure which is
being delayed at the first reassemble of the device... does this ring
any bell?
The last time it happened to me (that's yesterday) it was with a
degraded raid-1 (it was created with a missing device) with metadata=1.0
.. I absolutely confirm that dd'ing the first 512bytes sector from the MD
device and dd'ing the first sector from the underlying partition both
resulted in a (identical) nonzero sector before the first reboot. After
the second reboot both were zero.
Also please note that since it was a degraded raid-1, this excludes a
resync problem, because there couldn't possibly have been any resync.
Also, the filesystem itself appears intact, so this is a "bug" affecting
only the very beginning of a MD device.
Anyone knows what's happening?
With reboot I mean: "reboot -h now". And that's a real reboot from the
bios, no kexec.
I use Ubuntu, and it has been doing this I'd say at least with kernels
2.6.32 <---> 2.6.38 . Maybe it has always done this.
Happened on various recent Intel CPUs 64bit computers, various HDD
controllers, and various brands of drives which btw had physical
512bytes sectors.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: First raid1 sector gets zeroed at first reboot
am 17.08.2011 15:43:34 von John Robinson
On 16/08/2011 18:42, Asdo wrote:
> Hello all
> sometimes I put grub on the first sector of a MD raid1 device, which is
> on disk partition and not on the whole disk.
> (there is another bootloader in the MBR which chainloads this one, and
> that's not the problem)
>
> Sometimes, and I'm not yet able to reproduce it reliably, that sector
> gets zeroed at first reboot.
>
> So the first reboot after installation of the OS + grub indeed succeeds,
> but the next reboot fails. After the first reboot the first sector of
> such MD device gets zeroed, so at the second reboot the bootloader is
> missing. At that point I have to boot with a live-cd again and reinstall
> Grub in there to be able to boot again.
>
> I totally confirm that the sector is nonzero before the first reboot,
> and is zero after the second reboot. Not sure when exactly it gets
> zeroed but it's between those two points in time. I suspect it becomes
> zero at the first reassemble of the MD device.
>
> After the second reboot the problem won't ever happen again on that
> RAID. And if it hasn't happened by that time it won't ever happen again
> on that RAID.
> I'm thinking at a bug in some RAID initialization procedure which is
> being delayed at the first reassemble of the device... does this ring
> any bell?
>
> The last time it happened to me (that's yesterday) it was with a
> degraded raid-1 (it was created with a missing device) with metadata=1.0
> . I absolutely confirm that dd'ing the first 512bytes sector from the MD
> device and dd'ing the first sector from the underlying partition both
> resulted in a (identical) nonzero sector before the first reboot. After
> the second reboot both were zero.
> Also please note that since it was a degraded raid-1, this excludes a
> resync problem, because there couldn't possibly have been any resync.
>
> Also, the filesystem itself appears intact, so this is a "bug" affecting
> only the very beginning of a MD device.
>
> Anyone knows what's happening?
The first sector of a md RAID with metadata 1.0 is in its data area, so
there's no way md is writing to this area itself, it's almost certainly
the filesystem that's writing it.
I think installing grub on a md partition is a bad idea. You can use
metadata 1.2 to have the first 4K left free, but grub may write its
stage 1.5 code to the first 31.5K of a device (whole drive or partition).
Cheers,
John.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: First raid1 sector gets zeroed at first reboot
am 17.08.2011 16:00:13 von Asdo
On 08/17/11 15:43, John Robinson wrote:
>
> The first sector of a md RAID with metadata 1.0 is in its data area,
> so there's no way md is writing to this area itself, it's almost
> certainly the filesystem that's writing it.
>
> I think installing grub on a md partition is a bad idea. You can use
> metadata 1.2 to have the first 4K left free, but grub may write its
> stage 1.5 code to the first 31.5K of a device (whole drive or partition).
>
No you are confusing it with metadata 1.1 .
Metadata 1.0 has the data at the end like 0.9 .
Writing there is semantically correct.
Thank you
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: First raid1 sector gets zeroed at first reboot
am 17.08.2011 16:34:46 von John Robinson
On 17/08/2011 15:00, Asdo wrote:
> On 08/17/11 15:43, John Robinson wrote:
>>
>> The first sector of a md RAID with metadata 1.0 is in its data area,
>> so there's no way md is writing to this area itself, it's almost
>> certainly the filesystem that's writing it.
>>
>> I think installing grub on a md partition is a bad idea. You can use
>> metadata 1.2 to have the first 4K left free, but grub may write its
>> stage 1.5 code to the first 31.5K of a device (whole drive or partition).
>>
>
> No you are confusing it with metadata 1.1 .
> Metadata 1.0 has the data at the end like 0.9 .
No I'm afraid it's you that's confused. Metadata 1.0 has the *metadata*
at the end, like 0.9, so it has the data i.e. filesystem area at the
beginning. Take a look with mdadm -D, the data offset is zero.
This is why partitions with 0.9 or 1.0 metadata in RAID-1 can be used
individually to boot from because the partitions look identical to ones
that just have filesystems on them.
Cheers,
John.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: First raid1 sector gets zeroed at first reboot
am 17.08.2011 18:45:26 von Asdo
On 08/17/11 16:34, John Robinson wrote:
> On 17/08/2011 15:00, Asdo wrote:
>> No you are confusing it with metadata 1.1 .
>> Metadata 1.0 has the data at the end like 0.9 .
>
> No I'm afraid it's you that's confused. Metadata 1.0 has the
> *metadata* at the end, like 0.9, so it has the data i.e. filesystem
> area at the beginning. Take a look with mdadm -D, the data offset is
> zero.
>
> This is why partitions with 0.9 or 1.0 metadata in RAID-1 can be used
> individually to boot from because the partitions look identical to
> ones that just have filesystems on them.
That's correct, the same thing I meant.
Then I had misinterpreted you:
> The first sector of a md RAID with metadata 1.0 is in its data area,
> so there's no way md is writing to this area itself, it's almost
> certainly the filesystem that's writing it.
This is an interesting observation then. ("no way" is a bit extreme
though) You are right in the sense that it might have been the
filesystem that is doing something at the first remount, and not MD. I
can't be sure it's MD anymore. Still, this is wrong, why should the
filesystem wipe its own boot sector?
It's ext3 btw. If no one pops up with an explanation here on linux-raid
I will also ask there.
I still stand that what I am doing is correct. I am using the partition
boot sector properly
http://en.wikipedia.org/wiki/Volume_boot_record
Thank you
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: First raid1 sector gets zeroed at first reboot
am 17.08.2011 19:32:05 von John Robinson
On 17/08/2011 17:45, Asdo wrote:
> On 08/17/11 16:34, John Robinson wrote:
[...]
>> The first sector of a md RAID with metadata 1.0 is in its data area,
>> so there's no way md is writing to this area itself, it's almost
>> certainly the filesystem that's writing it.
> This is an interesting observation then. ("no way" is a bit extreme
> though)
If it did, everybody's filesystems would be getting trashed, and I don't
think this is happening.
> You are right in the sense that it might have been the
> filesystem that is doing something at the first remount, and not MD. I
> can't be sure it's MD anymore. Still, this is wrong, why should the
> filesystem wipe its own boot sector?
> It's ext3 btw. If no one pops up with an explanation here on linux-raid
> I will also ask there.
>
> I still stand that what I am doing is correct. I am using the partition
> boot sector properly
> http://en.wikipedia.org/wiki/Volume_boot_record
Not all filesystems or other things you might have in a partition
necessarily support a volume boot record. LVM doesn't. XFS doesn't. md
doesn't unless you use metadata 1.2. As it happens it can work with md
if you use metadata 0.9 or 1.0 and a filesystem which does support a
volume boot record.
ext2/3 supports a volume boot record. The first superblock starts 1K
into the filesystem. If you are using a 1K block size, the superblock is
in block 1, so mke2fs won't touch an existing volume boot record. If you
are using the more likely 4K block size, the superblock is 1K into block
0, and mke2fs will write zeros to the first 1K.
Once mke2fs has been run, however, I wouldn't expect ext2/3 to overwrite
the volume boot record, given that the developers bothered to support
one in the first place, but that's an ext2/3 question.
Cheers,
John.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html