Raid Checks

Raid Checks

am 03.04.2011 10:31:32 von Jonathan Tripathy

Hi Everyone,

I'm running CentOS 5.5 which a stock version of mdadm. I have 2 physical
disks in a RAID1 setup. Each disk has 4 md partitions on it.

I'm experiencing the issues associated with the raid-check script every
Sunday morning, where whatever is happening, a re-sync happens. Doing a
little reading around Google, I see that this is probably caused by the
mismatch_cnt being non-zero, which apparently is normal for RAID1
devices. According to this bug:

https://bugzilla.redhat.com/show_bug.cgi?id=566828

someone has made a patch for RedHat so that mismatch_cnt isn't checked
on RAID1 setups. However, since I don't use RedHat, and I don't really
want to compile mdadm from scratch, is there a workaround for this? I
don't want to just disable the raid-check script, as I think it does
some other important checks which are useful for RAID1.

Any help would be appreciated

Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Raid Checks

am 03.04.2011 10:48:47 von Jonathan Tripathy

On 03/04/11 09:31, Jonathan Tripathy wrote:
> Hi Everyone,
>
> I'm running CentOS 5.5 which a stock version of mdadm. I have 2
> physical disks in a RAID1 setup. Each disk has 4 md partitions on it.
>
> I'm experiencing the issues associated with the raid-check script
> every Sunday morning, where whatever is happening, a re-sync happens.
> Doing a little reading around Google, I see that this is probably
> caused by the mismatch_cnt being non-zero, which apparently is normal
> for RAID1 devices. According to this bug:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=566828
>
> someone has made a patch for RedHat so that mismatch_cnt isn't checked
> on RAID1 setups. However, since I don't use RedHat, and I don't really
> want to compile mdadm from scratch, is there a workaround for this? I
> don't want to just disable the raid-check script, as I think it does
> some other important checks which are useful for RAID1.
>
> Any help would be appreciated
>
> Thanks
>
Seems like I'm a little confused...

To apply that patch, I don't need to re-compile, as it's just a script
run by cron. However, looking at the script, it may not fix my "resync"
issue which is happening every Sunday. Looks like that script just stops
the mismatch warning email from being sent.

My main concern, is that during these resyncs, I loose redundancy, don't I?
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Raid Checks

am 03.04.2011 13:01:57 von NeilBrown

On Sun, 03 Apr 2011 09:48:47 +0100 Jonathan Tripathy
wrote:

>
> On 03/04/11 09:31, Jonathan Tripathy wrote:
> > Hi Everyone,
> >
> > I'm running CentOS 5.5 which a stock version of mdadm. I have 2
> > physical disks in a RAID1 setup. Each disk has 4 md partitions on it.
> >
> > I'm experiencing the issues associated with the raid-check script
> > every Sunday morning, where whatever is happening, a re-sync happens.
> > Doing a little reading around Google, I see that this is probably
> > caused by the mismatch_cnt being non-zero, which apparently is normal
> > for RAID1 devices. According to this bug:
> >
> > https://bugzilla.redhat.com/show_bug.cgi?id=566828
> >
> > someone has made a patch for RedHat so that mismatch_cnt isn't checked
> > on RAID1 setups. However, since I don't use RedHat, and I don't really
> > want to compile mdadm from scratch, is there a workaround for this? I
> > don't want to just disable the raid-check script, as I think it does
> > some other important checks which are useful for RAID1.
> >
> > Any help would be appreciated
> >
> > Thanks
> >
> Seems like I'm a little confused...
>
> To apply that patch, I don't need to re-compile, as it's just a script
> run by cron. However, looking at the script, it may not fix my "resync"
> issue which is happening every Sunday. Looks like that script just stops
> the mismatch warning email from being sent.
>
> My main concern, is that during these resyncs, I loose redundancy, don't I?

No.

NeilBrown

> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Raid Checks

am 03.04.2011 13:04:46 von Jonathan Tripathy

>> My main concern, is that during these resyncs, I loose redundancy, don't I?
> No.
>
> NeilBrown
>
Ah right, so by letting the raid-check script resync my RAID1 arrays
every Sunday, the only thing I loose is disk performance during the
resync? This is what state my server is currently in:

State : clean, resyncing
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0

Rebuild Status : 77% complete

Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Raid Checks

am 03.04.2011 13:19:37 von NeilBrown

On Sun, 03 Apr 2011 12:04:46 +0100 Jonathan Tripathy
wrote:

>
> >> My main concern, is that during these resyncs, I loose redundancy, don't I?
> > No.
> >
> > NeilBrown
> >
> Ah right, so by letting the raid-check script resync my RAID1 arrays
> every Sunday, the only thing I loose is disk performance during the
> resync?

Correct.

> This is what state my server is currently in:
>
> State : clean, resyncing

With mdadm 3.2.1, this will show "checking" when it is checking, rather than
"resyncing".

NeilBrown


> Active Devices : 2
> Working Devices : 2
> Failed Devices : 0
> Spare Devices : 0
>
> Rebuild Status : 77% complete
>
> Thanks

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Raid Checks

am 03.04.2011 13:30:03 von Jonathan Tripathy

>> This is what state my server is currently in:
>>
>> State : clean, resyncing
> With mdadm 3.2.1, this will show "checking" when it is checking, rather than
> "resyncing".
Interesting. So during a normal "resync", I'm guessing redundancy will
be lost? Also, if the "checking" process finds an error, where will I
find it? /var/log/messages?

Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Raid Checks

am 03.04.2011 14:18:52 von John Robinson

On 03/04/2011 12:30, Jonathan Tripathy wrote:
>>> This is what state my server is currently in:
>>>
>>> State : clean, resyncing
>> With mdadm 3.2.1, this will show "checking" when it is checking,
>> rather than "resyncing".
> Interesting. So during a normal "resync", I'm guessing redundancy will
> be lost?

You only need a resync when you've already lost redundancy. The only
normal resync is at array creation time, when mdadm will instantly
create the array with no redundancy and then proceed to generate the
mirror/parity information. Any other resync is only required after
something else caused a loss of redundancy e.g. a disc dying or a system
crash, or perhaps a drive threw up a bad sector. The check process will
attempt to discover any such problems.

As Doug Ledford noted in the Red Hat Bugzilla there are times when a md
RAID 1 can harmlessly end up out of sync.

> Also, if the "checking" process finds an error, where will I
> find it? /var/log/messages?

In the email the raid-check process sends you. Any errors found are not
reported in detail on RHEL/CentOS 5 - I think more recent kernel/mdadm
can be asked to be more verbose about the location of errors. Requesting
a repair rather than a check would get the check process to
automatically resync any stripes which had bad mirrors/parity.

In short, it is safe to ignore a modest mismatch_cnt on RAID 1 as long
as you aren't seeing disc errors, and that is what Doug Ledford's patch
to raid-check does.

Cheers,

John.

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Raid Checks

am 03.04.2011 21:02:49 von Jonathan Tripathy

> On 03/04/2011 12:30, Jonathan Tripathy wrote:
>>>> This is what state my server is currently in:
>>>>
>>>> State : clean, resyncing
>>> With mdadm 3.2.1, this will show "checking" when it is checking,
>>> rather than "resyncing".
>> Interesting. So during a normal "resync", I'm guessing redundancy will
>> be lost?
>
> You only need a resync when you've already lost redundancy. The only
> normal resync is at array creation time, when mdadm will instantly
> create the array with no redundancy and then proceed to generate the
> mirror/parity information. Any other resync is only required after
> something else caused a loss of redundancy e.g. a disc dying or a
> system crash, or perhaps a drive threw up a bad sector. The check
> process will attempt to discover any such problems.
Understood. If a problem is detected, would I have to involk a re-sync
manually (assuming that CHECK is enabled in the script)
>
> As Doug Ledford noted in the Red Hat Bugzilla there are times when a
> md RAID 1 can harmlessly end up out of sync.
>
>> Also, if the "checking" process finds an error, where will I
>> find it? /var/log/messages?
>
> In the email the raid-check process sends you. Any errors found are
> not reported in detail on RHEL/CentOS 5 - I think more recent
> kernel/mdadm can be asked to be more verbose about the location of
> errors. Requesting a repair rather than a check would get the check
> process to automatically resync any stripes which had bad mirrors/parity.
How do I set up emails? Is it just the monitor deamon?
>
> In short, it is safe to ignore a modest mismatch_cnt on RAID 1 as long
> as you aren't seeing disc errors, and that is what Doug Ledford's
> patch to raid-check does.
>

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html