Re: How to force rewrite of a smart detected bad block with raid5:checkarray?

Re: How to force rewrite of a smart detected bad block with raid5:checkarray?

am 19.01.2011 19:36:12 von Richard Scobie

Marc Merlin wrote:

> Also, I didn't find anything about sync_action, check, and repair in
> the mdadm man page (a pointer to
> https://raid.wiki.kernel.org/index.php/RAID_Administration
> would me useful).
> Actually the above page still says that you can't check just a range
> of blocks.

> Is there more up to date documentation that I should be reading
> somewhere?

The kernel source, Documentation/md.txt.

Regards,

Richard
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: How to force rewrite of a smart detected bad block with raid5: checkarray?

am 19.01.2011 19:49:14 von Marc MERLIN

On Thu, Jan 20, 2011 at 07:36:12AM +1300, Richard Scobie wrote:
> Marc Merlin wrote:
>
> > Also, I didn't find anything about sync_action, check, and repair in
> > the mdadm man page (a pointer to
> > https://raid.wiki.kernel.org/index.php/RAID_Administration
> > would me useful).
> > Actually the above page still says that you can't check just a range
> > of blocks.
>
> > Is there more up to date documentation that I should be reading
> > somewhere?
>
> The kernel source, Documentation/md.txt.

Ah, yes of course. Didn't think about looking there, thanks.

Mmmh, so I was curious as to how repair, when reading all the blocks of a
stripe with no read errors, and finding a parity mismatch, would know which
block was corrupted and needs to be rewritten.
Given that I don't see how it can figure that out, I'm not even sure when
repair would be useful for raid5 when there are no underlying media errors
returned (raid6 obviously has redundant info and it's possible there).

Not that I need repair in my case (check indeed does the right thing), I'm
just not sure when repair would be useful.

Anyway, thanks again for the pointer.

Cheers,
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems & security ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: How to force rewrite of a smart detected bad block with raid5:checkarray?

am 19.01.2011 21:01:33 von NeilBrown

On Wed, 19 Jan 2011 10:49:14 -0800 Marc MERLIN wrote:

> On Thu, Jan 20, 2011 at 07:36:12AM +1300, Richard Scobie wrote:
> > Marc Merlin wrote:
> >
> > > Also, I didn't find anything about sync_action, check, and repair in
> > > the mdadm man page (a pointer to
> > > https://raid.wiki.kernel.org/index.php/RAID_Administration
> > > would me useful).
> > > Actually the above page still says that you can't check just a range
> > > of blocks.
> >
> > > Is there more up to date documentation that I should be reading
> > > somewhere?
> >
> > The kernel source, Documentation/md.txt.
>
> Ah, yes of course. Didn't think about looking there, thanks.

"man md" is also an appropriate place to look.

>
> Mmmh, so I was curious as to how repair, when reading all the blocks of a
> stripe with no read errors, and finding a parity mismatch, would know which
> block was corrupted and needs to be rewritten.

It doesn't repair the data - that would be impossible. It repairs the
redundancy information which is all that raid really knows about.
i.e. if it finds an inconsistency it re-writes the parity block.

> Given that I don't see how it can figure that out, I'm not even sure when
> repair would be useful for raid5 when there are no underlying media errors
> returned (raid6 obviously has redundant info and it's possible there).

It isn't often useful. But if you parity blocks are wrong somehow, then it
can be useful. It will not recover data that you have already lost, but it
could make it less likely to lose more data.

md currently treats RAID6 just the same way as RAID5 - parity is re-written.
It is possible that more could be done, but it isn't completely clear that it
should - and it certainly isn't high on my priority list.

NeilBrown


>
> Not that I need repair in my case (check indeed does the right thing), I'm
> just not sure when repair would be useful.
>
> Anyway, thanks again for the pointer.
>
> Cheers,
> Marc

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: How to force rewrite of a smart detected bad block with raid5: checkarray?

am 19.01.2011 21:57:39 von Marc MERLIN

On Thu, Jan 20, 2011 at 07:01:33AM +1100, NeilBrown wrote:
> > > The kernel source, Documentation/md.txt.
> >
> > Ah, yes of course. Didn't think about looking there, thanks.
>
> "man md" is also an appropriate place to look.

Ah yes. Thanks for that too.

> > Mmmh, so I was curious as to how repair, when reading all the blocks of a
> > stripe with no read errors, and finding a parity mismatch, would know which
> > block was corrupted and needs to be rewritten.
>
> It doesn't repair the data - that would be impossible. It repairs the

that's what I thought.

> redundancy information which is all that raid really knows about.
> i.e. if it finds an inconsistency it re-writes the parity block.

Right, so it has only chance out of n to fix the right drive. Better than
nothing though.

> It isn't often useful. But if you parity blocks are wrong somehow, then it
> can be useful. It will not recover data that you have already lost, but it
> could make it less likely to lose more data.

Fair enough.

> md currently treats RAID6 just the same way as RAID5 - parity is re-written.
> It is possible that more could be done, but it isn't completely clear that it
> should - and it certainly isn't high on my priority list.

Understood.

So, I went back and read man md, and md.txt in the kernel Documentation
tree, but I could not find documentation on this:
echo 3907029168 > sync_min
echo 3907029170 > sync_max

and as per my other post, it didn't work for me on 2.6.36
(echo: write error: Invalid argument)

Any suggestions?

Thanks,
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems & security ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: How to force rewrite of a smart detected bad block with raid5:checkarray?

am 20.01.2011 01:30:39 von Richard Scobie

Marc MERLIN wrote:

> So, I went back and read man md, and md.txt in the kernel Documentation
> tree, but I could not find documentation on this:
> echo 3907029168 > sync_min
> echo 3907029170 > sync_max
>
> and as per my other post, it didn't work for me on 2.6.36
> (echo: write error: Invalid argument)

sync_max is mentioned in md.txt for 2.6.29.1 (currently running here):

This is a number of sectors at which point a resync/recovery
process will pause. When a resync is active, the value can
only ever be increased, never decreased. The value of 'max'
effectively disables the limit.

There is no mention of sync_min.

Regards,

Richard
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html