Re-map disk sectors in userspace when rewriting after read errors

am 15.09.2009 08:23:33 von Matthias Urlichs

Hi,

my problem is that I have a bunch of crappy disks which seem unable to
reliably remap bad areas after a read error.

This obviously makes the read error rewrite feature of our beloved
RAID5/6 code somewhat less than useful.

What I would like to do is to re-map these sectors in userspace -- either
by browbeating the disk into it, or by using the Device Mapper. So I'd
need a way to tell a userspace daemon "this device+block is unreadable",
and wait until said daemon tells the RAID core to go ahead.

I can do the userspace side easily, but my time to dig through the RAID
code and implement that sort of channel in a maintainable way is somewhat
limited. (Plus, I need that code sooner rather than later.)

Would somebody be able to help out? There may be some money in it ...

--
Matthias Urlichs

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Re-map disk sectors in userspace when rewriting after read errors

am 15.09.2009 08:45:29 von berk walker

Matthias Urlichs wrote:
> Hi,
>
> my problem is that I have a bunch of crappy disks which seem unable to
> reliably remap bad areas after a read error.
>
> This obviously makes the read error rewrite feature of our beloved
> RAID5/6 code somewhat less than useful.
>
> What I would like to do is to re-map these sectors in userspace -- either
> by browbeating the disk into it, or by using the Device Mapper. So I'd
> need a way to tell a userspace daemon "this device+block is unreadable",
> and wait until said daemon tells the RAID core to go ahead.
>
> I can do the userspace side easily, but my time to dig through the RAID
> code and implement that sort of channel in a maintainable way is somewhat
> limited. (Plus, I need that code sooner rather than later.)
>
> Would somebody be able to help out? There may be some money in it ...
>
>
I can not believe the question. What file system might this be?

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Re-map disk sectors in userspace when rewriting after readerrors

am 15.09.2009 09:13:08 von Alex Butcher

On Tue, 15 Sep 2009, Matthias Urlichs wrote:

> my problem is that I have a bunch of crappy disks which seem unable to
> reliably remap bad areas after a read error.

IME, discs don't remap after read errors, only on writes.

> This obviously makes the read error rewrite feature of our beloved
> RAID5/6 code somewhat less than useful.

Are you sure that refresh-writes triggered by read errors are expected
behaviour of md's RAID5/6 mode? Only they weren't for RAID1 until somewhat
recently (2.6.15, IIRC).

Best Regards,
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Re-map disk sectors in userspace when rewriting after readerrors

am 15.09.2009 09:23:08 von Matthias Urlichs

On Tue, 15 Sep 2009 02:45:29 -0400, berk walker wrote:

> I can not believe the question. What file system might this be?

Umm, what's your problem with my question?

And why would it matter which file system I'm using?

_My_ problem is that I have a bunch of disks which are not as reliable as
I'd like. Yes I could go and buy a new heap of 1TB disks, but frankly I'd
like to avoid that. These disks are "good enough" for the data that's on
them. I'll replace one if it fails entirely -- assuming that I can
rebuild the RAID6 array when I do that. However, since the rewrite-after-
read code has caused bad sectors to accumulate on all of these disks, I
can't even do that at the moment.

(And, since there's no command which knows how to recover bad spots from
the other RAID disks yet (I hope to be able to work on _that_ problem
next week), I can't even use ddrescue to copy one almost-good disk to a
new one.)

--
Matthias Urlichs

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Re-map disk sectors in userspace when rewriting after readerrors

am 15.09.2009 09:29:18 von Matthias Urlichs

On Tue, 15 Sep 2009 08:13:08 +0100, Alex Butcher wrote:

> IME, discs don't remap after read errors, only on writes.

Some may remap after recoverable read errors. However, the RAID code
does (I assume - see below) rewrite the data -- which the disk happily
acknowledges -- only to report the very same error next time that spot's
being read. :-(

> Are you sure that refresh-writes triggered by read errors are expected
> behaviour of md's RAID5/6 mode?

Not 100%, no -- but recovering the data but otherwise ignoring the error
(other than increment the error counter) would be a level of foolishness
I won't assume of the RAID code's authors.

--
Matthias Urlichs

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Re-map disk sectors in userspace when rewriting after readerrors

am 15.09.2009 09:37:52 von Alex Butcher

On Tue, 15 Sep 2009, Matthias Urlichs wrote:

> On Tue, 15 Sep 2009 08:13:08 +0100, Alex Butcher wrote:
>
>> IME, discs don't remap after read errors, only on writes.
>
> Some may remap after recoverable read errors. However, the RAID code
> does (I assume - see below) rewrite the data -- which the disk happily
> acknowledges -- only to report the very same error next time that spot's
> being read. :-(

Odd. If I hadn't observed something similar myself with a 40G Maxtor
(badblocks -w fails, wipe with dd if=/dev/zero, badblocks -w succeeds,
badblocks -w fails again), I wouldn't believe it. SMART seems to think that
it's nowhere near the reallocated sector count threshold. The only
conclusion I can come to is that the firmware is trash, or being way too
forgiving of inconsistently-performing spinning media. Either way, it's not
suitable for data I even care a little bit about.

What does SMART say about reallocated and pending sectors on your disks? If
the reallocated threshold has been crossed, this might be the failure mode,
I guess.

What make/model are they?

>> Are you sure that refresh-writes triggered by read errors are expected
>> behaviour of md's RAID5/6 mode?
>
> Not 100%, no -- but recovering the data but otherwise ignoring the error
> (other than increment the error counter) would be a level of foolishness
> I won't assume of the RAID code's authors.

Well, the RAID code had been in the kernel and was being used in production
systems for quite some time before 2.6.15 came along. It took a BSD user to
point it out and a read through the kernel source for me to believe it...

Cheers,
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Re-map disk sectors in userspace when rewriting after read errors

am 15.09.2009 12:40:44 von majedb

Hello,

I'm facing a similar problem now with 2 disks. The
Current_Pending_Sectors and Offline_Uncorrectable are higher than 100,
on a RAID5. SMART monitoring tools failed to report these after each
test so now I'm battling through...

I'm running the array degraded and yesterday while trying to copy the
data to another array (5.5TB), one disk jumped out of the dodgy array
and caused I/O errors... It won't even resync to another disk beyond
15.6%.

Currently, I'm cloning with dd_rescue and hoping to be able to copy
most of the data, and accept some data loss...

Would anyone suggest a better solution?

P.S.: The disks in question are WD, model: WDC WD10EACS-00ZJB0. I have
other WD disks and they're intact and have zero bad sectors...

On Tue, Sep 15, 2009 at 9:23 AM, Matthias Urlichs =
wrote:
> Hi,
>
> my problem is that I have a bunch of crappy disks which seem unable t=
o
> reliably remap bad areas after a read error.
>
> This obviously makes the read error rewrite feature of our beloved
> RAID5/6 code somewhat less than useful.
>
> What I would like to do is to re-map these sectors in userspace -- ei=
ther
> by browbeating the disk into it, or by using the Device Mapper. So I'=
d
> need a way to tell a userspace daemon "this device+block is unreadabl=
e",
> and wait until said daemon tells the RAID core to go ahead.
>
> I can do the userspace side easily, but my time to dig through the RA=
ID
> code and implement that sort of channel in a maintainable way is some=
what
> limited. (Plus, I need that code sooner rather than later.)
>
> Would somebody be able to help out? There may be some money in it ...
>
> --
> Matthias Urlichs
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid"=
in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at Â http://vger.kernel.org/majordomo-info.ht=
ml
>

--=20
Majed B.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Re-map disk sectors in userspace when rewriting after readerrors

am 15.09.2009 12:48:43 von Matthias Urlichs

On Tue, 15 Sep 2009 08:37:52 +0100, Alex Butcher wrote:

> Either way, it's not
> suitable for data I even care a little bit about.

Ordinarily I'd agree with you. In this case, however, the data is mostly
read-only and on backup media. So I don't really care if the disks fall
off the edge of a cliff; the data will survive.

I can justify a moderate amount of time working on this, with the
hardware I have. I can't really justify buying eight new disks.

NB: Please don't dismiss this kind of setup out of hand. I know that
disks are cheap enough these days that the typical professional user
won't ever need to worry about not being able to replace hardware which
behaves like this. However, many people happen to be in a different
situation. :-/

--
Matthias Urlichs

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Re-map disk sectors in userspace when rewriting after readerrors

am 15.09.2009 12:52:07 von Matthias Urlichs

On Tue, 15 Sep 2009 13:40:44 +0300, Majed B. wrote:

> Would anyone suggest a better solution?

You should tell ddrescue to log which sectors it failed to copy. You can
then recover the missing data by reading the stuff at that offset from
the other disks, and XORing the bytes.

I plan to write a program which does that (and which also understands
RAID1 and RAID6). How long can you survive without your data?

--
Matthias Urlichs

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Re-map disk sectors in userspace when rewriting after read errors

am 15.09.2009 13:03:17 von majedb

I've been trying to migrate for 2 weeks. I can wait another 2 ... maybe=
3 weeks.

Just to be clear, I'm using dd_rescue, not ddrescue (This is the GNU
one). I read the log option but forgot to use it... now I've wasted
over 20 hours... ugh ... /smacks self

That would be a very useful program for cases like this!

On Tue, Sep 15, 2009 at 1:52 PM, Matthias Urlichs =
wrote:
> On Tue, 15 Sep 2009 13:40:44 +0300, Majed B. wrote:
>
>> Would anyone suggest a better solution?
>
> You should tell ddrescue to log which sectors it failed to copy. You =
can
> then recover the missing data by reading the stuff at that offset fro=
m
> the other disks, and XORing the bytes.
>
> I plan to write a program which does that (and which also understands
> RAID1 and RAID6). How long can you survive without your data?
>
> --
> Matthias Urlichs
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid"=
in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at Â http://vger.kernel.org/majordomo-info.ht=
ml
>

--=20
Majed B.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Re-map disk sectors in userspace when rewriting after read errors

am 15.09.2009 19:02:41 von majedb

Matthias,

Out of curiosity, how will you find the sectors/blocks that
reconstruct a certain bad sector? Is the data spread to the same block
number on all disks?

On Tue, Sep 15, 2009 at 2:03 PM, Majed B. wrote:
> I've been trying to migrate for 2 weeks. I can wait another 2 ... may=
be 3 weeks.
>
> Just to be clear, I'm using dd_rescue, not ddrescue (This is the GNU
> one). I read the log option but forgot to use it... now I've wasted
> over 20 hours... ugh ... /smacks self
>
> That would be a very useful program for cases like this!
>
> On Tue, Sep 15, 2009 at 1:52 PM, Matthias Urlichs e> wrote:
>> On Tue, 15 Sep 2009 13:40:44 +0300, Majed B. wrote:
>>
>>> Would anyone suggest a better solution?
>>
>> You should tell ddrescue to log which sectors it failed to copy. You=
can
>> then recover the missing data by reading the stuff at that offset fr=
om
>> the other disks, and XORing the bytes.
>>
>> I plan to write a program which does that (and which also understand=
s
>> RAID1 and RAID6). How long can you survive without your data?
>>
>> --
>> Matthias Urlichs
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid=
" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at Â http://vger.kernel.org/majordomo-info.h=
tml
>>
>
>
>
> --
> Â Â Â Majed B.
>

--=20
Majed B.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Re-map disk sectors in userspace when rewriting after readerrors

am 15.09.2009 20:05:53 von Matthias Urlichs

On Tue, 2009-09-15 at 20:02 +0300, Majed B. wrote:
> Out of curiosity, how will you find the sectors/blocks that
> reconstruct a certain bad sector? Is the data spread to the same block
> number on all disks?

Yes. It's a byte-level operation, actually.

The only part that's moderately tricky is, on RAID6, to determine which
partition the Q drive is. Fortunately, mdadm already contains (almost)
all the necessary logic.

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Re-map disk sectors in userspace when rewriting after read errors

am 15.09.2009 20:14:55 von majedb

Hmm, so I guess I'm luckier since I run RAID5? (or not because I have
2 bad disks? :p)

When do you expect to have a working application done, by the way?

On Tue, Sep 15, 2009 at 9:05 PM, Matthias Urlichs wrote:
> On Tue, 2009-09-15 at 20:02 +0300, Majed B. wrote:
>> Out of curiosity, how will you find the sectors/blocks that
>> reconstruct a certain bad sector? Is the data spread to the same block
>> number on all disks?
>
> Yes. It's a byte-level operation, actually.
>
> The only part that's moderately tricky is, on RAID6, to determine which
> partition the Q drive is. Fortunately, mdadm already contains (almost)
> all the necessary logic.
>
>

--
Majed B.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Re-map disk sectors in userspace when rewriting after readerrors

am 15.09.2009 20:44:42 von Matthias Urlichs

On Tue, 2009-09-15 at 21:14 +0300, Majed B. wrote:
> Hmm, so I guess I'm luckier since I run RAID5? (or not because I have
> 2 bad disks? :p)
>
Well, depends on whether you have two errors in the same sector. If not,
you're going to be lucky.

> When do you expect to have a working application done, by the way?
>
Hopefully later this week. It'll probably be a patch to mdadm's
development branch of some sort.

Neil: In order to do that, I need to read badblock map files for some
(or all) disks, in GNU dd_rescue's format preferably. Do you have a
preference WRT how to tell mdadm about these?

I tend towards "mdadm --recover 0:foo,2:bar DISK_DEVICE...". This would
tell mdadm that the badblock map for disk 0 is in file 'foo', the map
for disk 2 is in 'bar', and the other disks are supposed to be cleanly
read/writeable.

mdadm would then read RAID info from these devices, make sure it's
consistent (or "mostly consistent" if using --force), read the bad block
map, recover the data that's indicated to be bad and write it to the
partitions in question, and zero out the blocks that are unrecoverable
(and restore P+Q vectors for them).

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Re-map disk sectors in userspace when rewriting after read errors

am 16.09.2009 11:31:28 von majedb

Matthias,

I have a question which would probably sound stupid: If I have a bad
blocks output file from dd_rescue, can I reconstruct a bad sector's
data by reading the same sector from all disks (using dd if=/dev/sdx
of=./bbfix_#number bs=512 count=1 skip=bb_number-1), then run an
normal XOR operation, write zeros to the bad block to force sector
remap, then dd the XOR output to the said sector?

On Tue, Sep 15, 2009 at 9:44 PM, Matthias Urlichs wrote:
> On Tue, 2009-09-15 at 21:14 +0300, Majed B. wrote:
>> Hmm, so I guess I'm luckier since I run RAID5? (or not because I have
>> 2 bad disks? :p)
>>
> Well, depends on whether you have two errors in the same sector. If not,
> you're going to be lucky.
>
>> When do you expect to have a working application done, by the way?
>>
> Hopefully later this week. It'll probably be a patch to mdadm's
> development branch of some sort.
>
> Neil: In order to do that, I need to read badblock map files for some
> (or all) disks, in GNU dd_rescue's format preferably. Do you have a
> preference WRT how to tell mdadm about these?
>
> I tend towards "mdadm --recover 0:foo,2:bar DISK_DEVICE...". This would
> tell mdadm that the badblock map for disk 0 is in file 'foo', the map
> for disk 2 is in 'bar', and the other disks are supposed to be cleanly
> read/writeable.
>
> mdadm would then read RAID info from these devices, make sure it's
> consistent (or "mostly consistent" if using --force), read the bad block
> map, recover the data that's indicated to be bad and write it to the
> partitions in question, and zero out the blocks that are unrecoverable
> (and restore P+Q vectors for them).
>
>
>

--
Majed B.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Re-map disk sectors in userspace when rewriting after read errors

am 16.09.2009 11:41:15 von Goswin von Brederlow

Matthias Urlichs writes:

> On Tue, 15 Sep 2009 08:37:52 +0100, Alex Butcher wrote:
>
>> Either way, it's not
>> suitable for data I even care a little bit about.
>
> Ordinarily I'd agree with you. In this case, however, the data is mostly
> read-only and on backup media. So I don't really care if the disks fall
> off the edge of a cliff; the data will survive.
>
> I can justify a moderate amount of time working on this, with the
> hardware I have. I can't really justify buying eight new disks.
>
> NB: Please don't dismiss this kind of setup out of hand. I know that
> disks are cheap enough these days that the typical professional user
> won't ever need to worry about not being able to replace hardware which
> behaves like this. However, many people happen to be in a different
> situation. :-/

How about making it re-read repaired blocks so it catches when the
disk didn't remap?

I'm assuming the following happens:

1) disk read fails
2) raid rebuilds the block from parity
3) raid writes block to bad disk
4) disk writes data to the old block and fails to detect a write error
that would trigger a rempapping
5) re-read of the data succeeds because the data is still in the
drives disk cache
6) later read of the data fails because nothing was remapped

So you would need to write some repair-check-daemon that remembers
repaired blocks, waits for enough data to have passed through the
drive to flush the disk cache and then retries the block again.
And again and again till it stops giving errors.

Alternatively write a re-map device-mapper target that reserves some
space of the disk and remaps bad blocks itself.

MfG
Goswin
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Re-map disk sectors in userspace when rewriting after readerrors

am 16.09.2009 11:44:26 von Matthias Urlichs

On Wed, 2009-09-16 at 12:31 +0300, Majed B. wrote:
> I have a question which would probably sound stupid: If I have a bad
> blocks output file from dd_rescue, can I reconstruct a bad sector's
> data by reading the same sector from all disks (using dd if=/dev/sdx
> of=./bbfix_#number bs=512 count=1 skip=bb_number-1), then run an
> normal XOR operation, write zeros to the bad block to force sector
> remap, then dd the XOR output to the said sector?

Well, of course. Assuming that the disk's sector remap works, which was
my problem, and that we're talking about RAID5.

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Re-map disk sectors in userspace when rewriting after read errors

am 16.09.2009 11:52:42 von majedb

That's good, I guess, but I fell into what seems to be a problem yesterday.

I've mentioned before that I have 8 disks in an array. 7 of which
belong to it (degraded), and one doesn't. That outsider disk had bad
sectors. I wrote zeros to the disk yesterday and both Pending and
Offline counts have been reset, but Reallocation count didn't
increase. I did run an immediate offline smartd test after zeroing the
disk...

Does that make sense?!

On Wed, Sep 16, 2009 at 12:44 PM, Matthias Urlichs wrote:
> On Wed, 2009-09-16 at 12:31 +0300, Majed B. wrote:
>> I have a question which would probably sound stupid: If I have a bad
>> blocks output file from dd_rescue, can I reconstruct a bad sector's
>> data by reading the same sector from all disks (using dd if=/dev/sdx
>> of=./bbfix_#number bs=512 count=1 skip=bb_number-1), then run an
>> normal XOR operation, write zeros to the bad block to force sector
>> remap, then dd the XOR output to the said sector?
>
> Well, of course. Assuming that the disk's sector remap works, which was
> my problem, and that we're talking about RAID5.
>
>
>

--
Majed B.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Re-map disk sectors in userspace when rewriting after readerrors

am 16.09.2009 12:00:23 von Robin Hill

--82I3+IH0IqGh5yIs
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Wed Sep 16, 2009 at 11:44:26AM +0200, Matthias Urlichs wrote:

> On Wed, 2009-09-16 at 12:31 +0300, Majed B. wrote:
> > I have a question which would probably sound stupid: If I have a bad
> > blocks output file from dd_rescue, can I reconstruct a bad sector's
> > data by reading the same sector from all disks (using dd if=3D/dev/sdx
> > of=3D./bbfix_#number bs=3D512 count=3D1 skip=3Dbb_number-1), then run an
> > normal XOR operation, write zeros to the bad block to force sector
> > remap, then dd the XOR output to the said sector?
>=20
> Well, of course. Assuming that the disk's sector remap works, which was
> my problem, and that we're talking about RAID5.
>=20
And also assuming that the array starts from the same sector of each
disk.

Cheers,
Robin
--=20
___ =20
( ' } | Robin Hill |
/ / ) | Little Jim says .... |
// !! | "He fallen in de water !!" |

--82I3+IH0IqGh5yIs
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.11 (GNU/Linux)

iEYEARECAAYFAkqwtzUACgkQShxCyD40xBKeQACgvQV5XwZNLOhp1jVaYJla j8DX
y1sAnjNgYyOgp7pSUHJSQsoUo16nwKoT
=BPxQ
-----END PGP SIGNATURE-----

--82I3+IH0IqGh5yIs--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Re-map disk sectors in userspace when rewriting after read errors

am 16.09.2009 12:07:43 von majedb

Thank you for the heads up, Robin.

I've just checked and it seems that they do start from the same sector:

/dev/sdg: WDC WD10EADS-65L5B1
/dev/sdh: MAXTOR STM31000340AS
root@Adam:/boot# fdisk -l /dev/sd[g-h]

Disk /dev/sdg: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders
Units =3D cylinders of 16065 * 512 =3D 8225280 bytes
Disk identifier: 0x00000000

Device Boot Start End Blocks Id System
/dev/sdg1 1 121601 976760001 fd Linux raid auto=
detect

Disk /dev/sdh: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders
Units =3D cylinders of 16065 * 512 =3D 8225280 bytes
Disk identifier: 0x00000000

Device Boot Start End Blocks Id System
/dev/sdh1 1 121601 976760001 fd Linux raid auto=
detect

There are other disks in the array, but the rest are all WD disks and
have a similar structure to the one above.

On Wed, Sep 16, 2009 at 1:00 PM, Robin Hill wro=
te:
> On Wed Sep 16, 2009 at 11:44:26AM +0200, Matthias Urlichs wrote:
>
>> On Wed, 2009-09-16 at 12:31 +0300, Majed B. wrote:
>> > I have a question which would probably sound stupid: If I have a b=
ad
>> > blocks output file from dd_rescue, can I reconstruct a bad sector'=
s
>> > data by reading the same sector from all disks (using dd if=3D/dev=
/sdx
>> > of=3D./bbfix_#number bs=3D512 count=3D1 skip=3Dbb_number-1), then =
run an
>> > normal XOR operation, write zeros to the bad block to force sector
>> > remap, then dd the XOR output to the said sector?
>>
>> Well, of course. Assuming that the disk's sector remap works, which =
was
>> my problem, and that we're talking about RAID5.
>>
> And also assuming that the array starts from the same sector of each
> disk.
>
> Cheers,
> Â Â Robin
> --
> Â Â ___
> Â Â ( ' } Â Â | Â Â Â Robin Hill =C2=
=A0 Â Â Â |
> Â / / ) Â Â Â | Little Jim says .... Â Â =
Â Â Â Â Â Â Â Â Â Â =
Â Â |
> Â // !! Â Â Â | Â Â Â "He fallen in =
de water !!" Â Â Â Â Â Â Â Â |
>

--=20
Majed B.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Re-map disk sectors in userspace when rewriting after readerrors

am 16.09.2009 15:05:24 von Alex Butcher

On Wed, 16 Sep 2009, Majed B. wrote:

> I wrote zeros to the disk yesterday and both Pending and Offline counts
> have been reset, but Reallocation count didn't increase.

Soft, rather than hard errors, presumably. These can occur if a drive is
writing when power is unexpectedly removed.

HTH,
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Re-map disk sectors in userspace when rewriting after readerrors

am 16.09.2009 15:13:33 von Matthias Urlichs

On Wed, 2009-09-16 at 11:41 +0200, Goswin von Brederlow wrote:
> Alternatively write a re-map device-mapper target that reserves some
> space of the disk and remaps bad blocks itself.
>
That'd require some place to store the mapping so that the whole thing
still works after a reboot. Which should probably be on a different
disk.

I tend to want to move (part of) that problem to userspace; you may want
to do more than a simple remapping of a few blocks when that happens
(e.g. test-reading the surrounding area).

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Re-map disk sectors in userspace when rewriting after read errors

am 18.09.2009 10:17:27 von majedb

I've re-read this thread and I was wondering if: echo check >
/sys/block/$array/md/sync_action would help me (and possibly Matthias)
in any way.

I have a RAID5 array of 8 disks running degraded on 7. One of the 7
has bad sectors and the one that is not in the array also had bad
sectors.

I zeroed the one out of the array (with dd) and then cloned the one
with bad sectors in the array to it using dd_rescue.

Later, I reassembled the array using the cloned disk instead of the ori=
ginal.

So now, I'm sure I still have inconsistencies, but would doing the
action above force a correction? Also, would that work on a degraded
array?

Thank you.

On Wed, Sep 16, 2009 at 4:13 PM, Matthias Urlichs =
wrote:
> On Wed, 2009-09-16 at 11:41 +0200, Goswin von Brederlow wrote:
>> Alternatively write a re-map device-mapper target that reserves some
>> space of the disk and remaps bad blocks itself.
>>
> That'd require some place to store the mapping so that the whole thin=
g
> still works after a reboot. Which should probably be on a different
> disk.
>
> I tend to want to move (part of) that problem to userspace; you may w=
ant
> to do more than a simple remapping of a few blocks when that happens
> (e.g. test-reading the surrounding area).
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid"=
in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at Â http://vger.kernel.org/majordomo-info.ht=
ml
>

--=20
Majed B.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Re-map disk sectors in userspace when rewriting after readerrors

am 18.09.2009 10:28:51 von Robin Hill

--9UV9rz0O2dU/yYYn
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Fri Sep 18, 2009 at 11:17:27AM +0300, Majed B. wrote:

> I've re-read this thread and I was wondering if: echo check >
> /sys/block/$array/md/sync_action would help me (and possibly Matthias)
> in any way.
>=20
> I have a RAID5 array of 8 disks running degraded on 7. One of the 7
> has bad sectors and the one that is not in the array also had bad
> sectors.
>=20
> I zeroed the one out of the array (with dd) and then cloned the one
> with bad sectors in the array to it using dd_rescue.
>=20
> Later, I reassembled the array using the cloned disk instead of the origi=
nal.
>=20
> So now, I'm sure I still have inconsistencies, but would doing the
> action above force a correction? Also, would that work on a degraded
> array?
>=20
All the 'check' action does is validate that the checksum matches the
data. By doing this, it will also be doing a full read check on the
array (though I'm not certain what action is taken on read failures).
The 'repair' action will also rewrite any checksums which don't match
the data.

All of this requires a non-degraded array, so I suspect the 'check' and
'repair' actions will get ignored altogether on a degraded array (and
certainly won't actually work). As the array is degraded, you _can't_
have any RAID inconsistencies. You may have some filesystem
inconsistencies (a fsck is definitely recommended) and/or data
inconsistencies (unless you have checksums or backups to compare against
then you're stuck on finding these though).

Cheers,
Robin
--=20
___ =20
( ' } | Robin Hill |
/ / ) | Little Jim says .... |
// !! | "He fallen in de water !!" |

--9UV9rz0O2dU/yYYn
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.11 (GNU/Linux)

iEYEARECAAYFAkqzRMIACgkQShxCyD40xBKobQCdE6yt5dGyjmPgMPwi3MhS MHm4
PX8An3ve+ys1LqqANhoF0nlYKQFdIK+Y
=xSAg
-----END PGP SIGNATURE-----

--9UV9rz0O2dU/yYYn--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Re-map disk sectors in userspace when rewriting after read errors

am 18.09.2009 11:57:23 von majedb

Thank you for the insight, Robin.

I already have used dd_rescue to find which sectors are bad, so I
guess I could either wait for Matthias to finish his modifications to
mdadm, or I can reconstruct the bad sectors manually (read same sector
from other disks, xor all, write to damaged disk's clone).

Weird thing though, is that when I re-read some of the bad sectors, I
didn't get I/O errors ... it's confusing!

Also, I'd rather avoid a fsck when I have bad sectors to not lose
files. I'll run fsck once I've fixed the bad sectors and resynced the
array.

On Fri, Sep 18, 2009 at 11:28 AM, Robin Hill wr=
ote:
> All the 'check' action does is validate that the checksum matches the
> data. Â By doing this, it will also be doing a full read check on=
the
> array (though I'm not certain what action is taken on read failures).
> The 'repair' action will also rewrite any checksums which don't match
> the data.
>
> All of this requires a non-degraded array, so I suspect the 'check' a=
nd
> 'repair' actions will get ignored altogether on a degraded array (and
> certainly won't actually work). Â As the array is degraded, you _=
can't_
> have any RAID inconsistencies. Â You may have some filesystem
> inconsistencies (a fsck is definitely recommended) and/or data
> inconsistencies (unless you have checksums or backups to compare agai=
nst
> then you're stuck on finding these though).
>
> Cheers,
> Â Â Robin
> --
> Â Â ___
> Â Â ( ' } Â Â | Â Â Â Robin Hill =C2=
=A0 Â Â Â |
> Â / / ) Â Â Â | Little Jim says .... Â Â =
Â Â Â Â Â Â Â Â Â Â =
Â Â |
> Â // !! Â Â Â | Â Â Â "He fallen in =
de water !!" Â Â Â Â Â Â Â Â |
>

--=20
Majed B.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Re-map disk sectors in userspace when rewriting after readerrors

am 18.09.2009 12:22:34 von Robin Hill

--M/SuVGWktc5uNpra
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Fri Sep 18, 2009 at 12:57:23PM +0300, Majed B. wrote:

> Thank you for the insight, Robin.
>=20
> I already have used dd_rescue to find which sectors are bad, so I
> guess I could either wait for Matthias to finish his modifications to
> mdadm, or I can reconstruct the bad sectors manually (read same sector
> from other disks, xor all, write to damaged disk's clone).
>=20
This won't work if your array is degraded though - you don't have enough
data to do the reconstruction (unless you have two failed drives you can
partially read?).

> Weird thing though, is that when I re-read some of the bad sectors, I
> didn't get I/O errors ... it's confusing!
>=20
Odd. I'd recommend using ddrescue rather than dd_rescue - it's faster
and handles retries of bad sectors better.

> Also, I'd rather avoid a fsck when I have bad sectors to not lose
> files. I'll run fsck once I've fixed the bad sectors and resynced the
> array.
>=20
True - a fsck should only be done once the data's in the best possible
state,

Cheers,
Robin
--=20
___ =20
( ' } | Robin Hill |
/ / ) | Little Jim says .... |
// !! | "He fallen in de water !!" |

--M/SuVGWktc5uNpra
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.11 (GNU/Linux)

iEYEARECAAYFAkqzX2oACgkQShxCyD40xBIjKACgodREmuab4AqitSS2nM0b DLRb
Wa4AniqmeKzZZ42YJCduoxTA1U29fELZ
=vqK5
-----END PGP SIGNATURE-----

--M/SuVGWktc5uNpra--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Re-map disk sectors in userspace when rewriting after read errors

am 18.09.2009 12:52:14 von majedb

Well, I think my case is different Matthias's and I can't reconstruct
the data anymore, as you said, Robin.

So this leaves me with a degraded array with bad sectors and a dodgy fi=
lesystem.

You see, I can mount the LVM Logical Volume (formatted with XFS), but
as soon as I hit some bad sectors, XFS complains and then one of the
array disks jump out.
Just now, one disk exited the array and renamed itself from sdg to sdj
... (this is the first time this happens). According to smartctl -a
/dev/sdj, there are no bad sectors, but I still get this in
/var/log/messages

Sep 18 07:01:38 Adam kernel: [316599.950147] sd 6:0:0:0: [sdg] Result:
hostbyte=3DDID_NO_CONNECT driverbyte=3DDRIVER_OK,SUGGEST_OK
Sep 18 07:01:38 Adam kernel: [316599.950175] raid5:md0: read error not
correctable (sector 1240859816 on sdg1).
Sep 18 07:01:38 Adam kernel: [316599.950223] raid5:md0: read error not
correctable (sector 1240859824 on sdg1).
Sep 18 07:01:38 Adam kernel: [316599.950225] raid5:md0: read error not
correctable (sector 1240859832 on sdg1).
Sep 18 07:01:38 Adam kernel: [316599.950227] raid5:md0: read error not
correctable (sector 1240859840 on sdg1).
Sep 18 07:01:38 Adam kernel: [316599.950230] raid5:md0: read error not
correctable (sector 1240859848 on sdg1).
Sep 18 07:01:38 Adam kernel: [316599.950232] raid5:md0: read error not
correctable (sector 1240859856 on sdg1).
Sep 18 07:01:38 Adam kernel: [316599.950234] raid5:md0: read error not
correctable (sector 1240859864 on sdg1).
Sep 18 07:01:38 Adam kernel: [316599.950236] raid5:md0: read error not
correctable (sector 1240859872 on sdg1).
Sep 18 07:01:38 Adam kernel: [316599.950238] raid5:md0: read error not
correctable (sector 1240859880 on sdg1).
Sep 18 07:01:38 Adam kernel: [316599.950240] raid5:md0: read error not
correctable (sector 1240859888 on sdg1).

When the disk exits the array, it becomes useless (6 out of 8 disks)
and XFS complains:

Sep 18 07:01:46 Adam kernel: [316607.896293] xfs_imap_to_bp:
xfs_trans_read_buf()returned an error 5 on dm-0. Returning error.
Sep 18 07:01:46 Adam kernel: [316607.896374] xfs_imap_to_bp:
xfs_trans_read_buf()returned an error 5 on dm-0. Returning error.
Sep 18 07:01:46 Adam kernel: [316607.896453] xfs_imap_to_bp:
xfs_trans_read_buf()returned an error 5 on dm-0. Returning error.

Here's some info on smartctl -a /dev/sdg
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail
Always - 0
7 Seek_Error_Rate 0x002e 100 253 000 Old_age
Always - 0
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age
Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age
Always - 0
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age
Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age
Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age
Offline - 0

I can't find an explanation to why disks are behaving this way...

==================== =====
==================== =====
====

Plan B: Since I cloned the disk with bad sectors to another, what
would happen if I zeroed the damaged one then cloned the clone to it?!

I do realize that there will be zeros in the areas of bad sectors, but
how will mdadm/md behave? Would a resync fail?

I can run fsck at that point and files residing on bad sectors will be
the only affected ones, correct?

On Fri, Sep 18, 2009 at 1:22 PM, Robin Hill wro=
te:
> On Fri Sep 18, 2009 at 12:57:23PM +0300, Majed B. wrote:
>
>> Thank you for the insight, Robin.
>>
>> I already have used dd_rescue to find which sectors are bad, so I
>> guess I could either wait for Matthias to finish his modifications t=
o
>> mdadm, or I can reconstruct the bad sectors manually (read same sect=
or
>> from other disks, xor all, write to damaged disk's clone).
>>
> This won't work if your array is degraded though - you don't have eno=
ugh
> data to do the reconstruction (unless you have two failed drives you =
can
> partially read?).
>
>> Weird thing though, is that when I re-read some of the bad sectors, =
I
>> didn't get I/O errors ... it's confusing!
>>
> Odd. Â I'd recommend using ddrescue rather than dd_rescue - it's =
faster
> and handles retries of bad sectors better.
>
>> Also, I'd rather avoid a fsck when I have bad sectors to not lose
>> files. I'll run fsck once I've fixed the bad sectors and resynced th=
e
>> array.
>>
> True - a fsck should only be done once the data's in the best possibl=
e
> state,
>
> Cheers,
> Â Â Robin
> --
> Â Â ___
> Â Â ( ' } Â Â | Â Â Â Robin Hill =C2=
=A0 Â Â Â |
> Â / / ) Â Â Â | Little Jim says .... Â Â =
Â Â Â Â Â Â Â Â Â Â =
Â Â |
> Â // !! Â Â Â | Â Â Â "He fallen in =
de water !!" Â Â Â Â Â Â Â Â |
>

--=20
Majed B.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Re-map disk sectors in userspace when rewriting after readerrors

am 18.09.2009 13:15:11 von Robin Hill

--w3uUfsyyY1Pqa/ej
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Fri Sep 18, 2009 at 01:52:14PM +0300, Majed B. wrote:

> Well, I think my case is different Matthias's and I can't reconstruct
> the data anymore, as you said, Robin.
>=20
> So this leaves me with a degraded array with bad sectors and a dodgy
> filesystem.
>=20
> You see, I can mount the LVM Logical Volume (formatted with XFS), but
> as soon as I hit some bad sectors, XFS complains and then one of the
> array disks jump out.
> Just now, one disk exited the array and renamed itself from sdg to sdj
> .... (this is the first time this happens). According to smartctl -a
> /dev/sdj, there are no bad sectors, but I still get this in
> /var/log/messages
>=20
The renaming would suggest a hard bus reset - not what I'd expect with
just a bad block.

> Here's some info on smartctl -a /dev/sdg
> 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail
> Always - 0
> 7 Seek_Error_Rate 0x002e 100 253 000 Old_age
> Always - 0
> 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age
> Always - 0
> 197 Current_Pending_Sector 0x0032 200 200 000 Old_age
> Always - 0
> 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age
> Offline - 0
> 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age
> Always - 0
> 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age
> Offline - 0
>=20
A lot of these are only updated via offline tests, so won't change in
normal use, even if there are issues. Have you run any SMART tests on
the disk? The long test usually shows a failure if the disk has read
errors.

> Plan B: Since I cloned the disk with bad sectors to another, what
> would happen if I zeroed the damaged one then cloned the clone to it?!
>=20
Depends on what the actual condition of the disk is. The zeroing should
remap any bad blocks though.

> I do realize that there will be zeros in the areas of bad sectors, but
> how will mdadm/md behave? Would a resync fail?
>=20
mdadm doesn't care what data is on it, as long as the array metadata is
valid. Providing all disks are readable (and the new disk is writable)
then a resync would certainly work - whether the filesystem will be
usable afterwards depends on how many zeroed blocks there are and where
they fall.

> I can run fsck at that point and files residing on bad sectors will be
> the only affected ones, correct?
>=20
Files/directories yes - if the directory inodes get zeroed then all the
files within the directory will be affected (renamed & moved to
/lost+found).

I've had to do just this myself recently, and despite the low number of
zeroed blocks, there was an awful lot of filesystem damage (I ended up
restoring most of it from backup).

Robin
--=20
___ =20
( ' } | Robin Hill |
/ / ) | Little Jim says .... |
// !! | "He fallen in de water !!" |

--w3uUfsyyY1Pqa/ej
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.11 (GNU/Linux)

iEYEARECAAYFAkqza74ACgkQShxCyD40xBKOjQCfd4v9Kqsk0QUoLAuDIGJv 2uzA
970AoJbbj1fJGo4J0OA5ZZcRT2f5i6L/
=/fsL
-----END PGP SIGNATURE-----

--w3uUfsyyY1Pqa/ej--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Re-map disk sectors in userspace when rewriting after readerrors

am 18.09.2009 13:35:38 von Matthias Urlichs

On Fri, 2009-09-18 at 11:17 +0300, Majed B. wrote:
>
> I have a RAID5 array of 8 disks running degraded on 7. One of the 7
> has bad sectors and the one that is not in the array also had bad
> sectors.

If you run a check on a degraded array and the check runs into errors it
can't recover from, I assume that the disk will get kicked off and
you'll have a nonfunctional array instead.

Not something I'd do in your situation.

I'll try to finish my patch ASAP.

It should be possible to convince the code to read from the offline disk
when absolutely necessary, but no guarantee that I'll get that in right
away. (On second thought, this only matters for RAID6.)

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Re-map disk sectors in userspace when rewriting after read errors

am 18.09.2009 19:44:00 von John Robinson

On 18/09/2009 12:35, Matthias Urlichs wrote:
[...]
> If you run a check on a degraded array and the check runs into errors it
> can't recover from, I assume that the disk will get kicked off and
> you'll have a nonfunctional array instead.

No, I don't think so - at least with RAID-1, md doesn't drop the array
on errors on the one remaining functional disc, on the grounds that some
data is better than none, but I don't know whether the array gets
switched to read-only or what the situation is with other RAID levels.

Cheers,

John.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Re-map disk sectors in userspace when rewriting after read errors

am 18.09.2009 20:02:10 von Greg Freemyer

All,

I keep forgetting to ask, but the subject of this thread makes me
wonder if you guys are familiar with the hdparm features of
"--make-bad-sector", "--read-sector", and "--write-sector".

I don't know if any of those can be used to force a sector to be
remapped, but I could see a user space process like:

identify corrupt sector
hdparm --make-bad-sector (to get it as corrupt as linux knows how).
calculate correct value
write new value to sector the normal way (hopefully the drive will
remap the bad sector)

hdparm --read-sector will do a low level read of the sector, including
the sector header and checksum as I understand it. I'm not sure all
that gets back to userspace.

hdparm --write-sector will force a sector to be rewritten. I don't
believe it is meant to ever cause a sector remap. Of course you never
know what a disk drive is going to do for any given command.

Mark Lord is of course the expert on all things hdparm.

Greg
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Re-map disk sectors in userspace when rewriting after read errors

am 18.09.2009 22:13:29 von majedb

--00032555a19623c95b0473dfc5ea
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Greg,

You don't really need to use hdparm. You can use dd to overwrite the
bad sectors with zeros which forces the disk to remap the sector.

As for calculating the new data, a friend of mine wrote me a java
program that takes in any number of input files and XORs them, then
writes the output to a file.
The input files are the sectors' data from other disks.

I have attached the program in case any one is interested. Courtesy to
Eng. Hisham Farahat who wrote the program "sector xor, or sexor, as I
call it"

java -jar sexor.jar file1 file2 ... fileN

The output file will always be called "out" -- do not include it in
the input list.

On Fri, Sep 18, 2009 at 9:02 PM, Greg Freemyer wr=
ote:
> All,
>
> I keep forgetting to ask, but the subject of this thread makes me
> wonder if you guys are familiar with the hdparm features of
> "--make-bad-sector", "--read-sector", and "--write-sector".
>
> I don't know if any of those can be used to force a sector to be
> remapped, but I could see a user space process like:
>
> identify corrupt sector
> hdparm --make-bad-sector Â (to get it as corrupt as linux knows how)=
..
> calculate correct value
> write new value to sector the normal way (hopefully the drive will
> remap the bad sector)
>
> hdparm --read-sector will do a low level read of the sector, including
> the sector header and checksum as I understand it. Â I'm not sure all
> that gets back to userspace.
>
> hdparm --write-sector will force a sector to be rewritten. Â I don't
> believe it is meant to ever cause a sector remap. Â Of course you nev=
er
> know what a disk drive is going to do for any given command.
>
> Mark Lord is of course the expert on all things hdparm.
>
> Greg
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at Â http://vger.kernel.org/majordomo-info.html
>

--=20
Majed B.

--00032555a19623c95b0473dfc5ea
Content-Type: application/java-archive; name="sexor.jar"
Content-Disposition: attachment; filename="sexor.jar"
Content-Transfer-Encoding: base64
X-Attachment-Id: f_fzrd93wo0

UEsDBBQACAAIANlsMjsAAAAAAAAAAAAAAAAUAAQATUVUQS1JTkYvTUFOSUZF U1QuTUb+ygAA803M
y0xLLS7RDUstKs7Mz7NSMNQz4OXyTczM03XOSSwutlIITq3IL+Ll4uUCAFBL BwilHGUtLgAAACwA
AABQSwMECgAAAAAAwWgyO/Uz20DoAAAA6AAAAAoAAAAuY2xhc3NwYXRoPD94 bWwgdmVyc2lvbj0i
MS4wIiBlbmNvZGluZz0iVVRGLTgiPz4NCjxjbGFzc3BhdGg+DQoJPGNsYXNz cGF0aGVudHJ5IGtp
bmQ9InNyYyIgcGF0aD0ic3JjIi8+DQoJPGNsYXNzcGF0aGVudHJ5IGtpbmQ9 ImNvbiIgcGF0aD0i
b3JnLmVjbGlwc2UuamR0LmxhdW5jaGluZy5KUkVfQ09OVEFJTkVSIi8+DQoJ PGNsYXNzcGF0aGVu
dHJ5IGtpbmQ9Im91dHB1dCIgcGF0aD0iYmluIi8+DQo8L2NsYXNzcGF0aD4N ClBLAwQKAAAAAADB
aDI7PqqOO30BAAB9AQAACAAAAC5wcm9qZWN0PD94bWwgdmVyc2lvbj0iMS4w IiBlbmNvZGluZz0i
VVRGLTgiPz4NCjxwcm9qZWN0RGVzY3JpcHRpb24+DQoJPG5hbWU+U2V4b3I8 L25hbWU+DQoJPGNv
bW1lbnQ+PC9jb21tZW50Pg0KCTxwcm9qZWN0cz4NCgk8L3Byb2plY3RzPg0K CTxidWlsZFNwZWM+
DQoJCTxidWlsZENvbW1hbmQ+DQoJCQk8bmFtZT5vcmcuZWNsaXBzZS5qZHQu Y29yZS5qYXZhYnVp
bGRlcjwvbmFtZT4NCgkJCTxhcmd1bWVudHM+DQoJCQk8L2FyZ3VtZW50cz4N CgkJPC9idWlsZENv
bW1hbmQ+DQoJPC9idWlsZFNwZWM+DQoJPG5hdHVyZXM+DQoJCTxuYXR1cmU+ b3JnLmVjbGlwc2Uu
amR0LmNvcmUuamF2YW5hdHVyZTwvbmF0dXJlPg0KCTwvbmF0dXJlcz4NCjwv cHJvamVjdERlc2Ny
aXB0aW9uPg0KUEsDBAoAAAAAADZsMjvuP4pOPAUAADwFAAALAAAAU2V4b3Iu Y2xhc3PK/rq+AAAA
MgBXBwACAQAFU2V4b3IHAAQBABBqYXZhL2xhbmcvT2JqZWN0AQAGPGluaXQ+ AQADKClWAQAEQ29k
ZQoAAwAJDAAFAAYBAA9MaW5lTnVtYmVyVGFibGUBABJMb2NhbFZhcmlhYmxl VGFibGUBAAR0aGlz
AQAHTFNleG9yOwEABG1haW4BABYoW0xqYXZhL2xhbmcvU3RyaW5nOylWAQAK RXhjZXB0aW9ucwcA
EgEAE2phdmEvaW8vSU9FeGNlcHRpb24HABQBABdqYXZhL2lvL0ZpbGVJbnB1 dFN0cmVhbQcAFgEA
DGphdmEvaW8vRmlsZQoAFQAYDAAFABkBABUoTGphdmEvbGFuZy9TdHJpbmc7 KVYKABMAGwwABQAc
AQARKExqYXZhL2lvL0ZpbGU7KVYKABMAHgwAHwAgAQAEcmVhZAEABShbQilJ CQAiACQHACMBABBq
YXZhL2xhbmcvU3lzdGVtDAAlACYBAANvdXQBABVMamF2YS9pby9QcmludFN0 cmVhbTsHACgBABdq
YXZhL2xhbmcvU3RyaW5nQnVpbGRlcggAKgEABkZpbGU6IAoAJwAYCgAnAC0M AC4ALwEABmFwcGVu
ZAEALShMamF2YS9sYW5nL1N0cmluZzspTGphdmEvbGFuZy9TdHJpbmdCdWls ZGVyOwgAMQEADSBp
cyBwcm9jZXNzZWQKACcAMwwANAA1AQAIdG9TdHJpbmcBABQoKUxqYXZhL2xh bmcvU3RyaW5nOwoA
NwA5BwA4AQATamF2YS9pby9QcmludFN0cmVhbQwAOgAZAQAHcHJpbnRsbgcA PAEAGGphdmEvaW8v
RmlsZU91dHB1dFN0cmVhbQgAJQoAOwAbCgA7AEAMAEEAQgEABXdyaXRlAQAF KFtCKVYKADsARAwA
RQAGAQAFY2xvc2UIAEcBABhPdXRvdXQgZmlsZSBpcyBnZW5lcmF0ZWQBAARh cmdzAQATW0xqYXZh
L2xhbmcvU3RyaW5nOwEABmJ1ZmZlcgEAAltCAQADcmVzAQABagEAAUkBAAJp bgEAGUxqYXZhL2lv
L0ZpbGVJbnB1dFN0cmVhbTsBAAFpAQAaTGphdmEvaW8vRmlsZU91dHB1dFN0 cmVhbTsBAA1TdGFj
a01hcFRhYmxlBwBLAQAKU291cmNlRmlsZQEAClNleG9yLmphdmEAIQABAAMA AAAAAAIAAQAFAAYA
AQAHAAAALwABAAEAAAAFKrcACLEAAAACAAoAAAAGAAEAAAAHAAsAAAAMAAEA AAAFAAwADQAAAAkA
DgAPAAIAEAAAAAQAAQARAAcAAAFNAAYABgAAAJIRAgC8CEwRAgC8CE0DPqcA W7sAE1m7ABVZKh0y
twAXtwAaOgQZBCu2AB1XAzYFpwAULBUFLBUFMysVBTOCkVSEBQEVBSu+of/r sgAhuwAnWRIptwAr
Kh0ytgAsEjC2ACy2ADK2ADaEAwEdKr6h/6W7ADtZuwAVWRI9twAXtwA+Ti0s tgA/LbYAQ7IAIRJG
tgA2sQAAAAMACgAAAD4ADwAAAA4ABgAPAAwAEAARABEAJAASACsAEwAxABQA PwATAEkAFgBmABAA
bwAaAIAAGwCFABwAiQAdAJEAMwALAAAASAAHAAAAkgBIAEkAAAAGAIwASgBL AAEADACGAEwASwAC
AA4AYQBNAE4AAwAkAEIATwBQAAQALgAbAFEATgAFAIAAEgAlAFIAAwBTAAAA FwAE/gARBwBUBwBU
Af0AHwcAEwEQ+QAmAAEAVQAAAAIAVlBLAwQKAAAAAAA2bDI76PLWxucFAADn BQAACgAAAFNleG9y
LmphdmFpbXBvcnQgamF2YS5pby5GaWxlOw0KaW1wb3J0IGphdmEuaW8uRmls ZUlucHV0U3RyZWFt
Ow0KaW1wb3J0IGphdmEuaW8uRmlsZU5vdEZvdW5kRXhjZXB0aW9uOw0KaW1w b3J0IGphdmEuaW8u
RmlsZU91dHB1dFN0cmVhbTsNCmltcG9ydCBqYXZhLmlvLklPRXhjZXB0aW9u Ow0KDQpwdWJsaWMg
Y2xhc3MgU2V4b3Igew0KDQoJLyoqDQoJICogQHBhcmFtIGFyZ3MNCgkgKiBA dGhyb3dzIElPRXhj
ZXB0aW9uDQoJICovDQoJcHVibGljIHN0YXRpYyB2b2lkIG1haW4oU3RyaW5n W10gYXJncykgdGhy
b3dzIElPRXhjZXB0aW9uIHsNCgkJYnl0ZSBidWZmZXJbXSA9IG5ldyBieXRl WzUxMl07DQoJCWJ5
dGUgcmVzW10gPSBuZXcgYnl0ZVs1MTJdOw0KCQlmb3IgKGludCBqID0gMDsg aiA8IGFyZ3MubGVu
Z3RoOyBqKyspIHsNCgkJCUZpbGVJbnB1dFN0cmVhbSBpbiA9IG5ldyBGaWxl SW5wdXRTdHJlYW0o
bmV3IEZpbGUoYXJnc1tqXSkpOw0KCQkJaW4ucmVhZChidWZmZXIpOw0KCQkJ Zm9yIChpbnQgaSA9
IDA7IGkgPCBidWZmZXIubGVuZ3RoOyBpKyspIHsNCgkJCQlyZXNbaV0gPSAo Ynl0ZSkgKHJlc1tp
XSBeIGJ1ZmZlcltpXSk7DQoJCQl9DQoJCQlTeXN0ZW0ub3V0LnByaW50bG4o IkZpbGU6ICIrYXJn
c1tqXSsiIGlzIHByb2Nlc3NlZCIpOw0KDQoJCX0NCg0KCQlGaWxlT3V0cHV0 U3RyZWFtIG91dCA9
IG5ldyBGaWxlT3V0cHV0U3RyZWFtKG5ldyBGaWxlKCJvdXQiKSk7DQoJCW91 dC53cml0ZShyZXMp
Ow0KCQlvdXQuY2xvc2UoKTsNCgkJU3lzdGVtLm91dC5wcmludGxuKCJPdXRv dXQgZmlsZSBpcyBn
ZW5lcmF0ZWQiKTsNCg0KCQkvLyBGaWxlSW5wdXRTdHJlYW0gaW4gPSBuZXcg RmlsZUlucHV0U3Ry
ZWFtKG5ldyBGaWxlKCIxIikpOw0KCQkvLyBGaWxlSW5wdXRTdHJlYW0gaW4y ID0gbmV3IEZpbGVJ
bnB1dFN0cmVhbShuZXcgRmlsZSgiMiIpKTsNCgkJLy8gYnl0ZSBidWZmZXJb XSA9IG5ldyBieXRl
WzUxMl07DQoJCS8vIGJ5dGUgYnVmZmVyMltdID0gbmV3IGJ5dGVbNTEyXTsN CgkJLy8gYnl0ZSBy
ZXNbXSA9IG5ldyBieXRlWzUxMl07DQoJCS8vIGluLnJlYWQoYnVmZmVyKTsN CgkJLy8gaW4yLnJl
YWQoYnVmZmVyMik7DQoJCS8vIGZvciAoaW50IGkgPSAwOyBpIDwgYnVmZmVy Lmxlbmd0aDsgaSsr
KSB7DQoJCS8vIFN5c3RlbS5vdXQucHJpbnRmKCIlZFx0IixidWZmZXJbaV0p Ow0KCQkvLyB9DQoJ
CS8vIFN5c3RlbS5vdXQucHJpbnRsbigpOw0KCQkvLyBmb3IgKGludCBpID0g MDsgaSA8IGJ1ZmZl
ci5sZW5ndGg7IGkrKykgew0KCQkvLyBTeXN0ZW0ub3V0LnByaW50ZigiJWRc dCIsYnVmZmVyMltp
XSk7DQoJCS8vIH0NCgkJLy8gU3lzdGVtLm91dC5wcmludGxuKCk7DQoJCS8v IGZvciAoaW50IGkg
PSAwOyBpIDwgYnVmZmVyLmxlbmd0aDsgaSsrKSB7DQoJCS8vIFN5c3RlbS5v dXQucHJpbnRmKCIl
ZFx0IixidWZmZXIyW2ldXmJ1ZmZlcltpXSk7DQoJCS8vIHJlc1tpXT0oYnl0 ZSkoYnVmZmVyMltp
XV5idWZmZXJbaV0pOw0KCQkvLyB9DQoNCgl9DQoNCn0NClBLAQIUABQACAAI ANlsMjulHGUtLgAA
ACwAAAAUAAQAAAAAAAAAAAAAAAAAAABNRVRBLUlORi9NQU5JRkVTVC5NRv7K AABQSwECCgAKAAAA
AADBaDI79TPbQOgAAADoAAAACgAAAAAAAAAAAAAAAAB0AAAALmNsYXNzcGF0 aFBLAQIKAAoAAAAA
AMFoMjs+qo47fQEAAH0BAAAIAAAAAAAAAAAAAAAAAIQBAAAucHJvamVjdFBL AQIKAAoAAAAAADZs
MjvuP4pOPAUAADwFAAALAAAAAAAAAAAAAAAAACcDAABTZXhvci5jbGFzc1BL AQIKAAoAAAAAADZs
Mjvo8tbG5wUAAOcFAAAKAAAAAAAAAAAAAAAAAIwIAABTZXhvci5qYXZhUEsF BgAAAAAFAAUAJQEA
AJsOAAAAAA==
--00032555a19623c95b0473dfc5ea--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Re-map disk sectors in userspace when rewriting after read errors

am 02.10.2009 15:55:22 von Bill Davidsen

Majed B. wrote:
> Greg,
>
> You don't really need to use hdparm. You can use dd to overwrite the
> bad sectors with zeros which forces the disk to remap the sector.
>

From the description of the problem, I would expect the md code to have
rewritten the sector, and the problem is that the failed write isn't
detected or somehow the write doesn't cause a relocate. That's my
reading of the previous discussion, disk firmware is crap.

Newegg.Com had TB drives on sale for about $65 or so, hard to justify
the time to live with crap, not to mention that the same grotty firmware
which isn't getting the bad block remapped may be return bad data
without warning. That would bother me.

--
Bill Davidsen
Unintended results are the well-earned reward for incompetence.

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html