Raid 6 - TLER/CCTL/ERC
am 06.10.2010 07:51:36 von Peter Zieba
Hey all,
I have a question regarding Linux raid and degraded arrays.
My configuration involves:
- 8x Samsung HD103UJ 1TB drives (terrible consumer-grade)
- AOC-USAS-L8i Controller
- CentOS 5.5 2.6.18-194.11.1.el5xen (64-bit)
- Each drive has one maximum-sized partition.
- 8-drives are configured in a raid 6.
My understanding is that with a raid 6, if a disk cannot return a given sector, it should still be possible to get what should have been returned from the first disk, from two other disks. My understanding is also that if this is successful, this should be written back to the disk that originally failed to read the given sector. I'm assuming that's what a message such as this indicates:
Sep 17 04:01:12 doorstop kernel: raid5:md0: read error corrected (8 sectors at 1647989048 on sde1)
I was hoping to confirm my suspicion on the meaning of that message.
On occasion, I'll also see this:
Oct 1 01:50:53 doorstop kernel: raid5:md0: read error not correctable (sector 1647369400 on sdh1).
This seems to involved the drive being kicked from the array, even though the drive is still readable for the most part (save for a few sectors).
What exactly is the criteria for a disk being kicked out of an array?
Furthermore, if an 8-disk raid 6 is running on the bare-minimum 6-disks, why on earth would it kick any more disks out? At this point, doesn't it makes sense to simply return an error to whatever tried to read from that part of the array instead of killing the array?
In other words, I would rather be able to read from a degraded raid-6 using something like dd with "conv=sync,noerror" (as I would be able to expect with a single disk with some bad sectors),
than have it kick out the last drive that it can possibly run on, and die completely. Is there a good reason for this behavior?
Finally, why do the kernel messages that all say "raid5:" when it is clearly a raid 6?:
[root@doorstop log]# cat /proc/mdstat
Personalities : [raid0] [raid6] [raid5] [raid4]
md0 : active raid6 sdc1[8](F) sdf1[7] sde1[6] sdd1[5] sda1[3] sdb1[1]
5860559616 blocks level 6, 64k chunk, algorithm 2 [8/5] [_U_U_UUU]
unused devices:
As for intimate details about the behavior of the drives themselves, I've noticed the following:
- Over time, each disk develops a slowly increasing number of "Current_Pending_Sector" (ID 197).
- The pending sector count returns to zero if a disk is removed from an array and filled with /dev/zero, or random data.
- Interestingly, on some occasions, the pending sector count did not return to zero after wiping the partition i.e. /dev/sda1.
- It did, however, return to zero when wiping the entire disk (/dev/sda)
- I had a feeling that this was the result of the drive "reading ahead", into the small area of unusable space after the first partition, and before the end of the disk, and then making note of this in SMART, but not necessarily causing a noticeable problem, as the sector was never actually requested by the kernel.
- I dd'd just that part of the drive, and the pending sectors went away in those cases
- I have on rare occasion had these drives go completely bad before (i.e., there were non-zero values for either "Reallocated_Event_Count", "Reallocated_Sector_Ct", or "Offline_Uncorrectable" (#196, #5, #198, respectively), and the drive seemed unwilling to read any sectors. These were RMA'd.
- As for the other drives, again, pending sectors do crop up, and always disappear when written to. I do not consider these drives bad. Flaky, sure. Slow to respond on error? Almost undoubtedly.
Finally, I should mention that I have tried the smartctl erc commands:
http://www.csc.liv.ac.uk/~greg/projects/erc/
I could not pass them through the controller I was using, but was able to connect the drives to the controller on the motherboard, set the erc values, and still have drives dropping out.
As a terrible band-aid, if I make sure to remove a drive when I see pending sectors, nuke it with random data (or /dev/zero), and resync the array, I get the drive pending sector count to return to zero and the array is happy. Once I have too many drives with pending sectors, however, a resync is almost guaranteed to fail, and I end up having to copy my data off and rebuild the array.
Instead of scripting the above (which sadly, I have done), is there any hope of saving the investment in disks? I have a feeling that this is simply something hitting a timeout, and likely causing problems for many more than just myself.
I greatly appreciate the time taken to read this, and any feedback provided.
Thank you,
Peter Zieba
312-285-3794
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid 6 - TLER/CCTL/ERC
am 06.10.2010 13:57:58 von Phil Turmel
On 10/06/2010 01:51 AM, Peter Zieba wrote:
> Hey all,
>
> I have a question regarding Linux raid and degraded arrays.
>
> My configuration involves:
> - 8x Samsung HD103UJ 1TB drives (terrible consumer-grade)
> - AOC-USAS-L8i Controller
> - CentOS 5.5 2.6.18-194.11.1.el5xen (64-bit)
> - Each drive has one maximum-sized partition.
> - 8-drives are configured in a raid 6.
>
> My understanding is that with a raid 6, if a disk cannot return a given sector, it should still be possible to get what should have been returned from the first disk, from two other disks. My understanding is also that if this is successful, this should be written back to the disk that originally failed to read the given sector. I'm assuming that's what a message such as this indicates:
> Sep 17 04:01:12 doorstop kernel: raid5:md0: read error corrected (8 sectors at 1647989048 on sde1)
>
> I was hoping to confirm my suspicion on the meaning of that message.
>
> On occasion, I'll also see this:
> Oct 1 01:50:53 doorstop kernel: raid5:md0: read error not correctable (sector 1647369400 on sdh1).
>
> This seems to involved the drive being kicked from the array, even though the drive is still readable for the most part (save for a few sectors).
[snip /]
Hi Peter,
For read errors that aren't permanent (gone after writing to the affected sectors), a "repair" action is your friend. I used to deal with occasional kicked-out drives in my arrays until I started running the following script in a weekly cron job:
#!/bin/bash
#
for x in /sys/block/md*/md/sync_action ; do
echo repair >$x
done
HTH,
Phil
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid 6 - TLER/CCTL/ERC
am 06.10.2010 16:12:12 von Lemur Kryptering
I'll definitely give that a shot when I rebuild this thing.
In the meantime, is there anything that I can do to convince md not to kick the last disk (running on 6 out of 8 disks) when reading a bad spot? I've tried setting the array to read-only, but this didn't seem to help.
All I'm really trying to do is dd data off of it using "conv=sync,noerror". When it hits the unreadable spot, it simply kicks the drive from the array, leaving 4/8 disks active, taking down the array.
Again, I don't understand why md would take this action. It would make a lot more sense if it simply reported an IO error to whatever made the request.
Peter Zieba
312-285-3794
----- Original Message -----
From: "Phil Turmel"
To: "Peter Zieba"
Cc: linux-raid@vger.kernel.org
Sent: Wednesday, October 6, 2010 6:57:58 AM GMT -06:00 US/Canada Central
Subject: Re: Raid 6 - TLER/CCTL/ERC
On 10/06/2010 01:51 AM, Peter Zieba wrote:
> Hey all,
>
> I have a question regarding Linux raid and degraded arrays.
>
> My configuration involves:
> - 8x Samsung HD103UJ 1TB drives (terrible consumer-grade)
> - AOC-USAS-L8i Controller
> - CentOS 5.5 2.6.18-194.11.1.el5xen (64-bit)
> - Each drive has one maximum-sized partition.
> - 8-drives are configured in a raid 6.
>
> My understanding is that with a raid 6, if a disk cannot return a given sector, it should still be possible to get what should have been returned from the first disk, from two other disks. My understanding is also that if this is successful, this should be written back to the disk that originally failed to read the given sector. I'm assuming that's what a message such as this indicates:
> Sep 17 04:01:12 doorstop kernel: raid5:md0: read error corrected (8 sectors at 1647989048 on sde1)
>
> I was hoping to confirm my suspicion on the meaning of that message.
>
> On occasion, I'll also see this:
> Oct 1 01:50:53 doorstop kernel: raid5:md0: read error not correctable (sector 1647369400 on sdh1).
>
> This seems to involved the drive being kicked from the array, even though the drive is still readable for the most part (save for a few sectors).
[snip /]
Hi Peter,
For read errors that aren't permanent (gone after writing to the affected sectors), a "repair" action is your friend. I used to deal with occasional kicked-out drives in my arrays until I started running the following script in a weekly cron job:
#!/bin/bash
#
for x in /sys/block/md*/md/sync_action ; do
echo repair >$x
done
HTH,
Phil
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid 6 - TLER/CCTL/ERC
am 06.10.2010 22:14:50 von Richard Scobie
Peter Zieba wrote:
> - AOC-USAS-L8i Controller
> I could not pass them through the controller I was using, but was able to connect the drives to the controller on the motherboard, set the erc values, and still have drives dropping out.
This controller uses the LSI 1068 controller chip and up until kernel
2.6.36, is likely to offline attached drives if smartctl or smartd is used.
If you update to this kernel or later , or apply the one line patch to
the LSI driver in earlier ones, you will be able to safely use these
monitoring utilities.
Patch as outined by the author on the bug list and subsequently
accepted by LSI:
"It seems the mptsas driver could use blk_queue_dma_alignment() to advertise
a stricter alignment requirement. If it does, sd does the right thing and
bounces misaligned buffers (see block/blk-map.c line 57). The following
patch to 2.6.34-rc5 makes my symptoms go away. I'm sure this is the wrong
place for this code, but it gets my idea across."
diff --git a/drivers/message/fusion/mptscsih.c
b/drivers/message/fusion/mptscsih.c
index 6796597..1e034ad 100644
--- a/drivers/message/fusion/mptscsih.c
+++ b/drivers/message/fusion/mptscsih.c
@@ -2450,6 +2450,8 @@ mptscsih_slave_configure(struct scsi_device *sdev)
ioc->name,sdev->tagged_supported, sdev->simple_tags,
sdev->ordered_tags));
+ blk_queue_dma_alignment (sdev->request_queue, 512 - 1);
+
return 0;
}
Regards,
Richard
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid 6 - TLER/CCTL/ERC
am 06.10.2010 22:24:41 von John Robinson
On 06/10/2010 06:51, Peter Zieba wrote:
> Hey all,
>
> I have a question regarding Linux raid and degraded arrays.
>
> My configuration involves:
> - 8x Samsung HD103UJ 1TB drives (terrible consumer-grade)
I have some of these drives too. I wouldn't go so far as to call them
terrible, though 2 out of 3 did manage to get to a couple of pending
sectors, which went away when I ran badblocks and haven't reappeared.
> - AOC-USAS-L8i Controller
> - CentOS 5.5 2.6.18-194.11.1.el5xen (64-bit)
> - Each drive has one maximum-sized partition.
> - 8-drives are configured in a raid 6.
>
> My understanding is that with a raid 6, if a disk cannot return a given sector, it should still be possible to get what should have been returned from the first disk, from two other disks. My understanding is also that if this is successful, this should be written back to the disk that originally failed to read the given sector. I'm assuming that's what a message such as this indicates:
> Sep 17 04:01:12 doorstop kernel: raid5:md0: read error corrected (8 sectors at 1647989048 on sde1)
>
> I was hoping to confirm my suspicion on the meaning of that message.
Yup.
> On occasion, I'll also see this:
> Oct 1 01:50:53 doorstop kernel: raid5:md0: read error not correctable (sector 1647369400 on sdh1).
>
> This seems to involved the drive being kicked from the array, even though the drive is still readable for the most part (save for a few sectors).
The above indicates that a write failed. The drive should probably be
replaced, though if you're seeing a lot of these I'd start suspecting
cabling, drive chassis and/or SATA controller problems.
Hmm, is yours the SATA controller that doesn't like SMART commands? Or
at least didn't in older kernels? Do you run smartd? Try without it for
a bit... If that helps, look on Red Hat bugzilla and perhaps post a bug
report.
> What exactly is the criteria for a disk being kicked out of an array?
>
> Furthermore, if an 8-disk raid 6 is running on the bare-minimum 6-disks, why on earth would it kick any more disks out? At this point, doesn't it makes sense to simply return an error to whatever tried to read from that part of the array instead of killing the array?
Because RAID isn't supposed to return bad data while bare drives are.
[...]
> Finally, why do the kernel messages that all say "raid5:" when it is clearly a raid 6?:
RAIDs 4, 5 and 6 are handled by the raid5 kernel module. Again I think
the message has been changed in more recent kernels.
[...]
> Finally, I should mention that I have tried the smartctl erc commands:
> http://www.csc.liv.ac.uk/~greg/projects/erc/
>
> I could not pass them through the controller I was using, but was able to connect the drives to the controller on the motherboard, set the erc values, and still have drives dropping out.
Those settings don't stick across power cycles and presumably you
powered the drives down to change which controller they were connected
to, so your setting will have been lost.
Hope this helps.
Cheers,
John.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid 6 - TLER/CCTL/ERC
am 06.10.2010 23:22:36 von unknown
Hi,
it has been discussed many times before on the list ...
Am 06.10.2010 16:12, schrieb Lemur Kryptering:
> I'll definitely give that a shot when I rebuild this thing.
>
> In the meantime, is there anything that I can do to convince md not to kick the last disk (running on 6 out of 8 disks) when reading a bad spot? I've tried setting the array to read-only, but this didn't seem to help.
You can set the ERC values of your drives. Then they'll stop processing
their internal error recovery procedure after the timeout and continue
to react. Without ERC-timeout, the drive tries to correct the error on
its own (not reacting on any requests), mdraid assumes an error after a
while and tries to rewrite the "missing" sector (assembled from the
other disks). But the drive will still not react to the write request
as it is still doing its internal recovery procedure. Now mdraid
assumes the disk to be bad and kicks it.
There's nothing you can do about this viscious circle except either
enabling ERC or using Raid-Edition disk (which have ERC enabled by default).
Stefan
>
> All I'm really trying to do is dd data off of it using "conv=sync,noerror". When it hits the unreadable spot, it simply kicks the drive from the array, leaving 4/8 disks active, taking down the array.
>
> Again, I don't understand why md would take this action. It would make a lot more sense if it simply reported an IO error to whatever made the request.
>
> Peter Zieba
> 312-285-3794
>
> ----- Original Message -----
> From: "Phil Turmel"
> To: "Peter Zieba"
> Cc: linux-raid@vger.kernel.org
> Sent: Wednesday, October 6, 2010 6:57:58 AM GMT -06:00 US/Canada Central
> Subject: Re: Raid 6 - TLER/CCTL/ERC
>
> On 10/06/2010 01:51 AM, Peter Zieba wrote:
>> Hey all,
>>
>> I have a question regarding Linux raid and degraded arrays.
>>
>> My configuration involves:
>> - 8x Samsung HD103UJ 1TB drives (terrible consumer-grade)
>> - AOC-USAS-L8i Controller
>> - CentOS 5.5 2.6.18-194.11.1.el5xen (64-bit)
>> - Each drive has one maximum-sized partition.
>> - 8-drives are configured in a raid 6.
>>
>> My understanding is that with a raid 6, if a disk cannot return a given sector, it should still be possible to get what should have been returned from the first disk, from two other disks. My understanding is also that if this is successful, this should be written back to the disk that originally failed to read the given sector. I'm assuming that's what a message such as this indicates:
>> Sep 17 04:01:12 doorstop kernel: raid5:md0: read error corrected (8 sectors at 1647989048 on sde1)
>>
>> I was hoping to confirm my suspicion on the meaning of that message.
>>
>> On occasion, I'll also see this:
>> Oct 1 01:50:53 doorstop kernel: raid5:md0: read error not correctable (sector 1647369400 on sdh1).
>>
>> This seems to involved the drive being kicked from the array, even though the drive is still readable for the most part (save for a few sectors).
>
> [snip /]
>
> Hi Peter,
>
> For read errors that aren't permanent (gone after writing to the affected sectors), a "repair" action is your friend. I used to deal with occasional kicked-out drives in my arrays until I started running the following script in a weekly cron job:
>
> #!/bin/bash
> #
> for x in /sys/block/md*/md/sync_action ; do
> echo repair >$x
> done
>
>
> HTH,
>
> Phil
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid 6 - TLER/CCTL/ERC
am 07.10.2010 00:51:46 von Lemur Kryptering
----- "John Robinson" wrote:
> On 06/10/2010 06:51, Peter Zieba wrote:
> > Hey all,
> >
> > I have a question regarding Linux raid and degraded arrays.
> >
> > My configuration involves:
> > - 8x Samsung HD103UJ 1TB drives (terrible consumer-grade)
>
> I have some of these drives too. I wouldn't go so far as to call them
>
> terrible, though 2 out of 3 did manage to get to a couple of pending
> sectors, which went away when I ran badblocks and haven't reappeared.
>
Someone else suggested I echo "repair" into "sync_action" inside of sys on a weekly basis. I know CentOS has something like this already a similar cron job somewhere in there already. I will take a closer look at this.
> > - AOC-USAS-L8i Controller
> > - CentOS 5.5 2.6.18-194.11.1.el5xen (64-bit)
> > - Each drive has one maximum-sized partition.
> > - 8-drives are configured in a raid 6.
> >
> > My understanding is that with a raid 6, if a disk cannot return a
> given sector, it should still be possible to get what should have been
> returned from the first disk, from two other disks. My understanding
> is also that if this is successful, this should be written back to the
> disk that originally failed to read the given sector. I'm assuming
> that's what a message such as this indicates:
> > Sep 17 04:01:12 doorstop kernel: raid5:md0: read error corrected (8
> sectors at 1647989048 on sde1)
> >
> > I was hoping to confirm my suspicion on the meaning of that
> message.
>
> Yup.
Thanks! It's a simple message but I wanted to make sure I got the meaning right. I appreciate it.
>
> > On occasion, I'll also see this:
> > Oct 1 01:50:53 doorstop kernel: raid5:md0: read error not
> correctable (sector 1647369400 on sdh1).
> >
> > This seems to involved the drive being kicked from the array, even
> though the drive is still readable for the most part (save for a few
> sectors).
>
> The above indicates that a write failed. The drive should probably be
>
> replaced, though if you're seeing a lot of these I'd start suspecting
>
> cabling, drive chassis and/or SATA controller problems.
>
> Hmm, is yours the SATA controller that doesn't like SMART commands? Or
>
> at least didn't in older kernels? Do you run smartd? Try without it
> for
> a bit... If that helps, look on Red Hat bugzilla and perhaps post a
> bug
> report.
>
Yes, it does seem that my controller is indeed the one that has the smart issues. I'm fairly certain that I'm not actually experiencing any of the smart-related issues, however, as I've had the exact same problems cropping up while the disks were connected to the motherboard. It seems that this particular problem is exacerbated by running smart commands excessively (which I can do without seeing these errors). I will be looking into this a bit deeper to make sure, however.
> > What exactly is the criteria for a disk being kicked out of an
> array?
> >
> > Furthermore, if an 8-disk raid 6 is running on the bare-minimum
> 6-disks, why on earth would it kick any more disks out? At this point,
> doesn't it makes sense to simply return an error to whatever tried to
> read from that part of the array instead of killing the array?
>
> Because RAID isn't supposed to return bad data while bare drives are.
>
If it has no choice, however, it seems like this behavior would be preferable to dieing completely:
It could mean the difference between one file being being inaccessible, and an entire machine going down. I'm starting to wonder what it would take to change this functionality...
> [...]
> > Finally, why do the kernel messages that all say "raid5:" when it is
> clearly a raid 6?:
>
> RAIDs 4, 5 and 6 are handled by the raid5 kernel module. Again I think
>
> the message has been changed in more recent kernels.
>
Thanks! I figured it was something simple like that, but feel better knowing for sure.
> [...]
> > Finally, I should mention that I have tried the smartctl erc
> commands:
> > http://www.csc.liv.ac.uk/~greg/projects/erc/
> >
> > I could not pass them through the controller I was using, but was
> able to connect the drives to the controller on the motherboard, set
> the erc values, and still have drives dropping out.
>
> Those settings don't stick across power cycles and presumably you
> powered the drives down to change which controller they were connected
>
> to, so your setting will have been lost.
I'm aware the values don't stick across a power cycle. I had the array running off of the motherboard.
>
> Hope this helps.
>
> Cheers,
>
> John.
Thanks! I appreciate your feedback!
Peter Zieba
312-285-3794
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid 6 - TLER/CCTL/ERC
am 07.10.2010 01:11:11 von Lemur Kryptering
----- "Stefan /*St0fF*/ Hübner"
> wrote:
> Hi,
>=20
> it has been discussed many times before on the list ...
My apologies. I browsed a little into the past, but obviously not far e=
nough.
>=20
> Am 06.10.2010 16:12, schrieb Lemur Kryptering:
> > I'll definitely give that a shot when I rebuild this thing.
> >=20
> > In the meantime, is there anything that I can do to convince md not
> to kick the last disk (running on 6 out of 8 disks) when reading a ba=
d
> spot? I've tried setting the array to read-only, but this didn't seem
> to help.
>=20
> You can set the ERC values of your drives. Then they'll stop
> processing
> their internal error recovery procedure after the timeout and
> continue
> to react. Without ERC-timeout, the drive tries to correct the error
> on
> its own (not reacting on any requests), mdraid assumes an error after
> a
> while and tries to rewrite the "missing" sector (assembled from the
> other disks). But the drive will still not react to the write
> request
> as it is still doing its internal recovery procedure. Now mdraid
> assumes the disk to be bad and kicks it.
That sounds exactly like what I'm seeing in the logs -- the sector init=
ially reported as bad is indeed unreadable via dd. All of the subsequen=
t problems reported in other sectors aren't actually problems when I ch=
eck on them at a later point. Couldn't this be worked around by exposin=
g whatever timeouts there are in mdraid to something that could be adju=
sted in /sys?
>=20
> There's nothing you can do about this viscious circle except either
> enabling ERC or using Raid-Edition disk (which have ERC enabled by
> default).
>=20
I tried connecting the drives directly to my motherboard (my controller=
didn't seem to want to let me pass the smart commands ERC commands to =
the drives). The ERC commands took, in so far as I was able to read the=
m back with what I set them to. This didn't seem to help much with the =
issues I was having, however.
Lesson-learned on the non-raid edition disks. I would have spent the ex=
tra to avoid all this headache, but am now stuck with these things. I r=
ealize that not fixing the problem at the core (the drives themselves),=
essentially puts the burden on mdraid (which would be forced to block =
for a ridiculous amount of time waiting for the drive instead of just k=
icking it), however, in my particular case, this sort of delay would no=
t be a cause for concern.
Would someone be able to nudge me in the right direction as far as wher=
e the logic that handles this is located?
> Stefan
> >=20
> > All I'm really trying to do is dd data off of it using
> "conv=3Dsync,noerror". When it hits the unreadable spot, it simply ki=
cks
> the drive from the array, leaving 4/8 disks active, taking down the
> array.
> >=20
> > Again, I don't understand why md would take this action. It would
> make a lot more sense if it simply reported an IO error to whatever
> made the request.
> >=20
> > Peter Zieba
> > 312-285-3794
> >=20
> > ----- Original Message -----
> > From: "Phil Turmel"
> > To: "Peter Zieba"
> > Cc: linux-raid@vger.kernel.org
> > Sent: Wednesday, October 6, 2010 6:57:58 AM GMT -06:00 US/Canada
> Central
> > Subject: Re: Raid 6 - TLER/CCTL/ERC
> >=20
> > On 10/06/2010 01:51 AM, Peter Zieba wrote:
> >> Hey all,
> >>
> >> I have a question regarding Linux raid and degraded arrays.
> >>
> >> My configuration involves:
> >> - 8x Samsung HD103UJ 1TB drives (terrible consumer-grade)
> >> - AOC-USAS-L8i Controller
> >> - CentOS 5.5 2.6.18-194.11.1.el5xen (64-bit)
> >> - Each drive has one maximum-sized partition.
> >> - 8-drives are configured in a raid 6.
> >>
> >> My understanding is that with a raid 6, if a disk cannot return a
> given sector, it should still be possible to get what should have bee=
n
> returned from the first disk, from two other disks. My understanding
> is also that if this is successful, this should be written back to th=
e
> disk that originally failed to read the given sector. I'm assuming
> that's what a message such as this indicates:
> >> Sep 17 04:01:12 doorstop kernel: raid5:md0: read error corrected (=
8
> sectors at 1647989048 on sde1)
> >>
> >> I was hoping to confirm my suspicion on the meaning of that
> message.
> >>
> >> On occasion, I'll also see this:
> >> Oct 1 01:50:53 doorstop kernel: raid5:md0: read error not
> correctable (sector 1647369400 on sdh1).
> >>
> >> This seems to involved the drive being kicked from the array, even
> though the drive is still readable for the most part (save for a few
> sectors).
> >=20
> > [snip /]
> >=20
> > Hi Peter,
> >=20
> > For read errors that aren't permanent (gone after writing to the
> affected sectors), a "repair" action is your friend. I used to deal
> with occasional kicked-out drives in my arrays until I started runnin=
g
> the following script in a weekly cron job:
> >=20
> > #!/bin/bash
> > #
> > for x in /sys/block/md*/md/sync_action ; do
> > echo repair >$x
> > done
> >=20
> >=20
> > HTH,
> >=20
> > Phil
> > --
> > To unsubscribe from this list: send the line "unsubscribe
> linux-raid" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid 6 - TLER/CCTL/ERC
am 07.10.2010 02:45:39 von Michael Sallaway
On 6/10/2010 3:51 PM, Peter Zieba wrote:
> I have a question regarding Linux raid and degraded arrays.
>
> My configuration involves:
> - 8x Samsung HD103UJ 1TB drives (terrible consumer-grade)
> - AOC-USAS-L8i Controller
> - CentOS 5.5 2.6.18-194.11.1.el5xen (64-bit)
> - Each drive has one maximum-sized partition.
> - 8-drives are configured in a raid 6.
>
> My understanding is that with a raid 6, if a disk cannot return a given sector, it should still be possible to get what should have been returned from the first disk, from two other disks. My understanding is also that if this is successful, this should be written back to the disk that originally failed to read the given sector. I'm assuming that's what a message such as this indicates:
> Sep 17 04:01:12 doorstop kernel: raid5:md0: read error corrected (8 sectors at 1647989048 on sde1)
>
> I was hoping to confirm my suspicion on the meaning of that message.
>
> On occasion, I'll also see this:
> Oct 1 01:50:53 doorstop kernel: raid5:md0: read error not correctable (sector 1647369400 on sdh1).
>
> This seems to involved the drive being kicked from the array, even though the drive is still readable for the most part (save for a few sectors).
Hi Peter,
I've just been in the *exact* same situation recently, so I can probably
answer some of your questions (only as another end-user, though!). I'm
using similar samsung drives (the consumer 1.5TB drives), the
AOC-USASLP-L8i, and ubuntu kernels.
First off, I don't think the LSI1068E really works properly in any
non-recent kernel; I was using 2.6.32 (stock Ubuntu 10.04 kernel), and
having all sorts of problems with the card (read errors, bus errors,
timeouts, etc.). I ended up going back to my old controller for a while.
However, I've recently changed kernel (to 2.6.35) for other reasons
(described below), and now the card is working fine. So I'm not sure how
different it will be in CentOS, but you may want to consider trying a
newer kernel in case the card is causing problems.
As for the read errors/kicking drives from the array, I'm not sure why
it gets kicked reading some sectors and not others, however I know there
were changes to the md stuff which handled that more gracefully earlier
this year. I had the same problem -- on my 2.6.32 kernel, a rebuild of
one drive would hit a bad sector on another and drop the drive, then hit
another bad sector on a different drive and drop it as well, making the
array unusable. However, with a 2.6.35 kernel it recovers gracefully and
keeps going with the rebuild. (I can't find the exact patch, but Neil
had it in an earlier email to me on the list; maybe a month or two ago?)
So again, I'd suggest trying a newer kernel if you're having trouble.
Mind you, this is only as another end-user, not a developer, so I'm sure
I've probably got something wrong in all that. :-) But that's what
worked for me.
Hope that helps,
Michael
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid 6 - TLER/CCTL/ERC
am 08.10.2010 07:47:25 von unknown
Am 07.10.2010 01:11, schrieb Lemur Kryptering:
>
> [...]
>
> That sounds exactly like what I'm seeing in the logs -- the sector initially reported as bad is indeed unreadable via dd. All of the subsequent problems reported in other sectors aren't actually problems when I check on them at a later point. Couldn't this be worked around by exposing whatever timeouts there are in mdraid to something that could be adjusted in /sys?
>
>>
>> There's nothing you can do about this viscious circle except either
>> enabling ERC or using Raid-Edition disk (which have ERC enabled by
>> default).
I must say yesterday we had our first Hitachi UltraStar Drives - which
are supposed to be Raid-Edition. They didn't have ERC enabled. I'll
inquire Hitachi about that today.
>>
>
> I tried connecting the drives directly to my motherboard (my controller didn't seem to want to let me pass the smart commands ERC commands to the drives). The ERC commands took, in so far as I was able to read them back with what I set them to. This didn't seem to help much with the issues I was having, however.
Which wouldn't work, as the SCT ERC settings are volatile. I.e.:
they're gone after a power cycle.
>
> Lesson-learned on the non-raid edition disks. I would have spent the extra to avoid all this headache, but am now stuck with these things. I realize that not fixing the problem at the core (the drives themselves), essentially puts the burden on mdraid (which would be forced to block for a ridiculous amount of time waiting for the drive instead of just kicking it), however, in my particular case, this sort of delay would not be a cause for concern.
>
> Would someone be able to nudge me in the right direction as far as where the logic that handles this is located?
>
>> [...]
>>> #!/bin/bash
>>> #
>>> for x in /sys/block/md*/md/sync_action ; do
>>> echo repair >$x
>>> done
>>>[...]
That is probably the only thing you can try. As this does indeed try to
reconstruct the sector from the redundancy. But I'd try it with ERC
enabled. Maybe you find a way where this works. (i.e. move the whole
Raid to the other computer...)
Stefan
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html