possible bug in md

am 04.07.2011 18:26:14 von Iordan Iordanov

Hi,

I was doing some testing with an Ubuntu 10.04 installation (Linux
2.6.32, so my apologies if this has been noted and dealt with already),
and I noticed what I think may be a bug.

I had a system with RAID10, layout n2, where /dev/sda is one of the
devices, and the other is "missing". I wanted to add /dev/sdb to the
RAID10 array. Both drives are on their last legs (bad sectors and
stuff), and I was just doing a proof of concept for a guide I was
writing, so I didn't care.

Here are the relevant dmesg messages for the drives detected:
====================================================
ata1.00: ATA-5: IC35L040AVER07-0, ER4OA44A, max UDMA/100
ata1.00: 80418240 sectors, multi 16: LBA
ata1.01: ATA-6: Maxtor 94610H6, BAC51KJ0, max UDMA/100
ata1.01: 90045648 sectors, multi 16: LBA
====================================================

On the system, ata1.00 is an IBM drive (/dev/sda), and ata1.01 is a
Maxtor drive (/dev/sdb). I have RAID10 (/dev/md0) on ata1.00 (/dev/sda)
and one "missing" device. I added the Maxtor (ata1.01, /dev/sdb), and
during the sync, an error occurred on ata1.00, which is the first disk
of the RAID10 array (the IBM, /dev/sda). However, mdadm wrongly reports
that an error has occurred on the device I had just ADDED (the Maxtor):

====================================================
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata1.00: BMDMA stat 0x65
ata1.00: failed command: READ DMA
ata1.00: cmd c8/00:00:00:e5:7b/00:00:00:00:00/e2 tag 0 dma 131072 in
res 51/40:39:c7:e5:7b/00:00:00:00:00/e2 Emask 0x9 (media error)
ata1.00: status: { DRDY ERR }
ata1.00: error: { UNC }
ata1.00: configured for UDMA/100
ata1.01: configured for UDMA/100
ata1: EH complete
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata1.00: BMDMA stat 0x65
ata1.00: failed command: READ DMA
ata1.00: cmd c8/00:00:00:e5:7b/00:00:00:00:00/e2 tag 0 dma 131072 in
res 51/40:39:c7:e5:7b/00:00:00:00:00/e2 Emask 0x9 (media error)
ata1.00: status: { DRDY ERR }
ata1.00: error: { UNC }
ata1.00: configured for UDMA/100
ata1.01: configured for UDMA/100
sd 0:0:0:0: [sda] Unhandled sense code
sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
sd 0:0:0:0: [sda] Sense Key : Medium Error [current] [descriptor]
Descriptor sense data with sense descriptors (in hex):
72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
02 7b e5 c7
sd 0:0:0:0: [sda] Add. Sense: Unrecovered read error - auto reallocate
failed
sd 0:0:0:0: [sda] CDB: Read(10): 28 00 02 7b e5 00 00 01 00 00
end_request: I/O error, dev sda, sector 41674183
ata1: EH complete
md: md0: recovery done.
raid10: Disk failure on sdb, disabling device.
raid10: Operation continuing on 1 devices.
RAID10 conf printout:
--- wd:1 rd:2
disk 0, wo:0, o:1, dev:sda
disk 1, wo:1, o:0, dev:sdb
RAID10 conf printout:
--- wd:1 rd:2
disk 0, wo:0, o:1, dev:sda
====================================================

The relevant lines are the ones that show the errors on ata1.00 (the
IBM), and then the line which reports disk failure on /dev/sdb (ata1.01):

raid10: Disk failure on sdb, disabling device.

Sincerely,
Iordan Iordanov
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: possible bug in md

am 05.07.2011 02:24:19 von NeilBrown

On Mon, 04 Jul 2011 12:26:14 -0400 Iordan Iordanov
wrote:

> Hi,
>
> I was doing some testing with an Ubuntu 10.04 installation (Linux
> 2.6.32, so my apologies if this has been noted and dealt with already),
> and I noticed what I think may be a bug.
>
> I had a system with RAID10, layout n2, where /dev/sda is one of the
> devices, and the other is "missing". I wanted to add /dev/sdb to the
> RAID10 array. Both drives are on their last legs (bad sectors and
> stuff), and I was just doing a proof of concept for a guide I was
> writing, so I didn't care.
>
> Here are the relevant dmesg messages for the drives detected:
> ====================================================
> ata1.00: ATA-5: IC35L040AVER07-0, ER4OA44A, max UDMA/100
> ata1.00: 80418240 sectors, multi 16: LBA
> ata1.01: ATA-6: Maxtor 94610H6, BAC51KJ0, max UDMA/100
> ata1.01: 90045648 sectors, multi 16: LBA
> ====================================================
>
> On the system, ata1.00 is an IBM drive (/dev/sda), and ata1.01 is a
> Maxtor drive (/dev/sdb). I have RAID10 (/dev/md0) on ata1.00 (/dev/sda)
> and one "missing" device. I added the Maxtor (ata1.01, /dev/sdb), and
> during the sync, an error occurred on ata1.00, which is the first disk
> of the RAID10 array (the IBM, /dev/sda). However, mdadm wrongly reports
> that an error has occurred on the device I had just ADDED (the Maxtor):
>
> ====================================================
> ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
> ata1.00: BMDMA stat 0x65
> ata1.00: failed command: READ DMA
> ata1.00: cmd c8/00:00:00:e5:7b/00:00:00:00:00/e2 tag 0 dma 131072 in
> res 51/40:39:c7:e5:7b/00:00:00:00:00/e2 Emask 0x9 (media error)
> ata1.00: status: { DRDY ERR }
> ata1.00: error: { UNC }
> ata1.00: configured for UDMA/100
> ata1.01: configured for UDMA/100
> ata1: EH complete
> ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
> ata1.00: BMDMA stat 0x65
> ata1.00: failed command: READ DMA
> ata1.00: cmd c8/00:00:00:e5:7b/00:00:00:00:00/e2 tag 0 dma 131072 in
> res 51/40:39:c7:e5:7b/00:00:00:00:00/e2 Emask 0x9 (media error)
> ata1.00: status: { DRDY ERR }
> ata1.00: error: { UNC }
> ata1.00: configured for UDMA/100
> ata1.01: configured for UDMA/100
> sd 0:0:0:0: [sda] Unhandled sense code
> sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
> sd 0:0:0:0: [sda] Sense Key : Medium Error [current] [descriptor]
> Descriptor sense data with sense descriptors (in hex):
> 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
> 02 7b e5 c7
> sd 0:0:0:0: [sda] Add. Sense: Unrecovered read error - auto reallocate
> failed
> sd 0:0:0:0: [sda] CDB: Read(10): 28 00 02 7b e5 00 00 01 00 00
> end_request: I/O error, dev sda, sector 41674183
> ata1: EH complete
> md: md0: recovery done.
> raid10: Disk failure on sdb, disabling device.
> raid10: Operation continuing on 1 devices.
> RAID10 conf printout:
> --- wd:1 rd:2
> disk 0, wo:0, o:1, dev:sda
> disk 1, wo:1, o:0, dev:sdb
> RAID10 conf printout:
> --- wd:1 rd:2
> disk 0, wo:0, o:1, dev:sda
> ====================================================
>
> The relevant lines are the ones that show the errors on ata1.00 (the
> IBM), and then the line which reports disk failure on /dev/sdb (ata1.01):
>
> raid10: Disk failure on sdb, disabling device.
>
> Sincerely,
> Iordan Iordanov

Thanks for the report.

md/raid10 is behaving 'correctly' here though I agree that it is a bit
confusing.

When raid10 handles the error on sda it notes that sda is the only device so
removing from the array would not to anyone any good so it just passes the
read error up.
The recovery process then gets to handle the read response which it would
normally do by writing the data to the spare. However as there is no data to
write it just pretends that the write attempt failed so the spare gets
removed from the array.
This is correct in that the spare should be removed from the array as there
is nothing else useful that can be done. It is possibly not ideal in that
the spare gets marked as 'faulty' where it isn't really.
I should probably fix that.

But mostly it is doing the 'right' thing.

Thanks,
NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: possible bug in md

am 05.07.2011 18:25:39 von Iordan Iordanov

Hi Neil,

On 07/04/11 20:24, NeilBrown wrote:
> This is correct in that the spare should be removed from the array as there
> is nothing else useful that can be done. It is possibly not ideal in that
> the spare gets marked as 'faulty' where it isn't really.

I agree that MD is doing the right thing in stopping the sync, since
there is nothing else that can be done. What it should say in the kernel
log in this case (in my opinion anyway) is something like:

raid10: Disk failure on sda, sync stopped, sdb marked faulty.

instead of:

raid10: Disk failure on sdb, disabling device.

only because /dev/sdb did not actually fail! I agree this is not
terribly important, I was reporting only for correctness, and I know
you're busy :).

Many thanks,
Iordan
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: possible bug in md

am 14.07.2011 07:11:37 von NeilBrown

On Tue, 05 Jul 2011 12:25:39 -0400 Iordan Iordanov
wrote:

> Hi Neil,
>
> On 07/04/11 20:24, NeilBrown wrote:
> > This is correct in that the spare should be removed from the array as there
> > is nothing else useful that can be done. It is possibly not ideal in that
> > the spare gets marked as 'faulty' where it isn't really.
>
> I agree that MD is doing the right thing in stopping the sync, since
> there is nothing else that can be done. What it should say in the kernel
> log in this case (in my opinion anyway) is something like:
>
> raid10: Disk failure on sda, sync stopped, sdb marked faulty.
>
> instead of:
>
> raid10: Disk failure on sdb, disabling device.
>
> only because /dev/sdb did not actually fail! I agree this is not
> terribly important, I was reporting only for correctness, and I know
> you're busy :).
>
> Many thanks,
> Iordan

I have made some changes to RAID10 so that it will not report that
a device has failed when really it hasn't. It will abort the recovery,
ensure that another recovery doesn't automatically restart, and will
report why the recovery was aborted.

Thanks,
NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: possible bug in md

am 14.07.2011 18:17:37 von Iordan Iordanov

Hi Neil,

On 07/14/11 01:11, NeilBrown wrote:
> I have made some changes to RAID10 so that it will not report that
> a device has failed when really it hasn't. It will abort the recovery,
> ensure that another recovery doesn't automatically restart, and will
> report why the recovery was aborted.

Many thanks for taking care of that!

On a related note, do you know what would happen if on a 3-device RAID1
(mirror), I failed one of the drives, and triggered a rebuild onto a
spare, and then determined which device is the "source" for the rebuild,
and yanked it out (or failed it)?

Would the RAID1 recover and start syncing from the next available (last
remaining) valid device, or will it fail? If you don't know, I will
conduct a test and report the result.

Cheers,
Iordan
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html