check which disk is a problem

check which disk is a problem

am 19.05.2011 12:34:39 von Pol Hallen

Hi folks :-)

I've a raid6 sw on debian stable and a problem (!):

cat /proc/mdstat

Personalities : [raid6] [raid5] [raid4]
md0 : active raid6 sdc1[0] sdf1[5] sdg1[4] sdh1[3] sdd1[1]
5860543744 blocks level 6, 64k chunk, algorithm 2 [6/5] [UU_UUU]

so, I think /dev/sde is corrupted disk

How identify this disk?

blkid:

/dev/sdc1: UUID="9bd6372e-e2ea-b1d5-d2bd-c3cbad12f41d"
TYPE="linux_raid_member"
/dev/sdd1: UUID="9bd6372e-e2ea-b1d5-d2bd-c3cbad12f41d"
TYPE="linux_raid_member"
/dev/sde1: UUID="9bd6372e-e2ea-b1d5-d2bd-c3cbad12f41d"
TYPE="linux_raid_member"
/dev/sdf1: UUID="9bd6372e-e2ea-b1d5-d2bd-c3cbad12f41d"
TYPE="linux_raid_member"
/dev/sdg1: UUID="9bd6372e-e2ea-b1d5-d2bd-c3cbad12f41d"
TYPE="linux_raid_member"
/dev/sdh1: UUID="9bd6372e-e2ea-b1d5-d2bd-c3cbad12f41d"
TYPE="linux_raid_member"

has same uuid, why?

and now how can I resolve?

thanks!


Pol
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: check which disk is a problem

am 19.05.2011 13:20:12 von Kay Diederichs

On 05/19/2011 12:34 PM, Pol Hallen wrote:
> Hi folks :-)
>
> I've a raid6 sw on debian stable and a problem (!):
>
> cat /proc/mdstat
>
> Personalities : [raid6] [raid5] [raid4]
> md0 : active raid6 sdc1[0] sdf1[5] sdg1[4] sdh1[3] sdd1[1]
> 5860543744 blocks level 6, 64k chunk, algorithm 2 [6/5] [UU_UUU]
>
> so, I think /dev/sde is corrupted disk
>
> How identify this disk?
>

hdparm -tT /dev/sde
will make its lights flicker for about 5 seconds.

HTH,
Kay

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: check which disk is a problem

am 19.05.2011 13:47:34 von John Robinson

On 19/05/2011 11:34, Pol Hallen wrote:
> Hi folks :-)
>
> I've a raid6 sw on debian stable and a problem (!):
>
> cat /proc/mdstat
>
> Personalities : [raid6] [raid5] [raid4]
> md0 : active raid6 sdc1[0] sdf1[5] sdg1[4] sdh1[3] sdd1[1]
> 5860543744 blocks level 6, 64k chunk, algorithm 2 [6/5] [UU_UUU]
>
> so, I think /dev/sde is corrupted disk
>
> How identify this disk?
>
> blkid:
>
> /dev/sdc1: UUID="9bd6372e-e2ea-b1d5-d2bd-c3cbad12f41d"
> TYPE="linux_raid_member"
> /dev/sdd1: UUID="9bd6372e-e2ea-b1d5-d2bd-c3cbad12f41d"
> TYPE="linux_raid_member"
> /dev/sde1: UUID="9bd6372e-e2ea-b1d5-d2bd-c3cbad12f41d"
> TYPE="linux_raid_member"
> /dev/sdf1: UUID="9bd6372e-e2ea-b1d5-d2bd-c3cbad12f41d"
> TYPE="linux_raid_member"
> /dev/sdg1: UUID="9bd6372e-e2ea-b1d5-d2bd-c3cbad12f41d"
> TYPE="linux_raid_member"
> /dev/sdh1: UUID="9bd6372e-e2ea-b1d5-d2bd-c3cbad12f41d"
> TYPE="linux_raid_member"
>
> has same uuid, why?
>
> and now how can I resolve?

You can find out which discs/partitions are meant to be in the array with
mdadm -D /dev/md0

and if as it appears there's one missing you can see what state it's in with
mdadm -E /dev/sde1
(or similar).

You should look through your logs to see if you can see what happened to
it. You should also check its SMART status with e.g.
smartctl -a /dev/sde

If it's not dead or dying, you may be able to re-add it with
mdadm /dev/md0 --add /dev/sde1

Hope this helps!

Cheers,

John.

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: check which disk is a problem

am 19.05.2011 16:17:30 von Pol Hallen

> You should look through your logs to see if you can see what happened to
> it. You should also check its SMART status with e.g.
> smartctl -a /dev/sde

ok thanks :-)

My fear is also this:

May 19 11:02:47 lorna kernel: [220141.636031] ata9: SATA link up 1.5 Gbps
(SStatus 113 SControl 310)
May 19 11:02:47 lorna kernel: [220141.708820] ata9.00: configured for UDMA/33
May 19 11:02:47 lorna kernel: [220141.710840] ata9: EH complete
May 19 11:03:13 lorna kernel: [220167.526518] ata9: exception Emask 0x10 SAct
0x0 SErr 0x10000 action 0xe frozen
May 19 11:03:13 lorna kernel: [220167.528619] ata9: irq_stat 0x00400000, PHY
RDY changed
May 19 11:03:13 lorna kernel: [220167.530664] ata9: SError: { PHYRdyChg }
May 19 11:03:13 lorna kernel: [220167.532735] ata9: hard resetting link
May 19 11:03:14 lorna kernel: [220168.263086] EXT4-fs error (device md0):
ext4_ext_search_right: bad header/extent i
n inode #260441748: invalid extent entries - magic f30a, entries 125, max 340
(340), depth 0(0)
May 19 11:03:14 lorna kernel: [220168.464708] EXT4-fs (md0): delayed block
allocation failed for inode 260441748 at
logical offset 130984 with max blocks 1 with error -5
May 19 11:03:14 lorna kernel: [220168.467005] This should not happen!! Data
will be lost

this shouldn't happen or no? On raid6 I've only a disk with problem.. or is it
a ext4 problem?

I use default debian kernel 2.6.32-5-686-bigmem with debian stable.

Thanks!

Pol
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html