RAID10 failed with two disks

am 22.08.2011 12:39:42 von Piotr Legiecki

Hi

I'v got RAID10 on 4 disks. Suddenly two of the disks failed (I doubt the
disks actually failed, rather it is a kernel failure or maybe
motherboard SATA controller)?

So after rebootig I cannot start the array. So my first question is: on
RAID10 (default layout) which disks may fail and still the array survive?

mdadm --examine /dev/sda1
/dev/sda1:
Magic : a92b4efc
Version : 00.90.00
UUID : fab2336d:71210520:990002ab:4fde9f0c (local to host bez)
Creation Time : Mon Aug 22 10:40:36 2011
Raid Level : raid10
Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
Array Size : 1953519872 (1863.02 GiB 2000.40 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 4

Update Time : Mon Aug 22 10:40:36 2011
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 2
Spare Devices : 0
Checksum : d4ba8390 - correct
Events : 1

Layout : near=2, far=1
Chunk Size : 64K

Number Major Minor RaidDevice State
this 0 8 1 0 active sync /dev/sda1

0 0 8 1 0 active sync /dev/sda1
1 1 8 17 1 active sync /dev/sdb1
2 2 0 0 2 faulty
3 3 0 0 3 faulty

The last two disks (failed ones) are sde1 and sdf1.

So do I have any chances to get the array running or it is dead?

I've tried a few steps to run the array but with no luck.

Regards
P.

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: RAID10 failed with two disks

am 22.08.2011 13:09:03 von NeilBrown

On Mon, 22 Aug 2011 12:39:42 +0200 Piotr Legiecki wrote:

> Hi
>
> I'v got RAID10 on 4 disks. Suddenly two of the disks failed (I doubt the
> disks actually failed, rather it is a kernel failure or maybe
> motherboard SATA controller)?
>
> So after rebootig I cannot start the array. So my first question is: on
> RAID10 (default layout) which disks may fail and still the array survive?

Not adjacent disks.

>
> mdadm --examine /dev/sda1
> /dev/sda1:
> Magic : a92b4efc
> Version : 00.90.00
> UUID : fab2336d:71210520:990002ab:4fde9f0c (local to host bez)
> Creation Time : Mon Aug 22 10:40:36 2011
> Raid Level : raid10
> Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
> Array Size : 1953519872 (1863.02 GiB 2000.40 GB)
> Raid Devices : 4
> Total Devices : 4
> Preferred Minor : 4
>
> Update Time : Mon Aug 22 10:40:36 2011
> State : clean
> Active Devices : 2
> Working Devices : 2
> Failed Devices : 2
> Spare Devices : 0
> Checksum : d4ba8390 - correct
> Events : 1
>
> Layout : near=2, far=1
> Chunk Size : 64K
>
> Number Major Minor RaidDevice State
> this 0 8 1 0 active sync /dev/sda1
>
> 0 0 8 1 0 active sync /dev/sda1
> 1 1 8 17 1 active sync /dev/sdb1
> 2 2 0 0 2 faulty
> 3 3 0 0 3 faulty
>
> The last two disks (failed ones) are sde1 and sdf1.
>
> So do I have any chances to get the array running or it is dead?

Possible.
Report "mdadm --examine" of all devices that you believe should be part of
the array.

NeilBrown

>
> I've tried a few steps to run the array but with no luck.
>
> Regards
> P.
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: RAID10 failed with two disks

am 22.08.2011 13:42:54 von Piotr Legiecki

>> mdadm --examine /dev/sda1
>> /dev/sda1:
>> Magic : a92b4efc
>> Version : 00.90.00
>> UUID : fab2336d:71210520:990002ab:4fde9f0c (local to host bez)
>> Creation Time : Mon Aug 22 10:40:36 2011
>> Raid Level : raid10
>> Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
>> Array Size : 1953519872 (1863.02 GiB 2000.40 GB)
>> Raid Devices : 4
>> Total Devices : 4
>> Preferred Minor : 4
>>
>> Update Time : Mon Aug 22 10:40:36 2011
>> State : clean
>> Active Devices : 2
>> Working Devices : 2
>> Failed Devices : 2
>> Spare Devices : 0
>> Checksum : d4ba8390 - correct
>> Events : 1
>>
>> Layout : near=2, far=1
>> Chunk Size : 64K
>>
>> Number Major Minor RaidDevice State
>> this 0 8 1 0 active sync /dev/sda1
>>
>> 0 0 8 1 0 active sync /dev/sda1
>> 1 1 8 17 1 active sync /dev/sdb1
>> 2 2 0 0 2 faulty
>> 3 3 0 0 3 faulty
>>
>> The last two disks (failed ones) are sde1 and sdf1.
>>
>> So do I have any chances to get the array running or it is dead?
>
> Possible.
> Report "mdadm --examine" of all devices that you believe should be part of
> the array.

/dev/sdb1:
Magic : a92b4efc
Version : 00.90.00
UUID : fab2336d:71210520:990002ab:4fde9f0c (local to host bez)
Creation Time : Mon Aug 22 10:40:36 2011
Raid Level : raid10
Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
Array Size : 1953519872 (1863.02 GiB 2000.40 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 4

Update Time : Mon Aug 22 10:40:36 2011
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 2
Spare Devices : 0
Checksum : d4ba83a2 - correct
Events : 1

Layout : near=2, far=1
Chunk Size : 64K

Number Major Minor RaidDevice State
this 1 8 17 1 active sync /dev/sdb1

0 0 8 1 0 active sync /dev/sda1
1 1 8 17 1 active sync /dev/sdb1
2 2 0 0 2 faulty
3 3 0 0 3 faulty

/dev/sde1:
Magic : a92b4efc
Version : 00.90.00
UUID : 157a7440:4502f6db:990002ab:4fde9f0c (local to host bez)
Creation Time : Fri Jun 3 12:18:33 2011
Raid Level : raid10
Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
Array Size : 1953519872 (1863.02 GiB 2000.40 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 4

Update Time : Sat Aug 20 03:06:27 2011
State : clean
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Checksum : c2f848c2 - correct
Events : 24

Layout : near=2, far=1
Chunk Size : 64K

Number Major Minor RaidDevice State
this 2 8 65 2 active sync /dev/sde1

0 0 8 1 0 active sync /dev/sda1
1 1 8 17 1 active sync /dev/sdb1
2 2 8 65 2 active sync /dev/sde1
3 3 8 81 3 active sync /dev/sdf1

/dev/sdf1:
Magic : a92b4efc
Version : 00.90.00
UUID : 157a7440:4502f6db:990002ab:4fde9f0c (local to host bez)
Creation Time : Fri Jun 3 12:18:33 2011
Raid Level : raid10
Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
Array Size : 1953519872 (1863.02 GiB 2000.40 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 4

Update Time : Sat Aug 20 03:06:27 2011
State : clean
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Checksum : c2f848d4 - correct
Events : 24

Layout : near=2, far=1
Chunk Size : 64K

Number Major Minor RaidDevice State
this 3 8 81 3 active sync /dev/sdf1

0 0 8 1 0 active sync /dev/sda1
1 1 8 17 1 active sync /dev/sdb1
2 2 8 65 2 active sync /dev/sde1
3 3 8 81 3 active sync /dev/sdf1

smartd reported the sde and sdf disks are failed, but after rebooting it
does not complain anymore.

You say adjacent disks must be healthy for RAID10. So in my situation I
have adjacent disks dead (sde and sdf). It does not look good.

And does layout (near, far etc) influence on this rule: adjacent disk
must be healthy?

Regards
P.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: RAID10 failed with two disks

am 22.08.2011 14:01:29 von NeilBrown

On Mon, 22 Aug 2011 13:42:54 +0200 Piotr Legiecki wrote:

> >> mdadm --examine /dev/sda1
> >> /dev/sda1:
> >> Magic : a92b4efc
> >> Version : 00.90.00
> >> UUID : fab2336d:71210520:990002ab:4fde9f0c (local to host bez)
> >> Creation Time : Mon Aug 22 10:40:36 2011
> >> Raid Level : raid10
> >> Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
> >> Array Size : 1953519872 (1863.02 GiB 2000.40 GB)
> >> Raid Devices : 4
> >> Total Devices : 4
> >> Preferred Minor : 4
> >>
> >> Update Time : Mon Aug 22 10:40:36 2011
> >> State : clean
> >> Active Devices : 2
> >> Working Devices : 2
> >> Failed Devices : 2
> >> Spare Devices : 0
> >> Checksum : d4ba8390 - correct
> >> Events : 1
> >>
> >> Layout : near=2, far=1
> >> Chunk Size : 64K
> >>
> >> Number Major Minor RaidDevice State
> >> this 0 8 1 0 active sync /dev/sda1
> >>
> >> 0 0 8 1 0 active sync /dev/sda1
> >> 1 1 8 17 1 active sync /dev/sdb1
> >> 2 2 0 0 2 faulty
> >> 3 3 0 0 3 faulty
> >>
> >> The last two disks (failed ones) are sde1 and sdf1.
> >>
> >> So do I have any chances to get the array running or it is dead?
> >
> > Possible.
> > Report "mdadm --examine" of all devices that you believe should be part of
> > the array.
>
> /dev/sdb1:
> Magic : a92b4efc
> Version : 00.90.00
> UUID : fab2336d:71210520:990002ab:4fde9f0c (local to host bez)
> Creation Time : Mon Aug 22 10:40:36 2011
> Raid Level : raid10
> Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
> Array Size : 1953519872 (1863.02 GiB 2000.40 GB)
> Raid Devices : 4
> Total Devices : 4
> Preferred Minor : 4
>
> Update Time : Mon Aug 22 10:40:36 2011
> State : clean
> Active Devices : 2
> Working Devices : 2
> Failed Devices : 2
> Spare Devices : 0
> Checksum : d4ba83a2 - correct
> Events : 1
>
> Layout : near=2, far=1
> Chunk Size : 64K
>
> Number Major Minor RaidDevice State
> this 1 8 17 1 active sync /dev/sdb1
>
> 0 0 8 1 0 active sync /dev/sda1
> 1 1 8 17 1 active sync /dev/sdb1
> 2 2 0 0 2 faulty
> 3 3 0 0 3 faulty
>
>
>
> /dev/sde1:
> Magic : a92b4efc
> Version : 00.90.00
> UUID : 157a7440:4502f6db:990002ab:4fde9f0c (local to host bez)
> Creation Time : Fri Jun 3 12:18:33 2011
> Raid Level : raid10
> Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
> Array Size : 1953519872 (1863.02 GiB 2000.40 GB)
> Raid Devices : 4
> Total Devices : 4
> Preferred Minor : 4
>
> Update Time : Sat Aug 20 03:06:27 2011
> State : clean
> Active Devices : 4
> Working Devices : 4
> Failed Devices : 0
> Spare Devices : 0
> Checksum : c2f848c2 - correct
> Events : 24
>
> Layout : near=2, far=1
> Chunk Size : 64K
>
> Number Major Minor RaidDevice State
> this 2 8 65 2 active sync /dev/sde1
>
> 0 0 8 1 0 active sync /dev/sda1
> 1 1 8 17 1 active sync /dev/sdb1
> 2 2 8 65 2 active sync /dev/sde1
> 3 3 8 81 3 active sync /dev/sdf1
>
> /dev/sdf1:
> Magic : a92b4efc
> Version : 00.90.00
> UUID : 157a7440:4502f6db:990002ab:4fde9f0c (local to host bez)
> Creation Time : Fri Jun 3 12:18:33 2011
> Raid Level : raid10
> Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
> Array Size : 1953519872 (1863.02 GiB 2000.40 GB)
> Raid Devices : 4
> Total Devices : 4
> Preferred Minor : 4
>
> Update Time : Sat Aug 20 03:06:27 2011
> State : clean
> Active Devices : 4
> Working Devices : 4
> Failed Devices : 0
> Spare Devices : 0
> Checksum : c2f848d4 - correct
> Events : 24
>
> Layout : near=2, far=1
> Chunk Size : 64K
>
> Number Major Minor RaidDevice State
> this 3 8 81 3 active sync /dev/sdf1
>
> 0 0 8 1 0 active sync /dev/sda1
> 1 1 8 17 1 active sync /dev/sdb1
> 2 2 8 65 2 active sync /dev/sde1
> 3 3 8 81 3 active sync /dev/sdf1

It looks like sde1 and sdf1 are unchanged since the "failure" which happened
shortly after 3am on Saturday. So the data on them is probably good.

It looks like someone (you?) tried to create a new array on sda1 and sdb1
thus destroying the old metadata (but probably not the data). I'm surprised
that mdadm would have let you create a RAID10 with just 2 devices... Is
that what happened? or something else?

Anyway it looks as though if you run the command:

mdadm --create /dev/md4 -l10 -n4 -e 0.90 /dev/sd{a,b,e,d}1 --assume-clean

there is a reasonable change that /dev/md4 would have all your data.
You should then
fsck -fn /dev/md4
to check that it is all OK. If it is you can
echo check > /sys/block/md4/md/sync_action
to check if the mirrors are consistent. When it finished
cat /sys/block/md4/md/mismatch_cnt
will show '0' if all is consistent.

If it is not zero but a small number, you can feel safe doing
echo repair > /sys/block/md4/md/sync_action
to fix it up.
If it is a big number.... that would be troubling.

>
>
> smartd reported the sde and sdf disks are failed, but after rebooting it
> does not complain anymore.
>
> You say adjacent disks must be healthy for RAID10. So in my situation I
> have adjacent disks dead (sde and sdf). It does not look good.
>
> And does layout (near, far etc) influence on this rule: adjacent disk
> must be healthy?

I didn't say adjacent disks must be healthy. Is said you cannot have
adjacent disks both failing. This is not affected by near/far.
It is a bit more subtle than that though. It is OK for 2nd and 3rd to both
fail. But not 1st and 2nd or 3rd and 4th.

NeilBrown

>
>
> Regards
> P.

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: RAID10 failed with two disks

am 22.08.2011 14:52:50 von Piotr Legiecki

NeilBrown pisze:
> It looks like sde1 and sdf1 are unchanged since the "failure" which happened
> shortly after 3am on Saturday. So the data on them is probably good.

And I think so.

> It looks like someone (you?) tried to create a new array on sda1 and sdb1
> thus destroying the old metadata (but probably not the data). I'm surprised
> that mdadm would have let you create a RAID10 with just 2 devices... Is
> that what happened? or something else?

Well, its me of course ;-) I've tried to run the array. It of course
didn't allo me to create RAID10 on two disks only, so I have used mdadm
--create .... missing missing parameters. But it didn't help.

> Anyway it looks as though if you run the command:
>
> mdadm --create /dev/md4 -l10 -n4 -e 0.90 /dev/sd{a,b,e,d}1 --assume-clean

Personalities : [raid1] [raid10]
md4 : active (auto-read-only) raid10 sdf1[3] sde1[2] sdb1[1] sda1[0]
1953519872 blocks 64K chunks 2 near-copies [4/4] [UUUU]

md3 : active raid1 sdc4[0] sdd4[1]
472752704 blocks [2/2] [UU]

md2 : active (auto-read-only) raid1 sdc3[0] sdd3[1]
979840 blocks [2/2] [UU]

md0 : active raid1 sdd1[0] sdc1[1]
9767424 blocks [2/2] [UU]

md1 : active raid1 sdd2[0] sdc2[1]
4883648 blocks [2/2] [UU]

Hura, hura, hura! ;-) Well, wonder why it didn't work for me ;-(

> there is a reasonable change that /dev/md4 would have all your data.
> You should then
> fsck -fn /dev/md4

fsck issued some errors
.....
Illegal block #-1 (3126319976) in inode 14794786. IGNORED.
Error while iterating over blocks in inode 14794786: Illegal indirect
block found
e2fsck: aborted

md4 is read-only now.

> to check that it is all OK. If it is you can
> echo check > /sys/block/md4/md/sync_action
> to check if the mirrors are consistent. When it finished
> cat /sys/block/md4/md/mismatch_cnt
> will show '0' if all is consistent.
>
> If it is not zero but a small number, you can feel safe doing
> echo repair > /sys/block/md4/md/sync_action
> to fix it up.
> If it is a big number.... that would be troubling.

A bit of magic as I can see. Would it not be reasonable to put those
commands in mdadm?

>> And does layout (near, far etc) influence on this rule: adjacent disk
>> must be healthy?
>
> I didn't say adjacent disks must be healthy. Is said you cannot have
> adjacent disks both failing. This is not affected by near/far.
> It is a bit more subtle than that though. It is OK for 2nd and 3rd to both
> fail. But not 1st and 2nd or 3rd and 4th.

I see. Just like ordinary RAID1+0. First and second pair of the disks
are RAID1, when both disks in that pair fail the mirror is dead.

Wonder what happens when I create RAID10 on 6 disks? So we have got:
sda1+sdb1 = RAID1
sdc1+sdd1 = RAID1
sde1+sdf1 = RAID1
Those three RAID1 are striped together in RAID0?
And assuming each disk is 1TB, I have 3TB logical space?
In such situation still the adjacent disks of each RAID1 both must not
fail.

And I still wonder why it happened? Hardware issue (motherboard)? Or
kernel bug (2.6.26 - debian/lenny)?

Thank you very nice for help.

Regards
Piotr
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: RAID10 failed with two disks

am 23.08.2011 01:56:34 von NeilBrown

On Mon, 22 Aug 2011 14:52:50 +0200 Piotr Legiecki wrote:

> NeilBrown pisze:
> > It looks like sde1 and sdf1 are unchanged since the "failure" which happened
> > shortly after 3am on Saturday. So the data on them is probably good.
>
> And I think so.
>
> > It looks like someone (you?) tried to create a new array on sda1 and sdb1
> > thus destroying the old metadata (but probably not the data). I'm surprised
> > that mdadm would have let you create a RAID10 with just 2 devices... Is
> > that what happened? or something else?
>
> Well, its me of course ;-) I've tried to run the array. It of course
> didn't allo me to create RAID10 on two disks only, so I have used mdadm
> --create .... missing missing parameters. But it didn't help.
>
>
> > Anyway it looks as though if you run the command:
> >
> > mdadm --create /dev/md4 -l10 -n4 -e 0.90 /dev/sd{a,b,e,d}1 --assume-clean
>
> Personalities : [raid1] [raid10]
> md4 : active (auto-read-only) raid10 sdf1[3] sde1[2] sdb1[1] sda1[0]
> 1953519872 blocks 64K chunks 2 near-copies [4/4] [UUUU]
>
> md3 : active raid1 sdc4[0] sdd4[1]
> 472752704 blocks [2/2] [UU]
>
> md2 : active (auto-read-only) raid1 sdc3[0] sdd3[1]
> 979840 blocks [2/2] [UU]
>
> md0 : active raid1 sdd1[0] sdc1[1]
> 9767424 blocks [2/2] [UU]
>
> md1 : active raid1 sdd2[0] sdc2[1]
> 4883648 blocks [2/2] [UU]
>
> Hura, hura, hura! ;-) Well, wonder why it didn't work for me ;-(

Looks good so far, but is you data safe?

>
>
> > there is a reasonable change that /dev/md4 would have all your data.
> > You should then
> > fsck -fn /dev/md4
>
> fsck issued some errors
> ....
> Illegal block #-1 (3126319976) in inode 14794786. IGNORED.
> Error while iterating over blocks in inode 14794786: Illegal indirect
> block found
> e2fsck: aborted

Mostly safe it seems .... assuming there were really serious things that you
hid behind the "...".

An "fsck -f /dev/md4" would probably fix it up.

>
> md4 is read-only now.
>
> > to check that it is all OK. If it is you can
> > echo check > /sys/block/md4/md/sync_action
> > to check if the mirrors are consistent. When it finished
> > cat /sys/block/md4/md/mismatch_cnt
> > will show '0' if all is consistent.
> >
> > If it is not zero but a small number, you can feel safe doing
> > echo repair > /sys/block/md4/md/sync_action
> > to fix it up.
> > If it is a big number.... that would be troubling.
>
> A bit of magic as I can see. Would it not be reasonable to put those
> commands in mdadm?

Maybe one day. So much to do, so little time!

>
> >> And does layout (near, far etc) influence on this rule: adjacent disk
> >> must be healthy?
> >
> > I didn't say adjacent disks must be healthy. Is said you cannot have
> > adjacent disks both failing. This is not affected by near/far.
> > It is a bit more subtle than that though. It is OK for 2nd and 3rd to both
> > fail. But not 1st and 2nd or 3rd and 4th.
>
> I see. Just like ordinary RAID1+0. First and second pair of the disks
> are RAID1, when both disks in that pair fail the mirror is dead.

Like that - yes.

>
> Wonder what happens when I create RAID10 on 6 disks? So we have got:
> sda1+sdb1 = RAID1
> sdc1+sdd1 = RAID1
> sde1+sdf1 = RAID1
> Those three RAID1 are striped together in RAID0?
> And assuming each disk is 1TB, I have 3TB logical space?
> In such situation still the adjacent disks of each RAID1 both must not
> fail.

This is correct assuming the default layout.
If you asked for "--layout=n3" you would get a 3-way mirror over a1,b1,c1 and
d1,e1,f1 and those would be raid0-ed.

If you had 5 devices then you get data copied on
sda1+sdb1
sdc1+sdd1
sde1+sda1
sdb1+sdc1
sdde+sde1

so is *any* pair of adjacent devices fail, you lose data.

>
>
> And I still wonder why it happened? Hardware issue (motherboard)? Or
> kernel bug (2.6.26 - debian/lenny)?

Hard to tell without seeing kernel logs. Almost certainly a hardware issue
of some sort. Maybe a loose or bumped cable. Maybe a power supply spike.
Maybe a stray cosmic ray....

NeilBrown
>
>
> Thank you very nice for help.
>
> Regards
> Piotr

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: RAID10 failed with two disks

am 23.08.2011 10:35:20 von Piotr Legiecki

>> Hura, hura, hura! ;-) Well, wonder why it didn't work for me ;-(
>
> Looks good so far, but is you data safe?

I think so.
fsck has found some errors and corrected them.
resync done
cat /sys/block/md4/md/mismatch_cnt
0
Looks good.

> This is correct assuming the default layout.
> If you asked for "--layout=n3" you would get a 3-way mirror over a1,b1,c1 and
> d1,e1,f1 and those would be raid0-ed.

And the question which one is the most efficient is beyond the scope of
our subject I'm of course? Or maybe there is some general rule of thumb
for this?

The more disks the faster array should be *but* the more data to mirror
at once when writing...

Anyway my tests proved that RAID1 on two disks is *much* slower than
RAID10 on 4 disks. RAID10 SATA can easily compete with HP SmartAray
P410i/BBC SAS RAIDs (but in RAID1 only ;-)). Well, at least during
iozone benchmarks.

> If you had 5 devices then you get data copied on
> sda1+sdb1
> sdc1+sdd1
> sde1+sda1
> sdb1+sdc1
> sdde+sde1
>
> so is *any* pair of adjacent devices fail, you lose data.

So from safety point of view there is need for more spare disks or go
for RAID6.

> Hard to tell without seeing kernel logs. Almost certainly a hardware issue
> of some sort. Maybe a loose or bumped cable. Maybe a power supply spike.
> Maybe a stray cosmic ray....

http://pastebin.com/iapZWm0S

Those 'failed' disks are connected to motherboard SATA ports. I've got
also Adaptec 1430 adapter with 2 free ports, maybe I should move those
disks there.

Thank you for all the help and time put into answering my questions.

Piotr
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html