20 drive raid-10, CentOS5.5, after reboot assemble fails - all drives"non-fresh"

20 drive raid-10, CentOS5.5, after reboot assemble fails - all drives"non-fresh"

am 08.08.2011 04:37:04 von Jeff Johnson

Greetings,

I have a 20 drive raid-10 that has been running well for over one year.
After the most recently system boot the raid will not assemble.
/var/log/messages shows that all of the drives are "non-fresh".
Examining the drives show that the raid partitions are present, the
superblocks have valid data and the Event ticker for the drives are
equal for the data drives. The spare drives have a different Event
ticker value.

I am reluctant to try and use the --force switch with assemble until I
understand the problem better. There is very important data on this
volume and it is not backed up to my knowledge. I do not know how the
machine was brought down prior to this system boot.

With all drives being "non-fresh" I can't start a partial array and then
re-add the remaining drives. I've unraveled some pretty messed up md
configs and recovered the underlying filesystem but this one has me at a
loss.

Any advice is greatly appreciated!

--Jeff

Below is the config file and output from mdadm examine commands:

/* Config file */
ARRAY /dev/md3 level=raid10 num-devices=20
UUID=e17a29e8:ec6bce5c:f13d343c:cfba4dc4
spares=4
devices=/dev/sdz1,/dev/sdy1,/dev/sdx1,/dev/sdw1,/dev/sdv1,/d ev/sdu1,/dev/sdt1,/dev/sd
s1,/dev/sdr1,/dev/sdq1,/dev/sdp1,/dev/sdo1,/dev/sdn1,/dev/sd m1,/dev/sdl1,/dev/sdk1,/dev/sdj1,/dev/s
di1,/dev/sdh1,/dev/sdg1,/dev/sdf1,/dev/sde1,/dev/sdd1,/dev/s dc1

/* mdadm -E /dev/sd[cdefghijklmnopqrstuvwxyz]1 | grep Event */
Events : 90
Events : 90
Events : 90
Events : 90
Events : 90
Events : 90
Events : 90
Events : 90
Events : 90
Events : 90
Events : 90
Events : 90
Events : 90
Events : 90
Events : 90
Events : 90
Events : 90
Events : 90
Events : 90
Events : 90
Events : 92
Events : 92
Events : 92
Events : 92

/* mdadm -E /dev/sdc1 */
/dev/sdc1:
Magic : a92b4efc
Version : 0.90.00
UUID : e17a29e8:ec6bce5c:f13d343c:cfba4dc4
Creation Time : Fri Sep 24 12:06:37 2010
Raid Level : raid10
Used Dev Size : 99924096 (95.30 GiB 102.32 GB)
Array Size : 999240960 (952.95 GiB 1023.22 GB)
Raid Devices : 20
Total Devices : 24
Preferred Minor : 3

Update Time : Sat Aug 6 05:54:37 2011
State : clean
Active Devices : 20
Working Devices : 24
Failed Devices : 0
Spare Devices : 4
Checksum : d8d97049 - correct
Events : 90

Layout : near=2
Chunk Size : 128K

Number Major Minor RaidDevice State
this 0 8 33 0 active sync /dev/sdc1

0 0 8 33 0 active sync /dev/sdc1
1 1 8 49 1 active sync /dev/sdd1
2 2 8 65 2 active sync /dev/sde1
3 3 8 81 3 active sync /dev/sdf1
4 4 8 97 4 active sync /dev/sdg1
5 5 8 113 5 active sync /dev/sdh1
6 6 8 129 6 active sync /dev/sdi1
7 7 8 145 7 active sync /dev/sdj1
8 8 8 161 8 active sync /dev/sdk1
9 9 8 177 9 active sync /dev/sdl1
10 10 8 193 10 active sync /dev/sdm1
11 11 8 209 11 active sync /dev/sdn1
12 12 8 225 12 active sync /dev/sdo1
13 13 8 241 13 active sync /dev/sdp1
14 14 65 1 14 active sync /dev/sdq1
15 15 65 17 15 active sync /dev/sdr1
16 16 65 33 16 active sync /dev/sds1
17 17 65 49 17 active sync /dev/sdt1
18 18 65 65 18 active sync /dev/sdu1
19 19 65 81 19 active sync /dev/sdv1
20 20 65 145 20 spare /dev/sdz1
21 21 65 129 21 spare /dev/sdy1
22 22 65 113 22 spare /dev/sdx1
23 23 65 97 23 spare /dev/sdw1

--
------------------------------
Jeff Johnson
Manager
Aeon Computing

jeff.johnson "at" aeoncomputing.com
www.aeoncomputing.com
t: 858-412-3810 x101 f: 858-412-3845

4905 Morena Boulevard, Suite 1313 - San Diego, CA 92117

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: 20 drive raid-10, CentOS5.5, after reboot assemble fails - alldrives "non-fresh"

am 08.08.2011 04:56:08 von NeilBrown

On Sun, 07 Aug 2011 19:37:04 -0700 Jeff Johnson
wrote:

> Greetings,
>
> I have a 20 drive raid-10 that has been running well for over one year.
> After the most recently system boot the raid will not assemble.
> /var/log/messages shows that all of the drives are "non-fresh".
> Examining the drives show that the raid partitions are present, the
> superblocks have valid data and the Event ticker for the drives are
> equal for the data drives. The spare drives have a different Event
> ticker value.
>
> I am reluctant to try and use the --force switch with assemble until I
> understand the problem better. There is very important data on this
> volume and it is not backed up to my knowledge. I do not know how the
> machine was brought down prior to this system boot.
>
> With all drives being "non-fresh" I can't start a partial array and then
> re-add the remaining drives. I've unraveled some pretty messed up md
> configs and recovered the underlying filesystem but this one has me at a
> loss.
>
> Any advice is greatly appreciated!
>
> --Jeff
>
> Below is the config file and output from mdadm examine commands:
>
> /* Config file */
> ARRAY /dev/md3 level=raid10 num-devices=20
> UUID=e17a29e8:ec6bce5c:f13d343c:cfba4dc4
> spares=4
> devices=/dev/sdz1,/dev/sdy1,/dev/sdx1,/dev/sdw1,/dev/sdv1,/d ev/sdu1,/dev/sdt1,/dev/sd
> s1,/dev/sdr1,/dev/sdq1,/dev/sdp1,/dev/sdo1,/dev/sdn1,/dev/sd m1,/dev/sdl1,/dev/sdk1,/dev/sdj1,/dev/s
> di1,/dev/sdh1,/dev/sdg1,/dev/sdf1,/dev/sde1,/dev/sdd1,/dev/s dc1

You really don't want that 'devices=" clause in there. Device names can
change...


>
> /* mdadm -E /dev/sd[cdefghijklmnopqrstuvwxyz]1 | grep Event */
> Events : 90
> Events : 90
> Events : 90
> Events : 90
> Events : 90
> Events : 90
> Events : 90
> Events : 90
> Events : 90
> Events : 90
> Events : 90
> Events : 90
> Events : 90
> Events : 90
> Events : 90
> Events : 90
> Events : 90
> Events : 90
> Events : 90
> Events : 90
> Events : 92
> Events : 92
> Events : 92
> Events : 92

So the spares are '92' and the others are '90'. That is weird...

However you should be able to assemble the array by simply listing all the
non-spare devices:

mdadm -A /dev/md3 /dev/sd[c-v]1

NeilBrown



>
> /* mdadm -E /dev/sdc1 */
> /dev/sdc1:
> Magic : a92b4efc
> Version : 0.90.00
> UUID : e17a29e8:ec6bce5c:f13d343c:cfba4dc4
> Creation Time : Fri Sep 24 12:06:37 2010
> Raid Level : raid10
> Used Dev Size : 99924096 (95.30 GiB 102.32 GB)
> Array Size : 999240960 (952.95 GiB 1023.22 GB)
> Raid Devices : 20
> Total Devices : 24
> Preferred Minor : 3
>
> Update Time : Sat Aug 6 05:54:37 2011
> State : clean
> Active Devices : 20
> Working Devices : 24
> Failed Devices : 0
> Spare Devices : 4
> Checksum : d8d97049 - correct
> Events : 90
>
> Layout : near=2
> Chunk Size : 128K
>
> Number Major Minor RaidDevice State
> this 0 8 33 0 active sync /dev/sdc1
>
> 0 0 8 33 0 active sync /dev/sdc1
> 1 1 8 49 1 active sync /dev/sdd1
> 2 2 8 65 2 active sync /dev/sde1
> 3 3 8 81 3 active sync /dev/sdf1
> 4 4 8 97 4 active sync /dev/sdg1
> 5 5 8 113 5 active sync /dev/sdh1
> 6 6 8 129 6 active sync /dev/sdi1
> 7 7 8 145 7 active sync /dev/sdj1
> 8 8 8 161 8 active sync /dev/sdk1
> 9 9 8 177 9 active sync /dev/sdl1
> 10 10 8 193 10 active sync /dev/sdm1
> 11 11 8 209 11 active sync /dev/sdn1
> 12 12 8 225 12 active sync /dev/sdo1
> 13 13 8 241 13 active sync /dev/sdp1
> 14 14 65 1 14 active sync /dev/sdq1
> 15 15 65 17 15 active sync /dev/sdr1
> 16 16 65 33 16 active sync /dev/sds1
> 17 17 65 49 17 active sync /dev/sdt1
> 18 18 65 65 18 active sync /dev/sdu1
> 19 19 65 81 19 active sync /dev/sdv1
> 20 20 65 145 20 spare /dev/sdz1
> 21 21 65 129 21 spare /dev/sdy1
> 22 22 65 113 22 spare /dev/sdx1
> 23 23 65 97 23 spare /dev/sdw1
>

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: 20 drive raid-10, CentOS5.5, after reboot assemble fails - alldrives "non-fresh"

am 08.08.2011 06:32:50 von Jeff Johnson

I am now able (thanks to Neil's suggestion) manually assemble the
/dev/md3 raid10 volume using:

mdadm -A /dev/md3 /dev/sd[cdefghijklmnopqrstuv]1

and then manually adding the spares back with: mdadm --add /dev/md3
/dev/sd[wxyz]1

The data is intact, phew! I am still unable to start the raid using a
config file. I gracefully stopped the raid using 'mdadm --stop /dev/md3'
and then tried 'mdadm -A /dev/md3 -c /etc/mdadm.conf.mdt' and it fails
to start.

I recreated the config file using 'mdadm --examine --scan >
/etc/mdadm.conf'. Then I stopped /dev/md3 and tried to assemble it again
using 'mdadm -A /dev/md3' and it fails to assemble and start.

It is good I can start the raid manually but it isn't supposed to work
like that. Any idea why assembling from a config file would fail? Here
is the latest version of the config file line (made with mdadm --examine
--scan):

ARRAY /dev/md3 level=raid10 num-devices=20 metadata=0.90 spares=4
UUID=e17a29e8:ec6bce5c:f13d343c:cfba4dc4

--Jeff

On Sun, Aug 7, 2011 at 7:56 PM, NeilBrown >wrote:

On Sun, 07 Aug 2011 19:37:04 -0700 Jeff Johnson
> wrote:

> Greetings,
>
> I have a 20 drive raid-10 that has been running well for over one
year.
> After the most recently system boot the raid will not assemble.
> /var/log/messages shows that all of the drives are "non-fresh".

--snip--

You really don't want that 'devices=" clause in there. Device names can
change..

--snip--

> Events : 90
> Events : 90
> Events : 92
> Events : 92
> Events : 92
> Events : 92

So the spares are '92' and the others are '90'. That is weird...

However you should be able to assemble the array by simply listing
all the
non-spare devices:

mdadm -A /dev/md3 /dev/sd[c-v]1

NeilBrown


--
------------------------------
Jeff Johnson
Manager
Aeon Computing

jeff.johnson "at" aeoncomputing.com
www.aeoncomputing.com
t:858-412-3810 x101 f:858-412-3845



4905 Morena Boulevard, Suite 1313 - San Diego, CA 92117

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: 20 drive raid-10, CentOS5.5, after reboot assemble fails - alldrives "non-fresh"

am 08.08.2011 06:40:43 von Joe Landman

On 08/08/2011 12:32 AM, Jeff Johnson wrote:

> It is good I can start the raid manually but it isn't supposed to work
> like that. Any idea why assembling from a config file would fail? Here
> is the latest version of the config file line (made with mdadm --examine
> --scan):

Jeff,

You might need to update the raid superblocks during the manual assemble.

mdadm --assemble --update=summaries /dev/md3 /dev/sd[c-v]1

Also, you can simplify the below a bit to the following:

>
> ARRAY /dev/md3 level=raid10 num-devices=20 metadata=0.90 spares=4
> UUID=e17a29e8:ec6bce5c:f13d343c:cfba4dc4

ARRAY /dev/md3 UUID=e17a29e8:ec6bce5c:f13d343c:cfba4dc4

>
> --Jeff
>
> On Sun, Aug 7, 2011 at 7:56 PM, NeilBrown > >wrote:
>
> On Sun, 07 Aug 2011 19:37:04 -0700 Jeff Johnson
> > > wrote:
>
> > Greetings,
> >
> > I have a 20 drive raid-10 that has been running well for over one
> year.
> > After the most recently system boot the raid will not assemble.
> > /var/log/messages shows that all of the drives are "non-fresh".
>
> --snip--
>
> You really don't want that 'devices=" clause in there. Device names can
> change..
> --snip--
>
> > Events : 90
> > Events : 90
> > Events : 92
> > Events : 92
> > Events : 92
> > Events : 92
>
> So the spares are '92' and the others are '90'. That is weird...
>
> However you should be able to assemble the array by simply listing
> all the
> non-spare devices:
>
> mdadm -A /dev/md3 /dev/sd[c-v]1
>
> NeilBrown
>
>
> --
> ------------------------------
> Jeff Johnson
> Manager
> Aeon Computing
>
> jeff.johnson "at" aeoncomputing.com
> www.aeoncomputing.com
> t:858-412-3810 x101 f:858-412-3845
>
>
>
> 4905 Morena Boulevard, Suite 1313 - San Diego, CA 92117
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: 20 drive raid-10, CentOS5.5, after reboot assemble fails - alldrives "non-fresh"

am 08.08.2011 06:54:31 von Jeff Johnson

Joe,

The raid still won't assemble via config file:

mdadm --assemble --update=summaries /dev/md3 /dev/sd[c-v]1
mdadm --add /dev/md3 /dev/sd[wxyz]1 (spares)
mdadm --examine --scan | grep md3 > /etc/mdadm.conf.mdt_new
mdadm --stop /dev/md3
mdadm -vv --assemble /dev/md3 -c /etc/mdadm.conf.mdt_new

Output:
mdadm: looking for devices for /dev/md3
mdadm: no recogniseable superblock on /dev/sdz2
mdadm: /dev/sdz2 has wrong uuid.
mdadm: /dev/sdz has wrong uuid.
mdadm: no RAID superblock on /dev/sdy2
mdadm: /dev/sdy2 has wrong uuid.
mdadm: /dev/sdy has wrong uuid.
mdadm: no RAID superblock on /dev/sdx2
mdadm: /dev/sdx2 has wrong uuid.
mdadm: /dev/sdx has wrong uuid.
mdadm: no RAID superblock on /dev/sdw2
mdadm: /dev/sdw2 has wrong uuid.
mdadm: /dev/sdw has wrong uuid.
mdadm: no RAID superblock on /dev/sdv2
mdadm: /dev/sdv2 has wrong uuid.
mdadm: /dev/sdv has wrong uuid.
mdadm: no RAID superblock on /dev/sdu2
mdadm: /dev/sdu2 has wrong uuid.
mdadm: /dev/sdu has wrong uuid.
mdadm: no RAID superblock on /dev/sdt2
mdadm: /dev/sdt2 has wrong uuid.
mdadm: /dev/sdt has wrong uuid.
mdadm: no RAID superblock on /dev/sds2
mdadm: /dev/sds2 has wrong uuid.
mdadm: /dev/sds has wrong uuid.
mdadm: no RAID superblock on /dev/sdr2
mdadm: /dev/sdr2 has wrong uuid.
mdadm: /dev/sdr has wrong uuid.
mdadm: no RAID superblock on /dev/sdq2
mdadm: /dev/sdq2 has wrong uuid.
mdadm: /dev/sdq has wrong uuid.
mdadm: no RAID superblock on /dev/sdp2
mdadm: /dev/sdp2 has wrong uuid.
mdadm: /dev/sdp has wrong uuid.
mdadm: no RAID superblock on /dev/sdo2
mdadm: /dev/sdo2 has wrong uuid.
mdadm: /dev/sdo has wrong uuid.
mdadm: no RAID superblock on /dev/sdn2
mdadm: /dev/sdn2 has wrong uuid.
mdadm: /dev/sdn has wrong uuid.
mdadm: no RAID superblock on /dev/sdm2
mdadm: /dev/sdm2 has wrong uuid.
mdadm: /dev/sdm has wrong uuid.
mdadm: no RAID superblock on /dev/sdl2
mdadm: /dev/sdl2 has wrong uuid.
mdadm: /dev/sdl has wrong uuid.
mdadm: no RAID superblock on /dev/sdk2
mdadm: /dev/sdk2 has wrong uuid.
mdadm: /dev/sdk has wrong uuid.
mdadm: no RAID superblock on /dev/sdj2
mdadm: /dev/sdj2 has wrong uuid.
mdadm: /dev/sdj has wrong uuid.
mdadm: no RAID superblock on /dev/sdi2
mdadm: /dev/sdi2 has wrong uuid.
mdadm: /dev/sdi has wrong uuid.
mdadm: no RAID superblock on /dev/sdh2
mdadm: /dev/sdh2 has wrong uuid.
mdadm: /dev/sdh has wrong uuid.
mdadm: no RAID superblock on /dev/sdg2
mdadm: /dev/sdg2 has wrong uuid.
mdadm: /dev/sdg has wrong uuid.
mdadm: no RAID superblock on /dev/sdf2
mdadm: /dev/sdf2 has wrong uuid.
mdadm: /dev/sdf has wrong uuid.
mdadm: no RAID superblock on /dev/sde2
mdadm: /dev/sde2 has wrong uuid.
mdadm: /dev/sde has wrong uuid.
mdadm: no RAID superblock on /dev/sdd2
mdadm: /dev/sdd2 has wrong uuid.
mdadm: /dev/sdd has wrong uuid.
mdadm: no RAID superblock on /dev/sdc2
mdadm: /dev/sdc2 has wrong uuid.
mdadm: /dev/sdc has wrong uuid.
mdadm: cannot open device
/dev/disk/by-uuid/55389e74-b43e-4a6b-97c5-573fcd91a4b7: Device or
resource busy
mdadm: /dev/disk/by-uuid/55389e74-b43e-4a6b-97c5-573fcd91a4b7 has wrong
uuid.
mdadm: cannot open device
/dev/disk/by-uuid/ab90577f-7c58-4e91-95e4-25025cf01790: Device or
resource busy
mdadm: /dev/disk/by-uuid/ab90577f-7c58-4e91-95e4-25025cf01790 has wrong
uuid.
mdadm: cannot open device /dev/root: Device or resource busy
mdadm: /dev/root has wrong uuid.
mdadm: cannot open device /dev/sdb3: Device or resource busy
mdadm: /dev/sdb3 has wrong uuid.
mdadm: cannot open device /dev/sdb2: Device or resource busy
mdadm: /dev/sdb2 has wrong uuid.
mdadm: cannot open device /dev/sdb1: Device or resource busy
mdadm: /dev/sdb1 has wrong uuid.
mdadm: cannot open device /dev/sdb: Device or resource busy
mdadm: /dev/sdb has wrong uuid.
mdadm: cannot open device /dev/sda3: Device or resource busy
mdadm: /dev/sda3 has wrong uuid.
mdadm: cannot open device /dev/sda2: Device or resource busy
mdadm: /dev/sda2 has wrong uuid.
mdadm: cannot open device /dev/sda1: Device or resource busy
mdadm: /dev/sda1 has wrong uuid.
mdadm: cannot open device /dev/sda: Device or resource busy
mdadm: /dev/sda has wrong uuid.
mdadm: /dev/sdz1 is identified as a member of /dev/md3, slot 20.
mdadm: /dev/sdy1 is identified as a member of /dev/md3, slot 21.
mdadm: /dev/sdx1 is identified as a member of /dev/md3, slot 22.
mdadm: /dev/sdw1 is identified as a member of /dev/md3, slot 23.
mdadm: /dev/sdv1 is identified as a member of /dev/md3, slot 19.
mdadm: /dev/sdu1 is identified as a member of /dev/md3, slot 18.
mdadm: /dev/sdt1 is identified as a member of /dev/md3, slot 17.
mdadm: /dev/sds1 is identified as a member of /dev/md3, slot 16.
mdadm: /dev/sdr1 is identified as a member of /dev/md3, slot 15.
mdadm: /dev/sdq1 is identified as a member of /dev/md3, slot 14.
mdadm: /dev/sdp1 is identified as a member of /dev/md3, slot 13.
mdadm: /dev/sdo1 is identified as a member of /dev/md3, slot 12.
mdadm: /dev/sdn1 is identified as a member of /dev/md3, slot 11.
mdadm: /dev/sdm1 is identified as a member of /dev/md3, slot 10.
mdadm: /dev/sdl1 is identified as a member of /dev/md3, slot 9.
mdadm: /dev/sdk1 is identified as a member of /dev/md3, slot 8.
mdadm: /dev/sdj1 is identified as a member of /dev/md3, slot 7.
mdadm: /dev/sdi1 is identified as a member of /dev/md3, slot 6.
mdadm: /dev/sdh1 is identified as a member of /dev/md3, slot 5.
mdadm: /dev/sdg1 is identified as a member of /dev/md3, slot 4.
mdadm: /dev/sdf1 is identified as a member of /dev/md3, slot 3.
mdadm: /dev/sde1 is identified as a member of /dev/md3, slot 2.
mdadm: /dev/sdd1 is identified as a member of /dev/md3, slot 1.
mdadm: /dev/sdc1 is identified as a member of /dev/md3, slot 0.
mdadm: No suitable drives found for /dev/md3


Maybe '--update=uuid' ??

--Jeff


On 8/7/11 9:40 PM, Joe Landman wrote:
> On 08/08/2011 12:32 AM, Jeff Johnson wrote:
>
>> It is good I can start the raid manually but it isn't supposed to work
>> like that. Any idea why assembling from a config file would fail? Here
>> is the latest version of the config file line (made with mdadm --examine
>> --scan):
>
> Jeff,
>
> You might need to update the raid superblocks during the manual
> assemble.
>
> mdadm --assemble --update=summaries /dev/md3 /dev/sd[c-v]1
>
> Also, you can simplify the below a bit to the following:
>
>>
>> ARRAY /dev/md3 level=raid10 num-devices=20 metadata=0.90 spares=4
>> UUID=e17a29e8:ec6bce5c:f13d343c:cfba4dc4
>
> ARRAY /dev/md3 UUID=e17a29e8:ec6bce5c:f13d343c:cfba4dc4
>

--
------------------------------
Jeff Johnson
Manager
Aeon Computing

jeff.johnson "at" aeoncomputing.com
www.aeoncomputing.com
t: 858-412-3810 x101 f: 858-412-3845


4905 Morena Boulevard, Suite 1313 - San Diego, CA 92117

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: 20 drive raid-10, CentOS5.5, after reboot assemble fails - alldrives "non-fresh"

am 08.08.2011 07:04:13 von Joe Landman

On 08/08/2011 12:54 AM, Jeff Johnson wrote:
> Joe,
>
> The raid still won't assemble via config file:

[...]

> mdadm: /dev/sdz1 is identified as a member of /dev/md3, slot 20.
> mdadm: /dev/sdy1 is identified as a member of /dev/md3, slot 21.
> mdadm: /dev/sdx1 is identified as a member of /dev/md3, slot 22.
> mdadm: /dev/sdw1 is identified as a member of /dev/md3, slot 23.
> mdadm: /dev/sdv1 is identified as a member of /dev/md3, slot 19.
> mdadm: /dev/sdu1 is identified as a member of /dev/md3, slot 18.
> mdadm: /dev/sdt1 is identified as a member of /dev/md3, slot 17.
> mdadm: /dev/sds1 is identified as a member of /dev/md3, slot 16.
> mdadm: /dev/sdr1 is identified as a member of /dev/md3, slot 15.
> mdadm: /dev/sdq1 is identified as a member of /dev/md3, slot 14.
> mdadm: /dev/sdp1 is identified as a member of /dev/md3, slot 13.
> mdadm: /dev/sdo1 is identified as a member of /dev/md3, slot 12.
> mdadm: /dev/sdn1 is identified as a member of /dev/md3, slot 11.
> mdadm: /dev/sdm1 is identified as a member of /dev/md3, slot 10.
> mdadm: /dev/sdl1 is identified as a member of /dev/md3, slot 9.
> mdadm: /dev/sdk1 is identified as a member of /dev/md3, slot 8.
> mdadm: /dev/sdj1 is identified as a member of /dev/md3, slot 7.
> mdadm: /dev/sdi1 is identified as a member of /dev/md3, slot 6.
> mdadm: /dev/sdh1 is identified as a member of /dev/md3, slot 5.
> mdadm: /dev/sdg1 is identified as a member of /dev/md3, slot 4.
> mdadm: /dev/sdf1 is identified as a member of /dev/md3, slot 3.
> mdadm: /dev/sde1 is identified as a member of /dev/md3, slot 2.
> mdadm: /dev/sdd1 is identified as a member of /dev/md3, slot 1.
> mdadm: /dev/sdc1 is identified as a member of /dev/md3, slot 0.
> mdadm: No suitable drives found for /dev/md3
>
>
> Maybe '--update=uuid' ??

It looks like it correctly finds /dev/sd[c-z]1 as elements of /dev/md3

Which mdadm are you using?

mdadm -V

and which kernel?

Try the UUID update, and let us know if it helps. Also if your mdadm is
old (2.6.x), try updating to 3.1.x.

FWIW: we've found problems in the past with Centos 5.4 to 5.5 kernels
with MD arrays. Often times our only real solution would be to update
the full OS on the boot drives. This is for distro specific kernels.
For our kernels, we don't run into this issue.


--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics, Inc.
email: landman@scalableinformatics.com
web : http://scalableinformatics.com
http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax : +1 866 888 3112
cell : +1 734 612 4615
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: 20 drive raid-10, CentOS5.5, after reboot assemble fails - alldrives "non-fresh"

am 08.08.2011 07:55:21 von Jeff Johnson

Joe / et-al,

The '--assemble --update=uuid' appears to have done the trick. It is
weird because the uuid in the config file matched the uuid of the raid
volume shown with 'mdadm -D /dev/md3' and the uuid on each of the drives
shown with 'mdadm -E /dev/sdc1'

The '--update=summaries' did not work. Assigning a new random uuid
appears to have repaired whatever bit in the superblock was mucked up.

Strange...

Joe, thanks for your help. Find me at SC11, I'm buying you beers.

--Jeff



On 8/7/11 10:04 PM, Joe Landman wrote:
>> Maybe '--update=uuid' ??
>
>
> It looks like it correctly finds /dev/sd[c-z]1 as elements of /dev/md3
>
> Which mdadm are you using?
>
> mdadm -V
>
> and which kernel?
>
> Try the UUID update, and let us know if it helps. Also if your mdadm
> is old (2.6.x), try updating to 3.1.x.
>
> FWIW: we've found problems in the past with Centos 5.4 to 5.5 kernels
> with MD arrays. Often times our only real solution would be to update
> the full OS on the boot drives. This is for distro specific kernels.
> For our kernels, we don't run into this issue.
>
>


--
------------------------------
Jeff Johnson
Manager
Aeon Computing

jeff.johnson "at" aeoncomputing.com
www.aeoncomputing.com
t: 858-412-3810 x101 f: 858-412-3845

4905 Morena Boulevard, Suite 1313 - San Diego, CA 92117

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: 20 drive raid-10, CentOS5.5, after reboot assemble fails - alldrives "non-fresh"

am 08.08.2011 18:17:48 von Joe Landman

On 08/08/2011 01:55 AM, Jeff Johnson wrote:
> Joe / et-al,
>
> The '--assemble --update=uuid' appears to have done the trick. It is
> weird because the uuid in the config file matched the uuid of the raid
> volume shown with 'mdadm -D /dev/md3' and the uuid on each of the drives
> shown with 'mdadm -E /dev/sdc1'

Interesting.

>
> The '--update=summaries' did not work. Assigning a new random uuid
> appears to have repaired whatever bit in the superblock was mucked up.
>
> Strange...
>
> Joe, thanks for your help. Find me at SC11, I'm buying you beers.
>

:) see you there




--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman@scalableinformatics.com
web : http://scalableinformatics.com
http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax : +1 866 888 3112
cell : +1 734 612 4615
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html