Bookmarks

Yahoo Gmail Google Facebook Delicious Twitter Reddit Stumpleupon Myspace Digg

Search queries

w2ksp4.exe, WwwxxXdbf, procmail "FROM_MAILER" patch, Use of assignment to $[ is deprecated at /usr/local/sbin/apxs line 86. , wwwxxx vim, mysql closing table and opening table, 800c5000, setgid operation not permitted, pciehp: acpi_pciehprm on IBM, WWWXXX.DBF

Links

XODOX
Impressum

#1: Problems with raid after reboot.

Posted on 2011-07-20 23:24:49 by mtice

Hello,

I had to shutdown my machine for moving - when I powered it back up my
raid-5 array is in a bad state:

# mdadm -A -s
mdadm: no devices found for /dev/md0
mdadm: /dev/md/0 assembled from 2 drives - not enough to start the array.

I ended up forcing the assembly:

# mdadm -A -s -f
mdadm: no devices found for /dev/md0
mdadm: forcing event count in /dev/sde(1) from 177 upto 181
mdadm: clearing FAULTY flag for device 1 in /dev/md/0 for /dev/sde
mdadm: /dev/md/0 has been started with 3 drives (out of 4).

Looking at the detailed output the missing disk (/dev/sdc) is "removed":


# mdadm --detail /dev/md0
/dev/md0:
Version : 00.90
Creation Time : Sat Mar 12 21:22:34 2011
Raid Level : raid5
Array Size : 2197723392 (2095.91 GiB 2250.47 GB)
Used Dev Size : 732574464 (698.64 GiB 750.16 GB)
Raid Devices : 4
Total Devices : 3
Preferred Minor : 0
Persistence : Superblock is persistent

Update Time : Tue Jul 19 20:44:45 2011
State : clean, degraded
Active Devices : 3
Working Devices : 3
Failed Devices : 0
Spare Devices : 0

Layout : left-symmetric
Chunk Size : 64K

UUID : daf06d5a:b80528b1:2e29483d:f114274d (local to host storage)
Events : 0.181

Number Major Minor RaidDevice State
0 8 80 0 active sync /dev/sdf
1 8 64 1 active sync /dev/sde
2 8 48 2 active sync /dev/sdd
3 0 0 3 removed


I can examine the disk but I'm unable to add it (I don't recall if it
needs to be removed first or not):

# mdadm --examine /dev/sdc
/dev/sdc:
Magic : a92b4efc
Version : 00.90.00
UUID : daf06d5a:b80528b1:2e29483d:f114274d (local to host storage)
Creation Time : Sat Mar 12 21:22:34 2011
Raid Level : raid5
Used Dev Size : 732574464 (698.64 GiB 750.16 GB)
Array Size : 2197723392 (2095.91 GiB 2250.47 GB)
Raid Devices : 4
Total Devices : 3
Preferred Minor : 0

Update Time : Tue Jul 19 20:44:45 2011
State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 1
Spare Devices : 0
Checksum : 22ac1229 - correct
Events : 181

Layout : left-symmetric
Chunk Size : 64K

Number Major Minor RaidDevice State
this 3 8 32 3 active sync /dev/sdc

0 0 8 80 0 active sync /dev/sdf
1 1 0 0 1 faulty removed
2 2 8 48 2 active sync /dev/sdd
3 3 8 32 3 active sync /dev/sdc

# mdadm --add /dev/md0 /dev/sdc
mdadm: Cannot open /dev/sdc: Device or resource busy

So a couple questions.

1. Any thoughts on what would cause this? I seem to have bad luck
with my raid arrays whenever I reboot.
2. How do I fix? Everything *seems* to be as it should be . . .

Here is the mdadm.conf:

# cat /etc/mdadm/mdadm.conf | grep -v ^#

DEVICE partitions
CREATE owner=root group=disk mode=0660 auto=yes
HOMEHOST <system>
MAILADDR root
ARRAY /dev/md0 level=raid5 num-devices=4
UUID=11c1cdd8:60ec9a90:2e29483d:f114274d

Any help is greatly appreciated.

Matt
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Report this message

#2: Re: Problems with raid after reboot.

Posted on 2011-07-23 05:12:25 by mtice

On Wed, Jul 20, 2011 at 3:24 PM, Matthew Tice <mjtice@gmail.com> wrote:
> Hello,
>
> I had to shutdown my machine for moving - when I powered it back up m=
y
> raid-5 array is in a bad state:
>
> # mdadm -A -s
> mdadm: no devices found for /dev/md0
> mdadm: /dev/md/0 assembled from 2 drives - not enough to start the ar=
ray.
>
> I ended up forcing the assembly:
>
> # mdadm -A -s -f
> mdadm: no devices found for /dev/md0
> mdadm: forcing event count in /dev/sde(1) from 177 upto 181
> mdadm: clearing FAULTY flag for device 1 in /dev/md/0 for /dev/sde
> mdadm: /dev/md/0 has been started with 3 drives (out of 4).
>
> Looking at the detailed output the missing disk (/dev/sdc) is "remove=
d":
>
>
> # mdadm --detail /dev/md0
> /dev/md0:
> =A0 =A0 =A0 =A0Version : 00.90
> =A0Creation Time : Sat Mar 12 21:22:34 2011
> =A0 =A0 Raid Level : raid5
> =A0 =A0 Array Size : 2197723392 (2095.91 GiB 2250.47 GB)
> =A0Used Dev Size : 732574464 (698.64 GiB 750.16 GB)
> =A0 Raid Devices : 4
> =A0Total Devices : 3
> Preferred Minor : 0
> =A0 =A0Persistence : Superblock is persistent
>
> =A0 =A0Update Time : Tue Jul 19 20:44:45 2011
> =A0 =A0 =A0 =A0 =A0State : clean, degraded
> =A0Active Devices : 3
> Working Devices : 3
> =A0Failed Devices : 0
> =A0Spare Devices : 0
>
> =A0 =A0 =A0 =A0 Layout : left-symmetric
> =A0 =A0 Chunk Size : 64K
>
> =A0 =A0 =A0 =A0 =A0 UUID : daf06d5a:b80528b1:2e29483d:f114274d (local=
to host storage)
> =A0 =A0 =A0 =A0 Events : 0.181
>
> =A0 =A0Number =A0 Major =A0 Minor =A0 RaidDevice State
> =A0 =A0 =A0 0 =A0 =A0 =A0 8 =A0 =A0 =A0 80 =A0 =A0 =A0 =A00 =A0 =A0 =A0=
active sync =A0 /dev/sdf
> =A0 =A0 =A0 1 =A0 =A0 =A0 8 =A0 =A0 =A0 64 =A0 =A0 =A0 =A01 =A0 =A0 =A0=
active sync =A0 /dev/sde
> =A0 =A0 =A0 2 =A0 =A0 =A0 8 =A0 =A0 =A0 48 =A0 =A0 =A0 =A02 =A0 =A0 =A0=
active sync =A0 /dev/sdd
> =A0 =A0 =A0 3 =A0 =A0 =A0 0 =A0 =A0 =A0 =A00 =A0 =A0 =A0 =A03 =A0 =A0=
=A0removed
>
>
> I can examine the disk but I'm unable to add it (I don't recall if it
> needs to be removed first or not):
>
> # mdadm --examine /dev/sdc
> /dev/sdc:
> =A0 =A0 =A0 =A0 =A0Magic : a92b4efc
> =A0 =A0 =A0 =A0Version : 00.90.00
> =A0 =A0 =A0 =A0 =A0 UUID : daf06d5a:b80528b1:2e29483d:f114274d (local=
to host storage)
> =A0Creation Time : Sat Mar 12 21:22:34 2011
> =A0 =A0 Raid Level : raid5
> =A0Used Dev Size : 732574464 (698.64 GiB 750.16 GB)
> =A0 =A0 Array Size : 2197723392 (2095.91 GiB 2250.47 GB)
> =A0 Raid Devices : 4
> =A0Total Devices : 3
> Preferred Minor : 0
>
> =A0 =A0Update Time : Tue Jul 19 20:44:45 2011
> =A0 =A0 =A0 =A0 =A0State : clean
> =A0Active Devices : 3
> Working Devices : 3
> =A0Failed Devices : 1
> =A0Spare Devices : 0
> =A0 =A0 =A0 Checksum : 22ac1229 - correct
> =A0 =A0 =A0 =A0 Events : 181
>
> =A0 =A0 =A0 =A0 Layout : left-symmetric
> =A0 =A0 Chunk Size : 64K
>
> =A0 =A0 =A0Number =A0 Major =A0 Minor =A0 RaidDevice State
> this =A0 =A0 3 =A0 =A0 =A0 8 =A0 =A0 =A0 32 =A0 =A0 =A0 =A03 =A0 =A0 =
=A0active sync =A0 /dev/sdc
>
> =A0 0 =A0 =A0 0 =A0 =A0 =A0 8 =A0 =A0 =A0 80 =A0 =A0 =A0 =A00 =A0 =A0=
=A0active sync =A0 /dev/sdf
> =A0 1 =A0 =A0 1 =A0 =A0 =A0 0 =A0 =A0 =A0 =A00 =A0 =A0 =A0 =A01 =A0 =A0=
=A0faulty removed
> =A0 2 =A0 =A0 2 =A0 =A0 =A0 8 =A0 =A0 =A0 48 =A0 =A0 =A0 =A02 =A0 =A0=
=A0active sync =A0 /dev/sdd
> =A0 3 =A0 =A0 3 =A0 =A0 =A0 8 =A0 =A0 =A0 32 =A0 =A0 =A0 =A03 =A0 =A0=
=A0active sync =A0 /dev/sdc
>
> # mdadm --add /dev/md0 /dev/sdc
> mdadm: Cannot open /dev/sdc: Device or resource busy
>
> So a couple questions.
>
> 1. Any thoughts on what would cause this? =A0I seem to have bad luck
> with my raid arrays whenever I reboot.
> 2. How do I fix? =A0Everything *seems* to be as it should be . . .
>
> Here is the mdadm.conf:
>
> # cat /etc/mdadm/mdadm.conf =A0| grep -v ^#
>
> DEVICE partitions
> CREATE owner=3Droot group=3Ddisk mode=3D0660 auto=3Dyes
> HOMEHOST <system>
> MAILADDR root
> ARRAY /dev/md0 level=3Draid5 num-devices=3D4
> UUID=3D11c1cdd8:60ec9a90:2e29483d:f114274d
>
> Any help is greatly appreciated.
>
> Matt
>

One thing I just noticed that seems kind of strange is that when I do
the --examine /dev/sdc it shows the drive in the list and another is
listed as faulty? Am I reading that right?


# mdadm --examine /dev/sdc
/dev/sdc:
Magic : a92b4efc
Version : 00.90.00
UUID : daf06d5a:b80528b1:2e29483d:f114274d (local to host st=
orage)
Creation Time : Sat Mar 12 21:22:34 2011
Raid Level : raid5
Used Dev Size : 732574464 (698.64 GiB 750.16 GB)
Array Size : 2197723392 (2095.91 GiB 2250.47 GB)
Raid Devices : 4
Total Devices : 3
Preferred Minor : 0

Update Time : Tue Jul 19 20:44:45 2011
State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 1
Spare Devices : 0
Checksum : 22ac1229 - correct
Events : 181

Layout : left-symmetric
Chunk Size : 64K

Number Major Minor RaidDevice State
this 3 8 32 3 active sync /dev/sdc

0 0 8 80 0 active sync /dev/sdf
1 1 0 0 1 faulty removed
2 2 8 48 2 active sync /dev/sdd
3 3 8 32 3 active sync /dev/sdc
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Report this message

#3: Re: Problems with raid after reboot.

Posted on 2011-07-25 23:04:34 by mtice

On Fri, Jul 22, 2011 at 9:12 PM, Matthew Tice <mjtice@gmail.com> wrote:
> On Wed, Jul 20, 2011 at 3:24 PM, Matthew Tice <mjtice@gmail.com> wrot=
e:
>> Hello,
>>
>> I had to shutdown my machine for moving - when I powered it back up =
my
>> raid-5 array is in a bad state:
>>
>> # mdadm -A -s
>> mdadm: no devices found for /dev/md0
>> mdadm: /dev/md/0 assembled from 2 drives - not enough to start the a=
rray.
>>
>> I ended up forcing the assembly:
>>
>> # mdadm -A -s -f
>> mdadm: no devices found for /dev/md0
>> mdadm: forcing event count in /dev/sde(1) from 177 upto 181
>> mdadm: clearing FAULTY flag for device 1 in /dev/md/0 for /dev/sde
>> mdadm: /dev/md/0 has been started with 3 drives (out of 4).
>>
>> Looking at the detailed output the missing disk (/dev/sdc) is "remov=
ed":
>>
>>
>> # mdadm --detail /dev/md0
>> /dev/md0:
>>        Version : 00.90
>>  Creation Time : Sat Mar 12 21:22:34 2011
>>     Raid Level : raid5
>>     Array Size : 2197723392 (2095.91 GiB 2250.47 GB)
>>  Used Dev Size : 732574464 (698.64 GiB 750.16 GB)
>>   Raid Devices : 4
>>  Total Devices : 3
>> Preferred Minor : 0
>>    Persistence : Superblock is persistent
>>
>>    Update Time : Tue Jul 19 20:44:45 2011
>>          State : clean, degraded
>>  Active Devices : 3
>> Working Devices : 3
>>  Failed Devices : 0
>>  Spare Devices : 0
>>
>>         Layout : left-symmetric
>>     Chunk Size : 64K
>>
>>           UUID : daf06d5a:b80528b1:2e29483d=
:f114274d (local to host storage)
>>         Events : 0.181
>>
>>    Number   Major   Minor   RaidDevice Stat=
e
>>       0       8       8=
0        0      active sync   /=
dev/sdf
>>       1       8       6=
4        1      active sync   /=
dev/sde
>>       2       8       4=
8        2      active sync   /=
dev/sdd
>>       3       0       =C2=
=A00        3      removed
>>
>>
>> I can examine the disk but I'm unable to add it (I don't recall if i=
t
>> needs to be removed first or not):
>>
>> # mdadm --examine /dev/sdc
>> /dev/sdc:
>>          Magic : a92b4efc
>>        Version : 00.90.00
>>           UUID : daf06d5a:b80528b1:2e29483d=
:f114274d (local to host storage)
>>  Creation Time : Sat Mar 12 21:22:34 2011
>>     Raid Level : raid5
>>  Used Dev Size : 732574464 (698.64 GiB 750.16 GB)
>>     Array Size : 2197723392 (2095.91 GiB 2250.47 GB)
>>   Raid Devices : 4
>>  Total Devices : 3
>> Preferred Minor : 0
>>
>>    Update Time : Tue Jul 19 20:44:45 2011
>>          State : clean
>>  Active Devices : 3
>> Working Devices : 3
>>  Failed Devices : 1
>>  Spare Devices : 0
>>       Checksum : 22ac1229 - correct
>>         Events : 181
>>
>>         Layout : left-symmetric
>>     Chunk Size : 64K
>>
>>      Number   Major   Minor   RaidDevi=
ce State
>> this     3       8       32 =
       3      active sync   /de=
v/sdc
>>
>>   0     0       8      =
80        0      active sync  =
/dev/sdf
>>   1     1       0      =
 0        1      faulty remove=
d
>>   2     2       8      =
48        2      active sync  =
/dev/sdd
>>   3     3       8      =
32        3      active sync  =
/dev/sdc
>>
>> # mdadm --add /dev/md0 /dev/sdc
>> mdadm: Cannot open /dev/sdc: Device or resource busy
>>
>> So a couple questions.
>>
>> 1. Any thoughts on what would cause this?  I seem to have bad l=
uck
>> with my raid arrays whenever I reboot.
>> 2. How do I fix?  Everything *seems* to be as it should be . . =
=2E
>>
>> Here is the mdadm.conf:
>>
>> # cat /etc/mdadm/mdadm.conf  | grep -v ^#
>>
>> DEVICE partitions
>> CREATE owner=3Droot group=3Ddisk mode=3D0660 auto=3Dyes
>> HOMEHOST <system>
>> MAILADDR root
>> ARRAY /dev/md0 level=3Draid5 num-devices=3D4
>> UUID=3D11c1cdd8:60ec9a90:2e29483d:f114274d
>>
>> Any help is greatly appreciated.
>>
>> Matt
>>
>
> One thing I just noticed that seems kind of strange is that when I do
> the --examine /dev/sdc it shows the drive in the list and another is
> listed as faulty?  Am I reading that right?
>
>
> # mdadm --examine /dev/sdc
> /dev/sdc:
>          Magic : a92b4efc
>        Version : 00.90.00
>           UUID : daf06d5a:b80528b1:2e29483d:=
f114274d (local to host storage)
>  Creation Time : Sat Mar 12 21:22:34 2011
>     Raid Level : raid5
>  Used Dev Size : 732574464 (698.64 GiB 750.16 GB)
>     Array Size : 2197723392 (2095.91 GiB 2250.47 GB)
>   Raid Devices : 4
>  Total Devices : 3
> Preferred Minor : 0
>
>    Update Time : Tue Jul 19 20:44:45 2011
>          State : clean
>  Active Devices : 3
> Working Devices : 3
>  Failed Devices : 1
>  Spare Devices : 0
>       Checksum : 22ac1229 - correct
>         Events : 181
>
>         Layout : left-symmetric
>     Chunk Size : 64K
>
>      Number   Major   Minor   RaidDevic=
e State
> this     3       8       32 =C2=
=A0      3      active sync   /dev/s=
dc
>
>   0     0       8       =
80        0      active sync   =
/dev/sdf
>   1     1       0       =
 0        1      faulty removed
>   2     2       8       =
48        2      active sync   =
/dev/sdd
>   3     3       8       =
32        3      active sync   =
/dev/sdc
>

Well things are a lot different now - I'm unable to start the array
successfully. I removed an older non-relevant drive that was giving
me smart errors - when I rebooted the drive assignments shifted (not
sure this really matters, though).

Now when I try to start the array I get:

# mdadm -A -f /dev/md0
mdadm: no devices found for /dev/md0

I can nudge it slightly with auto-detect:

# mdadm --auto-detect

Then I try to assemble the array with:

# mdadm -A -f /dev/md0 /dev/sd[bcde]
mdadm: cannot open device /dev/sde: Device or resource busy
mdadm: /dev/sde has no superblock - assembly aborted

I ran Phil's lsdrv script (and an mdadm --examine /dev/sd[bcde] so
hopefully it can help:

# lsdrv

PCI [ata_piix] 00:1f.1 IDE interface: Intel Corporation 82801G (ICH7
=46amily) IDE Controller (rev 01)
├─scsi 0:0:0:0 LITE-ON COMBO SOHC-4836K {2006061700044=
437}
â”=82 └─sr0: [11:0] Empty/Unknown 1.00g
└─scsi 1:x:x:x [Empty]
PCI [ata_piix] 00:1f.2 IDE interface: Intel Corporation N10/ICH7
=46amily SATA IDE Controller (rev 01)
├─scsi 2:x:x:x [Empty]
└─scsi 3:0:0:0 ATA HDS728080PLA380 {PFDB20S4SNLT6J}
└─sda: [8:0] Partitioned (dos) 76.69g
├─sda1: [8:1] (ext4) 75.23g {960433b3-af56-41bd-=
bb9a-d0a0fb5ffb45}
â”=82 └─Mounted as
/dev/disk/by-uuid/960433b3-af56-41bd-bb9a-d0a0fb5ffb45 @ /
├─sda2: [8:2] Partitioned (dos) 1.00k
└─sda5: [8:5] (swap) 1.46g {10c3b226-16d4-44ea-a=
d1e-6296bb92969d}
PCI [sata_sil24] 04:00.0 RAID bus controller: Silicon Image, Inc. SiI
3132 Serial ATA Raid II Controller (rev 01)
├─scsi 4:0:0:0 ATA WDC WD7500AADS-0 {WD-WCAV59574584}
â”=82 └─sdb: [8:16] MD raid5 (4) 698.64g inactive
{daf06d5a-b805-28b1-2e29-483df114274d}
├─scsi 4:1:0:0 ATA WDC WD7500AADS-0 {WD-WCAV59459025}
â”=82 └─sdc: [8:32] MD raid5 (4) 698.64g inactive
{daf06d5a-b805-28b1-2e29-483df114274d}
├─scsi 4:2:0:0 ATA Hitachi HDS72101 {JP9911HZ1SKHNU}
â”=82 └─sdd: [8:48] MD raid5 (4) 931.51g inactive
{daf06d5a-b805-28b1-2e29-483df114274d}
├─scsi 4:3:0:0 ATA Hitachi HDS72101 {JP9960HZ1VK96U}
â”=82 └─sde: [8:64] MD raid5 (none/4) 931.51g md_=
d0 inactive spare
{daf06d5a-b805-28b1-2e29-483df114274d}
â”=82 └─md_d0: [254:0] Empty/Unknown 0.00k
└─scsi 7:x:x:x [Empty]
PCI [pata_via] 02:00.0 IDE interface: VIA Technologies, Inc. PATA IDE
Host Controller
├─scsi 5:x:x:x [Empty]
└─scsi 6:x:x:x [Empty]
PCI [sata_sil24] 05:01.0 RAID bus controller: Silicon Image, Inc. SiI
3124 PCI-X Serial ATA Controller (rev 02)
├─scsi 8:x:x:x [Empty]
├─scsi 9:x:x:x [Empty]
├─scsi 10:x:x:x [Empty]
└─scsi 11:x:x:x [Empty]
Other Block Devices
├─ram0: [1:0] Empty/Unknown 64.00m
├─ram1: [1:1] Empty/Unknown 64.00m
├─ram2: [1:2] Empty/Unknown 64.00m
├─ram3: [1:3] Empty/Unknown 64.00m
├─ram4: [1:4] Empty/Unknown 64.00m
├─ram5: [1:5] Empty/Unknown 64.00m
├─ram6: [1:6] Empty/Unknown 64.00m
├─ram7: [1:7] Empty/Unknown 64.00m
├─ram8: [1:8] Empty/Unknown 64.00m
├─ram9: [1:9] Empty/Unknown 64.00m
├─ram10: [1:10] Empty/Unknown 64.00m
├─ram11: [1:11] Empty/Unknown 64.00m
├─ram12: [1:12] Empty/Unknown 64.00m
├─ram13: [1:13] Empty/Unknown 64.00m
├─ram14: [1:14] Empty/Unknown 64.00m
└─ram15: [1:15] Empty/Unknown 64.00m


# mdadm --examine /dev/sd[bcde]

/dev/sdb:
Magic : a92b4efc
Version : 00.90.00
UUID : daf06d5a:b80528b1:2e29483d:f114274d (local to host st=
orage)
Creation Time : Sat Mar 12 21:22:34 2011
Raid Level : raid5
Used Dev Size : 732574464 (698.64 GiB 750.16 GB)
Array Size : 2197723392 (2095.91 GiB 2250.47 GB)
Raid Devices : 4
Total Devices : 3
Preferred Minor : 0

Update Time : Mon Jul 25 14:08:30 2011
State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 0
Spare Devices : 0
Checksum : 22b3c880 - correct
Events : 5593

Layout : left-symmetric
Chunk Size : 64K

Number Major Minor RaidDevice State
this 3 8 64 3 active sync /dev/sde

0 0 0 0 0 removed
1 1 8 32 1 active sync /dev/sdc
2 2 8 16 2 active sync /dev/sdb
3 3 8 64 3 active sync /dev/sde
/dev/sdc:
Magic : a92b4efc
Version : 00.90.00
UUID : daf06d5a:b80528b1:2e29483d:f114274d (local to host st=
orage)
Creation Time : Sat Mar 12 21:22:34 2011
Raid Level : raid5
Used Dev Size : 732574464 (698.64 GiB 750.16 GB)
Array Size : 2197723392 (2095.91 GiB 2250.47 GB)
Raid Devices : 4
Total Devices : 3
Preferred Minor : 0

Update Time : Mon Jul 25 14:08:30 2011
State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 0
Spare Devices : 0
Checksum : 22b3c84e - correct
Events : 5593

Layout : left-symmetric
Chunk Size : 64K

Number Major Minor RaidDevice State
this 2 8 16 2 active sync /dev/sdb

0 0 0 0 0 removed
1 1 8 32 1 active sync /dev/sdc
2 2 8 16 2 active sync /dev/sdb
3 3 8 64 3 active sync /dev/sde
/dev/sdd:
Magic : a92b4efc
Version : 00.90.00
UUID : daf06d5a:b80528b1:2e29483d:f114274d (local to host st=
orage)
Creation Time : Sat Mar 12 21:22:34 2011
Raid Level : raid5
Used Dev Size : 732574464 (698.64 GiB 750.16 GB)
Array Size : 2197723392 (2095.91 GiB 2250.47 GB)
Raid Devices : 4
Total Devices : 3
Preferred Minor : 0

Update Time : Mon Jul 25 14:08:30 2011
State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 0
Spare Devices : 0
Checksum : 22b3c85c - correct
Events : 5593

Layout : left-symmetric
Chunk Size : 64K

Number Major Minor RaidDevice State
this 1 8 32 1 active sync /dev/sdc

0 0 0 0 0 removed
1 1 8 32 1 active sync /dev/sdc
2 2 8 16 2 active sync /dev/sdb
3 3 8 64 3 active sync /dev/sde
/dev/sde:
Magic : a92b4efc
Version : 00.90.00
UUID : daf06d5a:b80528b1:2e29483d:f114274d (local to host st=
orage)
Creation Time : Sat Mar 12 21:22:34 2011
Raid Level : raid5
Used Dev Size : 732574464 (698.64 GiB 750.16 GB)
Array Size : 2197723392 (2095.91 GiB 2250.47 GB)
Raid Devices : 4
Total Devices : 3
Preferred Minor : 0

Update Time : Sun Jul 24 11:46:10 2011
State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 1
Spare Devices : 0
Checksum : 22b255fc - correct
Events : 5591

Layout : left-symmetric
Chunk Size : 64K

Number Major Minor RaidDevice State
this 0 8 80 0 active sync

0 0 8 80 0 active sync
1 1 8 64 1 active sync /dev/sde
2 2 8 48 2 active sync /dev/sdd
3 3 0 0 3 faulty removed

I've looked but I'm unable to find where the drive is in use.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Report this message

#4: Re: Problems with raid after reboot.

Posted on 2011-07-25 23:21:07 by Robin Hill

--T4sUOijqQbZv57TR
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Mon Jul 25, 2011 at 03:04:34PM -0600, Matthew Tice wrote:

> Well things are a lot different now - I'm unable to start the array
> successfully. I removed an older non-relevant drive that was giving
> me smart errors - when I rebooted the drive assignments shifted (not
> sure this really matters, though).
>=20
> Now when I try to start the array I get:
>=20
> # mdadm -A -f /dev/md0
> mdadm: no devices found for /dev/md0
>=20
> I can nudge it slightly with auto-detect:
>=20
> # mdadm --auto-detect
>=20
> Then I try to assemble the array with:
>=20
> # mdadm -A -f /dev/md0 /dev/sd[bcde]
> mdadm: cannot open device /dev/sde: Device or resource busy
> mdadm: /dev/sde has no superblock - assembly aborted
>=20
<- SNIP ->
> â”=82 └─sde: [8:64] MD raid5 (none/4) 931.51g md_d0=
inactive spare
<- SNIP ->
>=20
> I've looked but I'm unable to find where the drive is in use.

lsdrv shows that it's in use in array md_d0 - presumably this is a
part-assembled array (possibly auto-assembled by the kernel). Try
stopping that first, then doing the "mdadm -A -f /dev/md0 /dev/sd[bcde]"

Cheers,
Robin
--=20
___ =20
( ' } | Robin Hill <robin@robinhill.me.uk> |
/ / ) | Little Jim says .... |
// !! | "He fallen in de water !!" |

--T4sUOijqQbZv57TR
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (GNU/Linux)

iEYEARECAAYFAk4t3kIACgkQShxCyD40xBL/JACdGmCoX7kiuEetGVdsMJVz WA/G
oYcAoINEHoCOZgwnkZCxcHwS1RxiQpEJ
=sdkh
-----END PGP SIGNATURE-----

--T4sUOijqQbZv57TR--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Report this message

#5: Re: Problems with raid after reboot.

Posted on 2011-07-25 23:30:32 by mtice

On Mon, Jul 25, 2011 at 3:21 PM, Robin Hill <robin@robinhill.me.uk> wro=
te:
> On Mon Jul 25, 2011 at 03:04:34PM -0600, Matthew Tice wrote:
>
>> Well things are a lot different now - I'm unable to start the array
>> successfully.  I removed an older non-relevant drive that was g=
iving
>> me smart errors - when I rebooted the drive assignments shifted (not
>> sure this really matters, though).
>>
>> Now when I try to start the array I get:
>>
>> # mdadm -A -f /dev/md0
>> mdadm: no devices found for /dev/md0
>>
>> I can nudge it slightly with auto-detect:
>>
>> # mdadm --auto-detect
>>
>> Then I try to assemble the array with:
>>
>> # mdadm -A -f /dev/md0 /dev/sd[bcde]
>> mdadm: cannot open device /dev/sde: Device or resource busy
>> mdadm: /dev/sde has no superblock - assembly aborted
>>
> <- SNIP ->
>>  â”=82  └─sde: [8:64] MD raid5 (none/4=
) 931.51g md_d0 inactive spare
> <- SNIP ->
>>
>> I've looked but I'm unable to find where the drive is in use.
>
> lsdrv shows that it's in use in array md_d0 - presumably this is a
> part-assembled array (possibly auto-assembled by the kernel). Try
> stopping that first, then doing the "mdadm -A -f /dev/md0 /dev/sd[bcd=
e]"
>

Nice catch, thanks, Robin.

I stopped /dev/md_d0 then started the array on /dev/md0

# mdadm -A -f /dev/md0 /dev/sd[bcde]
mdadm: /dev/md0 has been started with 3 drives (out of 4).

It's only seeing the three drives. I did an fsck on it just in case
but it failed:

# fsck -n /dev/md0
fsck from util-linux-ng 2.17.2
e2fsck 1.41.12 (17-May-2010)
Superblock has an invalid journal (inode 8).
Clear? no

fsck.ext4: Illegal inode number while checking ext3 journal for /dev/md=
0

Looks like /dev/sde is missing (as also noted above):

# mdadm --detail /dev/md0
/dev/md0:
Version : 00.90
Creation Time : Sat Mar 12 21:22:34 2011
Raid Level : raid5
Array Size : 2197723392 (2095.91 GiB 2250.47 GB)
Used Dev Size : 732574464 (698.64 GiB 750.16 GB)
Raid Devices : 4
Total Devices : 3
Preferred Minor : 0
Persistence : Superblock is persistent

Update Time : Mon Jul 25 14:08:30 2011
State : clean, degraded
Active Devices : 3
Working Devices : 3
Failed Devices : 0
Spare Devices : 0

Layout : left-symmetric
Chunk Size : 64K

UUID : daf06d5a:b80528b1:2e29483d:f114274d (local to host st=
orage)
Events : 0.5593

Number Major Minor RaidDevice State
0 0 0 0 removed
1 8 48 1 active sync /dev/sdd
2 8 32 2 active sync /dev/sdc
3 8 16 3 active sync /dev/sdb
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Report this message

#6: Re: Problems with raid after reboot.

Posted on 2011-07-25 23:33:42 by mtice

On Mon, Jul 25, 2011 at 3:30 PM, Matthew Tice <mjtice@gmail.com> wrote:
> On Mon, Jul 25, 2011 at 3:21 PM, Robin Hill <robin@robinhill.me.uk> w=
rote:
>> On Mon Jul 25, 2011 at 03:04:34PM -0600, Matthew Tice wrote:
>>
>>> Well things are a lot different now - I'm unable to start the array
>>> successfully.  I removed an older non-relevant drive that was =
giving
>>> me smart errors - when I rebooted the drive assignments shifted (no=
t
>>> sure this really matters, though).
>>>
>>> Now when I try to start the array I get:
>>>
>>> # mdadm -A -f /dev/md0
>>> mdadm: no devices found for /dev/md0
>>>
>>> I can nudge it slightly with auto-detect:
>>>
>>> # mdadm --auto-detect
>>>
>>> Then I try to assemble the array with:
>>>
>>> # mdadm -A -f /dev/md0 /dev/sd[bcde]
>>> mdadm: cannot open device /dev/sde: Device or resource busy
>>> mdadm: /dev/sde has no superblock - assembly aborted
>>>
>> <- SNIP ->
>>>  â”=82  └─sde: [8:64] MD raid5 (none/=
4) 931.51g md_d0 inactive spare
>> <- SNIP ->
>>>
>>> I've looked but I'm unable to find where the drive is in use.
>>
>> lsdrv shows that it's in use in array md_d0 - presumably this is a
>> part-assembled array (possibly auto-assembled by the kernel). Try
>> stopping that first, then doing the "mdadm -A -f /dev/md0 /dev/sd[bc=
de]"
>>
>
> Nice catch, thanks, Robin.
>
> I stopped /dev/md_d0 then started the array on /dev/md0
>
> # mdadm -A -f /dev/md0 /dev/sd[bcde]
> mdadm: /dev/md0 has been started with 3 drives (out of 4).
>
> It's only seeing the three drives.  I did an fsck on it just in =
case
> but it failed:
>
> # fsck -n /dev/md0
> fsck from util-linux-ng 2.17.2
> e2fsck 1.41.12 (17-May-2010)
> Superblock has an invalid journal (inode 8).
> Clear? no
>
> fsck.ext4: Illegal inode number while checking ext3 journal for /dev/=
md0
>
> Looks like /dev/sde is missing (as also noted above):
>
> # mdadm --detail /dev/md0
> /dev/md0:
>        Version : 00.90
>  Creation Time : Sat Mar 12 21:22:34 2011
>     Raid Level : raid5
>     Array Size : 2197723392 (2095.91 GiB 2250.47 GB)
>  Used Dev Size : 732574464 (698.64 GiB 750.16 GB)
>   Raid Devices : 4
>  Total Devices : 3
> Preferred Minor : 0
>    Persistence : Superblock is persistent
>
>    Update Time : Mon Jul 25 14:08:30 2011
>          State : clean, degraded
>  Active Devices : 3
> Working Devices : 3
>  Failed Devices : 0
>  Spare Devices : 0
>
>         Layout : left-symmetric
>     Chunk Size : 64K
>
>           UUID : daf06d5a:b80528b1:2e29483d:=
f114274d (local to host storage)
>         Events : 0.5593
>
>    Number   Major   Minor   RaidDevice State
>       0       0       =C2=
=A00        0      removed
>       1       8       48=
       1      active sync   /d=
ev/sdd
>       2       8       32=
       2      active sync   /d=
ev/sdc
>       3       8       16=
       3      active sync   /d=
ev/sdb
>

One other strange thing I just noticed - /dev/sde keeps getting added
back into /dev/md_d0 (after I start the array on /dev/md0)

# /usr/local/bin/lsdrv
**Warning** The following utility(ies) failed to execute:
pvs
lvs
Some information may be missing.

PCI [ata_piix] 00:1f.1 IDE interface: Intel Corporation 82801G (ICH7
=46amily) IDE Controller (rev 01)
├─scsi 0:0:0:0 LITE-ON COMBO SOHC-4836K {2006061700044=
437}
â”=82 └─sr0: [11:0] Empty/Unknown 1.00g
└─scsi 1:x:x:x [Empty]
PCI [ata_piix] 00:1f.2 IDE interface: Intel Corporation N10/ICH7
=46amily SATA IDE Controller (rev 01)
├─scsi 2:x:x:x [Empty]
└─scsi 3:0:0:0 ATA HDS728080PLA380 {PFDB20S4SNLT6J}
└─sda: [8:0] Partitioned (dos) 76.69g
├─sda1: [8:1] (ext4) 75.23g {960433b3-af56-41bd-=
bb9a-d0a0fb5ffb45}
â”=82 └─Mounted as
/dev/disk/by-uuid/960433b3-af56-41bd-bb9a-d0a0fb5ffb45 @ /
├─sda2: [8:2] Partitioned (dos) 1.00k
└─sda5: [8:5] (swap) 1.46g {10c3b226-16d4-44ea-a=
d1e-6296bb92969d}
PCI [sata_sil24] 04:00.0 RAID bus controller: Silicon Image, Inc. SiI
3132 Serial ATA Raid II Controller (rev 01)
├─scsi 4:0:0:0 ATA WDC WD7500AADS-0 {WD-WCAV59574584}
â”=82 └─sdb: [8:16] MD raid5 (3/4) 698.64g md0 cl=
ean in_sync
{daf06d5a-b805-28b1-2e29-483df114274d}
â”=82 └─md0: [9:0] (ext3) 2.05t {a9a38e8e-d54d=
-407d-a786-31410ad6e17d}
├─scsi 4:1:0:0 ATA WDC WD7500AADS-0 {WD-WCAV59459025}
â”=82 └─sdc: [8:32] MD raid5 (2/4) 698.64g md0 cl=
ean in_sync
{daf06d5a-b805-28b1-2e29-483df114274d}
├─scsi 4:2:0:0 ATA Hitachi HDS72101 {JP9911HZ1SKHNU}
â”=82 └─sdd: [8:48] MD raid5 (1/4) 931.51g md0 cl=
ean in_sync
{daf06d5a-b805-28b1-2e29-483df114274d}
├─scsi 4:3:0:0 ATA Hitachi HDS72101 {JP9960HZ1VK96U}
â”=82 └─sde: [8:64] MD raid5 (none/4) 931.51g md_=
d0 inactive spare
{daf06d5a-b805-28b1-2e29-483df114274d}
â”=82 └─md_d0: [254:0] Empty/Unknown 0.00k
└─scsi 7:x:x:x [Empty]
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Report this message

#7: Re: Problems with raid after reboot.

Posted on 2011-07-25 23:42:09 by mtice

On Mon, Jul 25, 2011 at 3:33 PM, Matthew Tice <mjtice@gmail.com> wrote:
> On Mon, Jul 25, 2011 at 3:30 PM, Matthew Tice <mjtice@gmail.com> wrot=
e:
>> On Mon, Jul 25, 2011 at 3:21 PM, Robin Hill <robin@robinhill.me.uk> =
wrote:
>>> On Mon Jul 25, 2011 at 03:04:34PM -0600, Matthew Tice wrote:
>>>
>>>> Well things are a lot different now - I'm unable to start the arra=
y
>>>> successfully.  I removed an older non-relevant drive that was=
giving
>>>> me smart errors - when I rebooted the drive assignments shifted (n=
ot
>>>> sure this really matters, though).
>>>>
>>>> Now when I try to start the array I get:
>>>>
>>>> # mdadm -A -f /dev/md0
>>>> mdadm: no devices found for /dev/md0
>>>>
>>>> I can nudge it slightly with auto-detect:
>>>>
>>>> # mdadm --auto-detect
>>>>
>>>> Then I try to assemble the array with:
>>>>
>>>> # mdadm -A -f /dev/md0 /dev/sd[bcde]
>>>> mdadm: cannot open device /dev/sde: Device or resource busy
>>>> mdadm: /dev/sde has no superblock - assembly aborted
>>>>
>>> <- SNIP ->
>>>>  â”=82  └─sde: [8:64] MD raid5 (none=
/4) 931.51g md_d0 inactive spare
>>> <- SNIP ->
>>>>
>>>> I've looked but I'm unable to find where the drive is in use.
>>>
>>> lsdrv shows that it's in use in array md_d0 - presumably this is a
>>> part-assembled array (possibly auto-assembled by the kernel). Try
>>> stopping that first, then doing the "mdadm -A -f /dev/md0 /dev/sd[b=
cde]"
>>>
>>
>> Nice catch, thanks, Robin.
>>
>> I stopped /dev/md_d0 then started the array on /dev/md0
>>
>> # mdadm -A -f /dev/md0 /dev/sd[bcde]
>> mdadm: /dev/md0 has been started with 3 drives (out of 4).
>>
>> It's only seeing the three drives.  I did an fsck on it just in=
case
>> but it failed:
>>
>> # fsck -n /dev/md0
>> fsck from util-linux-ng 2.17.2
>> e2fsck 1.41.12 (17-May-2010)
>> Superblock has an invalid journal (inode 8).
>> Clear? no
>>
>> fsck.ext4: Illegal inode number while checking ext3 journal for /dev=
/md0
>>
>> Looks like /dev/sde is missing (as also noted above):
>>
>> # mdadm --detail /dev/md0
>> /dev/md0:
>>        Version : 00.90
>>  Creation Time : Sat Mar 12 21:22:34 2011
>>     Raid Level : raid5
>>     Array Size : 2197723392 (2095.91 GiB 2250.47 GB)
>>  Used Dev Size : 732574464 (698.64 GiB 750.16 GB)
>>   Raid Devices : 4
>>  Total Devices : 3
>> Preferred Minor : 0
>>    Persistence : Superblock is persistent
>>
>>    Update Time : Mon Jul 25 14:08:30 2011
>>          State : clean, degraded
>>  Active Devices : 3
>> Working Devices : 3
>>  Failed Devices : 0
>>  Spare Devices : 0
>>
>>         Layout : left-symmetric
>>     Chunk Size : 64K
>>
>>           UUID : daf06d5a:b80528b1:2e29483d=
:f114274d (local to host storage)
>>         Events : 0.5593
>>
>>    Number   Major   Minor   RaidDevice Stat=
e
>>       0       0       =C2=
=A00        0      removed
>>       1       8       4=
8        1      active sync   /=
dev/sdd
>>       2       8       3=
2        2      active sync   /=
dev/sdc
>>       3       8       1=
6        3      active sync   /=
dev/sdb
>>
>
> One other strange thing I just noticed - /dev/sde keeps getting added
> back into /dev/md_d0 (after I start the array on /dev/md0)
>
> # /usr/local/bin/lsdrv
> **Warning** The following utility(ies) failed to execute:
>  pvs
>  lvs
> Some information may be missing.
>
> PCI [ata_piix] 00:1f.1 IDE interface: Intel Corporation 82801G (ICH7
> Family) IDE Controller (rev 01)
>  ├─scsi 0:0:0:0 LITE-ON COMBO SOHC-4836K {200606=
1700044437}
>  â”=82  └─sr0: [11:0] Empty/Unknown 1.0=
0g
>  └─scsi 1:x:x:x [Empty]
> PCI [ata_piix] 00:1f.2 IDE interface: Intel Corporation N10/ICH7
> Family SATA IDE Controller (rev 01)
>  ├─scsi 2:x:x:x [Empty]
>  └─scsi 3:0:0:0 ATA HDS728080PLA380 {PFDB20S4SNL=
T6J}
>    └─sda: [8:0] Partitioned (dos) 76.69g
>       ├─sda1: [8:1] (ext4) 75.23g {960=
433b3-af56-41bd-bb9a-d0a0fb5ffb45}
>       â”=82  └─Mounted as
> /dev/disk/by-uuid/960433b3-af56-41bd-bb9a-d0a0fb5ffb45 @ /
>       ├─sda2: [8:2] Partitioned (dos) =
1.00k
>       └─sda5: [8:5] (swap) 1.46g {10c3=
b226-16d4-44ea-ad1e-6296bb92969d}
> PCI [sata_sil24] 04:00.0 RAID bus controller: Silicon Image, Inc. SiI
> 3132 Serial ATA Raid II Controller (rev 01)
>  ├─scsi 4:0:0:0 ATA WDC WD7500AADS-0 {WD-WCAV595=
74584}
>  â”=82  └─sdb: [8:16] MD raid5 (3/4) 69=
8.64g md0 clean in_sync
> {daf06d5a-b805-28b1-2e29-483df114274d}
>  â”=82     └─md0: [9:0] (ext3) 2.0=
5t {a9a38e8e-d54d-407d-a786-31410ad6e17d}
>  ├─scsi 4:1:0:0 ATA WDC WD7500AADS-0 {WD-WCAV594=
59025}
>  â”=82  └─sdc: [8:32] MD raid5 (2/4) 69=
8.64g md0 clean in_sync
> {daf06d5a-b805-28b1-2e29-483df114274d}
>  ├─scsi 4:2:0:0 ATA Hitachi HDS72101 {JP9911HZ1S=
KHNU}
>  â”=82  └─sdd: [8:48] MD raid5 (1/4) 93=
1.51g md0 clean in_sync
> {daf06d5a-b805-28b1-2e29-483df114274d}
>  ├─scsi 4:3:0:0 ATA Hitachi HDS72101 {JP9960HZ1V=
K96U}
>  â”=82  └─sde: [8:64] MD raid5 (none/4)=
931.51g md_d0 inactive spare
> {daf06d5a-b805-28b1-2e29-483df114274d}
>  â”=82     └─md_d0: [254:0] Empty/=
Unknown 0.00k
>  └─scsi 7:x:x:x [Empty]
>

Here is something interesting from syslog:

1. I stop /dev/md_d0
Jul 25 15:38:56 localhost kernel: [ 4272.658244] md: md_d0 stopped.
Jul 25 15:38:56 localhost kernel: [ 4272.658258] md: unbind<sde>
Jul 25 15:38:56 localhost kernel: [ 4272.658271] md: export_rdev(sde)

2. I assemble /dev/md0 with:
# mdadm -A /dev/md0 /dev/sd[bcde]
mdadm: /dev/md0 has been started with 3 drives (out of 4).

Jul 25 15:41:33 localhost kernel: [ 4429.537035] md: md0 stopped.
Jul 25 15:41:33 localhost kernel: [ 4429.545447] md: bind<sde>
Jul 25 15:41:33 localhost kernel: [ 4429.545644] md: bind<sdc>
Jul 25 15:41:33 localhost kernel: [ 4429.545810] md: bind<sdb>
Jul 25 15:41:33 localhost kernel: [ 4429.546827] md: bind<sdd>
Jul 25 15:41:33 localhost kernel: [ 4429.546876] md: kicking non-fresh
sde from array!
Jul 25 15:41:33 localhost kernel: [ 4429.546883] md: unbind<sde>
Jul 25 15:41:33 localhost kernel: [ 4429.546890] md: export_rdev(sde)
Jul 25 15:41:33 localhost kernel: [ 4429.565035] md/raid:md0: device
sdd operational as raid disk 1
Jul 25 15:41:33 localhost kernel: [ 4429.565041] md/raid:md0: device
sdb operational as raid disk 3
Jul 25 15:41:33 localhost kernel: [ 4429.565045] md/raid:md0: device
sdc operational as raid disk 2
Jul 25 15:41:33 localhost kernel: [ 4429.565631] md/raid:md0: allocated=
4222kB
Jul 25 15:41:33 localhost kernel: [ 4429.573438] md/raid:md0: raid
level 5 active with 3 out of 4 devices, algorithm 2
Jul 25 15:41:33 localhost kernel: [ 4429.574754] RAID conf printout:
Jul 25 15:41:33 localhost kernel: [ 4429.574757] --- level:5 rd:4 wd:3
Jul 25 15:41:33 localhost kernel: [ 4429.574761] disk 1, o:1, dev:sdd
Jul 25 15:41:33 localhost kernel: [ 4429.574765] disk 2, o:1, dev:sdc
Jul 25 15:41:33 localhost kernel: [ 4429.574768] disk 3, o:1, dev:sdb
Jul 25 15:41:33 localhost kernel: [ 4429.574863] md0: detected
capacity change from 0 to 2250468753408
Jul 25 15:41:33 localhost kernel: [ 4429.575092] md0: unknown partitio=
n table
Jul 25 15:41:33 localhost kernel: [ 4429.626140] md: bind<sde>

So /dev/sde is "non-fresh" and has an unknown partition table.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Report this message

#8: Re: Problems with raid after reboot.

Posted on 2011-07-25 23:55:02 by mtice

On Mon, Jul 25, 2011 at 3:42 PM, Matthew Tice <mjtice@gmail.com> wrote:
> On Mon, Jul 25, 2011 at 3:33 PM, Matthew Tice <mjtice@gmail.com> wrot=
e:
>> On Mon, Jul 25, 2011 at 3:30 PM, Matthew Tice <mjtice@gmail.com> wro=
te:
>>> On Mon, Jul 25, 2011 at 3:21 PM, Robin Hill <robin@robinhill.me.uk>=
wrote:
>>>> On Mon Jul 25, 2011 at 03:04:34PM -0600, Matthew Tice wrote:
>>>>
>>>>> Well things are a lot different now - I'm unable to start the arr=
ay
>>>>> successfully.  I removed an older non-relevant drive that wa=
s giving
>>>>> me smart errors - when I rebooted the drive assignments shifted (=
not
>>>>> sure this really matters, though).
>>>>>
>>>>> Now when I try to start the array I get:
>>>>>
>>>>> # mdadm -A -f /dev/md0
>>>>> mdadm: no devices found for /dev/md0
>>>>>
>>>>> I can nudge it slightly with auto-detect:
>>>>>
>>>>> # mdadm --auto-detect
>>>>>
>>>>> Then I try to assemble the array with:
>>>>>
>>>>> # mdadm -A -f /dev/md0 /dev/sd[bcde]
>>>>> mdadm: cannot open device /dev/sde: Device or resource busy
>>>>> mdadm: /dev/sde has no superblock - assembly aborted
>>>>>
>>>> <- SNIP ->
>>>>>  â”=82  └─sde: [8:64] MD raid5 (non=
e/4) 931.51g md_d0 inactive spare
>>>> <- SNIP ->
>>>>>
>>>>> I've looked but I'm unable to find where the drive is in use.
>>>>
>>>> lsdrv shows that it's in use in array md_d0 - presumably this is a
>>>> part-assembled array (possibly auto-assembled by the kernel). Try
>>>> stopping that first, then doing the "mdadm -A -f /dev/md0 /dev/sd[=
bcde]"
>>>>
>>>
>>> Nice catch, thanks, Robin.
>>>
>>> I stopped /dev/md_d0 then started the array on /dev/md0
>>>
>>> # mdadm -A -f /dev/md0 /dev/sd[bcde]
>>> mdadm: /dev/md0 has been started with 3 drives (out of 4).
>>>
>>> It's only seeing the three drives.  I did an fsck on it just i=
n case
>>> but it failed:
>>>
>>> # fsck -n /dev/md0
>>> fsck from util-linux-ng 2.17.2
>>> e2fsck 1.41.12 (17-May-2010)
>>> Superblock has an invalid journal (inode 8).
>>> Clear? no
>>>
>>> fsck.ext4: Illegal inode number while checking ext3 journal for /de=
v/md0
>>>
>>> Looks like /dev/sde is missing (as also noted above):
>>>
>>> # mdadm --detail /dev/md0
>>> /dev/md0:
>>>        Version : 00.90
>>>  Creation Time : Sat Mar 12 21:22:34 2011
>>>     Raid Level : raid5
>>>     Array Size : 2197723392 (2095.91 GiB 2250.47 GB)
>>>  Used Dev Size : 732574464 (698.64 GiB 750.16 GB)
>>>   Raid Devices : 4
>>>  Total Devices : 3
>>> Preferred Minor : 0
>>>    Persistence : Superblock is persistent
>>>
>>>    Update Time : Mon Jul 25 14:08:30 2011
>>>          State : clean, degraded
>>>  Active Devices : 3
>>> Working Devices : 3
>>>  Failed Devices : 0
>>>  Spare Devices : 0
>>>
>>>         Layout : left-symmetric
>>>     Chunk Size : 64K
>>>
>>>           UUID : daf06d5a:b80528b1:2e29483=
d:f114274d (local to host storage)
>>>         Events : 0.5593
>>>
>>>    Number   Major   Minor   RaidDevice Sta=
te
>>>       0       0       =
 0        0      removed
>>>       1       8       =
48        1      active sync   =
/dev/sdd
>>>       2       8       =
32        2      active sync   =
/dev/sdc
>>>       3       8       =
16        3      active sync   =
/dev/sdb
>>>
>>
>> One other strange thing I just noticed - /dev/sde keeps getting adde=
d
>> back into /dev/md_d0 (after I start the array on /dev/md0)
>>
>> # /usr/local/bin/lsdrv
>> **Warning** The following utility(ies) failed to execute:
>>  pvs
>>  lvs
>> Some information may be missing.
>>
>> PCI [ata_piix] 00:1f.1 IDE interface: Intel Corporation 82801G (ICH7
>> Family) IDE Controller (rev 01)
>>  ├─scsi 0:0:0:0 LITE-ON COMBO SOHC-4836K {20060=
61700044437}
>>  â”=82  └─sr0: [11:0] Empty/Unknown 1.=
00g
>>  └─scsi 1:x:x:x [Empty]
>> PCI [ata_piix] 00:1f.2 IDE interface: Intel Corporation N10/ICH7
>> Family SATA IDE Controller (rev 01)
>>  ├─scsi 2:x:x:x [Empty]
>>  └─scsi 3:0:0:0 ATA HDS728080PLA380 {PFDB20S4SN=
LT6J}
>>    └─sda: [8:0] Partitioned (dos) 76.69g
>>       ├─sda1: [8:1] (ext4) 75.23g {96=
0433b3-af56-41bd-bb9a-d0a0fb5ffb45}
>>       â”=82  └─Mounted as
>> /dev/disk/by-uuid/960433b3-af56-41bd-bb9a-d0a0fb5ffb45 @ /
>>       ├─sda2: [8:2] Partitioned (dos)=
1.00k
>>       └─sda5: [8:5] (swap) 1.46g {10c=
3b226-16d4-44ea-ad1e-6296bb92969d}
>> PCI [sata_sil24] 04:00.0 RAID bus controller: Silicon Image, Inc. Si=
I
>> 3132 Serial ATA Raid II Controller (rev 01)
>>  ├─scsi 4:0:0:0 ATA WDC WD7500AADS-0 {WD-WCAV59=
574584}
>>  â”=82  └─sdb: [8:16] MD raid5 (3/4) 6=
98.64g md0 clean in_sync
>> {daf06d5a-b805-28b1-2e29-483df114274d}
>>  â”=82     └─md0: [9:0] (ext3) 2.=
05t {a9a38e8e-d54d-407d-a786-31410ad6e17d}
>>  ├─scsi 4:1:0:0 ATA WDC WD7500AADS-0 {WD-WCAV59=
459025}
>>  â”=82  └─sdc: [8:32] MD raid5 (2/4) 6=
98.64g md0 clean in_sync
>> {daf06d5a-b805-28b1-2e29-483df114274d}
>>  ├─scsi 4:2:0:0 ATA Hitachi HDS72101 {JP9911HZ1=
SKHNU}
>>  â”=82  └─sdd: [8:48] MD raid5 (1/4) 9=
31.51g md0 clean in_sync
>> {daf06d5a-b805-28b1-2e29-483df114274d}
>>  ├─scsi 4:3:0:0 ATA Hitachi HDS72101 {JP9960HZ1=
VK96U}
>>  â”=82  └─sde: [8:64] MD raid5 (none/4=
) 931.51g md_d0 inactive spare
>> {daf06d5a-b805-28b1-2e29-483df114274d}
>>  â”=82     └─md_d0: [254:0] Empty=
/Unknown 0.00k
>>  └─scsi 7:x:x:x [Empty]
>>
>
> Here is something interesting from syslog:
>
> 1. I stop /dev/md_d0
> Jul 25 15:38:56 localhost kernel: [ 4272.658244] md: md_d0 stopped.
> Jul 25 15:38:56 localhost kernel: [ 4272.658258] md: unbind<sde>
> Jul 25 15:38:56 localhost kernel: [ 4272.658271] md: export_rdev(sde)
>
> 2. I assemble /dev/md0 with:
> # mdadm -A /dev/md0 /dev/sd[bcde]
> mdadm: /dev/md0 has been started with 3 drives (out of 4).
>
> Jul 25 15:41:33 localhost kernel: [ 4429.537035] md: md0 stopped.
> Jul 25 15:41:33 localhost kernel: [ 4429.545447] md: bind<sde>
> Jul 25 15:41:33 localhost kernel: [ 4429.545644] md: bind<sdc>
> Jul 25 15:41:33 localhost kernel: [ 4429.545810] md: bind<sdb>
> Jul 25 15:41:33 localhost kernel: [ 4429.546827] md: bind<sdd>
> Jul 25 15:41:33 localhost kernel: [ 4429.546876] md: kicking non-fres=
h
> sde from array!
> Jul 25 15:41:33 localhost kernel: [ 4429.546883] md: unbind<sde>
> Jul 25 15:41:33 localhost kernel: [ 4429.546890] md: export_rdev(sde)
> Jul 25 15:41:33 localhost kernel: [ 4429.565035] md/raid:md0: device
> sdd operational as raid disk 1
> Jul 25 15:41:33 localhost kernel: [ 4429.565041] md/raid:md0: device
> sdb operational as raid disk 3
> Jul 25 15:41:33 localhost kernel: [ 4429.565045] md/raid:md0: device
> sdc operational as raid disk 2
> Jul 25 15:41:33 localhost kernel: [ 4429.565631] md/raid:md0: allocat=
ed 4222kB
> Jul 25 15:41:33 localhost kernel: [ 4429.573438] md/raid:md0: raid
> level 5 active with 3 out of 4 devices, algorithm 2
> Jul 25 15:41:33 localhost kernel: [ 4429.574754] RAID conf printout:
> Jul 25 15:41:33 localhost kernel: [ 4429.574757]  --- level:5 rd=
:4 wd:3
> Jul 25 15:41:33 localhost kernel: [ 4429.574761]  disk 1, o:1, d=
ev:sdd
> Jul 25 15:41:33 localhost kernel: [ 4429.574765]  disk 2, o:1, d=
ev:sdc
> Jul 25 15:41:33 localhost kernel: [ 4429.574768]  disk 3, o:1, d=
ev:sdb
> Jul 25 15:41:33 localhost kernel: [ 4429.574863] md0: detected
> capacity change from 0 to 2250468753408
> Jul 25 15:41:33 localhost kernel: [ 4429.575092]  md0: unknown p=
artition table
> Jul 25 15:41:33 localhost kernel: [ 4429.626140] md: bind<sde>
>
> So /dev/sde is "non-fresh" and has an unknown partition table.
>

Okay, I was able to add it back in by stopping this /dev/md_d0 and then=
:
# mdadm /dev/md0 --add /dev/sde
mdadm: re-added /dev/sde

So now it's syncing:
# mdadm --detail /dev/md0
/dev/md0:
Version : 00.90
Creation Time : Sat Mar 12 21:22:34 2011
Raid Level : raid5
Array Size : 2197723392 (2095.91 GiB 2250.47 GB)
Used Dev Size : 732574464 (698.64 GiB 750.16 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 0
Persistence : Superblock is persistent

Update Time : Mon Jul 25 15:52:29 2011
State : clean, degraded, recovering
Active Devices : 3
Working Devices : 4
Failed Devices : 0
Spare Devices : 1

Layout : left-symmetric
Chunk Size : 64K

Rebuild Status : 0% complete

UUID : daf06d5a:b80528b1:2e29483d:f114274d (local to host st=
orage)
Events : 0.5599

Number Major Minor RaidDevice State
4 8 64 0 spare rebuilding /dev/sde
1 8 48 1 active sync /dev/sdd
2 8 32 2 active sync /dev/sdc
3 8 16 3 active sync /dev/sdb

# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
[raid4] [raid10]
md0 : active raid5 sde[4] sdd[1] sdb[3] sdc[2]
2197723392 blocks level 5, 64k chunk, algorithm 2 [4/3] [_UUU]
[>....................] recovery =3D 0.4% (3470464/732574464)
finish=3D365.0min speed=3D33284K/sec

unused devices: <none>


However, it's still failing an fsck - so does order matter when I
re-assemble the array? I see conflicting answers online.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Report this message

#9: Re: Problems with raid after reboot.

Posted on 2011-07-26 10:37:20 by Robin Hill

--zhXaljGHf11kAtnf
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Mon Jul 25, 2011 at 03:55:02PM -0600, Matthew Tice wrote:

> So now it's syncing:
> # mdadm --detail /dev/md0
> /dev/md0:
> Version : 00.90
> Creation Time : Sat Mar 12 21:22:34 2011
> Raid Level : raid5
> Array Size : 2197723392 (2095.91 GiB 2250.47 GB)
> Used Dev Size : 732574464 (698.64 GiB 750.16 GB)
> Raid Devices : 4
> Total Devices : 4
> Preferred Minor : 0
> Persistence : Superblock is persistent
>=20
> Update Time : Mon Jul 25 15:52:29 2011
> State : clean, degraded, recovering
> Active Devices : 3
> Working Devices : 4
> Failed Devices : 0
> Spare Devices : 1
>=20
> Layout : left-symmetric
> Chunk Size : 64K
>=20
> Rebuild Status : 0% complete
>=20
> UUID : daf06d5a:b80528b1:2e29483d:f114274d (local to host stor=
age)
> Events : 0.5599
>=20
> Number Major Minor RaidDevice State
> 4 8 64 0 spare rebuilding /dev/sde
> 1 8 48 1 active sync /dev/sdd
> 2 8 32 2 active sync /dev/sdc
> 3 8 16 3 active sync /dev/sdb
>=20
> # cat /proc/mdstat
> Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
> [raid4] [raid10]
> md0 : active raid5 sde[4] sdd[1] sdb[3] sdc[2]
> 2197723392 blocks level 5, 64k chunk, algorithm 2 [4/3] [_UUU]
> [>....................] recovery =3D 0.4% (3470464/732574464)
> finish=3D365.0min speed=3D33284K/sec
>=20
> unused devices: <none>
>=20
>=20
> However, it's still failing an fsck - so does order matter when I
> re-assemble the array? I see conflicting answers online.
>=20
No, order only matters if you're recreating the array (which is a
last-ditch option for if assembly fails). The metadata on each drive
indicates where it should be in the array, so the assembly will use that
to order the drives.

The fsck errors would look to be genuine issues with the filesystem. Was
the array shut down cleanly before you moved it? You did have to force
the assembly initially, which would suggest not (and could point to some
minor corruption). I'm not sure you have much of an option other than to
go through with a fsck & repair any issues now though (if you've got the
space then I'd suggest imaging the array as a backup though).=20

Cheers,
Robin
--=20
___ =20
( ' } | Robin Hill <robin@robinhill.me.uk> |
/ / ) | Little Jim says .... |
// !! | "He fallen in de water !!" |

--zhXaljGHf11kAtnf
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (GNU/Linux)

iEYEARECAAYFAk4ufL8ACgkQShxCyD40xBLhTACgsPLYPY91Bf4eB8mIjwgL xRqL
1e0AnAodBOgq851d5FM5F8naJpQNgPpS
=FepG
-----END PGP SIGNATURE-----

--zhXaljGHf11kAtnf--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Report this message