Accidental grow before add

am 26.09.2010 09:27:16 von Mike Hartman

I think I may have mucked up my array, but I'm hoping somebody can
give me a tip to retrieve the situation.

I had just added a new disk to my system and partitioned it in
preparation for adding it to my RAID 6 array, growing it from 7
devices to 8. However, I jumped the gun (guess I'm more tired than I
thought) and ran the grow command before I added the new disk to the
array as a spare.

In other words, I should have run:

mdadm --add /dev/md0 /dev/md3p1
mdadm --grow /dev/md0 --raid-devices=8 --backup-file=/grow_md0.bak

but instead I just ran

mdadm --grow /dev/md0 --raid-devices=8 --backup-file=/grow_md0.bak

I immediately checked /proc/mdstat and got the following output:

Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md0 : active raid6 sdk1[0] md2p1[7] sde1[6] sdf1[5] md1p1[4] sdl1[3] sdj1[1]
7324227840 blocks super 1.2 level 6, 256k chunk, algorithm 2
[8/7] [UUUUUUU_]
[>....................] reshape = 0.0% (79600/1464845568)
finish=3066.3min speed=7960K/sec

md3 : active raid0 sdb1[0] sdh1[1]
1465141760 blocks super 1.2 128k chunks

md2 : active raid0 sdc1[0] sdd1[1]
1465141760 blocks super 1.2 128k chunks

md1 : active raid0 sdi1[0] sdm1[1]
1465141760 blocks super 1.2 128k chunks

unused devices:

At this point I figured I was probably ok. It looked like it was
restructuring the array to expect 8 disks, and with only 7 it would
just end up being in a degraded state. So I figured I'd just cost
myself some time - one reshape to get to the degraded 8 disk state,
and another reshape to activate the new disk instead of just the one
reshape onto the new disk. I went ahead and added the new disk as a
spare, figuring the current reshape operation would ignore it until it
completed, and then the system would notice it was degraded with a
spare available and rebuild it.

However, things have slowed to a crawl (relative to the time it
normally takes to regrow this array) so I'm afraid something has gone
wrong. As you can see in the initial mdstat above, it started at
7960K/sec - quite fast for a reshape on this array. But just a couple
minutes after that it had dropped down to only 667K. It worked its way
back up through 1801K to 10277K, which is about average for a reshape
on this array. Not sure how long it stayed at that level, but now
(still only 10 or 15 minutes after the original mistake) it's plunged
all the way down to 40K/s. It's been down at this level for several
minutes and still dropping slowly. This doesn't strike me as a good
sign for the health of the unusual regrow operation.

Anybody have a theory on what could be causing the slowness? Does it
seem like a reasonable consequence to growing an array without a spare
attached? I'm hoping that this particular growing mistake isn't
automatically fatal or mdadm would have warned me or asked for a
confirmation or something. Worst case scenario I'm hoping the array
survives even if I just have to live with this speed and wait for it
to finish - although at the current rate that would take over a
year... Dare I mount the array's partition to check on the contents,
or would that risk messing it up worse?

Here's the latest /proc/mdstat:

Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md0 : active raid6 md3p1[8](S) sdk1[0] md2p1[7] sde1[6] sdf1[5]
md1p1[4] sdl1[3] sdj1[1]
7324227840 blocks super 1.2 level 6, 256k chunk, algorithm 2
[8/7] [UUUUUUU_]
[>....................] reshape = 0.1% (1862640/1464845568)
finish=628568.8min speed=38K/sec

md3 : active raid0 sdb1[0] sdh1[1]
1465141760 blocks super 1.2 128k chunks

md2 : active raid0 sdc1[0] sdd1[1]
1465141760 blocks super 1.2 128k chunks

md1 : active raid0 sdi1[0] sdm1[1]
1465141760 blocks super 1.2 128k chunks

unused devices:

Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Accidental grow before add

am 26.09.2010 11:39:07 von Mike Hartman

> I think I may have mucked up my array, but I'm hoping somebody can
> give me a tip to retrieve the situation.
>
> I had just added a new disk to my system and partitioned it in
> preparation for adding it to my RAID 6 array, growing it from 7
> devices to 8. However, I jumped the gun (guess I'm more tired than I
> thought) and ran the grow command before I added the new disk to the
> array as a spare.
>
> In other words, I should have run:
>
> mdadm --add /dev/md0 /dev/md3p1
> mdadm --grow /dev/md0 --raid-devices=3D8 --backup-file=3D/grow_md0.ba=
k
>
> but instead I just ran
>
> mdadm --grow /dev/md0 --raid-devices=3D8 --backup-file=3D/grow_md0.ba=
k
>
> I immediately checked /proc/mdstat and got the following output:
>
> Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [ra=
id4]
> md0 : active raid6 sdk1[0] md2p1[7] sde1[6] sdf1[5] md1p1[4] sdl1[3] =
sdj1[1]
> =A0 =A0 =A07324227840 blocks super 1.2 level 6, 256k chunk, algorithm=
2
> [8/7] [UUUUUUU_]
> =A0 =A0 =A0[>....................] =A0reshape =3D =A00.0% (79600/1464=
845568)
> finish=3D3066.3min speed=3D7960K/sec
>
> md3 : active raid0 sdb1[0] sdh1[1]
> =A0 =A0 =A01465141760 blocks super 1.2 128k chunks
>
> md2 : active raid0 sdc1[0] sdd1[1]
> =A0 =A0 =A01465141760 blocks super 1.2 128k chunks
>
> md1 : active raid0 sdi1[0] sdm1[1]
> =A0 =A0 =A01465141760 blocks super 1.2 128k chunks
>
> unused devices:
>
> At this point I figured I was probably ok. It looked like it was
> restructuring the array to expect 8 disks, and with only 7 it would
> just end up being in a degraded state. So I figured I'd just cost
> myself some time - one reshape to get to the degraded 8 disk state,
> and another reshape to activate the new disk instead of just the one
> reshape onto the new disk. I went ahead and added the new disk as a
> spare, figuring the current reshape operation would ignore it until i=
t
> completed, and then the system would notice it was degraded with a
> spare available and rebuild it.
>
> However, things have slowed to a crawl (relative to the time it
> normally takes to regrow this array) so I'm afraid something has gone
> wrong. As you can see in the initial mdstat above, it started at
> 7960K/sec - quite fast for a reshape on this array. But just a couple
> minutes after that it had dropped down to only 667K. It worked its wa=
y
> back up through 1801K to 10277K, which is about average for a reshape
> on this array. Not sure how long it stayed at that level, but now
> (still only 10 or 15 minutes after the original mistake) it's plunged
> all the way down to 40K/s. It's been down at this level for several
> minutes and still dropping slowly. This doesn't strike me as a good
> sign for the health of the unusual regrow operation.
>
> Anybody have a theory on what could be causing the slowness? Does it
> seem like a reasonable consequence to growing an array without a spar=
e
> attached? I'm hoping that this particular growing mistake isn't
> automatically fatal or mdadm would have warned me or asked for a
> confirmation or something. Worst case scenario I'm hoping the array
> survives even if I just have to live with this speed and wait for it
> to finish - although at the current rate that would take over a
> year... Dare I mount the array's partition to check on the contents,
> or would that risk messing it up worse?
>
> Here's the latest /proc/mdstat:
>
> Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [ra=
id4]
> md0 : active raid6 md3p1[8](S) sdk1[0] md2p1[7] sde1[6] sdf1[5]
> md1p1[4] sdl1[3] sdj1[1]
> =A0 =A0 =A07324227840 blocks super 1.2 level 6, 256k chunk, algorithm=
2
> [8/7] [UUUUUUU_]
> =A0 =A0 =A0[>....................] =A0reshape =3D =A00.1% (1862640/14=
64845568)
> finish=3D628568.8min speed=3D38K/sec
>
> md3 : active raid0 sdb1[0] sdh1[1]
> =A0 =A0 =A01465141760 blocks super 1.2 128k chunks
>
> md2 : active raid0 sdc1[0] sdd1[1]
> =A0 =A0 =A01465141760 blocks super 1.2 128k chunks
>
> md1 : active raid0 sdi1[0] sdm1[1]
> =A0 =A0 =A01465141760 blocks super 1.2 128k chunks
>
> unused devices:
>
> Mike
>

And now the speed has picked back up to the normal rate (for now), but
for some reason it has marked one of the existing drives as failed.
Especially weird since that "drive" is one of my RAID 0s, and its
component disks look fine. Since I was already "missing" the drive I
forgot to add, that leaves me with no more room for failures. I have
no idea why mdadm has decided this other drive failed (the timing is
awfully coincidental) but if whatever it is happens again I'm really
in trouble. Here's the latest mdstat:

Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid=
4]
md0 : active raid6 md3p1[8](S) sdk1[0] md2p1[7](F) sde1[6] sdf1[5]
md1p1[4] sdl1[3] sdj1[1]
7324227840 blocks super 1.2 level 6, 256k chunk, algorithm 2
[8/6] [UUUUUU__]
[>....................] reshape =3D 3.1% (45582368/1464845568)
finish=3D2251.5min speed=3D10505K/sec

md3 : active raid0 sdb1[0] sdh1[1]
1465141760 blocks super 1.2 128k chunks

md2 : active raid0 sdc1[0] sdd1[1]
1465141760 blocks super 1.2 128k chunks

md1 : active raid0 sdi1[0] sdm1[1]
1465141760 blocks super 1.2 128k chunks

unused devices:

Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Accidental grow before add

am 26.09.2010 11:54:39 von Mike Hartman

>> I think I may have mucked up my array, but I'm hoping somebody can
>> give me a tip to retrieve the situation.
>>
>> I had just added a new disk to my system and partitioned it in
>> preparation for adding it to my RAID 6 array, growing it from 7
>> devices to 8. However, I jumped the gun (guess I'm more tired than I
>> thought) and ran the grow command before I added the new disk to the
>> array as a spare.
>>
>> In other words, I should have run:
>>
>> mdadm --add /dev/md0 /dev/md3p1
>> mdadm --grow /dev/md0 --raid-devices=3D8 --backup-file=3D/grow_md0.b=
ak
>>
>> but instead I just ran
>>
>> mdadm --grow /dev/md0 --raid-devices=3D8 --backup-file=3D/grow_md0.b=
ak
>>
>> I immediately checked /proc/mdstat and got the following output:
>>
>> Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [r=
aid4]
>> md0 : active raid6 sdk1[0] md2p1[7] sde1[6] sdf1[5] md1p1[4] sdl1[3]=
sdj1[1]
>> =A0 =A0 =A07324227840 blocks super 1.2 level 6, 256k chunk, algorith=
m 2
>> [8/7] [UUUUUUU_]
>> =A0 =A0 =A0[>....................] =A0reshape =3D =A00.0% (79600/146=
4845568)
>> finish=3D3066.3min speed=3D7960K/sec
>>
>> md3 : active raid0 sdb1[0] sdh1[1]
>> =A0 =A0 =A01465141760 blocks super 1.2 128k chunks
>>
>> md2 : active raid0 sdc1[0] sdd1[1]
>> =A0 =A0 =A01465141760 blocks super 1.2 128k chunks
>>
>> md1 : active raid0 sdi1[0] sdm1[1]
>> =A0 =A0 =A01465141760 blocks super 1.2 128k chunks
>>
>> unused devices:
>>
>> At this point I figured I was probably ok. It looked like it was
>> restructuring the array to expect 8 disks, and with only 7 it would
>> just end up being in a degraded state. So I figured I'd just cost
>> myself some time - one reshape to get to the degraded 8 disk state,
>> and another reshape to activate the new disk instead of just the one
>> reshape onto the new disk. I went ahead and added the new disk as a
>> spare, figuring the current reshape operation would ignore it until =
it
>> completed, and then the system would notice it was degraded with a
>> spare available and rebuild it.
>>
>> However, things have slowed to a crawl (relative to the time it
>> normally takes to regrow this array) so I'm afraid something has gon=
e
>> wrong. As you can see in the initial mdstat above, it started at
>> 7960K/sec - quite fast for a reshape on this array. But just a coupl=
e
>> minutes after that it had dropped down to only 667K. It worked its w=
ay
>> back up through 1801K to 10277K, which is about average for a reshap=
e
>> on this array. Not sure how long it stayed at that level, but now
>> (still only 10 or 15 minutes after the original mistake) it's plunge=
d
>> all the way down to 40K/s. It's been down at this level for several
>> minutes and still dropping slowly. This doesn't strike me as a good
>> sign for the health of the unusual regrow operation.
>>
>> Anybody have a theory on what could be causing the slowness? Does it
>> seem like a reasonable consequence to growing an array without a spa=
re
>> attached? I'm hoping that this particular growing mistake isn't
>> automatically fatal or mdadm would have warned me or asked for a
>> confirmation or something. Worst case scenario I'm hoping the array
>> survives even if I just have to live with this speed and wait for it
>> to finish - although at the current rate that would take over a
>> year... Dare I mount the array's partition to check on the contents,
>> or would that risk messing it up worse?
>>
>> Here's the latest /proc/mdstat:
>>
>> Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [r=
aid4]
>> md0 : active raid6 md3p1[8](S) sdk1[0] md2p1[7] sde1[6] sdf1[5]
>> md1p1[4] sdl1[3] sdj1[1]
>> =A0 =A0 =A07324227840 blocks super 1.2 level 6, 256k chunk, algorith=
m 2
>> [8/7] [UUUUUUU_]
>> =A0 =A0 =A0[>....................] =A0reshape =3D =A00.1% (1862640/1=
464845568)
>> finish=3D628568.8min speed=3D38K/sec
>>
>> md3 : active raid0 sdb1[0] sdh1[1]
>> =A0 =A0 =A01465141760 blocks super 1.2 128k chunks
>>
>> md2 : active raid0 sdc1[0] sdd1[1]
>> =A0 =A0 =A01465141760 blocks super 1.2 128k chunks
>>
>> md1 : active raid0 sdi1[0] sdm1[1]
>> =A0 =A0 =A01465141760 blocks super 1.2 128k chunks
>>
>> unused devices:
>>
>> Mike
>>
>
> And now the speed has picked back up to the normal rate (for now), bu=
t
> for some reason it has marked one of the existing drives as failed.
> Especially weird since that "drive" is one of my RAID 0s, and its
> component disks look fine. Since I was already "missing" the drive I
> forgot to add, that leaves me with no more room for failures. I have
> no idea why mdadm has decided this other drive failed (the timing is
> awfully coincidental) but if whatever it is happens again I'm really
> in trouble. Here's the latest mdstat:
>
> Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [ra=
id4]
> md0 : active raid6 md3p1[8](S) sdk1[0] md2p1[7](F) sde1[6] sdf1[5]
> md1p1[4] sdl1[3] sdj1[1]
> =A0 =A0 =A07324227840 blocks super 1.2 level 6, 256k chunk, algorithm=
2
> [8/6] [UUUUUU__]
> =A0 =A0 =A0[>....................] =A0reshape =3D =A03.1% (45582368/1=
464845568)
> finish=3D2251.5min speed=3D10505K/sec
>
> md3 : active raid0 sdb1[0] sdh1[1]
> =A0 =A0 =A01465141760 blocks super 1.2 128k chunks
>
> md2 : active raid0 sdc1[0] sdd1[1]
> =A0 =A0 =A01465141760 blocks super 1.2 128k chunks
>
> md1 : active raid0 sdi1[0] sdm1[1]
> =A0 =A0 =A01465141760 blocks super 1.2 128k chunks
>
> unused devices:
>
> Mike
>

I've stopped /dev/md0 with "mdadm --stop /dev/md0" because I'm just
too worried about what might happen to the array if another component
mysteriously fails.

Most pressing question: if md2 reports all component drives healthy
and correct in mdstat, then why would md2's md2p1 partition suddenly
show up as a failed component of md0 (since it's obvious the
underlying hardware is ok)? And does a drive "failing" during a
reshape corrupt the reshape (in which case irreparable damage has
already been done)?

Assuming that my array isn't already destroyed, and assuming that this
mysterious failure without any hard drives failing doesn't crop up
again, is there any way to force the array to immediately start
incorporating spares introduced after the reshape began (both the new
drive and now the one that "failed")? Because right now I'm in the
position of needing to complete a multi-day reshape operation with no
safety net at all and that scares the hell out of me.

Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Accidental grow before add

am 26.09.2010 11:59:57 von Mikael Abrahamsson

On Sun, 26 Sep 2010, Mike Hartman wrote:

> I've stopped /dev/md0 with "mdadm --stop /dev/md0" because I'm just
> too worried about what might happen to the array if another component
> mysteriously fails.

You need to start looking in dmesg / other logs to see what has happened
and why things have failed. Without that information it's impossible to
tell what's going on.

--
Mikael Abrahamsson email: swmike@swm.pp.se
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Accidental grow before add

am 26.09.2010 12:18:09 von Mike Hartman

> You need to start looking in dmesg / other logs to see what has happe=
ned and
> why things have failed. Without that information it's impossible to t=
ell
> what's going on.
>
> --
> Mikael Abrahamsson =A0 =A0email: swmike@swm.pp.se
>

I've uploaded the dmesg output starting with the reshape to
www.hartmanipulation.com/raid/dmesg_6.txt. It looks like /dev/sdd is
having some kind of intermittent read issues (which wasn't happening
before the reshape started) but I still don't understand why it
wouldn't be marked as failed in the md2 section of mdstat, since md0
is accessing it via md2.

At any rate, that doesn't help me with my most immediate issue: does a
drive failing during a reshape corrupt the array? Or am I safe to
resume the reshape? Is there any way to restore my safety net a bit
before resuming the reshape, or will I just have to hope nothing else
goes wrong between now and the time the new hot spare is finally
incorporated?

Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Accidental grow before add

am 26.09.2010 12:38:18 von Robin Hill

--5vNYLRcllDrimb99
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Sun Sep 26, 2010 at 06:18:09AM -0400, Mike Hartman wrote:

> > You need to start looking in dmesg / other logs to see what has happene=
d and
> > why things have failed. Without that information it's impossible to tell
> > what's going on.
> >
> I've uploaded the dmesg output starting with the reshape to
> www.hartmanipulation.com/raid/dmesg_6.txt. It looks like /dev/sdd is
> having some kind of intermittent read issues (which wasn't happening
> before the reshape started) but I still don't understand why it
> wouldn't be marked as failed in the md2 section of mdstat, since md0
> is accessing it via md2.
>=20
I think this is because it's a RAID0 array. It can't fail the device
without (irrecoverably) failing the array, so it's left to the normal
block device error reporting/handling process.

> At any rate, that doesn't help me with my most immediate issue: does a
> drive failing during a reshape corrupt the array? Or am I safe to
> resume the reshape? Is there any way to restore my safety net a bit
> before resuming the reshape, or will I just have to hope nothing else
> goes wrong between now and the time the new hot spare is finally
> incorporated?
>=20
Failure of a device during the reshape certainly shouldn't corrupt the
array (I don't see how it would anyway, unless there's a screw-up in the
code). I don't think there's any way to "restore your safety net"
though (short of imaging all the drives as backups), but it's probably
worth while doing a read test of all member devices before you continue.

Cheers,
Robin
--=20
___ =20
( ' } | Robin Hill |
/ / ) | Little Jim says .... |
// !! | "He fallen in de water !!" |

--5vNYLRcllDrimb99
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.16 (GNU/Linux)

iEYEARECAAYFAkyfIpkACgkQShxCyD40xBJ2bgCbBV2IrE/N4i0uTkxs9Grt 6uS2
bcgAnjI+Kc4B3WSutbWB+jPdQm3z+JTE
=aIqD
-----END PGP SIGNATURE-----

--5vNYLRcllDrimb99--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Accidental grow before add

am 26.09.2010 21:34:17 von Mike Hartman

>> > You need to start looking in dmesg / other logs to see what has ha=
ppened and
>> > why things have failed. Without that information it's impossible t=
o tell
>> > what's going on.
>> >
>> I've uploaded the dmesg output starting with the reshape to
>> www.hartmanipulation.com/raid/dmesg_6.txt. It looks like /dev/sdd is
>> having some kind of intermittent read issues (which wasn't happening
>> before the reshape started) but I still don't understand why it
>> wouldn't be marked as failed in the md2 section of mdstat, since md0
>> is accessing it via md2.
>>
> I think this is because it's a RAID0 array. =A0It can't fail the devi=
ce
> without (irrecoverably) failing the array, so it's left to the normal
> block device error reporting/handling process.

I guess that makes sense.

>
>> At any rate, that doesn't help me with my most immediate issue: does=
a
>> drive failing during a reshape corrupt the array? Or am I safe to
>> resume the reshape? Is there any way to restore my safety net a bit
>> before resuming the reshape, or will I just have to hope nothing els=
e
>> goes wrong between now and the time the new hot spare is finally
>> incorporated?
>>
> Failure of a device during the reshape certainly shouldn't corrupt th=
e
> array (I don't see how it would anyway, unless there's a screw-up in =
the
> code).

I guess I was thinking that the reshape was restriping all the data
under the assumption of 7 (or 8) drives, and one failing might change
the restriping requirements in midstream and leave it in an
unrecoverable in-between state. Very glad to hear that's not the case.

> I don't think there's any way to "restore your safety net"
> though (short of imaging all the drives as backups), but it's probabl=
y
> worth while doing a read test of all member devices before you contin=
ue.

Can you recommend a good way to perform such a read test? Would I just
dd the entire contents of each disk to /dev/null or is there a more
efficient way of doing it?

>
> Cheers,
> =A0 =A0Robin
> --
> =A0 =A0 ___
> =A0 =A0( ' } =A0 =A0 | =A0 =A0 =A0 Robin Hill =A0 =A0 =A0 =A0 obinhill.me.uk> |
> =A0 / / ) =A0 =A0 =A0| Little Jim says .... =A0 =A0 =A0 =A0 =A0 =A0 =A0=
=A0 =A0 =A0 =A0 =A0 =A0 =A0|
> =A0// !! =A0 =A0 =A0 | =A0 =A0 =A0"He fallen in de water !!" =A0 =A0 =
=A0 =A0 =A0 =A0 =A0 =A0 |

Thanks for the help Robin!

Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Accidental grow before add

am 26.09.2010 23:22:32 von Robin Hill

--OwLcNYc0lM97+oe1
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Sun Sep 26, 2010 at 03:34:17PM -0400, Mike Hartman wrote:

> >> At any rate, that doesn't help me with my most immediate issue: does a
> >> drive failing during a reshape corrupt the array? Or am I safe to
> >> resume the reshape? Is there any way to restore my safety net a bit
> >> before resuming the reshape, or will I just have to hope nothing else
> >> goes wrong between now and the time the new hot spare is finally
> >> incorporated?
> >>
> > Failure of a device during the reshape certainly shouldn't corrupt the
> > array (I don't see how it would anyway, unless there's a screw-up in the
> > code).
>=20
> I guess I was thinking that the reshape was restriping all the data
> under the assumption of 7 (or 8) drives, and one failing might change
> the restriping requirements in midstream and leave it in an
> unrecoverable in-between state. Very glad to hear that's not the case.
>=20
I've not looked at the code, so I can't be certain. I'm pretty sure
I've had this happen to me during a reshape though, without any issues.

> > I don't think there's any way to "restore your safety net"
> > though (short of imaging all the drives as backups), but it's probably
> > worth while doing a read test of all member devices before you continue.
>=20
> Can you recommend a good way to perform such a read test? Would I just
> dd the entire contents of each disk to /dev/null or is there a more
> efficient way of doing it?
>=20
That's what I'd do anyway. You could also try running a full SMART
test - that should pick up anything. I'd still go with dd though, as
I'm more confident of what that's actually doing.

Good luck,
Robin
--=20
___ =20
( ' } | Robin Hill |
/ / ) | Little Jim says .... |
// !! | "He fallen in de water !!" |

--OwLcNYc0lM97+oe1
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.16 (GNU/Linux)

iEYEARECAAYFAkyfuZUACgkQShxCyD40xBJ0RQCg1xtdhwqNqocHgRghfjHe +hkv
zWMAn30vV1vg4HDx+szyIsgf+ox3F3RA
=vE0R
-----END PGP SIGNATURE-----

--OwLcNYc0lM97+oe1--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Accidental grow before add

am 27.09.2010 10:11:19 von Jon Hardcastle

--- On Sun, 26/9/10, Mike Hartman wrote:

> From: Mike Hartman
> Subject: Accidental grow before add
> To: linux-raid@vger.kernel.org
> Date: Sunday, 26 September, 2010, 8:27
> I think I may have mucked up my
> array, but I'm hoping somebody can
> give me a tip to retrieve the situation.
>=20
> I had just added a new disk to my system and partitioned it
> in
> preparation for adding it to my RAID 6 array, growing it
> from 7
> devices to 8. However, I jumped the gun (guess I'm more
> tired than I
> thought) and ran the grow command before I added the new
> disk to the
> array as a spare.
>=20
> In other words, I should have run:
>=20
> mdadm --add /dev/md0 /dev/md3p1
> mdadm --grow /dev/md0 --raid-devices=3D8
> --backup-file=3D/grow_md0.bak
>=20
> but instead I just ran
>=20
> mdadm --grow /dev/md0 --raid-devices=3D8
> --backup-file=3D/grow_md0.bak
>=20
> I immediately checked /proc/mdstat and got the following
> output:
>=20
> Personalities : [linear] [raid0] [raid1] [raid10] [raid6]
> [raid5] [raid4]
> md0 : active raid6 sdk1[0] md2p1[7] sde1[6] sdf1[5]
> md1p1[4] sdl1[3] sdj1[1]
> =A0 =A0 =A0 7324227840 blocks super 1.2 level 6,
> 256k chunk, algorithm 2
> [8/7] [UUUUUUU_]
> =A0 =A0 =A0 [>....................]=A0
> reshape = 0.0% (79600/1464845568)
> finish=3D3066.3min speed=3D7960K/sec
>=20
> md3 : active raid0 sdb1[0] sdh1[1]
> =A0 =A0 =A0 1465141760 blocks super 1.2 128k
> chunks
>=20
> md2 : active raid0 sdc1[0] sdd1[1]
> =A0 =A0 =A0 1465141760 blocks super 1.2 128k
> chunks
>=20
> md1 : active raid0 sdi1[0] sdm1[1]
> =A0 =A0 =A0 1465141760 blocks super 1.2 128k
> chunks
>=20
> unused devices:
>=20
> At this point I figured I was probably ok. It looked like
> it was
> restructuring the array to expect 8 disks, and with only 7
> it would
> just end up being in a degraded state. So I figured I'd
> just cost
> myself some time - one reshape to get to the degraded 8
> disk state,
> and another reshape to activate the new disk instead of
> just the one
> reshape onto the new disk. I went ahead and added the new
> disk as a
> spare, figuring the current reshape operation would ignore
> it until it
> completed, and then the system would notice it was degraded
> with a
> spare available and rebuild it.
>=20
> However, things have slowed to a crawl (relative to the
> time it
> normally takes to regrow this array) so I'm afraid
> something has gone
> wrong. As you can see in the initial mdstat above, it
> started at
> 7960K/sec - quite fast for a reshape on this array. But
> just a couple
> minutes after that it had dropped down to only 667K. It
> worked its way
> back up through 1801K to 10277K, which is about average for
> a reshape
> on this array. Not sure how long it stayed at that level,
> but now
> (still only 10 or 15 minutes after the original mistake)
> it's plunged
> all the way down to 40K/s. It's been down at this level for
> several
> minutes and still dropping slowly. This doesn't strike me
> as a good
> sign for the health of the unusual regrow operation.
>=20
> Anybody have a theory on what could be causing the
> slowness? Does it
> seem like a reasonable consequence to growing an array
> without a spare
> attached? I'm hoping that this particular growing mistake
> isn't
> automatically fatal or mdadm would have warned me or asked
> for a
> confirmation or something. Worst case scenario I'm hoping
> the array
> survives even if I just have to live with this speed and
> wait for it
> to finish - although at the current rate that would take
> over a
> year... Dare I mount the array's partition to check on the
> contents,
> or would that risk messing it up worse?
>=20
> Here's the latest /proc/mdstat:
>=20
> Personalities : [linear] [raid0] [raid1] [raid10] [raid6]
> [raid5] [raid4]
> md0 : active raid6 md3p1[8](S) sdk1[0] md2p1[7] sde1[6]
> sdf1[5]
> md1p1[4] sdl1[3] sdj1[1]
> =A0 =A0 =A0 7324227840 blocks super 1.2 level 6,
> 256k chunk, algorithm 2
> [8/7] [UUUUUUU_]
> =A0 =A0 =A0 [>....................]=A0
> reshape = 0.1% (1862640/1464845568)
> finish=3D628568.8min speed=3D38K/sec
>=20
> md3 : active raid0 sdb1[0] sdh1[1]
> =A0 =A0 =A0 1465141760 blocks super 1.2 128k
> chunks
>=20
> md2 : active raid0 sdc1[0] sdd1[1]
> =A0 =A0 =A0 1465141760 blocks super 1.2 128k
> chunks
>=20
> md1 : active raid0 sdi1[0] sdm1[1]
> =A0 =A0 =A0 1465141760 blocks super 1.2 128k
> chunks
>=20
> unused devices:
>=20
> Mike
> --
> To unsubscribe from this list: send the line "unsubscribe
> linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at=A0 http://vger.kernel.org/majordomo-info.html
>=20

I am more interested to know why it kicked off a reshape that would lea=
ve the array in a degraded state without a warning and needing a '--for=
ce' are you sure there wasn't capacity to 'grow' anyway?

Also, when i first ran my reshape it was incredibly slow from Raid5~6 t=
ho.. it literally took days.

=20
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Accidental grow before add

am 27.09.2010 11:05:05 von Mike Hartman

> I am more interested to know why it kicked off a reshape that would leave the array in a degraded state without a warning and
> needing a '--force' are you sure there wasn't capacity to 'grow' anyway?

Positive. I had no spare of any kind and mdstat was showing all disks
were in use. Now I've got the new drive in there as a spare, but it
was added after the reshape started and mdadm doesn't seem to be
trying to use it yet. I'm thinking it's going through the original
reshape I kicked off (transforming it from an intact 7 disk RAID 6 to
a degraded 8 disk RAID 6) and then when it gets to the end it will run
another reshape to pick up the new spare. I too am surprised there
wasn't at least a warning if not a confirmation.

> Also, when i first ran my reshape it was incredibly slow from Raid5~6 tho.. it literally took days.

I did a RAID 5 -> RAID 6 conversion the other week and it was also
slower than a normal resizing, but only 2-2.5 times as slow. Adding a
new disk usually takes a bit less than 2 days on this array and that
conversion took closer to 4. However, at the slowest rate I reported
above it would have taken something 11 months - definitely a whole
different ballpark.

At any rate, apparently one of my other drives in the array was
throwing some read errors. Eventually it did something unrecoverable
and was dropped from the array. Once that happened the speed returned
to a more normal level, but I stopped the arrays to run a complete
read test on every drive before continuing. With an already degraded
array, losing that drive killed any failure buffer I had left. I want
to make quite sure all the other drives will finish the reshape
properly before risking it. Then I guess it's just a matter of waiting
3 or 4 days for both reshapes to complete.

Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Accidental grow before add

am 28.09.2010 17:14:51 von nagilum

----- Message from mike@hartmanipulation.com ---------

>> I am more interested to know why it kicked off a reshape that would
>> leave the array in a degraded state without a warning and
>> needing a '--force' are you sure there wasn't capacity to 'grow' anyway?
>
> Positive. I had no spare of any kind and mdstat was showing all disks
> were in use.

Yep, a warning/safety net would be good. At the moment mdadm assumes
you know what you're doing.

> Now I've got the new drive in there as a spare, but it
> was added after the reshape started and mdadm doesn't seem to be
> trying to use it yet. I'm thinking it's going through the original
> reshape I kicked off (transforming it from an intact 7 disk RAID 6 to
> a degraded 8 disk RAID 6) and then when it gets to the end it will run
> another reshape to pick up the new spare.

Yes, that's what's going to happen.

>> Also, when i first ran my reshape it was incredibly slow from
>> Raid5~6 tho.. it literally took days.
> I did a RAID 5 -> RAID 6 conversion the other week and it was also
> slower than a normal resizing, but only 2-2.5 times as slow. Adding a
> new disk usually takes a bit less than 2 days on this array and that
> conversion took closer to 4. However, at the slowest rate I reported
> above it would have taken something 11 months - definitely a whole
> different ballpark.

Yeah that was due to the disk errors.
I find "iostat -d 2 -kx" helpful to understand what's going on.

> At any rate, apparently one of my other drives in the array was
> throwing some read errors. Eventually it did something unrecoverable
> and was dropped from the array. Once that happened the speed returned
> to a more normal level, but I stopped the arrays to run a complete
> read test on every drive before continuing. With an already degraded
> array, losing that drive killed any failure buffer I had left. I want
> to make quite sure all the other drives will finish the reshape
> properly before risking it. Then I guess it's just a matter of waiting
> 3 or 4 days for both reshapes to complete.

Yep, I once got bitten by a linux kernel bug that caused the RAID5 to
corrupt when a drive failed during reshape. I managed to recover though.
Since then I always do a raid-check before starting any changes.
Good luck and thanks for the story so far.
Alex.

============================================================ ============
# _ __ _ __ http://www.nagilum.org/ \n icq://69646724 #
# / |/ /__ ____ _(_) /_ ____ _ nagilum@nagilum.org \n +491776461165 #
# / / _ `/ _ `/ / / // / ' \ Amiga (68k/PPC): AOS/NetBSD/Linux #
# /_/|_/\_,_/\_, /_/_/\_,_/_/_/_/ Mac (PPC): MacOS-X / NetBSD /Linux #
# /___/ x86: FreeBSD/Linux/Solaris/Win2k ARM9: EPOC EV6 #
============================================================ ============

------------------------------------------------------------ ----
cakebox.homeunix.net - all the machine one needs..
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Accidental grow before add

am 30.09.2010 18:13:27 von Mike Hartman

In the spirit of providing full updates for interested parties/future Googlers:

> I'm thinking it's going through the original
> reshape I kicked off (transforming it from an intact 7 disk RAID 6 to
> a degraded 8 disk RAID 6) and then when it gets to the end it will run
> another reshape to pick up the new spare.

Well that "first" reshape finally finished and it looks like it
actually did switch over to bringing in the new spare at some point in
midstream. I only noticed it after the reshape completed, but here's
the window where it happened.

23:02 (New spare still unused):

Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md0 : active raid6 sdk1[0] md3p1[8](S) sde1[6] sdf1[5] md1p1[4] sdl1[3] sdj1[1]
7324227840 blocks super 1.2 level 6, 256k chunk, algorithm 2
[8/6] [UUUUUU__]
[===============>.....] reshape = 76.4% (1119168512/1464845568)
finish=654.5min speed=8801K/sec

md3 : active raid0 sdb1[0] sdh1[1]
1465141760 blocks super 1.2 128k chunks

md1 : active raid0 sdi1[0] sdm1[1]
1465141760 blocks super 1.2 128k chunks

unused devices:

23:03 (Spare flag is gone, although it's not marked as "Up" yet further down):

Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md0 : active raid6 sdk1[0] md3p1[8] sde1[6] sdf1[5] md1p1[4] sdl1[3] sdj1[1]
8789073408 blocks super 1.2 level 6, 256k chunk, algorithm 2
[8/6] [UUUUUU__]
[===============>.....] recovery = 78.7%
(1152999432/1464845568) finish=161.1min speed=32245K/sec

md3 : active raid0 sdb1[0] sdh1[1]
1465141760 blocks super 1.2 128k chunks

md1 : active raid0 sdi1[0] sdm1[1]
1465141760 blocks super 1.2 128k chunks

unused devices:

14:57 (It seemed to stall at the percent complete above for about 16 hours):

Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md0 : active raid6 sdk1[0] md3p1[8] sde1[6] sdf1[5] md1p1[4] sdl1[3] sdj1[1]
8789073408 blocks super 1.2 level 6, 256k chunk, algorithm 2
[8/6] [UUUUUU__]
[===============>.....] recovery = 79.1%
(1160057740/1464845568) finish=161.3min speed=31488K/sec

md3 : active raid0 sdb1[0] sdh1[1]
1465141760 blocks super 1.2 128k chunks

md1 : active raid0 sdi1[0] sdm1[1]
1465141760 blocks super 1.2 128k chunks

unused devices:

15:01 (And the leap forward):

Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md0 : active raid6 sdk1[0] md3p1[8] sde1[6] sdf1[5] md1p1[4] sdl1[3] sdj1[1]
8789073408 blocks super 1.2 level 6, 256k chunk, algorithm 2
[8/6] [UUUUUU__]
[==================>..] recovery = 92.3%
(1352535224/1464845568) finish=58.9min speed=31729K/sec

md3 : active raid0 sdb1[0] sdh1[1]
1465141760 blocks super 1.2 128k chunks

md1 : active raid0 sdi1[0] sdm1[1]
1465141760 blocks super 1.2 128k chunks

unused devices:

16:05 (Finishing clean, with only the drive that failed in mid-reshape
still missing):

Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md0 : active raid6 sdk1[0] md3p1[8] sde1[6] sdf1[5] md1p1[4] sdl1[3] sdj1[1]
8789073408 blocks super 1.2 level 6, 256k chunk, algorithm 2
[8/7] [UUUUUUU_]

md3 : active raid0 sdb1[0] sdh1[1]
1465141760 blocks super 1.2 128k chunks

md1 : active raid0 sdi1[0] sdm1[1]
1465141760 blocks super 1.2 128k chunks

unused devices:

So it seemed to pause for about 16 hours to pull in the spare, but
that's 4-5 times faster than it would normally take to grow the array
onto a new one. I assume that's because I was already reshaping the
array to fit across 8 disks (they just weren't all there) so when it
saw the new one it only had to update the new disk. Hopefully it will
go that fast when I replace the other disk that died.

Everything seems to have worked out ok - I just did a forced fsck on
the filesystem and it didn't mention correcting anything. Mounted it
and everything seems to be intact. Hopefully this whole thread will be
useful for someone in a similar situation. Thanks to everyone for the
help.

Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Accidental grow before add

am 05.10.2010 07:18:03 von NeilBrown

On Tue, 28 Sep 2010 17:14:51 +0200
Nagilum wrote:

>
> ----- Message from mike@hartmanipulation.com ---------
>
> >> I am more interested to know why it kicked off a reshape that would
> >> leave the array in a degraded state without a warning and
> >> needing a '--force' are you sure there wasn't capacity to 'grow' anyway?
> >
> > Positive. I had no spare of any kind and mdstat was showing all disks
> > were in use.
>
> Yep, a warning/safety net would be good. At the moment mdadm assumes
> you know what you're doing.
>

I've added this to my list of possible enhancements for mdadm-3.2

Thanks,
NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Accidental grow before add

am 05.10.2010 08:24:25 von NeilBrown

On Thu, 30 Sep 2010 12:13:27 -0400
Mike Hartman wrote:

> In the spirit of providing full updates for interested parties/future Googlers:
>
> > I'm thinking it's going through the original
> > reshape I kicked off (transforming it from an intact 7 disk RAID 6 to
> > a degraded 8 disk RAID 6) and then when it gets to the end it will run
> > another reshape to pick up the new spare.
>
> Well that "first" reshape finally finished and it looks like it
> actually did switch over to bringing in the new spare at some point in
> midstream. I only noticed it after the reshape completed, but here's
> the window where it happened.
>
>
> 23:02 (New spare still unused):
>
> Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
> md0 : active raid6 sdk1[0] md3p1[8](S) sde1[6] sdf1[5] md1p1[4] sdl1[3] sdj1[1]
> 7324227840 blocks super 1.2 level 6, 256k chunk, algorithm 2
> [8/6] [UUUUUU__]
> [===============>.....] reshape = 76.4% (1119168512/1464845568)
> finish=654.5min speed=8801K/sec
>
> md3 : active raid0 sdb1[0] sdh1[1]
> 1465141760 blocks super 1.2 128k chunks
>
> md1 : active raid0 sdi1[0] sdm1[1]
> 1465141760 blocks super 1.2 128k chunks
>
> unused devices:
>
>
> 23:03 (Spare flag is gone, although it's not marked as "Up" yet further down):
>
> Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
> md0 : active raid6 sdk1[0] md3p1[8] sde1[6] sdf1[5] md1p1[4] sdl1[3] sdj1[1]
> 8789073408 blocks super 1.2 level 6, 256k chunk, algorithm 2
> [8/6] [UUUUUU__]
> [===============>.....] recovery = 78.7%
> (1152999432/1464845568) finish=161.1min speed=32245K/sec
>
> md3 : active raid0 sdb1[0] sdh1[1]
> 1465141760 blocks super 1.2 128k chunks
>
> md1 : active raid0 sdi1[0] sdm1[1]
> 1465141760 blocks super 1.2 128k chunks
>
> unused devices:

This is really strange. I cannot reproduce any behaviour like this.
What kernel are you using?

What should happen is that the reshape will continue to the end, and then a
recovery will start from the beginning of the array, incorporating the new
device. This is what happens in my tests.

At about 84% the reshape should start going a lot faster as it no longer
needs to read data - it just writes zeros. But there is nothing interesting
that can happen around 77%.

>
>
>
> 14:57 (It seemed to stall at the percent complete above for about 16 hours):

This is also extremely odd. I think you are saying that the 'speed' stayed
at a fairly normal level, but the 'recovery =' percent didn't change.
Looking at the code - that cannot happen!

Maybe there is a perfectly reasonable explanation - possibly dependant on the
particular kernel you are using - but I cannot see it.

I would certainly recommend a 'check' and a 'fsck' (if you haven't already).

NeilBrown

>
> Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
> md0 : active raid6 sdk1[0] md3p1[8] sde1[6] sdf1[5] md1p1[4] sdl1[3] sdj1[1]
> 8789073408 blocks super 1.2 level 6, 256k chunk, algorithm 2
> [8/6] [UUUUUU__]
> [===============>.....] recovery = 79.1%
> (1160057740/1464845568) finish=161.3min speed=31488K/sec
>
> md3 : active raid0 sdb1[0] sdh1[1]
> 1465141760 blocks super 1.2 128k chunks
>
> md1 : active raid0 sdi1[0] sdm1[1]
> 1465141760 blocks super 1.2 128k chunks
>
> unused devices:
>
>
>
> 15:01 (And the leap forward):
>
> Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
> md0 : active raid6 sdk1[0] md3p1[8] sde1[6] sdf1[5] md1p1[4] sdl1[3] sdj1[1]
> 8789073408 blocks super 1.2 level 6, 256k chunk, algorithm 2
> [8/6] [UUUUUU__]
> [==================>..] recovery = 92.3%
> (1352535224/1464845568) finish=58.9min speed=31729K/sec
>
> md3 : active raid0 sdb1[0] sdh1[1]
> 1465141760 blocks super 1.2 128k chunks
>
> md1 : active raid0 sdi1[0] sdm1[1]
> 1465141760 blocks super 1.2 128k chunks
>
> unused devices:
>
>
>
> 16:05 (Finishing clean, with only the drive that failed in mid-reshape
> still missing):
>
> Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
> md0 : active raid6 sdk1[0] md3p1[8] sde1[6] sdf1[5] md1p1[4] sdl1[3] sdj1[1]
> 8789073408 blocks super 1.2 level 6, 256k chunk, algorithm 2
> [8/7] [UUUUUUU_]
>
> md3 : active raid0 sdb1[0] sdh1[1]
> 1465141760 blocks super 1.2 128k chunks
>
> md1 : active raid0 sdi1[0] sdm1[1]
> 1465141760 blocks super 1.2 128k chunks
>
> unused devices:
>
>
> So it seemed to pause for about 16 hours to pull in the spare, but
> that's 4-5 times faster than it would normally take to grow the array
> onto a new one. I assume that's because I was already reshaping the
> array to fit across 8 disks (they just weren't all there) so when it
> saw the new one it only had to update the new disk. Hopefully it will
> go that fast when I replace the other disk that died.
>
> Everything seems to have worked out ok - I just did a forced fsck on
> the filesystem and it didn't mention correcting anything. Mounted it
> and everything seems to be intact. Hopefully this whole thread will be
> useful for someone in a similar situation. Thanks to everyone for the
> help.
>
> Mike
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html