Degraded Array

am 04.12.2010 03:42:27 von Leslie Rhorer

Hello everyone.

=A0 I was just growing one of my RAID6 ar=
rays from 13 to 14
members.=A0 The array growth had passed its critical stage and had been
growing for several minutes when the system came to a screeching halt.=A0=
It
hit the big red switch, and when the system rebooted, the array assembl=
ed,
but two members are missing.=A0 One of the members is the new drive and=
the
other is the 13th drive in the RAID set.=A0 Of course, the array can ru=
n well
enough with only 12 members, but it=92s definitely not the best situati=
on,
especially since the re-shape will take another day and a half.=A0 Is i=
t best
I go ahead and leave the array in its current state until the re-shape =
is
done, or should I go ahead and add back the two failed drives?

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Degraded Array

am 04.12.2010 05:26:36 von majedb

You have a degraded array now with 1 disk down. If you proceed, more
disks might pop out due to errors.

It's best to backup your data, run a check on the array, fix it then
try to resume the reshape.

On Sat, Dec 4, 2010 at 5:42 AM, Leslie Rhorer wro=
te:
>
> Hello everyone.
>
> Â Â Â Â Â Â Â Â Â Â Â I =
was just growing one of my RAID6 arrays from 13 to 14
> members.Â The array growth had passed its critical stage and had=
been
> growing for several minutes when the system came to a screeching halt=
.Â=A0 It
> hit the big red switch, and when the system rebooted, the array assem=
bled,
> but two members are missing.Â One of the members is the new driv=
e and the
> other is the 13th drive in the RAID set.Â Of course, the array c=
an run well
> enough with only 12 members, but itâ=99s definitely not the best=
situation,
> especially since the re-shape will take another day and a half.Â =
Is it best
> I go ahead and leave the array in its current state until the re-shap=
e is
> done, or should I go ahead and add back the two failed drives?
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid"=
in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at Â http://vger.kernel.org/majordomo-info.ht=
ml

--
Â Â Â Â Majed B.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

RE: Degraded Array

am 04.12.2010 05:47:32 von Leslie Rhorer

> -----Original Message-----
> From: Majed B. [mailto:majedb@gmail.com]
> Sent: Friday, December 03, 2010 10:27 PM
> To: lrhorer@satx.rr.com
> Cc: linux-raid@vger.kernel.org
> Subject: Re: Degraded Array
>=20
> You have a degraded array now with 1 disk down. If you proceed, more
> disks might pop out due to errors.

Well, sort of. A significant fraction of the data is now striped
across 12 + 0 drives, rather than 11 + 1. There are no errors occurrin=
g on
the drives, although of course an unrecoverable error could happen at a=
ny
time.

> It's best to backup your data, run a check on the array, fix it then

The data is backed up. Except in extreme circumstances, I would
never start a re-shape without a current backup.

> run a check on the array, fix it then, try to resume the reshape.

The array is in good health, other than the two kicked drives. I'm not=
sure
I understand what you mean, though. I'm asking about the two offline
drives. Should I add the 13th back? It still has substantially the sa=
me
data as the other 12 drives, discounting the amount that has been
re-written. If so, how can I safely stop the array re-shape and re-add=
the
drive? (This is under mdadm 2.6.7.2.)

>=20
> On Sat, Dec 4, 2010 at 5:42 AM, Leslie Rhorer w=
rote:
> >
> > Hello everyone.
> >
> > =A0 I was just growing one of my RAID=
6 arrays from 13 to 14
> > members.=A0 The array growth had passed its critical stage and had =
been
> > growing for several minutes when the system came to a screeching ha=
lt.
> > It hit the big red switch, and when the system rebooted, the array

I meant to type *I*, not *It*.

> > but two members are missing.=A0 One of the members is the new drive=
and
> the
> > other is the 13th drive in the RAID set.=A0 Of course, the array ca=
n run
> well
> > enough with only 12 members, but it=92s definitely not the best sit=
uation,
> > especially since the re-shape will take another day and a half.=A0 =
Is it
> best
> > I go ahead and leave the array in its current state until the re-sh=
ape
> is
> > done, or should I go ahead and add back the two failed drives?
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-rai=
d" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at =A0http://vger.kernel.org/majordomo-info.htm=
l
>=20
>=20
>=20
> --
> =A0 =A0 Majed B.

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Degraded Array

am 04.12.2010 07:44:57 von NeilBrown

On Sat, 4 Dec 2010 07:26:36 +0300 "Majed B." wrote:

> You have a degraded array now with 1 disk down. If you proceed, more
> disks might pop out due to errors.
>=20
> It's best to backup your data, run a check on the array, fix it then
> try to resume the reshape.

Backups are always a good idea, but are sometimes impractical.

I don't think running a 'check' would help at all. A 'reshape' will do=
much
the same sort of work, and more.

It isn't strictly true that the array is '1 disk down'. Parts of it ar=
e 1
disk down, parts are 2 disks down. As the reshape progresses more and =
more
will be 2 disks down. We don't really want that.

This case isn't really handled well at present. You want to do a 'reco=
very'
and a 'reshape' at the same time. This is quite possible, but doesn't
currently happen when you restart a reshape in the middle (added to my =
todo
list).

I suggest you:
- apply the patch below to mdadm.
- assemble the array with --update=3Drevert-reshape. You should give
it a --backup-file too.
- let the reshape complete so you are back to 13 devices.
- add a spare and let it recovery
- then add a spare and reshape the array.

Of course you needed to be running a new enough kernel to be able decre=
ase
the number of devices in a raid5.

NeilBrown

>=20
> On Sat, Dec 4, 2010 at 5:42 AM, Leslie Rhorer w=
rote:
> >
> > Hello everyone.
> >
> > Â Â Â Â Â Â Â Â Â Â Â =
I was just growing one of my RAID6 arrays from 13 to 14
> > members.Â The array growth had passed its critical stage and h=
ad been
> > growing for several minutes when the system came to a screeching ha=
lt.Â It
> > hit the big red switch, and when the system rebooted, the array ass=
embled,
> > but two members are missing.Â One of the members is the new dr=
ive and the
> > other is the 13th drive in the RAID set.Â Of course, the array=
can run well
> > enough with only 12 members, but itâ=99s definitely not the be=
st situation,
> > especially since the re-shape will take another day and a half.Â =
Is it best
> > I go ahead and leave the array in its current state until the re-sh=
ape is
> > done, or should I go ahead and add back the two failed drives?
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-rai=
d" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at Â http://vger.kernel.org/majordomo-info.=
html
>=20
>=20
>=20
> --
> Â Â Â Â Majed B.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid"=
in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html

commit 12bab17f765a4130c7bd133a0bbb3b83f3f492b0
Author: NeilBrown
Date: Sat Dec 4 17:37:14 2010 +1100

Support reverting of reshape.
=20
Allow --update=3Drevert-reshape to do what you would expect.
=20
FIXME
needs review. Think about interface and use cases.
Document.

diff --git a/Assemble.c b/Assemble.c
index afd4e60..c034e37 100644
--- a/Assemble.c
+++ b/Assemble.c
@@ -592,6 +592,12 @@ int Assemble(struct supertype *st, char *mddev,
/* Ok, no bad inconsistancy, we can try updating etc */
bitmap_done =3D 0;
content->update_private =3D NULL;
+ if (update && strcmp(update, "revert-reshape") == 0 &&
+ (content->reshape_active == 0 || content->delta_disks <=3D 0)=
) {
+ fprintf(stderr, Name ": Cannot revert-reshape on this array\n");
+ close(mdfd);
+ return 1;
+ }
for (tmpdev =3D devlist; tmpdev; tmpdev=3Dtmpdev->next) if (tmpdev->u=
sed == 1) {
char *devname =3D tmpdev->devname;
struct stat stb;
diff --git a/mdadm.c b/mdadm.c
index 08e8ea4..7cf51b5 100644
--- a/mdadm.c
+++ b/mdadm.c
@@ -662,6 +662,8 @@ int main(int argc, char *argv[])
continue;
if (strcmp(update, "devicesize")==0)
continue;
+ if (strcmp(update, "revert-reshape")==0)
+ continue;
if (strcmp(update, "byteorder")==0) {
if (ss) {
fprintf(stderr, Name ": must not set metadata type with --update=3D=
byteorder.\n");
@@ -688,7 +690,8 @@ int main(int argc, char *argv[])
}
fprintf(outf, "Valid --update options are:\n"
" 'sparc2.2', 'super-minor', 'uuid', 'name', 'resync',\n"
- " 'summaries', 'homehost', 'byteorder', 'devicesize'.\n");
+ " 'summaries', 'homehost', 'byteorder', 'devicesize',\n"
+ " 'revert-reshape'.\n");
exit(outf == stdout ? 0 : 2);
=20
case O(INCREMENTAL,NoDegraded):
diff --git a/super0.c b/super0.c
index ae3e885..01d5cfa 100644
--- a/super0.c
+++ b/super0.c
@@ -545,6 +545,19 @@ static int update_super0(struct supertype *st, str=
uct mdinfo *info,
}
if (strcmp(update, "_reshape_progress")==0)
sb->reshape_position =3D info->reshape_progress;
+ if (strcmp(update, "revert-reshape") == 0 &&
+ sb->minor_version > 90 && sb->delta_disks !=3D 0) {
+ int tmp;
+ sb->raid_disks -=3D sb->delta_disks;
+ sb->delta_disks =3D - sb->delta_disks;
+ tmp =3D sb->new_layout;
+ sb->new_layout =3D sb->layout;
+ sb->layout =3D tmp;
+
+ tmp =3D sb->new_chunk;
+ sb->new_chunk =3D sb->chunk_size;
+ sb->chunk_size =3D tmp;
+ }
=20
sb->sb_csum =3D calc_sb0_csum(sb);
return rv;
diff --git a/super1.c b/super1.c
index 0eb0323..805777e 100644
--- a/super1.c
+++ b/super1.c
@@ -781,6 +781,19 @@ static int update_super1(struct supertype *st, str=
uct mdinfo *info,
}
if (strcmp(update, "_reshape_progress")==0)
sb->reshape_position =3D __cpu_to_le64(info->reshape_progress);
+ if (strcmp(update, "revert-reshape") == 0 && sb->delta_disks) {
+ __u32 temp;
+ sb->raid_disks =3D __cpu_to_le32(__le32_to_cpu(sb->raid_disks) + __l=
e32_to_cpu(sb->delta_disks));
+ sb->delta_disks =3D __cpu_to_le32(-__le32_to_cpu(sb->delta_disks));
+ printf("REverted to %d\n", (int)__le32_to_cpu(sb->delta_disks));
+ temp =3D sb->new_layout;
+ sb->new_layout =3D sb->layout;
+ sb->layout =3D temp;
+
+ temp =3D sb->new_chunk;
+ sb->new_chunk =3D sb->chunksize;
+ sb->chunksize =3D temp;
+ }
=20
sb->sb_csum =3D calc_sb_1_csum(sb);
return rv;

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

RE: Degraded Array

am 04.12.2010 09:53:45 von Leslie Rhorer

> -----Original Message-----
> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
> owner@vger.kernel.org] On Behalf Of Neil Brown
> Sent: Saturday, December 04, 2010 12:45 AM
> To: Majed B.
> Cc: lrhorer@satx.rr.com; linux-raid@vger.kernel.org
> Subject: Re: Degraded Array
>
> On Sat, 4 Dec 2010 07:26:36 +0300 "Majed B." wrote:
>
> > You have a degraded array now with 1 disk down. If you proceed, more
> > disks might pop out due to errors.
> >
> > It's best to backup your data, run a check on the array, fix it then
> > try to resume the reshape.
>
> Backups are always a good idea, but are sometimes impractical.

I always have backups. I have a backup system running a RAID array
always kept a bit bigger than my primary server. Every morning at 04:00 I
run an rsync (well, the system does, of course).

> I don't think running a 'check' would help at all. A 'reshape' will do
> much
> the same sort of work, and more.
>
> It isn't strictly true that the array is '1 disk down'. Parts of it are 1
> disk down, parts are 2 disks down. As the reshape progresses more and
> more
> will be 2 disks down. We don't really want that.

Well, I'm not too fussed if there is no better option.

> This case isn't really handled well at present. You want to do a
> 'recovery'
> and a 'reshape' at the same time. This is quite possible, but doesn't
> currently happen when you restart a reshape in the middle (added to my
> todo
> list).
>
> I suggest you:
> - apply the patch below to mdadm.
> - assemble the array with --update=revert-reshape. You should give
> it a --backup-file too.
> - let the reshape complete so you are back to 13 devices.
> - add a spare and let it recovery
> - then add a spare and reshape the array.
>
> Of course you needed to be running a new enough kernel to be able decrease
> the number of devices in a raid5.

I don't think I am. Mdadm 2.6.7.2 and kernel 2.6.26-2-amd64.

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

RE: Degraded Array

am 11.12.2010 05:29:39 von Leslie Rhorer

Well, that was painful, and more than a little odd. As I reported
before, the system halted dead during the re-shape from 13 disks to 14 on
the RAID6 array of the main server. The array reassembled after reboot, but
with only 12 drives. I'm pretty sure one drive was missing because it (the
14th) wasn't in mdadm.conf, because of course I had not put it there, yet.
I'm not exactly sure why the 12th wasn't assembled in the array. Any way,
during the continued re-shape, it halted again. I brought it back up again,
and it eventually completed the re-shape, but with hundreds of thousands of
reported inconsistencies. I re-added the two faulted drives one at a time
and the recovery finished both times without apparent error. When it was
done, I started looking at the file system, and it was a mess. At one
point, XFS crashed altogether. I ran XFS_Repair, and it found numerous
problems at the file system level. Several files were lost. I ran a cmp
between every file on the backup and on the RAID array I had just re-shaped,
and nearly every large file was corrupted. Most small files were intact,
but a few of them were also toast. The large files were not totally
unreadable, however. In fact most of the videos were mostly intact, but
with frequent video breakups, stutters, and drop-outs encountered on every
file I checked that had failed the compare. I then ran an rsync against the
corrupted file system with the --checksum option, but it did not copy most
of the files back from the backup, although it did copy quite a few. Weird.
Checking a few of the known bad files with md5sum, every pair had different
CRCs. I also checked a few apparently good files, and every pair of those
had matching CRCs. I ran another cmp, piping the list of failures to a log
file, and then used the list to copy the remaining failed files back to the
main array. Finally, I did one last cmp between the two, and every file
passed except those which were expected not to.

I have no idea what could have caused this, but given the symptoms it seems
likely the stripes on one of the drives were accidentally put in the wrong
place while the re-shape took place, or something like that. On the up
side, the arrays have never performed better. I'm very pleased. Running
two TCP transfers at once over a 1000M Ethernet link, the transfers topped
out at over 928 Mbps. Single TCP transfers managed better than 800Mbps.
Some intra-machine processes topped out at nearly 2200 Mbps. There is no
sign at all of any corruption post re-shape.

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html