[PATCH 0/4] IMSM Checkpointing Bug Fix Series (2)

[PATCH 0/4] IMSM Checkpointing Bug Fix Series (2)

am 09.06.2011 18:29:01 von adam.kwolek

The following series contains 2 fixes for IMSM Checkpointing:
1. imsm: FIX: Disable automatic metadata rollback for broken reshape
2. imsm: FIX: Use function to obtain array layout

It enables IMSM array creation:
1. FIX: Cannot create volume
2. imsm: FIX: Cannot create volume
This fix contains 2 patches but any single patch of those 2 help.
I think both should be placed in mdadm.

Checkpointing status:
- Raid0: OK
- Raid5: Data corruption problem when stripes are restored from backup (work in progress)


BR
Adam

---

Adam Kwolek (4):
imsm: FIX: Disable automatic metadata rollback for broken reshape
imsm: FIX: Use function to obtain array layout
FIX: Cannot create volume
imsm: FIX: Cannot create volume


Create.c | 4 ++--
super-intel.c | 24 +++++++++++++++++-------
2 files changed, 19 insertions(+), 9 deletions(-)

--
Signature
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 1/4] imsm: FIX: Cannot create volume

am 09.06.2011 18:29:12 von adam.kwolek

getinfo_super_imsm_volume() clears entire 'info' structure before filling with new
information. Disk number and raid_disk can be required later by caller
but it is lost.

Restore disk number information in getinfo_super_imsm_volume() call.

Signed-off-by: Adam Kwolek
---

super-intel.c | 11 ++++++++++-
1 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/super-intel.c b/super-intel.c
index 5c840ec..e094b85 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -2075,11 +2075,18 @@ static void getinfo_super_imsm_volume(struct supertype *st, struct mdinfo *info,
unsigned int component_size_alligment;
int map_disks = info->array.raid_disks;

- memset(info, 0, sizeof(*info));
if (prev_map)
map_to_analyse = prev_map;

dl = super->disks;
+ while (dl) {
+ if (dl->index == info->disk.number)
+ break;
+ dl = dl->next;
+ }
+ if (!dl)
+ dl = super->disks;
+ memset(info, 0, sizeof(*info));

info->container_member = super->current_vol;
info->array.raid_disks = map->num_members;
@@ -2147,6 +2154,8 @@ static void getinfo_super_imsm_volume(struct supertype *st, struct mdinfo *info,
if (dl) {
info->disk.major = dl->major;
info->disk.minor = dl->minor;
+ info->disk.number = dl->index;
+ info->disk.raid_disk = dl->index;
}

info->data_offset = __le32_to_cpu(map_to_analyse->pba_of_lba0);

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 2/4] FIX: Cannot create volume

am 09.06.2011 18:29:21 von adam.kwolek

getinfo_super() can clear entire 'inf' structure before filling with new
information. Disk number required later is lost.

Restore disk number information after getinfo_super() call.

Signed-off-by: Adam Kwolek
---

Create.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/Create.c b/Create.c
index 7b4d0fe..d01dea7 100644
--- a/Create.c
+++ b/Create.c
@@ -805,7 +805,6 @@ int Create(struct supertype *st, char *mddev,
switch(pass) {
case 1:
*inf = info;
-
inf->disk.number = dnum;
inf->disk.raid_disk = dnum;
if (inf->disk.raid_disk < raiddisks)
@@ -856,12 +855,13 @@ int Create(struct supertype *st, char *mddev,
/* getinfo_super might have lost these ... */
inf->disk.major = major(stb.st_rdev);
inf->disk.minor = minor(stb.st_rdev);
+ inf->disk.number = dnum;
+ inf->disk.raid_disk = dnum;
}
break;
case 2:
inf->errors = 0;
rv = 0;
-
rv = add_disk(mdfd, st, &info, inf);

if (rv) {

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 3/4] imsm: FIX: Use function to obtain array layout

am 09.06.2011 18:29:29 von adam.kwolek

Function imsm_level_to_layout() should be use to get array layout.

Signed-off-by: Adam Kwolek
---

super-intel.c | 6 ++----
1 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/super-intel.c b/super-intel.c
index e094b85..8bfe40a 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -7733,8 +7733,7 @@ int save_backup_imsm(struct supertype *st,
if (open_backup_targets(info, new_disks, targets))
goto abort;

- if (map_dest->raid_level != 0)
- dest_layout = ALGORITHM_LEFT_ASYMMETRIC;
+ dest_layout = imsm_level_to_layout(map_dest->raid_level);
dest_chunk = __le16_to_cpu(map_dest->blocks_per_strip) * 512;

if (restore_stripes(targets, /* list of dest devices */
@@ -8781,8 +8780,7 @@ static int imsm_manage_reshape(
}

max_position = sra->component_size * ndata;
- if (map_src->raid_level != 0)
- source_layout = ALGORITHM_LEFT_ASYMMETRIC;
+ source_layout = imsm_level_to_layout(map_src->raid_level);

while (__le32_to_cpu(migr_rec->curr_migr_unit) <
__le32_to_cpu(migr_rec->num_migr_units)) {

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 4/4] imsm: FIX: Disable automatic metadata rollback for broken

am 09.06.2011 18:29:39 von adam.kwolek

mdmon cannot rollback metadata changes automatically.
It can break reshape process in the way that in case of reshape break
user will not be able to deal with broken reshape due to lack of information
about reshape geometry.

mdadm (process that invokes reshape) doesn't make any rollback to allow
for user action. mdmon should not do this either unless it knows for sure
it is save. such knowledge is not available for automatic rollback.

Signed-off-by: Adam Kwolek
---

super-intel.c | 7 +++++--
1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/super-intel.c b/super-intel.c
index 8bfe40a..5e8b834 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -5865,14 +5865,17 @@ static int imsm_set_array_state(struct active_array *a, int consistent)
} else {
if (a->last_checkpoint == 0 && a->prev_action == reshape) {
/* for some reason we aborted the reshape.
- * Better clean up
- */
+ *
+ * disable automatic metadata rollback
+ * user action is required to recover process
+ *
struct imsm_map *map2 = get_imsm_map(dev, 1);
dev->vol.migr_state = 0;
dev->vol.migr_type = 0;
dev->vol.curr_migr_unit = 0;
memcpy(map, map2, sizeof_imsm_map(map2));
super->updates_pending++;
+ */
}
if (a->last_checkpoint >= a->info.component_size) {
unsigned long long array_blocks;

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/4] imsm: FIX: Cannot create volume

am 14.06.2011 04:07:04 von NeilBrown

On Thu, 09 Jun 2011 18:29:12 +0200 Adam Kwolek wrote:

> getinfo_super_imsm_volume() clears entire 'info' structure before filling with new
> information. Disk number and raid_disk can be required later by caller
> but it is lost.
>
> Restore disk number information in getinfo_super_imsm_volume() call.

I need a better explanation than this.

When is getinfo_super_imsm or getinfo_super_imsm_volume *ever* called in a
situation where 'info' contains a meaningful value for disk.number???
I could not find it.

Alternately, explain what goes wrong with the current code (yes, I could
probably test it myself ... but if you send a patch you should justify it).

So for now this patch has not been applied.

Thanks,
NeilBrown


>
> Signed-off-by: Adam Kwolek
> ---
>
> super-intel.c | 11 ++++++++++-
> 1 files changed, 10 insertions(+), 1 deletions(-)
>
> diff --git a/super-intel.c b/super-intel.c
> index 5c840ec..e094b85 100644
> --- a/super-intel.c
> +++ b/super-intel.c
> @@ -2075,11 +2075,18 @@ static void getinfo_super_imsm_volume(struct supertype *st, struct mdinfo *info,
> unsigned int component_size_alligment;
> int map_disks = info->array.raid_disks;
>
> - memset(info, 0, sizeof(*info));
> if (prev_map)
> map_to_analyse = prev_map;
>
> dl = super->disks;
> + while (dl) {
> + if (dl->index == info->disk.number)
> + break;
> + dl = dl->next;
> + }
> + if (!dl)
> + dl = super->disks;
> + memset(info, 0, sizeof(*info));
>
> info->container_member = super->current_vol;
> info->array.raid_disks = map->num_members;
> @@ -2147,6 +2154,8 @@ static void getinfo_super_imsm_volume(struct supertype *st, struct mdinfo *info,
> if (dl) {
> info->disk.major = dl->major;
> info->disk.minor = dl->minor;
> + info->disk.number = dl->index;
> + info->disk.raid_disk = dl->index;
> }
>
> info->data_offset = __le32_to_cpu(map_to_analyse->pba_of_lba0);

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/4] FIX: Cannot create volume

am 14.06.2011 04:35:30 von NeilBrown

On Thu, 09 Jun 2011 18:29:21 +0200 Adam Kwolek wrote:

> getinfo_super() can clear entire 'inf' structure before filling with new
> information. Disk number required later is lost.
>
> Restore disk number information after getinfo_super() call.
>
> Signed-off-by: Adam Kwolek
> ---
>
> Create.c | 4 ++--
> 1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/Create.c b/Create.c
> index 7b4d0fe..d01dea7 100644
> --- a/Create.c
> +++ b/Create.c
> @@ -805,7 +805,6 @@ int Create(struct supertype *st, char *mddev,
> switch(pass) {
> case 1:
> *inf = info;
> -
> inf->disk.number = dnum;
> inf->disk.raid_disk = dnum;
> if (inf->disk.raid_disk < raiddisks)
> @@ -856,12 +855,13 @@ int Create(struct supertype *st, char *mddev,
> /* getinfo_super might have lost these ... */
> inf->disk.major = major(stb.st_rdev);
> inf->disk.minor = minor(stb.st_rdev);
> + inf->disk.number = dnum;
> + inf->disk.raid_disk = dnum;
> }
> break;
> case 2:
> inf->errors = 0;
> rv = 0;
> -
> rv = add_disk(mdfd, st, &info, inf);
>
> if (rv) {

Unfortunately this is wrong too.

There should be no need to set disk.number and disk.raid_disk as the
->getinfo_super is supposed to have set them.

As the comment in mdadm.h says:

/* Extract generic details from metadata. This could be details about
* the container, or about an individual array within the container.
* The determination is made either by:
* load_super being given a 'component' string.
* validate_geometry determining what to create.
* The info includes both array information and device information.
* The particular device should be:
* The last device added by add_to_super
* The device the metadata was loaded from by load_super
* If 'map' is present, then it is an array raid_disks long
* (raid_disk must already be set and correct) and it is filled
* with 1 for slots that are thought to be active and 0 for slots which
* appear to be failed/missing.
* *info is zeroed out before data is added.
*/

In this case, ->add_to_super was recently called so it should record state so
that getinfo_super can return the correct information. That is what super0
and super1 does.

It seems that this has been wrong for a long time and my recent change to
zero the info structure showed it up.
So I'll let the patch through so as not to hold things up unnecessarily, but
I would really like you to see if you can fix add_to_super and getinfo_super
so that they follow the documented behaviour.

Thanks,
NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

RE: [PATCH 1/4] imsm: FIX: Cannot create volume

am 14.06.2011 11:21:16 von adam.kwolek

> -----Original Message-----
> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
> owner@vger.kernel.org] On Behalf Of NeilBrown
> Sent: Tuesday, June 14, 2011 4:07 AM
> To: Kwolek, Adam
> Cc: linux-raid@vger.kernel.org; Williams, Dan J; Ciechanowski, Ed;
> Neubauer, Wojciech
> Subject: Re: [PATCH 1/4] imsm: FIX: Cannot create volume
>
> On Thu, 09 Jun 2011 18:29:12 +0200 Adam Kwolek
> wrote:
>
> > getinfo_super_imsm_volume() clears entire 'info' structure before
> filling with new
> > information. Disk number and raid_disk can be required later by caller
> > but it is lost.
> >
> > Restore disk number information in getinfo_super_imsm_volume() call.
>
> I need a better explanation than this.
>
> When is getinfo_super_imsm or getinfo_super_imsm_volume *ever* called in
> a
> situation where 'info' contains a meaningful value for disk.number???
> I could not find it.

Is not a matter of meaningful value (it is ok), this is restore of cleared by memset() value only.
Function getinfo_super_imsm_volume() clears entire mdinfo structure now.
We should restore mdinfo->disk.raid_disk structure field information
Currently some information is set for first available disk in metadata, but it should be set to correct value.

Creation is made in 2 stages:
Pass 1: Collects disks in infos table (Create.c:789)
Pass 2: adds disks to md for array creation purposes (Create.c:873)

When during pass 1 inf->disk.raid_disk is not set correctly (or cleared /0/), during pass 2 it cannot be added by add_disk() correctly.
add_disk() thinks that all disks has to be configured for slot 0 (as it is 'clean' value),


disk.number is not important here (it is set to have identical values set in Create.c).
disk.raid_number is value that cannot be cleared by getinfo to have possibility for crresc slot number configuration.

Second patch, you partially applied (for Create.c) was some kind of "plan B" only.

Please let me know if this is clear enough for you.

BR
Adam




>
> Alternately, explain what goes wrong with the current code (yes, I could
> probably test it myself ... but if you send a patch you should justify
> it).
>
> So for now this patch has not been applied.
>
> Thanks,
> NeilBrown
>
>
> >
> > Signed-off-by: Adam Kwolek
> > ---
> >
> > super-intel.c | 11 ++++++++++-
> > 1 files changed, 10 insertions(+), 1 deletions(-)
> >
> > diff --git a/super-intel.c b/super-intel.c
> > index 5c840ec..e094b85 100644
> > --- a/super-intel.c
> > +++ b/super-intel.c
> > @@ -2075,11 +2075,18 @@ static void getinfo_super_imsm_volume(struct
> supertype *st, struct mdinfo *info,
> > unsigned int component_size_alligment;
> > int map_disks = info->array.raid_disks;
> >
> > - memset(info, 0, sizeof(*info));
> > if (prev_map)
> > map_to_analyse = prev_map;
> >
> > dl = super->disks;
> > + while (dl) {
> > + if (dl->index == info->disk.number)
> > + break;
> > + dl = dl->next;
> > + }
> > + if (!dl)
> > + dl = super->disks;
> > + memset(info, 0, sizeof(*info));
> >
> > info->container_member = super->current_vol;
> > info->array.raid_disks = map->num_members;
> > @@ -2147,6 +2154,8 @@ static void getinfo_super_imsm_volume(struct
> supertype *st, struct mdinfo *info,
> > if (dl) {
> > info->disk.major = dl->major;
> > info->disk.minor = dl->minor;
> > + info->disk.number = dl->index;
> > + info->disk.raid_disk = dl->index;
> > }
> >
> > info->data_offset = __le32_to_cpu(map_to_analyse->pba_of_lba0);
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html