[PATCH 00/22] IMSM checkpointing implementation

[PATCH 00/22] IMSM checkpointing implementation

am 02.06.2011 16:48:08 von krzysztof.wojcik

IMSM for securing reshape process uses special disk area outside metadata
for reshaped area backup purposes. If just reshaped array area requires
backup, bunch of array stripes prepared for reshape is stored in to
Migration Copy Area. In case of reshape interruption, Option ROM during
restart or mdadm during reshape restart (when no reboot occurs) will
restore Migration Copy Area to designation array.
Reshape can be continued form stable array stable state.

The following series implements IMSM checkpointing procedure.

---

Adam Kwolek, Krzysztof Wojcik (22):
imsm: Add migration record to intel_super
Support restore_stripes() from the given buffer
Define dummy functions to mdmon.c
imsm: Add support for copy area and backup operations
imsm: check migration compatibility
FIX: Initialize reshape structure
imsm: Add wait_for_reshape_imsm() implementation
imsm: Implement imsm_manage_reshape(), reshape workhorse
imsm: Check if array degradation has been changed
imsm: Clear migration record when no migration in progress
imsm: Add information about migration record to mdadm '-E' option
imsm: update blocks_per_migr_unit() to support migration record
Add reshape restart support for external metadata
imsm: Implement recover_backup_imsm() for imsm metadata
imsm: Disable checkpoint updating by mdmon for general migration
imsm: Add metadata update type for general migration check-pointing
imsm: Prepare checkpoint update for general migration
imsm: Apply checkpoint metadata update for general migration
FIX: Enable metadata updates for raid0
Do not use backup file for external metadata
imsm: Remove user warning before reshape start
imsm: Unit Tests - remove backup-file during grow command


Assemble.c | 10
Grow.c | 50 +-
mdadm.h | 7
mdmon.c | 23 +
restripe.c | 101 +++-
super-intel.c | 1241 ++++++++++++++++++++++++++++++++++++++++++++--
tests/imsm-grow-template | 5
7 files changed, 1322 insertions(+), 115 deletions(-)

--
Krzysztof Wojcik
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 02/22] Support restore_stripes() from the given buffer

am 02.06.2011 16:48:26 von krzysztof.wojcik

From: Adam Kwolek

For external metadata backup location and saving methods depends
on metadata specific implementation details. Currently restore_stripes()
function is able to restore data only from the given backup file handles
and it is used only for assembling partially reshaped arrays.
As this function will be very helpful for external metadata backup
mechanism, add the support for restoring data from the given source buffer.
Add possibility for save_stripes() to work without designation targets.
Save_stripes() can now prepare data for restore_stripes() only.

Signed-off-by: Maciej Trela
Signed-off-by: Adam Kwolek
Signed-off-by: Krzysztof Wojcik
---
Grow.c | 4 +-
mdadm.h | 3 +-
restripe.c | 101 +++++++++++++++++++++++++++++++++++++++++++-----------------
3 files changed, 77 insertions(+), 31 deletions(-)

diff --git a/Grow.c b/Grow.c
index 62622bd..7a8ffdb 100644
--- a/Grow.c
+++ b/Grow.c
@@ -3380,7 +3380,7 @@ int Grow_restart(struct supertype *st, struct mdinfo *info, int *fdlist, int cnt
info->new_layout,
fd, __le64_to_cpu(bsb.devstart)*512,
__le64_to_cpu(bsb.arraystart)*512,
- __le64_to_cpu(bsb.length)*512)) {
+ __le64_to_cpu(bsb.length)*512, NULL)) {
/* didn't succeed, so giveup */
if (verbose)
fprintf(stderr, Name ": Error restoring backup from %s\n",
@@ -3397,7 +3397,7 @@ int Grow_restart(struct supertype *st, struct mdinfo *info, int *fdlist, int cnt
fd, __le64_to_cpu(bsb.devstart)*512 +
__le64_to_cpu(bsb.devstart2)*512,
__le64_to_cpu(bsb.arraystart2)*512,
- __le64_to_cpu(bsb.length2)*512)) {
+ __le64_to_cpu(bsb.length2)*512, NULL)) {
/* didn't succeed, so giveup */
if (verbose)
fprintf(stderr, Name ": Error restoring second backup from %s\n",
diff --git a/mdadm.h b/mdadm.h
index 9437d04..e97cab0 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -507,7 +507,8 @@ extern int save_stripes(int *source, unsigned long long *offsets,
extern int restore_stripes(int *dest, unsigned long long *offsets,
int raid_disks, int chunk_size, int level, int layout,
int source, unsigned long long read_offset,
- unsigned long long start, unsigned long long length);
+ unsigned long long start, unsigned long long length,
+ char *src_buf);

#ifndef Sendmail
#define Sendmail "/usr/lib/sendmail -t"
diff --git a/restripe.c b/restripe.c
index 63a2a64..648fe78 100644
--- a/restripe.c
+++ b/restripe.c
@@ -467,16 +467,35 @@ int raid6_check_disks(int data_disks, int start, int chunk_size,
return curr_broken_disk;
}

-/* Save data:
- * We are given:
- * A list of 'fds' of the active disks. Some may be absent.
- * A geometry: raid_disks, chunk_size, level, layout
- * A list of 'fds' for mirrored targets. They are already seeked to
- * right (Write) location
- * A start and length which must be stripe-aligned
- * 'buf' is large enough to hold one stripe, and is aligned
- */
-
+/********************************************************** *********************
+ * Function: save_stripes
+ * Description:
+ * Function reads data (only data without P and Q) from array and writes
+ * it to buf and opcjonaly to backup files
+ * Parameters:
+ * source : A list of 'fds' of the active disks.
+ * Some may be absent
+ * offsets : A list of offsets on disk belonging
+ * to the array [bytes]
+ * raid_disks : geometry: number of disks in the array
+ * chunk_size : geometry: chunk size [bytes]
+ * level : geometry: RAID level
+ * layout : geometry: layout
+ * nwrites : number of backup files
+ * dest : A list of 'fds' for mirrored targets
+ * (e.g. backup files). They are already seeked to right
+ * (write) location. If NULL, data will be wrote
+ * to the buf only
+ * start : start address of data to read (must be stripe-aligned)
+ * [bytes]
+ * length - : length of data to read (must be stripe-aligned)
+ * [bytes]
+ * buf : buffer for data. It is large enough to hold
+ * one stripe. It is stripe aligned
+ * Returns:
+ * 0 : success
+ * -1 : fail
+ ************************************************************ ******************/
int save_stripes(int *source, unsigned long long *offsets,
int raid_disks, int chunk_size, int level, int layout,
int nwrites, int *dest,
@@ -487,6 +506,7 @@ int save_stripes(int *source, unsigned long long *offsets,
int data_disks = raid_disks - (level == 0 ? 0 : level <=5 ? 1 : 2);
int disk;
int i;
+ unsigned long long length_test;

if (!tables_ready)
make_tables();
@@ -501,6 +521,18 @@ int save_stripes(int *source, unsigned long long *offsets,
}

len = data_disks * chunk_size;
+ length_test = length / len;
+ length_test *= len;
+
+ if (length != length_test) {
+ dprintf("Error: save_stripes(): Data are not alligned. EXIT\n");
+ dprintf("\tArea for saving stripes (length) = %llu\n", length);
+ dprintf("\tWork step (len) = %i\n", len);
+ dprintf("\tExpected save area (length_test) = %llu\n",
+ length_test);
+ abort();
+ }
+
while (length > 0) {
int failed = 0;
int fdisk[3], fblock[3];
@@ -620,11 +652,10 @@ int save_stripes(int *source, unsigned long long *offsets,
fdisk[0], fdisk[1], bufs);
}
}
-
- for (i=0; i - if (write(dest[i], buf, len) != len)
- return -1;
-
+ if (dest)
+ for (i = 0; i < nwrites; i++)
+ if (write(dest[i], buf, len) != len)
+ return -1;
length -= len;
start += len;
}
@@ -645,7 +676,8 @@ int save_stripes(int *source, unsigned long long *offsets,
int restore_stripes(int *dest, unsigned long long *offsets,
int raid_disks, int chunk_size, int level, int layout,
int source, unsigned long long read_offset,
- unsigned long long start, unsigned long long length)
+ unsigned long long start, unsigned long long length,
+ char *src_buf)
{
char *stripe_buf;
char **stripes = malloc(raid_disks * sizeof(char*));
@@ -668,13 +700,17 @@ int restore_stripes(int *dest, unsigned long long *offsets,

if (stripe_buf == NULL || stripes == NULL || blocks == NULL
|| zero == NULL) {
- free(stripe_buf);
- free(stripes);
- free(blocks);
- free(zero);
+ if (stripe_buf != NULL)
+ free(stripe_buf);
+ if (stripes != NULL)
+ free(stripes);
+ if (blocks != NULL)
+ free(blocks);
+ if (zero != NULL)
+ free(zero);
return -2;
}
- for (i=0; i + for (i = 0; i < raid_disks; i++)
stripes[i] = stripe_buf + i * chunk_size;
while (length > 0) {
unsigned int len = data_disks * chunk_size;
@@ -683,15 +719,24 @@ int restore_stripes(int *dest, unsigned long long *offsets,
int syndrome_disks;
if (length < len)
return -3;
- for (i=0; i < data_disks; i++) {
+ for (i = 0; i < data_disks; i++) {
int disk = geo_map(i, start/chunk_size/data_disks,
raid_disks, level, layout);
- if ((unsigned long long)lseek64(source, read_offset, 0)
- != read_offset)
- return -1;
- if (read(source, stripes[disk],
- chunk_size) != chunk_size)
- return -1;
+ if (src_buf == NULL) {
+ /* read from file */
+ if (lseek64(source,
+ read_offset, 0) != (off64_t)read_offset)
+ return -1;
+ if (read(source,
+ stripes[disk],
+ chunk_size) != chunk_size)
+ return -1;
+ } else {
+ /* read from input buffer */
+ memcpy(stripes[disk],
+ src_buf + read_offset,
+ chunk_size);
+ }
read_offset += chunk_size;
}
/* We have the data, now do the parity */

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 03/22] Define dummy functions to mdmon.c

am 02.06.2011 16:48:34 von krzysztof.wojcik

From: Adam Kwolek

Definitions are necessary to compile mdmon.
Metadata specific source code is compiled to mdmon.
Functions used for reshape check pointing:
- restore_stripes()
- save_stripes
- abort_reshape
are not used in mdmon, but they are compiled in it.
To enable mdmon compilation, dummy functions are used.

Signed-off-by: Adam Kwolek
Signed-off-by: Krzysztof Wojcik
---
mdmon.c | 23 +++++++++++++++++++++++
1 files changed, 23 insertions(+), 0 deletions(-)

diff --git a/mdmon.c b/mdmon.c
index a51a94f..d633cb0 100644
--- a/mdmon.c
+++ b/mdmon.c
@@ -527,3 +527,26 @@ int child_monitor(int afd, struct mdinfo *sra, struct reshape *reshape,
{
return 0;
}
+
+int restore_stripes(int *dest, unsigned long long *offsets,
+ int raid_disks, int chunk_size, int level, int layout,
+ int source, unsigned long long read_offset,
+ unsigned long long start, unsigned long long length,
+ char *src_buf)
+{
+ return 1;
+}
+
+void abort_reshape(struct mdinfo *sra)
+{
+ return;
+}
+
+int save_stripes(int *source, unsigned long long *offsets,
+ int raid_disks, int chunk_size, int level, int layout,
+ int nwrites, int *dest,
+ unsigned long long start, unsigned long long length,
+ char *buf)
+{
+ return 0;
+}

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 04/22] imsm: Add support for copy area and backup operations

am 02.06.2011 16:48:43 von krzysztof.wojcik

From: Adam Kwolek

This patch adds methods of manipulating migration record:
init_migr_record_imsm() - initiate migration record at the beginning of
the reshape process
write_imsm_migr_rec() - saves migration record to array.
Migration record is stored on 2 first disks in array only.
save_backup_imsm() - saves critical data stripes to Migration Copy Area
and updates the current migration unit status.
Uses restore_stripes() to format a destination stripe, and to write it
to the Migration Copy Area.
save_checkpoint_imsm() - Updates the current unit status in the
migration record.

Migration record is written to 2 first array disks only (similar to reading
operation).

Signed-off-by: Maciej Trela
Signed-off-by: Adam Kwolek
Signed-off-by: Krzysztof Wojcik
---
super-intel.c | 279 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 279 insertions(+), 0 deletions(-)

diff --git a/super-intel.c b/super-intel.c
index ffc41e7..1f7d008 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -1885,6 +1885,57 @@ out:
return retval;
}

+/********************************************************** *********************
+ * Function: write_imsm_migr_rec
+ * Description: Function writes imsm migration record
+ * (at the last sector of disk)
+ * Parameters:
+ * super : imsm internal array info
+ * Returns:
+ * 0 : success
+ * -1 : if fail
+ ************************************************************ ******************/
+static int write_imsm_migr_rec(struct supertype *st)
+{
+ struct intel_super *super = st->sb;
+ unsigned long long dsize;
+ char nm[30];
+ int fd = -1;
+ int retval = -1;
+ struct dl *sd;
+
+ for (sd = super->disks ; sd ; sd = sd->next) {
+ /* write to 2 first slots only */
+ if ((sd->index < 0) || (sd->index > 1))
+ continue;
+ sprintf(nm, "%d:%d", sd->major, sd->minor);
+ fd = dev_open(nm, O_RDWR);
+ if (fd < 0)
+ continue;
+ get_dev_size(fd, NULL, &dsize);
+ if (lseek64(fd, dsize - 512, SEEK_SET) < 0) {
+ fprintf(stderr,
+ Name ": Cannot seek to anchor block: %s\n",
+ strerror(errno));
+ goto out;
+ }
+ if (write(fd, super->migr_rec_buf, 512) != 512) {
+ fprintf(stderr,
+ Name ": Cannot write migr record block: %s\n",
+ strerror(errno));
+ goto out;
+ }
+ close(fd);
+ fd = -1;
+ }
+
+ retval = 0;
+ out:
+ if (fd >= 0)
+ close(fd);
+ return retval;
+}
+
static void getinfo_super_imsm_volume(struct supertype *st, struct mdinfo *info, char *dmap)
{
struct intel_super *super = st->sb;
@@ -7262,6 +7313,234 @@ static void imsm_delete(struct intel_super *super, struct dl **dlp, unsigned ind
}
}

+/********************************************************** *********************
+ * Function: open_backup_targets
+ * Description: Function opens file descriptors for all devices given in
+ * info->devs
+ * Parameters:
+ * info : general array info
+ * raid_disks : number of disks
+ * raid_fds : table of device's file descriptors
+ * Returns:
+ * 0 : success
+ * -1 : fail
+ ************************************************************ ******************/
+int open_backup_targets(struct mdinfo *info, int raid_disks, int *raid_fds)
+{
+ struct mdinfo *sd;
+
+ for (sd = info->devs ; sd ; sd = sd->next) {
+ if (sd->disk.state & (1< + dprintf("disk is faulty!!\n");
+ continue;
+ }
+
+ if ((sd->disk.raid_disk >= raid_disks) ||
+ (sd->disk.raid_disk < 0)) {
+ raid_fds[sd->disk.raid_disk] = -1;
+ continue;
+ }
+ char *dn = map_dev(sd->disk.major,
+ sd->disk.minor, 1);
+ raid_fds[sd->disk.raid_disk] = dev_open(dn, O_RDWR);
+ if (raid_fds[sd->disk.raid_disk] < 0) {
+ fprintf(stderr, "cannot open component\n");
+ return -1;
+ }
+ }
+ return 0;
+}
+
+/********************************************************** *********************
+ * Function: init_migr_record_imsm
+ * Description: Function inits imsm migration record
+ * Parameters:
+ * super : imsm internal array info
+ * dev : device under migration
+ * info : general array info to find the smallest device
+ * Returns:
+ * none
+ ************************************************************ ******************/
+void init_migr_record_imsm(struct supertype *st, struct imsm_dev *dev,
+ struct mdinfo *info)
+{
+ struct intel_super *super = st->sb;
+ struct migr_record *migr_rec = super->migr_rec;
+ int new_data_disks;
+ unsigned long long dsize, dev_sectors;
+ long long unsigned min_dev_sectors = -1LLU;
+ struct mdinfo *sd;
+ char nm[30];
+ int fd;
+ struct imsm_map *map_dest = get_imsm_map(dev, 0);
+ struct imsm_map *map_src = get_imsm_map(dev, 1);
+ unsigned long long num_migr_units;
+
+ unsigned long long array_blocks =
+ (((unsigned long long)__le32_to_cpu(dev->size_high)) << 32) +
+ __le32_to_cpu(dev->size_low);
+
+ memset(migr_rec, 0, sizeof(struct migr_record));
+ migr_rec->family_num = __cpu_to_le32(super->anchor->family_num);
+
+ /* only ascending reshape supported now */
+ migr_rec->ascending_migr = __cpu_to_le32(1);
+
+ migr_rec->dest_depth_per_unit = GEN_MIGR_AREA_SIZE /
+ max(map_dest->blocks_per_strip, map_src->blocks_per_strip);
+ migr_rec->dest_depth_per_unit *= map_dest->blocks_per_strip;
+ new_data_disks = imsm_num_data_members(dev, 0);
+ migr_rec->blocks_per_unit =
+ __cpu_to_le32(migr_rec->dest_depth_per_unit * new_data_disks);
+ migr_rec->dest_depth_per_unit =
+ __cpu_to_le32(migr_rec->dest_depth_per_unit);
+
+ num_migr_units =
+ array_blocks / __le32_to_cpu(migr_rec->blocks_per_unit);
+
+ if (array_blocks % __le32_to_cpu(migr_rec->blocks_per_unit))
+ num_migr_units++;
+ migr_rec->num_migr_units = __cpu_to_le32(num_migr_units);
+
+ migr_rec->post_migr_vol_cap = dev->size_low;
+ migr_rec->post_migr_vol_cap_hi = dev->size_high;
+
+
+ /* Find the smallest dev */
+ for (sd = info->devs ; sd ; sd = sd->next) {
+ sprintf(nm, "%d:%d", sd->disk.major, sd->disk.minor);
+ fd = dev_open(nm, O_RDONLY);
+ if (fd < 0)
+ continue;
+ get_dev_size(fd, NULL, &dsize);
+ dev_sectors = dsize / 512;
+ if (dev_sectors < min_dev_sectors)
+ min_dev_sectors = dev_sectors;
+ close(fd);
+ }
+ migr_rec->ckpt_area_pba = __cpu_to_le32(min_dev_sectors -
+ RAID_DISK_RESERVED_BLOCKS_IMSM_HI);
+
+ write_imsm_migr_rec(st);
+
+ return;
+}
+
+/********************************************************** *********************
+ * Function: save_backup_imsm
+ * Description: Function saves critical data stripes to Migration Copy Area
+ * and updates the current migration unit status.
+ * Use restore_stripes() to form a destination stripe,
+ * and to write it to the Copy Area.
+ * Parameters:
+ * st : supertype information
+ * info : general array info
+ * buf : input buffer
+ * write_offset : address of data to backup
+ * length : length of data to backup (blocks_per_unit)
+ * Returns:
+ * 0 : success
+ *, -1 : fail
+ ************************************************************ ******************/
+int save_backup_imsm(struct supertype *st,
+ struct imsm_dev *dev,
+ struct mdinfo *info,
+ void *buf,
+ int new_data,
+ int length)
+{
+ int rv = -1;
+ struct intel_super *super = st->sb;
+ unsigned long long *target_offsets = NULL;
+ int *targets = NULL;
+ int i;
+ struct imsm_map *map_dest = get_imsm_map(dev, 0);
+ int new_disks = map_dest->num_members;
+
+ targets = malloc(new_disks * sizeof(int));
+ if (!targets)
+ goto abort;
+
+ target_offsets = malloc(new_disks * sizeof(unsigned long long));
+ if (!target_offsets)
+ goto abort;
+
+ for (i = 0; i < new_disks; i++) {
+ targets[i] = -1;
+ target_offsets[i] = (unsigned long long)
+ __le32_to_cpu(super->migr_rec->ckpt_area_pba) * 512;
+ }
+
+ if (open_backup_targets(info, new_disks, targets))
+ goto abort;
+
+ if (restore_stripes(targets, /* list of dest devices */
+ target_offsets, /* migration record offsets */
+ new_disks,
+ info->new_chunk,
+ info->new_level,
+ info->new_layout,
+ 0, /* source backup file descriptor */
+ 0, /* input buf offset
+ * always 0 buf is already offseted */
+ 0,
+ length,
+ buf) != 0) {
+ fprintf(stderr, Name ": Error restoring stripes\n");
+ goto abort;
+ }
+
+ rv = 0;
+
+abort:
+ if (targets) {
+ for (i = 0; i < new_disks; i++)
+ if (targets[i] >= 0)
+ close(targets[i]);
+ free(targets);
+ }
+ if (target_offsets)
+ free(target_offsets);
+
+ return rv;
+}
+
+/********************************************************** *********************
+ * Function: save_checkpoint_imsm
+ * Description: Function called for current unit status update
+ * in the migration record. It writes it to disk.
+ * Parameters:
+ * super : imsm internal array info
+ * info : general array info
+ * Returns:
+ * 0: success
+ * 1: failure
+ ************************************************************ ******************/
+int save_checkpoint_imsm(struct supertype *st, struct mdinfo *info, int state)
+{
+ struct intel_super *super = st->sb;
+ load_imsm_migr_rec(super, info);
+ if (__le32_to_cpu(super->migr_rec->blocks_per_unit) == 0) {
+ dprintf("ERROR: blocks_per_unit = 0!!!\n");
+ return 1;
+ }
+
+ super->migr_rec->curr_migr_unit =
+ __cpu_to_le32(info->reshape_progress /
+ __le32_to_cpu(super->migr_rec->blocks_per_unit));
+ super->migr_rec->rec_status = __cpu_to_le32(state);
+ super->migr_rec->dest_1st_member_lba =
+ __cpu_to_le32((__le32_to_cpu(super->migr_rec->curr_migr_unit ))
+ * __le32_to_cpu(super->migr_rec->dest_depth_per_unit));
+ if (write_imsm_migr_rec(st) < 0) {
+ dprintf("imsm: Cannot write migration record "
+ "outside backup area\n");
+ return 1;
+ }
+
+ return 0;
+}
+
static char disk_by_path[] = "/dev/disk/by-path/";

static const char *imsm_get_disk_controller_domain(const char *path)

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 05/22] imsm: check migration compatibility

am 02.06.2011 16:48:51 von krzysztof.wojcik

From: Adam Kwolek

Under Windows IMSM can reshape arrays in 2 directions
(ascending and decsending).
Under Linux one (ascending) direction is supported at this moment.
Block loading metadata when decsending reshape is detected

Windows also uses optimalization area during reshaping array.
Linux does not support it.
The patch blocks this operation also.

Signed-off-by: Maciej Trela
Signed-off-by: Adam Kwolek
Signed-off-by: Krzysztof Wojcik
---
super-intel.c | 61 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 61 insertions(+), 0 deletions(-)

diff --git a/super-intel.c b/super-intel.c
index 1f7d008..31fae1e 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -2846,6 +2846,44 @@ struct bbm_log *__get_imsm_bbm_log(struct imsm_super *mpb)
return ptr;
}

+/********************************************************** *********************
+ * Function: check_mpb_migr_compatibility
+ * Description: Function checks for unsupported migration features:
+ * - migration optimization area (pba_of_lba0)
+ * - descending reshape (ascending_migr)
+ * Parameters:
+ * super : imsm metadata information
+ * Returns:
+ * 0 : migration is compatible
+ * -1 : migration is not compatible
+ ************************************************************ ******************/
+int check_mpb_migr_compatibility(struct intel_super *super)
+{
+ struct imsm_map *map0, *map1;
+ struct migr_record *migr_rec = super->migr_rec;
+ int i;
+
+ for (i = 0; i < super->anchor->num_raid_devs; i++) {
+ struct imsm_dev *dev_iter = __get_imsm_dev(super->anchor, i);
+
+ if (dev_iter &&
+ dev_iter->vol.migr_state == 1 &&
+ dev_iter->vol.migr_type == MIGR_GEN_MIGR) {
+ /* This device is migrating */
+ map0 = get_imsm_map(dev_iter, 0);
+ map1 = get_imsm_map(dev_iter, 1);
+ if (map0->pba_of_lba0 != map1->pba_of_lba0)
+ /* migration optimization area was used */
+ return -1;
+ if (migr_rec->ascending_migr == 0
+ && migr_rec->dest_depth_per_unit > 0)
+ /* descending reshape not supported yet */
+ return -1;
+ }
+ }
+ return 0;
+}
+
static void __free_imsm(struct intel_super *super, int free_disks);

/* load_imsm_mpb - read matrix metadata
@@ -3575,6 +3613,19 @@ static int load_super_imsm_all(struct supertype *st, int fd, void **sbp,
err = 4;
goto error;
}
+
+ /* Check migration compatibility */
+ if (check_mpb_migr_compatibility(super) != 0) {
+ fprintf(stderr, Name ": Unsupported migration detected");
+ if (devname)
+ fprintf(stderr, " on %s\n", devname);
+ else
+ fprintf(stderr, " (IMSM).\n");
+
+ err = 5;
+ goto error;
+ }
+
err = 0;

error:
@@ -3657,6 +3708,16 @@ static int load_super_imsm(struct supertype *st, int fd, char *devname)
/* load migration record */
load_imsm_migr_rec(super, NULL);

+ /* Check for unsupported migration features */
+ if (check_mpb_migr_compatibility(super) != 0) {
+ fprintf(stderr, Name ": Unsupported migration detected");
+ if (devname)
+ fprintf(stderr, " on %s\n", devname);
+ else
+ fprintf(stderr, " (IMSM).\n");
+ return 3;
+ }
+
return 0;
}


--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 06/22] FIX: Initialize reshape structure

am 02.06.2011 16:49:00 von krzysztof.wojcik

From: Adam Kwolek

It can occurs that reshape structure can contain random values.
Due to this fact analyse_change() can disallow for grow start
without real cause (e.g. check of uninitialized new_chunk).

Signed-off-by: Adam Kwolek
---
Grow.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/Grow.c b/Grow.c
index 7a8ffdb..11b2214 100644
--- a/Grow.c
+++ b/Grow.c
@@ -1745,6 +1745,7 @@ static int reshape_array(char *container, int fd, char *devname,
info->component_size = array_size / array.raid_disks;
}

+ memset(&reshape, 0, sizeof(reshape));
if (info->reshape_active) {
int new_level = info->new_level;
info->new_level = UnSet;

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 07/22] imsm: Add wait_for_reshape_imsm() implementation

am 02.06.2011 16:49:08 von krzysztof.wojcik

From: Adam Kwolek

After each checkpoint mdadm should set new reshaped area and wait
until md finishes reshape. Function wait_for_reshape_imsm() sets
new reshape range and waits for job completion.

Signed-off-by: Adam Kwolek
Signed-off-by: Krzysztof Wojcik
---
super-intel.c | 61 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 61 insertions(+), 0 deletions(-)

diff --git a/super-intel.c b/super-intel.c
index 31fae1e..c395a48 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -8248,6 +8248,67 @@ exit_imsm_reshape_super:
return ret_val;
}

+/********************************************************** *********************
+ * Function: wait_for_reshape_imsm
+ * Description: Function writes new sync_max value and waits until
+ * reshape process reach new position
+ * Parameters:
+ * sra : general array info
+ * to_complete : new sync_max position
+ * ndata : number of disks in new array's layout
+ * Returns:
+ * 0 : success,
+ * 1 : there is no reshape in progress,
+ * -1 : fail
+ ************************************************************ ******************/
+int wait_for_reshape_imsm(struct mdinfo *sra, unsigned long long to_complete,
+ int ndata)
+{
+ int fd = sysfs_get_fd(sra, NULL, "reshape_position");
+ unsigned long long completed;
+
+ struct timeval timeout;
+
+ if (fd < 0)
+ return 1;
+
+ sysfs_fd_get_ll(fd, &completed);
+
+ if (to_complete == 0) {/* reshape till the end of array */
+ sysfs_set_str(sra, NULL, "sync_max", "max");
+ to_complete = MaxSector;
+ } else {
+ if (completed > to_complete)
+ return -1;
+ if (sysfs_set_num(sra, NULL, "sync_max",
+ to_complete / ndata) != 0) {
+ close(fd);
+ return -1;
+ }
+ }
+
+ timeout.tv_sec = 0;
+ timeout.tv_usec = 500000;
+ do {
+ char action[20];
+ fd_set rfds;
+ FD_ZERO(&rfds);
+ FD_SET(fd, &rfds);
+ select(fd+1, NULL, NULL, &rfds, &timeout);
+ if (sysfs_fd_get_ll(fd, &completed) < 0) {
+ close(fd);
+ return 1;
+ }
+ if (sysfs_get_str(sra, NULL, "sync_action",
+ action, 20) > 0 &&
+ strncmp(action, "reshape", 7) != 0)
+ continue;
+ } while (completed < to_complete);
+ close(fd);
+ return 0;
+
+}
+
static int imsm_manage_reshape(
int afd, struct mdinfo *sra, struct reshape *reshape,
struct supertype *st, unsigned long stripes,

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 08/22] imsm: Implement imsm_manage_reshape(), reshape workhorse

am 02.06.2011 16:49:17 von krzysztof.wojcik

From: Adam Kwolek

Before reshape is started, mdadm should check again if there is only one
array (in container) under reshape. Then function "divides" array in to
"migration units" that can fits migration copy area and enters main loop.
It checks if current "migration unit" requires to be backed up.
If necessary mdadm saves it to copy area and updates migration record.
Then MD-driver is directed to perform reshape step (by "migration unit" size)
and checkpoint is moved forward. In this way reshape is executed until
array ends.

Signed-off-by: Adam Kwolek
Signed-off-by: Krzysztof Wojcik
---
mdadm.h | 1
super-intel.c | 210 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
2 files changed, 206 insertions(+), 5 deletions(-)

diff --git a/mdadm.h b/mdadm.h
index e97cab0..a1ae2b9 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -1189,6 +1189,7 @@ extern int child_monitor(int afd, struct mdinfo *sra, struct reshape *reshape,
struct supertype *st, unsigned long stripes,
int *fds, unsigned long long *offsets,
int dests, int *destfd, unsigned long long *destoffsets);
+void abort_reshape(struct mdinfo *sra);

extern char *devnum2devname(int num);
extern void fmt_devname(char *name, int num);
diff --git a/super-intel.c b/super-intel.c
index c395a48..b4038d3 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -8309,16 +8309,216 @@ int wait_for_reshape_imsm(struct mdinfo *sra, unsigned long long to_complete,

}

+/********************************************************** *********************
+ * Function: imsm_manage_reshape
+ * Description: Function finds array under reshape and it manages reshape
+ * process. It creates stripes backups (if required) and sets
+ * checheckpoits.
+ * Parameters:
+ * afd : Backup handle (nattive) - not used
+ * sra : general array info
+ * reshape : reshape parameters - not used
+ * st : supertype structure
+ * blocks : size of critical section [blocks]
+ * fds : table of source device descriptor
+ * offsets : start of array (offest per devices)
+ * dests : not used
+ * destfd : table of destination device descriptor
+ * destoffsets : table of destination offsets (per device)
+ * Returns:
+ * 1 : success, reshape is done
+ * 0 : fail
+ ************************************************************ ******************/
static int imsm_manage_reshape(
int afd, struct mdinfo *sra, struct reshape *reshape,
- struct supertype *st, unsigned long stripes,
+ struct supertype *st, unsigned long backup_blocks,
int *fds, unsigned long long *offsets,
int dests, int *destfd, unsigned long long *destoffsets)
{
- /* Just use child_monitor for now */
- return child_monitor(
- afd, sra, reshape, st, stripes,
- fds, offsets, dests, destfd, destoffsets);
+ int ret_val = 0;
+ struct intel_super *super = st->sb;
+ struct intel_dev *dv = NULL;
+ struct imsm_dev *dev = NULL;
+ struct imsm_map *map_src, *map_dest;
+ int migr_vol_qan = 0;
+ int ndata, odata; /* [bytes] */
+ int chunk; /* [bytes] */
+ struct migr_record *migr_rec;
+ char *buf = NULL;
+ unsigned int buf_size; /* [bytes] */
+ unsigned long long max_position; /* array size [bytes] */
+ unsigned long long next_step; /* [blocks]/[bytes] */
+ unsigned long long old_data_stripe_length;
+ unsigned long long new_data_stripe_length;
+ unsigned long long start_src; /* [bytes] */
+ unsigned long long start; /* [bytes] */
+ unsigned long long start_buf_shift; /* [bytes] */
+
+ if (!fds || !offsets || !destfd || !destoffsets || !sra)
+ goto abort;
+
+ /* Find volume during the reshape */
+ for (dv = super->devlist; dv; dv = dv->next) {
+ if (dv->dev->vol.migr_type == MIGR_GEN_MIGR
+ && dv->dev->vol.migr_state == 1) {
+ dev = dv->dev;
+ migr_vol_qan++;
+ }
+ }
+ /* Only one volume can migrate at the same time */
+ if (migr_vol_qan != 1) {
+ fprintf(stderr, Name " : %s", migr_vol_qan ?
+ "Number of migrating volumes greater than 1\n" :
+ "There is no volume during migrationg\n");
+ goto abort;
+ }
+
+ map_src = get_imsm_map(dev, 1);
+ if (map_src == NULL)
+ goto abort;
+ map_dest = get_imsm_map(dev, 0);
+
+ ndata = imsm_num_data_members(dev, 0);
+ odata = imsm_num_data_members(dev, 1);
+
+ chunk = map_src->blocks_per_strip * 512;
+ old_data_stripe_length = odata * chunk;
+
+ migr_rec = super->migr_rec;
+
+ /* [bytes] */
+ sra->new_chunk = __le16_to_cpu(map_dest->blocks_per_strip) * 512;
+ sra->new_level = map_dest->raid_level;
+ new_data_stripe_length = sra->new_chunk * ndata;
+
+ /* initialize migration record for start condition */
+ if (sra->reshape_progress == 0)
+ init_migr_record_imsm(st, dev, sra);
+
+ /* size for data */
+ buf_size = __le32_to_cpu(migr_rec->blocks_per_unit) * 512;
+ /* extend buffer size for parity disk */
+ buf_size += __le32_to_cpu(migr_rec->dest_depth_per_unit) * 512;
+ /* add space for stripe aligment */
+ buf_size += old_data_stripe_length;
+ if (posix_memalign((void **)&buf, 4096, buf_size)) {
+ dprintf("imsm: Cannot allocate checpoint buffer\n");
+ goto abort;
+ }
+
+ max_position =
+ __le32_to_cpu(migr_rec->post_migr_vol_cap) +
+ ((unsigned long long)__le32_to_cpu(
+ migr_rec->post_migr_vol_cap_hi) << 32);
+
+ while (__le32_to_cpu(migr_rec->curr_migr_unit) <
+ __le32_to_cpu(migr_rec->num_migr_units)) {
+ /* current reshape position [blocks] */
+ unsigned long long current_position =
+ __le32_to_cpu(migr_rec->blocks_per_unit)
+ * __le32_to_cpu(migr_rec->curr_migr_unit);
+ unsigned long long border;
+
+ next_step = __le32_to_cpu(migr_rec->blocks_per_unit);
+
+ if ((current_position + next_step) > max_position)
+ next_step = max_position - current_position;
+
+ start = (map_src->pba_of_lba0 + dev->reserved_blocks +
+ current_position) * 512;
+
+ /* allign reading start to old geometry */
+ start_buf_shift = start % old_data_stripe_length;
+ start_src = start - start_buf_shift;
+
+ border = (start_src / odata) - (start / ndata);
+ border /= 512;
+ if (border <= __le32_to_cpu(migr_rec->dest_depth_per_unit)) {
+ /* save critical stripes to buf
+ * start - start address of current unit
+ * to backup [bytes]
+ * start_src - start address of current unit
+ * to backup alligned to source array
+ * [bytes]
+ */
+ unsigned long long next_step_filler = 0;
+ unsigned long long copy_length = next_step * 512;
+
+ /* allign copy area length to stripe in old geometry */
+ next_step_filler = (copy_length + start_buf_shift)
+ % old_data_stripe_length;
+ if (next_step_filler)
+ next_step_filler = old_data_stripe_length
+ - next_step_filler;
+ dprintf("save_stripes() parameters: start = %llu,"
+ "\tstart_src = %llu,\tnext_step*512 = %llu,"
+ "\tstart_in_buf_shift = %llu,"
+ "\tnext_step_filler = %llu\n",
+ start, start_src, copy_length,
+ start_buf_shift, next_step_filler);
+
+ if (save_stripes(fds, offsets, map_src->num_members,
+ chunk, sra->array.level,
+ sra->array.layout, 0, NULL, start_src,
+ copy_length +
+ next_step_filler + start_buf_shift,
+ buf)) {
+ dprintf("imsm: Cannot save stripes"
+ " to buffer\n");
+ goto abort;
+ }
+ /* Convert data to destination format and store it
+ * in backup general migration area
+ */
+ if (save_backup_imsm(st, dev, sra,
+ buf + start_buf_shift,
+ ndata, copy_length)) {
+ dprintf("imsm: Cannot save stripes to "
+ "target devices\n");
+ goto abort;
+ }
+ if (save_checkpoint_imsm(st, sra,
+ UNIT_SRC_IN_CP_AREA)) {
+ dprintf("imsm: Cannot write checkpoint to "
+ "migration record (UNIT_SRC_IN_CP_AREA)\n");
+ goto abort;
+ }
+ /* decrease backup_blocks */
+ if (backup_blocks > (unsigned long)next_step)
+ backup_blocks -= next_step;
+ else
+ backup_blocks = 0;
+ }
+ /* When data backed up, checkpoint stored,
+ * kick the kernel to reshape unit of data
+ */
+ next_step = next_step + sra->reshape_progress;
+ sysfs_set_num(sra, NULL, "suspend_lo", sra->reshape_progress);
+ sysfs_set_num(sra, NULL, "suspend_hi", next_step);
+
+ /* wait until reshape finish */
+ if (wait_for_reshape_imsm(sra, next_step, ndata) < 0)
+ dprintf("wait_for_reshape_imsm returned error,"
+ " but we ignore it!\n");
+
+ sra->reshape_progress = next_step;
+
+ if (save_checkpoint_imsm(st, sra, UNIT_SRC_NORMAL)) {
+ dprintf("imsm: Cannot write checkpoint to "
+ "migration record (UNIT_SRC_NORMAL)\n");
+ goto abort;
+ }
+
+ }
+
+ /* return '1' if done */
+ ret_val = 1;
+abort:
+ if (buf)
+ free(buf);
+ abort_reshape(sra);
+
+ return ret_val;
}
#endif /* MDASSEMBLE */


--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 09/22] imsm: Check if array degradation has been changed

am 02.06.2011 16:49:26 von krzysztof.wojcik

From: Adam Kwolek

Before reshaping every "migration unit", check if array is still usable.
In failed disks number is greater than allowed degradation level, reshape
has to be aborted.

Signed-off-by: Maciej Trela
Signed-off-by: Adam Kwolek
Signed-off-by: Krzysztof Wojcik
---
super-intel.c | 57 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 57 insertions(+), 0 deletions(-)

diff --git a/super-intel.c b/super-intel.c
index b4038d3..d2393c2 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -8310,6 +8310,53 @@ int wait_for_reshape_imsm(struct mdinfo *sra, unsigned long long to_complete,
}

/*********************************************************** ********************
+ * Function: check_degradation_change
+ * Description: Check that array hasn't become failed.
+ * Parameters:
+ * info : for sysfs access
+ * sources : source disks descriptors
+ * degraded: previous degradation level
+ * Returns:
+ * degradation level
+ ************************************************************ ******************/
+int check_degradation_change(struct mdinfo *info,
+ int *sources,
+ int degraded)
+{
+ unsigned long long new_degraded;
+ sysfs_get_ll(info, NULL, "degraded", &new_degraded);
+ if (new_degraded != (unsigned long long)degraded) {
+ /* check each device to ensure it is still working */
+ struct mdinfo *sd;
+ new_degraded = 0;
+ for (sd = info->devs ; sd ; sd = sd->next) {
+ if (sd->disk.state & (1< + continue;
+ if (sd->disk.state & (1< + char sbuf[20];
+ if (sysfs_get_str(info,
+ sd, "state", sbuf, 20) < 0 ||
+ strstr(sbuf, "faulty") ||
+ strstr(sbuf, "in_sync") == NULL) {
+ /* this device is dead */
+ sd->disk.state = (1< + if (sd->disk.raid_disk >= 0 &&
+ sources[sd->disk.raid_disk] >= 0) {
+ close(sources[
+ sd->disk.raid_disk]);
+ sources[sd->disk.raid_disk] =
+ -1;
+ }
+ new_degraded++;
+ }
+ }
+ }
+ }
+
+ return new_degraded;
+}
+
+/********************************************************** *********************
* Function: imsm_manage_reshape
* Description: Function finds array under reshape and it manages reshape
* process. It creates stripes backups (if required) and sets
@@ -8353,6 +8400,7 @@ static int imsm_manage_reshape(
unsigned long long start_src; /* [bytes] */
unsigned long long start; /* [bytes] */
unsigned long long start_buf_shift; /* [bytes] */
+ int degraded = 0;

if (!fds || !offsets || !destfd || !destoffsets || !sra)
goto abort;
@@ -8419,6 +8467,15 @@ static int imsm_manage_reshape(
* __le32_to_cpu(migr_rec->curr_migr_unit);
unsigned long long border;

+ /* Check that array hasn't become failed.
+ */
+ degraded = check_degradation_change(sra, fds, degraded);
+ if (degraded > 1) {
+ dprintf("imsm: Abort reshape due to degradation"
+ " level (%i)\n", degraded);
+ goto abort;
+ }
+
next_step = __le32_to_cpu(migr_rec->blocks_per_unit);

if ((current_position + next_step) > max_position)

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 10/22] imsm: Clear migration record when no migration in

am 02.06.2011 16:49:34 von krzysztof.wojcik

From: Adam Kwolek

When metadata is saved and there is no general migration in progress
/in container/ clear migration record in container.

Signed-off-by: Adam Kwolek
Signed-off-by: Krzysztof Wojcik
---
super-intel.c | 16 ++++++++++++++++
1 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/super-intel.c b/super-intel.c
index d2393c2..db2f2b9 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -4254,6 +4254,8 @@ static int write_super_imsm_spares(struct intel_super *super, int doclose)
return 0;
}

+static int is_gen_migration(struct imsm_dev *dev);
+
static int write_super_imsm(struct supertype *st, int doclose)
{
struct intel_super *super = st->sb;
@@ -4265,6 +4267,7 @@ static int write_super_imsm(struct supertype *st, int doclose)
int i;
__u32 mpb_size = sizeof(struct imsm_super) - sizeof(struct imsm_disk);
int num_disks = 0;
+ int clear_migration_record = 1;

/* 'generation' is incremented everytime the metadata is written */
generation = __le32_to_cpu(mpb->generation_num);
@@ -4299,6 +4302,8 @@ static int write_super_imsm(struct supertype *st, int doclose)
imsm_copy_dev(dev, dev2);
mpb_size += sizeof_imsm_dev(dev, 0);
}
+ if (is_gen_migration(dev2))
+ clear_migration_record = 0;
}
mpb_size += __le32_to_cpu(mpb->bbm_log_size);
mpb->mpb_size = __cpu_to_le32(mpb_size);
@@ -4307,6 +4312,9 @@ static int write_super_imsm(struct supertype *st, int doclose)
sum = __gen_imsm_checksum(mpb);
mpb->check_sum = __cpu_to_le32(sum);

+ if (clear_migration_record)
+ memset(super->migr_rec_buf, 0, 512);
+
/* write the mpb for disks that compose raid devices */
for (d = super->disks; d ; d = d->next) {
if (d->index < 0)
@@ -4314,6 +4322,14 @@ static int write_super_imsm(struct supertype *st, int doclose)
if (store_imsm_mpb(d->fd, mpb))
fprintf(stderr, "%s: failed for device %d:%d %s\n",
__func__, d->major, d->minor, strerror(errno));
+ if (clear_migration_record) {
+ unsigned long long dsize;
+
+ get_dev_size(d->fd, NULL, &dsize);
+ if (lseek64(d->fd, dsize - 512, SEEK_SET) >= 0) {
+ write(d->fd, super->migr_rec_buf, 512);
+ }
+ }
if (doclose) {
close(d->fd);
d->fd = -1;

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 11/22] imsm: Add information about migration record to mdadm

am 02.06.2011 16:49:42 von krzysztof.wojcik

From: Adam Kwolek

Add ability to display information from migration record in examine
option.

Signed-off-by: Adam Kwolek
Signed-off-by: Krzysztof Wojcik
---
super-intel.c | 53 +++++++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 53 insertions(+), 0 deletions(-)

diff --git a/super-intel.c b/super-intel.c
index db2f2b9..c40c02a 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -1035,6 +1035,57 @@ static void print_imsm_disk(struct imsm_super *mpb, int index, __u32 reserved)
human_size(sz * 512));
}

+static int is_gen_migration(struct imsm_dev *dev);
+
+void examine_migr_rec_imsm(struct intel_super *super)
+{
+ struct migr_record *migr_rec = super->migr_rec;
+ struct imsm_super *mpb = super->anchor;
+ int i;
+
+ for (i = 0; i < mpb->num_raid_devs; i++) {
+ struct imsm_dev *dev = __get_imsm_dev(mpb, i);
+ if (is_gen_migration(dev) == 0)
+ continue;
+
+ printf("\nMigration Record Information:");
+ if (super->disks->index > 1) {
+ printf(" Empty\n ");
+ printf("Examine one of first two disks in array\n");
+ break;
+ }
+ printf("\n Status : ");
+ if (__le32_to_cpu(migr_rec->rec_status) == UNIT_SRC_NORMAL)
+ printf("Normal\n");
+ else
+ printf("Contains Data\n");
+ printf(" Current Unit : %u\n",
+ __le32_to_cpu(migr_rec->curr_migr_unit));
+ printf(" Family : %u\n",
+ __le32_to_cpu(migr_rec->family_num));
+ printf(" Ascending : %u\n",
+ __le32_to_cpu(migr_rec->ascending_migr));
+ printf(" Blocks Per Unit : %u\n",
+ __le32_to_cpu(migr_rec->blocks_per_unit));
+ printf(" Dest. Depth Per Unit : %u\n",
+ __le32_to_cpu(migr_rec->dest_depth_per_unit));
+ printf(" Checkpoint Area pba : %u\n",
+ __le32_to_cpu(migr_rec->ckpt_area_pba));
+ printf(" First member lba : %u\n",
+ __le32_to_cpu(migr_rec->dest_1st_member_lba));
+ printf(" Total Number of Units : %u\n",
+ __le32_to_cpu(migr_rec->num_migr_units));
+ printf(" Size of volume : %u\n",
+ __le32_to_cpu(migr_rec->post_migr_vol_cap));
+ printf(" Expansion space for LBA64 : %u\n",
+ __le32_to_cpu(migr_rec->post_migr_vol_cap_hi));
+ printf(" Record was read from : %u\n",
+ __le32_to_cpu(migr_rec->ckpt_read_disk_num));
+
+ break;
+ }
+}
+
static void getinfo_super_imsm(struct supertype *st, struct mdinfo *info, char *map);

static void examine_super_imsm(struct supertype *st, char *homehost)
@@ -1114,6 +1165,8 @@ static void examine_super_imsm(struct supertype *st, char *homehost)
printf(" Usable Size : %llu%s\n", (unsigned long long)sz,
human_size(sz * 512));
}
+
+ examine_migr_rec_imsm(super);
}

static void brief_examine_super_imsm(struct supertype *st, int verbose)

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 12/22] imsm: update blocks_per_migr_unit() to support

am 02.06.2011 16:49:51 von krzysztof.wojcik

From: Adam Kwolek

blocks_per_migr_unit() has to use information from migration record
for general migration case. This causes to pass intel_super pointer
to this function and some other interfaces changes.

Signed-off-by: Adam Kwolek
Signed-off-by: Krzysztof Wojcik
---
super-intel.c | 46 ++++++++++++++++++++++++++++++----------------
1 files changed, 30 insertions(+), 16 deletions(-)

diff --git a/super-intel.c b/super-intel.c
index c40c02a..d80e530 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -915,9 +915,13 @@ static unsigned long long min_acceptable_spare_size_imsm(struct supertype *st)
}

#ifndef MDASSEMBLE
-static __u64 blocks_per_migr_unit(struct imsm_dev *dev);
+static __u64 blocks_per_migr_unit(struct intel_super *super,
+ struct imsm_dev *dev);

-static void print_imsm_dev(struct imsm_dev *dev, char *uuid, int disk_idx)
+static void print_imsm_dev(struct intel_super *super,
+ struct imsm_dev *dev,
+ char *uuid,
+ int disk_idx)
{
__u64 sz;
int slot, i;
@@ -1008,7 +1012,7 @@ static void print_imsm_dev(struct imsm_dev *dev, char *uuid, int disk_idx)
printf(" <-- %s", map_state_str[map->map_state]);
printf("\n Checkpoint : %u (%llu)",
__le32_to_cpu(dev->vol.curr_migr_unit),
- (unsigned long long)blocks_per_migr_unit(dev));
+ (unsigned long long)blocks_per_migr_unit(super, dev));
}
printf("\n");
printf(" Dirty State : %s\n", dev->vol.dirty ? "dirty" : "clean");
@@ -1138,7 +1142,7 @@ static void examine_super_imsm(struct supertype *st, char *homehost)
info.devs = NULL;
getinfo_super_imsm(st, &info, NULL);
fname_from_uuid(st, &info, nbuf, ':');
- print_imsm_dev(dev, nbuf + 5, super->disks->index);
+ print_imsm_dev(super, dev, nbuf + 5, super->disks->index);
}
for (i = 0; i < mpb->num_disks; i++) {
if (i == super->disks->index)
@@ -1784,7 +1788,8 @@ static __u32 map_migr_block(struct imsm_dev *dev, __u32 block)
}
}

-static __u64 blocks_per_migr_unit(struct imsm_dev *dev)
+static __u64 blocks_per_migr_unit(struct intel_super *super,
+ struct imsm_dev *dev)
{
/* calculate the conversion factor between per member 'blocks'
* (md/{resync,rebuild}_start) and imsm migration units, return
@@ -1794,7 +1799,10 @@ static __u64 blocks_per_migr_unit(struct imsm_dev *dev)
return 0;

switch (migr_type(dev)) {
- case MIGR_GEN_MIGR:
+ case MIGR_GEN_MIGR: {
+ struct migr_record *migr_rec = super->migr_rec;
+ return __le32_to_cpu(migr_rec->blocks_per_unit);
+ }
case MIGR_VERIFY:
case MIGR_REPAIR:
case MIGR_INIT: {
@@ -1992,6 +2000,7 @@ static int write_imsm_migr_rec(struct supertype *st)
static void getinfo_super_imsm_volume(struct supertype *st, struct mdinfo *info, char *dmap)
{
struct intel_super *super = st->sb;
+ struct migr_record *migr_rec = super->migr_rec;
struct imsm_dev *dev = get_imsm_dev(super, super->current_vol);
struct imsm_map *map = get_imsm_map(dev, 0);
struct imsm_map *prev_map = get_imsm_map(dev, 1);
@@ -2106,15 +2115,17 @@ static void getinfo_super_imsm_volume(struct supertype *st, struct mdinfo *info,
switch (migr_type(dev)) {
case MIGR_REPAIR:
case MIGR_INIT: {
- __u64 blocks_per_unit = blocks_per_migr_unit(dev);
+ __u64 blocks_per_unit = blocks_per_migr_unit(super,
+ dev);
__u64 units = __le32_to_cpu(dev->vol.curr_migr_unit);

info->resync_start = blocks_per_unit * units;
break;
}
case MIGR_GEN_MIGR: {
- __u64 blocks_per_unit = blocks_per_migr_unit(dev);
- __u64 units = __le32_to_cpu(dev->vol.curr_migr_unit);
+ __u64 blocks_per_unit = blocks_per_migr_unit(super,
+ dev);
+ __u64 units = __le32_to_cpu(migr_rec->curr_migr_unit);
unsigned long long array_blocks;
int used_disks;

@@ -5256,7 +5267,9 @@ static int is_rebuilding(struct imsm_dev *dev)
return 0;
}

-static void update_recovery_start(struct imsm_dev *dev, struct mdinfo *array)
+static void update_recovery_start(struct intel_super *super,
+ struct imsm_dev *dev,
+ struct mdinfo *array)
{
struct mdinfo *rebuild = NULL;
struct mdinfo *d;
@@ -5283,7 +5296,7 @@ static void update_recovery_start(struct imsm_dev *dev, struct mdinfo *array)
}

units = __le32_to_cpu(dev->vol.curr_migr_unit);
- rebuild->recovery_start = units * blocks_per_migr_unit(dev);
+ rebuild->recovery_start = units * blocks_per_migr_unit(super, dev);
}


@@ -5447,7 +5460,7 @@ static struct mdinfo *container_content_imsm(struct supertype *st, char *subarra
info_d->component_size = __le32_to_cpu(map->blocks_per_member);
}
/* now that the disk list is up-to-date fixup recovery_start */
- update_recovery_start(dev, this);
+ update_recovery_start(super, dev, this);
this->array.spare_disks += spare_disks;
rest = this;
}
@@ -5839,7 +5852,7 @@ static int imsm_set_array_state(struct active_array *a, int consistent)

mark_checkpoint:
/* check if we can update curr_migr_unit from resync_start, recovery_start */
- blocks_per_unit = blocks_per_migr_unit(dev);
+ blocks_per_unit = blocks_per_migr_unit(super, dev);
if (blocks_per_unit) {
__u32 units32;
__u64 units;
@@ -8623,9 +8636,10 @@ static int imsm_manage_reshape(
sysfs_set_num(sra, NULL, "suspend_hi", next_step);

/* wait until reshape finish */
- if (wait_for_reshape_imsm(sra, next_step, ndata) < 0)
- dprintf("wait_for_reshape_imsm returned error,"
- " but we ignore it!\n");
+ if (wait_for_reshape_imsm(sra, next_step, ndata) < 0) {
+ dprintf("wait_for_reshape_imsm returned error!\n");
+ goto abort;
+ }

sra->reshape_progress = next_step;


--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 13/22] Add reshape restart support for external metadata

am 02.06.2011 16:49:59 von krzysztof.wojcik

From: Adam Kwolek

Patch introduces support for reshape process restart for external metadata
using metadata specific data handling methods.
It introduces recover_backup() function that restores array to stable state
It is equivalent to Grow_restart() functionality for native metadata.

Signed-off-by: Maciej Trela
Signed-off-by: Adam Kwolek
Signed-off-by: Krzysztof Wojcik
---
Assemble.c | 10 ++++++++--
mdadm.h | 3 +++
2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/Assemble.c b/Assemble.c
index 8b05829..f085f7c 100644
--- a/Assemble.c
+++ b/Assemble.c
@@ -1192,7 +1192,10 @@ int Assemble(struct supertype *st, char *mddev,
fdlist[i] = -1;
}
if (!err) {
- err = Grow_restart(st, content, fdlist, bestcnt,
+ if (st->ss->external && st->ss->recover_backup)
+ err = st->ss->recover_backup(st, content);
+ else
+ err = Grow_restart(st, content, fdlist, bestcnt,
backup_file, verbose > 0);
if (err && invalid_backup) {
if (verbose > 0)
@@ -1573,7 +1576,10 @@ int assemble_container_content(struct supertype *st, int mdfd,
else
fdlist[spare++] = fd;
}
- err = Grow_restart(st, content, fdlist, spare,
+ if (st->ss->external && st->ss->recover_backup)
+ err = st->ss->recover_backup(st, content);
+ else
+ err = Grow_restart(st, content, fdlist, spare,
backup_file, verbose > 0);
while (spare > 0) {
spare--;
diff --git a/mdadm.h b/mdadm.h
index a1ae2b9..50042b0 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -785,6 +785,9 @@ extern struct superswitch {
*/
const char *(*get_disk_controller_domain)(const char *path);

+ /* for external backup area */
+ int (*recover_backup)(struct supertype *st, struct mdinfo *info);
+
int swapuuid; /* true if uuid is bigending rather than hostendian */
int external;
const char *name; /* canonical metadata name */

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 14/22] imsm: Implement recover_backup_imsm() for imsm metadata

am 02.06.2011 16:50:08 von krzysztof.wojcik

From: Adam Kwolek

Add ability to restore data backed up in General Migration Copy Area
in case of unexpected reshape interruption.
Function restores data during an array assembly and then reshape
is continues from next checkpoint.

Signed-off-by: Krzysztof Wojcik
---
super-intel.c | 129 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 129 insertions(+), 0 deletions(-)

diff --git a/super-intel.c b/super-intel.c
index d80e530..8ebc1e4 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -5299,6 +5299,7 @@ static void update_recovery_start(struct intel_super *super,
rebuild->recovery_start = units * blocks_per_migr_unit(super, dev);
}

+static int recover_backup_imsm(struct supertype *st, struct mdinfo *info);

static struct mdinfo *container_content_imsm(struct supertype *st, char *subarray)
{
@@ -5462,6 +5463,11 @@ static struct mdinfo *container_content_imsm(struct supertype *st, char *subarra
/* now that the disk list is up-to-date fixup recovery_start */
update_recovery_start(super, dev, this);
this->array.spare_disks += spare_disks;
+
+ /* check for reshape */
+ if (this->reshape_active == 1)
+ recover_backup_imsm(st, this);
+
rest = this;
}

@@ -7684,6 +7690,127 @@ int save_checkpoint_imsm(struct supertype *st, struct mdinfo *info, int state)
return 0;
}

+static __u64 blocks_per_migr_unit(struct intel_super *super,
+ struct imsm_dev *dev);
+
+/********************************************************** *********************
+ * Function: recover_backup_imsm
+ * Description: Function recovers critical data from the Migration Copy Area
+ * while assembling an array.
+ * Parameters:
+ * super : imsm internal array info
+ * info : general array info
+ * Returns:
+ * 0 : success (or there is no data to recover)
+ * 1 : fail
+ ************************************************************ ******************/
+int recover_backup_imsm(struct supertype *st, struct mdinfo *info)
+{
+ struct intel_super *super = st->sb;
+ struct migr_record *migr_rec = super->migr_rec;
+ struct imsm_map *map_dest = NULL;
+ struct intel_dev *id = NULL;
+ unsigned long long read_offset;
+ unsigned long long write_offset;
+ unsigned unit_len;
+ int *targets = NULL;
+ int new_disks, i, err;
+ char *buf = NULL;
+ int retval = 1;
+ unsigned long curr_migr_unit = __le32_to_cpu(migr_rec->curr_migr_unit);
+ unsigned long num_migr_units = __le32_to_cpu(migr_rec->num_migr_units);
+ int ascending = __le32_to_cpu(migr_rec->ascending_migr);
+ char buffer[20];
+
+ err = sysfs_get_str(info, NULL, "array_state", (char *)buffer, 20);
+ if (err < 1)
+ return 1;
+
+ /* recover data only during assemblation */
+ if (strncmp(buffer, "inactive", 8) != 0)
+ return 0;
+ /* no data to recover */
+ if (__le32_to_cpu(migr_rec->rec_status) == UNIT_SRC_NORMAL)
+ return 0;
+ if (curr_migr_unit >= num_migr_units)
+ return 1;
+
+ /* find device during reshape */
+ for (id = super->devlist; id; id = id->next)
+ if (is_gen_migration(id->dev))
+ break;
+ if (id == NULL)
+ return 1;
+
+ map_dest = get_imsm_map(id->dev, 0);
+ new_disks = map_dest->num_members;
+
+ read_offset = (unsigned long long)
+ __le32_to_cpu(migr_rec->ckpt_area_pba) * 512;
+
+ write_offset = ((unsigned long long)
+ __le32_to_cpu(migr_rec->dest_1st_member_lba) +
+ info->data_offset) * 512;
+
+ unit_len = __le32_to_cpu(migr_rec->dest_depth_per_unit) * 512;
+ if (posix_memalign((void **)&buf, 512, unit_len) != 0)
+ goto abort;
+ targets = malloc(new_disks * sizeof(int));
+ if (!targets)
+ goto abort;
+
+ open_backup_targets(info, new_disks, targets);
+
+ for (i = 0; i < new_disks; i++) {
+ if (lseek64(targets[i], read_offset, SEEK_SET) < 0) {
+ fprintf(stderr,
+ Name ": Cannot seek to block: %s\n",
+ strerror(errno));
+ goto abort;
+ }
+ if (read(targets[i], buf, unit_len) != unit_len) {
+ fprintf(stderr,
+ Name ": Cannot read copy area block: %s\n",
+ strerror(errno));
+ goto abort;
+ }
+ if (lseek64(targets[i], write_offset, SEEK_SET) < 0) {
+ fprintf(stderr,
+ Name ": Cannot seek to block: %s\n",
+ strerror(errno));
+ goto abort;
+ }
+ if (write(targets[i], buf, unit_len) != unit_len) {
+ fprintf(stderr,
+ Name ": Cannot restore block: %s\n",
+ strerror(errno));
+ goto abort;
+ }
+ }
+
+ if (ascending && curr_migr_unit < (num_migr_units-1))
+ curr_migr_unit++;
+
+ migr_rec->curr_migr_unit = __le32_to_cpu(curr_migr_unit);
+ super->migr_rec->rec_status = __cpu_to_le32(UNIT_SRC_NORMAL);
+ if (write_imsm_migr_rec(st) == 0) {
+ __u64 blocks_per_unit = blocks_per_migr_unit(super, id->dev);
+ info->reshape_progress = curr_migr_unit * blocks_per_unit;
+ retval = 0;
+ }
+
+abort:
+ if (targets) {
+ for (i = 0; i < new_disks; i++)
+ if (targets[i])
+ close(targets[i]);
+ free(targets);
+ }
+ if (buf)
+ free(buf);
+ return retval;
+}
+
static char disk_by_path[] = "/dev/disk/by-path/";

static const char *imsm_get_disk_controller_domain(const char *path)
@@ -8701,6 +8828,8 @@ struct superswitch super_imsm = {
.match_metadata_desc = match_metadata_desc_imsm,
.container_content = container_content_imsm,

+ .recover_backup = recover_backup_imsm,
+
.external = 1,
.name = "imsm",


--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 15/22] imsm: Disable checkpoint updating by mdmon for general

am 02.06.2011 16:50:16 von krzysztof.wojcik

From: Adam Kwolek

imsm contains 2 check-pointing mechanism. One (per array) is used for
initialization and rebuild and second (per container) is used for general
migration (reshape). First is controlled by mdmon, second by mdadm.
To avoid conflicts disable mdmon checkpoints updating for general
migration.

Signed-off-by: Adam Kwolek
---
super-intel.c | 7 +++++++
1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/super-intel.c b/super-intel.c
index 8ebc1e4..77d6167 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -5857,6 +5857,12 @@ static int imsm_set_array_state(struct active_array *a, int consistent)
}

mark_checkpoint:
+ /* skip checkpointing for general migration,
+ * it is controlled in mdadm
+ */
+ if (is_gen_migration(dev))
+ goto skip_mark_checkpoint;
+
/* check if we can update curr_migr_unit from resync_start, recovery_start */
blocks_per_unit = blocks_per_migr_unit(super, dev);
if (blocks_per_unit) {
@@ -5878,6 +5884,7 @@ mark_checkpoint:
}
}

+skip_mark_checkpoint:
/* mark dirty / clean */
if (dev->vol.dirty != !consistent) {
dprintf("imsm: mark '%s'\n", consistent ? "clean" : "dirty");

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 16/22] imsm: Add metadata update type for general migration

am 02.06.2011 16:50:24 von krzysztof.wojcik

From: Adam Kwolek

There are 2 places for keeping checkpoint information:
- metadata (per volume information used during volume
initialization and rebuilding).
- migration record (per container information used during
migration/reshape)

During reshape both checkpoints has to contains the same information.
To do this mdadm will send metadta update with checkpoint information.

Note: Checkpoint information consistence is not critical. During general
migration restart, information from migration record is used only.

Signed-off-by: Adam Kwolek
---
super-intel.c | 8 +++++++-
1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/super-intel.c b/super-intel.c
index 77d6167..2a11879 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -345,7 +345,8 @@ enum imsm_update_type {
update_add_remove_disk,
update_reshape_container_disks,
update_reshape_migration,
- update_takeover
+ update_takeover,
+ update_general_migration_checkpoint
};

struct imsm_update_activate_spare {
@@ -398,6 +399,11 @@ struct imsm_update_reshape_migration {
int new_disks[1]; /* new_raid_disks - old_raid_disks makedev number */
};

+struct imsm_update_general_migration_checkpoint {
+ enum imsm_update_type type;
+ __u32 curr_migr_unit;
+};
+
struct disk_info {
__u8 serial[MAX_RAID_SERIAL_LEN];
};

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 17/22] imsm: Prepare checkpoint update for general migration

am 02.06.2011 16:50:33 von krzysztof.wojcik

From: Adam Kwolek

mdadm has to prepare checkpoint information update and send it to mdmon.

Signed-off-by: Adam Kwolek
---
super-intel.c | 67 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 67 insertions(+), 0 deletions(-)

diff --git a/super-intel.c b/super-intel.c
index 2a11879..fa4e15d 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -1953,6 +1953,51 @@ out:
}

/*********************************************************** ********************
+ * function: imsm_create_metadata_checkpoint_update
+ * Description: It creates update for checkpoint change.
+ * Parameters:
+ * super : imsm internal array info
+ * u : pointer to prepared update
+ * Returns:
+ * Uptate length.
+ * If length is equal to 0, input pointer u contains no update
+ ************************************************************ ******************/
+static int imsm_create_metadata_checkpoint_update(
+ struct intel_super *super,
+ struct imsm_update_general_migration_checkpoint **u)
+{
+
+ int update_memory_size = 0;
+
+ dprintf("imsm_create_metadata_checkpoint_update(enter)\n");
+
+ if (u == NULL)
+ return 0;
+ *u = NULL;
+
+ /* size of all update data without anchor */
+ update_memory_size =
+ sizeof(struct imsm_update_general_migration_checkpoint);
+
+ *u = calloc(1, update_memory_size);
+ if (*u == NULL) {
+ dprintf("error: cannot get memory for "
+ "imsm_create_metadata_checkpoint_update update\n");
+ return 0;
+ }
+ (*u)->type = update_general_migration_checkpoint;
+ (*u)->curr_migr_unit = __le32_to_cpu(super->migr_rec->curr_migr_unit);
+ dprintf("imsm_create_metadata_checkpoint_update: prepared for %u\n",
+ (*u)->curr_migr_unit);
+
+ return update_memory_size;
+}
+
+
+static void imsm_update_metadata_locally(struct supertype *st,
+ void *buf, int len);
+
+/********************************************************** *********************
* Function: write_imsm_migr_rec
* Description: Function writes imsm migration record
* (at the last sector of disk)
@@ -1970,6 +2015,8 @@ static int write_imsm_migr_rec(struct supertype *st)
int fd = -1;
int retval = -1;
struct dl *sd;
+ int len;
+ struct imsm_update_general_migration_checkpoint *u;

for (sd = super->disks ; sd ; sd = sd->next) {
/* write to 2 first slots only */
@@ -1995,6 +2042,26 @@ static int write_imsm_migr_rec(struct supertype *st)
close(fd);
fd = -1;
}
+ /* update checkpoint information in metadata */
+ len = imsm_create_metadata_checkpoint_update(super, &u);
+
+ if (len <= 0) {
+ dprintf("imsm: Cannot prepare update\n");
+ goto out;
+ }
+ /* update metadata locally */
+ imsm_update_metadata_locally(st, u, len);
+ /* and possibly remotely */
+ if (st->update_tail) {
+ append_metadata_update(st, u, len);
+ /* during reshape we do all work inside metadata handler
+ * manage_reshape(), so metadata update has to be triggered
+ * insida it
+ */
+ flush_metadata_updates(st);
+ st->update_tail = &st->updates;
+ } else
+ free(u);

retval = 0;
out:

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 18/22] imsm: Apply checkpoint metadata update for general

am 02.06.2011 16:50:41 von krzysztof.wojcik

From: Adam Kwolek

mdmon has to update checkpoint information in metadata during
general migration according to received metadata update.

Signed-off-by: Adam Kwolek
---
super-intel.c | 22 ++++++++++++++++++++++
1 files changed, 22 insertions(+), 0 deletions(-)

diff --git a/super-intel.c b/super-intel.c
index fa4e15d..b6369c6 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -6920,6 +6920,24 @@ static void imsm_process_update(struct supertype *st,
mpb = super->anchor;

switch (type) {
+ case update_general_migration_checkpoint: {
+ struct intel_dev *id;
+ struct imsm_update_general_migration_checkpoint *u =
+ (void *)update->buf;
+
+ dprintf("imsm: process_update() "
+ "for update_general_migration_checkpoint called\n");
+
+ /* find device under general migration */
+ for (id = super->devlist ; id; id = id->next) {
+ if (is_gen_migration(id->dev)) {
+ id->dev->vol.curr_migr_unit =
+ __cpu_to_le32(u->curr_migr_unit);
+ super->updates_pending++;
+ }
+ }
+ break;
+ }
case update_takeover: {
struct imsm_update_takeover *u = (void *)update->buf;
if (apply_takeover_update(u, super, &update->space_list)) {
@@ -7257,6 +7275,10 @@ static void imsm_prepare_update(struct supertype *st,
size_t len = 0;

switch (type) {
+ case update_general_migration_checkpoint:
+ dprintf("imsm: prepare_update() "
+ "for update_general_migration_checkpoint called\n");
+ break;
case update_takeover: {
struct imsm_update_takeover *u = (void *)update->buf;
if (u->direction == R0_TO_R10) {

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 19/22] FIX: Enable metadata updates for raid0

am 02.06.2011 16:50:49 von krzysztof.wojcik

From: Adam Kwolek

When raid0 is takeovered to degraded raid4, metadata updates has to be
applied via mdmon (raid4 has to be monitored).
It is not possible due to no update_tail pointer initialization
in supertype structure.

Signed-off-by: Adam Kwolek
---
Grow.c | 5 +++++
1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/Grow.c b/Grow.c
index 11b2214..25be587 100644
--- a/Grow.c
+++ b/Grow.c
@@ -1847,6 +1847,9 @@ static int reshape_array(char *container, int fd, char *devname,
if (!mdmon_running(st->container_dev))
start_mdmon(st->container_dev);
ping_monitor(container);
+ if (mdmon_running(st->container_dev) &&
+ st->update_tail == NULL)
+ st->update_tail = &st->updates;
}
}
/* ->reshape_super might have chosen some spares from the
@@ -2264,6 +2267,8 @@ started:
": %s: could not set level "
"to %s\n", devname, c);
}
+ if (info->new_level == 0)
+ st->update_tail = NULL;
}
out:
if (forked)

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 20/22] Do not use backup file for external metadata

am 02.06.2011 16:50:58 von krzysztof.wojcik

From: Adam Kwolek

When external metatdata handler supports manage_reshape()
and recover_backup() functions in super switch backup file is not required
and can be omitted. For backup purposes metadata specific mechanisms
are used.

Signed-off-by: Adam Kwolek
Signed-off-by: Krzysztof Wojcik
---
Grow.c | 40 ++++++++++++++++++++++------------------
1 files changed, 22 insertions(+), 18 deletions(-)

diff --git a/Grow.c b/Grow.c
index 25be587..8e67be2 100644
--- a/Grow.c
+++ b/Grow.c
@@ -2038,25 +2038,29 @@ started:
if (d < 0) {
goto release;
}
- if (backup_file == NULL) {
- if (reshape.after.data_disks <= reshape.before.data_disks) {
- fprintf(stderr,
- Name ": %s: Cannot grow - need backup-file\n",
- devname);
- goto release;
- } else if (sra->array.spare_disks == 0) {
- fprintf(stderr, Name ": %s: Cannot grow - need a spare or "
- "backup-file to backup critical section\n",
- devname);
- goto release;
- }
- } else {
- if (!reshape_open_backup_file(backup_file, fd, devname,
- (signed)blocks,
- fdlist+d, offsets+d, restart)) {
- goto release;
+ if ((st->ss->manage_reshape == NULL) ||
+ (st->ss->recover_backup == NULL)) {
+ if (backup_file == NULL) {
+ if (reshape.after.data_disks <=
+ reshape.before.data_disks) {
+ fprintf(stderr, Name ": %s: Cannot grow - "
+ "need backup-file\n", devname);
+ goto release;
+ } else if (sra->array.spare_disks == 0) {
+ fprintf(stderr, Name ": %s: Cannot grow - "
+ "need a spare or backup-file to backup "
+ "critical section\n", devname);
+ goto release;
+ }
+ } else {
+ if (!reshape_open_backup_file(backup_file, fd, devname,
+ (signed)blocks,
+ fdlist+d, offsets+d,
+ restart)) {
+ goto release;
+ }
+ d++;
}
- d++;
}

/* lastly, check that the internal stripe cache is

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 21/22] imsm: Remove user warning before reshape start

am 02.06.2011 16:51:06 von krzysztof.wojcik

From: Adam Kwolek

imsm's arrays supports imsm native check-pointing now.
User warning is no longer required.

Signed-off-by: Adam Kwolek
---
super-intel.c | 31 -------------------------------
1 files changed, 0 insertions(+), 31 deletions(-)

diff --git a/super-intel.c b/super-intel.c
index b6369c6..f615bb1 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -8411,30 +8411,6 @@ int imsm_takeover(struct supertype *st, struct geo_params *geo)
return 0;
}

-static int warn_user_about_risk(void)
-{
- int rv = 0;
-
- fprintf(stderr,
- "\nThis is an experimental feature. Data on the RAID volume(s) "
- "can be lost!!!\n\n"
- "To continue command execution please make sure that\n"
- "the grow process will not be interrupted. Use safe power\n"
- "supply to avoid unexpected system reboot. Make sure that\n"
- "reshaped container is not assembled automatically during\n"
- "system boot.\n"
- "If reshape is interrupted, assemble array manually\n"
- "using e.g. '-Ac' option and up to date mdadm.conf file.\n"
- "Assembly in scan mode is not possible in such case.\n"
- "Growing container with boot array is not possible.\n"
- "If boot array reshape is interrupted, whole file system\n"
- "can be lost.\n\n");
- rv = ask("Do you want to continue? ");
- fprintf(stderr, "\n");
-
- return rv;
-}
-
static int imsm_reshape_super(struct supertype *st, long long size, int level,
int layout, int chunksize, int raid_disks,
int delta_disks, char *backup, char *dev,
@@ -8468,13 +8444,6 @@ static int imsm_reshape_super(struct supertype *st, long long size, int level,
dprintf("imsm: info: Container operation\n");
int old_raid_disks = 0;

- /* this warning will be removed when imsm checkpointing
- * will be implemented, and restoring from check-point
- * operation will be transparent for reboot process
- */
- if (warn_user_about_risk() == 0)
- return ret_val;
-
if (imsm_reshape_is_allowed_on_container(
st, &geo, &old_raid_disks)) {
struct imsm_update_reshape *u = NULL;

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 22/22] imsm: Unit Tests - remove backup-file during grow

am 02.06.2011 16:51:14 von krzysztof.wojcik

From: Adam Kwolek

Update reshape/migration unit tests to not to use backup file.
Imsm native check-pointing has to be used (internally) instead.

Signed-off-by: Krzysztof Wojcik
---
tests/imsm-grow-template | 5 ++---
1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/tests/imsm-grow-template b/tests/imsm-grow-template
index 191f056..8022e3a 100644
--- a/tests/imsm-grow-template
+++ b/tests/imsm-grow-template
@@ -14,10 +14,9 @@ function grow_member() {
local offset=$6
local chunk=$7
local array_size=$((comps * size))
- local backup_imsm=/tmp/backup_imsm

rm -f $backup_imsm
- ( set -ex; mdadm --grow $member --chunk=$chunk --level=$level --backup-file=$backup_imsm )
+ ( set -ex; mdadm --grow $member --chunk=$chunk --level=$level )
local status=$?
if [ $negative_test -ne 0 ]; then
if [ $status -eq 0 ]; then
@@ -83,7 +82,7 @@ if [ $migration_test -ne 0 ]; then
fi
else
rm -f $backup_imsm
- ( set -x; mdadm --grow $container --raid-disks=$num_disks --backup-file=$backup_imsm )
+ ( set -x; mdadm --grow $container --raid-disks=$num_disks )
grow_status=$?
if [ $negative_test -ne 0 ]; then
if [ $grow_status -eq 0 ]; then

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 06/22] FIX: Initialize reshape structure

am 08.06.2011 08:54:09 von NeilBrown

On Thu, 02 Jun 2011 16:49:00 +0200 Krzysztof Wojcik
wrote:

> From: Adam Kwolek
>
> It can occurs that reshape structure can contain random values.
> Due to this fact analyse_change() can disallow for grow start
> without real cause (e.g. check of uninitialized new_chunk).
>
> Signed-off-by: Adam Kwolek
> ---
> Grow.c | 1 +
> 1 files changed, 1 insertions(+), 0 deletions(-)
>
> diff --git a/Grow.c b/Grow.c
> index 7a8ffdb..11b2214 100644
> --- a/Grow.c
> +++ b/Grow.c
> @@ -1745,6 +1745,7 @@ static int reshape_array(char *container, int fd, char *devname,
> info->component_size = array_size / array.raid_disks;
> }
>
> + memset(&reshape, 0, sizeof(reshape));
> if (info->reshape_active) {
> int new_level = info->new_level;
> info->new_level = UnSet;


This doesn't make any sense to me.

I cannot see how any random numbers in 'reshape' can cause analyse_change to
do the wrong thing, and there is no "new_chunk" in 'reshape' that could be
"uninitialized"..

Please explain.

NeilBrown

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 07/22] imsm: Add wait_for_reshape_imsm() implementation

am 08.06.2011 09:07:03 von NeilBrown

On Thu, 02 Jun 2011 16:49:08 +0200 Krzysztof Wojcik
wrote:

> From: Adam Kwolek
>
> After each checkpoint mdadm should set new reshaped area and wait
> until md finishes reshape. Function wait_for_reshape_imsm() sets
> new reshape range and waits for job completion.
>
> Signed-off-by: Adam Kwolek
> Signed-off-by: Krzysztof Wojcik
> ---
> super-intel.c | 61 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 1 files changed, 61 insertions(+), 0 deletions(-)
>
> diff --git a/super-intel.c b/super-intel.c
> index 31fae1e..c395a48 100644
> --- a/super-intel.c
> +++ b/super-intel.c
> @@ -8248,6 +8248,67 @@ exit_imsm_reshape_super:
> return ret_val;
> }
>
> +/********************************************************** *********************
> + * Function: wait_for_reshape_imsm
> + * Description: Function writes new sync_max value and waits until
> + * reshape process reach new position
> + * Parameters:
> + * sra : general array info
> + * to_complete : new sync_max position
> + * ndata : number of disks in new array's layout
> + * Returns:
> + * 0 : success,
> + * 1 : there is no reshape in progress,
> + * -1 : fail
> + ************************************************************ ******************/
> +int wait_for_reshape_imsm(struct mdinfo *sra, unsigned long long to_complete,
> + int ndata)
> +{
> + int fd = sysfs_get_fd(sra, NULL, "reshape_position");
> + unsigned long long completed;
> +
> + struct timeval timeout;
> +
> + if (fd < 0)
> + return 1;
> +
> + sysfs_fd_get_ll(fd, &completed);
> +
> + if (to_complete == 0) {/* reshape till the end of array */
> + sysfs_set_str(sra, NULL, "sync_max", "max");
> + to_complete = MaxSector;
> + } else {
> + if (completed > to_complete)
> + return -1;
> + if (sysfs_set_num(sra, NULL, "sync_max",
> + to_complete / ndata) != 0) {
> + close(fd);
> + return -1;
> + }
> + }
> +
> + timeout.tv_sec = 0;
> + timeout.tv_usec = 500000;

Having a 1/2 second timeout is wrong. You shouldn't need a timeout at all.
If you do, there is a bug somewhere.

I changed this to 30 seconds.
> + do {
> + char action[20];
> + fd_set rfds;
> + FD_ZERO(&rfds);
> + FD_SET(fd, &rfds);
> + select(fd+1, NULL, NULL, &rfds, &timeout);
> + if (sysfs_fd_get_ll(fd, &completed) < 0) {
> + close(fd);
> + return 1;
> + }
> + if (sysfs_get_str(sra, NULL, "sync_action",
> + action, 20) > 0 &&
> + strncmp(action, "reshape", 7) != 0)
> + continue;

And if 'sync_action' is not 'reshape' any more then soemthing must have
aborted and just 'continue'ing is wrong. I have changed this to 'break', but
maybe you want to return an error.

NeilBrown


> + } while (completed < to_complete);
> + close(fd);
> + return 0;
> +
> +}
> +
> static int imsm_manage_reshape(
> int afd, struct mdinfo *sra, struct reshape *reshape,
> struct supertype *st, unsigned long stripes,
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 00/22] IMSM checkpointing implementation

am 08.06.2011 09:23:49 von NeilBrown

On Thu, 02 Jun 2011 16:48:08 +0200 Krzysztof Wojcik
wrote:

> IMSM for securing reshape process uses special disk area outside metadata
> for reshaped area backup purposes. If just reshaped array area requires
> backup, bunch of array stripes prepared for reshape is stored in to
> Migration Copy Area. In case of reshape interruption, Option ROM during
> restart or mdadm during reshape restart (when no reboot occurs) will
> restore Migration Copy Area to designation array.
> Reshape can be continued form stable array stable state.
>
> The following series implements IMSM checkpointing procedure.

I have applied most of these patches - some with some minor fixes. The major
changes I have already mentioned.

I will wait for you responses to those changes, and the other bug fixes you
mentioned before considering a release of 3.2.2, but I would like to make
that release in the next week or two.

Thanks,
NeilBrown

>
> ---
>
> Adam Kwolek, Krzysztof Wojcik (22):
> imsm: Add migration record to intel_super
> Support restore_stripes() from the given buffer
> Define dummy functions to mdmon.c
> imsm: Add support for copy area and backup operations
> imsm: check migration compatibility
> FIX: Initialize reshape structure
> imsm: Add wait_for_reshape_imsm() implementation
> imsm: Implement imsm_manage_reshape(), reshape workhorse
> imsm: Check if array degradation has been changed
> imsm: Clear migration record when no migration in progress
> imsm: Add information about migration record to mdadm '-E' option
> imsm: update blocks_per_migr_unit() to support migration record
> Add reshape restart support for external metadata
> imsm: Implement recover_backup_imsm() for imsm metadata
> imsm: Disable checkpoint updating by mdmon for general migration
> imsm: Add metadata update type for general migration check-pointing
> imsm: Prepare checkpoint update for general migration
> imsm: Apply checkpoint metadata update for general migration
> FIX: Enable metadata updates for raid0
> Do not use backup file for external metadata
> imsm: Remove user warning before reshape start
> imsm: Unit Tests - remove backup-file during grow command
>
>
> Assemble.c | 10
> Grow.c | 50 +-
> mdadm.h | 7
> mdmon.c | 23 +
> restripe.c | 101 +++-
> super-intel.c | 1241 ++++++++++++++++++++++++++++++++++++++++++++--
> tests/imsm-grow-template | 5
> 7 files changed, 1322 insertions(+), 115 deletions(-)
>

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

RE: [PATCH 00/22] IMSM checkpointing implementation

am 08.06.2011 09:34:35 von krzysztof.wojcik

Thanks Neil!

We will send set of bug fixes today.

Regards
Krzysztof

> -----Original Message-----
> From: NeilBrown [mailto:neilb@suse.de]
> Sent: Wednesday, June 08, 2011 9:24 AM
> To: Wojcik, Krzysztof
> Cc: linux-raid@vger.kernel.org; Neubauer, Wojciech; Kwolek, Adam;
> Williams, Dan J; Ciechanowski, Ed
> Subject: Re: [PATCH 00/22] IMSM checkpointing implementation
>
> On Thu, 02 Jun 2011 16:48:08 +0200 Krzysztof Wojcik
> wrote:
>
> > IMSM for securing reshape process uses special disk area outside
> metadata
> > for reshaped area backup purposes. If just reshaped array area
> requires
> > backup, bunch of array stripes prepared for reshape is stored in to
> > Migration Copy Area. In case of reshape interruption, Option ROM
> during
> > restart or mdadm during reshape restart (when no reboot occurs) will
> > restore Migration Copy Area to designation array.
> > Reshape can be continued form stable array stable state.
> >
> > The following series implements IMSM checkpointing procedure.
>
> I have applied most of these patches - some with some minor fixes. The
> major
> changes I have already mentioned.
>
> I will wait for you responses to those changes, and the other bug fixes
> you
> mentioned before considering a release of 3.2.2, but I would like to
> make
> that release in the next week or two.
>
> Thanks,
> NeilBrown
>
> >
> > ---
> >
> > Adam Kwolek, Krzysztof Wojcik (22):
> > imsm: Add migration record to intel_super
> > Support restore_stripes() from the given buffer
> > Define dummy functions to mdmon.c
> > imsm: Add support for copy area and backup operations
> > imsm: check migration compatibility
> > FIX: Initialize reshape structure
> > imsm: Add wait_for_reshape_imsm() implementation
> > imsm: Implement imsm_manage_reshape(), reshape workhorse
> > imsm: Check if array degradation has been changed
> > imsm: Clear migration record when no migration in progress
> > imsm: Add information about migration record to mdadm '-E'
> option
> > imsm: update blocks_per_migr_unit() to support migration record
> > Add reshape restart support for external metadata
> > imsm: Implement recover_backup_imsm() for imsm metadata
> > imsm: Disable checkpoint updating by mdmon for general
> migration
> > imsm: Add metadata update type for general migration check-
> pointing
> > imsm: Prepare checkpoint update for general migration
> > imsm: Apply checkpoint metadata update for general migration
> > FIX: Enable metadata updates for raid0
> > Do not use backup file for external metadata
> > imsm: Remove user warning before reshape start
> > imsm: Unit Tests - remove backup-file during grow command
> >
> >
> > Assemble.c | 10
> > Grow.c | 50 +-
> > mdadm.h | 7
> > mdmon.c | 23 +
> > restripe.c | 101 +++-
> > super-intel.c | 1241
> ++++++++++++++++++++++++++++++++++++++++++++--
> > tests/imsm-grow-template | 5
> > 7 files changed, 1322 insertions(+), 115 deletions(-)
> >

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html