[PATCH 0/8] Reshape restart after filesystem pivot

[PATCH 0/8] Reshape restart after filesystem pivot

am 27.09.2011 14:04:30 von adam.kwolek

This patch series (8) implements manual reshape restart/continuation
in the following scenario:

1. Condition:
During system boot mdadm finds array under reshape that should be
assembled. This happens while initramfs is mounted
(before file system pivot)

2. Scenario:
a) Assembly procedure assembles array and runs reshape continuation
procedure (as usual). Reshape procedure continues reshape and during
file system pivot lost file system context and mdadm cannot control
reshape process using check-pointing
To avoid this, reshape procedure is informed via '--freeze-reshape' option
that currently system is in initrd phase (before file system pivot).
Reshape restores critical section and prepares array for later reshape
continuation (sets reshape position read from metadata checkpoint).
At this moment mdadm finishes his work. Array is fully functional/accessible now.

0001-Do-not-continue-reshape-during-initrd-phase.patch

b) After system boot, user can manually invoke reshape continuation.
New mdadm option '--continue' for grow command was added.
This allows for reshape continuation. mdadm for reshape continuation
uses parameters read from metadata. Command line can looks as follows:

mdadm -G --continue device_name [--backup-file=file_name]

where:
device_name : device that reshape should be continued on
e.g. /dev/md/container_name or /dev/md/raid_name
backup-file : optional parameter required when backup-file
was use previously for reshape execution


For external metadata mdadm takes carry for metadata compatibility
e.g. container operation can be continued on array device /and opposite/.

0002-Add-continue-option-to-grow-command.patch
0003-Do-not-restart-reshape-if-it-is-started-already.patch
0004-Set-correct-reshape-restart-position.patch
0005-Move-code-to-get_data_disks-function.patch
0006-Verify-reshape-restart-position.patch

3. man update:
The last 2 patches provides madam's man update for '--freeze-reshape' and
'--continue' options.

0007-Manual-update-for-continue-option.patch
0008-Manual-update-for-continue-option.patch

Note: This series requires patch for throwing core dump by restore_backup() function:
0001-FIX-restore_backup-throws-core-dump.patch

BR
Adam


---

Adam Kwolek (8):
Manual update for --continue option
Manual update for --continue option
Verify reshape restart position
Move code to get_data_disks() function
Set correct reshape restart position
Do not restart reshape if it is started already
Add continue option to grow command
Do not continue reshape during initrd phase


Assemble.c | 12 ++-
Grow.c | 236 ++++++++++++++++++++++++++++++++++++++++++++++++++++-----
Incremental.c | 16 ++--
ReadMe.c | 2
mdadm.8.in | 40 ++++++++++
mdadm.c | 47 +++++++++--
mdadm.h | 25 +++++-
util.c | 10 ++
8 files changed, 341 insertions(+), 47 deletions(-)

--
Signature
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 1/8] Do not continue reshape during initrd phase

am 27.09.2011 14:04:39 von adam.kwolek

During initrd phase continuing reshape will cause file system context
lost. This blocks ability to control reshape using checkpoints.

To avoid this, during initrd phase assemble has to be executed with
'--freeze-reshape' option. This causes that mdadm restores reshape
critical section only.

Reshape can be continued later after system full boot.

Signed-off-by: Adam Kwolek
---

Assemble.c | 12 +++++++-----
Grow.c | 52 ++++++++++++++++++++++++++++++++++++++--------------
Incremental.c | 16 ++++++++++------
ReadMe.c | 1 +
mdadm.c | 33 ++++++++++++++++++++++++---------
mdadm.h | 18 ++++++++++++++----
6 files changed, 94 insertions(+), 38 deletions(-)

diff --git a/Assemble.c b/Assemble.c
index c6aad20..afca38e 100644
--- a/Assemble.c
+++ b/Assemble.c
@@ -138,7 +138,7 @@ int Assemble(struct supertype *st, char *mddev,
char *backup_file, int invalid_backup,
int readonly, int runstop,
char *update, char *homehost, int require_homehost,
- int verbose, int force)
+ int verbose, int force, int freeze_reshape)
{
/*
* The task of Assemble is to find a collection of
@@ -697,7 +697,7 @@ int Assemble(struct supertype *st, char *mddev,
int err;
err = assemble_container_content(st, mdfd, content, runstop,
chosen_name, verbose,
- backup_file);
+ backup_file, freeze_reshape);
close(mdfd);
return err;
}
@@ -1344,7 +1344,8 @@ int Assemble(struct supertype *st, char *mddev,
#ifndef MDASSEMBLE
if (content->reshape_active &&
content->delta_disks <= 0)
- rv = Grow_continue(mdfd, st, content, backup_file);
+ rv = Grow_continue(mdfd, st, content,
+ backup_file, freeze_reshape);
else
#endif
rv = ioctl(mdfd, RUN_ARRAY, NULL);
@@ -1511,7 +1512,7 @@ int Assemble(struct supertype *st, char *mddev,
int assemble_container_content(struct supertype *st, int mdfd,
struct mdinfo *content, int runstop,
char *chosen_name, int verbose,
- char *backup_file)
+ char *backup_file, int freeze_reshape)
{
struct mdinfo *dev, *sra;
int working = 0, preexist = 0;
@@ -1560,7 +1561,8 @@ int assemble_container_content(struct supertype *st, int mdfd,
spare, backup_file, verbose) == 1)
return 1;

- err = Grow_continue(mdfd, st, content, backup_file);
+ err = Grow_continue(mdfd, st, content, backup_file,
+ freeze_reshape);
} else switch(content->array.level) {
case LEVEL_LINEAR:
case LEVEL_MULTIPATH:
diff --git a/Grow.c b/Grow.c
index 4a25165..4509488 100644
--- a/Grow.c
+++ b/Grow.c
@@ -696,10 +696,16 @@ static int subarray_set_num(char *container, struct mdinfo *sra, char *name, int
return rc;
}

-int start_reshape(struct mdinfo *sra, int already_running)
+int start_reshape(struct mdinfo *sra, int already_running, int freeze_reshape)
{
int err;
- sysfs_set_num(sra, NULL, "suspend_lo", 0x7FFFFFFFFFFFFFFFULL);
+
+ /* do not block array as we not continue reshape this time
+ */
+ if (freeze_reshape == FREEZE_RESHAPE_NONE)
+ sysfs_set_num(sra, NULL, "suspend_lo", 0x7FFFFFFFFFFFFFFFULL);
+ else
+ sysfs_set_num(sra, NULL, "suspend_lo", 0);
err = sysfs_set_num(sra, NULL, "suspend_hi", 0);
err = err ?: sysfs_set_num(sra, NULL, "suspend_lo", 0);
if (!already_running)
@@ -1344,13 +1350,13 @@ static int reshape_array(char *container, int fd, char *devname,
struct supertype *st, struct mdinfo *info,
int force, struct mddev_dev *devlist,
char *backup_file, int quiet, int forked,
- int restart);
+ int restart, int freeze_reshape);
static int reshape_container(char *container, char *devname,
struct supertype *st,
struct mdinfo *info,
int force,
char *backup_file,
- int quiet, int restart);
+ int quiet, int restart, int freeze_reshape);

int Grow_reshape(char *devname, int fd, int quiet, char *backup_file,
long long size,
@@ -1761,7 +1767,7 @@ int Grow_reshape(char *devname, int fd, int quiet, char *backup_file,
* performed at the level of the container
*/
rv = reshape_container(container, devname, st, &info,
- force, backup_file, quiet, 0);
+ force, backup_file, quiet, 0, 0);
frozen = 0;
} else {
/* get spare devices from external metadata
@@ -1789,7 +1795,7 @@ int Grow_reshape(char *devname, int fd, int quiet, char *backup_file,
}
sync_metadata(st);
rv = reshape_array(container, fd, devname, st, &info, force,
- devlist, backup_file, quiet, 0, 0);
+ devlist, backup_file, quiet, 0, 0, 0);
frozen = 0;
}
release:
@@ -1802,7 +1808,7 @@ static int reshape_array(char *container, int fd, char *devname,
struct supertype *st, struct mdinfo *info,
int force, struct mddev_dev *devlist,
char *backup_file, int quiet, int forked,
- int restart)
+ int restart, int freeze_reshape)
{
struct reshape reshape;
int spares_needed;
@@ -2241,7 +2247,7 @@ started:
}
}

- err = start_reshape(sra, restart);
+ err = start_reshape(sra, restart, freeze_reshape);
if (err) {
fprintf(stderr,
Name ": Cannot %s reshape for %s\n",
@@ -2251,6 +2257,15 @@ started:
}
if (restart)
sysfs_set_str(sra, NULL, "array_state", "active");
+ if (freeze_reshape == FREEZE_RESHAPE_ASSEMBLY) {
+ free(fdlist);
+ free(offsets);
+ sysfs_free(sra);
+ fprintf(stderr, Name ": Reshape has to be continued from"
+ " location %llu when root fileststem will be mounted\n",
+ sra->reshape_progress);
+ return 1;
+ }

/* Now we just need to kick off the reshape and watch, while
* handling backups of the data...
@@ -2389,7 +2404,7 @@ int reshape_container(char *container, char *devname,
struct mdinfo *info,
int force,
char *backup_file,
- int quiet, int restart)
+ int quiet, int restart, int freeze_reshape)
{
struct mdinfo *cc = NULL;
int rv = restart;
@@ -2418,7 +2433,9 @@ int reshape_container(char *container, char *devname,
unfreeze(st);
return 1;
default: /* parent */
- printf(Name ": multi-array reshape continues in background\n");
+ if (freeze_reshape == FREEZE_RESHAPE_NONE)
+ printf(Name ": multi-array reshape continues"
+ "in background\n");
return 0;
case 0: /* child */
break;
@@ -2473,8 +2490,15 @@ int reshape_container(char *container, char *devname,

rv = reshape_array(container, fd, adev, st,
content, force, NULL,
- backup_file, quiet, 1, restart);
+ backup_file, quiet, 1, restart,
+ freeze_reshape);
close(fd);
+
+ if (freeze_reshape) {
+ sysfs_free(cc);
+ exit(0);
+ }
+
restart = 0;
if (rv)
break;
@@ -3613,7 +3637,7 @@ int Grow_restart(struct supertype *st, struct mdinfo *info, int *fdlist, int cnt
}

int Grow_continue(int mdfd, struct supertype *st, struct mdinfo *info,
- char *backup_file)
+ char *backup_file, int freeze_reshape)
{
char buf[40];
char *container = NULL;
@@ -3640,9 +3664,9 @@ int Grow_continue(int mdfd, struct supertype *st, struct mdinfo *info,
close(cfd);
return reshape_container(container, NULL,
st, info, 0, backup_file,
- 0, 1);
+ 0, 1, freeze_reshape);
}
}
return reshape_array(container, mdfd, "array", st, info, 1,
- NULL, backup_file, 0, 0, 1);
+ NULL, backup_file, 0, 0, 1, freeze_reshape);
}
diff --git a/Incremental.c b/Incremental.c
index 791ad85..571d45d 100644
--- a/Incremental.c
+++ b/Incremental.c
@@ -44,7 +44,8 @@ static int try_spare(char *devname, int *dfdp, struct dev_policy *pol,

static int Incremental_container(struct supertype *st, char *devname,
char *homehost,
- int verbose, int runstop, int autof);
+ int verbose, int runstop, int autof,
+ int freeze_reshape);

static struct mddev_ident *search_mdstat(struct supertype *st,
struct mdinfo *info,
@@ -53,7 +54,7 @@ static struct mddev_ident *search_mdstat(struct supertype *st,

int Incremental(char *devname, int verbose, int runstop,
struct supertype *st, char *homehost, int require_homehost,
- int autof)
+ int autof, int freeze_reshape)
{
/* Add this device to an array, creating the array if necessary
* and starting the array if sensible or - if runstop>0 - if possible.
@@ -140,7 +141,8 @@ int Incremental(char *devname, int verbose, int runstop,
close(dfd);
if (!rv && st->ss->container_content)
return Incremental_container(st, devname, homehost,
- verbose, runstop, autof);
+ verbose, runstop, autof,
+ freeze_reshape);

fprintf(stderr, Name ": %s is not part of an md array.\n",
devname);
@@ -450,7 +452,8 @@ int Incremental(char *devname, int verbose, int runstop,
close(mdfd);
sysfs_free(sra);
rv = Incremental(chosen_name, verbose, runstop,
- NULL, homehost, require_homehost, autof);
+ NULL, homehost, require_homehost, autof,
+ freeze_reshape);
if (rv == 1)
/* Don't fail the whole -I if a subarray didn't
* have enough devices to start yet
@@ -1416,7 +1419,7 @@ static char *container2devname(char *devname)

static int Incremental_container(struct supertype *st, char *devname,
char *homehost, int verbose,
- int runstop, int autof)
+ int runstop, int autof, int freeze_reshape)
{
/* Collect the contents of this container and for each
* array, choose a device name and assemble the array.
@@ -1551,7 +1554,8 @@ static int Incremental_container(struct supertype *st, char *devname,
}

assemble_container_content(st, mdfd, ra, runstop,
- chosen_name, verbose, NULL);
+ chosen_name, verbose, NULL,
+ freeze_reshape);
close(mdfd);
}

diff --git a/ReadMe.c b/ReadMe.c
index b658841..89dd7af 100644
--- a/ReadMe.c
+++ b/ReadMe.c
@@ -153,6 +153,7 @@ struct option long_options[] = {
{"scan", 0, 0, 's'},
{"force", 0, 0, Force},
{"update", 1, 0, 'U'},
+ {"freeze-reshape", 0, 0, FreezeReshape},

/* Management */
{"add", 0, 0, Add},
diff --git a/mdadm.c b/mdadm.c
index 1533510..18ca2ee 100644
--- a/mdadm.c
+++ b/mdadm.c
@@ -112,6 +112,8 @@ int main(int argc, char *argv[])

int mdfd = -1;

+ int freeze_reshape = FREEZE_RESHAPE_NONE;
+
srandom(time(0) ^ getpid());

ident.uuid_set=0;
@@ -612,8 +614,12 @@ int main(int argc, char *argv[])
case O(MANAGE,Force): /* add device which is too large */
force=1;
continue;
-
/* now for the Assemble options */
+ case O(ASSEMBLE, FreezeReshape): /* Freeze reshape during
+ * initrd phase */
+ case O(INCREMENTAL, FreezeReshape):
+ freeze_reshape = FREEZE_RESHAPE_ASSEMBLY;
+ continue;
case O(CREATE,'u'): /* uuid of array */
case O(ASSEMBLE,'u'): /* uuid of array */
if (ident.uuid_set) {
@@ -1228,14 +1234,16 @@ int main(int argc, char *argv[])
NULL, backup_file, invalid_backup,
readonly, runstop, update,
homehost, require_homehost,
- verbose-quiet, force);
+ verbose-quiet, force,
+ freeze_reshape);
}
} else if (!scan)
rv = Assemble(ss, devlist->devname, &ident,
devlist->next, backup_file, invalid_backup,
readonly, runstop, update,
homehost, require_homehost,
- verbose-quiet, force);
+ verbose-quiet, force,
+ freeze_reshape);
else if (devs_found>0) {
if (update && devs_found > 1) {
fprintf(stderr, Name ": can only update a single array at a time\n");
@@ -1259,7 +1267,8 @@ int main(int argc, char *argv[])
NULL, backup_file, invalid_backup,
readonly, runstop, update,
homehost, require_homehost,
- verbose-quiet, force);
+ verbose-quiet, force,
+ freeze_reshape);
}
} else {
struct mddev_ident *a, *array_list = conf_get_ident(NULL);
@@ -1300,7 +1309,8 @@ int main(int argc, char *argv[])
NULL, NULL, 0,
readonly, runstop, NULL,
homehost, require_homehost,
- verbose-quiet, force);
+ verbose-quiet, force,
+ freeze_reshape);
if (r == 0) {
a->assembled = 1;
successes++;
@@ -1325,9 +1335,13 @@ int main(int argc, char *argv[])
rv2 = Assemble(ss, NULL,
&ident,
devlist, NULL, 0,
- readonly, runstop, NULL,
- homehost, require_homehost,
- verbose-quiet, force);
+ readonly,
+ runstop, NULL,
+ homehost,
+ require_homehost,
+ verbose-quiet,
+ force,
+ freeze_reshape);
if (rv2==0) {
cnt++;
acnt++;
@@ -1681,7 +1695,8 @@ int main(int argc, char *argv[])
else
rv = Incremental(devlist->devname, verbose-quiet,
runstop, ss, homehost,
- require_homehost, autof);
+ require_homehost, autof,
+ freeze_reshape);
break;
case AUTODETECT:
autodetect();
diff --git a/mdadm.h b/mdadm.h
index 8dd37d9..073deb9 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -313,6 +313,7 @@ enum special_options {
RebuildMapOpt,
InvalidBackup,
UdevRules,
+ FreezeReshape,
};

/* structures read from config file */
@@ -1030,7 +1031,16 @@ extern int Grow_reshape(char *devname, int fd, int quiet, char *backup_file,
extern int Grow_restart(struct supertype *st, struct mdinfo *info,
int *fdlist, int cnt, char *backup_file, int verbose);
extern int Grow_continue(int mdfd, struct supertype *st,
- struct mdinfo *info, char *backup_file);
+ struct mdinfo *info, char *backup_file,
+ int freeze_reshape);
+
+/* define stages for freeze assembly feature
+ * FREEZE_RESHAPE_NONE : disabled
+ * FREEZE_RESHAPE_ASSEMBLY : assemby phase
+ */
+#define FREEZE_RESHAPE_NONE 0
+#define FREEZE_RESHAPE_ASSEMBLY 1
+
extern int restore_backup(struct supertype *st,
struct mdinfo *content,
int working_disks,
@@ -1044,7 +1054,7 @@ extern int Assemble(struct supertype *st, char *mddev,
char *backup_file, int invalid_backup,
int readonly, int runstop,
char *update, char *homehost, int require_homehost,
- int verbose, int force);
+ int verbose, int force, int freeze_reshape);

extern int Build(char *mddev, int chunk, int level, int layout,
int raiddisks, struct mddev_dev *devlist, int assume_clean,
@@ -1078,7 +1088,7 @@ extern int WaitClean(char *dev, int sock, int verbose);

extern int Incremental(char *devname, int verbose, int runstop,
struct supertype *st, char *homehost, int require_homehost,
- int autof);
+ int autof, int freeze_reshape);
extern void RebuildMap(void);
extern int IncrementalScan(int verbose);
extern int IncrementalRemove(char *devname, char *path, int verbose);
@@ -1157,7 +1167,7 @@ extern void append_metadata_update(struct supertype *st, void *buf, int len);
extern int assemble_container_content(struct supertype *st, int mdfd,
struct mdinfo *content, int runstop,
char *chosen_name, int verbose,
- char *backup_file);
+ char *backup_file, int freeze_reshape);
extern struct mdinfo *container_choose_spares(struct supertype *st,
unsigned long long min_size,
struct domainlist *domlist,

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 2/8] Add continue option to grow command

am 27.09.2011 14:04:48 von adam.kwolek

To allow for reshape continuation '--continue' option is added
to grow command.
Function that will be executed in grow-continue case doesn't require
information about reshape geometry. All required information are read
from metadata.
For external metadata reshape can be run for monitored array/container
only. In case when array/container is not monitored run mdmon for it.

Signed-off-by: Adam Kwolek
---

Grow.c | 132 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ++
ReadMe.c | 1
mdadm.c | 14 ++++++-
mdadm.h | 6 +++
4 files changed, 151 insertions(+), 2 deletions(-)

diff --git a/Grow.c b/Grow.c
index 4509488..768fc86 100644
--- a/Grow.c
+++ b/Grow.c
@@ -3636,6 +3636,138 @@ int Grow_restart(struct supertype *st, struct mdinfo *info, int *fdlist, int cnt
return 1;
}

+int Grow_continue_command(char *devname, int fd,
+ char *backup_file, int verbose)
+{
+ int ret_val = 0;
+ struct supertype *st = NULL;
+ struct mdinfo *content = NULL;
+ struct mdinfo array;
+ char *subarray = NULL;
+ struct mdinfo *cc = NULL;
+ struct mdstat_ent *mdstat = NULL;
+ char buf[40];
+ int cfd = -1;
+ int fd2 = -1;
+
+ dprintf("Grow continue from command line called for %s\n",
+ devname);
+
+ st = super_by_fd(fd, &subarray);
+ if (!st || !st->ss) {
+ fprintf(stderr,
+ Name ": Unable to determine metadata format for %s\n",
+ devname);
+ return 1;
+ }
+ dprintf("Grow continue is run for ");
+ if (st->ss->external == 0) {
+ dprintf("native array (%s)\n", devname);
+ if (ioctl(fd, GET_ARRAY_INFO, &array) < 0) {
+ fprintf(stderr, Name ": %s is not an active md array -"
+ " aborting\n", devname);
+ ret_val = 1;
+ goto Grow_continue_command_exit;
+ }
+ content = &array;
+ sysfs_init(content, fd, st->devnum);
+ } else {
+ int container_dev;
+
+ if (subarray) {
+ dprintf("subarray (%s)\n", subarray);
+ container_dev = st->container_dev;
+ cfd = open_dev_excl(st->container_dev);
+ } else {
+ container_dev = st->devnum;
+ close(fd);
+ cfd = open_dev_excl(st->devnum);
+ dprintf("container (%i)\n", container_dev);
+ fd = cfd;
+ }
+ if (cfd < 0) {
+ fprintf(stderr, Name ": Unable to open container "
+ "for %s\n", devname);
+ ret_val = 1;
+ goto Grow_continue_command_exit;
+ }
+ fmt_devname(buf, container_dev);
+
+ /* find in container array under reshape
+ */
+ ret_val = st->ss->load_container(st, cfd, NULL);
+ if (ret_val) {
+ fprintf(stderr,
+ Name ": Cannot read superblock for %s\n",
+ devname);
+ ret_val = 1;
+ goto Grow_continue_command_exit;
+ }
+
+ cc = st->ss->container_content(st, NULL);
+ for (content = cc; content ; content = content->next) {
+ char *array;
+
+ if (content->reshape_active == 0)
+ continue;
+
+ array = strchr(content->text_version+1, '/')+1;
+ mdstat = mdstat_by_subdev(array, container_dev);
+ if (!mdstat)
+ continue;
+ break;
+ }
+ if (!content) {
+ fprintf(stderr,
+ Name ": Unable to determine reshaped "
+ "array for %s\n", devname);
+ ret_val = 1;
+ goto Grow_continue_command_exit;
+ }
+ fd2 = open_dev(mdstat->devnum);
+ if (fd2 < 0) {
+ fprintf(stderr, Name ": cannot open (md%i)\n",
+ mdstat->devnum);
+ ret_val = 1;
+ goto Grow_continue_command_exit;
+ }
+
+ sysfs_init(content, fd2, mdstat->devnum);
+
+ /* start mdmon in case it is not run
+ */
+ if (!mdmon_running(container_dev))
+ start_mdmon(container_dev);
+ ping_monitor(buf);
+
+ if (mdmon_running(container_dev))
+ st->update_tail = &st->updates;
+ else {
+ fprintf(stderr, Name ": No mdmon found. "
+ "Grow cannot continue.\n");
+ ret_val = 1;
+ goto Grow_continue_command_exit;
+ }
+ }
+
+ /* continue reshape
+ */
+ ret_val = Grow_continue(fd, st, content, backup_file,
+ FREEZE_RESHAPE_NONE);
+
+Grow_continue_command_exit:
+ if (fd2 > -1)
+ close(fd2);
+ if (cfd > -1)
+ close(cfd);
+ st->ss->free_super(st);
+ free_mdstat(mdstat);
+ sysfs_free(cc);
+ free(subarray);
+
+ return ret_val;
+}
+
int Grow_continue(int mdfd, struct supertype *st, struct mdinfo *info,
char *backup_file, int freeze_reshape)
{
diff --git a/ReadMe.c b/ReadMe.c
index 89dd7af..25426d3 100644
--- a/ReadMe.c
+++ b/ReadMe.c
@@ -191,6 +191,7 @@ struct option long_options[] = {
{"backup-file", 1,0, BackupFile},
{"invalid-backup",0,0,InvalidBackup},
{"array-size", 1, 0, 'Z'},
+ {"continue", 0, 0, Continue},

/* For Incremental */
{"rebuild-map", 0, 0, RebuildMapOpt},
diff --git a/mdadm.c b/mdadm.c
index 18ca2ee..9fe09ea 100644
--- a/mdadm.c
+++ b/mdadm.c
@@ -74,6 +74,7 @@ int main(int argc, char *argv[])
int export = 0;
int assume_clean = 0;
char *symlinks = NULL;
+ int grow_continue = FREEZE_RESHAPE_NONE;
/* autof indicates whether and how to create device node.
* bottom 3 bits are style. Rest (when shifted) are number of parts
* 0 - unset
@@ -995,7 +996,11 @@ int main(int argc, char *argv[])
}
backup_file = optarg;
continue;
-
+ case O(GROW, Continue):
+ /* Continuer broken grow
+ */
+ grow_continue = FREEZE_RESHAPE_CONTINUE;
+ continue;
case O(ASSEMBLE, InvalidBackup):
/* Acknowledge that the backupfile is invalid, but ask
* to continue anyway
@@ -1649,7 +1654,12 @@ int main(int argc, char *argv[])
delay = DEFAULT_BITMAP_DELAY;
rv = Grow_addbitmap(devlist->devname, mdfd, bitmap_file,
bitmap_chunk, delay, write_behind, force);
- } else if (size >= 0 || raiddisks != 0 || layout_str != NULL
+ } else if (grow_continue) {
+ rv = Grow_continue_command(devlist->devname,
+ mdfd, backup_file,
+ verbose);
+ break;
+ } else if (size >= 0 || raiddisks != 0 || layout_str != NULL
|| chunk != 0 || level != UnSet) {
rv = Grow_reshape(devlist->devname, mdfd, quiet, backup_file,
size, level, layout_str, chunk, raiddisks,
diff --git a/mdadm.h b/mdadm.h
index 073deb9..8f3e786 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -314,6 +314,7 @@ enum special_options {
InvalidBackup,
UdevRules,
FreezeReshape,
+ Continue,
};

/* structures read from config file */
@@ -1037,9 +1038,11 @@ extern int Grow_continue(int mdfd, struct supertype *st,
/* define stages for freeze assembly feature
* FREEZE_RESHAPE_NONE : disabled
* FREEZE_RESHAPE_ASSEMBLY : assemby phase
+ * FREEZE_RESHAPE_CONTINUE : grow continue phase
*/
#define FREEZE_RESHAPE_NONE 0
#define FREEZE_RESHAPE_ASSEMBLY 1
+#define FREEZE_RESHAPE_CONTINUE 2

extern int restore_backup(struct supertype *st,
struct mdinfo *content,
@@ -1047,6 +1050,8 @@ extern int restore_backup(struct supertype *st,
int spares,
char *backup_file,
int verbose);
+extern int Grow_continue_command(char *devname, int fd,
+ char *backup_file, int verbose);

extern int Assemble(struct supertype *st, char *mddev,
struct mddev_ident *ident,
@@ -1185,6 +1190,7 @@ extern char *human_size(long long bytes);
extern char *human_size_brief(long long bytes);
extern void print_r10_layout(int layout);

+
#define NoMdDev (1<<23)
extern int find_free_devnum(int use_partitions);


--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 3/8] Do not restart reshape if it is started already

am 27.09.2011 14:04:57 von adam.kwolek

When reshape was invoked during initrd start-up stage array is pushed
in to reshape state already, so read only state cannot be set again during
reshape continuation. Set previously reshape state has to be reused
during reshape continuation.

Signed-off-by: Adam Kwolek
---

Grow.c | 10 ++++++++--
1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/Grow.c b/Grow.c
index 768fc86..d9c2817 100644
--- a/Grow.c
+++ b/Grow.c
@@ -3773,11 +3773,17 @@ int Grow_continue(int mdfd, struct supertype *st, struct mdinfo *info,
{
char buf[40];
char *container = NULL;
- int err;
+ int err = 0;

- err = sysfs_set_str(info, NULL, "array_state", "readonly");
+ /* set read only array state when there is no reshape
+ * in progress already
+ */
+ if ((sysfs_get_str(info, NULL, "sync_action", buf, 40) != 8) &&
+ (strncmp(buf, "reshape", 7) != 0))
+ err = sysfs_set_str(info, NULL, "array_state", "readonly");
if (err)
return err;
+
if (st->ss->external) {
fmt_devname(buf, st->container_dev);
container = buf;

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 4/8] Set correct reshape restart position

am 27.09.2011 14:05:05 von adam.kwolek

During initrd stage when, when array is assembled with '--freeze-reshape'
option and before stopping reshape, reshape position has to be set to read
from metadata checkpoint.
This will allow later for restart point verification and user will be able
to see in mdstat information about reshape process instead resync when
reshape position is set to 0.

Signed-off-by: Adam Kwolek
---

Grow.c | 26 +++++++++++++++-----------
1 files changed, 15 insertions(+), 11 deletions(-)

diff --git a/Grow.c b/Grow.c
index d9c2817..afe4c72 100644
--- a/Grow.c
+++ b/Grow.c
@@ -696,21 +696,24 @@ static int subarray_set_num(char *container, struct mdinfo *sra, char *name, int
return rc;
}

-int start_reshape(struct mdinfo *sra, int already_running, int freeze_reshape)
+int start_reshape(struct mdinfo *sra, int already_running,
+ int freeze_reshape, int data_disks)
{
int err;
+ unsigned long long position_to_set = 0;
+ unsigned long long sync_max_to_set;

/* do not block array as we not continue reshape this time
*/
- if (freeze_reshape == FREEZE_RESHAPE_NONE)
- sysfs_set_num(sra, NULL, "suspend_lo", 0x7FFFFFFFFFFFFFFFULL);
- else
- sysfs_set_num(sra, NULL, "suspend_lo", 0);
- err = sysfs_set_num(sra, NULL, "suspend_hi", 0);
- err = err ?: sysfs_set_num(sra, NULL, "suspend_lo", 0);
+ if (freeze_reshape != FREEZE_RESHAPE_NONE)
+ position_to_set = sra->reshape_progress;
+ sysfs_set_num(sra, NULL, "suspend_lo", 0x7FFFFFFFFFFFFFFFULL);
+ err = sysfs_set_num(sra, NULL, "suspend_hi", position_to_set);
+ err = err ?: sysfs_set_num(sra, NULL, "suspend_lo", position_to_set);
+ sync_max_to_set = position_to_set / data_disks;
if (!already_running)
- sysfs_set_num(sra, NULL, "sync_min", 0);
- err = err ?: sysfs_set_num(sra, NULL, "sync_max", 0);
+ sysfs_set_num(sra, NULL, "sync_min", sync_max_to_set);
+ err = err ?: sysfs_set_num(sra, NULL, "sync_max", sync_max_to_set);
if (!already_running)
err = err ?: sysfs_set_str(sra, NULL, "sync_action", "reshape");

@@ -2247,7 +2250,8 @@ started:
}
}

- err = start_reshape(sra, restart, freeze_reshape);
+ err = start_reshape(sra, restart, freeze_reshape,
+ info->array.raid_disks - reshape.parity);
if (err) {
fprintf(stderr,
Name ": Cannot %s reshape for %s\n",
@@ -3753,7 +3757,7 @@ int Grow_continue_command(char *devname, int fd,
/* continue reshape
*/
ret_val = Grow_continue(fd, st, content, backup_file,
- FREEZE_RESHAPE_NONE);
+ FREEZE_RESHAPE_CONTINUE);

Grow_continue_command_exit:
if (fd2 > -1)

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 5/8] Move code to get_data_disks() function

am 27.09.2011 14:05:13 von adam.kwolek

Move code to function for code reuse.

Signed-off-by: Adam Kwolek
---

mdadm.h | 1 +
util.c | 10 ++++++++--
2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/mdadm.h b/mdadm.h
index 8f3e786..4bbf660 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -1165,6 +1165,7 @@ extern unsigned long long get_component_size(int fd);
extern void remove_partitions(int fd);
extern int test_partition(int fd);
extern int test_partition_from_id(dev_t id);
+extern int get_data_disks(int level, int layout, int raid_disks);
extern unsigned long long calc_array_size(int level, int raid_disks, int layout,
int chunksize, unsigned long long devsize);
extern int flush_metadata_updates(struct supertype *st);
diff --git a/util.c b/util.c
index 0ea7e0d..50c98c1 100644
--- a/util.c
+++ b/util.c
@@ -703,6 +703,12 @@ void print_r10_layout(int layout)
unsigned long long calc_array_size(int level, int raid_disks, int layout,
int chunksize, unsigned long long devsize)
{
+ devsize &= ~(unsigned long long)((chunksize>>9)-1);
+ return get_data_disks(level, layout, raid_disks) * devsize;
+}
+
+int get_data_disks(int level, int layout, int raid_disks)
+{
int data_disks = 0;
switch (level) {
case 0: data_disks = raid_disks; break;
@@ -713,8 +719,8 @@ unsigned long long calc_array_size(int level, int raid_disks, int layout,
case 10: data_disks = raid_disks / (layout & 255) / ((layout>>8)&255);
break;
}
- devsize &= ~(unsigned long long)((chunksize>>9)-1);
- return data_disks * devsize;
+
+ return data_disks;
}

#if !defined(MDASSEMBLE) || defined(MDASSEMBLE) && defined(MDASSEMBLE_AUTO)

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 6/8] Verify reshape restart position

am 27.09.2011 14:05:22 von adam.kwolek

Check if reshape restart position is the same as set in md.
If position doesn't match this means that we cannot restart reshape.

Signed-off-by: Adam Kwolek
---

Grow.c | 32 ++++++++++++++++++++++++++++++++
1 files changed, 32 insertions(+), 0 deletions(-)

diff --git a/Grow.c b/Grow.c
index afe4c72..3ff249d 100644
--- a/Grow.c
+++ b/Grow.c
@@ -3653,6 +3653,8 @@ int Grow_continue_command(char *devname, int fd,
char buf[40];
int cfd = -1;
int fd2 = -1;
+ char *ep;
+ unsigned long long position;

dprintf("Grow continue from command line called for %s\n",
devname);
@@ -3754,6 +3756,36 @@ int Grow_continue_command(char *devname, int fd,
}
}

+ /* verify that array under reshape is started from
+ * correct position
+ */
+ ret_val = sysfs_get_str(content, NULL, "sync_max", buf, 40);
+ if (ret_val <= 0) {
+ fprintf(stderr, Name
+ ": cannot open verify reshape progress for %s (%i)\n",
+ content->sys_name, ret_val);
+ ret_val = 1;
+ goto Grow_continue_command_exit;
+ }
+ dprintf(Name ": Read sync_max sysfs entry is: %s\n", buf);
+ errno = 0;
+ position = strtoull(buf, &ep, 0);
+ if (errno || ep == buf || (*ep != 0 && *ep != '\n' && *ep != ' ')) {
+ fprintf(stderr, Name ": md is not allowed to finish reshape "
+ "wihout mdadm assistance.\n");
+ ret_val = 1;
+ goto Grow_continue_command_exit;
+ }
+ position *= get_data_disks(map_name(pers, mdstat->level),
+ content->new_layout,
+ content->array.raid_disks);
+ if (position != content->reshape_progress) {
+ fprintf(stderr, Name ": md is not allowed to finish reshape "
+ "wihout mdadm assistance.\n");
+ ret_val = 1;
+ goto Grow_continue_command_exit;
+ }
+
/* continue reshape
*/
ret_val = Grow_continue(fd, st, content, backup_file,

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 7/8] Manual update for --continue option

am 27.09.2011 14:05:30 von adam.kwolek

Patch adds to mdadm man the following information:

--freeze-reshape
Option is intended to be used in start-up scripts during initrd boot
phase. When array under reshape is assembled during initrd phase,
this option stops reshape after reshape critical section is being
restored. This happens before file system pivot operation and avoids lost
of file system context. Loosing file system context would cause
reshape to be broken.

Reshape can be continued later using -continue option for grow command.

Signed-off-by: Adam Kwolek
---

mdadm.8.in | 13 +++++++++++++
1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/mdadm.8.in b/mdadm.8.in
index 039f3d4..3e2f99e 100644
--- a/mdadm.8.in
+++ b/mdadm.8.in
@@ -1096,6 +1096,19 @@ option can be used when an array has an internal bitmap which is
corrupt in some way so that assembling the array normally fails. It
will cause any internal bitmap to be ignored.

+.TP
+.BR \-\-freeze\-reshape
+Option is intended to be used in start-up scripts during initrd boot phase.
+When array under reshape is assembled during initrd phase, this option
+stops reshape after reshape critical section is being restored. This happens
+before file system pivot operation and avoids lost of file system context.
+Loosing file system context would cause reshape to be broken.
+
+Reshape can be continued later using
+.B\-\-continue
+option for grow command.
+
+
.SH For Manage mode:

.TP

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 8/8] Manual update for --continue option

am 27.09.2011 14:05:39 von adam.kwolek

Patch adds to mdadm man the following information:

--continue
This option is complementary pair to assembly --freeze-reshape option.
It is needed when --grow operation is interrupted and it is not restarted
automatically due to --freeze-reshape usage during array assembly.
Option --continue has to be used together with -G , ( --grow ) command
and device that it should be executed on. All parameters required for
reshape continuation will be read from array metadata. If initial
--grow command had required --backup-file= option to be set,
continuation option will require to have exactly the same backup
file pointed to also.

Any other parameter passed together with --continue option will be ignored.

Signed-off-by: Adam Kwolek
---

mdadm.8.in | 27 +++++++++++++++++++++++++++
1 files changed, 27 insertions(+), 0 deletions(-)

diff --git a/mdadm.8.in b/mdadm.8.in
index 3e2f99e..6b0af55 100644
--- a/mdadm.8.in
+++ b/mdadm.8.in
@@ -734,6 +734,33 @@ The file must be stored on a separate device, not on the RAID array
being reshaped.

.TP
+.BR \-\-continue
+This option is complementary pair to assembly
+.B \-\-freeze-reshape
+option. It is needed when
+.B \-\-grow
+operation is interrupted and it is not restarted automatically due to
+.B \-\-freeze-reshape
+usage during array assembly. Option
+.BR \-\-continue
+has to be used together with
+.BR \-G
+, (
+.BR \-\-grow
+) command and device that it should be executed on.
+All parameters required for reshape continuation will be read from array metadata.
+If initial
+.BR \-\-grow
+command had required
+.BR \-\-backup\-file=
+option to be set, continuation option will require to have exactly the same
+backup file pointed to also.
+.IP
+Any other parameter passed together with
+.BR \-\-continue
+option will be ignored.
+
+.TP
.BR \-N ", " \-\-name=
Set a
.B name

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/8] Do not continue reshape during initrd phase

am 03.10.2011 00:19:20 von NeilBrown

--Sig_/X5Ivlbf+hywsZeaVo.MMnJ7
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

On Tue, 27 Sep 2011 14:04:39 +0200 Adam Kwolek wrot=
e:

> During initrd phase continuing reshape will cause file system context
> lost. This blocks ability to control reshape using checkpoints.
>=20
> To avoid this, during initrd phase assemble has to be executed with
> '--freeze-reshape' option. This causes that mdadm restores reshape
> critical section only.
>=20
> Reshape can be continued later after system full boot.

Thanks.
I've applied this with a few small changes:

- The constants FREEZE_RESHAPE_NONE and FREEZE_RESHAPE_ASSEMBLY are not
necessary and don't help readability. Just treat "freeze_reshape" as a
boolean, either true or false.

- The change to share_reshape as unnecessary so I removed it.

Thanks,
NeilBrown


>
>=20
> Signed-off-by: Adam Kwolek
> ---
>=20
> Assemble.c | 12 +++++++-----
> Grow.c | 52 ++++++++++++++++++++++++++++++++++++++--------------
> Incremental.c | 16 ++++++++++------
> ReadMe.c | 1 +
> mdadm.c | 33 ++++++++++++++++++++++++---------
> mdadm.h | 18 ++++++++++++++----
> 6 files changed, 94 insertions(+), 38 deletions(-)
>=20
> diff --git a/Assemble.c b/Assemble.c
> index c6aad20..afca38e 100644
> --- a/Assemble.c
> +++ b/Assemble.c
> @@ -138,7 +138,7 @@ int Assemble(struct supertype *st, char *mddev,
> char *backup_file, int invalid_backup,
> int readonly, int runstop,
> char *update, char *homehost, int require_homehost,
> - int verbose, int force)
> + int verbose, int force, int freeze_reshape)
> {
> /*
> * The task of Assemble is to find a collection of
> @@ -697,7 +697,7 @@ int Assemble(struct supertype *st, char *mddev,
> int err;
> err =3D assemble_container_content(st, mdfd, content, runstop,
> chosen_name, verbose,
> - backup_file);
> + backup_file, freeze_reshape);
> close(mdfd);
> return err;
> }
> @@ -1344,7 +1344,8 @@ int Assemble(struct supertype *st, char *mddev,
> #ifndef MDASSEMBLE
> if (content->reshape_active &&
> content->delta_disks <=3D 0)
> - rv =3D Grow_continue(mdfd, st, content, backup_file);
> + rv =3D Grow_continue(mdfd, st, content,
> + backup_file, freeze_reshape);
> else
> #endif
> rv =3D ioctl(mdfd, RUN_ARRAY, NULL);
> @@ -1511,7 +1512,7 @@ int Assemble(struct supertype *st, char *mddev,
> int assemble_container_content(struct supertype *st, int mdfd,
> struct mdinfo *content, int runstop,
> char *chosen_name, int verbose,
> - char *backup_file)
> + char *backup_file, int freeze_reshape)
> {
> struct mdinfo *dev, *sra;
> int working =3D 0, preexist =3D 0;
> @@ -1560,7 +1561,8 @@ int assemble_container_content(struct supertype *st=
, int mdfd,
> spare, backup_file, verbose) == 1)
> return 1;
> =20
> - err =3D Grow_continue(mdfd, st, content, backup_file);
> + err =3D Grow_continue(mdfd, st, content, backup_file,
> + freeze_reshape);
> } else switch(content->array.level) {
> case LEVEL_LINEAR:
> case LEVEL_MULTIPATH:
> diff --git a/Grow.c b/Grow.c
> index 4a25165..4509488 100644
> --- a/Grow.c
> +++ b/Grow.c
> @@ -696,10 +696,16 @@ static int subarray_set_num(char *container, struct=
mdinfo *sra, char *name, int
> return rc;
> }
> =20
> -int start_reshape(struct mdinfo *sra, int already_running)
> +int start_reshape(struct mdinfo *sra, int already_running, int freeze_re=
shape)
> {
> int err;
> - sysfs_set_num(sra, NULL, "suspend_lo", 0x7FFFFFFFFFFFFFFFULL);
> +
> + /* do not block array as we not continue reshape this time
> + */
> + if (freeze_reshape == FREEZE_RESHAPE_NONE)
> + sysfs_set_num(sra, NULL, "suspend_lo", 0x7FFFFFFFFFFFFFFFULL);
> + else
> + sysfs_set_num(sra, NULL, "suspend_lo", 0);
> err =3D sysfs_set_num(sra, NULL, "suspend_hi", 0);
> err =3D err ?: sysfs_set_num(sra, NULL, "suspend_lo", 0);
> if (!already_running)
> @@ -1344,13 +1350,13 @@ static int reshape_array(char *container, int fd,=
char *devname,
> struct supertype *st, struct mdinfo *info,
> int force, struct mddev_dev *devlist,
> char *backup_file, int quiet, int forked,
> - int restart);
> + int restart, int freeze_reshape);
> static int reshape_container(char *container, char *devname,
> struct supertype *st,=20
> struct mdinfo *info,
> int force,
> char *backup_file,
> - int quiet, int restart);
> + int quiet, int restart, int freeze_reshape);
> =20
> int Grow_reshape(char *devname, int fd, int quiet, char *backup_file,
> long long size,
> @@ -1761,7 +1767,7 @@ int Grow_reshape(char *devname, int fd, int quiet, =
char *backup_file,
> * performed at the level of the container
> */
> rv =3D reshape_container(container, devname, st, &info,
> - force, backup_file, quiet, 0);
> + force, backup_file, quiet, 0, 0);
> frozen =3D 0;
> } else {
> /* get spare devices from external metadata
> @@ -1789,7 +1795,7 @@ int Grow_reshape(char *devname, int fd, int quiet, =
char *backup_file,
> }
> sync_metadata(st);
> rv =3D reshape_array(container, fd, devname, st, &info, force,
> - devlist, backup_file, quiet, 0, 0);
> + devlist, backup_file, quiet, 0, 0, 0);
> frozen =3D 0;
> }
> release:
> @@ -1802,7 +1808,7 @@ static int reshape_array(char *container, int fd, c=
har *devname,
> struct supertype *st, struct mdinfo *info,
> int force, struct mddev_dev *devlist,
> char *backup_file, int quiet, int forked,
> - int restart)
> + int restart, int freeze_reshape)
> {
> struct reshape reshape;
> int spares_needed;
> @@ -2241,7 +2247,7 @@ started:
> }
> }
> =20
> - err =3D start_reshape(sra, restart);
> + err =3D start_reshape(sra, restart, freeze_reshape);
> if (err) {
> fprintf(stderr,=20
> Name ": Cannot %s reshape for %s\n",
> @@ -2251,6 +2257,15 @@ started:
> }
> if (restart)
> sysfs_set_str(sra, NULL, "array_state", "active");
> + if (freeze_reshape == FREEZE_RESHAPE_ASSEMBLY) {
> + free(fdlist);
> + free(offsets);
> + sysfs_free(sra);
> + fprintf(stderr, Name ": Reshape has to be continued from"
> + " location %llu when root fileststem will be mounted\n",
> + sra->reshape_progress);
> + return 1;
> + }
> =20
> /* Now we just need to kick off the reshape and watch, while
> * handling backups of the data...
> @@ -2389,7 +2404,7 @@ int reshape_container(char *container, char *devnam=
e,
> struct mdinfo *info,
> int force,
> char *backup_file,
> - int quiet, int restart)
> + int quiet, int restart, int freeze_reshape)
> {
> struct mdinfo *cc =3D NULL;
> int rv =3D restart;
> @@ -2418,7 +2433,9 @@ int reshape_container(char *container, char *devnam=
e,
> unfreeze(st);
> return 1;
> default: /* parent */
> - printf(Name ": multi-array reshape continues in background\n");
> + if (freeze_reshape == FREEZE_RESHAPE_NONE)
> + printf(Name ": multi-array reshape continues"
> + "in background\n");
> return 0;
> case 0: /* child */
> break;
> @@ -2473,8 +2490,15 @@ int reshape_container(char *container, char *devna=
me,
> =20
> rv =3D reshape_array(container, fd, adev, st,
> content, force, NULL,
> - backup_file, quiet, 1, restart);
> + backup_file, quiet, 1, restart,
> + freeze_reshape);
> close(fd);
> +
> + if (freeze_reshape) {
> + sysfs_free(cc);
> + exit(0);
> + }
> +
> restart =3D 0;
> if (rv)
> break;
> @@ -3613,7 +3637,7 @@ int Grow_restart(struct supertype *st, struct mdinf=
o *info, int *fdlist, int cnt
> }
> =20
> int Grow_continue(int mdfd, struct supertype *st, struct mdinfo *info,
> - char *backup_file)
> + char *backup_file, int freeze_reshape)
> {
> char buf[40];
> char *container =3D NULL;
> @@ -3640,9 +3664,9 @@ int Grow_continue(int mdfd, struct supertype *st, s=
truct mdinfo *info,
> close(cfd);
> return reshape_container(container, NULL,
> st, info, 0, backup_file,
> - 0, 1);
> + 0, 1, freeze_reshape);
> }
> }
> return reshape_array(container, mdfd, "array", st, info, 1,
> - NULL, backup_file, 0, 0, 1);
> + NULL, backup_file, 0, 0, 1, freeze_reshape);
> }
> diff --git a/Incremental.c b/Incremental.c
> index 791ad85..571d45d 100644
> --- a/Incremental.c
> +++ b/Incremental.c
> @@ -44,7 +44,8 @@ static int try_spare(char *devname, int *dfdp, struct d=
ev_policy *pol,
> =20
> static int Incremental_container(struct supertype *st, char *devname,
> char *homehost,
> - int verbose, int runstop, int autof);
> + int verbose, int runstop, int autof,
> + int freeze_reshape);
> =20
> static struct mddev_ident *search_mdstat(struct supertype *st,
> struct mdinfo *info,
> @@ -53,7 +54,7 @@ static struct mddev_ident *search_mdstat(struct superty=
pe *st,
> =20
> int Incremental(char *devname, int verbose, int runstop,
> struct supertype *st, char *homehost, int require_homehost,
> - int autof)
> + int autof, int freeze_reshape)
> {
> /* Add this device to an array, creating the array if necessary
> * and starting the array if sensible or - if runstop>0 - if possible.
> @@ -140,7 +141,8 @@ int Incremental(char *devname, int verbose, int runst=
op,
> close(dfd);
> if (!rv && st->ss->container_content)
> return Incremental_container(st, devname, homehost,
> - verbose, runstop, autof);
> + verbose, runstop, autof,
> + freeze_reshape);
> =20
> fprintf(stderr, Name ": %s is not part of an md array.\n",
> devname);
> @@ -450,7 +452,8 @@ int Incremental(char *devname, int verbose, int runst=
op,
> close(mdfd);
> sysfs_free(sra);
> rv =3D Incremental(chosen_name, verbose, runstop,
> - NULL, homehost, require_homehost, autof);
> + NULL, homehost, require_homehost, autof,
> + freeze_reshape);
> if (rv == 1)
> /* Don't fail the whole -I if a subarray didn't
> * have enough devices to start yet
> @@ -1416,7 +1419,7 @@ static char *container2devname(char *devname)
> =20
> static int Incremental_container(struct supertype *st, char *devname,
> char *homehost, int verbose,
> - int runstop, int autof)
> + int runstop, int autof, int freeze_reshape)
> {
> /* Collect the contents of this container and for each
> * array, choose a device name and assemble the array.
> @@ -1551,7 +1554,8 @@ static int Incremental_container(struct supertype *=
st, char *devname,
> }
> =20
> assemble_container_content(st, mdfd, ra, runstop,
> - chosen_name, verbose, NULL);
> + chosen_name, verbose, NULL,
> + freeze_reshape);
> close(mdfd);
> }
> =20
> diff --git a/ReadMe.c b/ReadMe.c
> index b658841..89dd7af 100644
> --- a/ReadMe.c
> +++ b/ReadMe.c
> @@ -153,6 +153,7 @@ struct option long_options[] =3D {
> {"scan", 0, 0, 's'},
> {"force", 0, 0, Force},
> {"update", 1, 0, 'U'},
> + {"freeze-reshape", 0, 0, FreezeReshape},
> =20
> /* Management */
> {"add", 0, 0, Add},
> diff --git a/mdadm.c b/mdadm.c
> index 1533510..18ca2ee 100644
> --- a/mdadm.c
> +++ b/mdadm.c
> @@ -112,6 +112,8 @@ int main(int argc, char *argv[])
> =20
> int mdfd =3D -1;
> =20
> + int freeze_reshape =3D FREEZE_RESHAPE_NONE;
> +
> srandom(time(0) ^ getpid());
> =20
> ident.uuid_set=3D0;
> @@ -612,8 +614,12 @@ int main(int argc, char *argv[])
> case O(MANAGE,Force): /* add device which is too large */
> force=3D1;
> continue;
> -
> /* now for the Assemble options */
> + case O(ASSEMBLE, FreezeReshape): /* Freeze reshape during
> + * initrd phase */
> + case O(INCREMENTAL, FreezeReshape):
> + freeze_reshape =3D FREEZE_RESHAPE_ASSEMBLY;
> + continue;
> case O(CREATE,'u'): /* uuid of array */
> case O(ASSEMBLE,'u'): /* uuid of array */
> if (ident.uuid_set) {
> @@ -1228,14 +1234,16 @@ int main(int argc, char *argv[])
> NULL, backup_file, invalid_backup,
> readonly, runstop, update,
> homehost, require_homehost,
> - verbose-quiet, force);
> + verbose-quiet, force,
> + freeze_reshape);
> }
> } else if (!scan)
> rv =3D Assemble(ss, devlist->devname, &ident,
> devlist->next, backup_file, invalid_backup,
> readonly, runstop, update,
> homehost, require_homehost,
> - verbose-quiet, force);
> + verbose-quiet, force,
> + freeze_reshape);
> else if (devs_found>0) {
> if (update && devs_found > 1) {
> fprintf(stderr, Name ": can only update a single array at a time\n");
> @@ -1259,7 +1267,8 @@ int main(int argc, char *argv[])
> NULL, backup_file, invalid_backup,
> readonly, runstop, update,
> homehost, require_homehost,
> - verbose-quiet, force);
> + verbose-quiet, force,
> + freeze_reshape);
> }
> } else {
> struct mddev_ident *a, *array_list =3D conf_get_ident(NULL);
> @@ -1300,7 +1309,8 @@ int main(int argc, char *argv[])
> NULL, NULL, 0,
> readonly, runstop, NULL,
> homehost, require_homehost,
> - verbose-quiet, force);
> + verbose-quiet, force,
> + freeze_reshape);
> if (r == 0) {
> a->assembled =3D 1;
> successes++;
> @@ -1325,9 +1335,13 @@ int main(int argc, char *argv[])
> rv2 =3D Assemble(ss, NULL,
> &ident,
> devlist, NULL, 0,
> - readonly, runstop, NULL,
> - homehost, require_homehost,
> - verbose-quiet, force);
> + readonly,
> + runstop, NULL,
> + homehost,
> + require_homehost,
> + verbose-quiet,
> + force,
> + freeze_reshape);
> if (rv2==0) {
> cnt++;
> acnt++;
> @@ -1681,7 +1695,8 @@ int main(int argc, char *argv[])
> else
> rv =3D Incremental(devlist->devname, verbose-quiet,
> runstop, ss, homehost,
> - require_homehost, autof);
> + require_homehost, autof,
> + freeze_reshape);
> break;
> case AUTODETECT:
> autodetect();
> diff --git a/mdadm.h b/mdadm.h
> index 8dd37d9..073deb9 100644
> --- a/mdadm.h
> +++ b/mdadm.h
> @@ -313,6 +313,7 @@ enum special_options {
> RebuildMapOpt,
> InvalidBackup,
> UdevRules,
> + FreezeReshape,
> };
> =20
> /* structures read from config file */
> @@ -1030,7 +1031,16 @@ extern int Grow_reshape(char *devname, int fd, int=
quiet, char *backup_file,
> extern int Grow_restart(struct supertype *st, struct mdinfo *info,
> int *fdlist, int cnt, char *backup_file, int verbose);
> extern int Grow_continue(int mdfd, struct supertype *st,
> - struct mdinfo *info, char *backup_file);
> + struct mdinfo *info, char *backup_file,
> + int freeze_reshape);
> +
> +/* define stages for freeze assembly feature
> + * FREEZE_RESHAPE_NONE : disabled
> + * FREEZE_RESHAPE_ASSEMBLY : assemby phase
> + */
> +#define FREEZE_RESHAPE_NONE 0
> +#define FREEZE_RESHAPE_ASSEMBLY 1
> +
> extern int restore_backup(struct supertype *st,
> struct mdinfo *content,
> int working_disks,
> @@ -1044,7 +1054,7 @@ extern int Assemble(struct supertype *st, char *mdd=
ev,
> char *backup_file, int invalid_backup,
> int readonly, int runstop,
> char *update, char *homehost, int require_homehost,
> - int verbose, int force);
> + int verbose, int force, int freeze_reshape);
> =20
> extern int Build(char *mddev, int chunk, int level, int layout,
> int raiddisks, struct mddev_dev *devlist, int assume_clean,
> @@ -1078,7 +1088,7 @@ extern int WaitClean(char *dev, int sock, int verbo=
se);
> =20
> extern int Incremental(char *devname, int verbose, int runstop,
> struct supertype *st, char *homehost, int require_homehost,
> - int autof);
> + int autof, int freeze_reshape);
> extern void RebuildMap(void);
> extern int IncrementalScan(int verbose);
> extern int IncrementalRemove(char *devname, char *path, int verbose);
> @@ -1157,7 +1167,7 @@ extern void append_metadata_update(struct supertype=
*st, void *buf, int len);
> extern int assemble_container_content(struct supertype *st, int mdfd,
> struct mdinfo *content, int runstop,
> char *chosen_name, int verbose,
> - char *backup_file);
> + char *backup_file, int freeze_reshape);
> extern struct mdinfo *container_choose_spares(struct supertype *st,
> unsigned long long min_size,
> struct domainlist *domlist,


--Sig_/X5Ivlbf+hywsZeaVo.MMnJ7
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)

iD8DBQFOiONoG5fc6gV+Wb0RAkSzAJ9p7kRhHDEUKBZ8DdWaf5F9ZjtWTACf aaXg
kQKx9xlyAGzcSZ7Cmf0l1bo=
=t7ov
-----END PGP SIGNATURE-----

--Sig_/X5Ivlbf+hywsZeaVo.MMnJ7--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/8] Add continue option to grow command

am 03.10.2011 00:27:38 von NeilBrown

--Sig_/JJ=gP9jrI=JyggHnF4AEn+I
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

On Tue, 27 Sep 2011 14:04:48 +0200 Adam Kwolek wrot=
e:

> To allow for reshape continuation '--continue' option is added
> to grow command.
> Function that will be executed in grow-continue case doesn't require
> information about reshape geometry. All required information are read
> from metadata.
> For external metadata reshape can be run for monitored array/container
> only. In case when array/container is not monitored run mdmon for it.
>=20
> Signed-off-by: Adam Kwolek

Applied this just some minor fixes,

Thanks,
NeilBrown


> ---
>=20
> Grow.c | 132 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++=
++++++
> ReadMe.c | 1=20
> mdadm.c | 14 ++++++-
> mdadm.h | 6 +++
> 4 files changed, 151 insertions(+), 2 deletions(-)
>=20
> diff --git a/Grow.c b/Grow.c
> index 4509488..768fc86 100644
> --- a/Grow.c
> +++ b/Grow.c
> @@ -3636,6 +3636,138 @@ int Grow_restart(struct supertype *st, struct mdi=
nfo *info, int *fdlist, int cnt
> return 1;
> }
> =20
> +int Grow_continue_command(char *devname, int fd,
> + char *backup_file, int verbose)
> +{
> + int ret_val =3D 0;
> + struct supertype *st =3D NULL;
> + struct mdinfo *content =3D NULL;
> + struct mdinfo array;
> + char *subarray =3D NULL;
> + struct mdinfo *cc =3D NULL;
> + struct mdstat_ent *mdstat =3D NULL;
> + char buf[40];
> + int cfd =3D -1;
> + int fd2 =3D -1;
> +
> + dprintf("Grow continue from command line called for %s\n",
> + devname);
> +
> + st =3D super_by_fd(fd, &subarray);
> + if (!st || !st->ss) {
> + fprintf(stderr,
> + Name ": Unable to determine metadata format for %s\n",
> + devname);
> + return 1;
> + }
> + dprintf("Grow continue is run for ");
> + if (st->ss->external == 0) {
> + dprintf("native array (%s)\n", devname);
> + if (ioctl(fd, GET_ARRAY_INFO, &array) < 0) {
> + fprintf(stderr, Name ": %s is not an active md array -"
> + " aborting\n", devname);
> + ret_val =3D 1;
> + goto Grow_continue_command_exit;
> + }
> + content =3D &array;
> + sysfs_init(content, fd, st->devnum);
> + } else {
> + int container_dev;
> +
> + if (subarray) {
> + dprintf("subarray (%s)\n", subarray);
> + container_dev =3D st->container_dev;
> + cfd =3D open_dev_excl(st->container_dev);
> + } else {
> + container_dev =3D st->devnum;
> + close(fd);
> + cfd =3D open_dev_excl(st->devnum);
> + dprintf("container (%i)\n", container_dev);
> + fd =3D cfd;
> + }
> + if (cfd < 0) {
> + fprintf(stderr, Name ": Unable to open container "
> + "for %s\n", devname);
> + ret_val =3D 1;
> + goto Grow_continue_command_exit;
> + }
> + fmt_devname(buf, container_dev);
> +
> + /* find in container array under reshape
> + */
> + ret_val =3D st->ss->load_container(st, cfd, NULL);
> + if (ret_val) {
> + fprintf(stderr,
> + Name ": Cannot read superblock for %s\n",
> + devname);
> + ret_val =3D 1;
> + goto Grow_continue_command_exit;
> + }
> +
> + cc =3D st->ss->container_content(st, NULL);
> + for (content =3D cc; content ; content =3D content->next) {
> + char *array;
> +
> + if (content->reshape_active == 0)
> + continue;
> +
> + array =3D strchr(content->text_version+1, '/')+1;
> + mdstat =3D mdstat_by_subdev(array, container_dev);
> + if (!mdstat)
> + continue;
> + break;
> + }
> + if (!content) {
> + fprintf(stderr,
> + Name ": Unable to determine reshaped "
> + "array for %s\n", devname);
> + ret_val =3D 1;
> + goto Grow_continue_command_exit;
> + }
> + fd2 =3D open_dev(mdstat->devnum);
> + if (fd2 < 0) {
> + fprintf(stderr, Name ": cannot open (md%i)\n",
> + mdstat->devnum);
> + ret_val =3D 1;
> + goto Grow_continue_command_exit;
> + }
> +
> + sysfs_init(content, fd2, mdstat->devnum);
> +
> + /* start mdmon in case it is not run
> + */
> + if (!mdmon_running(container_dev))
> + start_mdmon(container_dev);
> + ping_monitor(buf);
> +
> + if (mdmon_running(container_dev))
> + st->update_tail =3D &st->updates;
> + else {
> + fprintf(stderr, Name ": No mdmon found. "
> + "Grow cannot continue.\n");
> + ret_val =3D 1;
> + goto Grow_continue_command_exit;
> + }
> + }
> +
> + /* continue reshape
> + */
> + ret_val =3D Grow_continue(fd, st, content, backup_file,
> + FREEZE_RESHAPE_NONE);
> +
> +Grow_continue_command_exit:
> + if (fd2 > -1)
> + close(fd2);
> + if (cfd > -1)
> + close(cfd);
> + st->ss->free_super(st);
> + free_mdstat(mdstat);
> + sysfs_free(cc);
> + free(subarray);
> +
> + return ret_val;
> +}
> +
> int Grow_continue(int mdfd, struct supertype *st, struct mdinfo *info,
> char *backup_file, int freeze_reshape)
> {
> diff --git a/ReadMe.c b/ReadMe.c
> index 89dd7af..25426d3 100644
> --- a/ReadMe.c
> +++ b/ReadMe.c
> @@ -191,6 +191,7 @@ struct option long_options[] =3D {
> {"backup-file", 1,0, BackupFile},
> {"invalid-backup",0,0,InvalidBackup},
> {"array-size", 1, 0, 'Z'},
> + {"continue", 0, 0, Continue},
> =20
> /* For Incremental */
> {"rebuild-map", 0, 0, RebuildMapOpt},
> diff --git a/mdadm.c b/mdadm.c
> index 18ca2ee..9fe09ea 100644
> --- a/mdadm.c
> +++ b/mdadm.c
> @@ -74,6 +74,7 @@ int main(int argc, char *argv[])
> int export =3D 0;
> int assume_clean =3D 0;
> char *symlinks =3D NULL;
> + int grow_continue =3D FREEZE_RESHAPE_NONE;
> /* autof indicates whether and how to create device node.
> * bottom 3 bits are style. Rest (when shifted) are number of parts
> * 0 - unset
> @@ -995,7 +996,11 @@ int main(int argc, char *argv[])
> }
> backup_file =3D optarg;
> continue;
> -
> + case O(GROW, Continue):
> + /* Continuer broken grow
> + */
> + grow_continue =3D FREEZE_RESHAPE_CONTINUE;
> + continue;
> case O(ASSEMBLE, InvalidBackup):
> /* Acknowledge that the backupfile is invalid, but ask
> * to continue anyway
> @@ -1649,7 +1654,12 @@ int main(int argc, char *argv[])
> delay =3D DEFAULT_BITMAP_DELAY;
> rv =3D Grow_addbitmap(devlist->devname, mdfd, bitmap_file,
> bitmap_chunk, delay, write_behind, force);
> - } else if (size >=3D 0 || raiddisks !=3D 0 || layout_str !=3D NULL
> + } else if (grow_continue) {
> + rv =3D Grow_continue_command(devlist->devname,
> + mdfd, backup_file,
> + verbose);
> + break;
> + } else if (size >=3D 0 || raiddisks !=3D 0 || layout_str !=3D NULL
> || chunk !=3D 0 || level !=3D UnSet) {
> rv =3D Grow_reshape(devlist->devname, mdfd, quiet, backup_file,
> size, level, layout_str, chunk, raiddisks,
> diff --git a/mdadm.h b/mdadm.h
> index 073deb9..8f3e786 100644
> --- a/mdadm.h
> +++ b/mdadm.h
> @@ -314,6 +314,7 @@ enum special_options {
> InvalidBackup,
> UdevRules,
> FreezeReshape,
> + Continue,
> };
> =20
> /* structures read from config file */
> @@ -1037,9 +1038,11 @@ extern int Grow_continue(int mdfd, struct supertyp=
e *st,
> /* define stages for freeze assembly feature
> * FREEZE_RESHAPE_NONE : disabled
> * FREEZE_RESHAPE_ASSEMBLY : assemby phase
> + * FREEZE_RESHAPE_CONTINUE : grow continue phase
> */
> #define FREEZE_RESHAPE_NONE 0
> #define FREEZE_RESHAPE_ASSEMBLY 1
> +#define FREEZE_RESHAPE_CONTINUE 2
> =20
> extern int restore_backup(struct supertype *st,
> struct mdinfo *content,
> @@ -1047,6 +1050,8 @@ extern int restore_backup(struct supertype *st,
> int spares,
> char *backup_file,
> int verbose);
> +extern int Grow_continue_command(char *devname, int fd,
> + char *backup_file, int verbose);
> =20
> extern int Assemble(struct supertype *st, char *mddev,
> struct mddev_ident *ident,
> @@ -1185,6 +1190,7 @@ extern char *human_size(long long bytes);
> extern char *human_size_brief(long long bytes);
> extern void print_r10_layout(int layout);
> =20
> +
> #define NoMdDev (1<<23)
> extern int find_free_devnum(int use_partitions);
> =20


--Sig_/JJ=gP9jrI=JyggHnF4AEn+I
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)

iD8DBQFOiOVaG5fc6gV+Wb0RAqq6AKC42kvVLzXGVy9ptIVbqU8eSKEHtwCg 2rsZ
COzQviF+w/5+zngRrCKVEog=
=u30Y
-----END PGP SIGNATURE-----

--Sig_/JJ=gP9jrI=JyggHnF4AEn+I--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 3/8] Do not restart reshape if it is started already

am 03.10.2011 00:41:58 von NeilBrown

--Sig_/xVajjJV=9Ye/G=YCaD+uJr8
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

On Tue, 27 Sep 2011 14:04:57 +0200 Adam Kwolek wrot=
e:

> When reshape was invoked during initrd start-up stage array is pushed
> in to reshape state already, so read only state cannot be set again during
> reshape continuation. Set previously reshape state has to be reused
> during reshape continuation.
>=20
> Signed-off-by: Adam Kwolek
> ---
>=20
> Grow.c | 10 ++++++++--
> 1 files changed, 8 insertions(+), 2 deletions(-)
>=20
> diff --git a/Grow.c b/Grow.c
> index 768fc86..d9c2817 100644
> --- a/Grow.c
> +++ b/Grow.c
> @@ -3773,11 +3773,17 @@ int Grow_continue(int mdfd, struct supertype *st,=
struct mdinfo *info,
> {
> char buf[40];
> char *container =3D NULL;
> - int err;
> + int err =3D 0;
> =20
> - err =3D sysfs_set_str(info, NULL, "array_state", "readonly");
> + /* set read only array state when there is no reshape
> + * in progress already
> + */
> + if ((sysfs_get_str(info, NULL, "sync_action", buf, 40) !=3D 8) &&
> + (strncmp(buf, "reshape", 7) !=3D 0))
> + err =3D sysfs_set_str(info, NULL, "array_state", "readonly");
> if (err)
> return err;
> +
> if (st->ss->external) {
> fmt_devname(buf, st->container_dev);
> container =3D buf;


This is wrong.
For a start the '&&' should be '||' or it doesn't make sense.

The reason were are setting readonly here is that the array hasn't been
activated at all, so it isn't possible to freeze anything.
So we set the array to readonly so that no reshape/resync/etc can start, th=
en
activate the array in reshape_array, then freeze, and allow it to be
read-write.

So Grow_continue really assumes that the array hasn't been started yet.
You are using it in different situation, where it has been started.
For that to be reasonably, you really need to tell Grow_continue that the
array hasn't started, not let it try to figure out for itself.

However I think it would be easier to just copy the code from Grow_continue
into the place where you called it in Grow_continue_command and then remove
the bits that you don't need.
You don't need the readonly setting and you don't need the start_mdmon and I
don't think you need the freeze(), so it becomes:

if (st->ss->external && info->reshape_active == 2) {
int cfd =3D open_dev(st->container_dev);
if (cfd < 0)
return 1;
st->ss->load_container(st, cfd, container);
close(cfd);
ret_val =3D reshape_container(container, NULL,
st, info, 0, backup_file,
0, 1, freeze_reshape);
}
} else
ret_val =3D reshape_array(container, mdfd, "array", st, info, 1,
NULL, backup_file, 0, 0, 1, freeze_reshape);

which is a better result I think.

So I won't apply this patch. Please consider the above and submit a revised
version if you agree.

Thanks,
NeilBrown

--Sig_/xVajjJV=9Ye/G=YCaD+uJr8
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)

iD8DBQFOiOi2G5fc6gV+Wb0RAnpPAJ9CuiRk7EVgGG9FpkYcJbBqZ2U6xwCf RMnq
FMHHVniBCbky0SvRrd6vX+w=
=BxCT
-----END PGP SIGNATURE-----

--Sig_/xVajjJV=9Ye/G=YCaD+uJr8--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 4/8] Set correct reshape restart position

am 03.10.2011 00:56:49 von NeilBrown

--Sig_/cTHn6kssom1hbjWOGcfsszw
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

On Tue, 27 Sep 2011 14:05:05 +0200 Adam Kwolek wrot=
e:

> During initrd stage when, when array is assembled with '--freeze-reshape'
> option and before stopping reshape, reshape position has to be set to read
> from metadata checkpoint.
> This will allow later for restart point verification and user will be able
> to see in mdstat information about reshape process instead resync when
> reshape position is set to 0.
>=20
> Signed-off-by: Adam Kwolek

Hi,
I think that this patch makes start_reshape rather messy and confusing.
I am tempted to say not do do this at all, and maybe fix the kernel so that
is reports better information.

However it might be reasonable to do something like this in mdadm. If so
I would like it to go in the "if (freeze_reshape)" branch in
reshape_array(), and just explicitly set "sync_max". There should be no ne=
ed
to set sync_min - is there?

Thanks,
NeilBrown


> ---
>=20
> Grow.c | 26 +++++++++++++++-----------
> 1 files changed, 15 insertions(+), 11 deletions(-)
>=20
> diff --git a/Grow.c b/Grow.c
> index d9c2817..afe4c72 100644
> --- a/Grow.c
> +++ b/Grow.c
> @@ -696,21 +696,24 @@ static int subarray_set_num(char *container, struct=
mdinfo *sra, char *name, int
> return rc;
> }
> =20
> -int start_reshape(struct mdinfo *sra, int already_running, int freeze_re=
shape)
> +int start_reshape(struct mdinfo *sra, int already_running,
> + int freeze_reshape, int data_disks)
> {
> int err;
> + unsigned long long position_to_set =3D 0;
> + unsigned long long sync_max_to_set;
> =20
> /* do not block array as we not continue reshape this time
> */
> - if (freeze_reshape == FREEZE_RESHAPE_NONE)
> - sysfs_set_num(sra, NULL, "suspend_lo", 0x7FFFFFFFFFFFFFFFULL);
> - else
> - sysfs_set_num(sra, NULL, "suspend_lo", 0);
> - err =3D sysfs_set_num(sra, NULL, "suspend_hi", 0);
> - err =3D err ?: sysfs_set_num(sra, NULL, "suspend_lo", 0);
> + if (freeze_reshape !=3D FREEZE_RESHAPE_NONE)
> + position_to_set =3D sra->reshape_progress;
> + sysfs_set_num(sra, NULL, "suspend_lo", 0x7FFFFFFFFFFFFFFFULL);
> + err =3D sysfs_set_num(sra, NULL, "suspend_hi", position_to_set);
> + err =3D err ?: sysfs_set_num(sra, NULL, "suspend_lo", position_to_set);
> + sync_max_to_set =3D position_to_set / data_disks;
> if (!already_running)
> - sysfs_set_num(sra, NULL, "sync_min", 0);
> - err =3D err ?: sysfs_set_num(sra, NULL, "sync_max", 0);
> + sysfs_set_num(sra, NULL, "sync_min", sync_max_to_set);
> + err =3D err ?: sysfs_set_num(sra, NULL, "sync_max", sync_max_to_set);
> if (!already_running)
> err =3D err ?: sysfs_set_str(sra, NULL, "sync_action", "reshape");
> =20
> @@ -2247,7 +2250,8 @@ started:
> }
> }
> =20
> - err =3D start_reshape(sra, restart, freeze_reshape);
> + err =3D start_reshape(sra, restart, freeze_reshape,
> + info->array.raid_disks - reshape.parity);
> if (err) {
> fprintf(stderr,=20
> Name ": Cannot %s reshape for %s\n",
> @@ -3753,7 +3757,7 @@ int Grow_continue_command(char *devname, int fd,
> /* continue reshape
> */
> ret_val =3D Grow_continue(fd, st, content, backup_file,
> - FREEZE_RESHAPE_NONE);
> + FREEZE_RESHAPE_CONTINUE);
> =20
> Grow_continue_command_exit:
> if (fd2 > -1)


--Sig_/cTHn6kssom1hbjWOGcfsszw
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)

iD8DBQFOiOwxG5fc6gV+Wb0RAijwAJ9xtpyLmp4QihcgUQ9+O5CYPr5teQCg mzsG
cHXoyv8sirFJSs8lfUHsEPA=
=pQVh
-----END PGP SIGNATURE-----

--Sig_/cTHn6kssom1hbjWOGcfsszw--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 5/8] Move code to get_data_disks() function

am 03.10.2011 00:58:29 von NeilBrown

--Sig_/8A5Y9r6UffNxPLUctGWVcxY
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

On Tue, 27 Sep 2011 14:05:13 +0200 Adam Kwolek wrot=
e:

> Move code to function for code reuse.
>=20
> Signed-off-by: Adam Kwolek

Applied, thanks.

NeilBrown



> ---
>=20
> mdadm.h | 1 +
> util.c | 10 ++++++++--
> 2 files changed, 9 insertions(+), 2 deletions(-)
>=20
> diff --git a/mdadm.h b/mdadm.h
> index 8f3e786..4bbf660 100644
> --- a/mdadm.h
> +++ b/mdadm.h
> @@ -1165,6 +1165,7 @@ extern unsigned long long get_component_size(int fd=
);
> extern void remove_partitions(int fd);
> extern int test_partition(int fd);
> extern int test_partition_from_id(dev_t id);
> +extern int get_data_disks(int level, int layout, int raid_disks);
> extern unsigned long long calc_array_size(int level, int raid_disks, int=
layout,
> int chunksize, unsigned long long devsize);
> extern int flush_metadata_updates(struct supertype *st);
> diff --git a/util.c b/util.c
> index 0ea7e0d..50c98c1 100644
> --- a/util.c
> +++ b/util.c
> @@ -703,6 +703,12 @@ void print_r10_layout(int layout)
> unsigned long long calc_array_size(int level, int raid_disks, int layout,
> int chunksize, unsigned long long devsize)
> {
> + devsize &=3D ~(unsigned long long)((chunksize>>9)-1);
> + return get_data_disks(level, layout, raid_disks) * devsize;
> +}
> +
> +int get_data_disks(int level, int layout, int raid_disks)
> +{
> int data_disks =3D 0;
> switch (level) {
> case 0: data_disks =3D raid_disks; break;
> @@ -713,8 +719,8 @@ unsigned long long calc_array_size(int level, int rai=
d_disks, int layout,
> case 10: data_disks =3D raid_disks / (layout & 255) / ((layout>>8)&255);
> break;
> }
> - devsize &=3D ~(unsigned long long)((chunksize>>9)-1);
> - return data_disks * devsize;
> +
> + return data_disks;
> }
> =20
> #if !defined(MDASSEMBLE) || defined(MDASSEMBLE) && defined(MDASSEMBLE_AU=
TO)


--Sig_/8A5Y9r6UffNxPLUctGWVcxY
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)

iD8DBQFOiOyVG5fc6gV+Wb0RAhvFAKCKOJfGRcZKzAS0ntCsk1nNp+zbUgCf Sook
ANc5WSlnWKt/o3lt4bD0DHE=
=G+Rh
-----END PGP SIGNATURE-----

--Sig_/8A5Y9r6UffNxPLUctGWVcxY--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 6/8] Verify reshape restart position

am 03.10.2011 01:06:28 von NeilBrown

--Sig_/XtO+Ii/85t/dnUFdkXUMVXS
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

On Tue, 27 Sep 2011 14:05:22 +0200 Adam Kwolek wrot=
e:

> Check if reshape restart position is the same as set in md.
> If position doesn't match this means that we cannot restart reshape.
>=20
> Signed-off-by: Adam Kwolek
> ---
>=20
> Grow.c | 32 ++++++++++++++++++++++++++++++++
> 1 files changed, 32 insertions(+), 0 deletions(-)
>=20
> diff --git a/Grow.c b/Grow.c
> index afe4c72..3ff249d 100644
> --- a/Grow.c
> +++ b/Grow.c
> @@ -3653,6 +3653,8 @@ int Grow_continue_command(char *devname, int fd,
> char buf[40];
> int cfd =3D -1;
> int fd2 =3D -1;
> + char *ep;
> + unsigned long long position;
> =20
> dprintf("Grow continue from command line called for %s\n",
> devname);
> @@ -3754,6 +3756,36 @@ int Grow_continue_command(char *devname, int fd,
> }
> }
> =20
> + /* verify that array under reshape is started from
> + * correct position
> + */
> + ret_val =3D sysfs_get_str(content, NULL, "sync_max", buf, 40);
> + if (ret_val <=3D 0) {
> + fprintf(stderr, Name
> + ": cannot open verify reshape progress for %s (%i)\n",
> + content->sys_name, ret_val);
> + ret_val =3D 1;
> + goto Grow_continue_command_exit;
> + }
> + dprintf(Name ": Read sync_max sysfs entry is: %s\n", buf);
> + errno =3D 0;
> + position =3D strtoull(buf, &ep, 0);
> + if (errno || ep == buf || (*ep !=3D 0 && *ep !=3D '\n' && *ep !=3D =
' ')) {
> + fprintf(stderr, Name ": md is not allowed to finish reshape "
> + "wihout mdadm assistance.\n");
> + ret_val =3D 1;
> + goto Grow_continue_command_exit;
> + }
> + position *=3D get_data_disks(map_name(pers, mdstat->level),
> + content->new_layout,
> + content->array.raid_disks);
> + if (position !=3D content->reshape_progress) {
> + fprintf(stderr, Name ": md is not allowed to finish reshape "
> + "wihout mdadm assistance.\n");
> + ret_val =3D 1;
> + goto Grow_continue_command_exit;
> + }
> +
> /* continue reshape
> */
> ret_val =3D Grow_continue(fd, st, content, backup_file,


Applied with a few small changes.

1/ is it never correct to test errno to see if an error occurred. You must=
=20
test something else to see if an error occurred, and the check errno to
see what the error was.
2/ The error message didn't seem helpful. I changed it to:

Fatal error: array reshape was not properly frozen.

Thanks,
NeilBrown



--Sig_/XtO+Ii/85t/dnUFdkXUMVXS
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)

iD8DBQFOiO50G5fc6gV+Wb0RAvNDAJ9hkoEwPZGvByiBp+oMH45qRQEZ/ACg p6H1
3PclkyBa4GOahuvPaBNSy0k=
=YNP9
-----END PGP SIGNATURE-----

--Sig_/XtO+Ii/85t/dnUFdkXUMVXS--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 8/8] Manual update for --continue option

am 03.10.2011 01:09:48 von NeilBrown

--Sig_/=2hjXgWeV0copRh7X1YA7FF
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

On Tue, 27 Sep 2011 14:05:39 +0200 Adam Kwolek wrot=
e:

> Patch adds to mdadm man the following information:
>=20
> --continue
> This option is complementary pair to assembly --freeze-reshape option.
> It is needed when --grow operation is interrupted and it is not restar=
ted
> automatically due to --freeze-reshape usage during array assembly.
> Option --continue has to be used together with -G , ( --grow ) command
> and device that it should be executed on. All parameters required for
> reshape continuation will be read from array metadata. If initial
> --grow command had required --backup-file=3D option to be set,
> continuation option will require to have exactly the same backup
> file pointed to also.
>=20
> Any other parameter passed together with --continue option will be ign=
ored.
>=20
> Signed-off-by: Adam Kwolek
> ---
>=20
> mdadm.8.in | 27 +++++++++++++++++++++++++++
> 1 files changed, 27 insertions(+), 0 deletions(-)
>=20
> diff --git a/mdadm.8.in b/mdadm.8.in
> index 3e2f99e..6b0af55 100644
> --- a/mdadm.8.in
> +++ b/mdadm.8.in
> @@ -734,6 +734,33 @@ The file must be stored on a separate device, not on=
the RAID array
> being reshaped.
> =20
> .TP
> +.BR \-\-continue
> +This option is complementary pair to assembly
> +.B \-\-freeze-reshape
> +option. It is needed when
> +.B \-\-grow
> +operation is interrupted and it is not restarted automatically due to
> +.B \-\-freeze-reshape
> +usage during array assembly. Option
> +.BR \-\-continue
> +has to be used together with
> +.BR \-G
> +, (
> +.BR \-\-grow
> +) command and device that it should be executed on.
> +All parameters required for reshape continuation will be read from array=
metadata.
> +If initial
> +.BR \-\-grow
> +command had required
> +.BR \-\-backup\-file=3D
> +option to be set, continuation option will require to have exactly the s=
ame
> +backup file pointed to also.
> +.IP
> +Any other parameter passed together with
> +.BR \-\-continue
> +option will be ignored.
> +
> +.TP
> .BR \-N ", " \-\-name=3D
> Set a
> .B name


Thanks.
Both man-page updates added (with some minor fixes).

NeilBrown


--Sig_/=2hjXgWeV0copRh7X1YA7FF
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)

iD8DBQFOiO88G5fc6gV+Wb0RAu/4AJ9v3g5SF1LbujDeMMcG+v5KfYVGHACf c2TL
xjdbVm7gUR9flTI+WvRH6no=
=Ol0+
-----END PGP SIGNATURE-----

--Sig_/=2hjXgWeV0copRh7X1YA7FF--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html