[PATCH 00/14] Series short description

[PATCH 00/14] Series short description

am 16.09.2011 13:53:46 von adam.kwolek

This patch series (14) implements manual reshape restart/continuation
in the following scenario:

1. Condition:
During system boot mdadm finds array under reshape that should be
assembled. This happens while initramfs is mounted
(before file system pivot)

2. Scenario:
a) Assembly procedure assembles array and runs reshape continuation
procedure (as usual). Reshape procedure continues reshape and during
file system pivot lost file system context and cannot control process
using check-pointing
To avoid this now reshape procedure detects that initramfs is mounted
(before pivot) and reshape critical section is restored and array
is prepared for reshape only. At this point mdadm finishes his work
for this array and array is fully functional/accessible now.

0001-Stop-array-reshape-when-mounted-initramfs-is-detecte.pa tch
0002-Stop-container-reshape-while-using-initramfs.patch
0003-FIX-Do-not-unblock-array-accidentally.patch
0007-FIX-Memory-leak-during-Assembly.patch

b) Afters system boot user manually can invoke reshape continuation.
New mdadm option '--continue' for grow command was added.
This allows for reshape continuation. mdadm for reshape continuation
uses parameters read from metadata. Command line can looks as follows:

mdadm -G --continue device_name [--backup-file=file_name]

where:
device_name : device that reshape should be continued on
e.g. /dev/md/container_name or /dev/md/raid_name
backup-file : optional parameter required when backup-file
was use previously for reshape execution


For external metadata mdadm takes carry for metadata compatibility
e.g. container operation can be continued on array device /and oposit/.

0004-Add-continue-option-to-grow-command.patch
0005-Add-Grow_continue_command.patch
0006-Check-and-run-mdmon.patch
0007-FIX-Memory-leak-during-Assembly.patch
0008-Check-if-reshape-can-be-restarted.patch
0009-Move-restore-backup-code-to-function.patch
0010-Perform-restore-backup-for-reshape-continuation.patch
0011-Add-possibility-to-restart-reshape-for-native-metada.pa tch
0012-Early-reshape-backup-verification.patch
0013-Check-if-md-allows-to-control-reshape.patch

3. man update:
The last patch provides madam's man update for '--continue' grow option
0014-Manual-update-for-continue-option.patch
BR
Adam

---

Adam Kwolek (14):
Manual update for --continue option
Check if md allows to control reshape
Early reshape backup verification
Add possibility to restart reshape for native metadata
Perform restore backup for reshape continuation
Move restore backup code to function
Check if reshape can be restarted
FIX: Memory leak during Assembly
Check and run mdmon
Add Grow_continue_command
Add continue option to grow command
FIX: Do not unblock array accidentally
Stop container reshape while using initramfs
Stop array reshape when mounted initramfs is detected


Assemble.c | 39 +-------
Grow.c | 290 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
ReadMe.c | 1
mdadm.8.in | 24 +++++
mdadm.c | 24 +++++
mdadm.h | 14 +++
sysfs.c | 10 ++
util.c | 59 ++++++++++++
8 files changed, 419 insertions(+), 42 deletions(-)

--
Signature
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 01/14] Stop array reshape when mounted initramfs is detected

am 16.09.2011 13:53:54 von adam.kwolek

During remounting file system from initramfs to real one,
mdadm is not able to continue reshape procedure due to lost
filesystem context.
To avoid this mdadm detects mounted initramfs and performs
critical section restore only. Array is set for later
reshape continuation.

Reshape continuation will be possible to invoking manually,
using '--continue' option.

Signed-off-by: Adam Kwolek
---

Grow.c | 25 +++++++++++++++++++++++--
mdadm.h | 4 ++++
util.c | 22 ++++++++++++++++++++++
3 files changed, 49 insertions(+), 2 deletions(-)

diff --git a/Grow.c b/Grow.c
index 17d14b6..420177c 100644
--- a/Grow.c
+++ b/Grow.c
@@ -635,10 +635,21 @@ static int subarray_set_num(char *container, struct mdinfo *sra, char *name, int
return rc;
}

+/* possible return values:
+ * 0 : success
+ * -1: error
+ * INITRAMFS_IS_IN_USE: continue reshape later
+ */
int start_reshape(struct mdinfo *sra, int already_running)
{
int err;
- sysfs_set_num(sra, NULL, "suspend_lo", 0x7FFFFFFFFFFFFFFFULL);
+
+ /* do not block array as we not continue reshape this time
+ */
+ if (initramfs_is_in_use() == INITRAMFS_NOT_USED)
+ sysfs_set_num(sra, NULL, "suspend_lo", 0x7FFFFFFFFFFFFFFFULL);
+ else
+ sysfs_set_num(sra, NULL, "suspend_lo", 0);
err = sysfs_set_num(sra, NULL, "suspend_hi", 0);
err = err ?: sysfs_set_num(sra, NULL, "suspend_lo", 0);
if (!already_running)
@@ -2190,6 +2201,14 @@ started:
}
if (restart)
sysfs_set_str(sra, NULL, "array_state", "active");
+ if (initramfs_is_in_use() == INITRAMFS_IS_IN_USE) {
+ free(fdlist);
+ free(offsets);
+ sysfs_free(sra);
+ fprintf(stderr, Name ": Reshape has to be continued "
+ "when root fileststem will be mounted\n");
+ return 1;
+ }

/* Now we just need to kick off the reshape and watch, while
* handling backups of the data...
@@ -2357,7 +2376,9 @@ int reshape_container(char *container, char *devname,
unfreeze(st);
return 1;
default: /* parent */
- printf(Name ": multi-array reshape continues in background\n");
+ if (initramfs_is_in_use() == INITRAMFS_NOT_USED)
+ printf(Name ": multi-array reshape continues "
+ "in background\n");
return 0;
case 0: /* child */
break;
diff --git a/mdadm.h b/mdadm.h
index d616966..9035e67 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -1161,6 +1161,10 @@ extern char *human_size(long long bytes);
extern char *human_size_brief(long long bytes);
extern void print_r10_layout(int layout);

+extern int initramfs_is_in_use(void);
+#define INITRAMFS_IS_IN_USE 1
+#define INITRAMFS_NOT_USED 0
+
#define NoMdDev (1<<23)
extern int find_free_devnum(int use_partitions);

diff --git a/util.c b/util.c
index 0ea7e0d..7186f3f 100644
--- a/util.c
+++ b/util.c
@@ -1768,3 +1768,25 @@ struct mdinfo *container_choose_spares(struct supertype *st,
}
return disks;
}
+
+int initramfs_is_in_use(void)
+{
+ int ret_val;
+ struct stat sts;
+
+ /* /init
+ * common script for many Linux distribution
+ * /sysboot
+ * is specific for RH
+ * /mkinitrd.config
+ * is specific for SLES
+ */
+ if ((stat("/init", &sts) == -1) ||
+ ((stat("/sysroot", &sts) == -1) &&
+ (stat("/mkinitrd.config", &sts) == -1)))
+ ret_val = INITRAMFS_NOT_USED;
+ else
+ ret_val = INITRAMFS_IS_IN_USE;
+
+ return ret_val;
+}

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 02/14] Stop container reshape while using initramfs

am 16.09.2011 13:54:02 von adam.kwolek

When first array in container is set for later reshape mdadm
has exit for remaining container operations. Do not wait for
current reshape finalization as it will be performed manually later.

Signed-off-by: Adam Kwolek
---

Grow.c | 6 ++++++
1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/Grow.c b/Grow.c
index 420177c..cec32d7 100644
--- a/Grow.c
+++ b/Grow.c
@@ -2435,6 +2435,12 @@ int reshape_container(char *container, char *devname,
content, force, NULL,
backup_file, quiet, 1, restart);
close(fd);
+
+ if (initramfs_is_in_use() == INITRAMFS_IS_IN_USE) {
+ sysfs_free(cc);
+ exit(0);
+ }
+
restart = 0;
if (rv)
break;

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 03/14] FIX: Do not unblock array accidentally

am 16.09.2011 13:54:09 von adam.kwolek

When sysfs_set_array() function is called, it tests if array
can be configured using sysfs. Setting metadata_version entry
can accidentally unblock mdmon when array is under reshape.
To avoid this, blocking character '-' is checked and if is is set,
it is used for array test.

Signed-off-by: Adam Kwolek
---

sysfs.c | 10 ++++++++++
1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/sysfs.c b/sysfs.c
index 2146264..903202d 100644
--- a/sysfs.c
+++ b/sysfs.c
@@ -541,7 +541,17 @@ int sysfs_set_array(struct mdinfo *info, int vers)
ver[0] = 0;
if (info->array.major_version == -1 &&
info->array.minor_version == -2) {
+ char buf[1024];
strcat(strcpy(ver, "external:"), info->text_version);
+ if (sysfs_get_str(info, NULL, "metadata_version",
+ buf, 1024) > 0) {
+ /* Testing string should not affect
+ * monitor blocking functionality
+ * Set blocking character if present in sysfs already
+ */
+ if (buf[9] == '-')
+ ver[9] = '-';
+ }

if ((vers % 100) < 2 ||
sysfs_set_str(info, NULL, "metadata_version",

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 04/14] Add continue option to grow command

am 16.09.2011 13:54:17 von adam.kwolek

To allow for reshape continuation, 'continue' option is added
to grow command. It is not possible to check, if any existing mdadm
instance is going to take ownership on blocked md array or it is
orphaned/interrupted reshape.
To resolve this problem grow command with continue option,
is allowed when current mdadm instance is only one in the system.

Signed-off-by: Adam Kwolek
---

ReadMe.c | 1 +
mdadm.c | 17 ++++++++++++++++-
mdadm.h | 2 ++
util.c | 37 +++++++++++++++++++++++++++++++++++++
4 files changed, 56 insertions(+), 1 deletions(-)

diff --git a/ReadMe.c b/ReadMe.c
index b658841..f0dc0d9 100644
--- a/ReadMe.c
+++ b/ReadMe.c
@@ -190,6 +190,7 @@ struct option long_options[] = {
{"backup-file", 1,0, BackupFile},
{"invalid-backup",0,0,InvalidBackup},
{"array-size", 1, 0, 'Z'},
+ {"continue", 0, 0, Continue},

/* For Incremental */
{"rebuild-map", 0, 0, RebuildMapOpt},
diff --git a/mdadm.c b/mdadm.c
index 4b817ab..f30534a 100644
--- a/mdadm.c
+++ b/mdadm.c
@@ -74,6 +74,7 @@ int main(int argc, char *argv[])
int export = 0;
int assume_clean = 0;
char *symlinks = NULL;
+ int grow_continue = 0;
/* autof indicates whether and how to create device node.
* bottom 3 bits are style. Rest (when shifted) are number of parts
* 0 - unset
@@ -988,7 +989,21 @@ int main(int argc, char *argv[])
}
backup_file = optarg;
continue;
-
+ case O(GROW, Continue):
+ /* Continuer broken grow
+ * To be sure that continuation is allowed check
+ * if there is only one (current) mdadm instance
+ */
+ grow_continue = count_mdadms();
+ fprintf(stderr, Name ": %i mdadm instance(s) found\n",
+ grow_continue);
+ if (grow_continue != 1) {
+ fprintf(stderr, Name ": Grow continuation "
+ "requires single mdadm instance "
+ "running.\n");
+ exit(2);
+ }
+ continue;
case O(ASSEMBLE, InvalidBackup):
/* Acknowledge that the backupfile is invalid, but ask
* to continue anyway
diff --git a/mdadm.h b/mdadm.h
index 9035e67..7e3a618 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -313,6 +313,7 @@ enum special_options {
RebuildMapOpt,
InvalidBackup,
UdevRules,
+ Continue,
};

/* structures read from config file */
@@ -1162,6 +1163,7 @@ extern char *human_size_brief(long long bytes);
extern void print_r10_layout(int layout);

extern int initramfs_is_in_use(void);
+extern int count_mdadms(void);
#define INITRAMFS_IS_IN_USE 1
#define INITRAMFS_NOT_USED 0

diff --git a/util.c b/util.c
index 7186f3f..5077c2f 100644
--- a/util.c
+++ b/util.c
@@ -1790,3 +1790,40 @@ int initramfs_is_in_use(void)

return ret_val;
}
+
+int count_mdadms(void)
+{
+ int ret_val = 0;
+ DIR *dirp;
+ struct dirent *d_entry;
+ int fd;
+
+ dirp = opendir("/proc");
+ if (!dirp)
+ return -1;
+
+ while ((d_entry = readdir(dirp)) != NULL) {
+ char p[PATH_MAX];
+
+ if (!strcmp(d_entry->d_name, ".") ||
+ !strcmp(d_entry->d_name, "..") ||
+ !strcmp(d_entry->d_name, "self"))
+ continue;
+ errno = 0;
+ fd = (int)strtol(d_entry->d_name, NULL, 10);
+ if (errno)
+ continue;
+ sprintf(p, "/proc/%s/cmdline", d_entry->d_name);
+ fd = open(p, O_RDONLY);
+ if (fd < 0)
+ continue;
+ if (read(fd, p, PATH_MAX) > 0) {
+ if (strcmp(p, "mdadm") == 0)
+ ret_val++;
+ }
+ close(fd);
+ }
+ closedir(dirp);
+
+ return ret_val;
+}

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 05/14] Add Grow_continue_command

am 16.09.2011 13:54:25 von adam.kwolek

Add function that will be executed in grow-continue case
for external metadata. It checks if any reshape is in progress.
If it finds reshape in progress, Grow_continue() function
is called as during boot time for reshape continuation.

Signed-off-by: Adam Kwolek
---

Grow.c | 113 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ +++
mdadm.c | 7 +++-
mdadm.h | 2 +
3 files changed, 121 insertions(+), 1 deletions(-)

diff --git a/Grow.c b/Grow.c
index cec32d7..2b2cb5a 100644
--- a/Grow.c
+++ b/Grow.c
@@ -3578,6 +3578,119 @@ int Grow_restart(struct supertype *st, struct mdinfo *info, int *fdlist, int cnt
return 1;
}

+int Grow_continue_command(char *devname, int fd,
+ char *backup_file, int verbose)
+{
+ int ret_val = 0;
+ struct supertype *st = NULL;
+ struct mdinfo *content = NULL;
+ char *subarray = NULL;
+ struct mdinfo *cc = NULL;
+ struct mdstat_ent *mdstat = NULL;
+ char container[20];
+ int cfd = -1;
+ int fd2 = -1;
+
+ dprintf("Grow continue from command line called for %s\n",
+ devname);
+
+ st = super_by_fd(fd, &subarray);
+ if (!st) {
+ fprintf(stderr,
+ Name ": Unable to determine metadata format for %s\n",
+ devname);
+ return 1;
+ }
+ dprintf("Grow continue is run for ");
+ if (st->ss->external == 0) {
+ dprintf("native array (%s)\n", devname);
+ /* FIXME : temporary just exit
+ * for native metadata case
+ */
+ st->ss->free_super(st);
+ free(subarray);
+ exit(0);
+ } else {
+ int container_dev;
+
+ if (subarray) {
+ dprintf("subarray (%s)\n", subarray);
+ container_dev = st->container_dev;
+ cfd = open_dev_excl(st->container_dev);
+ } else {
+ container_dev = st->devnum;
+ close(fd);
+ cfd = open_dev_excl(st->devnum);
+ dprintf("container (%i)\n", container_dev);
+ fd = cfd;
+ }
+ if (cfd < 0) {
+ fprintf(stderr, Name ": Unable to open container "
+ "for %s\n", devname);
+ ret_val = 1;
+ goto Grow_continue_command_exit;
+ }
+ fmt_devname(container, container_dev);
+
+ /* find in container array under reshape
+ */
+ ret_val = st->ss->load_container(st, cfd, NULL);
+ if (ret_val) {
+ fprintf(stderr,
+ Name ": Cannot read superblock for %s\n",
+ devname);
+ ret_val = 1;
+ goto Grow_continue_command_exit;
+ }
+
+ cc = st->ss->container_content(st, NULL);
+ for (content = cc; content ; content = content->next) {
+ char *array;
+
+ if (content->reshape_active == 0)
+ continue;
+
+ array = strchr(content->text_version+1, '/')+1;
+ mdstat = mdstat_by_subdev(array, container_dev);
+ if (!mdstat)
+ continue;
+ break;
+ }
+ if (!content) {
+ fprintf(stderr,
+ Name ": Unable to determine reshaped "
+ "array for %s\n", devname);
+ ret_val = 1;
+ goto Grow_continue_command_exit;
+ }
+ fd2 = open_dev(mdstat->devnum);
+ if (fd2 < 0) {
+ fprintf(stderr, Name ": cannot open (md%i)\n",
+ mdstat->devnum);
+ ret_val = 1;
+ goto Grow_continue_command_exit;
+ }
+
+ sysfs_init(content, fd2, mdstat->devnum);
+ }
+
+ /* continue reshape
+ */
+ ret_val = Grow_continue(fd, st, content, backup_file);
+
+Grow_continue_command_exit:
+ if (fd2 > -1)
+ close(fd2);
+ if (cfd > -1)
+ close(cfd);
+ st->ss->free_super(st);
+ free_mdstat(mdstat);
+ sysfs_free(cc);
+ free(subarray);
+
+ return ret_val;
+}
+
int Grow_continue(int mdfd, struct supertype *st, struct mdinfo *info,
char *backup_file)
{
diff --git a/mdadm.c b/mdadm.c
index f30534a..3bd4ebe 100644
--- a/mdadm.c
+++ b/mdadm.c
@@ -1649,7 +1649,12 @@ int main(int argc, char *argv[])
delay = DEFAULT_BITMAP_DELAY;
rv = Grow_addbitmap(devlist->devname, mdfd, bitmap_file,
bitmap_chunk, delay, write_behind, force);
- } else if (size >= 0 || raiddisks != 0 || layout_str != NULL
+ } else if (grow_continue) {
+ rv = Grow_continue_command(devlist->devname,
+ mdfd, backup_file,
+ verbose);
+ break;
+ } else if (size >= 0 || raiddisks != 0 || layout_str != NULL
|| chunk != 0 || level != UnSet) {
rv = Grow_reshape(devlist->devname, mdfd, quiet, backup_file,
size, level, layout_str, chunk, raiddisks,
diff --git a/mdadm.h b/mdadm.h
index 7e3a618..7761db6 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -1024,6 +1024,8 @@ extern int Grow_restart(struct supertype *st, struct mdinfo *info,
int *fdlist, int cnt, char *backup_file, int verbose);
extern int Grow_continue(int mdfd, struct supertype *st,
struct mdinfo *info, char *backup_file);
+extern int Grow_continue_command(char *devname, int fd,
+ char *backup_file, int verbose);

extern int Assemble(struct supertype *st, char *mddev,
struct mddev_ident *ident,

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 06/14] Check and run mdmon

am 16.09.2011 13:54:32 von adam.kwolek

Reshape can be run for external metadata for monitored array/container
only. In case when array/container is not monitored run mdmon for it.

Signed-off-by: Adam Kwolek
---

Grow.c | 15 +++++++++++++++
1 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/Grow.c b/Grow.c
index 2b2cb5a..4d2c361 100644
--- a/Grow.c
+++ b/Grow.c
@@ -3672,6 +3672,21 @@ int Grow_continue_command(char *devname, int fd,
}

sysfs_init(content, fd2, mdstat->devnum);
+
+ /* start mdmon in case it is not run
+ */
+ if (!mdmon_running(container_dev))
+ start_mdmon(container_dev);
+ ping_monitor(container);
+
+ if (mdmon_running(container_dev))
+ st->update_tail = &st->updates;
+ else {
+ fprintf(stderr, Name ": No mdmon found. "
+ "Grow cannot continue.\n");
+ ret_val = 1;
+ goto Grow_continue_command_exit;
+ }
}

/* continue reshape

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 07/14] FIX: Memory leak during Assembly

am 16.09.2011 13:54:40 von adam.kwolek

For fdlist pointer allocated in assemble_container_content() function,
free() is never called. This patch fixes this memory leak.

Signed-off-by: Adam Kwolek
---

Assemble.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/Assemble.c b/Assemble.c
index 25cfec1..66d2ee4 100644
--- a/Assemble.c
+++ b/Assemble.c
@@ -1584,6 +1584,7 @@ int assemble_container_content(struct supertype *st, int mdfd,
if (fdlist[spare] >= 0)
close(fdlist[spare]);
}
+ free(fdlist);
if (err) {
fprintf(stderr, Name ": Failed to restore critical"
" section for reshape - sorry.\n");

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 08/14] Check if reshape can be restarted

am 16.09.2011 13:54:48 von adam.kwolek

When reshape was invoked during initramfs start-up stage
array is pushed in to reshape state already, so read only
state cannot be set again. To verify if reshape continuation should
be allowed sync_action sysfs key is tested and number of existing mdadm
instances is checked again.

Signed-off-by: Adam Kwolek
---

Grow.c | 19 +++++++++++++++++--
1 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/Grow.c b/Grow.c
index 4d2c361..b331287 100644
--- a/Grow.c
+++ b/Grow.c
@@ -3714,8 +3714,23 @@ int Grow_continue(int mdfd, struct supertype *st, struct mdinfo *info,
int err;

err = sysfs_set_str(info, NULL, "array_state", "readonly");
- if (err)
- return err;
+ if (err) {
+ /* check if array is set in to reshape stare,
+ * and current mdadm instance is the only one
+ */
+ if ((sysfs_get_str(info, NULL, "sync_action",
+ buf, 40) == 8) &&
+ (count_mdadms() == 1)) {
+ dprintf("Warning: Grow_continue() cannot set array in"
+ "to read only state [sync_action is %s].",
+ buf);
+ if (strncmp(buf, "reshape", 7))
+ return err;
+ dprintf("Allow to continue.\n");
+ } else
+ return err;
+ }
+
if (st->ss->external) {
fmt_devname(buf, st->container_dev);
container = buf;

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 09/14] Move restore backup code to function

am 16.09.2011 13:54:55 von adam.kwolek

Reshape backup should be able to be restored during reshape continuation
also. To reuse already existing code it is moved to function.

Signed-off-by: Adam Kwolek
---

Assemble.c | 40 +++------------------------------------
Grow.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
mdadm.h | 6 ++++++
3 files changed, 71 insertions(+), 37 deletions(-)

diff --git a/Assemble.c b/Assemble.c
index 66d2ee4..6188fb5 100644
--- a/Assemble.c
+++ b/Assemble.c
@@ -1555,44 +1555,10 @@ int assemble_container_content(struct supertype *st, int mdfd,

if (content->reshape_active) {
int spare = content->array.raid_disks + expansion;
- int i;
- int *fdlist = malloc(sizeof(int) *
- (working + expansion
- + content->array.raid_disks));
- for (i=0; i - fdlist[i] = -1;
- for (dev = content->devs; dev; dev = dev->next) {
- char buf[20];
- int fd;
- sprintf(buf, "%d:%d",
- dev->disk.major,
- dev->disk.minor);
- fd = dev_open(buf, O_RDWR);
-
- if (dev->disk.raid_disk >= 0)
- fdlist[dev->disk.raid_disk] = fd;
- else
- fdlist[spare++] = fd;
- }
- if (st->ss->external && st->ss->recover_backup)
- err = st->ss->recover_backup(st, content);
- else
- err = Grow_restart(st, content, fdlist, spare,
- backup_file, verbose > 0);
- while (spare > 0) {
- spare--;
- if (fdlist[spare] >= 0)
- close(fdlist[spare]);
- }
- free(fdlist);
- if (err) {
- fprintf(stderr, Name ": Failed to restore critical"
- " section for reshape - sorry.\n");
- if (!backup_file)
- fprintf(stderr, Name ": Possibly you need"
- " to specify a --backup-file\n");
+ if (restore_backup(st, content,
+ working,
+ spare, backup_file, verbose))
return 1;
- }

err = Grow_continue(mdfd, st, content, backup_file);
} else switch(content->array.level) {
diff --git a/Grow.c b/Grow.c
index b331287..0b96f7a 100644
--- a/Grow.c
+++ b/Grow.c
@@ -3578,6 +3578,68 @@ int Grow_restart(struct supertype *st, struct mdinfo *info, int *fdlist, int cnt
return 1;
}

+int restore_backup(struct supertype *st,
+ struct mdinfo *content,
+ int working_disks,
+ int spares,
+ char *backup_file,
+ int verbose)
+{
+ int i;
+ int *fdlist;
+ struct mdinfo *dev;
+ int err;
+ int disk_count = working_disks + spares;
+ int next_spare = working_disks;
+
+ dprintf("Called restore_backup()\n");
+ fdlist = malloc(sizeof(int) * disk_count);
+ if (fdlist == NULL) {
+ fprintf(stderr,
+ Name ": cannot allocate memory for disk list\n");
+ return 1;
+ }
+ for (i = 0; i < disk_count; i++)
+ fdlist[i] = -1;
+ for (dev = content->devs; dev; dev = dev->next) {
+ char buf[22];
+ int fd;
+ sprintf(buf, "%d:%d",
+ dev->disk.major,
+ dev->disk.minor);
+ fd = dev_open(buf, O_RDWR);
+
+ if (dev->disk.raid_disk >= 0)
+ fdlist[dev->disk.raid_disk] = fd;
+ else
+ fdlist[next_spare++] = fd;
+ }
+
+ if (st->ss->external && st->ss->recover_backup)
+ err = st->ss->recover_backup(st, content);
+ else
+ err = Grow_restart(st, content, fdlist, spares,
+ backup_file, verbose > 0);
+
+ while (disk_count > 0) {
+ disk_count--;
+ if (fdlist[disk_count] >= 0)
+ close(fdlist[disk_count]);
+ }
+ free(fdlist);
+ if (err) {
+ fprintf(stderr, Name ": Failed to restore critical"
+ " section for reshape - sorry.\n");
+ if (!backup_file)
+ fprintf(stderr, Name ": Possibly you need"
+ " to specify a --backup-file\n");
+ return 1;
+ }
+
+ dprintf("restore_backup() returns status OK.\n");
+ return 0;
+}
+
int Grow_continue_command(char *devname, int fd,
char *backup_file, int verbose)
{
diff --git a/mdadm.h b/mdadm.h
index 7761db6..6165979 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -1026,6 +1026,12 @@ extern int Grow_continue(int mdfd, struct supertype *st,
struct mdinfo *info, char *backup_file);
extern int Grow_continue_command(char *devname, int fd,
char *backup_file, int verbose);
+extern int restore_backup(struct supertype *st,
+ struct mdinfo *content,
+ int working_disks,
+ int spares,
+ char *backup_file,
+ int verbose);

extern int Assemble(struct supertype *st, char *mddev,
struct mddev_ident *ident,

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 10/14] Perform restore backup for reshape continuation

am 16.09.2011 13:55:03 von adam.kwolek

It can happen that reshape is broken by different reason than reboot.
this means that on reshape continuation start we cannot be sure
that critical section has been restored already.

Restore data from checkpoint before reshape continuation to avoid
data corruption.

Signed-off-by: Adam Kwolek
---

Grow.c | 18 +++++++++++++++++-
1 files changed, 17 insertions(+), 1 deletions(-)

diff --git a/Grow.c b/Grow.c
index 0b96f7a..da8b72d 100644
--- a/Grow.c
+++ b/Grow.c
@@ -3652,6 +3652,9 @@ int Grow_continue_command(char *devname, int fd,
char container[20];
int cfd = -1;
int fd2 = -1;
+ int spares;
+ int working_disks;
+ struct mdinfo *dev;

dprintf("Grow continue from command line called for %s\n",
devname);
@@ -3753,7 +3756,20 @@ int Grow_continue_command(char *devname, int fd,

/* continue reshape
*/
- ret_val = Grow_continue(fd, st, content, backup_file);
+ dev = content->devs;
+ spares = 0;
+ working_disks = 0;
+ while (dev) {
+ if (dev->disk.raid_disk == -1)
+ spares++;
+ else
+ working_disks++;
+ dev = dev->next;
+ }
+ ret_val = restore_backup(st, content, working_disks, spares,
+ backup_file, verbose);
+ if (!ret_val)
+ ret_val = Grow_continue(fd, st, content, backup_file);

Grow_continue_command_exit:
if (fd2 > -1)

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 11/14] Add possibility to restart reshape for native metadata

am 16.09.2011 13:55:10 von adam.kwolek

When mdadm services native metadata, check-pointing functions
can lost file system context in the same way as for external metadata.

This patch adds native metadata continuation support.

Signed-off-by: Adam Kwolek
---

Grow.c | 15 +++++++++------
1 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/Grow.c b/Grow.c
index da8b72d..fa22382 100644
--- a/Grow.c
+++ b/Grow.c
@@ -3646,6 +3646,7 @@ int Grow_continue_command(char *devname, int fd,
int ret_val = 0;
struct supertype *st = NULL;
struct mdinfo *content = NULL;
+ struct mdinfo array;
char *subarray = NULL;
struct mdinfo *cc = NULL;
struct mdstat_ent *mdstat = NULL;
@@ -3669,12 +3670,14 @@ int Grow_continue_command(char *devname, int fd,
dprintf("Grow continue is run for ");
if (st->ss->external == 0) {
dprintf("native array (%s)\n", devname);
- /* FIXME : temporary just exit
- * for native metadata case
- */
- st->ss->free_super(st);
- free(subarray);
- exit(0);
+ if (ioctl(fd, GET_ARRAY_INFO, &array) < 0) {
+ fprintf(stderr, Name ": %s is not an active md array -"
+ " aborting\n", devname);
+ ret_val = 1;
+ goto Grow_continue_command_exit;
+ }
+ content = &array;
+ sysfs_init(content, fd, st->devnum);
} else {
int container_dev;


--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 12/14] Early reshape backup verification

am 16.09.2011 13:55:18 von adam.kwolek

Ever reshape needs to backup its critical section. For this purposes
backup file or external metadata mechanisms can be used.

Add verification if backup file or metadata mechanisms can have chance
to restore data form backup.
For native metadata existence of passed (from command line) backup file
is checked. For external metadta manage_reshape() and recover_backup()
are tested for this purposes.

Signed-off-by: Adam Kwolek
---

Grow.c | 12 ++++++++++++
1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/Grow.c b/Grow.c
index fa22382..d54b8a8 100644
--- a/Grow.c
+++ b/Grow.c
@@ -3667,6 +3667,18 @@ int Grow_continue_command(char *devname, int fd,
devname);
return 1;
}
+
+ /* request backup file
+ * when no metadta specific backup action is defined
+ */
+ if ((backup_file == NULL) &&
+ ((st->ss->manage_reshape == NULL) ||
+ (st->ss->recover_backup == NULL))) {
+ fprintf(stderr, Name ": Please provide backup file "
+ "for reshape continuation.\n");
+ return 1;
+ }
+
dprintf("Grow continue is run for ");
if (st->ss->external == 0) {
dprintf("native array (%s)\n", devname);

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 13/14] Check if md allows to control reshape

am 16.09.2011 13:55:25 von adam.kwolek

It can happen that there is no mdadm in memory and 'max' was already
set to sync_max. Such array cannot be put under check pointing control
again.

Verify such situation and refuse to control array in such condition.

Signed-off-by: Adam Kwolek
---

Grow.c | 19 +++++++++++++++++++
1 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/Grow.c b/Grow.c
index d54b8a8..04e6679 100644
--- a/Grow.c
+++ b/Grow.c
@@ -3656,6 +3656,7 @@ int Grow_continue_command(char *devname, int fd,
int spares;
int working_disks;
struct mdinfo *dev;
+ char buf[40];

dprintf("Grow continue from command line called for %s\n",
devname);
@@ -3769,6 +3770,24 @@ int Grow_continue_command(char *devname, int fd,
}
}

+ /* verify that array under reshape is stopped
+ */
+ ret_val = sysfs_get_str(content, NULL, "sync_max", buf, 40);
+ if (ret_val <= 0) {
+ fprintf(stderr, Name
+ ": cannot open verify reshape progress for %s (%i)\n",
+ content->sys_name, ret_val);
+ ret_val = 1;
+ goto Grow_continue_command_exit;
+ }
+ dprintf(Name ": Read sync_max sysfs entry is: %s\n", buf);
+ if (strncmp(buf, "max", 3) == 0) {
+ fprintf(stderr, Name ": md is not allowed to finish reshape "
+ "wihout mdadm assistance.\n");
+ ret_val = 1;
+ goto Grow_continue_command_exit;
+ }
+
/* continue reshape
*/
dev = content->devs;

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH 14/14] Manual update for --continue option

am 16.09.2011 13:55:34 von adam.kwolek

Patch adds to mdadm man the following information:

--continue
This is needed when --grow operation is interrupted and it is not
restarted automatically during array assembly or array cannot
be disassembled and assembled again. Option --continue has to be used
together with -G , ( --grow ) command and device that it should be
executed on. All parameters required for reshape continuation will
be read from array metadata. If initial --grow command had
required --backup-file= option to be set, continuation option will
require to have exactly the same backup file pointed to also.

Any other parameter passed together with --continue option will
be ignored.

Signed-off-by: Adam Kwolek
---

mdadm.8.in | 24 ++++++++++++++++++++++++
1 files changed, 24 insertions(+), 0 deletions(-)

diff --git a/mdadm.8.in b/mdadm.8.in
index e22fde4..e677be4 100644
--- a/mdadm.8.in
+++ b/mdadm.8.in
@@ -733,6 +733,30 @@ The file must be stored on a separate device, not on the RAID array
being reshaped.

.TP
+.BR \-\-continue
+This is needed when
+.B \-\-grow
+operation is interrupted and it is not restarted automatically during array
+assembly or array cannot be disassembled and assembled again. Option
+.BR \-\-continue
+has to be used together with
+.BR \-G
+, (
+.BR \-\-grow
+) command and device that it should be executed on.
+All parameters required for reshape continuation will be read from array metadata.
+If initial
+.BR \-\-grow
+command had required
+.BR \-\-backup\-file=
+option to be set, continuation option will require to have exactly the same
+backup file pointed to also.
+.IP
+Any other parameter passed together with
+.BR \-\-continue
+option will be ignored.
+
+.TP
.BR \-N ", " \-\-name=
Set a
.B name

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 01/14] Stop array reshape when mounted initramfs isdetected

am 19.09.2011 12:42:20 von NeilBrown

--Sig_//GzYkwxMi4Z9eEEqBW5YxkH
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

On Fri, 16 Sep 2011 13:53:54 +0200 Adam Kwolek wrot=
e:

> During remounting file system from initramfs to real one,
> mdadm is not able to continue reshape procedure due to lost
> filesystem context.
> To avoid this mdadm detects mounted initramfs and performs
> critical section restore only. Array is set for later
> reshape continuation.
>=20
> Reshape continuation will be possible to invoking manually,
> using '--continue' option.
>=20
> Signed-off-by: Adam Kwolek
> ---
>=20
> Grow.c | 25 +++++++++++++++++++++++--
> mdadm.h | 4 ++++
> util.c | 22 ++++++++++++++++++++++
> 3 files changed, 49 insertions(+), 2 deletions(-)
>=20
> diff --git a/Grow.c b/Grow.c
> index 17d14b6..420177c 100644
> --- a/Grow.c
> +++ b/Grow.c
> @@ -635,10 +635,21 @@ static int subarray_set_num(char *container, struct=
mdinfo *sra, char *name, int
> return rc;
> }
> =20
> +/* possible return values:
> + * 0 : success
> + * -1: error
> + * INITRAMFS_IS_IN_USE: continue reshape later
> + */
> int start_reshape(struct mdinfo *sra, int already_running)
> {
> int err;
> - sysfs_set_num(sra, NULL, "suspend_lo", 0x7FFFFFFFFFFFFFFFULL);
> +
> + /* do not block array as we not continue reshape this time
> + */
> + if (initramfs_is_in_use() == INITRAMFS_NOT_USED)
> + sysfs_set_num(sra, NULL, "suspend_lo", 0x7FFFFFFFFFFFFFFFULL);
> + else
> + sysfs_set_num(sra, NULL, "suspend_lo", 0);
> err =3D sysfs_set_num(sra, NULL, "suspend_hi", 0);
> err =3D err ?: sysfs_set_num(sra, NULL, "suspend_lo", 0);
> if (!already_running)
> @@ -2190,6 +2201,14 @@ started:
> }
> if (restart)
> sysfs_set_str(sra, NULL, "array_state", "active");
> + if (initramfs_is_in_use() == INITRAMFS_IS_IN_USE) {
> + free(fdlist);
> + free(offsets);
> + sysfs_free(sra);
> + fprintf(stderr, Name ": Reshape has to be continued "
> + "when root fileststem will be mounted\n");
> + return 1;
> + }
> =20
> /* Now we just need to kick off the reshape and watch, while
> * handling backups of the data...
> @@ -2357,7 +2376,9 @@ int reshape_container(char *container, char *devnam=
e,
> unfreeze(st);
> return 1;
> default: /* parent */
> - printf(Name ": multi-array reshape continues in background\n");
> + if (initramfs_is_in_use() == INITRAMFS_NOT_USED)
> + printf(Name ": multi-array reshape continues "
> + "in background\n");
> return 0;
> case 0: /* child */
> break;
> diff --git a/mdadm.h b/mdadm.h
> index d616966..9035e67 100644
> --- a/mdadm.h
> +++ b/mdadm.h
> @@ -1161,6 +1161,10 @@ extern char *human_size(long long bytes);
> extern char *human_size_brief(long long bytes);
> extern void print_r10_layout(int layout);
> =20
> +extern int initramfs_is_in_use(void);
> +#define INITRAMFS_IS_IN_USE 1
> +#define INITRAMFS_NOT_USED 0
> +
> #define NoMdDev (1<<23)
> extern int find_free_devnum(int use_partitions);
> =20
> diff --git a/util.c b/util.c
> index 0ea7e0d..7186f3f 100644
> --- a/util.c
> +++ b/util.c
> @@ -1768,3 +1768,25 @@ struct mdinfo *container_choose_spares(struct supe=
rtype *st,
> }
> return disks;
> }
> +
> +int initramfs_is_in_use(void)
> +{
> + int ret_val;
> + struct stat sts;
> +
> + /* /init
> + * common script for many Linux distribution
> + * /sysboot
> + * is specific for RH
> + * /mkinitrd.config
> + * is specific for SLES
> + */
> + if ((stat("/init", &sts) == -1) ||
> + ((stat("/sysroot", &sts) == -1) &&
> + (stat("/mkinitrd.config", &sts) == -1)))
> + ret_val =3D INITRAMFS_NOT_USED;
> + else
> + ret_val =3D INITRAMFS_IS_IN_USE;
> +
> + return ret_val;
> +}
>

Hi,
I meant to look through this today but didn't manage to get to it.

However an easy comment is that this initramfs_is_in_use is not acceptable.
You are measuring symptoms and not the cause.

Unfortunately it isn't really easy to measure "Is someone likely to want to
replace the current root filesystem with a different one soon".

In the first instance I think this should be implemented as a command line
option: --freeze-reshape maybe. It is understood with --assemble to mean
that any reshape should not be continued.
It should be possible to get that into the mdadm flags during initrd but not
in the root filesystem.

The second patch looks like it should be combined with the first.

The rest seems OK at a first glance but I'll need to review it more careful=
ly.

Thanks,
NeilBrown

--Sig_//GzYkwxMi4Z9eEEqBW5YxkH
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)

iD8DBQFOdxyMG5fc6gV+Wb0RAj/nAKDOAOohPvH4FS9NmDkPMbozGEG/5gCe LAvH
mAoCZe8VH/AiCSCi0l0B4Dg=
=BDS0
-----END PGP SIGNATURE-----

--Sig_//GzYkwxMi4Z9eEEqBW5YxkH--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

RE: [PATCH 01/14] Stop array reshape when mounted initramfs isdetected

am 19.09.2011 13:14:35 von adam.kwolek

> -----Original Message-----
> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
> owner@vger.kernel.org] On Behalf Of NeilBrown
> Sent: Monday, September 19, 2011 12:42 PM
> To: Kwolek, Adam
> Cc: linux-raid@vger.kernel.org; Ciechanowski, Ed; Labun, Marcin
> Subject: Re: [PATCH 01/14] Stop array reshape when mounted initramfs is
> detected
>
> On Fri, 16 Sep 2011 13:53:54 +0200 Adam Kwolek
> wrote:
>
> > During remounting file system from initramfs to real one, mdadm is not
> > able to continue reshape procedure due to lost filesystem context.
> > To avoid this mdadm detects mounted initramfs and performs critical
> > section restore only. Array is set for later reshape continuation.
> >
> > Reshape continuation will be possible to invoking manually, using
> > '--continue' option.
> >
> > Signed-off-by: Adam Kwolek
> > ---
> >
> > Grow.c | 25 +++++++++++++++++++++++--
> > mdadm.h | 4 ++++
> > util.c | 22 ++++++++++++++++++++++
> > 3 files changed, 49 insertions(+), 2 deletions(-)
> >
> > diff --git a/Grow.c b/Grow.c
> > index 17d14b6..420177c 100644
> > --- a/Grow.c
> > +++ b/Grow.c
> > @@ -635,10 +635,21 @@ static int subarray_set_num(char *container,
> struct mdinfo *sra, char *name, int
> > return rc;
> > }
> >
> > +/* possible return values:
> > + * 0 : success
> > + * -1: error
> > + * INITRAMFS_IS_IN_USE: continue reshape later */
> > int start_reshape(struct mdinfo *sra, int already_running) {
> > int err;
> > - sysfs_set_num(sra, NULL, "suspend_lo", 0x7FFFFFFFFFFFFFFFULL);
> > +
> > + /* do not block array as we not continue reshape this time
> > + */
> > + if (initramfs_is_in_use() == INITRAMFS_NOT_USED)
> > + sysfs_set_num(sra, NULL, "suspend_lo",
> 0x7FFFFFFFFFFFFFFFULL);
> > + else
> > + sysfs_set_num(sra, NULL, "suspend_lo", 0);
> > err = sysfs_set_num(sra, NULL, "suspend_hi", 0);
> > err = err ?: sysfs_set_num(sra, NULL, "suspend_lo", 0);
> > if (!already_running)
> > @@ -2190,6 +2201,14 @@ started:
> > }
> > if (restart)
> > sysfs_set_str(sra, NULL, "array_state", "active");
> > + if (initramfs_is_in_use() == INITRAMFS_IS_IN_USE) {
> > + free(fdlist);
> > + free(offsets);
> > + sysfs_free(sra);
> > + fprintf(stderr, Name ": Reshape has to be continued "
> > + "when root fileststem will be mounted\n");
> > + return 1;
> > + }
> >
> > /* Now we just need to kick off the reshape and watch, while
> > * handling backups of the data...
> > @@ -2357,7 +2376,9 @@ int reshape_container(char *container, char
> *devname,
> > unfreeze(st);
> > return 1;
> > default: /* parent */
> > - printf(Name ": multi-array reshape continues in
> background\n");
> > + if (initramfs_is_in_use() == INITRAMFS_NOT_USED)
> > + printf(Name ": multi-array reshape continues "
> > + "in background\n");
> > return 0;
> > case 0: /* child */
> > break;
> > diff --git a/mdadm.h b/mdadm.h
> > index d616966..9035e67 100644
> > --- a/mdadm.h
> > +++ b/mdadm.h
> > @@ -1161,6 +1161,10 @@ extern char *human_size(long long bytes);
> > extern char *human_size_brief(long long bytes); extern void
> > print_r10_layout(int layout);
> >
> > +extern int initramfs_is_in_use(void);
> > +#define INITRAMFS_IS_IN_USE 1
> > +#define INITRAMFS_NOT_USED 0
> > +
> > #define NoMdDev (1<<23)
> > extern int find_free_devnum(int use_partitions);
> >
> > diff --git a/util.c b/util.c
> > index 0ea7e0d..7186f3f 100644
> > --- a/util.c
> > +++ b/util.c
> > @@ -1768,3 +1768,25 @@ struct mdinfo *container_choose_spares(struct
> supertype *st,
> > }
> > return disks;
> > }
> > +
> > +int initramfs_is_in_use(void)
> > +{
> > + int ret_val;
> > + struct stat sts;
> > +
> > + /* /init
> > + * common script for many Linux distribution
> > + * /sysboot
> > + * is specific for RH
> > + * /mkinitrd.config
> > + * is specific for SLES
> > + */
> > + if ((stat("/init", &sts) == -1) ||
> > + ((stat("/sysroot", &sts) == -1) &&
> > + (stat("/mkinitrd.config", &sts) == -1)))
> > + ret_val = INITRAMFS_NOT_USED;
> > + else
> > + ret_val = INITRAMFS_IS_IN_USE;
> > +
> > + return ret_val;
> > +}
> >
>
> Hi,
> I meant to look through this today but didn't manage to get to it.
>
> However an easy comment is that this initramfs_is_in_use is not
> acceptable.
> You are measuring symptoms and not the cause.
>
> Unfortunately it isn't really easy to measure "Is someone likely to want
> to replace the current root filesystem with a different one soon".
>
> In the first instance I think this should be implemented as a command
> line
> option: --freeze-reshape maybe. It is understood with --assemble to
> mean that any reshape should not be continued.
> It should be possible to get that into the mdadm flags during initrd but
> not in the root filesystem.
>
> The second patch looks like it should be combined with the first.
>
> The rest seems OK at a first glance but I'll need to review it more
> carefully.
>
> Thanks,
> NeilBrown

Thank you for the first part of the review.

I was worry about file system detection I've made, and you confirms my doubts.
I like your idea about another option for reshape freeze at the early boot stage.
I'll include it in second approach.

I'm waiting for rest of your review.
When I'll get it, I'll prepare updated fix.

BR
Adam

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 03/14] FIX: Do not unblock array accidentally

am 21.09.2011 04:19:27 von NeilBrown

--Sig_/oI.MeND65t.Kh.3ljUu9qXB
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

On Fri, 16 Sep 2011 13:54:09 +0200 Adam Kwolek wrot=
e:

> When sysfs_set_array() function is called, it tests if array
> can be configured using sysfs. Setting metadata_version entry
> can accidentally unblock mdmon when array is under reshape.
> To avoid this, blocking character '-' is checked and if is is set,
> it is used for array test.
>=20
> Signed-off-by: Adam Kwolek
> ---
>=20
> sysfs.c | 10 ++++++++++
> 1 files changed, 10 insertions(+), 0 deletions(-)
>=20
> diff --git a/sysfs.c b/sysfs.c
> index 2146264..903202d 100644
> --- a/sysfs.c
> +++ b/sysfs.c
> @@ -541,7 +541,17 @@ int sysfs_set_array(struct mdinfo *info, int vers)
> ver[0] =3D 0;
> if (info->array.major_version == -1 &&
> info->array.minor_version == -2) {
> + char buf[1024];
> strcat(strcpy(ver, "external:"), info->text_version);
> + if (sysfs_get_str(info, NULL, "metadata_version",
> + buf, 1024) > 0) {
> + /* Testing string should not affect
> + * monitor blocking functionality
> + * Set blocking character if present in sysfs already
> + */
> + if (buf[9] == '-')
> + ver[9] =3D '-';
> + }
> =20
> if ((vers % 100) < 2 ||
> sysfs_set_str(info, NULL, "metadata_version",
>=20
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html

I have applied this patch, though I rewrote the comment a bit.

thanks,
NeilBrown


--Sig_/oI.MeND65t.Kh.3ljUu9qXB
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)

iD8DBQFOeUmwG5fc6gV+Wb0RAgp/AJ9vHljilApwpV/jEN74EQwLNTw3xgCg yFvj
bdTfOZffFhVQazBgXiWatqE=
=L2B9
-----END PGP SIGNATURE-----

--Sig_/oI.MeND65t.Kh.3ljUu9qXB--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 04/14] Add continue option to grow command

am 21.09.2011 04:22:07 von NeilBrown

--Sig_/m1K4vtn5EBo0xcDWQ7da6Hj
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

On Fri, 16 Sep 2011 13:54:17 +0200 Adam Kwolek wrot=
e:

> To allow for reshape continuation, 'continue' option is added
> to grow command. It is not possible to check, if any existing mdadm
> instance is going to take ownership on blocked md array or it is
> orphaned/interrupted reshape.
> To resolve this problem grow command with continue option,
> is allowed when current mdadm instance is only one in the system.
>=20
> Signed-off-by: Adam Kwolek

Checking how many instances of mdadm are running is definitely wrong.
What are you trying to guard against here?

If we do need locking to prevent two processes trying to manage an array at
the same time, I would suggest creating a 'pid' file
- /var/run/mdadm-reshape-mdXXX.pid
or something like that.

NeilBrown


> ---
>=20
> ReadMe.c | 1 +
> mdadm.c | 17 ++++++++++++++++-
> mdadm.h | 2 ++
> util.c | 37 +++++++++++++++++++++++++++++++++++++
> 4 files changed, 56 insertions(+), 1 deletions(-)
>=20
> diff --git a/ReadMe.c b/ReadMe.c
> index b658841..f0dc0d9 100644
> --- a/ReadMe.c
> +++ b/ReadMe.c
> @@ -190,6 +190,7 @@ struct option long_options[] =3D {
> {"backup-file", 1,0, BackupFile},
> {"invalid-backup",0,0,InvalidBackup},
> {"array-size", 1, 0, 'Z'},
> + {"continue", 0, 0, Continue},
> =20
> /* For Incremental */
> {"rebuild-map", 0, 0, RebuildMapOpt},
> diff --git a/mdadm.c b/mdadm.c
> index 4b817ab..f30534a 100644
> --- a/mdadm.c
> +++ b/mdadm.c
> @@ -74,6 +74,7 @@ int main(int argc, char *argv[])
> int export =3D 0;
> int assume_clean =3D 0;
> char *symlinks =3D NULL;
> + int grow_continue =3D 0;
> /* autof indicates whether and how to create device node.
> * bottom 3 bits are style. Rest (when shifted) are number of parts
> * 0 - unset
> @@ -988,7 +989,21 @@ int main(int argc, char *argv[])
> }
> backup_file =3D optarg;
> continue;
> -
> + case O(GROW, Continue):
> + /* Continuer broken grow
> + * To be sure that continuation is allowed check
> + * if there is only one (current) mdadm instance
> + */
> + grow_continue =3D count_mdadms();
> + fprintf(stderr, Name ": %i mdadm instance(s) found\n",
> + grow_continue);
> + if (grow_continue !=3D 1) {
> + fprintf(stderr, Name ": Grow continuation "
> + "requires single mdadm instance "
> + "running.\n");
> + exit(2);
> + }
> + continue;
> case O(ASSEMBLE, InvalidBackup):
> /* Acknowledge that the backupfile is invalid, but ask
> * to continue anyway
> diff --git a/mdadm.h b/mdadm.h
> index 9035e67..7e3a618 100644
> --- a/mdadm.h
> +++ b/mdadm.h
> @@ -313,6 +313,7 @@ enum special_options {
> RebuildMapOpt,
> InvalidBackup,
> UdevRules,
> + Continue,
> };
> =20
> /* structures read from config file */
> @@ -1162,6 +1163,7 @@ extern char *human_size_brief(long long bytes);
> extern void print_r10_layout(int layout);
> =20
> extern int initramfs_is_in_use(void);
> +extern int count_mdadms(void);
> #define INITRAMFS_IS_IN_USE 1
> #define INITRAMFS_NOT_USED 0
> =20
> diff --git a/util.c b/util.c
> index 7186f3f..5077c2f 100644
> --- a/util.c
> +++ b/util.c
> @@ -1790,3 +1790,40 @@ int initramfs_is_in_use(void)
> =20
> return ret_val;
> }
> +
> +int count_mdadms(void)
> +{
> + int ret_val =3D 0;
> + DIR *dirp;
> + struct dirent *d_entry;
> + int fd;
> +
> + dirp =3D opendir("/proc");
> + if (!dirp)
> + return -1;
> +
> + while ((d_entry =3D readdir(dirp)) !=3D NULL) {
> + char p[PATH_MAX];
> +
> + if (!strcmp(d_entry->d_name, ".") ||
> + !strcmp(d_entry->d_name, "..") ||
> + !strcmp(d_entry->d_name, "self"))
> + continue;
> + errno =3D 0;
> + fd =3D (int)strtol(d_entry->d_name, NULL, 10);
> + if (errno)
> + continue;
> + sprintf(p, "/proc/%s/cmdline", d_entry->d_name);
> + fd =3D open(p, O_RDONLY);
> + if (fd < 0)
> + continue;
> + if (read(fd, p, PATH_MAX) > 0) {
> + if (strcmp(p, "mdadm") == 0)
> + ret_val++;
> + }
> + close(fd);
> + }
> + closedir(dirp);
> +
> + return ret_val;
> +}
>=20
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html


--Sig_/m1K4vtn5EBo0xcDWQ7da6Hj
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)

iD8DBQFOeUpPG5fc6gV+Wb0RAgncAJ0eHlgvGZ1DNY5nVwh36C4y+a6WGACg v5DU
hwwyTZZH6aByM7b6nMqU54o=
=a/9F
-----END PGP SIGNATURE-----

--Sig_/m1K4vtn5EBo0xcDWQ7da6Hj--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 07/14] FIX: Memory leak during Assembly

am 21.09.2011 04:23:10 von NeilBrown

--Sig_/V7SI/DHPlBT0Sxi_3vo70RA
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

On Fri, 16 Sep 2011 13:54:40 +0200 Adam Kwolek wrot=
e:

> For fdlist pointer allocated in assemble_container_content() function,
> free() is never called. This patch fixes this memory leak.
>=20
> Signed-off-by: Adam Kwolek
> ---
>=20
> Assemble.c | 1 +
> 1 files changed, 1 insertions(+), 0 deletions(-)
>=20
> diff --git a/Assemble.c b/Assemble.c
> index 25cfec1..66d2ee4 100644
> --- a/Assemble.c
> +++ b/Assemble.c
> @@ -1584,6 +1584,7 @@ int assemble_container_content(struct supertype *st=
, int mdfd,
> if (fdlist[spare] >=3D 0)
> close(fdlist[spare]);
> }
> + free(fdlist);
> if (err) {
> fprintf(stderr, Name ": Failed to restore critical"
> " section for reshape - sorry.\n");
>=20
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html


I've applied this fix, thanks.

NeilBrown


--Sig_/V7SI/DHPlBT0Sxi_3vo70RA
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)

iD8DBQFOeUqOG5fc6gV+Wb0RAtYMAKCEL5hmOBNEx8U2ZsgyDowfMf4KbgCf RhTi
o5eKiuZMAUiKRb++n7DhYgM=
=+Hji
-----END PGP SIGNATURE-----

--Sig_/V7SI/DHPlBT0Sxi_3vo70RA--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 08/14] Check if reshape can be restarted

am 21.09.2011 04:27:34 von NeilBrown

--Sig_/UxayZO/0a8w2QGrtIAC/_HD
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

On Fri, 16 Sep 2011 13:54:48 +0200 Adam Kwolek wrot=
e:

> When reshape was invoked during initramfs start-up stage
> array is pushed in to reshape state already, so read only
> state cannot be set again. To verify if reshape continuation should
> be allowed sync_action sysfs key is tested and number of existing mdadm
> instances is checked again.
>=20
> Signed-off-by: Adam Kwolek

Again, it is not at all clear to me what you are trying to protect against
here...

The "set_str ... readonly" is going to need to be changed.

In the normal case, this is where the array is activated, but it is activat=
ed
read-only to ensure no reshape automatically starts.

In the --continue case it is already running and reshape is suspended, so
there is no need to set it to readonly when Grow_continue is called from
Grow_continue_command.

NeilBrown


> ---
>=20
> Grow.c | 19 +++++++++++++++++--
> 1 files changed, 17 insertions(+), 2 deletions(-)
>=20
> diff --git a/Grow.c b/Grow.c
> index 4d2c361..b331287 100644
> --- a/Grow.c
> +++ b/Grow.c
> @@ -3714,8 +3714,23 @@ int Grow_continue(int mdfd, struct supertype *st, =
struct mdinfo *info,
> int err;
> =20
> err =3D sysfs_set_str(info, NULL, "array_state", "readonly");
> - if (err)
> - return err;
> + if (err) {
> + /* check if array is set in to reshape stare,
> + * and current mdadm instance is the only one
> + */
> + if ((sysfs_get_str(info, NULL, "sync_action",
> + buf, 40) == 8) &&
> + (count_mdadms() == 1)) {
> + dprintf("Warning: Grow_continue() cannot set array in"
> + "to read only state [sync_action is %s].",
> + buf);
> + if (strncmp(buf, "reshape", 7))
> + return err;
> + dprintf("Allow to continue.\n");
> + } else
> + return err;
> + }
> +
> if (st->ss->external) {
> fmt_devname(buf, st->container_dev);
> container =3D buf;
>=20
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html


--Sig_/UxayZO/0a8w2QGrtIAC/_HD
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)

iD8DBQFOeUuWG5fc6gV+Wb0RAkvIAKCTAtpQlfgOCgTjqu0SHibE4gMmdQCe IJsf
/RG2uoEr4gn9Yts2DJhwbSs=
=wwAC
-----END PGP SIGNATURE-----

--Sig_/UxayZO/0a8w2QGrtIAC/_HD--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 09/14] Move restore backup code to function

am 21.09.2011 04:28:42 von NeilBrown

--Sig_/L7eCMGiUC+xX7AXL9R2Rji=
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

On Fri, 16 Sep 2011 13:54:55 +0200 Adam Kwolek wrot=
e:

> Reshape backup should be able to be restored during reshape continuation
> also. To reuse already existing code it is moved to function.

I have applied this, but I had to fix it up a bit first.
'spare' is not the same as 'spares' - it is closer to 'next_spare'.

Thanks,
NeilBrown

>=20
> Signed-off-by: Adam Kwolek
> ---
>=20
> Assemble.c | 40 +++------------------------------------
> Grow.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++++++++++=
++++++
> mdadm.h | 6 ++++++
> 3 files changed, 71 insertions(+), 37 deletions(-)
>=20
> diff --git a/Assemble.c b/Assemble.c
> index 66d2ee4..6188fb5 100644
> --- a/Assemble.c
> +++ b/Assemble.c
> @@ -1555,44 +1555,10 @@ int assemble_container_content(struct supertype *=
st, int mdfd,
> =20
> if (content->reshape_active) {
> int spare =3D content->array.raid_disks + expansion;
> - int i;
> - int *fdlist =3D malloc(sizeof(int) *
> - (working + expansion
> - + content->array.raid_disks));
> - for (i=3D0; i > - fdlist[i] =3D -1;
> - for (dev =3D content->devs; dev; dev =3D dev->next) {
> - char buf[20];
> - int fd;
> - sprintf(buf, "%d:%d",
> - dev->disk.major,
> - dev->disk.minor);
> - fd =3D dev_open(buf, O_RDWR);
> -
> - if (dev->disk.raid_disk >=3D 0)
> - fdlist[dev->disk.raid_disk] =3D fd;
> - else
> - fdlist[spare++] =3D fd;
> - }
> - if (st->ss->external && st->ss->recover_backup)
> - err =3D st->ss->recover_backup(st, content);
> - else
> - err =3D Grow_restart(st, content, fdlist, spare,
> - backup_file, verbose > 0);
> - while (spare > 0) {
> - spare--;
> - if (fdlist[spare] >=3D 0)
> - close(fdlist[spare]);
> - }
> - free(fdlist);
> - if (err) {
> - fprintf(stderr, Name ": Failed to restore critical"
> - " section for reshape - sorry.\n");
> - if (!backup_file)
> - fprintf(stderr, Name ": Possibly you need"
> - " to specify a --backup-file\n");
> + if (restore_backup(st, content,
> + working,
> + spare, backup_file, verbose))
> return 1;
> - }
> =20
> err =3D Grow_continue(mdfd, st, content, backup_file);
> } else switch(content->array.level) {
> diff --git a/Grow.c b/Grow.c
> index b331287..0b96f7a 100644
> --- a/Grow.c
> +++ b/Grow.c
> @@ -3578,6 +3578,68 @@ int Grow_restart(struct supertype *st, struct mdin=
fo *info, int *fdlist, int cnt
> return 1;
> }
> =20
> +int restore_backup(struct supertype *st,
> + struct mdinfo *content,
> + int working_disks,
> + int spares,
> + char *backup_file,
> + int verbose)
> +{
> + int i;
> + int *fdlist;
> + struct mdinfo *dev;
> + int err;
> + int disk_count =3D working_disks + spares;
> + int next_spare =3D working_disks;
> +
> + dprintf("Called restore_backup()\n");
> + fdlist =3D malloc(sizeof(int) * disk_count);
> + if (fdlist == NULL) {
> + fprintf(stderr,
> + Name ": cannot allocate memory for disk list\n");
> + return 1;
> + }
> + for (i =3D 0; i < disk_count; i++)
> + fdlist[i] =3D -1;
> + for (dev =3D content->devs; dev; dev =3D dev->next) {
> + char buf[22];
> + int fd;
> + sprintf(buf, "%d:%d",
> + dev->disk.major,
> + dev->disk.minor);
> + fd =3D dev_open(buf, O_RDWR);
> +
> + if (dev->disk.raid_disk >=3D 0)
> + fdlist[dev->disk.raid_disk] =3D fd;
> + else
> + fdlist[next_spare++] =3D fd;
> + }
> +
> + if (st->ss->external && st->ss->recover_backup)
> + err =3D st->ss->recover_backup(st, content);
> + else
> + err =3D Grow_restart(st, content, fdlist, spares,
> + backup_file, verbose > 0);
> +
> + while (disk_count > 0) {
> + disk_count--;
> + if (fdlist[disk_count] >=3D 0)
> + close(fdlist[disk_count]);
> + }
> + free(fdlist);
> + if (err) {
> + fprintf(stderr, Name ": Failed to restore critical"
> + " section for reshape - sorry.\n");
> + if (!backup_file)
> + fprintf(stderr, Name ": Possibly you need"
> + " to specify a --backup-file\n");
> + return 1;
> + }
> +
> + dprintf("restore_backup() returns status OK.\n");
> + return 0;
> +}
> +
> int Grow_continue_command(char *devname, int fd,
> char *backup_file, int verbose)
> {
> diff --git a/mdadm.h b/mdadm.h
> index 7761db6..6165979 100644
> --- a/mdadm.h
> +++ b/mdadm.h
> @@ -1026,6 +1026,12 @@ extern int Grow_continue(int mdfd, struct supertyp=
e *st,
> struct mdinfo *info, char *backup_file);
> extern int Grow_continue_command(char *devname, int fd,
> char *backup_file, int verbose);
> +extern int restore_backup(struct supertype *st,
> + struct mdinfo *content,
> + int working_disks,
> + int spares,
> + char *backup_file,
> + int verbose);
> =20
> extern int Assemble(struct supertype *st, char *mddev,
> struct mddev_ident *ident,
>=20
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html


--Sig_/L7eCMGiUC+xX7AXL9R2Rji=
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)

iD8DBQFOeUvaG5fc6gV+Wb0RAiTXAJ4z2PuXBnpRvirq1Mn2VNzs3kQ2hwCg vzqI
VUigMp99V/DUzM0GgsKfels=
=xdLY
-----END PGP SIGNATURE-----

--Sig_/L7eCMGiUC+xX7AXL9R2Rji=--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 10/14] Perform restore backup for reshape continuation

am 21.09.2011 04:29:56 von NeilBrown

--Sig_/3wlq0EGod=gYV+78QBUBYZy
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

On Fri, 16 Sep 2011 13:55:03 +0200 Adam Kwolek wrot=
e:

> It can happen that reshape is broken by different reason than reboot.
> this means that on reshape continuation start we cannot be sure
> that critical section has been restored already.

I don't think this is true.

If the array is running, then the critical section must have been restored.
Anything else is simply wrong.

What scenario are you thinking of here?

Thanks,
NeilBrown


>=20
> Restore data from checkpoint before reshape continuation to avoid
> data corruption.
>=20
> Signed-off-by: Adam Kwolek
> ---
>=20
> Grow.c | 18 +++++++++++++++++-
> 1 files changed, 17 insertions(+), 1 deletions(-)
>=20
> diff --git a/Grow.c b/Grow.c
> index 0b96f7a..da8b72d 100644
> --- a/Grow.c
> +++ b/Grow.c
> @@ -3652,6 +3652,9 @@ int Grow_continue_command(char *devname, int fd,
> char container[20];
> int cfd =3D -1;
> int fd2 =3D -1;
> + int spares;
> + int working_disks;
> + struct mdinfo *dev;
> =20
> dprintf("Grow continue from command line called for %s\n",
> devname);
> @@ -3753,7 +3756,20 @@ int Grow_continue_command(char *devname, int fd,
> =20
> /* continue reshape
> */
> - ret_val =3D Grow_continue(fd, st, content, backup_file);
> + dev =3D content->devs;
> + spares =3D 0;
> + working_disks =3D 0;
> + while (dev) {
> + if (dev->disk.raid_disk == -1)
> + spares++;
> + else
> + working_disks++;
> + dev =3D dev->next;
> + }
> + ret_val =3D restore_backup(st, content, working_disks, spares,
> + backup_file, verbose);
> + if (!ret_val)
> + ret_val =3D Grow_continue(fd, st, content, backup_file);
> =20
> Grow_continue_command_exit:
> if (fd2 > -1)


--Sig_/3wlq0EGod=gYV+78QBUBYZy
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)

iD8DBQFOeUwkG5fc6gV+Wb0RAs/5AJ9PaKB9XZgIH3RD7JWXJjnBdWPS2gCg i744
sd+KmplS3V24pfmRMgtY7NE=
=apKR
-----END PGP SIGNATURE-----

--Sig_/3wlq0EGod=gYV+78QBUBYZy--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 12/14] Early reshape backup verification

am 21.09.2011 04:31:20 von NeilBrown

--Sig_/5kYWkRnAE+pHi84MaNmns=l
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

On Fri, 16 Sep 2011 13:55:18 +0200 Adam Kwolek wrot=
e:

> Ever reshape needs to backup its critical section. For this purposes
> backup file or external metadata mechanisms can be used.

This is not correct.
For internal metadata the critical section can be stored on a spare device,
and this is preferred where possible.

So it is not the case that you must have a backup_file for non-external
data.

NeilBrown


>=20
> Add verification if backup file or metadata mechanisms can have chance
> to restore data form backup.
> For native metadata existence of passed (from command line) backup file
> is checked. For external metadta manage_reshape() and recover_backup()
> are tested for this purposes.
>=20
> Signed-off-by: Adam Kwolek
> ---
>=20
> Grow.c | 12 ++++++++++++
> 1 files changed, 12 insertions(+), 0 deletions(-)
>=20
> diff --git a/Grow.c b/Grow.c
> index fa22382..d54b8a8 100644
> --- a/Grow.c
> +++ b/Grow.c
> @@ -3667,6 +3667,18 @@ int Grow_continue_command(char *devname, int fd,
> devname);
> return 1;
> }
> +
> + /* request backup file
> + * when no metadta specific backup action is defined
> + */
> + if ((backup_file == NULL) &&
> + ((st->ss->manage_reshape == NULL) ||
> + (st->ss->recover_backup == NULL))) {
> + fprintf(stderr, Name ": Please provide backup file "
> + "for reshape continuation.\n");
> + return 1;
> + }
> +
> dprintf("Grow continue is run for ");
> if (st->ss->external == 0) {
> dprintf("native array (%s)\n", devname);
>=20
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html


--Sig_/5kYWkRnAE+pHi84MaNmns=l
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)

iD8DBQFOeUx4G5fc6gV+Wb0RAoibAKDXFZNX+xHK26m9sW1p+74TZ5yudwCf YUiN
y1N/VA32BbXzc4JzFgpC/pw=
=3B7Q
-----END PGP SIGNATURE-----

--Sig_/5kYWkRnAE+pHi84MaNmns=l--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 13/14] Check if md allows to control reshape

am 21.09.2011 04:33:38 von NeilBrown

--Sig_/JYduVslgs4+cIEKle=S.1/T
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

On Fri, 16 Sep 2011 13:55:25 +0200 Adam Kwolek wrot=
e:

> It can happen that there is no mdadm in memory and 'max' was already
> set to sync_max. Such array cannot be put under check pointing control
> again.

Again I don't understand what you are trying to guard against.
If resync_max is 'max', then the kernel is allowed to fun reshape to
completion without any help from mdadm. If mdadm needs to monitor things it
always sets resync_max to a smaller value.

Confused.

NeilBrown


>=20
> Verify such situation and refuse to control array in such condition.
>=20
> Signed-off-by: Adam Kwolek
> ---
>=20
> Grow.c | 19 +++++++++++++++++++
> 1 files changed, 19 insertions(+), 0 deletions(-)
>=20
> diff --git a/Grow.c b/Grow.c
> index d54b8a8..04e6679 100644
> --- a/Grow.c
> +++ b/Grow.c
> @@ -3656,6 +3656,7 @@ int Grow_continue_command(char *devname, int fd,
> int spares;
> int working_disks;
> struct mdinfo *dev;
> + char buf[40];
> =20
> dprintf("Grow continue from command line called for %s\n",
> devname);
> @@ -3769,6 +3770,24 @@ int Grow_continue_command(char *devname, int fd,
> }
> }
> =20
> + /* verify that array under reshape is stopped
> + */
> + ret_val =3D sysfs_get_str(content, NULL, "sync_max", buf, 40);
> + if (ret_val <=3D 0) {
> + fprintf(stderr, Name
> + ": cannot open verify reshape progress for %s (%i)\n",
> + content->sys_name, ret_val);
> + ret_val =3D 1;
> + goto Grow_continue_command_exit;
> + }
> + dprintf(Name ": Read sync_max sysfs entry is: %s\n", buf);
> + if (strncmp(buf, "max", 3) == 0) {
> + fprintf(stderr, Name ": md is not allowed to finish reshape "
> + "wihout mdadm assistance.\n");
> + ret_val =3D 1;
> + goto Grow_continue_command_exit;
> + }
> +
> /* continue reshape
> */
> dev =3D content->devs;
>=20
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html


--Sig_/JYduVslgs4+cIEKle=S.1/T
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)

iD8DBQFOeU0CG5fc6gV+Wb0RAjv2AKCiyKCiQZHkN07AZoD9fkjKZQZzJQCg uGij
35Drx6evgU/hqVth8mKAskc=
=RnHb
-----END PGP SIGNATURE-----

--Sig_/JYduVslgs4+cIEKle=S.1/T--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

RE: [PATCH 13/14] Check if md allows to control reshape

am 21.09.2011 09:31:48 von adam.kwolek

> -----Original Message-----
> From: NeilBrown [mailto:neilb@suse.de]
> Sent: Wednesday, September 21, 2011 4:34 AM
> To: Kwolek, Adam
> Cc: linux-raid@vger.kernel.org; Ciechanowski, Ed; Labun, Marcin
> Subject: Re: [PATCH 13/14] Check if md allows to control reshape
>
> On Fri, 16 Sep 2011 13:55:25 +0200 Adam Kwolek
> wrote:
>
> > It can happen that there is no mdadm in memory and 'max' was already
> > set to sync_max. Such array cannot be put under check pointing control
> > again.
>
> Again I don't understand what you are trying to guard against.
> If resync_max is 'max', then the kernel is allowed to fun reshape to
> completion without any help from mdadm. If mdadm needs to monitor
> things it always sets resync_max to a smaller value.
>
> Confused.

In my opinion running reshape continuation on array that is allowed to freely reshape to the end
is race condition (another mdadm or user via sysfs manage this array already).
We can tell nothing about restart point (read from metadata) when md runs with reshape already.
I think we should not allow to attach mdadm to such reshape process.

I think this guard is required.

BR
Adam

>
> NeilBrown
>
>
> >
> > Verify such situation and refuse to control array in such condition.
> >
> > Signed-off-by: Adam Kwolek
> > ---
> >
> > Grow.c | 19 +++++++++++++++++++
> > 1 files changed, 19 insertions(+), 0 deletions(-)
> >
> > diff --git a/Grow.c b/Grow.c
> > index d54b8a8..04e6679 100644
> > --- a/Grow.c
> > +++ b/Grow.c
> > @@ -3656,6 +3656,7 @@ int Grow_continue_command(char *devname,
> int fd,
> > int spares;
> > int working_disks;
> > struct mdinfo *dev;
> > + char buf[40];
> >
> > dprintf("Grow continue from command line called for %s\n",
> > devname);
> > @@ -3769,6 +3770,24 @@ int Grow_continue_command(char *devname,
> int fd,
> > }
> > }
> >
> > + /* verify that array under reshape is stopped
> > + */
> > + ret_val = sysfs_get_str(content, NULL, "sync_max", buf, 40);
> > + if (ret_val <= 0) {
> > + fprintf(stderr, Name
> > + ": cannot open verify reshape progress for %s
> (%i)\n",
> > + content->sys_name, ret_val);
> > + ret_val = 1;
> > + goto Grow_continue_command_exit;
> > + }
> > + dprintf(Name ": Read sync_max sysfs entry is: %s\n", buf);
> > + if (strncmp(buf, "max", 3) == 0) {
> > + fprintf(stderr, Name ": md is not allowed to finish reshape "
> > + "wihout mdadm assistance.\n");
> > + ret_val = 1;
> > + goto Grow_continue_command_exit;
> > + }
> > +
> > /* continue reshape
> > */
> > dev = content->devs;
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-raid"
> > in the body of a message to majordomo@vger.kernel.org More
> majordomo
> > info at http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

RE: [PATCH 12/14] Early reshape backup verification

am 21.09.2011 09:33:15 von adam.kwolek

> -----Original Message-----
> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
> owner@vger.kernel.org] On Behalf Of NeilBrown
> Sent: Wednesday, September 21, 2011 4:31 AM
> To: Kwolek, Adam
> Cc: linux-raid@vger.kernel.org; Ciechanowski, Ed; Labun, Marcin
> Subject: Re: [PATCH 12/14] Early reshape backup verification
>
> On Fri, 16 Sep 2011 13:55:18 +0200 Adam Kwolek
> wrote:
>
> > Ever reshape needs to backup its critical section. For this purposes
> > backup file or external metadata mechanisms can be used.
>
> This is not correct.

You are right.

BR
Adam


> For internal metadata the critical section can be stored on a spare device, and
> this is preferred where possible.
>
> So it is not the case that you must have a backup_file for non-external data.
>
> NeilBrown
>
>
> >
> > Add verification if backup file or metadata mechanisms can have chance
> > to restore data form backup.
> > For native metadata existence of passed (from command line) backup
> > file is checked. For external metadta manage_reshape() and
> > recover_backup() are tested for this purposes.
> >
> > Signed-off-by: Adam Kwolek
> > ---
> >
> > Grow.c | 12 ++++++++++++
> > 1 files changed, 12 insertions(+), 0 deletions(-)
> >
> > diff --git a/Grow.c b/Grow.c
> > index fa22382..d54b8a8 100644
> > --- a/Grow.c
> > +++ b/Grow.c
> > @@ -3667,6 +3667,18 @@ int Grow_continue_command(char *devname,
> int fd,
> > devname);
> > return 1;
> > }
> > +
> > + /* request backup file
> > + * when no metadta specific backup action is defined
> > + */
> > + if ((backup_file == NULL) &&
> > + ((st->ss->manage_reshape == NULL) ||
> > + (st->ss->recover_backup == NULL))) {
> > + fprintf(stderr, Name ": Please provide backup file "
> > + "for reshape continuation.\n");
> > + return 1;
> > + }
> > +
> > dprintf("Grow continue is run for ");
> > if (st->ss->external == 0) {
> > dprintf("native array (%s)\n", devname);
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-raid"
> > in the body of a message to majordomo@vger.kernel.org More
> majordomo
> > info at http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

RE: [PATCH 10/14] Perform restore backup for reshape continuation

am 21.09.2011 09:35:21 von adam.kwolek

> -----Original Message-----
> From: NeilBrown [mailto:neilb@suse.de]
> Sent: Wednesday, September 21, 2011 4:30 AM
> To: Kwolek, Adam
> Cc: linux-raid@vger.kernel.org; Ciechanowski, Ed; Labun, Marcin
> Subject: Re: [PATCH 10/14] Perform restore backup for reshape continuation
>
> On Fri, 16 Sep 2011 13:55:03 +0200 Adam Kwolek
> wrote:
>
> > It can happen that reshape is broken by different reason than reboot.
> > this means that on reshape continuation start we cannot be sure that
> > critical section has been restored already.
>
> I don't think this is true.
>
> If the array is running, then the critical section must have been restored.
> Anything else is simply wrong.
>
> What scenario are you thinking of here?

You are right. If array is assembled backup is restored already.

BR
Adam

>
> Thanks,
> NeilBrown
>
>
> >
> > Restore data from checkpoint before reshape continuation to avoid data
> > corruption.
> >
> > Signed-off-by: Adam Kwolek
> > ---
> >
> > Grow.c | 18 +++++++++++++++++-
> > 1 files changed, 17 insertions(+), 1 deletions(-)
> >
> > diff --git a/Grow.c b/Grow.c
> > index 0b96f7a..da8b72d 100644
> > --- a/Grow.c
> > +++ b/Grow.c
> > @@ -3652,6 +3652,9 @@ int Grow_continue_command(char *devname,
> int fd,
> > char container[20];
> > int cfd = -1;
> > int fd2 = -1;
> > + int spares;
> > + int working_disks;
> > + struct mdinfo *dev;
> >
> > dprintf("Grow continue from command line called for %s\n",
> > devname);
> > @@ -3753,7 +3756,20 @@ int Grow_continue_command(char *devname,
> int
> > fd,
> >
> > /* continue reshape
> > */
> > - ret_val = Grow_continue(fd, st, content, backup_file);
> > + dev = content->devs;
> > + spares = 0;
> > + working_disks = 0;
> > + while (dev) {
> > + if (dev->disk.raid_disk == -1)
> > + spares++;
> > + else
> > + working_disks++;
> > + dev = dev->next;
> > + }
> > + ret_val = restore_backup(st, content, working_disks, spares,
> > + backup_file, verbose);
> > + if (!ret_val)
> > + ret_val = Grow_continue(fd, st, content, backup_file);
> >
> > Grow_continue_command_exit:
> > if (fd2 > -1)

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

RE: [PATCH 08/14] Check if reshape can be restarted

am 21.09.2011 09:45:31 von adam.kwolek

> -----Original Message-----
> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
> owner@vger.kernel.org] On Behalf Of NeilBrown
> Sent: Wednesday, September 21, 2011 4:28 AM
> To: Kwolek, Adam
> Cc: linux-raid@vger.kernel.org; Ciechanowski, Ed; Labun, Marcin
> Subject: Re: [PATCH 08/14] Check if reshape can be restarted
>
> On Fri, 16 Sep 2011 13:54:48 +0200 Adam Kwolek
> wrote:
>
> > When reshape was invoked during initramfs start-up stage array is
> > pushed in to reshape state already, so read only state cannot be set
> > again. To verify if reshape continuation should be allowed sync_action
> > sysfs key is tested and number of existing mdadm instances is checked
> > again.
> >
> > Signed-off-by: Adam Kwolek
>
> Again, it is not at all clear to me what you are trying to protect against here...
>
> The "set_str ... readonly" is going to need to be changed.
>
> In the normal case, this is where the array is activated, but it is activated
> read-only to ensure no reshape automatically starts.
>
> In the --continue case it is already running and reshape is suspended, so
> there is no need to set it to readonly when Grow_continue is called from
> Grow_continue_command.


This is exactly the case.

We have 2 continuation possibilities, when Grow_continue() is called:

1. Assembly when array is set to readonly state and this activates it.

2. grow --continue. Reshape is continued and array is already in reshape state,
It is not possible to set readonly state again, this is why sync_action is checked
For "reshape" to allow continuation for run.
Conjunction with count_mdadms() /the single mdadm instance/ guards any potential race conditions.

BR
Adam

>
> NeilBrown
>
>
> > ---
> >
> > Grow.c | 19 +++++++++++++++++--
> > 1 files changed, 17 insertions(+), 2 deletions(-)
> >
> > diff --git a/Grow.c b/Grow.c
> > index 4d2c361..b331287 100644
> > --- a/Grow.c
> > +++ b/Grow.c
> > @@ -3714,8 +3714,23 @@ int Grow_continue(int mdfd, struct supertype
> *st, struct mdinfo *info,
> > int err;
> >
> > err = sysfs_set_str(info, NULL, "array_state", "readonly");
> > - if (err)
> > - return err;
> > + if (err) {
> > + /* check if array is set in to reshape stare,
> > + * and current mdadm instance is the only one
> > + */
> > + if ((sysfs_get_str(info, NULL, "sync_action",
> > + buf, 40) == 8) &&
> > + (count_mdadms() == 1)) {
> > + dprintf("Warning: Grow_continue() cannot set array
> in"
> > + "to read only state [sync_action is %s].",
> > + buf);
> > + if (strncmp(buf, "reshape", 7))
> > + return err;
> > + dprintf("Allow to continue.\n");
> > + } else
> > + return err;
> > + }
> > +
> > if (st->ss->external) {
> > fmt_devname(buf, st->container_dev);
> > container = buf;
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-raid"
> > in the body of a message to majordomo@vger.kernel.org More
> majordomo
> > info at http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

RE: [PATCH 04/14] Add continue option to grow command

am 21.09.2011 09:47:13 von adam.kwolek

> -----Original Message-----
> From: NeilBrown [mailto:neilb@suse.de]
> Sent: Wednesday, September 21, 2011 4:22 AM
> To: Kwolek, Adam
> Cc: linux-raid@vger.kernel.org; Ciechanowski, Ed; Labun, Marcin
> Subject: Re: [PATCH 04/14] Add continue option to grow command
>
> On Fri, 16 Sep 2011 13:54:17 +0200 Adam Kwolek
> wrote:
>
> > To allow for reshape continuation, 'continue' option is added to grow
> > command. It is not possible to check, if any existing mdadm instance
> > is going to take ownership on blocked md array or it is
> > orphaned/interrupted reshape.
> > To resolve this problem grow command with continue option, is allowed
> > when current mdadm instance is only one in the system.
> >
> > Signed-off-by: Adam Kwolek
>
> Checking how many instances of mdadm are running is definitely wrong.
> What are you trying to guard against here?


Double reshape '--continue' command execution.


>
> If we do need locking to prevent two processes trying to manage an array at
> the same time, I would suggest creating a 'pid' file
> - /var/run/mdadm-reshape-mdXXX.pid
> or something like that.


OK, I'll think about this.

BR
Adam

>
> NeilBrown
>
>
> > ---
> >
> > ReadMe.c | 1 +
> > mdadm.c | 17 ++++++++++++++++-
> > mdadm.h | 2 ++
> > util.c | 37 +++++++++++++++++++++++++++++++++++++
> > 4 files changed, 56 insertions(+), 1 deletions(-)
> >
> > diff --git a/ReadMe.c b/ReadMe.c
> > index b658841..f0dc0d9 100644
> > --- a/ReadMe.c
> > +++ b/ReadMe.c
> > @@ -190,6 +190,7 @@ struct option long_options[] = {
> > {"backup-file", 1,0, BackupFile},
> > {"invalid-backup",0,0,InvalidBackup},
> > {"array-size", 1, 0, 'Z'},
> > + {"continue", 0, 0, Continue},
> >
> > /* For Incremental */
> > {"rebuild-map", 0, 0, RebuildMapOpt}, diff --git a/mdadm.c
> > b/mdadm.c index 4b817ab..f30534a 100644
> > --- a/mdadm.c
> > +++ b/mdadm.c
> > @@ -74,6 +74,7 @@ int main(int argc, char *argv[])
> > int export = 0;
> > int assume_clean = 0;
> > char *symlinks = NULL;
> > + int grow_continue = 0;
> > /* autof indicates whether and how to create device node.
> > * bottom 3 bits are style. Rest (when shifted) are number of parts
> > * 0 - unset
> > @@ -988,7 +989,21 @@ int main(int argc, char *argv[])
> > }
> > backup_file = optarg;
> > continue;
> > -
> > + case O(GROW, Continue):
> > + /* Continuer broken grow
> > + * To be sure that continuation is allowed check
> > + * if there is only one (current) mdadm instance
> > + */
> > + grow_continue = count_mdadms();
> > + fprintf(stderr, Name ": %i mdadm instance(s)
> found\n",
> > + grow_continue);
> > + if (grow_continue != 1) {
> > + fprintf(stderr, Name ": Grow continuation "
> > + "requires single mdadm instance "
> > + "running.\n");
> > + exit(2);
> > + }
> > + continue;
> > case O(ASSEMBLE, InvalidBackup):
> > /* Acknowledge that the backupfile is invalid, but ask
> > * to continue anyway
> > diff --git a/mdadm.h b/mdadm.h
> > index 9035e67..7e3a618 100644
> > --- a/mdadm.h
> > +++ b/mdadm.h
> > @@ -313,6 +313,7 @@ enum special_options {
> > RebuildMapOpt,
> > InvalidBackup,
> > UdevRules,
> > + Continue,
> > };
> >
> > /* structures read from config file */ @@ -1162,6 +1163,7 @@ extern
> > char *human_size_brief(long long bytes); extern void
> > print_r10_layout(int layout);
> >
> > extern int initramfs_is_in_use(void);
> > +extern int count_mdadms(void);
> > #define INITRAMFS_IS_IN_USE 1
> > #define INITRAMFS_NOT_USED 0
> >
> > diff --git a/util.c b/util.c
> > index 7186f3f..5077c2f 100644
> > --- a/util.c
> > +++ b/util.c
> > @@ -1790,3 +1790,40 @@ int initramfs_is_in_use(void)
> >
> > return ret_val;
> > }
> > +
> > +int count_mdadms(void)
> > +{
> > + int ret_val = 0;
> > + DIR *dirp;
> > + struct dirent *d_entry;
> > + int fd;
> > +
> > + dirp = opendir("/proc");
> > + if (!dirp)
> > + return -1;
> > +
> > + while ((d_entry = readdir(dirp)) != NULL) {
> > + char p[PATH_MAX];
> > +
> > + if (!strcmp(d_entry->d_name, ".") ||
> > + !strcmp(d_entry->d_name, "..") ||
> > + !strcmp(d_entry->d_name, "self"))
> > + continue;
> > + errno = 0;
> > + fd = (int)strtol(d_entry->d_name, NULL, 10);
> > + if (errno)
> > + continue;
> > + sprintf(p, "/proc/%s/cmdline", d_entry->d_name);
> > + fd = open(p, O_RDONLY);
> > + if (fd < 0)
> > + continue;
> > + if (read(fd, p, PATH_MAX) > 0) {
> > + if (strcmp(p, "mdadm") == 0)
> > + ret_val++;
> > + }
> > + close(fd);
> > + }
> > + closedir(dirp);
> > +
> > + return ret_val;
> > +}
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-raid"
> > in the body of a message to majordomo@vger.kernel.org More
> majordomo
> > info at http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 13/14] Check if md allows to control reshape

am 21.09.2011 09:51:14 von NeilBrown

--Sig_/A8G+.wEp41ygAn6r..hF715
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

On Wed, 21 Sep 2011 07:31:48 +0000 "Kwolek, Adam"
wrote:

>=20
>=20
> > -----Original Message-----
> > From: NeilBrown [mailto:neilb@suse.de]
> > Sent: Wednesday, September 21, 2011 4:34 AM
> > To: Kwolek, Adam
> > Cc: linux-raid@vger.kernel.org; Ciechanowski, Ed; Labun, Marcin
> > Subject: Re: [PATCH 13/14] Check if md allows to control reshape
> >=20
> > On Fri, 16 Sep 2011 13:55:25 +0200 Adam Kwolek
> > wrote:
> >=20
> > > It can happen that there is no mdadm in memory and 'max' was already
> > > set to sync_max. Such array cannot be put under check pointing control
> > > again.
> >=20
> > Again I don't understand what you are trying to guard against.
> > If resync_max is 'max', then the kernel is allowed to fun reshape to
> > completion without any help from mdadm. If mdadm needs to monitor
> > things it always sets resync_max to a smaller value.
> >=20
> > Confused.
>=20
> In my opinion running reshape continuation on array that is allowed to fr=
eely reshape to the end
> is race condition (another mdadm or user via sysfs manage this array alre=
ady).
> We can tell nothing about restart point (read from metadata) when md runs=
with reshape already.
> I think we should not allow to attach mdadm to such reshape process.
>=20
> I think this guard is required.
>=20

I see what your point is now. I was just a bit confused by the message that
is printed which seems to say something different.

If we need to guard something here, we need to guard against a lot more than
just sync_max == "max". We need to guard against another mdadm process
monitoring the array (the extra .pid file I mentioned) and also if sync_max
has any value larger than what we would expect from the metadata.

So a test that rejects an attempt to continue reshape on an array where
sync_max is not the correct value makes sense. Just testing for "max"
doesn't.

Thanks,
NeilBrown

--Sig_/A8G+.wEp41ygAn6r..hF715
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)

iD8DBQFOeZdyG5fc6gV+Wb0RAhfFAKC3xdLPF3IA2zFzWEDbJSbFBtIJ0wCg 0kUH
SDwc0kkz0SeudcrPmnBJTvo=
=lnZS
-----END PGP SIGNATURE-----

--Sig_/A8G+.wEp41ygAn6r..hF715--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html