[Patch 00/17] Autorebuild
[Patch 00/17] Autorebuild
am 29.10.2010 16:13:40 von anna.czarnowska
This is updated series of patches forming autorebuild functionality in mdadm monitor based on new policy code.
Autorebuild Monitoring application:
Autorebuild monitor is part of monitor application (mdadm -F). In the current code of mdadm monitor autorebuild feature was based on spare group assignment in mdadm.conf file and worked only for native metadata. This code has been retained for compatibility with old config format.
The new autorebuild implementation works also for external metadata types. It uses the concept of domains in mdadm.conf introduced by Neil Brown.
Monitoring application periodically checks the state of MD active arrays and triggers a rebuild if there are eligible spare disks in other arrays/containers.
Degraded arrays are checked one by one. If there is a spare disk in other array/container that matches the domain of the degraded array and the domain action allows for spare sharing the spare is moved using existing Manage_subdevs function. If the addition fails, the spare device is moved back to the original container and next potential spare is tried. The process is repeated until all arrays are checked and the process is put into a sleep state for a configured period.
New option --no-sharing has been added to Monitor mode to be able to run monitoring only (without moving spares). This is recommended when many instances of monitor are to be run on the same set of devices.
Spare sharing is allowed in only one instance of Monitor running with --scan option. User is still able to start Monitoring functions in multiple instances without --scan option.
The autorebuild build-in assumptions are:
1\spares are shared between the arrays of the same metadata 2\spares are moved only from containers/volumes that are not degraded 3\spares are moved to containers/volumes lacking a *good* spare (size)
Anna Czarnowska
Przemyslaw Hawrylewicz-Czarnowski
Marcin Labun
0001-added-path-path_id-to-give-the-information-on-the-pa.pa tch
0002-Update-of-udev-rules-to-support-IMSM-devices.patch
0003-extension-of-IncrementalRemove-to-store-location-pat.pa tch
0004-Incremental-for-bare-disks-implementation-of-spare-s.pa tch
0005-Util-get-device-size-from-id.patch
0006-Monitor-set-err-on-arrays-not-in-mdstat.patch
0007-Monitor-spare-group-based-spare-sharing-moved-to-sep.pa tch
0008-mdadm-added-no-sharing-option-for-Monitor-mode.patch
0009-Monitor-avoid-skipping-checks-on-external-arays.patch
0010-Monitor-include-containers-in-scan-mode.patch
0011-Monitor-link-containers-with-subarrays-in-statelist.pat ch
0012-imsm-create-mdinfo-list-of-disks-in-a-container-from.pa tch
0013-Monitor-autorebuild-functionality-added.patch
0014-Monitor-Respect-policy-in-auto-rebuild-in-mdadm-moni.pa tch
0015-Monitor-more-accurate-size-check-when-looking-for-sp.pa tch
0016-IMSM-Fix-problem-in-mdmon-monitor-of-using-removed-d.pa tch
0017-Policy-is-aware-of-metadata-disk-s-controller-domain.pa tch
Incremental.c | 230 +++++++++++++++---
Makefile | 3 +
Monitor.c | 691 ++++++++++++++++++++++++++++++++++++++++++++--------
ReadMe.c | 4 +
managemon.c | 38 +++
mdadm.c | 29 ++-
mdadm.h | 49 ++++-
policy.c | 134 +++++++++-
super-intel.c | 274 ++++++++++++++++++---
udev-md-raid.rules | 7 +-
util.c | 23 ++
11 files changed, 1290 insertions(+), 192 deletions(-)
------------------------------------------------------------ ---------
Intel Technology Poland sp. z o.o.
z siedziba w Gdansku
ul. Slowackiego 173
80-298 Gdansk
Sad Rejonowy Gdansk Polnoc w Gdansku,
VII Wydzial Gospodarczy Krajowego Rejestru Sadowego,
numer KRS 101882
NIP 957-07-52-316
Kapital zakladowy 200.000 zl
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Patch 00/17] Autorebuild
am 17.11.2010 11:22:42 von NeilBrown
On Fri, 29 Oct 2010 15:13:40 +0100
"Czarnowska, Anna" wrote:
> This is updated series of patches forming autorebuild functionality in mdadm monitor based on new policy code.
I Anna and all,
I have decided that the best way forward is for me to apply all of your
patches and fix them up on the way. It turned out there was quite a few
changes that I wanted to make that I only discovered while examining the
patches very closely, so this seems to be a very useful exercise.
I haven't finished yet, but you can see my current state - which is up to
about patch 11 in your series - in my devel-3.2 branch.
Probably the most significant change so far is the interpretation of
action=spare-same-slot. I have defined that in a way that is *more*
permissive than action=spare, where you had it less permissive.
The comment for that change set is below.
The new list of policy actions is:
enum policy_action {
act_default,
act_include,
act_re_add,
act_spare, /* This only applies to bare devices */
act_spare_same_slot, /* this allows non-bare devices,
* but only if recent removal */
act_force_spare, /* this allow non-bare devices in any case */
act_err
};
I'll hopefully get the rest of your patches done tomorrow and then I might
start on Adam... and then on the changes that I want to make.
If you can do some testing once enough code is in place, that would be great.
Thanks,
NeilBrown
commit a2191ce8af4aa178d62df759fab47ef4dc8e6f67
Author: NeilBrown
Date: Wed Nov 17 12:46:35 2010 +1100
Add action=spare-same-slot policy.
When "mdadm -I" is given a device with no metadata, mdadm tries to add
it as a 'spare' somewhere based on policy.
This patch changes the behaviour in two ways:
1/ If the device is at a 'path' where a previous device was removed
from an array or container, then we preferentially add the spare to
that array or container.
2/ Previously only 'bare' devices were considered for adding as
spares. Now if action=spare-same-slot is active, we will add
non-bare devices, but *only* if the path was previously in use
for some array, and the device will only be added to that array.
Based on code
From: Przemyslaw Czarnowski
Signed-off-by: Przemyslaw Czarnowski
Signed-off-by: NeilBrown
>
> Autorebuild Monitoring application:
> Autorebuild monitor is part of monitor application (mdadm -F). In the current code of mdadm monitor autorebuild feature was based on spare group assignment in mdadm.conf file and worked only for native metadata. This code has been retained for compatibility with old config format.
> The new autorebuild implementation works also for external metadata types. It uses the concept of domains in mdadm.conf introduced by Neil Brown.
> Monitoring application periodically checks the state of MD active arrays and triggers a rebuild if there are eligible spare disks in other arrays/containers.
> Degraded arrays are checked one by one. If there is a spare disk in other array/container that matches the domain of the degraded array and the domain action allows for spare sharing the spare is moved using existing Manage_subdevs function. If the addition fails, the spare device is moved back to the original container and next potential spare is tried. The process is repeated until all arrays are checked and the process is put into a sleep state for a configured period.
>
> New option --no-sharing has been added to Monitor mode to be able to run monitoring only (without moving spares). This is recommended when many instances of monitor are to be run on the same set of devices.
> Spare sharing is allowed in only one instance of Monitor running with --scan option. User is still able to start Monitoring functions in multiple instances without --scan option.
>
> The autorebuild build-in assumptions are:
> 1\spares are shared between the arrays of the same metadata 2\spares are moved only from containers/volumes that are not degraded 3\spares are moved to containers/volumes lacking a *good* spare (size)
>
> Anna Czarnowska
> Przemyslaw Hawrylewicz-Czarnowski
> Marcin Labun
>
> 0001-added-path-path_id-to-give-the-information-on-the-pa.pa tch
> 0002-Update-of-udev-rules-to-support-IMSM-devices.patch
> 0003-extension-of-IncrementalRemove-to-store-location-pat.pa tch
> 0004-Incremental-for-bare-disks-implementation-of-spare-s.pa tch
> 0005-Util-get-device-size-from-id.patch
> 0006-Monitor-set-err-on-arrays-not-in-mdstat.patch
> 0007-Monitor-spare-group-based-spare-sharing-moved-to-sep.pa tch
> 0008-mdadm-added-no-sharing-option-for-Monitor-mode.patch
> 0009-Monitor-avoid-skipping-checks-on-external-arays.patch
> 0010-Monitor-include-containers-in-scan-mode.patch
> 0011-Monitor-link-containers-with-subarrays-in-statelist.pat ch
> 0012-imsm-create-mdinfo-list-of-disks-in-a-container-from.pa tch
> 0013-Monitor-autorebuild-functionality-added.patch
> 0014-Monitor-Respect-policy-in-auto-rebuild-in-mdadm-moni.pa tch
> 0015-Monitor-more-accurate-size-check-when-looking-for-sp.pa tch
> 0016-IMSM-Fix-problem-in-mdmon-monitor-of-using-removed-d.pa tch
> 0017-Policy-is-aware-of-metadata-disk-s-controller-domain.pa tch
>
> Incremental.c | 230 +++++++++++++++---
> Makefile | 3 +
> Monitor.c | 691 ++++++++++++++++++++++++++++++++++++++++++++--------
> ReadMe.c | 4 +
> managemon.c | 38 +++
> mdadm.c | 29 ++-
> mdadm.h | 49 ++++-
> policy.c | 134 +++++++++-
> super-intel.c | 274 ++++++++++++++++++---
> udev-md-raid.rules | 7 +-
> util.c | 23 ++
> 11 files changed, 1290 insertions(+), 192 deletions(-)
> ------------------------------------------------------------ ---------
> Intel Technology Poland sp. z o.o.
> z siedziba w Gdansku
> ul. Slowackiego 173
> 80-298 Gdansk
>
> Sad Rejonowy Gdansk Polnoc w Gdansku,
> VII Wydzial Gospodarczy Krajowego Rejestru Sadowego,
> numer KRS 101882
>
> NIP 957-07-52-316
> Kapital zakladowy 200.000 zl
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [Patch 00/17] Autorebuild
am 17.11.2010 17:04:37 von Marcin.Labun
> -----Original Message-----
> From: Neil Brown [mailto:neilb@suse.de]
> Sent: Wednesday, November 17, 2010 11:23 AM
> To: Czarnowska, Anna
> Cc: linux-raid@vger.kernel.org; Neubauer, Wojciech; Williams, Dan J;
> Ciechanowski, Ed; Labun, Marcin; Hawrylewicz Czarnowski, Przemyslaw
> Subject: Re: [Patch 00/17] Autorebuild
>
>
> I'll hopefully get the rest of your patches done tomorrow and then I
> might
> start on Adam... and then on the changes that I want to make.
> If you can do some testing once enough code is in place, that would be
> great.
We will start testing of hot-plug and are waiting for auto-rebuild staff integrated on your devel branch do run the rest of tests.
Marcin
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [Patch 00/17] Autorebuild
am 19.11.2010 00:14:49 von anna.czarnowska
Hi Neil,
I started testing new code today. Just the Incremental part.
There are few problems:
1. Cookie file is cleared before it is read so spare-same-slot can't work. It should be just open for reading. (probably a typo)
2. Container uuid instead of subarray uuid is written in cookie file, so for ddf it may not be clear which subarray used the slot.
3. Incremental fail does not work for external metadata. Przemek's original patch did fail the disk in subarrays. Now Manage_subdevs tries to fail a disk in container while subarray is expected. Do you intend to change Manage_subdevs to take a container?
4. With spare-same-slot when there is a cookie and disk has no metadata then we probably shouldn't look at domains. Just add.
This is all for now.
Regards
Anna
> -----Original Message-----
> From: Neil Brown [mailto:neilb@suse.de]
> Sent: Wednesday, November 17, 2010 11:23 AM
> To: Czarnowska, Anna
> Cc: linux-raid@vger.kernel.org; Neubauer, Wojciech; Williams, Dan J;
> Ciechanowski, Ed; Labun, Marcin; Hawrylewicz Czarnowski, Przemyslaw
> Subject: Re: [Patch 00/17] Autorebuild
>
> On Fri, 29 Oct 2010 15:13:40 +0100
> "Czarnowska, Anna" wrote:
>
> > This is updated series of patches forming autorebuild functionality
> in mdadm monitor based on new policy code.
>
> I Anna and all,
>
> I have decided that the best way forward is for me to apply all of your
> patches and fix them up on the way. It turned out there was quite a
> few
> changes that I wanted to make that I only discovered while examining
> the
> patches very closely, so this seems to be a very useful exercise.
>
> I haven't finished yet, but you can see my current state - which is up
> to
> about patch 11 in your series - in my devel-3.2 branch.
>
> Probably the most significant change so far is the interpretation of
> action=spare-same-slot. I have defined that in a way that is *more*
> permissive than action=spare, where you had it less permissive.
>
> The comment for that change set is below.
> The new list of policy actions is:
>
> enum policy_action {
> act_default,
> act_include,
> act_re_add,
> act_spare, /* This only applies to bare devices */
> act_spare_same_slot, /* this allows non-bare devices,
> * but only if recent removal */
> act_force_spare, /* this allow non-bare devices in any case */
> act_err
> };
>
> I'll hopefully get the rest of your patches done tomorrow and then I
> might
> start on Adam... and then on the changes that I want to make.
> If you can do some testing once enough code is in place, that would be
> great.
> Thanks,
> NeilBrown
>
> commit a2191ce8af4aa178d62df759fab47ef4dc8e6f67
> Author: NeilBrown
> Date: Wed Nov 17 12:46:35 2010 +1100
>
> Add action=spare-same-slot policy.
>
> When "mdadm -I" is given a device with no metadata, mdadm tries to
> add
> it as a 'spare' somewhere based on policy.
>
> This patch changes the behaviour in two ways:
>
> 1/ If the device is at a 'path' where a previous device was removed
> from an array or container, then we preferentially add the spare
> to
> that array or container.
>
> 2/ Previously only 'bare' devices were considered for adding as
> spares. Now if action=spare-same-slot is active, we will add
> non-bare devices, but *only* if the path was previously in use
> for some array, and the device will only be added to that array.
>
> Based on code
> From: Przemyslaw Czarnowski
>
>
> Signed-off-by: Przemyslaw Czarnowski
>
> Signed-off-by: NeilBrown
>
>
> >
> > Autorebuild Monitoring application:
> > Autorebuild monitor is part of monitor application (mdadm -F). In the
> current code of mdadm monitor autorebuild feature was based on spare
> group assignment in mdadm.conf file and worked only for native
> metadata. This code has been retained for compatibility with old config
> format.
> > The new autorebuild implementation works also for external metadata
> types. It uses the concept of domains in mdadm.conf introduced by Neil
> Brown.
> > Monitoring application periodically checks the state of MD active
> arrays and triggers a rebuild if there are eligible spare disks in
> other arrays/containers.
> > Degraded arrays are checked one by one. If there is a spare disk in
> other array/container that matches the domain of the degraded array and
> the domain action allows for spare sharing the spare is moved using
> existing Manage_subdevs function. If the addition fails, the spare
> device is moved back to the original container and next potential spare
> is tried. The process is repeated until all arrays are checked and the
> process is put into a sleep state for a configured period.
> >
> > New option --no-sharing has been added to Monitor mode to be able to
> run monitoring only (without moving spares). This is recommended when
> many instances of monitor are to be run on the same set of devices.
> > Spare sharing is allowed in only one instance of Monitor running with
> --scan option. User is still able to start Monitoring functions in
> multiple instances without --scan option.
> >
> > The autorebuild build-in assumptions are:
> > 1\spares are shared between the arrays of the same metadata 2\spares
> are moved only from containers/volumes that are not degraded 3\spares
> are moved to containers/volumes lacking a *good* spare (size)
> >
> > Anna Czarnowska
> > Przemyslaw Hawrylewicz-Czarnowski
> > Marcin Labun
> >
> > 0001-added-path-path_id-to-give-the-information-on-the-pa.pa tch
> > 0002-Update-of-udev-rules-to-support-IMSM-devices.patch
> > 0003-extension-of-IncrementalRemove-to-store-location-pat.pa tch
> > 0004-Incremental-for-bare-disks-implementation-of-spare-s.pa tch
> > 0005-Util-get-device-size-from-id.patch
> > 0006-Monitor-set-err-on-arrays-not-in-mdstat.patch
> > 0007-Monitor-spare-group-based-spare-sharing-moved-to-sep.pa tch
> > 0008-mdadm-added-no-sharing-option-for-Monitor-mode.patch
> > 0009-Monitor-avoid-skipping-checks-on-external-arays.patch
> > 0010-Monitor-include-containers-in-scan-mode.patch
> > 0011-Monitor-link-containers-with-subarrays-in-statelist.pat ch
> > 0012-imsm-create-mdinfo-list-of-disks-in-a-container-from.pa tch
> > 0013-Monitor-autorebuild-functionality-added.patch
> > 0014-Monitor-Respect-policy-in-auto-rebuild-in-mdadm-moni.pa tch
> > 0015-Monitor-more-accurate-size-check-when-looking-for-sp.pa tch
> > 0016-IMSM-Fix-problem-in-mdmon-monitor-of-using-removed-d.pa tch
> > 0017-Policy-is-aware-of-metadata-disk-s-controller-domain.pa tch
> >
> > Incremental.c | 230 +++++++++++++++---
> > Makefile | 3 +
> > Monitor.c | 691
> ++++++++++++++++++++++++++++++++++++++++++++--------
> > ReadMe.c | 4 +
> > managemon.c | 38 +++
> > mdadm.c | 29 ++-
> > mdadm.h | 49 ++++-
> > policy.c | 134 +++++++++-
> > super-intel.c | 274 ++++++++++++++++++---
> > udev-md-raid.rules | 7 +-
> > util.c | 23 ++
> > 11 files changed, 1290 insertions(+), 192 deletions(-)
> > ------------------------------------------------------------ ---------
> > Intel Technology Poland sp. z o.o.
> > z siedziba w Gdansku
> > ul. Slowackiego 173
> > 80-298 Gdansk
> >
> > Sad Rejonowy Gdansk Polnoc w Gdansku,
> > VII Wydzial Gospodarczy Krajowego Rejestru Sadowego,
> > numer KRS 101882
> >
> > NIP 957-07-52-316
> > Kapital zakladowy 200.000 zl
> >
> > This e-mail and any attachments may contain confidential material for
> > the sole use of the intended recipient(s). Any review or distribution
> > by others is strictly prohibited. If you are not the intended
> > recipient, please contact the sender and delete all copies.
------------------------------------------------------------ ---------
Intel Technology Poland sp. z o.o.
z siedziba w Gdansku
ul. Slowackiego 173
80-298 Gdansk
Sad Rejonowy Gdansk Polnoc w Gdansku,
VII Wydzial Gospodarczy Krajowego Rejestru Sadowego,
numer KRS 101882
NIP 957-07-52-316
Kapital zakladowy 200.000 zl
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Devel 3.2 branch issues
am 19.11.2010 13:43:20 von anna.czarnowska
Hi Neil,
Our validation team have reported problems with assembly on the devel 3.2 branch.
I have verified that currently it is not possible to assemble any array.
Patch: Assemble - avoid including wayward devices
It does not affect native metadata but breaks assembly of external arrays.
Only one disk is assembled for any raid level.
After patch: super_by_fd: return subarray info explicitly
Assembly becomes much slower.
Patch: Assemble: small cleanup of error checking
Breaks assembly for all metadata types. Nothing assembles after it is applied.
These are just early modifications of Assemble.c. The impact of further changes
can't be verified at the moment.
Are you aware of the above issues? This is stopping our further validation.
I also mentioned issues with Incremental in previous mail.
When are you planning to submit the rest of modified autorebuild code?
Regards
Anna
------------------------------------------------------------ ---------
Intel Technology Poland sp. z o.o.
z siedziba w Gdansku
ul. Slowackiego 173
80-298 Gdansk
Sad Rejonowy Gdansk Polnoc w Gdansku,
VII Wydzial Gospodarczy Krajowego Rejestru Sadowego,
numer KRS 101882
NIP 957-07-52-316
Kapital zakladowy 200.000 zl
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Autorebuild, new dynamic udev rules for hot-plugs
am 19.11.2010 16:12:19 von unknown
Hi,
I would like to present another patch for our autorebuild tree (it should be applied on the top of Autorebuild series, as devel-3.2 is not stable yet). Dan Williams proposed to reduce overhead associated with passing each hot-plugged block device to mdadm, using just the devices described in policies via mdadm.conf file.
See patch below for details, comments are as always very welcome.
Date: Thu, 18 Nov 2010 01:19:52 +0100
Subject: [PATCH] New dynamic hot-plug udev rules for policies
When introducing policies, new hot-plug rules were added to support
bare disks. Mdadm was started for each hot plugged block device
to determine if it could be used as spare or as a replacement member for
degraded array.
This patch introduces limitation of range of devices that are handled
by mdadm. It limits them to the ones specified in domains associated
with the actions: spare-same-port, spare and spare-force.
In order to enable hot-plug for bare disks one must update udev rules
with command
mdadm --activate-domains
After mdadm.conf is changed one is obliged to re-run
"mdadm --activate-domains" command in order to bring the system
configuration up to date.
All hot-plugged disks containing metadata are still handled by existing
rules.
Note: this patch is just a proposition to minimize overhead of using mdadm for
each plugged block device. If accepted, it will be incorporated in
previous implementation of hot-plug for bare disks.
Signed-off-by: Przemyslaw Czarnowski
---
Makefile | 2 +
ReadMe.c | 1 +
mdadm.c | 4 ++
mdadm.h | 9 +++-
policy.c | 137 ++++++++++++++++++++++++++++++++++++++++++++++++++++
udev-md-raid.rules | 5 +-
6 files changed, 154 insertions(+), 4 deletions(-)
diff --git a/Makefile b/Makefile
index 2b88818..eeae8e1 100644
--- a/Makefile
+++ b/Makefile
@@ -73,9 +73,11 @@ MAP_FILE = map
MDMON_DIR = /dev/.mdadm
# place for autoreplace cookies
FAILED_SLOTS_DIR = /dev/.mdadm/failed-slots
+UDEV_RULES_DIR = $(DESTDIR)/lib/udev/rules.d
DIRFLAGS = -DMAP_DIR=\"$(MAP_DIR)\" -DMAP_FILE=\"$(MAP_FILE)\"
DIRFLAGS += -DMDMON_DIR=\"$(MDMON_DIR)\"
DIRFLAGS += -DFAILED_SLOTS_DIR=\"$(FAILED_SLOTS_DIR)\"
+DIRFLAGS += -DUDEV_RULES_DIR=\"$(UDEV_RULES_DIR)\"
CFLAGS = $(CWFLAGS) $(CXFLAGS) -DSendmail=\""$(MAILCMD)"\" $(CONFFILEFLAGS) $(DIRFLAGS)
# The glibc TLS ABI requires applications that call clone(2) to set up
diff --git a/ReadMe.c b/ReadMe.c
index 54a1998..f17b8a1 100644
--- a/ReadMe.c
+++ b/ReadMe.c
@@ -110,6 +110,7 @@ struct option long_options[] = {
{"detail-platform", 0, 0, DetailPlatform},
{"kill-subarray", 1, 0, KillSubarray},
{"update-subarray", 1, 0, UpdateSubarray},
+ {"activate-domains", 0, 0, ActivateDomains},
/* synonyms */
{"monitor", 0, 0, 'F'},
diff --git a/mdadm.c b/mdadm.c
index c9a172a..b5403cf 100644
--- a/mdadm.c
+++ b/mdadm.c
@@ -228,6 +228,7 @@ int main(int argc, char *argv[])
}
subarray = optarg;
}
+ case ActivateDomains:
case 'K': if (!mode) newmode = MISC; break;
case NoSharing: newmode = MONITOR; break;
}
@@ -841,6 +842,7 @@ int main(int argc, char *argv[])
case O(MISC, DetailPlatform):
case O(MISC, KillSubarray):
case O(MISC, UpdateSubarray):
+ case O(MISC, ActivateDomains):
if (devmode && devmode != opt &&
(devmode == 'E' || (opt == 'E' && devmode != 'Q'))) {
fprintf(stderr, Name ": --examine/-E cannot be given with ");
@@ -1421,6 +1423,8 @@ int main(int argc, char *argv[])
free_mdstat(ms);
} while (!last && err);
if (err) rv |= 1;
+ } else if (devmode == ActivateDomains) {
+ rv = Activate_Domains();
} else {
fprintf(stderr, Name ": No devices given.\n");
exit(2);
diff --git a/mdadm.h b/mdadm.h
index 9b4a1a8..171aa69 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -101,6 +101,11 @@ extern __off64_t lseek64 __P ((int __fd, __off64_t __offset, int __whence));
#define FAILED_SLOTS_DIR "/dev/.mdadm/failed-slots"
#endif /* FAILED_SLOTS */
+/* UDEV_RULES_DIR is the place, where udev holds its rules */
+#ifndef UDEV_RULES_DIR
+#define UDEV_RULES_DIR "/lib/udev/rules.d"
+#endif /* UDEV_RULES_DIR */
+
#include "md_u.h"
#include "md_p.h"
#include "bitmap.h"
@@ -289,7 +294,8 @@ enum special_options {
KillSubarray,
UpdateSubarray, /* 16 */
IncrementalPath,
- NoSharing
+ NoSharing,
+ ActivateDomains
};
/* structures read from config file */
@@ -971,6 +977,7 @@ extern int CreateBitmap(char *filename, int force, char uuid[16],
unsigned long long array_size,
int major);
extern int ExamineBitmap(char *filename, int brief, struct supertype *st);
+extern int Activate_Domains(void);
extern int bitmap_update_uuid(int fd, int *uuid, int swap);
extern unsigned long bitmap_sectors(struct bitmap_super_s *bsb);
diff --git a/policy.c b/policy.c
index daee52a..e4c53d5 100644
--- a/policy.c
+++ b/policy.c
@@ -753,3 +753,140 @@ void domain_free(struct domainlist *dl)
free(head);
}
}
+
+/* invocation of udev rule file */
+char udev_template_start[] =
+"# do not edit this file, it will be overwritten on update\n"
+"\n"
+"SUBSYSTEM!=\"block\", GOTO=\"md_autorebuild_end\"\n"
+"\n"
+"ENV{ID_FS_TYPE}==\"linux_raid_member\", GOTO=\"md_autorebuild_end\"\n"
+"ENV{ID_FS_TYPE}==\"isw_raid_member\", GOTO=\"md_autorebuild_end\"\n"
+"\n";
+
+/* ending of udev rule file */
+char udev_template_end[] =
+"\n"
+"LABEL=\"md_autorebuild_end\"\n"
+"\n";
+
+/* find rule named rule_type and return its value */
+char *find_rule(struct rule *rule, char *rule_type)
+{
+ while (rule) {
+ if (rule->name == rule_type)
+ return rule->value;
+
+ rule = rule->next;
+ }
+ return NULL;
+}
+
+#define UDEV_RULE_FORMAT \
+"ACTION==\"add\", KERNEL!=\"md*\" ENV{ID_PATH}==\"%s\" " \
+"RUN+=\"/sbin/mdadm --incremental $env{DEVNAME}\", " \
+"GOTO=\"md_autorebuild_end\"\n" \
+
+/* Write rule in the rule file. Use format from UDEV_RULE_FORMAT */
+int write_rule(struct rule *rule, int fd)
+{
+ char line[1024];
+ char *r = find_rule(rule, rule_path);
+ if (!r)
+ return -1;
+
+ snprintf(line, sizeof(line) - 1, UDEV_RULE_FORMAT, r);
+ return write(fd, line, strlen(line));
+}
+
+/* Generate single entry in udev rule basing on POLICY line found in config
+ * file. Take only those with paths, only first occurrence if paths are equal
+ * and if actions supports handling of spares (>=act_spare_same_slot)
+ */
+int generate_entries(int fd)
+{
+ struct pol_rule *loop, *dup;
+ char *loop_value, *dup_value;
+ int duplicate;
+ int written = 0;
+
+ for (loop = config_rules; loop; loop = loop->next) {
+ if (loop->type != rule_policy)
+ continue;
+ duplicate = 0;
+
+ /* only policies with paths and with actions supporting
+ * bare disks are considered */
+ loop_value = find_rule(loop->rule, pol_act);
+ if (!loop_value || map_act(loop_value) < act_spare_same_slot)
+ continue;
+ loop_value = find_rule(loop->rule, rule_path);
+ if (!loop_value)
+ continue;
+ for (dup = config_rules; dup != loop; dup = dup->next) {
+ if (dup->type != rule_policy)
+ continue;
+ dup_value = find_rule(dup->rule, pol_act);
+ if (!dup_value || map_act(dup_value) < act_spare_same_slot)
+ continue;
+ dup_value = find_rule(dup->rule, rule_path);
+ if (!dup_value)
+ continue;
+ if (strcmp(loop_value, dup_value) == 0) {
+ duplicate = 1;
+ break;
+ }
+ }
+
+ /* not a dup or first occurrence */
+ if (!duplicate) {
+ if (write_rule(loop->rule, fd) == -1)
+ return 0;
+ written++;
+ }
+ }
+ return written;
+}
+
+#define AR_UDEV_RULE_FILE UDEV_RULES_DIR "/63-md-raid-autorebuild.rules"
+
+/* Activate_Domains routine creates dynamic udev rules used to handle
+ * hot-plug events for bare devices (and making them spares)
+ */
+int Activate_Domains(void)
+{
+ int fd = 0;
+ int rv;
+
+ fd = creat(AR_UDEV_RULE_FILE,
+ S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH);
+ if (fd == -1)
+ return 1;
+
+ /* write static invocation */
+ if (write(fd, udev_template_start,
+ sizeof(udev_template_start) - 1) == -1) {
+ close(fd);
+ unlink(AR_UDEV_RULE_FILE);
+ return 1;
+ }
+
+ /* iterate, if none created or error occurred, remove file */
+ rv = generate_entries(fd);
+ if (rv <= 0) {
+ close(fd);
+ unlink(AR_UDEV_RULE_FILE);
+ return rv == -1 ? -1 : 0;
+ }
+
+ /* write ending */
+ if (write(fd, udev_template_end, sizeof(udev_template_end) - 1) == -1) {
+ close(fd);
+ unlink(AR_UDEV_RULE_FILE);
+ return 1;
+ }
+
+ close(fd);
+
+ return 0;
+}
diff --git a/udev-md-raid.rules b/udev-md-raid.rules
index 36dd51e..11057bb 100644
--- a/udev-md-raid.rules
+++ b/udev-md-raid.rules
@@ -3,11 +3,10 @@
SUBSYSTEM!="block", GOTO="md_end"
# handle potential components of arrays
+ENV{ID_FS_TYPE}=="linux_raid_member", ACTION=="add", RUN+="/sbin/mdadm --incremental $env{DEVNAME}"
ENV{ID_FS_TYPE}=="linux_raid_member", ACTION=="remove", RUN+="/sbin/mdadm -If $name --path $env{ID_PATH}"
+ENV{ID_FS_TYPE}=="isw_raid_member", ACTION=="add", RUN+="/sbin/mdadm --incremental $env{DEVNAME}"
ENV{ID_FS_TYPE}=="isw_raid_member", ACTION=="remove", RUN+="/sbin/mdadm -If $name --path $env{ID_PATH}"
-# try incremental for each block device and not md. Most cases should not create
-# unnecessary overhead, as no "heavy" disk operations are performed
-ACTION=="add", KERNEL!="md*", RUN+="/sbin/mdadm --incremental $env{DEVNAME}"
# handle md arrays
ACTION!="add|change", GOTO="md_end"
--
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Patch 00/17] Autorebuild
am 22.11.2010 03:16:44 von NeilBrown
On Thu, 18 Nov 2010 23:14:49 +0000
"Czarnowska, Anna" wrote:
> Hi Neil,
> I started testing new code today. Just the Incremental part.
>
> There are few problems:
> 1. Cookie file is cleared before it is read so spare-same-slot can't work. It should be just open for reading. (probably a typo)
Yes, just a typo. Fixed.
> 2. Container uuid instead of subarray uuid is written in cookie file, so for ddf it may not be clear which subarray used the slot.
This is deliberate. It is really up to the ddf spare-assignment handler (in
super-ddf ... though it isn't written yet) to decide which sub array gets
which part of the new disk. If an admin wants more control they need to do
it at a different level - probably having separate ddf containers in separate
domains.
> 3. Incremental fail does not work for external metadata. Przemek's original patch did fail the disk in subarrays. Now Manage_subdevs tries to fail a disk in container while subarray is expected. Do you intend to change Manage_subdevs to take a container?
Yes... I didn't notice that change in the patch. This is one of the reasons
I like each patch to just make one change.
I have added a new patch which fails all the contained arrays before removing
from the container (though I haven't tested it yet).
> 4. With spare-same-slot when there is a cookie and disk has no metadata then we probably shouldn't look at domains. Just add.
>
I disagree. We must always check domains.
Why do you think we should ignore domains in that case?
Thanks for the testing. I'll push a new devel-3.2 out later today.
NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Devel 3.2 branch issues
am 22.11.2010 04:29:53 von NeilBrown
On Fri, 19 Nov 2010 12:43:20 +0000
"Czarnowska, Anna" wrote:
> Hi Neil,
> Our validation team have reported problems with assembly on the devel 3.2 branch.
> I have verified that currently it is not possible to assemble any array.
>
> Patch: Assemble - avoid including wayward devices
> It does not affect native metadata but breaks assembly of external arrays.
> Only one disk is assembled for any raid level.
Thanks - fixed.
>
> After patch: super_by_fd: return subarray info explicitly
> Assembly becomes much slower.
Yep .. I was calling 'strcpy' with a NULL as the source - bad.
Fixed, though a subsequent patch removed the strcpy anyway.
>
> Patch: Assemble: small cleanup of error checking
> Breaks assembly for all metadata types. Nothing assembles after it is applied.
>
I'm not sure this is true, but the test/03* tests of assembly certainly fail.
I've fixed that. Thanks.
> These are just early modifications of Assemble.c. The impact of further changes
> can't be verified at the moment.
> Are you aware of the above issues? This is stopping our further validation.
> I also mentioned issues with Incremental in previous mail.
> When are you planning to submit the rest of modified autorebuild code?
Shortly ..
by the way, some of the changes in you of the patches you sent have not been
included in any form. They include:
- the getinfo_super_disks method. I couldn't see why you need this. All the
info about the state of the arrays should already be available.
If there is something that you need that we don't have, please explain and
we can see how best to add it back in.
- min_active_disk_size_in_array. I don't think the minimum current size is
really a good guide. I've kept the code for letting the metadata handler
check the size, but anything beyond that should be done with domains I
think.
E.g have a domain '2G-or-greater' which is assigned to all 2G or greater
devices. Then anything smaller will automatically be excluded from arrays
with those devices.
- The remove_from_super method. As Dan pointed out there seems to be
something wrong there so I chose to just leave it out for now. If you
could explain again what is needed, we can find the best way to add that
functionality.
Thanks,
NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Autorebuild, new dynamic udev rules for hot-plugs
am 22.11.2010 06:02:01 von NeilBrown
On Fri, 19 Nov 2010 15:12:19 +0000
"Hawrylewicz Czarnowski, Przemyslaw"
wrote:
> Hi,
>
> I would like to present another patch for our autorebuild tree (it should be applied on the top of Autorebuild series, as devel-3.2 is not stable yet). Dan Williams proposed to reduce overhead associated with passing each hot-plugged block device to mdadm, using just the devices described in policies via mdadm.conf file.
> See patch below for details, comments are as always very welcome.
I am in general in favour of this approach.
It has the benefit that it can very easily be not-used if there turn out to
be problems with it.
Four comments.
1/ I wouldn't write a file in /lib/udev/rules.d/
I think it should be written to "/dev/.udev/rules.d/"
which is referred to as the "temporary rules directory"
in the udev documentation.
2/ I would be good to process the type=disk or type=part part of the
policy into the rules file as well.
3/ I'm not very comfortable with hard-coding the name of the
file to be created in the rules.d directory. Maybe usage could be
--activate-domains=63-md-whatever
4/ I don't think it is good to have an incomplete file in rules.d that udev
might accidentally read. We should create the file with a name with a
leading '.' (assuming udev ignores those, I haven't checked) and then
rename it after it has been completely written.
Other than that, it looks pretty good.
Thanks,
NeilBrown
>
> Date: Thu, 18 Nov 2010 01:19:52 +0100
> Subject: [PATCH] New dynamic hot-plug udev rules for policies
>
> When introducing policies, new hot-plug rules were added to support
> bare disks. Mdadm was started for each hot plugged block device
> to determine if it could be used as spare or as a replacement member for
> degraded array.
> This patch introduces limitation of range of devices that are handled
> by mdadm. It limits them to the ones specified in domains associated
> with the actions: spare-same-port, spare and spare-force.
> In order to enable hot-plug for bare disks one must update udev rules
> with command
>
> mdadm --activate-domains
>
> After mdadm.conf is changed one is obliged to re-run
> "mdadm --activate-domains" command in order to bring the system
> configuration up to date.
> All hot-plugged disks containing metadata are still handled by existing
> rules.
>
> Note: this patch is just a proposition to minimize overhead of using mdadm for
> each plugged block device. If accepted, it will be incorporated in
> previous implementation of hot-plug for bare disks.
>
> Signed-off-by: Przemyslaw Czarnowski
> ---
> Makefile | 2 +
> ReadMe.c | 1 +
> mdadm.c | 4 ++
> mdadm.h | 9 +++-
> policy.c | 137 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> udev-md-raid.rules | 5 +-
> 6 files changed, 154 insertions(+), 4 deletions(-)
>
> diff --git a/Makefile b/Makefile
> index 2b88818..eeae8e1 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -73,9 +73,11 @@ MAP_FILE = map
> MDMON_DIR = /dev/.mdadm
> # place for autoreplace cookies
> FAILED_SLOTS_DIR = /dev/.mdadm/failed-slots
> +UDEV_RULES_DIR = $(DESTDIR)/lib/udev/rules.d
> DIRFLAGS = -DMAP_DIR=\"$(MAP_DIR)\" -DMAP_FILE=\"$(MAP_FILE)\"
> DIRFLAGS += -DMDMON_DIR=\"$(MDMON_DIR)\"
> DIRFLAGS += -DFAILED_SLOTS_DIR=\"$(FAILED_SLOTS_DIR)\"
> +DIRFLAGS += -DUDEV_RULES_DIR=\"$(UDEV_RULES_DIR)\"
> CFLAGS = $(CWFLAGS) $(CXFLAGS) -DSendmail=\""$(MAILCMD)"\" $(CONFFILEFLAGS) $(DIRFLAGS)
>
> # The glibc TLS ABI requires applications that call clone(2) to set up
> diff --git a/ReadMe.c b/ReadMe.c
> index 54a1998..f17b8a1 100644
> --- a/ReadMe.c
> +++ b/ReadMe.c
> @@ -110,6 +110,7 @@ struct option long_options[] = {
> {"detail-platform", 0, 0, DetailPlatform},
> {"kill-subarray", 1, 0, KillSubarray},
> {"update-subarray", 1, 0, UpdateSubarray},
> + {"activate-domains", 0, 0, ActivateDomains},
>
> /* synonyms */
> {"monitor", 0, 0, 'F'},
> diff --git a/mdadm.c b/mdadm.c
> index c9a172a..b5403cf 100644
> --- a/mdadm.c
> +++ b/mdadm.c
> @@ -228,6 +228,7 @@ int main(int argc, char *argv[])
> }
> subarray = optarg;
> }
> + case ActivateDomains:
> case 'K': if (!mode) newmode = MISC; break;
> case NoSharing: newmode = MONITOR; break;
> }
> @@ -841,6 +842,7 @@ int main(int argc, char *argv[])
> case O(MISC, DetailPlatform):
> case O(MISC, KillSubarray):
> case O(MISC, UpdateSubarray):
> + case O(MISC, ActivateDomains):
> if (devmode && devmode != opt &&
> (devmode == 'E' || (opt == 'E' && devmode != 'Q'))) {
> fprintf(stderr, Name ": --examine/-E cannot be given with ");
> @@ -1421,6 +1423,8 @@ int main(int argc, char *argv[])
> free_mdstat(ms);
> } while (!last && err);
> if (err) rv |= 1;
> + } else if (devmode == ActivateDomains) {
> + rv = Activate_Domains();
> } else {
> fprintf(stderr, Name ": No devices given.\n");
> exit(2);
> diff --git a/mdadm.h b/mdadm.h
> index 9b4a1a8..171aa69 100644
> --- a/mdadm.h
> +++ b/mdadm.h
> @@ -101,6 +101,11 @@ extern __off64_t lseek64 __P ((int __fd, __off64_t __offset, int __whence));
> #define FAILED_SLOTS_DIR "/dev/.mdadm/failed-slots"
> #endif /* FAILED_SLOTS */
>
> +/* UDEV_RULES_DIR is the place, where udev holds its rules */
> +#ifndef UDEV_RULES_DIR
> +#define UDEV_RULES_DIR "/lib/udev/rules.d"
> +#endif /* UDEV_RULES_DIR */
> +
> #include "md_u.h"
> #include "md_p.h"
> #include "bitmap.h"
> @@ -289,7 +294,8 @@ enum special_options {
> KillSubarray,
> UpdateSubarray, /* 16 */
> IncrementalPath,
> - NoSharing
> + NoSharing,
> + ActivateDomains
> };
>
> /* structures read from config file */
> @@ -971,6 +977,7 @@ extern int CreateBitmap(char *filename, int force, char uuid[16],
> unsigned long long array_size,
> int major);
> extern int ExamineBitmap(char *filename, int brief, struct supertype *st);
> +extern int Activate_Domains(void);
> extern int bitmap_update_uuid(int fd, int *uuid, int swap);
> extern unsigned long bitmap_sectors(struct bitmap_super_s *bsb);
>
> diff --git a/policy.c b/policy.c
> index daee52a..e4c53d5 100644
> --- a/policy.c
> +++ b/policy.c
> @@ -753,3 +753,140 @@ void domain_free(struct domainlist *dl)
> free(head);
> }
> }
> +
> +/* invocation of udev rule file */
> +char udev_template_start[] =
> +"# do not edit this file, it will be overwritten on update\n"
> +"\n"
> +"SUBSYSTEM!=\"block\", GOTO=\"md_autorebuild_end\"\n"
> +"\n"
> +"ENV{ID_FS_TYPE}==\"linux_raid_member\", GOTO=\"md_autorebuild_end\"\n"
> +"ENV{ID_FS_TYPE}==\"isw_raid_member\", GOTO=\"md_autorebuild_end\"\n"
> +"\n";
> +
> +/* ending of udev rule file */
> +char udev_template_end[] =
> +"\n"
> +"LABEL=\"md_autorebuild_end\"\n"
> +"\n";
> +
> +/* find rule named rule_type and return its value */
> +char *find_rule(struct rule *rule, char *rule_type)
> +{
> + while (rule) {
> + if (rule->name == rule_type)
> + return rule->value;
> +
> + rule = rule->next;
> + }
> + return NULL;
> +}
> +
> +#define UDEV_RULE_FORMAT \
> +"ACTION==\"add\", KERNEL!=\"md*\" ENV{ID_PATH}==\"%s\" " \
> +"RUN+=\"/sbin/mdadm --incremental $env{DEVNAME}\", " \
> +"GOTO=\"md_autorebuild_end\"\n" \
> +
> +/* Write rule in the rule file. Use format from UDEV_RULE_FORMAT */
> +int write_rule(struct rule *rule, int fd)
> +{
> + char line[1024];
> + char *r = find_rule(rule, rule_path);
> + if (!r)
> + return -1;
> +
> + snprintf(line, sizeof(line) - 1, UDEV_RULE_FORMAT, r);
> + return write(fd, line, strlen(line));
> +}
> +
> +/* Generate single entry in udev rule basing on POLICY line found in config
> + * file. Take only those with paths, only first occurrence if paths are equal
> + * and if actions supports handling of spares (>=act_spare_same_slot)
> + */
> +int generate_entries(int fd)
> +{
> + struct pol_rule *loop, *dup;
> + char *loop_value, *dup_value;
> + int duplicate;
> + int written = 0;
> +
> + for (loop = config_rules; loop; loop = loop->next) {
> + if (loop->type != rule_policy)
> + continue;
> + duplicate = 0;
> +
> + /* only policies with paths and with actions supporting
> + * bare disks are considered */
> + loop_value = find_rule(loop->rule, pol_act);
> + if (!loop_value || map_act(loop_value) < act_spare_same_slot)
> + continue;
> + loop_value = find_rule(loop->rule, rule_path);
> + if (!loop_value)
> + continue;
> + for (dup = config_rules; dup != loop; dup = dup->next) {
> + if (dup->type != rule_policy)
> + continue;
> + dup_value = find_rule(dup->rule, pol_act);
> + if (!dup_value || map_act(dup_value) < act_spare_same_slot)
> + continue;
> + dup_value = find_rule(dup->rule, rule_path);
> + if (!dup_value)
> + continue;
> + if (strcmp(loop_value, dup_value) == 0) {
> + duplicate = 1;
> + break;
> + }
> + }
> +
> + /* not a dup or first occurrence */
> + if (!duplicate) {
> + if (write_rule(loop->rule, fd) == -1)
> + return 0;
> + written++;
> + }
> + }
> + return written;
> +}
> +
> +#define AR_UDEV_RULE_FILE UDEV_RULES_DIR "/63-md-raid-autorebuild.rules"
> +
> +/* Activate_Domains routine creates dynamic udev rules used to handle
> + * hot-plug events for bare devices (and making them spares)
> + */
> +int Activate_Domains(void)
> +{
> + int fd = 0;
> + int rv;
> +
> + fd = creat(AR_UDEV_RULE_FILE,
> + S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH);
> + if (fd == -1)
> + return 1;
> +
> + /* write static invocation */
> + if (write(fd, udev_template_start,
> + sizeof(udev_template_start) - 1) == -1) {
> + close(fd);
> + unlink(AR_UDEV_RULE_FILE);
> + return 1;
> + }
> +
> + /* iterate, if none created or error occurred, remove file */
> + rv = generate_entries(fd);
> + if (rv <= 0) {
> + close(fd);
> + unlink(AR_UDEV_RULE_FILE);
> + return rv == -1 ? -1 : 0;
> + }
> +
> + /* write ending */
> + if (write(fd, udev_template_end, sizeof(udev_template_end) - 1) == -1) {
> + close(fd);
> + unlink(AR_UDEV_RULE_FILE);
> + return 1;
> + }
> +
> + close(fd);
> +
> + return 0;
> +}
> diff --git a/udev-md-raid.rules b/udev-md-raid.rules
> index 36dd51e..11057bb 100644
> --- a/udev-md-raid.rules
> +++ b/udev-md-raid.rules
> @@ -3,11 +3,10 @@
> SUBSYSTEM!="block", GOTO="md_end"
>
> # handle potential components of arrays
> +ENV{ID_FS_TYPE}=="linux_raid_member", ACTION=="add", RUN+="/sbin/mdadm --incremental $env{DEVNAME}"
> ENV{ID_FS_TYPE}=="linux_raid_member", ACTION=="remove", RUN+="/sbin/mdadm -If $name --path $env{ID_PATH}"
> +ENV{ID_FS_TYPE}=="isw_raid_member", ACTION=="add", RUN+="/sbin/mdadm --incremental $env{DEVNAME}"
> ENV{ID_FS_TYPE}=="isw_raid_member", ACTION=="remove", RUN+="/sbin/mdadm -If $name --path $env{ID_PATH}"
> -# try incremental for each block device and not md. Most cases should not create
> -# unnecessary overhead, as no "heavy" disk operations are performed
> -ACTION=="add", KERNEL!="md*", RUN+="/sbin/mdadm --incremental $env{DEVNAME}"
>
> # handle md arrays
> ACTION!="add|change", GOTO="md_end"
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [Patch 00/17] Autorebuild
am 22.11.2010 16:08:27 von anna.czarnowska
> > 4. With spare-same-slot when there is a cookie and disk has no
> metadata then we probably shouldn't look at domains. Just add.
> >
>
> I disagree. We must always check domains.
> Why do you think we should ignore domains in that case?
Easy example is: we have an array made on two disks sda and sdb.
Sda has domain d1. Sdb has domain d2. This is perfectly ok for Create.
Now sdb fails and we put a new disk in that place. It gets domain d2 because it is in the same slot as old disk.
When the array went degraded and sdb was removed, the domain for the array was reduced to d1 only.
New disk does not match any more so it is not added.
I think we should still add it because we have a file saying that this slot belongs to that array.
There is also controller domain that has the special meaning but I don't think it is a problem.
If the user originally created the array spanning different controllers why wouldn't we take a replacement occupying the same slot as original member?
My conclusion is: we should ignore domains when there is a cookie.
Note that we don't look at domains at all when we add a disk with spare metadata. Here it is indeed very much needed.
When someone creates some arrays and adds some spares, additionally defines domains to keep all related disks together, then he may be disappointed seeing that after reboot all spares end up in one container anyway. This happens for imsm and because of this issue some spare that was meant for a different array may be used in the first array from config instead (all spares will be added there regardless of domains).
Anna
------------------------------------------------------------ ---------
Intel Technology Poland sp. z o.o.
z siedziba w Gdansku
ul. Slowackiego 173
80-298 Gdansk
Sad Rejonowy Gdansk Polnoc w Gdansku,
VII Wydzial Gospodarczy Krajowego Rejestru Sadowego,
numer KRS 101882
NIP 957-07-52-316
Kapital zakladowy 200.000 zl
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Devel 3.2 branch issues
am 22.11.2010 18:18:06 von Marcin.Labun
> - the getinfo_super_disks method. I couldn't see why you need this.
> All the
> info about the state of the arrays should already be available.
> If there is something that you need that we don't have, please
> explain and
> we can see how best to add it back in.
For external metadata we have added a metadata handler to get a disk state (a spare or not a spare) based on current metadata state on disk.
Ioctl(GET_DISK_INFO) does not have a disk state info for containers (returns 0 - so we don't know if it is a spare or a failed disk).
We know that a disk is an array member based on check its state in the array.
Since Monitor code on devel-3.2 does not have calls to getinfo_super_disks method,
auto-rebuild grabs the first disk in container and tries to move it to a degraded one (without a successes), and the first one happens to be array member and have state = 0 (a good spare).
Like all disks in container.
There is also a fatal in pol_add, when trying to update policy rules with NULL spare_group:
@@ -732,7 +738,8 @@ static int move_spare(struct state *from, struct state *to,
continue;
pol = devnum_policy(from->devid[d]);
- pol_add(&pol, pol_domain, from->spare_group, NULL);
+ if (from->spare_group)
+ pol_add(&pol, pol_domain, from->spare_group, NULL);
We will send you our FT integrated with your mdadm test suit in a couple of days.
Marcin
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Devel 3.2 branch issues
am 22.11.2010 19:47:31 von dan.j.williams
On 11/22/2010 9:18 AM, Labun, Marcin wrote:
>> - the getinfo_super_disks method. I couldn't see why you need this.
>> All the
>> info about the state of the arrays should already be available.
>> If there is something that you need that we don't have, please
>> explain and
>> we can see how best to add it back in.
>
> For external metadata we have added a metadata handler to get a disk state (a spare or not a spare) based on current metadata state on disk.
> Ioctl(GET_DISK_INFO) does not have a disk state info for containers (returns 0 - so we don't know if it is a spare or a failed disk).
> We know that a disk is an array member based on check its state in the array.
I'm still catching up on the devel-3.2 getinfo/load_super reworks, but I
think this info would probably fit into the new 'map' parameter of
getinfo_super(). Spares can be indicated as 'working' in the map. I.e.
if map returns 1 for a container member and that disk is not currently
in use in a subarray then we can assume it is a spare at the container
level.
Alternatively we could just return 2 in the map to indicate spare, but I
think in the locations we care about we already know that it is not
currently in use in a subarray.
--
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Devel 3.2 branch issues
am 22.11.2010 23:39:00 von anna.czarnowska
> by the way, some of the changes in you of the patches you sent have not
> been
> included in any form. They include:
>
> - the getinfo_super_disks method. I couldn't see why you need this.
> All the
> info about the state of the arrays should already be available.
> If there is something that you need that we don't have, please
> explain and
> we can see how best to add it back in.
Marcin has already answered this but here is my explanation.
Current test devstate[i]==0 is always true for container so any device seems a good candidate to move.
To be able to identify members, failed devices and real spares we updated devstate for containers.
To find members we can just check which disks are used in subarrays, but a failed disk is removed from subarray after a short while and as soon as it happens we are not able to see a difference between the failed disk and a spare unless we look at metadata.
> - min_active_disk_size_in_array. I don't think the minimum current
> size is
> really a good guide. I've kept the code for letting the metadata
> handler
> check the size, but anything beyond that should be done with domains
> I
> think.
> E.g have a domain '2G-or-greater' which is assigned to all 2G or
> greater
> devices. Then anything smaller will automatically be excluded from
> arrays
> with those devices.
So if someone doesn't base domains on size they may have a small spare added to an array where it cannot be used.
Min_active_disk_size was more than required for an array that didn't occupy the whole disk but at least it ensured that we are not throwing in something that wouldn't help. If we do this the array will remain degraded but will have spare - so Monitor may think it does not need more.
For this reason we also checked the case when there was a spare in "to" container. If the spare was not suitable (size check here too) we would still look for a good one.
And now back to assembly. There is still a segmentation fault when we try to assemble a subarray. Occurs when there is any config file and we run "mdadm -As" or "mdadm -Asc /etc/mdadm.conf". content is NULL when we try to compare uuid in line 413 in Assemble.c.
We are going to prepare some tests to add to current suite so it will be easier to verify new patches.
Regards
Anna
------------------------------------------------------------ ---------
Intel Technology Poland sp. z o.o.
z siedziba w Gdansku
ul. Slowackiego 173
80-298 Gdansk
Sad Rejonowy Gdansk Polnoc w Gdansku,
VII Wydzial Gospodarczy Krajowego Rejestru Sadowego,
numer KRS 101882
NIP 957-07-52-316
Kapital zakladowy 200.000 zl
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Autorebuild, new dynamic udev rules for hot-plugs
am 23.11.2010 00:50:04 von unknown
Hi,
> -----Original Message-----
> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
> owner@vger.kernel.org] On Behalf Of Neil Brown
> Sent: Monday, November 22, 2010 6:02 AM
> To: Hawrylewicz Czarnowski, Przemyslaw
> Cc: linux-raid@vger.kernel.org; Neubauer, Wojciech; Williams, Dan J;
> Ciechanowski, Ed; Labun, Marcin; Czarnowska, Anna
> Subject: Re: Autorebuild, new dynamic udev rules for hot-plugs
>
> On Fri, 19 Nov 2010 15:12:19 +0000
> "Hawrylewicz Czarnowski, Przemyslaw"
> wrote:
>
> > Hi,
> >
> > I would like to present another patch for our autorebuild tree (it should
> be applied on the top of Autorebuild series, as devel-3.2 is not stable
> yet). Dan Williams proposed to reduce overhead associated with passing each
> hot-plugged block device to mdadm, using just the devices described in
> policies via mdadm.conf file.
> > See patch below for details, comments are as always very welcome.
>
> I am in general in favour of this approach.
> It has the benefit that it can very easily be not-used if there turn out to
> be problems with it.
>
> Four comments.
>
> 1/ I wouldn't write a file in /lib/udev/rules.d/
> I think it should be written to "/dev/.udev/rules.d/"
> which is referred to as the "temporary rules directory"
> in the udev documentation.
I am not sure if it is what we are looking for. Temporary means they disappear after reboot. It is OK as cold-plug does not need support for bare disks (or maybe I am wrong?). But in such case, one who wants to use autorebuild should invoke mdadm --activate-domains for example in /etc/init.d/local.boot or somewhere else. Second idea here is to use ActivateDomain() when one starts monitor with autorebuild enabled. Which one? I would prefer to leave it as it was written initially (considering comment #4). Then, if one removes policies from config, invoking --activate-domains should reset/remove rules (but see #3)
>
> 2/ I would be good to process the type=disk or type=part part of the
> policy into the rules file as well.
OK
>
> 3/ I'm not very comfortable with hard-coding the name of the
> file to be created in the rules.d directory. Maybe usage could be
> --activate-domains=63-md-whatever
Good idea, but only if we store our rules in /dev/.udev/rules.d. Otherwise it would be difficult to maintain all generated rules and remove the old ones... I would leave default if not given by user, but one can pass any file name.
>
> 4/ I don't think it is good to have an incomplete file in rules.d that udev
> might accidentally read. We should create the file with a name with a
> leading '.' (assuming udev ignores those, I haven't checked) and then
> rename it after it has been completely written.
You're right. In theory, such partial udev rules are excluded when udev can't interpret them properly. I have looked into udev's sources and found that it looks for "*.rules" files. All other file extensions are ignored. Files with leading dots are also omitted. I would prefer to create .temp file and then rename it into .rules.
>
> Other than that, it looks pretty good.
Great
>
> Thanks,
> NeilBrown
>
> >
> > Date: Thu, 18 Nov 2010 01:19:52 +0100
> > Subject: [PATCH] New dynamic hot-plug udev rules for policies
> >
> > When introducing policies, new hot-plug rules were added to support
> > bare disks. Mdadm was started for each hot plugged block device
> > to determine if it could be used as spare or as a replacement member for
> > degraded array.
> > This patch introduces limitation of range of devices that are handled
> > by mdadm. It limits them to the ones specified in domains associated
> > with the actions: spare-same-port, spare and spare-force.
> > In order to enable hot-plug for bare disks one must update udev rules
> > with command
> >
> > mdadm --activate-domains
> >
> > After mdadm.conf is changed one is obliged to re-run
> > "mdadm --activate-domains" command in order to bring the system
> > configuration up to date.
> > All hot-plugged disks containing metadata are still handled by existing
> > rules.
> >
> > Note: this patch is just a proposition to minimize overhead of using
> mdadm for
> > each plugged block device. If accepted, it will be incorporated in
> > previous implementation of hot-plug for bare disks.
> >
> > Signed-off-by: Przemyslaw Czarnowski
>
> > ---
> > Makefile | 2 +
> > ReadMe.c | 1 +
> > mdadm.c | 4 ++
> > mdadm.h | 9 +++-
> > policy.c | 137
> ++++++++++++++++++++++++++++++++++++++++++++++++++++
> > udev-md-raid.rules | 5 +-
> > 6 files changed, 154 insertions(+), 4 deletions(-)
> >
[cut]
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Autorebuild, new dynamic udev rules for hot-plugs
am 23.11.2010 01:11:41 von dan.j.williams
On 11/22/2010 3:50 PM, Hawrylewicz Czarnowski, Przemyslaw wrote:
>> Four comments.
>>
>> 1/ I wouldn't write a file in /lib/udev/rules.d/
>> I think it should be written to "/dev/.udev/rules.d/"
>> which is referred to as the "temporary rules directory"
>> in the udev documentation.
> I am not sure if it is what we are looking for. Temporary means they disappear after reboot. It is OK as cold-plug does not need support for bare disks (or maybe I am wrong?). But in such case, one who wants to use autorebuild should invoke mdadm --activate-domains for example in /etc/init.d/local.boot or somewhere else. Second idea here is to use ActivateDomain() when one starts monitor with autorebuild enabled. Which one? I would prefer to leave it as it was written initially (considering comment #4). Then, if one removes policies from config, invoking --activate-domains should reset/remove rules (but see #3)
The intent was always to have this be something reinitialized at boot.
Putting these in the temporary rule directory also precludes them from
being added to the initramfs where they are not needed / potentially
confusing.
The other intent was to only match the pci paths for the controllers we
cared about. That does not appear to be a part of this patch.
>
>>
>> 2/ I would be good to process the type=disk or type=part part of the
>> policy into the rules file as well.
> OK
>
>>
>> 3/ I'm not very comfortable with hard-coding the name of the
>> file to be created in the rules.d directory. Maybe usage could be
>> --activate-domains=63-md-whatever
> Good idea, but only if we store our rules in /dev/.udev/rules.d. Otherwise it would be difficult to maintain all generated rules and remove the old ones... I would leave default if not given by user, but one can pass any file name.
The issue is that this namespace belongs to the distro and since they
need to modify initscripts to turn this feature on might as well dump
the entirety of the naming responsibility to the user.
>> 4/ I don't think it is good to have an incomplete file in rules.d that udev
>> might accidentally read. We should create the file with a name with a
>> leading '.' (assuming udev ignores those, I haven't checked) and then
>> rename it after it has been completely written.
> You're right. In theory, such partial udev rules are excluded when udev can't interpret them properly. I have looked into udev's sources and found that it looks for "*.rules" files. All other file extensions are ignored. Files with leading dots are also omitted. I would prefer to create.temp file and then rename it into.rules.
There must be an existing convention for this sort of the thing, if so
let's not invent another one.
--
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Devel 3.2 branch issues
am 23.11.2010 01:52:13 von NeilBrown
On Mon, 22 Nov 2010 22:39:00 +0000
"Czarnowska, Anna" wrote:
>
> > by the way, some of the changes in you of the patches you sent have not
> > been
> > included in any form. They include:
> >
> > - the getinfo_super_disks method. I couldn't see why you need this.
> > All the
> > info about the state of the arrays should already be available.
> > If there is something that you need that we don't have, please
> > explain and
> > we can see how best to add it back in.
>
> Marcin has already answered this but here is my explanation.
> Current test devstate[i]==0 is always true for container so any device seems a good candidate to move.
> To be able to identify members, failed devices and real spares we updated devstate for containers.
> To find members we can just check which disks are used in subarrays, but a failed disk is removed from subarray after a short while and as soon as it happens we are not able to see a difference between the failed disk and a spare unless we look at metadata.
Thanks. That makes sense. I'll look at the code and see about applying it.
>
> > - min_active_disk_size_in_array. I don't think the minimum current
> > size is
> > really a good guide. I've kept the code for letting the metadata
> > handler
> > check the size, but anything beyond that should be done with domains
> > I
> > think.
> > E.g have a domain '2G-or-greater' which is assigned to all 2G or
> > greater
> > devices. Then anything smaller will automatically be excluded from
> > arrays
> > with those devices.
>
> So if someone doesn't base domains on size they may have a small spare added to an array where it cannot be used.
> Min_active_disk_size was more than required for an array that didn't occupy the whole disk but at least it ensured that we are not throwing in something that wouldn't help. If we do this the array will remain degraded but will have spare - so Monitor may think it does not need more.
> For this reason we also checked the case when there was a spare in "to" container. If the spare was not suitable (size check here too) we would still look for a good one.
I don't think it is possible to come up with an automatic way to determine if
a given spare suits a given array that is always correct. There are too many
subtleties.
So I would like to allow the sysadmin to exercise complete control, and have
defaults that make reasonable sense in common cases.
The 'complete control' can be exercised through domain - though I will
probably add some size based rule mechanism to the policy code so devices can
be categorised by size if wanted.
The 'safe default' is probably best left to the metadata handler. So
ultimately all metadata types *should* specify min_acceptable_spare_size,
and we will just make do with that.
Does that sound OK?
>
> And now back to assembly. There is still a segmentation fault when we try to assemble a subarray. Occurs when there is any config file and we run "mdadm -As" or "mdadm -Asc /etc/mdadm.conf". content is NULL when we try to compare uuid in line 413 in Assemble.c.
Yes - patch below should fix this.
> We are going to prepare some tests to add to current suite so it will be easier to verify new patches.
That would be greatly appreciated!!
Thanks,
NeilBrown
commit 87477e6d5e4201bf2bd812f34f8321983310bd99
Author: NeilBrown
Date: Tue Nov 23 11:34:36 2010 +1100
Assemble: get content before testing it.
When checking that a container matches the required uuid,
we need to call 'getinfo_super' before we have a 'content'
to test.
Reported-by: "Czarnowska, Anna"
Signed-off-by: NeilBrown
diff --git a/Assemble.c b/Assemble.c
index 1a1e128..607f2af 100644
--- a/Assemble.c
+++ b/Assemble.c
@@ -409,6 +409,11 @@ int Assemble(struct supertype *st, char *mddev,
if (ident->container[0] != '/') {
/* we have a uuid */
int uuid[4];
+
+ content = &info;
+ memset(content, 0, sizeof(*content));
+ tst->ss->getinfo_super(tst, content, NULL);
+
if (!parse_uuid(ident->container, uuid) ||
!same_uuid(content->uuid, uuid, tst->ss->swapuuid)) {
if (report_missmatch)
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Autorebuild, new dynamic udev rules for hot-plugs
am 23.11.2010 02:17:25 von NeilBrown
On Mon, 22 Nov 2010 16:11:41 -0800
Dan Williams wrote:
> On 11/22/2010 3:50 PM, Hawrylewicz Czarnowski, Przemyslaw wrote:
> >> Four comments.
> >>
> >> 1/ I wouldn't write a file in /lib/udev/rules.d/
> >> I think it should be written to "/dev/.udev/rules.d/"
> >> which is referred to as the "temporary rules directory"
> >> in the udev documentation.
> > I am not sure if it is what we are looking for. Temporary means they disappear after reboot. It is OK as cold-plug does not need support for bare disks (or maybe I am wrong?). But in such case, one who wants to use autorebuild should invoke mdadm --activate-domains for example in /etc/init.d/local.boot or somewhere else. Second idea here is to use ActivateDomain() when one starts monitor with autorebuild enabled. Which one? I would prefer to leave it as it was written initially (considering comment #4). Then, if one removes policies from config, invoking --activate-domains should reset/remove rules (but see #3)
>
> The intent was always to have this be something reinitialized at boot.
> Putting these in the temporary rule directory also precludes them from
> being added to the initramfs where they are not needed / potentially
> confusing.
>
> The other intent was to only match the pci paths for the controllers we
> cared about. That does not appear to be a part of this patch.
Can you define "we cared about". Don't we care about everything listed in
mdadm.conf??
>
> >
> >>
> >> 2/ I would be good to process the type=disk or type=part part of the
> >> policy into the rules file as well.
> > OK
> >
> >>
> >> 3/ I'm not very comfortable with hard-coding the name of the
> >> file to be created in the rules.d directory. Maybe usage could be
> >> --activate-domains=63-md-whatever
> > Good idea, but only if we store our rules in /dev/.udev/rules.d. Otherwise it would be difficult to maintain all generated rules and remove the old ones... I would leave default if not given by user, but one can pass any file name.
>
> The issue is that this namespace belongs to the distro and since they
> need to modify initscripts to turn this feature on might as well dump
> the entirety of the naming responsibility to the user.
>
> >> 4/ I don't think it is good to have an incomplete file in rules.d that udev
> >> might accidentally read. We should create the file with a name with a
> >> leading '.' (assuming udev ignores those, I haven't checked) and then
> >> rename it after it has been completely written.
> > You're right. In theory, such partial udev rules are excluded when udev can't interpret them properly. I have looked into udev's sources and found that it looks for "*.rules" files. All other file extensions are ignored. Files with leading dots are also omitted. I would prefer to create.temp file and then rename it into.rules.
>
> There must be an existing convention for this sort of the thing, if so
> let's not invent another one.
We could avoid both these issues by just writing the new rules file to stdout.
When when the init script gets it wrong, it isn't our fault :-)
But I don't really like that. At least there should be a simple and uniform
way to propagate any mdadm.conf changes into udev.
Maybe the name of the rules file should be given in mdadm.conf, and e.g.
mdadm --check-config
would report any syntax errors, report any inconsistencies with current
arrays, and update the udev file if necessary..
Maybe leave that for 3.2.1, and just support '--activate-domains=filename'
for now.
???
NeilBrown
>
> --
> Dan
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Patch 00/17] Autorebuild
am 23.11.2010 02:34:17 von NeilBrown
On Mon, 22 Nov 2010 15:08:27 +0000
"Czarnowska, Anna" wrote:
> > > 4. With spare-same-slot when there is a cookie and disk has no
> > metadata then we probably shouldn't look at domains. Just add.
> > >
> >
> > I disagree. We must always check domains.
> > Why do you think we should ignore domains in that case?
>
> Easy example is: we have an array made on two disks sda and sdb.
> Sda has domain d1. Sdb has domain d2. This is perfectly ok for Create.
> Now sdb fails and we put a new disk in that place. It gets domain d2 because it is in the same slot as old disk.
> When the array went degraded and sdb was removed, the domain for the array was reduced to d1 only.
> New disk does not match any more so it is not added.
>
> I think we should still add it because we have a file saying that this slot belongs to that array.
> There is also controller domain that has the special meaning but I don't think it is a problem.
> If the user originally created the array spanning different controllers why wouldn't we take a replacement occupying the same slot as original member?
>
> My conclusion is: we should ignore domains when there is a cookie.
Yes, that make sense. Thanks for the explanation.
So when we find a device with a cookie file that identifies a particular
array, we allow that device to be added to that array without further
reference to domain.
Sounds good. It should go in the man-page somewhere of course.
>
> Note that we don't look at domains at all when we add a disk with spare metadata. Here it is indeed very much needed.
> When someone creates some arrays and adds some spares, additionally defines domains to keep all related disks together, then he may be disappointed seeing that after reboot all spares end up in one container anyway. This happens for imsm and because of this issue some spare that was meant for a different array may be used in the first array from config instead (all spares will be added there regardless of domains).
>
Presumably is required for both -I and -A.
Normally when assembling an array we ignore domains because if two devices
claim to be in an array, then they need to be assembled together no matter
what domains say.
But for truly global spares, the metadata doesn't tell us much, so we only
add such a spare to an array for which the domain says it is OK.
This is a little awkward for -I as if we get a spare first we have no idea
what to do with it.
I think we had an idea once of having a container for global spares. We
could proceed with that, putting spares in that container as they are found.
and maybe have Monitor() move these spares to an active container if one is
found with a domain match. Maybe?
NeilBrown
> Anna
>
> ------------------------------------------------------------ ---------
> Intel Technology Poland sp. z o.o.
> z siedziba w Gdansku
> ul. Slowackiego 173
> 80-298 Gdansk
>
> Sad Rejonowy Gdansk Polnoc w Gdansku,
> VII Wydzial Gospodarczy Krajowego Rejestru Sadowego,
> numer KRS 101882
>
> NIP 957-07-52-316
> Kapital zakladowy 200.000 zl
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Autorebuild, new dynamic udev rules for hot-plugs
am 23.11.2010 06:04:07 von dan.j.williams
On 11/22/2010 5:17 PM, Neil Brown wrote:
> On Mon, 22 Nov 2010 16:11:41 -0800
> Dan Williams wrote:
>
>> On 11/22/2010 3:50 PM, Hawrylewicz Czarnowski, Przemyslaw wrote:
>>>> Four comments.
>>>>
>>>> 1/ I wouldn't write a file in /lib/udev/rules.d/
>>>> I think it should be written to "/dev/.udev/rules.d/"
>>>> which is referred to as the "temporary rules directory"
>>>> in the udev documentation.
>>> I am not sure if it is what we are looking for. Temporary means they disappear after reboot. It is OK as cold-plug does not need support for bare disks (or maybe I am wrong?). But in such case, one who wants to use autorebuild should invoke mdadm --activate-domains for example in /etc/init.d/local.boot or somewhere else. Second idea here is to use ActivateDomain() when one starts monitor with autorebuild enabled. Which one? I would prefer to leave it as it was written initially (considering comment #4). Then, if one removes policies from config, invoking --activate-domains should reset/remove rules (but see #3)
>>
>> The intent was always to have this be something reinitialized at boot.
>> Putting these in the temporary rule directory also precludes them from
>> being added to the initramfs where they are not needed / potentially
>> confusing.
>>
>> The other intent was to only match the pci paths for the controllers we
>> cared about. That does not appear to be a part of this patch.
>
> Can you define "we cared about". Don't we care about everything listed in
> mdadm.conf??
A hot plug event outside of ahci (in raid mode), or the upcoming isci
driver needs to be ignored and an error thrown on activate if we can
unambiguously determine that the domain defines firmware unreachable
devices.
The IMSM_NO_PLATFORM debug environment variable can override this
behavior, or in the ahci case you can run in raid disabled mode. I need
to check if the same raid disabled case holds for isci.
>
> We could avoid both these issues by just writing the new rules file to stdout.
> When when the init script gets it wrong, it isn't our fault :-)
>
> But I don't really like that. At least there should be a simple and uniform
> way to propagate any mdadm.conf changes into udev.
>
> Maybe the name of the rules file should be given in mdadm.conf, and e.g.
> mdadm --check-config
> would report any syntax errors, report any inconsistencies with current
> arrays, and update the udev file if necessary..
>
> Maybe leave that for 3.2.1, and just support '--activate-domains=filename'
> for now.
>
> ???
A more generic mdadm.conf checker sounds like a good idea in general.
--
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Autorebuild, new dynamic udev rules for hot-plugs
am 23.11.2010 06:27:54 von NeilBrown
On Mon, 22 Nov 2010 21:04:07 -0800
Dan Williams wrote:
> On 11/22/2010 5:17 PM, Neil Brown wrote:
> > On Mon, 22 Nov 2010 16:11:41 -0800
> > Dan Williams wrote:
> >
> >> On 11/22/2010 3:50 PM, Hawrylewicz Czarnowski, Przemyslaw wrote:
> >>>> Four comments.
> >>>>
> >>>> 1/ I wouldn't write a file in /lib/udev/rules.d/
> >>>> I think it should be written to "/dev/.udev/rules.d/"
> >>>> which is referred to as the "temporary rules directory"
> >>>> in the udev documentation.
> >>> I am not sure if it is what we are looking for. Temporary means they disappear after reboot. It is OK as cold-plug does not need support for bare disks (or maybe I am wrong?). But in such case, one who wants to use autorebuild should invoke mdadm --activate-domains for example in /etc/init.d/local.boot or somewhere else. Second idea here is to use ActivateDomain() when one starts monitor with autorebuild enabled. Which one? I would prefer to leave it as it was written initially (considering comment #4). Then, if one removes policies from config, invoking --activate-domains should reset/remove rules (but see #3)
> >>
> >> The intent was always to have this be something reinitialized at boot.
> >> Putting these in the temporary rule directory also precludes them from
> >> being added to the initramfs where they are not needed / potentially
> >> confusing.
> >>
> >> The other intent was to only match the pci paths for the controllers we
> >> cared about. That does not appear to be a part of this patch.
> >
> > Can you define "we cared about". Don't we care about everything listed in
> > mdadm.conf??
>
> A hot plug event outside of ahci (in raid mode), or the upcoming isci
> driver needs to be ignored and an error thrown on activate if we can
> unambiguously determine that the domain defines firmware unreachable
> devices.
I can agree that a hot plug event for a non-firmware-reachable device should
not cause that device to be added to an imsm array. But the domain setting
should stop that happening already.
I don't agree (as I *think* you are saying) that hot plug events on such
devices should be completely ignored by mdadm. But as I am very surprised
that you would say that, I suspect I'm misunderstanding.
NeilBrown
>
> The IMSM_NO_PLATFORM debug environment variable can override this
> behavior, or in the ahci case you can run in raid disabled mode. I need
> to check if the same raid disabled case holds for isci.
>
> >
> > We could avoid both these issues by just writing the new rules file to stdout.
> > When when the init script gets it wrong, it isn't our fault :-)
> >
> > But I don't really like that. At least there should be a simple and uniform
> > way to propagate any mdadm.conf changes into udev.
> >
> > Maybe the name of the rules file should be given in mdadm.conf, and e.g.
> > mdadm --check-config
> > would report any syntax errors, report any inconsistencies with current
> > arrays, and update the udev file if necessary..
> >
> > Maybe leave that for 3.2.1, and just support '--activate-domains=filename'
> > for now.
> >
> > ???
>
> A more generic mdadm.conf checker sounds like a good idea in general.
>
> --
> Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Autorebuild, new dynamic udev rules for hot-plugs
am 23.11.2010 07:17:10 von dan.j.williams
On 11/22/2010 9:27 PM, Neil Brown wrote:
> On Mon, 22 Nov 2010 21:04:07 -0800
> Dan Williams wrote:
>
>> On 11/22/2010 5:17 PM, Neil Brown wrote:
>>> On Mon, 22 Nov 2010 16:11:41 -0800
>>> Dan Williams wrote:
>>>
>>>> On 11/22/2010 3:50 PM, Hawrylewicz Czarnowski, Przemyslaw wrote:
>>>>>> Four comments.
>>>>>>
>>>>>> 1/ I wouldn't write a file in /lib/udev/rules.d/
>>>>>> I think it should be written to "/dev/.udev/rules.d/"
>>>>>> which is referred to as the "temporary rules directory"
>>>>>> in the udev documentation.
>>>>> I am not sure if it is what we are looking for. Temporary means they disappear after reboot. It is OK as cold-plug does not need support for bare disks (or maybe I am wrong?). But in such case, one who wants to use autorebuild should invoke mdadm --activate-domains for example in /etc/init.d/local.boot or somewhere else. Second idea here is to use ActivateDomain() when one starts monitor with autorebuild enabled. Which one? I would prefer to leave it as it was written initially (considering comment #4). Then, if one removes policies from config, invoking --activate-domains should reset/remove rules (but see #3)
>>>>
>>>> The intent was always to have this be something reinitialized at boot.
>>>> Putting these in the temporary rule directory also precludes them from
>>>> being added to the initramfs where they are not needed / potentially
>>>> confusing.
>>>>
>>>> The other intent was to only match the pci paths for the controllers we
>>>> cared about. That does not appear to be a part of this patch.
>>>
>>> Can you define "we cared about". Don't we care about everything listed in
>>> mdadm.conf??
>>
>> A hot plug event outside of ahci (in raid mode), or the upcoming isci
>> driver needs to be ignored and an error thrown on activate if we can
>> unambiguously determine that the domain defines firmware unreachable
>> devices.
>
>
> I can agree that a hot plug event for a non-firmware-reachable device should
> not cause that device to be added to an imsm array. But the domain setting
> should stop that happening already.
>
> I don't agree (as I *think* you are saying) that hot plug events on such
> devices should be completely ignored by mdadm. But as I am very surprised
> that you would say that, I suspect I'm misunderstanding.
Well, this comes back to the idea that the user need not be burdened
with figuring out the pci-device or sas-domain-topology paths of a raid
controller, especially if that controller changes paths from boot-to-boot.
I see your point that if we define a maximal domain the
controller/firmware constraints can be applied late to handle the
clarification. But if everything is interrogated in this fashion then
what is the point of dynamically limiting the hot plug path set? Do we
need to require explicit definition of a default catch-all domain for
out-of-firmware-bounds devices? Now I think I'm the one that is
misunderstanding :-)
--
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Devel 3.2 branch issues
am 23.11.2010 13:04:37 von anna.czarnowska
> I don't think it is possible to come up with an automatic way to
> determine if
> a given spare suits a given array that is always correct. There are
> too many
> subtleties.
> So I would like to allow the sysadmin to exercise complete control, and
> have
> defaults that make reasonable sense in common cases.
>
> The 'complete control' can be exercised through domain - though I will
> probably add some size based rule mechanism to the policy code so
> devices can
> be categorised by size if wanted.
> The 'safe default' is probably best left to the metadata handler. So
> ultimately all metadata types *should* specify
> min_acceptable_spare_size,
> and we will just make do with that.
>
> Does that sound OK?
Yes.
When all metadata types have min_acceptable_spare_size there will be no need for min_active_disk_size at all.
One more thought: Manage_subdevs checks component_size so for native metadata it will not allow to add a spare that is too small.
But checking size in Monitor will prevent unnecessary removal and re-adding.
It would make sense to get Manage_subdevs to check the size properly for external metadata too.
Anna
------------------------------------------------------------ ---------
Intel Technology Poland sp. z o.o.
z siedziba w Gdansku
ul. Slowackiego 173
80-298 Gdansk
Sad Rejonowy Gdansk Polnoc w Gdansku,
VII Wydzial Gospodarczy Krajowego Rejestru Sadowego,
numer KRS 101882
NIP 957-07-52-316
Kapital zakladowy 200.000 zl
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Autorebuild, new dynamic udev rules for hot-plugs
am 23.11.2010 18:01:34 von unknown
> -----Original Message-----
> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
> owner@vger.kernel.org] On Behalf Of Neil Brown
> Sent: Tuesday, November 23, 2010 2:17 AM
> To: Williams, Dan J
> Cc: Hawrylewicz Czarnowski, Przemyslaw; linux-raid@vger.kernel.org;
> Neubauer, Wojciech; Ciechanowski, Ed; Labun, Marcin; Czarnowska, Anna
> Subject: Re: Autorebuild, new dynamic udev rules for hot-plugs
>
> On Mon, 22 Nov 2010 16:11:41 -0800
> Dan Williams wrote:
>
> > On 11/22/2010 3:50 PM, Hawrylewicz Czarnowski, Przemyslaw wrote:
> > >> Four comments.
> > >>
> > >> 1/ I wouldn't write a file in /lib/udev/rules.d/
> > >> I think it should be written to "/dev/.udev/rules.d/"
> > >> which is referred to as the "temporary rules directory"
> > >> in the udev documentation.
> > > I am not sure if it is what we are looking for. Temporary means they
> disappear after reboot. It is OK as cold-plug does not need support for
> bare disks (or maybe I am wrong?). But in such case, one who wants to use
> autorebuild should invoke mdadm --activate-domains for example in
> /etc/init.d/local.boot or somewhere else. Second idea here is to use
> ActivateDomain() when one starts monitor with autorebuild enabled. Which
> one? I would prefer to leave it as it was written initially (considering
> comment #4). Then, if one removes policies from config, invoking --
> activate-domains should reset/remove rules (but see #3)
> >
> > The intent was always to have this be something reinitialized at boot.
> > Putting these in the temporary rule directory also precludes them from
> > being added to the initramfs where they are not needed / potentially
> > confusing.
> >
> > The other intent was to only match the pci paths for the controllers we
> > cared about. That does not appear to be a part of this patch.
>
> Can you define "we cared about". Don't we care about everything listed in
> mdadm.conf??
>
>
> >
> > >
> > >>
> > >> 2/ I would be good to process the type=disk or type=part part of the
> > >> policy into the rules file as well.
> > > OK
> > >
> > >>
> > >> 3/ I'm not very comfortable with hard-coding the name of the
> > >> file to be created in the rules.d directory. Maybe usage could be
> > >> --activate-domains=63-md-whatever
> > > Good idea, but only if we store our rules in /dev/.udev/rules.d.
> Otherwise it would be difficult to maintain all generated rules and remove
> the old ones... I would leave default if not given by user, but one can
> pass any file name.
> >
> > The issue is that this namespace belongs to the distro and since they
> > need to modify initscripts to turn this feature on might as well dump
> > the entirety of the naming responsibility to the user.
> >
> > >> 4/ I don't think it is good to have an incomplete file in rules.d that
> udev
> > >> might accidentally read. We should create the file with a name
> with a
> > >> leading '.' (assuming udev ignores those, I haven't checked) and
> then
> > >> rename it after it has been completely written.
> > > You're right. In theory, such partial udev rules are excluded when udev
> can't interpret them properly. I have looked into udev's sources and found
> that it looks for "*.rules" files. All other file extensions are ignored.
> Files with leading dots are also omitted. I would prefer to
> create.temp file and then rename it into.rules.
> >
> > There must be an existing convention for this sort of the thing, if so
> > let's not invent another one.
I haven't found anything similar. Just mountall, but it writes single line in one "shot"... Both options with extension or with leading dot will work.
>
> We could avoid both these issues by just writing the new rules file to
> stdout.
> When when the init script gets it wrong, it isn't our fault :-)
I like that idea at this stage. Later on we might develop better solution (see below)
>
> But I don't really like that. At least there should be a simple and
> uniform
> way to propagate any mdadm.conf changes into udev.
>
> Maybe the name of the rules file should be given in mdadm.conf, and e.g.
> mdadm --check-config
> would report any syntax errors, report any inconsistencies with current
> arrays, and update the udev file if necessary..
>
> Maybe leave that for 3.2.1, and just support '--activate-domains=filename'
> for now.
Let me extend this thought a little. As I mentioned above I like the idea of writing rule to stdout. Or if somebody wants to pass a file name, just write the file in current directory - similar to the way one creates mdadm.conf with mdadm --examine (but with small improvement:).
But the general problem is to find "simple and uniform way". Something distro-independent. Rules should be prepared once, right after config file is finished. Fire and forget:)
We need to handle hot/cold-plug events for action=spare, and hot-plug events for actions=spare-same-slot. Considering we use temporary udev rules directory they need to be regenerated each reboot, putting attention at the moment when we have to do it so we handle all cases. What options then?
As a last resort, maybe just a note in man pages with possibilities, leaving implementation to user/admin?
>
> ???
>
> NeilBrown
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Devel 3.2 branch issues
am 23.11.2010 18:34:49 von Marcin.Labun
This is fixed version of original patch for the problem of imsm using spare disk
that has been removed from a container. In previous patch there was a problem of
releasing spare structures too early for spares that has been a part of volume (mdadm -f) - fixed.
As for Dan's comment,
> Since we do not update the metadata can we just lazily queue an modified
> imsm_delete() update the next time we call activate_spare() and find the spare removed? That way it is just garbage collection without this new infrastructure that gives the appearance we are writing metadata when removing a spare.
In fact, we are not updating metadata when removing a spare. I am reusing current imsm communication method between mdmon threads that already is used to do much more than just metadata updates.
In your proposal we would still need to send info to monitor that a disk shall not longer be used in a container.
Marcin
From 49406f135843a6bc2d2d28a34f8e8647fcced4d0 Mon Sep 17 00:00:00 2001
From: Marcin Labun
Date: Wed, 17 Nov 2010 00:09:02 +0100
Subject: [PATCH] IMSM: Fix problem in mdmon monitor of using removed disk from in imsm container.
Manager thread shall pass the information to monitor thread (mdmon)
that some devices are removed from container. Otherwise, monitor (mdmon)
might use such devices (spares) to rebuild the array that has gone degraded.
This problem happens for imsm containers, since a list of the container disks
is maintained in intel_super structure. When array goes degraded, the list is
searched to find a spare disks to start rebuild.
Without this fix the rebuild could be stared on the spare device that was
a member of the container, but has been removed from it.
New super type function handler has been introduced to prepare metadata
format specific information about removed devices.
int (*remove_from_super)(struct supertype *st, mdu_disk_info_t *dinfo,
int fd);
The message prepared in remove_from_super is later processed
by proceess_update handler in monitor thread.
Signed-off-by: Marcin Labun
---
managemon.c | 38 +++++++++++++
mdadm.h | 7 ++-
super-intel.c | 173 +++++++++++++++++++++++++++++++++++++++++++++++----------
3 files changed, 187 insertions(+), 31 deletions(-)
diff --git a/managemon.c b/managemon.c
index 8915522..93b130a 100644
--- a/managemon.c
+++ b/managemon.c
@@ -297,6 +297,43 @@ static void add_disk_to_container(struct supertype *st, struct mdinfo *sd)
st->update_tail = NULL;
}
+/*
+ * Create and queue update structure about the removed disks.
+ * The update is prepared by super type handler and passed to the monitor
+ * thread.
+ */
+static void remove_disk_from_container(struct supertype *st, struct mdinfo *sd)
+{
+ int dfd;
+ char nm[20];
+ struct metadata_update *update = NULL;
+ mdu_disk_info_t dk = {
+ .number = -1,
+ .major = sd->disk.major,
+ .minor = sd->disk.minor,
+ .raid_disk = -1,
+ .state = 0,
+ };
+ /* nothing to do if super type handler does not support
+ * remove disk primitive
+ */
+ if (!st->ss->remove_from_super)
+ return;
+ dprintf("%s: remove %d:%d to container\n",
+ __func__, sd->disk.major, sd->disk.minor);
+
+ sprintf(nm, "%d:%d", sd->disk.major, sd->disk.minor);
+ dfd = dev_open(nm, O_RDWR);
+ if (dfd < 0)
+ return;
+
+ st->update_tail = &update;
+ st->ss->remove_from_super(st, &dk, dfd);
+ st->ss->write_init_super(st);
+ queue_metadata_update(update);
+ st->update_tail = NULL;
+}
+
static void manage_container(struct mdstat_ent *mdstat,
struct supertype *container)
{
@@ -334,6 +371,7 @@ static void manage_container(struct mdstat_ent *mdstat,
if (!found) {
cd = *cdp;
*cdp = (*cdp)->next;
+ remove_disk_from_container(container, cd);
free(cd);
} else
cdp = &(*cdp)->next;
diff --git a/mdadm.h b/mdadm.h
index 2d1db36..6309a62 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -596,7 +596,12 @@ extern struct superswitch {
* when hot-adding a spare.
*/
int (*add_to_super)(struct supertype *st, mdu_disk_info_t *dinfo,
- int fd, char *devname);
+ int fd, char *devname);
+ /* update the metadata to delete a device,
+ * when hot-removing a spare.
+ */
+ int (*remove_from_super)(struct supertype *st, mdu_disk_info_t *dinfo,
+ int fd);
/* Write metadata to one device when fixing problems or adding
* a new device.
diff --git a/super-intel.c b/super-intel.c
index 9b4ad19..ac168e8 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -233,6 +233,10 @@ struct intel_dev {
unsigned index;
};
+enum action {
+ DISK_REMOVE = 0,
+ DISK_ADD
+};
/* internal representation of IMSM metadata */
struct intel_super {
union {
@@ -258,8 +262,10 @@ struct intel_super {
int extent_cnt;
struct extent *e; /* for determining freespace @ create */
int raiddisk; /* slot to fill in autolayout */
+ enum action action;
} *disks;
- struct dl *add; /* list of disks to add while mdmon active */
+ struct dl *disk_mgmt_list; /* list of disks to add/remove while mdmon
+ active */
struct dl *missing; /* disks removed while we weren't looking */
struct bbm_log *bbm_log;
const char *hba; /* device path of the raid controller for this metadata */
@@ -285,6 +291,7 @@ enum imsm_update_type {
update_kill_array,
update_rename_array,
update_add_disk,
+ update_add_remove_disk
};
struct imsm_update_activate_spare {
@@ -316,7 +323,7 @@ struct imsm_update_rename_array {
int dev_idx;
};
-struct imsm_update_add_disk {
+struct imsm_update_add_remove_disk {
enum imsm_update_type type;
};
@@ -2428,6 +2435,7 @@ static void __free_imsm_disk(struct dl *d)
free(d);
}
+
static void free_imsm_disks(struct intel_super *super)
{
struct dl *d;
@@ -3393,6 +3401,7 @@ static int add_to_super_imsm(struct supertype *st, mdu_disk_info_t *dk,
dd->devname = devname ? strdup(devname) : NULL;
dd->fd = fd;
dd->e = NULL;
+ dd->action = DISK_ADD;
rv = imsm_read_serial(fd, devname, dd->serial);
if (rv) {
fprintf(stderr,
@@ -3412,8 +3421,8 @@ static int add_to_super_imsm(struct supertype *st, mdu_disk_info_t *dk,
dd->disk.scsi_id = __cpu_to_le32(0);
if (st->update_tail) {
- dd->next = super->add;
- super->add = dd;
+ dd->next = super->disk_mgmt_list;
+ super->disk_mgmt_list = dd;
} else {
dd->next = super->disks;
super->disks = dd;
@@ -3422,6 +3431,45 @@ static int add_to_super_imsm(struct supertype *st, mdu_disk_info_t *dk,
return 0;
}
+
+static int remove_from_super_imsm(struct supertype *st, mdu_disk_info_t *dk,
+ int fd)
+{
+ struct intel_super *super = st->sb;
+ struct dl *dd;
+
+ /* remove from super works only in mdmon - for communication
+ * manager - monitor. Check if communication memory buffer
+ * is prepared.
+ */
+ if (!st->update_tail) {
+ fprintf(stderr,
+ Name ": %s shall be used in mdmon context only"
+ "(line %d).\n", __func__, __LINE__);
+ return 1;
+ }
+ dd = malloc(sizeof(*dd));
+ if (!dd) {
+ fprintf(stderr,
+ Name ": malloc failed %s:%d.\n", __func__, __LINE__);
+ return 1;
+ }
+ memset(dd, 0, sizeof(*dd));
+ dd->major = dk->major;
+ dd->minor = dk->minor;
+ dd->index = -1;
+ dd->fd = fd;
+ dd->disk.status = SPARE_DISK;
+ dd->action = DISK_REMOVE;
+
+ if (st->update_tail) {
+ dd->next = super->disk_mgmt_list;
+ super->disk_mgmt_list = dd;
+ }
+
+ return 0;
+}
+
static int store_imsm_mpb(int fd, struct imsm_super *mpb);
static union {
@@ -3574,13 +3622,13 @@ static int create_array(struct supertype *st, int dev_idx)
return 0;
}
-static int _add_disk(struct supertype *st)
+static int mgmt_disk(struct supertype *st)
{
struct intel_super *super = st->sb;
size_t len;
- struct imsm_update_add_disk *u;
+ struct imsm_update_add_remove_disk *u;
- if (!super->add)
+ if (!super->disk_mgmt_list)
return 0;
len = sizeof(*u);
@@ -3591,7 +3639,7 @@ static int _add_disk(struct supertype *st)
return 1;
}
- u->type = update_add_disk;
+ u->type = update_add_remove_disk;
append_metadata_update(st, u, len);
return 0;
@@ -3613,10 +3661,10 @@ static int write_init_super_imsm(struct supertype *st)
/* determine if we are creating a volume or adding a disk */
if (current_vol < 0) {
- /* in the add disk case we are running in mdmon
- * context, so don't close fd's
+ /* in the mgmt (add/remove) disk case we are running
+ * in mdmon context, so don't close fd's
*/
- return _add_disk(st);
+ return mgmt_disk(st);
} else
rv = create_array(st, current_vol);
@@ -4873,10 +4921,9 @@ static int store_imsm_mpb(int fd, struct imsm_super *mpb)
static void imsm_sync_metadata(struct supertype *container)
{
struct intel_super *super = container->sb;
-
+ dprintf("sync metadata: %d\n", super->updates_pending);
if (!super->updates_pending)
return;
-
write_super_imsm(super, 0);
super->updates_pending = 0;
@@ -5165,8 +5212,80 @@ static int disks_overlap(struct intel_super *super, int idx, struct imsm_update_
return 0;
}
+
+static struct dl *get_disk_super(struct intel_super *super, int major, int minor)
+{
+ struct dl *dl = NULL;
+ for (dl = super->disks; dl; dl = dl->next)
+ if ((dl->major == major) && (dl->minor == minor))
+ return dl;
+ return NULL;
+}
+
+static int remove_disk_super(struct intel_super *super, int major, int minor)
+{
+ struct dl *prev = NULL;
+ struct dl *dl;
+
+ prev = NULL;
+ for (dl = super->disks; dl; dl = dl->next) {
+ if ((dl->major == major) && (dl->minor == minor)) {
+ /* remove */
+ if (prev)
+ prev->next = dl->next;
+ else
+ super->disks = dl->next;
+ dl->next = NULL;
+ __free_imsm_disk(dl);
+ dprintf("%s: removed %x:%x\n",
+ __func__, major, minor);
+ break;
+ }
+ prev = dl;
+ }
+ return 0;
+}
+
static void imsm_delete(struct intel_super *super, struct dl **dlp, unsigned index);
+static int add_remove_disk_update(struct intel_super *super)
+{
+ int check_degraded = 0;
+ struct dl *disk = NULL;
+ /* add/remove some spares to/from the metadata/contrainer */
+ while (super->disk_mgmt_list) {
+ struct dl *disk_cfg;
+
+ disk_cfg = super->disk_mgmt_list;
+ super->disk_mgmt_list = disk_cfg->next;
+ disk_cfg->next = NULL;
+
+ if (disk_cfg->action == DISK_ADD) {
+ disk_cfg->next = super->disks;
+ super->disks = disk_cfg;
+ check_degraded = 1;
+ dprintf("%s: added %x:%x\n",
+ __func__, disk_cfg->major,
+ disk_cfg->minor);
+ } else if (disk_cfg->action == DISK_REMOVE) {
+ dprintf("Disk remove action processed: %x.%x\n",
+ disk_cfg->major, disk_cfg->minor);
+ disk = get_disk_super(super,
+ disk_cfg->major,
+ disk_cfg->minor);
+ /* remove spare disks only */
+ if (disk->index == -1) {
+ remove_disk_super(super,
+ disk_cfg->major,
+ disk_cfg->minor);
+ }
+ /* release allocate disk structure */
+ __free_imsm_disk(disk_cfg);
+ }
+ }
+ return check_degraded;
+}
+
static void imsm_process_update(struct supertype *st,
struct metadata_update *update)
{
@@ -5476,31 +5595,24 @@ static void imsm_process_update(struct supertype *st,
super->updates_pending++;
break;
}
- case update_add_disk:
-
+ case update_add_remove_disk: {
/* we may be able to repair some arrays if disks are
- * being added */
- if (super->add) {
+ * being added, check teh status of add_remove_disk
+ * if discs has been added.
+ */
+ if (add_remove_disk_update(super)) {
struct active_array *a;
super->updates_pending++;
- for (a = st->arrays; a; a = a->next)
+ for (a = st->arrays; a; a = a->next)
a->check_degraded = 1;
}
- /* add some spares to the metadata */
- while (super->add) {
- struct dl *al;
-
- al = super->add;
- super->add = al->next;
- al->next = super->disks;
- super->disks = al;
- dprintf("%s: added %x:%x\n",
- __func__, al->major, al->minor);
- }
-
break;
}
+ default:
+ fprintf(stderr, "error: unsuported process update type:"
+ "(type: %d)\n", type);
+ }
}
static void imsm_prepare_update(struct supertype *st,
@@ -5685,6 +5797,7 @@ struct superswitch super_imsm = {
.write_init_super = write_init_super_imsm,
.validate_geometry = validate_geometry_imsm,
.add_to_super = add_to_super_imsm,
+ .remove_from_super = remove_from_super_imsm,
.detail_platform = detail_platform_imsm,
.kill_subarray = kill_subarray_imsm,
.update_subarray = update_subarray_imsm,
--
1.6.4.2
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [Patch 00/17] Autorebuild
am 23.11.2010 19:20:27 von Marcin.Labun
> -----Original Message-----
> From: Neil Brown [mailto:neilb@suse.de]
> Sent: Tuesday, November 23, 2010 2:34 AM
> To: Czarnowska, Anna
> Cc: linux-raid@vger.kernel.org; Neubauer, Wojciech; Williams, Dan J;
> Ciechanowski, Ed; Labun, Marcin; Hawrylewicz Czarnowski, Przemyslaw
> Subject: Re: [Patch 00/17] Autorebuild
>
> On Mon, 22 Nov 2010 15:08:27 +0000
> "Czarnowska, Anna" wrote:
>
> > > > 4. With spare-same-slot when there is a cookie and disk has no
> > > metadata then we probably shouldn't look at domains. Just add.
> > > >
> > >
> > > I disagree. We must always check domains.
> > > Why do you think we should ignore domains in that case?
> >
> > Easy example is: we have an array made on two disks sda and sdb.
> > Sda has domain d1. Sdb has domain d2. This is perfectly ok for
> Create.
> > Now sdb fails and we put a new disk in that place. It gets domain d2
> because it is in the same slot as old disk.
> > When the array went degraded and sdb was removed, the domain for the
> array was reduced to d1 only.
> > New disk does not match any more so it is not added.
> >
> > I think we should still add it because we have a file saying that
> this slot belongs to that array.
> > There is also controller domain that has the special meaning but I
> don't think it is a problem.
> > If the user originally created the array spanning different
> controllers why wouldn't we take a replacement occupying the same slot
> as original member?
> >
> > My conclusion is: we should ignore domains when there is a cookie.
>
> Yes, that make sense. Thanks for the explanation.
>
> So when we find a device with a cookie file that identifies a
> particular
> array, we allow that device to be added to that array without further
> reference to domain.
> Sounds good. It should go in the man-page somewhere of course.
>
> >
> > Note that we don't look at domains at all when we add a disk with
> spare metadata. Here it is indeed very much needed.
> > When someone creates some arrays and adds some spares, additionally
> defines domains to keep all related disks together, then he may be
> disappointed seeing that after reboot all spares end up in one
> container anyway. This happens for imsm and because of this issue some
> spare that was meant for a different array may be used in the first
> array from config instead (all spares will be added there regardless of
> domains).
> >
>
> Presumably is required for both -I and -A.
> Normally when assembling an array we ignore domains because if two
> devices
> claim to be in an array, then they need to be assembled together no
> matter
> what domains say.
> But for truly global spares, the metadata doesn't tell us much, so we
> only
> add such a spare to an array for which the domain says it is OK.
>
> This is a little awkward for -I as if we get a spare first we have no
> idea
> what to do with it.
> I think we had an idea once of having a container for global spares.
> We
> could proceed with that, putting spares in that container as they are
> found.
> and maybe have Monitor() move these spares to an active container if
> one is
> found with a domain match. Maybe?
Sounds good. So after Monitor initial run all spares fall in the right container, and the ones left will have no match.
Maybe it is easier then changing Incremental and Assembly to look for right container.
Once we have presented two phase assembly that solved that problem (adding spares after all disks are placed in containers).
Marcin
>
> NeilBrown
>
>
>
> > Anna
> >
> > ------------------------------------------------------------ ---------
> > Intel Technology Poland sp. z o.o.
> > z siedziba w Gdansku
> > ul. Slowackiego 173
> > 80-298 Gdansk
> >
> > Sad Rejonowy Gdansk Polnoc w Gdansku,
> > VII Wydzial Gospodarczy Krajowego Rejestru Sadowego,
> > numer KRS 101882
> >
> > NIP 957-07-52-316
> > Kapital zakladowy 200.000 zl
> >
> > This e-mail and any attachments may contain confidential material for
> > the sole use of the intended recipient(s). Any review or distribution
> > by others is strictly prohibited. If you are not the intended
> > recipient, please contact the sender and delete all copies.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Devel 3.2 branch issues
am 25.11.2010 09:01:52 von NeilBrown
On Tue, 23 Nov 2010 11:52:13 +1100 Neil Brown wrote:
> On Mon, 22 Nov 2010 22:39:00 +0000
> "Czarnowska, Anna" wrote:
>
> >
> > > by the way, some of the changes in you of the patches you sent have not
> > > been
> > > included in any form. They include:
> > >
> > > - the getinfo_super_disks method. I couldn't see why you need this.
> > > All the
> > > info about the state of the arrays should already be available.
> > > If there is something that you need that we don't have, please
> > > explain and
> > > we can see how best to add it back in.
> >
> > Marcin has already answered this but here is my explanation.
> > Current test devstate[i]==0 is always true for container so any device seems a good candidate to move.
> > To be able to identify members, failed devices and real spares we updated devstate for containers.
> > To find members we can just check which disks are used in subarrays, but a failed disk is removed from subarray after a short while and as soon as it happens we are not able to see a difference between the failed disk and a spare unless we look at metadata.
>
> Thanks. That makes sense. I'll look at the code and see about applying it.
>
OK, I have something, though I haven't tested it.
It uses your getinfo_super_disks and does the following to choose a spare
from an external array. There are a couple of rearrangement patches before
this so it won't apply as-it, but should appear in my devel-3.2 within a few
hours.
NeilBrown
commit 5739e0d007a3eea80f5108d73d444751dbbde1ef
Author: NeilBrown
Date: Thu Nov 25 18:58:27 2010 +1100
Monitor: choose spare correctly for external metadata.
When metadata is managed externally - probably as a container - we
need to examine that metadata to see which devices are spares.
So use the getinfo_super_disk message and use the info returned.
Signed-off-by: NeilBrown
diff --git a/Monitor.c b/Monitor.c
index 5fc18d1..9ba49f2 100644
--- a/Monitor.c
+++ b/Monitor.c
@@ -798,6 +798,63 @@ static int choose_spare(struct state *from, struct state *to,
return dev;
}
+static int container_choose_spare(struct state *from, struct state *to,
+ struct domainlist *domlist)
+{
+ /* This is similar to choose_spare, but we cannot trust devstate,
+ * so we need to read the metadata instead
+ */
+
+ struct supertype *st = from->metadata;
+ int fd = open(st->devname, O_RDONLY);
+ int err;
+ struct mdinfo *disks, *d;
+ unsigned long long min_size
+ = min_spare_size_required(to);
+ int dev;
+
+ if (fd < 0)
+ return 0;
+ if (!st->ss->getinfo_super_disks)
+ return 0;
+
+ err = st->ss->load_container(st, fd, NULL);
+ close(fd);
+ if (err)
+ return 0;
+
+ disks = st->ss->getinfo_super_disks(st);
+ st->ss->free_super(st);
+
+ if (!disks)
+ return 0;
+
+ for (d = disks->devs ; d && !dev ; d = d->next) {
+ if (d->disk.state == 0) {
+ struct dev_policy *pol;
+ unsigned long long dev_size;
+ dev = makedev(d->disk.major,d->disk.minor);
+
+ if (min_size &&
+ dev_size_from_id(dev, &dev_size) &&
+ dev_size < min_size)
+ continue;
+
+ pol = devnum_policy(dev);
+ if (from->spare_group)
+ pol_add(&pol, pol_domain,
+ from->spare_group, NULL);
+ if (!domain_test(domlist, pol, to->metadata->ss->name))
+ dev = 0;
+
+ dev_policy_free(pol);
+ }
+ }
+ sysfs_free(disks);
+ return dev;
+}
+
+
static void try_spare_migration(struct state *statelist, struct alert_info *info)
{
struct state *from;
@@ -827,7 +884,11 @@ static void try_spare_migration(struct state *statelist, struct alert_info *info
int devid;
if (!check_donor(from, to, domlist))
continue;
- devid = choose_spare(from, to, domlist);
+ if (from->metadata->ss->external)
+ devid = container_choose_spare(
+ from, to, domlist);
+ else
+ devid = choose_spare(from, to, domlist);
if (devid > 0
&& move_spare(from, to, devid, info))
break;
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Devel 3.2 branch issues
am 25.11.2010 11:28:15 von anna.czarnowska
> -----Original Message-----
> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
> owner@vger.kernel.org] On Behalf Of Neil Brown
> Sent: Thursday, November 25, 2010 9:02 AM
> To: Czarnowska, Anna
> Cc: linux-raid@vger.kernel.org; Neubauer, Wojciech; Williams, Dan J;
> Ciechanowski, Ed; Labun, Marcin; Hawrylewicz Czarnowski, Przemyslaw
> Subject: Re: Devel 3.2 branch issues
>
> On Tue, 23 Nov 2010 11:52:13 +1100 Neil Brown wrote:
>
> > On Mon, 22 Nov 2010 22:39:00 +0000
> > "Czarnowska, Anna" wrote:
> >
> > >
> > > > by the way, some of the changes in you of the patches you sent
> have not
> > > > been
> > > > included in any form. They include:
> > > >
> > > > - the getinfo_super_disks method. I couldn't see why you need
> this.
> > > > All the
> > > > info about the state of the arrays should already be available.
> > > > If there is something that you need that we don't have, please
> > > > explain and
> > > > we can see how best to add it back in.
> > >
> > > Marcin has already answered this but here is my explanation.
> > > Current test devstate[i]==0 is always true for container so any
> device seems a good candidate to move.
> > > To be able to identify members, failed devices and real spares we
> updated devstate for containers.
> > > To find members we can just check which disks are used in
> subarrays, but a failed disk is removed from subarray after a short
> while and as soon as it happens we are not able to see a difference
> between the failed disk and a spare unless we look at metadata.
> >
> > Thanks. That makes sense. I'll look at the code and see about
> applying it.
> >
>
> OK, I have something, though I haven't tested it.
>
> It uses your getinfo_super_disks and does the following to choose a
> spare
> from an external array. There are a couple of rearrangement patches
> before
> this so it won't apply as-it, but should appear in my devel-3.2 within
> a few
> hours.
>
> NeilBrown
>
Well, this didn't help.
In the set of tests I have just posted even the basic ones fail for imsm.
For native there are still some problems with tests:
5c - spare not moved to degraded array in the same domain. This is really basic test with 4 arrays instead of 2.
9 - spare moved between different metadata arrays
13 - spare moved despite action=include which doesn't allow migration
Test9 run in scan mode generates a segmentation fault.
I will have a look at this in debugger and give you more info on the reasons later on.
Anna
------------------------------------------------------------ ---------
Intel Technology Poland sp. z o.o.
z siedziba w Gdansku
ul. Slowackiego 173
80-298 Gdansk
Sad Rejonowy Gdansk Polnoc w Gdansku,
VII Wydzial Gospodarczy Krajowego Rejestru Sadowego,
numer KRS 101882
NIP 957-07-52-316
Kapital zakladowy 200.000 zl
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Devel 3.2 branch issues
am 26.11.2010 19:23:34 von anna.czarnowska
> Well, this didn't help.
> In the set of tests I have just posted even the basic ones fail for
> imsm.
> For native there are still some problems with tests:
> 5c - spare not moved to degraded array in the same domain. This is
> really basic test with 4 arrays instead of 2.
> 9 - spare moved between different metadata arrays
> 13 - spare moved despite action=include which doesn't allow migration
>
> Test9 run in scan mode generates a segmentation fault.
>
> I will have a look at this in debugger and give you more info on the
> reasons later on.
>
> Anna
After applying yesterday's fixes test5 and test9 don't fail any more for native.
Test6 often fails because Monitor keeps removing and re-adding spare
that is too small to add to degraded array. Test fails when we see it removed.
I have prepared few further fixes to address possible problems with Monitor
as described in each patch.
Spare migration works now also for imsm. (test12 and 13 still fail).
Test12 - two spares are taken when just one needed.
Test13 - action=include so spare should not be moved
Scan mode still needs to be investigated.
Anna
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Devel 3.2 branch issues
am 28.11.2010 23:59:42 von NeilBrown
On Fri, 26 Nov 2010 18:23:34 +0000 "Czarnowska, Anna"
wrote:
> > Well, this didn't help.
> > In the set of tests I have just posted even the basic ones fail for
> > imsm.
> > For native there are still some problems with tests:
> > 5c - spare not moved to degraded array in the same domain. This is
> > really basic test with 4 arrays instead of 2.
> > 9 - spare moved between different metadata arrays
> > 13 - spare moved despite action=include which doesn't allow migration
> >
> > Test9 run in scan mode generates a segmentation fault.
> >
> > I will have a look at this in debugger and give you more info on the
> > reasons later on.
> >
> > Anna
>
> After applying yesterday's fixes test5 and test9 don't fail any more for native.
>
> Test6 often fails because Monitor keeps removing and re-adding spare
> that is too small to add to degraded array. Test fails when we see it removed.
>
> I have prepared few further fixes to address possible problems with Monitor
> as described in each patch.
Thanks for these patches and more particularly for all the testing effort,
finding and fixing my bugs!
I have applied them all.
NeilBrown
>
> Spare migration works now also for imsm. (test12 and 13 still fail).
> Test12 - two spares are taken when just one needed.
> Test13 - action=include so spare should not be moved
>
> Scan mode still needs to be investigated.
>
> Anna
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [Patch 00/17] Autorebuild
am 09.12.2010 12:40:43 von anna.czarnowska
Back to spares and domains...
> > > Note that we don't look at domains at all when we add a disk with
> > spare metadata. Here it is indeed very much needed.
> > > When someone creates some arrays and adds some spares, additionally
> > defines domains to keep all related disks together, then he may be
> > disappointed seeing that after reboot all spares end up in one
> > container anyway. This happens for imsm and because of this issue
> some
> > spare that was meant for a different array may be used in the first
> > array from config instead (all spares will be added there regardless
> of
> > domains).
> > >
> >
> > Presumably is required for both -I and -A.
> > Normally when assembling an array we ignore domains because if two
> > devices
> > claim to be in an array, then they need to be assembled together no
> > matter
> > what domains say.
> > But for truly global spares, the metadata doesn't tell us much, so we
> > only
> > add such a spare to an array for which the domain says it is OK.
The problem is: we don't know the domain of an array until it is fully assembled
(we know all devices in it).
So in Assembly we can either
- mark the spares and try them later (when choosing devices for an array we
skip spares in the first run and add a second run choosing
spares for an array with domain check against all members found in first run)
this could be done with uuid_match_any for spares.
or
- put spares in separate container and let Monitor take them out when they are needed.
Here spares can't match any array.
In both cases we must stop giving all imsm spares the uuid of the first array from config.
> >
> > This is a little awkward for -I as if we get a spare first we have no
> > idea
> > what to do with it.
> > I think we had an idea once of having a container for global spares.
> > We
> > could proceed with that, putting spares in that container as they are
> > found.
This is easily achieved for Incremental by just giving one uuid to all imsm spares.
This uuid cannot be used by any array. Probably just 0:0:0:0 would do.
This solution requires adding a special case to
Assemble or else no spares will be assembled.
If uuid_match_any is used for spares in Incremental then we don't know where to add them and exit.
We could take the list of all arrays from mdstat (not config) and try matching domains
but then the final result will depend on the order of devices as some arrays may appear later
and some may not have complete set of devices when we look at the spare.
Probably the simplest solution is to put all imsm spares in separate container.
We don't risk then that any of them will be used in an array that is in other domain
and Monitor can always move them to their domain when needed.
Monitor could also clean out this container when starting, even if there were no degraded arrays.
> > and maybe have Monitor() move these spares to an active container if
> > one is
> > found with a domain match. Maybe?
>
> Sounds good. So after Monitor initial run all spares fall in the right
> container, and the ones left will have no match.
> Maybe it is easier then changing Incremental and Assembly to look for
> right container.
Assembly would still require modification.
> Once we have presented two phase assembly that solved that problem
> (adding spares after all disks are placed in containers).
This added spares where they should be. But would require dealing with spares properly also in Incremental.
Or should we try to do the right thing in Assemble but throw all imsm spares into one container in Incremental?
What do you think?
Anna
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Patch 00/17] Autorebuild
am 13.12.2010 01:21:02 von NeilBrown
On Thu, 9 Dec 2010 11:40:43 +0000 "Czarnowska, Anna"
wrote:
> Back to spares and domains...
>
> > > > Note that we don't look at domains at all when we add a disk with
> > > spare metadata. Here it is indeed very much needed.
> > > > When someone creates some arrays and adds some spares, additionally
> > > defines domains to keep all related disks together, then he may be
> > > disappointed seeing that after reboot all spares end up in one
> > > container anyway. This happens for imsm and because of this issue
> > some
> > > spare that was meant for a different array may be used in the first
> > > array from config instead (all spares will be added there regardless
> > of
> > > domains).
> > > >
> > >
> > > Presumably is required for both -I and -A.
> > > Normally when assembling an array we ignore domains because if two
> > > devices
> > > claim to be in an array, then they need to be assembled together no
> > > matter
> > > what domains say.
> > > But for truly global spares, the metadata doesn't tell us much, so we
> > > only
> > > add such a spare to an array for which the domain says it is OK.
>
> The problem is: we don't know the domain of an array until it is fully assembled
> (we know all devices in it).
>
> So in Assembly we can either
> - mark the spares and try them later (when choosing devices for an array we
> skip spares in the first run and add a second run choosing
> spares for an array with domain check against all members found in first run)
> this could be done with uuid_match_any for spares.
> or
> - put spares in separate container and let Monitor take them out when they are needed.
> Here spares can't match any array.
>
> In both cases we must stop giving all imsm spares the uuid of the first array from config.
>
> > >
> > > This is a little awkward for -I as if we get a spare first we have no
> > > idea
> > > what to do with it.
> > > I think we had an idea once of having a container for global spares.
> > > We
> > > could proceed with that, putting spares in that container as they are
> > > found.
>
> This is easily achieved for Incremental by just giving one uuid to all imsm spares.
> This uuid cannot be used by any array. Probably just 0:0:0:0 would do.
> This solution requires adding a special case to
> Assemble or else no spares will be assembled.
>
> If uuid_match_any is used for spares in Incremental then we don't know where to add them and exit.
> We could take the list of all arrays from mdstat (not config) and try matching domains
> but then the final result will depend on the order of devices as some arrays may appear later
> and some may not have complete set of devices when we look at the spare.
>
> Probably the simplest solution is to put all imsm spares in separate container.
> We don't risk then that any of them will be used in an array that is in other domain
> and Monitor can always move them to their domain when needed.
> Monitor could also clean out this container when starting, even if there were no degraded arrays.
>
> > > and maybe have Monitor() move these spares to an active container if
> > > one is
> > > found with a domain match. Maybe?
> >
> > Sounds good. So after Monitor initial run all spares fall in the right
> > container, and the ones left will have no match.
> > Maybe it is easier then changing Incremental and Assembly to look for
> > right container.
>
> Assembly would still require modification.
>
> > Once we have presented two phase assembly that solved that problem
> > (adding spares after all disks are placed in containers).
>
> This added spares where they should be. But would require dealing with spares properly also in Incremental.
> Or should we try to do the right thing in Assemble but throw all imsm spares into one container in Incremental?
> What do you think?
>
> Anna
>
I like the idea of using a uuid of 0:0:0:0 for the container of spares.
There might need to be some subtleties in there to ensure spares with
different metadata stay separate, but I doubt that would be much of a
complication.
I don't know what you mean by trying to "do the right thing in Assemble".
It isn't clear that there is always one correct place to put a spare, though
there could be several suitable places. So you could do "a right thing", but
not necessarily "the right thing" ??
While I like the idea of always having a separate container for all spares, I
see one problem with it.
It should work fine when using --incremental or auto assembly (-As), but if
you explicitly identify an array to start, it would be started without any
spares which might not be what you would expect.
Assemble already has a mechanism to collect all devices that could possibly
be part of an array, and then to reject those that don't fit - typically
because there is another device which claims the same slot but has a newer
event count.
We could use the same mechanism to include any global spare found into an
array being assembled. Then once we are nearly ready to go, we determine the
domain of the array based on the domains of the non-spares and reject any
spares that don't match that domain. I think this is similar two the
two-pass approach that was suggested previously, but fits more cleanly into
the existing infrastructure.
This would me that any spare would be associated with the first array to be
assembled for which it was compatible. This is a little non-deterministic,
but I don't think that is a problem.
For --incremental we would still need global spare to go into a separate
container, though when it comes time to start an array, would could at that
point migrate spares from the spare-container to the new array...
I suggest we start out by just having a single container for imsm spares and
make sure that --monitor can move devices out of there as required. Once
that works reliably we can worry about making the unusual cases work better,
and possibly migrating spares more proactively.
NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] fix: Monitor doesn"t return after starting daemon
am 14.12.2010 15:47:27 von anna.czarnowska
From 8b2465b0d314cc93bba5797dbad9fd2813f0a79e Mon Sep 17 00:00:00 2001
From: Anna Czarnowska
Date: Tue, 14 Dec 2010 12:26:33 +0100
Subject: [PATCH] fix: Monitor doesn't return after starting daemon
Cc: linux-raid@vger.kernel.org, Williams, Dan J , Ciechanowski, Ed
Because both parent and child process continue after make_daemon succeeds.
Signed-off-by: Anna Czarnowska
---
Monitor.c | 10 ++++++----
1 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/Monitor.c b/Monitor.c
index e7f6d03..4ae1d2b 100644
--- a/Monitor.c
+++ b/Monitor.c
@@ -152,9 +152,11 @@ int Monitor(struct mddev_dev *devlist,
info.mailfrom = mailfrom;
info.dosyslog = dosyslog;
- if (daemonise)
- if (make_daemon(pidfile))
- return 1;
+ if (daemonise) {
+ int rv = make_daemon(pidfile);
+ if (rv >= 0)
+ return rv;
+ }
if (share)
if (check_one_sharer(scan))
@@ -272,7 +274,7 @@ static int make_daemon(char *pidfile)
dup2(0,1);
dup2(0,2);
setsid();
- return 0;
+ return -1;
}
static int check_one_sharer(int scan)
--
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fix: Monitor doesn"t return after starting daemon
am 14.12.2010 22:58:14 von NeilBrown
On Tue, 14 Dec 2010 14:47:27 +0000 "Czarnowska, Anna"
wrote:
> >From 8b2465b0d314cc93bba5797dbad9fd2813f0a79e Mon Sep 17 00:00:00 2001
> From: Anna Czarnowska
> Date: Tue, 14 Dec 2010 12:26:33 +0100
> Subject: [PATCH] fix: Monitor doesn't return after starting daemon
> Cc: linux-raid@vger.kernel.org, Williams, Dan J , Ciechanowski, Ed
>
> Because both parent and child process continue after make_daemon succeeds.
Applied, thanks.
NeilBrown
>
> Signed-off-by: Anna Czarnowska
> ---
> Monitor.c | 10 ++++++----
> 1 files changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/Monitor.c b/Monitor.c
> index e7f6d03..4ae1d2b 100644
> --- a/Monitor.c
> +++ b/Monitor.c
> @@ -152,9 +152,11 @@ int Monitor(struct mddev_dev *devlist,
> info.mailfrom = mailfrom;
> info.dosyslog = dosyslog;
>
> - if (daemonise)
> - if (make_daemon(pidfile))
> - return 1;
> + if (daemonise) {
> + int rv = make_daemon(pidfile);
> + if (rv >= 0)
> + return rv;
> + }
>
> if (share)
> if (check_one_sharer(scan))
> @@ -272,7 +274,7 @@ static int make_daemon(char *pidfile)
> dup2(0,1);
> dup2(0,2);
> setsid();
> - return 0;
> + return -1;
> }
>
> static int check_one_sharer(int scan)
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Autorebuild, new dynamic udev rules for hot-plugs
am 23.12.2010 16:44:26 von unknown
--_002_66C59AD0932712458090B447266D638C010C367419irsmsx504ge rc_
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
> -----Original Message-----
> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
> owner@vger.kernel.org] On Behalf Of Hawrylewicz Czarnowski, Przemyslaw
> Sent: Tuesday, November 23, 2010 6:02 PM
> To: Neil Brown
> Cc: linux-raid@vger.kernel.org; Neubauer, Wojciech; Ciechanowski, Ed;
> Labun, Marcin; Czarnowska, Anna; Williams, Dan J
> Subject: RE: Autorebuild, new dynamic udev rules for hot-plugs
>=20
> > -----Original Message-----
> > From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
> > owner@vger.kernel.org] On Behalf Of Neil Brown
> > Sent: Tuesday, November 23, 2010 2:17 AM
> > To: Williams, Dan J
> > Cc: Hawrylewicz Czarnowski, Przemyslaw; linux-raid@vger.kernel.org;
> > Neubauer, Wojciech; Ciechanowski, Ed; Labun, Marcin; Czarnowska, Anna
> > Subject: Re: Autorebuild, new dynamic udev rules for hot-plugs
> >
> > On Mon, 22 Nov 2010 16:11:41 -0800
> > Dan Williams wrote:
> >
> > > On 11/22/2010 3:50 PM, Hawrylewicz Czarnowski, Przemyslaw wrote:
> > > >> Four comments.
> > > >>
> > > >> 1/ I wouldn't write a file in /lib/udev/rules.d/
> > > >> I think it should be written to "/dev/.udev/rules.d/"
> > > >> which is referred to as the "temporary rules directory"
> > > >> in the udev documentation.
> > > > I am not sure if it is what we are looking for. Temporary means the=
y
> > disappear after reboot. It is OK as cold-plug does not need support for
> > bare disks (or maybe I am wrong?). But in such case, one who wants to u=
se
> > autorebuild should invoke mdadm --activate-domains for example in
> > /etc/init.d/local.boot or somewhere else. Second idea here is to use
> > ActivateDomain() when one starts monitor with autorebuild enabled. Whic=
h
> > one? I would prefer to leave it as it was written initially (considerin=
g
> > comment #4). Then, if one removes policies from config, invoking --
> > activate-domains should reset/remove rules (but see #3)
> > >
> > > The intent was always to have this be something reinitialized at boot=
..
> > > Putting these in the temporary rule directory also precludes them fro=
m
> > > being added to the initramfs where they are not needed / potentially
> > > confusing.
> > >
> > > The other intent was to only match the pci paths for the controllers =
we
> > > cared about. That does not appear to be a part of this patch.
> >
> > Can you define "we cared about". Don't we care about everything listed
> in
> > mdadm.conf??
> >
> >
> > >
> > > >
> > > >>
> > > >> 2/ I would be good to process the type=3Ddisk or type=3Dpart part =
of the
> > > >> policy into the rules file as well.
> > > > OK
> > > >
> > > >>
> > > >> 3/ I'm not very comfortable with hard-coding the name of the
> > > >> file to be created in the rules.d directory. Maybe usage coul=
d
> be
> > > >> --activate-domains=3D63-md-whatever
> > > > Good idea, but only if we store our rules in /dev/.udev/rules.d.
> > Otherwise it would be difficult to maintain all generated rules and
> remove
> > the old ones... I would leave default if not given by user, but one can
> > pass any file name.
> > >
> > > The issue is that this namespace belongs to the distro and since they
> > > need to modify initscripts to turn this feature on might as well dump
> > > the entirety of the naming responsibility to the user.
> > >
> > > >> 4/ I don't think it is good to have an incomplete file in rules.d
> that
> > udev
> > > >> might accidentally read. We should create the file with a nam=
e
> > with a
> > > >> leading '.' (assuming udev ignores those, I haven't checked) a=
nd
> > then
> > > >> rename it after it has been completely written.
> > > > You're right. In theory, such partial udev rules are excluded when
> udev
> > can't interpret them properly. I have looked into udev's sources and
> found
> > that it looks for "*.rules" files. All other file extensions are ignore=
d.
> > Files with leading dots are also omitted. I would prefer to
> > create.temp file and then rename it into.rules.
> > >
> > > There must be an existing convention for this sort of the thing, if s=
o
> > > let's not invent another one.
> I haven't found anything similar. Just mountall, but it writes single lin=
e
> in one "shot"... Both options with extension or with leading dot will wor=
k.
>=20
> >
> > We could avoid both these issues by just writing the new rules file to
> > stdout.
> > When when the init script gets it wrong, it isn't our fault :-)
> I like that idea at this stage. Later on we might develop better solution
> (see below)
>=20
> >
> > But I don't really like that. At least there should be a simple and
> > uniform
> > way to propagate any mdadm.conf changes into udev.
> >
> > Maybe the name of the rules file should be given in mdadm.conf, and e.g=
..
> > mdadm --check-config
> > would report any syntax errors, report any inconsistencies with current
> > arrays, and update the udev file if necessary..
> >
> > Maybe leave that for 3.2.1, and just support '--activate-
> domains=3Dfilename'
> > for now.
> Let me extend this thought a little. As I mentioned above I like the idea
> of writing rule to stdout. Or if somebody wants to pass a file name, just
> write the file in current directory - similar to the way one creates
> mdadm.conf with mdadm --examine (but with small improvement:).
Replying to my last post I would like to present new patch based on the abo=
ve scheme.
I also want to raise the discussion again, as this thread is dead for a whi=
le...
Please comment.
> But the general problem is to find "simple and uniform way". Something
> distro-independent. Rules should be prepared once, right after config fil=
e
> is finished. Fire and forget:)
> We need to handle hot/cold-plug events for action=3Dspare, and hot-plug
> events for actions=3Dspare-same-slot. Considering we use temporary udev r=
ules
> directory they need to be regenerated each reboot, putting attention at t=
he
> moment when we have to do it so we handle all cases. What options then?
>=20
> As a last resort, maybe just a note in man pages with possibilities,
> leaving implementation to user/admin?
>=20
> >
> > ???
> >
> > NeilBrown
> >
> >
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--_002_66C59AD0932712458090B447266D638C010C367419irsmsx504ge rc_
Content-Type: application/octet-stream;
name="0001-Dynamic-hot-plug-udev-rules-for-policies.patch"
Content-Description: 0001-Dynamic-hot-plug-udev-rules-for-policies.patch
Content-Disposition: attachment;
filename="0001-Dynamic-hot-plug-udev-rules-for-policies.patc h"; size=8176;
creation-date="Thu, 23 Dec 2010 15:37:31 GMT";
modification-date="Thu, 23 Dec 2010 15:37:20 GMT"
Content-Transfer-Encoding: base64
RnJvbSBjMGFlY2Q0ZGQ5NjY5MWU4YmZhNmYyZGMxODcyNjFlYzhiYjJjNWEy IE1vbiBTZXAgMTcg
MDA6MDA6MDAgMjAwMQpGcm9tOiBQcnplbXlzbGF3IEN6YXJub3dza2kgPHBy emVteXNsYXcuaGF3
cnlsZXdpY3ouY3phcm5vd3NraUBpbnRlbC5jb20+CkRhdGU6IFRodSwgMjMg RGVjIDIwMTAgMTY6
MzU6MDEgKzAxMDAKU3ViamVjdDogW1BBVENIXSBEeW5hbWljIGhvdC1wbHVn IHVkZXYgcnVsZXMg
Zm9yIHBvbGljaWVzCkNjOiBsaW51eC1yYWlkQHZnZXIua2VybmVsLm9yZywg V2lsbGlhbXMsIERh
biBKIDxkYW4uai53aWxsaWFtc0BpbnRlbC5jb20+LCBDaWVjaGFub3dza2ks IEVkIDxlZC5jaWVj
aGFub3dza2lAaW50ZWwuY29tPgoKV2hlbiBpbnRyb2R1Y2luZyBwb2xpY2ll cywgbmV3IGhvdC1w
bHVnIHJ1bGVzIHdlcmUgYWRkZWQgdG8gc3VwcG9ydApiYXJlIGRpc2tzLiBN ZGFkbSB3YXMgc3Rh
cnRlZCBmb3IgZWFjaCBob3QgcGx1Z2dlZCBibG9jayBkZXZpY2UKdG8gZGV0 ZXJtaW5lIGlmIGl0
IGNvdWxkIGJlIHVzZWQgYXMgc3BhcmUgb3IgYXMgYSByZXBsYWNlbWVudCBt ZW1iZXIgZm9yCmRl
Z3JhZGVkIGFycmF5LgpUaGlzIHBhdGNoIGludHJvZHVjZXMgbGltaXRhdGlv biBvZiByYW5nZSBv
ZiBkZXZpY2VzIHRoYXQgYXJlIGhhbmRsZWQKYnkgbWRhZG0uCkl0IGxpbWl0 cyB0aGVtIHRvIHRo
ZSBvbmVzIHNwZWNpZmllZCBpbiBkb21haW5zIGFzc29jaWF0ZWQgd2l0aAp0 aGUgYWN0aW9uczog
c3BhcmUtc2FtZS1wb3J0LCBzcGFyZSBhbmQgc3BhcmUtZm9yY2UuCkluIG9y ZGVyIHRvIGVuYWJs
ZSBob3QtcGx1ZyBmb3IgYmFyZSBkaXNrcyBvbmUgbXVzdCB1cGRhdGUgdWRl diBydWxlcwp3aXRo
IGNvbW1hbmQKCgltZGFkbSAtLWFjdGl2YXRlLWRvbWFpbnNbPWZpbGVuYW1l XQoKQWJvdmUgY29t
bWFuZCB3cml0ZXMgdWRldiBydWxlIGNvbmZpZ3VyYXRpb24gdG8gc3Rkb3V0 LiBJZiAnZmlsZW5h
bWUnCmlzIGdpdmVuIG91dHB1dCBpcyB3cml0dGVuIHRvIHRoZSBmaWxlIHBy b3ZpZGVkIGFzIHBh
cmFtZXRlci4gSXQgaXMgdXAKdG8gc3lzdGVtIGFkbWluaXN0cmF0b3Igd2hh dCBzaG91bGQgYmUg
ZG9uZSBsYXRlci4gVG8gbWFrZSBzdWNoIHJ1bGUKcGVybWFuZW50IChpLmUu IHJlbWFpbiBhZnRl
ciByZWJvb3QpIHJ1bGUgc2hvdWxkIGJlIHdyaXRlbiB0bwovbGliL3VkZXYv cnVsZXMuZCBkaXJl
Y3RvcnkuIE90aGVyIGNhc2VzIHdpbGwganVzdCBuZWVkIHRvIHdyaXRlIGl0 IHRvCi9kZXYvLnVk
ZXYvcnVsZXMuZCBkaXJlY3Rvcnkgd2hlcmUgdGVtcG9yYXJ5IHJ1bGVzIGxp ZXMuIE9uZSBzaG91
bGQgYmUKYXdhcmUgb2YgdGhlIG1lYW5pbmcgb2YgbmFtZXMvcHJpb3JpdGll cyBvZiB0aGUgdWRl
diBydWxlcy4KCkFmdGVyIG1kYWRtLmNvbmYgaXMgY2hhbmdlZCBvbmUgaXMg b2JsaWdlZCB0byBy
ZS1ydW4KIm1kYWRtIC0tYWN0aXZhdGUtZG9tYWlucyIgY29tbWFuZCBpbiBv cmRlciB0byBicmlu
ZyB0aGUgc3lzdGVtCmNvbmZpZ3VyYXRpb24gdXAgdG8gZGF0ZS4KQWxsIGhv dC1wbHVnZ2VkIGRp
c2tzIGNvbnRhaW5pbmcgbWV0YWRhdGEgYXJlIHN0aWxsIGhhbmRsZWQgYnkg ZXhpc3RpbmcKcnVs
ZXMuCgpOb3RlOiB0aGlzIHBhdGNoIGlzIGp1c3QgYSBwcm9wb3NpdGlvbiB0 byBtaW5pbWl6ZSBv
dmVyaGVhZCBvZiB1c2luZyBtZGFkbQpmb3IgZWFjaCBwbHVnZ2VkIGJsb2Nr IGRldmljZS4KClNp
Z25lZC1vZmYtYnk6IFByemVteXNsYXcgQ3phcm5vd3NraSA8cHJ6ZW15c2xh dy5oYXdyeWxld2lj
ei5jemFybm93c2tpQGludGVsLmNvbT4KLS0tCiBSZWFkTWUuYyB8ICAgIDEg KwogbWRhZG0uYyAg
fCAgIDE4ICsrKysrKysrCiBtZGFkbS5oICB8ICAgIDIgKwogcG9saWN5LmMg fCAgMTQxICsrKysr
KysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysr KysrKysrKysrKysr
CiA0IGZpbGVzIGNoYW5nZWQsIDE2MiBpbnNlcnRpb25zKCspLCAwIGRlbGV0 aW9ucygtKQoKZGlm
ZiAtLWdpdCBhL1JlYWRNZS5jIGIvUmVhZE1lLmMKaW5kZXggNTcxNDg0OS4u Y2Y0MWZhNSAxMDA2
NDQKLS0tIGEvUmVhZE1lLmMKKysrIGIvUmVhZE1lLmMKQEAgLTExMCw2ICsx MTAsNyBAQCBzdHJ1
Y3Qgb3B0aW9uIGxvbmdfb3B0aW9uc1tdID0gewogICAgIHsiZGV0YWlsLXBs YXRmb3JtIiwgMCwg
MCwgRGV0YWlsUGxhdGZvcm19LAogICAgIHsia2lsbC1zdWJhcnJheSIsIDEs IDAsIEtpbGxTdWJh
cnJheX0sCiAgICAgeyJ1cGRhdGUtc3ViYXJyYXkiLCAxLCAwLCBVcGRhdGVT dWJhcnJheX0sCisg
ICAgeyJhY3RpdmF0ZS1kb21haW5zIiwgMiwgMCwgQWN0aXZhdGVEb21haW5z fSwKIAogICAgIC8q
IHN5bm9ueW1zICovCiAgICAgeyJtb25pdG9yIiwgICAwLCAwLCAnRid9LApk aWZmIC0tZ2l0IGEv
bWRhZG0uYyBiL21kYWRtLmMKaW5kZXggMmZmZTk0Zi4uZjMwMjFjYyAxMDA2 NDQKLS0tIGEvbWRh
ZG0uYworKysgYi9tZGFkbS5jCkBAIC0xMDYsNiArMTA2LDcgQEAgaW50IG1h aW4oaW50IGFyZ2Ms
IGNoYXIgKmFyZ3ZbXSkKIAlpbnQgYXV0b191cGRhdGVfaG9tZSA9IDA7CiAJ Y2hhciAqc3ViYXJy
YXkgPSBOVUxMOwogCWNoYXIgKnJlbW92ZV9wYXRoID0gTlVMTDsKKwljaGFy ICp1ZGV2X2ZpbGVu
YW1lID0gTlVMTDsKIAogCWludCBwcmludF9oZWxwID0gMDsKIAlGSUxFICpv dXRmOwpAQCAtMjM0
LDYgKzIzNSw3IEBAIGludCBtYWluKGludCBhcmdjLCBjaGFyICphcmd2W10p CiAJCQkJfQogCQkJ
CXN1YmFycmF5ID0gb3B0YXJnOwogCQkJfQorCQljYXNlIEFjdGl2YXRlRG9t YWluczoKIAkJY2Fz
ZSAnSyc6IGlmICghbW9kZSkgbmV3bW9kZSA9IE1JU0M7IGJyZWFrOwogCQlj YXNlIE5vU2hhcmlu
ZzogbmV3bW9kZSA9IE1PTklUT1I7IGJyZWFrOwogCQl9CkBAIC05MjksNiAr OTMxLDIwIEBAIGlu
dCBtYWluKGludCBhcmdjLCBjaGFyICphcmd2W10pCiAJCQl9CiAJCQlkZXZt b2RlID0gb3B0Owog
CQkJY29udGludWU7CisJCWNhc2UgTyhNSVNDLCBBY3RpdmF0ZURvbWFpbnMp OgorCQkJaWYgKGRl
dm1vZGUgJiYgZGV2bW9kZSAhPSBvcHQpIHsKKwkJCQlmcHJpbnRmKHN0ZGVy ciwgTmFtZSAiOiAt
LUFjdGl2YXRlRG9tYWlucyBtdXN0IgorCQkJCQkJICAgICAiIGJlIHRoZSBv bmx5IG9wdGlvbi5c
biIpOworCQkJfSBlbHNlIHsKKwkJCQlpZiAodWRldl9maWxlbmFtZSkKKwkJ CQlmcHJpbnRmKHN0
ZGVyciwgTmFtZSAiOiBvbmx5IHNwZWNpZnkgb25lIHVkZXYgIgorCQkJCQkJ ICAgICAicnVsZSBm
aWxlbmFtZS4gJXMgaWdub3JlZC5cbiIsCisJCQkJCW9wdGFyZyk7CisJCQkJ ZWxzZQorCQkJCQl1
ZGV2X2ZpbGVuYW1lID0gb3B0YXJnOworCQkJfQorCQkJZGV2bW9kZSA9IG9w dDsKKwkJCWNvbnRp
bnVlOwogCQljYXNlIE8oTUlTQywndCcpOgogCQkJdGVzdCA9IDE7CiAJCQlj b250aW51ZTsKQEAg
LTE0OTMsNiArMTUwOSw4IEBAIGludCBtYWluKGludCBhcmdjLCBjaGFyICph cmd2W10pCiAJCQkJ
CQlmcmVlX21kc3RhdChtcyk7CiAJCQkJCX0gd2hpbGUgKCFsYXN0ICYmIGVy cik7CiAJCQkJCWlm
IChlcnIpIHJ2IHw9IDE7CisJCQkJfSBlbHNlIGlmIChkZXZtb2RlID09IEFj dGl2YXRlRG9tYWlu
cykgeworCQkJCQlydiA9IEFjdGl2YXRlX0RvbWFpbnModWRldl9maWxlbmFt ZSk7CiAJCQkJfSBl
bHNlIHsKIAkJCQkJZnByaW50ZihzdGRlcnIsIE5hbWUgIjogTm8gZGV2aWNl cyBnaXZlbi5cbiIp
OwogCQkJCQlleGl0KDIpOwpkaWZmIC0tZ2l0IGEvbWRhZG0uaCBiL21kYWRt LmgKaW5kZXggMzYx
MjRkZS4uMTI0MjAxNSAxMDA2NDQKLS0tIGEvbWRhZG0uaAorKysgYi9tZGFk bS5oCkBAIC0zMTEs
NiArMzExLDcgQEAgZW51bSBzcGVjaWFsX29wdGlvbnMgewogCUJpdG1hcCwK IAlSZWJ1aWxkTWFw
T3B0LAogCUludmFsaWRCYWNrdXAsCisJQWN0aXZhdGVEb21haW5zCiB9Owog CiAvKiBzdHJ1Y3R1
cmVzIHJlYWQgZnJvbSBjb25maWcgZmlsZSAqLwpAQCAtMTAzMSw2ICsxMDMy LDcgQEAgZXh0ZXJu
IGludCBDcmVhdGVCaXRtYXAoY2hhciAqZmlsZW5hbWUsIGludCBmb3JjZSwg Y2hhciB1dWlkWzE2
XSwKIAkJCXVuc2lnbmVkIGxvbmcgbG9uZyBhcnJheV9zaXplLAogCQkJaW50 IG1ham9yKTsKIGV4
dGVybiBpbnQgRXhhbWluZUJpdG1hcChjaGFyICpmaWxlbmFtZSwgaW50IGJy aWVmLCBzdHJ1Y3Qg
c3VwZXJ0eXBlICpzdCk7CitleHRlcm4gaW50IEFjdGl2YXRlX0RvbWFpbnMo Y2hhciAqcnVsZV9u
YW1lKTsKIGV4dGVybiBpbnQgYml0bWFwX3VwZGF0ZV91dWlkKGludCBmZCwg aW50ICp1dWlkLCBp
bnQgc3dhcCk7CiBleHRlcm4gdW5zaWduZWQgbG9uZyBiaXRtYXBfc2VjdG9y cyhzdHJ1Y3QgYml0
bWFwX3N1cGVyX3MgKmJzYik7CiAKZGlmZiAtLWdpdCBhL3BvbGljeS5jIGIv cG9saWN5LmMKaW5k
ZXggYmE5NzZkYi4uODFkM2U3MCAxMDA2NDQKLS0tIGEvcG9saWN5LmMKKysr IGIvcG9saWN5LmMK
QEAgLTc2NCwzICs3NjQsMTQ0IEBAIGludCBwb2xpY3lfY2hlY2tfcGF0aChz dHJ1Y3QgbWRpbmZv
ICpkaXNrLCBzdHJ1Y3QgbWFwX2VudCAqYXJyYXkpCiAJZmNsb3NlKGYpOwog CXJldHVybiBydiA9
PSA1OwogfQorCisvKiBpbnZvY2F0aW9uIG9mIHVkZXYgcnVsZSBmaWxlICov CitjaGFyIHVkZXZf
dGVtcGxhdGVfc3RhcnRbXSA9CisiIyBkbyBub3QgZWRpdCB0aGlzIGZpbGUs IGl0IHdpbGwgYmUg
b3ZlcndyaXR0ZW4gb24gdXBkYXRlXG4iCisiXG4iCisiU1VCU1lTVEVNIT1c ImJsb2NrXCIsIEdP
VE89XCJtZF9hdXRvcmVidWlsZF9lbmRcIlxuIgorIlxuIgorIkVOVntJRF9G U19UWVBFfT09XCJs
aW51eF9yYWlkX21lbWJlclwiLCBHT1RPPVwibWRfYXV0b3JlYnVpbGRfZW5k XCJcbiIKKyJFTlZ7
SURfRlNfVFlQRX09PVwiaXN3X3JhaWRfbWVtYmVyXCIsIEdPVE89XCJtZF9h dXRvcmVidWlsZF9l
bmRcIlxuIgorIlxuIjsKKworLyogZW5kaW5nIG9mIHVkZXYgcnVsZSBmaWxl ICovCitjaGFyIHVk
ZXZfdGVtcGxhdGVfZW5kW10gPQorIlxuIgorIkxBQkVMPVwibWRfYXV0b3Jl YnVpbGRfZW5kXCJc
biIKKyJcbiI7CisKKy8qIGZpbmQgcnVsZSBuYW1lZCBydWxlX3R5cGUgYW5k IHJldHVybiBpdHMg
dmFsdWUgKi8KK2NoYXIgKmZpbmRfcnVsZShzdHJ1Y3QgcnVsZSAqcnVsZSwg Y2hhciAqcnVsZV90
eXBlKQoreworCXdoaWxlIChydWxlKSB7CisJCWlmIChydWxlLT5uYW1lID09 IHJ1bGVfdHlwZSkK
KwkJCXJldHVybiBydWxlLT52YWx1ZTsKKworCQlydWxlID0gcnVsZS0+bmV4 dDsKKwl9CisJcmV0
dXJuIE5VTEw7Cit9CisKKyNkZWZpbmUgVURFVl9SVUxFX0ZPUk1BVCBcCisi QUNUSU9OPT1cImFk
ZFwiLCBLRVJORUwhPVwibWQqXCIgRU5We0lEX1BBVEh9PT1cIiVzXCIgIiBc CisiUlVOKz1cIi9z
YmluL21kYWRtIC0taW5jcmVtZW50YWwgJGVudntERVZOQU1FfVwiLCAiIFwK KyJHT1RPPVwibWRf
YXV0b3JlYnVpbGRfZW5kXCJcbiIgXAorCisvKiBXcml0ZSBydWxlIGluIHRo ZSBydWxlIGZpbGUu
IFVzZSBmb3JtYXQgZnJvbSBVREVWX1JVTEVfRk9STUFUICovCitpbnQgd3Jp dGVfcnVsZShzdHJ1
Y3QgcnVsZSAqcnVsZSwgaW50IGZkKQoreworCWNoYXIgbGluZVsxMDI0XTsK KwljaGFyICpyID0g
ZmluZF9ydWxlKHJ1bGUsIHJ1bGVfcGF0aCk7CisJaWYgKCFyKQorCQlyZXR1 cm4gLTE7CisKKwlz
bnByaW50ZihsaW5lLCBzaXplb2YobGluZSkgLSAxLCBVREVWX1JVTEVfRk9S TUFULCByKTsKKwly
ZXR1cm4gd3JpdGUoZmQsIGxpbmUsIHN0cmxlbihsaW5lKSk7Cit9CisKKy8q IEdlbmVyYXRlIHNp
bmdsZSBlbnRyeSBpbiB1ZGV2IHJ1bGUgYmFzaW5nIG9uIFBPTElDWSBsaW5l IGZvdW5kIGluIGNv
bmZpZworICogZmlsZS4gVGFrZSBvbmx5IHRob3NlIHdpdGggcGF0aHMsIG9u bHkgZmlyc3Qgb2Nj
dXJyZW5jZSBpZiBwYXRocyBhcmUgZXF1YWwKKyAqIGFuZCBpZiBhY3Rpb25z IHN1cHBvcnRzIGhh
bmRsaW5nIG9mIHNwYXJlcyAoPj1hY3Rfc3BhcmVfc2FtZV9zbG90KQorICov CitpbnQgZ2VuZXJh
dGVfZW50cmllcyhpbnQgZmQpCit7CisJc3RydWN0IHBvbF9ydWxlICpsb29w LCAqZHVwOworCWNo
YXIgKmxvb3BfdmFsdWUsICpkdXBfdmFsdWU7CisJaW50IGR1cGxpY2F0ZTsK KwlpbnQgd3JpdHRl
biA9IDA7CisKKwlmb3IgKGxvb3AgPSBjb25maWdfcnVsZXM7IGxvb3A7IGxv b3AgPSBsb29wLT5u
ZXh0KSB7CisJCWlmIChsb29wLT50eXBlICE9IHJ1bGVfcG9saWN5ICYmIGxv b3AtPnR5cGUgIT0g
cnVsZV9wYXJ0KQorCQkJY29udGludWU7CisJCWR1cGxpY2F0ZSA9IDA7CisK KwkJLyogb25seSBw
b2xpY2llcyB3aXRoIHBhdGhzIGFuZCB3aXRoIGFjdGlvbnMgc3VwcG9ydGlu ZworCQkgKiBiYXJl
IGRpc2tzIGFyZSBjb25zaWRlcmVkICovCisJCWxvb3BfdmFsdWUgPSBmaW5k X3J1bGUobG9vcC0+
cnVsZSwgcG9sX2FjdCk7CisJCWlmICghbG9vcF92YWx1ZSB8fCBtYXBfYWN0 KGxvb3BfdmFsdWUp
IDwgYWN0X3NwYXJlX3NhbWVfc2xvdCkKKwkJCWNvbnRpbnVlOworCQlsb29w X3ZhbHVlID0gZmlu
ZF9ydWxlKGxvb3AtPnJ1bGUsIHJ1bGVfcGF0aCk7CisJCWlmICghbG9vcF92 YWx1ZSkKKwkJCWNv
bnRpbnVlOworCQlmb3IgKGR1cCA9IGNvbmZpZ19ydWxlczsgZHVwICE9IGxv b3A7IGR1cCA9IGR1
cC0+bmV4dCkgeworCQkJaWYgKGR1cC0+dHlwZSAhPSBydWxlX3BvbGljeSAm JiBsb29wLT50eXBl
ICE9IHJ1bGVfcGFydCkKKwkJCQljb250aW51ZTsKKwkJCWR1cF92YWx1ZSA9 IGZpbmRfcnVsZShk
dXAtPnJ1bGUsIHBvbF9hY3QpOworCQkJaWYgKCFkdXBfdmFsdWUgfHwgbWFw X2FjdChkdXBfdmFs
dWUpIDwgYWN0X3NwYXJlX3NhbWVfc2xvdCkKKwkJCQljb250aW51ZTsKKwkJ CWR1cF92YWx1ZSA9
IGZpbmRfcnVsZShkdXAtPnJ1bGUsIHJ1bGVfcGF0aCk7CisJCQlpZiAoIWR1 cF92YWx1ZSkKKwkJ
CQljb250aW51ZTsKKwkJCWlmIChzdHJjbXAobG9vcF92YWx1ZSwgZHVwX3Zh bHVlKSA9PSAwKSB7
CisJCQkJZHVwbGljYXRlID0gMTsKKwkJCQlicmVhazsKKwkJCX0KKwkJfQor CisJCS8qIG5vdCBh
IGR1cCBvciBmaXJzdCBvY2N1cnJlbmNlICovCisJCWlmICghZHVwbGljYXRl KSB7CisJCQlpZiAo
d3JpdGVfcnVsZShsb29wLT5ydWxlLCBmZCkgPT0gLTEpCisJCQkJcmV0dXJu IDA7CisJCQl3cml0
dGVuKys7CisJCX0KKwl9CisJcmV0dXJuIHdyaXR0ZW47Cit9CisKKy8qIEFj dGl2YXRlX0RvbWFp
bnMgcm91dGluZSBjcmVhdGVzIGR5bmFtaWMgdWRldiBydWxlcyB1c2VkIHRv IGhhbmRsZQorICog
aG90LXBsdWcgZXZlbnRzIGZvciBiYXJlIGRldmljZXMgKGFuZCBtYWtpbmcg dGhlbSBzcGFyZXMp
CisgKi8KK2ludCBBY3RpdmF0ZV9Eb21haW5zKGNoYXIgKnJ1bGVfbmFtZSkK K3sKKwlpbnQgZmQg
PSAwOworCWludCBydjsKKwljaGFyIHVkZXZfcnVsZV9maWxlW1BBVEhfTUFY XTsKKworCWlmIChy
dWxlX25hbWUpCisJCWZkID0gY3JlYXQodWRldl9ydWxlX2ZpbGUsCisJCQkg ICBTX0lSVVNSIHwg
U19JV1VTUiB8IFNfSVJHUlAgfCBTX0lST1RIKTsKKwllbHNlCisJCWZkID0g ZHVwKGZpbGVubyhz
dGRvdXQpKTsKKwlpZiAoZmQgPT0gLTEpCisJCXJldHVybiAxOworCisJLyog d3JpdGUgc3RhdGlj
IGludm9jYXRpb24gKi8KKwlpZiAod3JpdGUoZmQsIHVkZXZfdGVtcGxhdGVf c3RhcnQsCisJCXNp
emVvZih1ZGV2X3RlbXBsYXRlX3N0YXJ0KSAtIDEpID09IC0xKSB7CisJCWNs b3NlKGZkKTsKKwkJ
aWYgKHJ1bGVfbmFtZSkKKwkJCXVubGluayh1ZGV2X3J1bGVfZmlsZSk7CisJ CXJldHVybiAxOwor
CX0KKworCS8qIGl0ZXJhdGUsIGlmIG5vbmUgY3JlYXRlZCBvciBlcnJvciBv Y2N1cnJlZCwgcmVt
b3ZlIGZpbGUgKi8KKwlydiA9IGdlbmVyYXRlX2VudHJpZXMoZmQpOworCWlm IChydiA8PSAwKSB7
CisJCWNsb3NlKGZkKTsKKwkJaWYgKHJ1bGVfbmFtZSkKKwkJCXVubGluayh1 ZGV2X3J1bGVfZmls
ZSk7CisJCXJldHVybiBydiA9PSAtMSA/IC0xIDogMDsKKwl9CisKKwkvKiB3 cml0ZSBlbmRpbmcg
Ki8KKwlpZiAod3JpdGUoZmQsIHVkZXZfdGVtcGxhdGVfZW5kLCBzaXplb2Yo dWRldl90ZW1wbGF0
ZV9lbmQpIC0gMSkgPT0gLTEpIHsKKwkJY2xvc2UoZmQpOworCQlpZiAocnVs ZV9uYW1lKQorCQkJ
dW5saW5rKHVkZXZfcnVsZV9maWxlKTsKKwkJcmV0dXJuIDE7CisJfQorCisJ Y2xvc2UoZmQpOwor
CXJldHVybiAwOworfQotLSAKMS43LjEKCg==
--_002_66C59AD0932712458090B447266D638C010C367419irsmsx504ge rc_--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html