[AUTOREBUILD 0/8] Autorebuild monitor patches based on user definedpolicy

[AUTOREBUILD 0/8] Autorebuild monitor patches based on user definedpolicy

am 01.10.2010 14:36:48 von Marcin.Labun

From f423b226f10cfe3b416c5e0580dde45cd8ca887d Mon Sep 17 00:00:00 2001
From: Marcin Labun
Date: Wed, 29 Sep 2010 06:12:38 +0200
Subject: [AUTOREBUILD 0/8] Autorebuild monitor patches based on user defined policy

This is updated series of patches forming autorebuild functionality in mdadm
monitor based on new policy code.

Autorebuild Monitoring application:
Autorebuild monitor is part of monitor application (mdadm -F). In the current
code of mdadm monitor autorebuild feature was based on spare group assignment in
mdadm.conf file and worked only for native metadata.
The new autorebuild implementation works for all metadata types. It uses
the concept of domains in mdadm.conf introduced by Neil Brown.
Monitoring application shall periodically check the state of MD active arrays
and trigger a rebuild if there are eligible spare disks in other containers.
Degraded arrays are checked one by one. For each array a potential spare disk
is searched. If the spare disk matches the domain of the degraded array and
the domain action allows for spare sharing the spare is moved using existing
Manage_subdevs function. If the addition fails, the spare device is moved back
to the original container and next potential spare is tried. The process is
repeated until all arrays are checked and the process is put into a sleep state
for a configured period.

The design of mdadm monitor requires that there is only one autorebuild process running.
Therefore a new option -no-sharing has been added to Monitor mode, and spare sharing is
allowed in only one instance of Monitor. User is still able to start Monitoring functions
in multiple instances.

The autorebuild build-in assumptions are:
1\spares are shared between the arrays of the same metadata
2\spares are moved only from containers/volumes that are not degraded
3\spares are moved to containers/volumes lacking a *good* spare (size)


0001-Monitor-set-err-on-arrays-not-in-mdstat.patch
0002-Monitor-removed-spare-group-based-spare-sharing-code.pa tch
0003-mdadm-added-no-sharing-parameter-for-Monitor-mode.patch
0004-Monitor-link-container-volumes-in-statelist.patch
0005-imsm-create-mdinfo-list-of-disks-in-a-container-from.pa tch
0006-Monitor-autorebuild-funcionality-added.patch
0007-Monitor-Respect-policy-in-auto-rebuild-in-mdadm-moni.pa tch
0008-Monitor-Helper-functions-added-for-spare_sharing-in-.pa tch


Monitor.c | 605 +++++++++++++++++++++++++++++++++++++++++++++++----------
ReadMe.c | 2 +
mdadm.c | 8 +-
mdadm.h | 8 +-
super-intel.c | 53 +++++
5 files changed, 565 insertions(+), 111 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [AUTOREBUILD 0/8] Autorebuild monitor patches based on userdefined policy

am 19.10.2010 02:40:27 von NeilBrown

On Fri, 1 Oct 2010 13:36:48 +0100
"Labun, Marcin" wrote:

> >From f423b226f10cfe3b416c5e0580dde45cd8ca887d Mon Sep 17 00:00:00 2001
> From: Marcin Labun
> Date: Wed, 29 Sep 2010 06:12:38 +0200
> Subject: [AUTOREBUILD 0/8] Autorebuild monitor patches based on user defined policy
>
> This is updated series of patches forming autorebuild functionality in mdadm
> monitor based on new policy code.

Hi Marcin,
thanks for this, and apologies for not replying sooner.
I've had a bit of a look and some of it seems good.
I haven't had a thorough look yet as I am in the middle of doing some fairly
serious refactoring of mdadm (the supertype, and mdinfo structures are going
to be heavily changed and largely merged - some super_switch methods will
disappear (e.g. getinfo_super) and others will appear (load_container)).
Once I have finished that I will review your code more thoroughly and merge
it into the new code base.

One concern I do have is patch 0002 which removes the spare-group based
spare migration. That functionality needs to stay, though obviously the
implementation can change. I imagine the 'spare-group' information would be
added to each member device as a 'domain' name.

Also it is best not to remove functionality and then re-add it a different
way, but rather to make sure the functionality works after every change, but
just gets extended at various points.

Thanks,
NeilBrown


>
> Autorebuild Monitoring application:
> Autorebuild monitor is part of monitor application (mdadm -F). In the current
> code of mdadm monitor autorebuild feature was based on spare group assignment in
> mdadm.conf file and worked only for native metadata.
> The new autorebuild implementation works for all metadata types. It uses
> the concept of domains in mdadm.conf introduced by Neil Brown.
> Monitoring application shall periodically check the state of MD active arrays
> and trigger a rebuild if there are eligible spare disks in other containers.
> Degraded arrays are checked one by one. For each array a potential spare disk
> is searched. If the spare disk matches the domain of the degraded array and
> the domain action allows for spare sharing the spare is moved using existing
> Manage_subdevs function. If the addition fails, the spare device is moved back
> to the original container and next potential spare is tried. The process is
> repeated until all arrays are checked and the process is put into a sleep state
> for a configured period.
>
> The design of mdadm monitor requires that there is only one autorebuild process running.
> Therefore a new option -no-sharing has been added to Monitor mode, and spare sharing is
> allowed in only one instance of Monitor. User is still able to start Monitoring functions
> in multiple instances.
>
> The autorebuild build-in assumptions are:
> 1\spares are shared between the arrays of the same metadata
> 2\spares are moved only from containers/volumes that are not degraded
> 3\spares are moved to containers/volumes lacking a *good* spare (size)
>
>
> 0001-Monitor-set-err-on-arrays-not-in-mdstat.patch
> 0002-Monitor-removed-spare-group-based-spare-sharing-code.pa tch
> 0003-mdadm-added-no-sharing-parameter-for-Monitor-mode.patch
> 0004-Monitor-link-container-volumes-in-statelist.patch
> 0005-imsm-create-mdinfo-list-of-disks-in-a-container-from.pa tch
> 0006-Monitor-autorebuild-funcionality-added.patch
> 0007-Monitor-Respect-policy-in-auto-rebuild-in-mdadm-moni.pa tch
> 0008-Monitor-Helper-functions-added-for-spare_sharing-in-.pa tch
>
>
> Monitor.c | 605 +++++++++++++++++++++++++++++++++++++++++++++++----------
> ReadMe.c | 2 +
> mdadm.c | 8 +-
> mdadm.h | 8 +-
> super-intel.c | 53 +++++
> 5 files changed, 565 insertions(+), 111 deletions(-)
>

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [AUTOREBUILD 0/8] Autorebuild monitor patches based on user definedpolicy

am 19.10.2010 08:54:06 von dan.j.williams

On 10/18/2010 5:40 PM, Neil Brown wrote:
> On Fri, 1 Oct 2010 13:36:48 +0100
> "Labun, Marcin" wrote:
>
>> > From f423b226f10cfe3b416c5e0580dde45cd8ca887d Mon Sep 17 00:00:00 2001
>> From: Marcin Labun
>> Date: Wed, 29 Sep 2010 06:12:38 +0200
>> Subject: [AUTOREBUILD 0/8] Autorebuild monitor patches based on user defined policy
>>
>> This is updated series of patches forming autorebuild functionality in mdadm
>> monitor based on new policy code.
>
> Hi Marcin,
> thanks for this, and apologies for not replying sooner.
> I've had a bit of a look and some of it seems good.
> I haven't had a thorough look yet as I am in the middle of doing some fairly
> serious refactoring of mdadm (the supertype, and mdinfo structures are going
> to be heavily changed and largely merged - some super_switch methods will
> disappear (e.g. getinfo_super) and others will appear (load_container)).
> Once I have finished that I will review your code more thoroughly and merge
> it into the new code base.
>
> One concern I do have is patch 0002 which removes the spare-group based
> spare migration. That functionality needs to stay, though obviously the
> implementation can change. I imagine the 'spare-group' information would be
> added to each member device as a 'domain' name.
>
> Also it is best not to remove functionality and then re-add it a different
> way, but rather to make sure the functionality works after every change, but
> just gets extended at various points.

Hi Neil,

I made a similar comment on this patch during our internal review. We
also talked about the need for superswitch methods that can be used to
1/ determine which devices in a container are spares versus stale disks
2/ what the minimum size a bare disk needs to be to join a container.
I'll wait to see if these items will be easier to determine with the new
mdinfo/supertype refactoring.

Other notes:
The --activate-domains option [1] to validate the configuration file and
install custom/filtered udev rules for the ports we care about, seemed
like a good idea at the time. Now that things are a bit further along
do you have a better solution in mind or is this still the approach we
want to take? Przemek currently has a patch to filter all block device
events through mdadm to query the configuration file for domain events
which seems like overkill if not a performance problem for large disk
count environments.

We also talked about migration, but I'll put those details in a separate
thread.

Thanks,
Dan

[1]: http://marc.info/?l=linux-raid&m=127001124615043&w=2
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

RE: [AUTOREBUILD 0/8] Autorebuild monitor patches based on userdefined policy

am 20.10.2010 17:41:15 von Marcin.Labun

> -----Original Message-----
> From: Neil Brown [mailto:neilb@suse.de]
>
> On Fri, 1 Oct 2010 13:36:48 +0100
> "Labun, Marcin" wrote:
>
> > >From f423b226f10cfe3b416c5e0580dde45cd8ca887d Mon Sep 17 00:00:00
> 2001
> > From: Marcin Labun
> > Date: Wed, 29 Sep 2010 06:12:38 +0200
> > Subject: [AUTOREBUILD 0/8] Autorebuild monitor patches based on user
> defined policy
> >
> > This is updated series of patches forming autorebuild functionality
> in mdadm
> > monitor based on new policy code.
>
> Hi Marcin,
> thanks for this, and apologies for not replying sooner.
> I've had a bit of a look and some of it seems good.
> I haven't had a thorough look yet as I am in the middle of doing some fairly
> serious refactoring of mdadm (the supertype, and mdinfo structures are going
> to be heavily changed and largely merged - some super_switch methods> will
> disappear (e.g. getinfo_super) and others will appear (load_container)).
> Once I have finished that I will review your code more thoroughly and merge
> it into the new code base.
>
> One concern I do have is patch 0002 which removes the spare-group based
> spare migration. That functionality needs to stay, though obviously the
> implementation can change. I imagine the 'spare-group' information would be
> added to each member device as a 'domain' name.
>
> Also it is best not to remove functionality and then re-add it a different
> way, but rather to make sure the functionality works after every change, but
> just gets extended at various points.
>
Hi Neil,
Next week we are planning to make another drop that includes spare-groups and a number of code rework changes.
Thanks,
Marcin

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html