[PATCH 2 of 9] MD: should_read_superblock
[PATCH 2 of 9] MD: should_read_superblock
am 24.05.2011 05:06:09 von Jonathan Brassow
Patch name: md-should_read_superblock.patch
Add new function to determine whether MD superblocks should be read.
It used to be sufficient to check if mddev->raid_disks was set to determine
whether to read the superblock or not. However, device-mapper (dm-raid.c)
sets this value before calling md_run(). Thus, we need additional mechanisms
for determining whether to read the superblock. This patch adds the condition
that if rdev->meta_bdev is set, the superblock should be read - something that
only device-mapper does (and only when there are superblocks to be read/used).
Signed-off-by: Jonathan Brassow
Index: linux-2.6/drivers/md/md.c
============================================================ =======
--- linux-2.6.orig/drivers/md/md.c
+++ linux-2.6/drivers/md/md.c
@@ -4421,6 +4421,20 @@ static void md_safemode_timeout(unsigned
md_wakeup_thread(mddev->thread);
}
+static int should_read_super(mddev_t *mddev)
+{
+ mdk_rdev_t *rdev, *tmp;
+
+ if (!mddev->raid_disks)
+ return 1;
+
+ rdev_for_each(rdev, tmp, mddev)
+ if (rdev->meta_bdev)
+ return 1;
+
+ return 0;
+}
+
static int start_dirty_degraded;
int md_run(mddev_t *mddev)
@@ -4442,7 +4456,7 @@ int md_run(mddev_t *mddev)
/*
* Analyze all RAID superblock(s)
*/
- if (!mddev->raid_disks) {
+ if (should_read_super(mddev)) {
if (!mddev->persistent)
return -EINVAL;
analyze_sbs(mddev);
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2 of 9] MD: should_read_superblock
am 25.05.2011 06:01:57 von NeilBrown
On Mon, 23 May 2011 22:06:09 -0500 Jonathan Brassow
wrote:
> Patch name: md-should_read_superblock.patch
>
> Add new function to determine whether MD superblocks should be read.
>
> It used to be sufficient to check if mddev->raid_disks was set to determine
> whether to read the superblock or not. However, device-mapper (dm-raid.c)
> sets this value before calling md_run(). Thus, we need additional mechanisms
> for determining whether to read the superblock. This patch adds the condition
> that if rdev->meta_bdev is set, the superblock should be read - something that
> only device-mapper does (and only when there are superblocks to be read/used).
>
> Signed-off-by: Jonathan Brassow
I've been feeling uncomfortable about this and have spent a while trying to
see if my discomfort is at all justified. It seems that maybe it is.
The discomfort is really at analyze_sbs being used for dm arrays. It is
really for arrays where md completely controls the metadata. dm array are in
a strange intermediate situation where some metadata is controlled by
user-space (so md is told about some details of the array) and other metadata
is managed by the kernel - so md finds those bits out by itself.
It isn't yet entirely clear to me how to handle the half-way state best.
But the particular problem is that analyse_sbs can call kick_rdev_from_array.
This will call export_rdev which will call kobject_put(&rdev->kboj) which is
bad because dm-based rdevs do not get their kobj initialised.
So I think analyse_sbs should not be used for dm arrays.
Rather the code in dm-raid.c which parses the metadata_device info from the
constructor line should load_super. Then before md_run is called it should
do the 'validate_super' step and record any failures.
So the only super_types method that md code would call on a dm-raid array
would be sync_super.
Does that work for you?
Thanks,
NeilBrown
>
> Index: linux-2.6/drivers/md/md.c
> ============================================================ =======
> --- linux-2.6.orig/drivers/md/md.c
> +++ linux-2.6/drivers/md/md.c
> @@ -4421,6 +4421,20 @@ static void md_safemode_timeout(unsigned
> md_wakeup_thread(mddev->thread);
> }
>
> +static int should_read_super(mddev_t *mddev)
> +{
> + mdk_rdev_t *rdev, *tmp;
> +
> + if (!mddev->raid_disks)
> + return 1;
> +
> + rdev_for_each(rdev, tmp, mddev)
> + if (rdev->meta_bdev)
> + return 1;
> +
> + return 0;
> +}
> +
> static int start_dirty_degraded;
>
> int md_run(mddev_t *mddev)
> @@ -4442,7 +4456,7 @@ int md_run(mddev_t *mddev)
> /*
> * Analyze all RAID superblock(s)
> */
> - if (!mddev->raid_disks) {
> + if (should_read_super(mddev)) {
> if (!mddev->persistent)
> return -EINVAL;
> analyze_sbs(mddev);
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2 of 9] MD: should_read_superblock
am 25.05.2011 16:00:19 von Jonathan Brassow
On May 24, 2011, at 11:01 PM, NeilBrown wrote:
> On Mon, 23 May 2011 22:06:09 -0500 Jonathan Brassow
> wrote:
>
>> Patch name: md-should_read_superblock.patch
>>
>> Add new function to determine whether MD superblocks should be read.
>>
>> It used to be sufficient to check if mddev->raid_disks was set to determine
>> whether to read the superblock or not. However, device-mapper (dm-raid.c)
>> sets this value before calling md_run(). Thus, we need additional mechanisms
>> for determining whether to read the superblock. This patch adds the condition
>> that if rdev->meta_bdev is set, the superblock should be read - something that
>> only device-mapper does (and only when there are superblocks to be read/used).
>>
>> Signed-off-by: Jonathan Brassow
>
> I've been feeling uncomfortable about this and have spent a while trying to
> see if my discomfort is at all justified. It seems that maybe it is.
>
> The discomfort is really at analyze_sbs being used for dm arrays. It is
> really for arrays where md completely controls the metadata. dm array are in
> a strange intermediate situation where some metadata is controlled by
> user-space (so md is told about some details of the array) and other metadata
> is managed by the kernel - so md finds those bits out by itself.
>
> It isn't yet entirely clear to me how to handle the half-way state best.
>
> But the particular problem is that analyse_sbs can call kick_rdev_from_array.
> This will call export_rdev which will call kobject_put(&rdev->kboj) which is
> bad because dm-based rdevs do not get their kobj initialised.
>
> So I think analyse_sbs should not be used for dm arrays.
> Rather the code in dm-raid.c which parses the metadata_device info from the
> constructor line should load_super. Then before md_run is called it should
> do the 'validate_super' step and record any failures.
>
> So the only super_types method that md code would call on a dm-raid array
> would be sync_super.
>
> Does that work for you?
That seems sensible. It changes things up a bit though...
1) the load_super and validate_super functions would go into dm-raid.c, but stubs (returning EINVAL) would remain in md.c in order to fill-out the super_types pointers.
2) the device-mapper superblock would have to move to a common place because it would need to be shared by the super functions in dm-raid.c and sync_super in md.c. I'd rather not put the new superblock in md_p.h... perhaps a new file, dm-raid.h? (You could hide the superblock entirely in dm-raid.c, but you'd have to export a function from dm-raid.c that would be called by sync_super in md.c - necessitating a dm-raid.h again. Is this a better solution?)
brassow--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2 of 9] MD: should_read_superblock
am 26.05.2011 02:32:09 von NeilBrown
On Wed, 25 May 2011 09:00:19 -0500 Jonathan Brassow
wrote:
>
> On May 24, 2011, at 11:01 PM, NeilBrown wrote:
>
> > On Mon, 23 May 2011 22:06:09 -0500 Jonathan Brassow
> > wrote:
> >
> >> Patch name: md-should_read_superblock.patch
> >>
> >> Add new function to determine whether MD superblocks should be read.
> >>
> >> It used to be sufficient to check if mddev->raid_disks was set to determine
> >> whether to read the superblock or not. However, device-mapper (dm-raid.c)
> >> sets this value before calling md_run(). Thus, we need additional mechanisms
> >> for determining whether to read the superblock. This patch adds the condition
> >> that if rdev->meta_bdev is set, the superblock should be read - something that
> >> only device-mapper does (and only when there are superblocks to be read/used).
> >>
> >> Signed-off-by: Jonathan Brassow
> >
> > I've been feeling uncomfortable about this and have spent a while trying to
> > see if my discomfort is at all justified. It seems that maybe it is.
> >
> > The discomfort is really at analyze_sbs being used for dm arrays. It is
> > really for arrays where md completely controls the metadata. dm array are in
> > a strange intermediate situation where some metadata is controlled by
> > user-space (so md is told about some details of the array) and other metadata
> > is managed by the kernel - so md finds those bits out by itself.
> >
> > It isn't yet entirely clear to me how to handle the half-way state best.
> >
> > But the particular problem is that analyse_sbs can call kick_rdev_from_array.
> > This will call export_rdev which will call kobject_put(&rdev->kboj) which is
> > bad because dm-based rdevs do not get their kobj initialised.
> >
> > So I think analyse_sbs should not be used for dm arrays.
> > Rather the code in dm-raid.c which parses the metadata_device info from the
> > constructor line should load_super. Then before md_run is called it should
> > do the 'validate_super' step and record any failures.
> >
> > So the only super_types method that md code would call on a dm-raid array
> > would be sync_super.
> >
> > Does that work for you?
>
> That seems sensible. It changes things up a bit though...
>
> 1) the load_super and validate_super functions would go into dm-raid.c, but stubs (returning EINVAL) would remain in md.c in order to fill-out the super_types pointers.
> 2) the device-mapper superblock would have to move to a common place because it would need to be shared by the super functions in dm-raid.c and sync_super in md.c. I'd rather not put the new superblock in md_p.h... perhaps a new file, dm-raid.h? (You could hide the superblock entirely in dm-raid.c, but you'd have to export a function from dm-raid.c that would be called by sync_super in md.c - necessitating a dm-raid.h again. Is this a better solution?)
>
> brassow
How about we put a 'sync_super' or possibly a 'struct super_type' pointer in
mddev_t, and use it instead of mddev->major_version for finding operations.
Then all knowledge of the dm metadata can live in dm-raid.c??
NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2 of 9] MD: should_read_superblock
am 26.05.2011 16:50:05 von Jonathan Brassow
On May 25, 2011, at 7:32 PM, NeilBrown wrote:
> On Wed, 25 May 2011 09:00:19 -0500 Jonathan Brassow
> wrote:
>
>>
>> On May 24, 2011, at 11:01 PM, NeilBrown wrote:
>>
>>> On Mon, 23 May 2011 22:06:09 -0500 Jonathan Brassow
>>> wrote:
>>>
>>>> Patch name: md-should_read_superblock.patch
>>>>
>>>> Add new function to determine whether MD superblocks should be read.
>>>>
>>>> It used to be sufficient to check if mddev->raid_disks was set to determine
>>>> whether to read the superblock or not. However, device-mapper (dm-raid.c)
>>>> sets this value before calling md_run(). Thus, we need additional mechanisms
>>>> for determining whether to read the superblock. This patch adds the condition
>>>> that if rdev->meta_bdev is set, the superblock should be read - something that
>>>> only device-mapper does (and only when there are superblocks to be read/used).
>>>>
>>>> Signed-off-by: Jonathan Brassow
>>>
>>> I've been feeling uncomfortable about this and have spent a while trying to
>>> see if my discomfort is at all justified. It seems that maybe it is.
>>>
>>> The discomfort is really at analyze_sbs being used for dm arrays. It is
>>> really for arrays where md completely controls the metadata. dm array are in
>>> a strange intermediate situation where some metadata is controlled by
>>> user-space (so md is told about some details of the array) and other metadata
>>> is managed by the kernel - so md finds those bits out by itself.
>>>
>>> It isn't yet entirely clear to me how to handle the half-way state best.
>>>
>>> But the particular problem is that analyse_sbs can call kick_rdev_from_array.
>>> This will call export_rdev which will call kobject_put(&rdev->kboj) which is
>>> bad because dm-based rdevs do not get their kobj initialised.
>>>
>>> So I think analyse_sbs should not be used for dm arrays.
>>> Rather the code in dm-raid.c which parses the metadata_device info from the
>>> constructor line should load_super. Then before md_run is called it should
>>> do the 'validate_super' step and record any failures.
>>>
>>> So the only super_types method that md code would call on a dm-raid array
>>> would be sync_super.
>>>
>>> Does that work for you?
>>
>> That seems sensible. It changes things up a bit though...
>>
>> 1) the load_super and validate_super functions would go into dm-raid.c, but stubs (returning EINVAL) would remain in md.c in order to fill-out the super_types pointers.
>> 2) the device-mapper superblock would have to move to a common place because it would need to be shared by the super functions in dm-raid.c and sync_super in md.c. I'd rather not put the new superblock in md_p.h... perhaps a new file, dm-raid.h? (You could hide the superblock entirely in dm-raid.c, but you'd have to export a function from dm-raid.c that would be called by sync_super in md.c - necessitating a dm-raid.h again. Is this a better solution?)
>>
>> brassow
>
> How about we put a 'sync_super' or possibly a 'struct super_type' pointer in
> mddev_t, and use it instead of mddev->major_version for finding operations.
> Then all knowledge of the dm metadata can live in dm-raid.c??
I was just thinking that - yes, that sounds good.
I haven't thought about it too deeply yet, so I'm not sure which I like better:
1) just sync_super ptr in mddev_t
2) super_types in mddev_t
My first impression is just sync_super, after all, the load and validate can be done within device-mapper and never need to be called by MD outside analyze_sbs and routines that add devices, right? Perhaps we would just remove sync_super from super_types or check for mddev->sync_super before calling super_types[x].sync_super? I'll think more about it.
thanks,
brassow
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html