Device state during an incremental assembly

Device state during an incremental assembly

am 20.11.2010 02:52:25 von Wakko Warner

I am not on this list, please always CC me.

Is there anyway to determine the array state after incrementally adding a
device? I was hoping to find something in /sys but I was unable to. I was
trying not to parse the output of mdadm for each invocation.

--
Microsoft has beaten Volkswagen's world record. Volkswagen only created 22
million bugs.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Device state during an incremental assembly

am 21.11.2010 07:19:48 von NeilBrown

On Fri, 19 Nov 2010 20:52:25 -0500
Wakko Warner wrote:

> I am not on this list, please always CC me.
>
> Is there anyway to determine the array state after incrementally adding a
> device? I was hoping to find something in /sys but I was unable to. I was
> trying not to parse the output of mdadm for each invocation.
>

Yes.



Note: if you don't find that there is enough detail in this answer, it simply
reflects the level of detail in the question :-)

NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Device state during an incremental assembly

am 21.11.2010 15:41:59 von Wakko Warner

> > I am not on this list, please always CC me.

I am now.

> > Is there anyway to determine the array state after incrementally adding a
> > device? I was hoping to find something in /sys but I was unable to. I was
> > trying not to parse the output of mdadm for each invocation.

> Yes.

> Note: if you don't find that there is enough detail in this answer, it
> simply
> reflects the level of detail in the question :-)

I've been playing with incremental assembly on a test system and wanted to
have a way of knowing if the md device was in a state that it could be
started. I'm working with a script that will assemble arrays for me and
possibly query if an array should be started if it hasn't.

For instance.

Lets say that I have a 4 disk array (In my case, raid level is really of no
importance, but if we must, assume this is raid5). The array was setup as
/dev/md0.

I run mdadm -I /dev/sda1

Obviously, this is not enough to start the array even in degraded mode. At
this point, is there a way to know if the array could be started and usable
without parsing the output of mdadm?

Next, mdadm -I /dev/sdb1
Still not enough.
Next, mdadm -I /dev/sdc1
At this point, I could start the array in degraded. This is what I'd like
to beable to query from beneath /sys/block/md0 or some where else.

And of course, mdadm -I /dev/sdd1
makes the array fully active.

P.S. cantor2.suse.de[195.135.220.15] does pipelining even if pipelining
wasn't performed.

--
Microsoft has beaten Volkswagen's world record. Volkswagen only created 22
million bugs.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Device state during an incremental assembly

am 21.11.2010 21:58:40 von NeilBrown

On Sun, 21 Nov 2010 09:41:59 -0500
Wakko Warner wrote:

> > > I am not on this list, please always CC me.
>
> I am now.
>
> > > Is there anyway to determine the array state after incrementally adding a
> > > device? I was hoping to find something in /sys but I was unable to. I was
> > > trying not to parse the output of mdadm for each invocation.
>
> > Yes.
>
> > Note: if you don't find that there is enough detail in this answer, it
> > simply
> > reflects the level of detail in the question :-)
>
> I've been playing with incremental assembly on a test system and wanted to
> have a way of knowing if the md device was in a state that it could be
> started. I'm working with a script that will assemble arrays for me and
> possibly query if an array should be started if it hasn't.
>
> For instance.
>
> Lets say that I have a 4 disk array (In my case, raid level is really of no
> importance, but if we must, assume this is raid5). The array was setup as
> /dev/md0.
>
> I run mdadm -I /dev/sda1
>
> Obviously, this is not enough to start the array even in degraded mode. At
> this point, is there a way to know if the array could be started and usable
> without parsing the output of mdadm?
>
> Next, mdadm -I /dev/sdb1
> Still not enough.
> Next, mdadm -I /dev/sdc1
> At this point, I could start the array in degraded. This is what I'd like
> to beable to query from beneath /sys/block/md0 or some where else.

Ahh.. In this case the answer is actually "no" - sorry.

The answer is a non-trivial function of:
the raid level
the number of available non-spare devices
(for raid10) which slots the available non-spare devices fill.

All of this information is available in sysfs, and mdadm has code to perform
the computation and then take appropriate action. But there is no simple way
to get this information.

Why do you want this information? What action will you take depending on the
answer?
If you just want mdadm to assemble as soon a a degraded array is possible,
just use "mdadm -IR" - but I suspect you already know that.


>
> And of course, mdadm -I /dev/sdd1
> makes the array fully active.
>
> P.S. cantor2.suse.de[195.135.220.15] does pipelining even if pipelining
> wasn't performed.
I'm guessing this is a statement about SMTP? This would be why I got a bounce
: host veg.animx.eu.org[76.7.162.186] said: 554 SMTP
synchronization error (in reply to MAIL FROM command)
to my original mail.
I guess suse.de is not being conservative in what it sends, and animx is not
being liberal in what it accepts... Do you know what mail system is in use
on animx??

NeilBrown



>

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Device state during an incremental assembly

am 21.11.2010 23:45:22 von Wakko Warner

Neil Brown wrote:
> On Sun, 21 Nov 2010 09:41:59 -0500
> > > Note: if you don't find that there is enough detail in this answer, it
> > > simply
> > > reflects the level of detail in the question :-)
> >
> > Next, mdadm -I /dev/sdb1
> > Still not enough.
> > Next, mdadm -I /dev/sdc1
> > At this point, I could start the array in degraded. This is what I'd like
> > to beable to query from beneath /sys/block/md0 or some where else.
>
> Ahh.. In this case the answer is actually "no" - sorry.

Ok, I wasn't sure so I figured I'd ask.

> The answer is a non-trivial function of:
> the raid level
> the number of available non-spare devices
> (for raid10) which slots the available non-spare devices fill.

I've tested only on raid5 at the moment.

> All of this information is available in sysfs, and mdadm has code to perform
> the computation and then take appropriate action. But there is no simple way
> to get this information.

For this environment, I'm limited to what busybox and it's ash can do.

> Why do you want this information? What action will you take depending on the
> answer?

I'm building an initramfs for myself that can be thrown at any of my systems
and "just work" (and w/o using modules so that it'll work with most kernels).

I was reading in the kernel md.txt that there is a start degraded
option. I wanted a way to prompt the user (or have a parameter on boot)
that would do this. After reading it, I really wouldn't want to just enable
that since it was for dirty and degraded.

> If you just want mdadm to assemble as soon a a degraded array is possible,
> just use "mdadm -IR" - but I suspect you already know that.

Sort of, but I didn't think of -I and -R together. Another question on
that. If I have /sys/module/md_mod/parameters/start_ro set to 1, I use
-IR with each device, will a resync happen once all devices show up?

IE:
mdadm -IR /dev/sda1
mdadm -IR /dev/sdb1
mdadm -IR /dev/sdc1
At this point it would be running in degraded (start_ro = 1).
mdadm -IR /dev/sdd1
Does a resync happen here? (assume there's no bitmap)

> > And of course, mdadm -I /dev/sdd1
> > makes the array fully active.
> >
> > P.S. cantor2.suse.de[195.135.220.15] does pipelining even if pipelining
> > wasn't performed.
> I'm guessing this is a statement about SMTP? This would be why I got a bounce
> : host veg.animx.eu.org[76.7.162.186] said: 554 SMTP
> synchronization error (in reply to MAIL FROM command)
> to my original mail.

Yes, w/o pipelining, MAIL command must not be sent without waiting for the
status code from the previous command (or initial connect). I'm actually
surprised that postfix did that.

> I guess suse.de is not being conservative in what it sends, and animx is not
> being liberal in what it accepts... Do you know what mail system is in use
> on animx??

I run both servers. It hit my primary and rDNS wasn't available (temp
fail), so it hit my secondary and rDNS passed, but that system was
configured not to advertise pipelining due to some poorly setup systems that
had broken pipelining and had SMTP errors at the same time. Both servers
use exim.

--
Microsoft has beaten Volkswagen's world record. Volkswagen only created 22
million bugs.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Device state during an incremental assembly

am 22.11.2010 00:19:19 von NeilBrown

On Sun, 21 Nov 2010 17:45:22 -0500
Wakko Warner wrote:

> Neil Brown wrote:
> > On Sun, 21 Nov 2010 09:41:59 -0500
> > > > Note: if you don't find that there is enough detail in this answer, it
> > > > simply
> > > > reflects the level of detail in the question :-)
> > >
> > > Next, mdadm -I /dev/sdb1
> > > Still not enough.
> > > Next, mdadm -I /dev/sdc1
> > > At this point, I could start the array in degraded. This is what I'd like
> > > to beable to query from beneath /sys/block/md0 or some where else.
> >
> > Ahh.. In this case the answer is actually "no" - sorry.
>
> Ok, I wasn't sure so I figured I'd ask.
>
> > The answer is a non-trivial function of:
> > the raid level
> > the number of available non-spare devices
> > (for raid10) which slots the available non-spare devices fill.
>
> I've tested only on raid5 at the moment.
>
> > All of this information is available in sysfs, and mdadm has code to perform
> > the computation and then take appropriate action. But there is no simple way
> > to get this information.
>
> For this environment, I'm limited to what busybox and it's ash can do.
>
> > Why do you want this information? What action will you take depending on the
> > answer?
>
> I'm building an initramfs for myself that can be thrown at any of my systems
> and "just work" (and w/o using modules so that it'll work with most kernels).
>
> I was reading in the kernel md.txt that there is a start degraded
> option. I wanted a way to prompt the user (or have a parameter on boot)
> that would do this. After reading it, I really wouldn't want to just enable
> that since it was for dirty and degraded.

There are two distinct issues here and I'm not sure which one you are
thinking about.

On one hand we can ask whether we should start a degraded array if the array
is dirty. In this case we should certainly wait until every possible device
has been discovered to maximise the chance that the array can be started
non-degraded. This is because a dirty degraded array can contain
undetectable data corruption. But usually we do want to start such an array
because having the data available is more important than a risk that some of
it is corrupted. The reason this requires a kernel option, or an '-f' to
"mdadm -A" is to ensure that the sysadmin knows that there is a small chance
of corruption.

On the other hand, we can ask whether we should start a degraded array if
there are enough devices to do that, though there could still be some more
devices to be found. In this case it depends a bit on how long it takes to
discover devices and how long we are happy to wait. Sometimes you might
want to ask the sysadmin "have all the usb devices been plugged in, or should
I want for more"...


The first is answered by giving '-f' to mdadm, or not.
The second by giving '-R' to mdadm, or not.


>
> > If you just want mdadm to assemble as soon a a degraded array is possible,
> > just use "mdadm -IR" - but I suspect you already know that.
>
> Sort of, but I didn't think of -I and -R together. Another question on
> that. If I have /sys/module/md_mod/parameters/start_ro set to 1, I use
> -IR with each device, will a resync happen once all devices show up?
>
> IE:
> mdadm -IR /dev/sda1
> mdadm -IR /dev/sdb1
> mdadm -IR /dev/sdc1
> At this point it would be running in degraded (start_ro = 1).
> mdadm -IR /dev/sdd1
> Does a resync happen here? (assume there's no bitmap)

It depends.
On a recent kernel, if nothing had written to the array, then a resync won't
be required when /dev/sdd1 is included - it will just change to array from
being degraded to being optimal.
However if there has been any write - and mounting a filesystem often writes
something to the superblock - then a resync (actually a recovery) will happen
at this point. sdd1 will be seen as a new spare to be added and rebuilt.

NeilBrown


--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Device state during an incremental assembly

am 22.11.2010 02:39:30 von Wakko Warner

Neil Brown wrote:
> On Sun, 21 Nov 2010 17:45:22 -0500
> Wakko Warner wrote:
>
> > Neil Brown wrote:
> > For this environment, I'm limited to what busybox and it's ash can do.
> >
> > > Why do you want this information? What action will you take depending on the
> > > answer?
> >
> > I'm building an initramfs for myself that can be thrown at any of my systems
> > and "just work" (and w/o using modules so that it'll work with most kernels).
> >
> > I was reading in the kernel md.txt that there is a start degraded
> > option. I wanted a way to prompt the user (or have a parameter on boot)
> > that would do this. After reading it, I really wouldn't want to just enable
> > that since it was for dirty and degraded.
>
> There are two distinct issues here and I'm not sure which one you are
> thinking about.

I believe both.

> On one hand we can ask whether we should start a degraded array if the array
> is dirty. In this case we should certainly wait until every possible device
> has been discovered to maximise the chance that the array can be started
> non-degraded. This is because a dirty degraded array can contain
> undetectable data corruption. But usually we do want to start such an array
> because having the data available is more important than a risk that some of
> it is corrupted. The reason this requires a kernel option, or an '-f' to
> "mdadm -A" is to ensure that the sysadmin knows that there is a small chance
> of corruption.

That's why I was placing a similar option to the environment I'm building.
In this case, it would be a UI question instead of a parameter. I see no
reason to automatically start dirty-degraded arrays (atleast for what I'm
doing)

If for instance the 4 drive array has 3 drives added, it's dirty, and I want
to force it to run for whatever reason, would mdadm -R -f do it? Or would
it be something that I'd have to shutdown the inactive array and rebuild it
with -A -f? (This would be a menu option. Also the start_ro module option
would always be 1 in my case)

> On the other hand, we can ask whether we should start a degraded array if
> there are enough devices to do that, though there could still be some more
> devices to be found. In this case it depends a bit on how long it takes to
> discover devices and how long we are happy to wait. Sometimes you might
> want to ask the sysadmin "have all the usb devices been plugged in, or should
> I want for more"...

For this one, same thing, 4 drive array, 3 added. I know I can start it
by echo read-auto > /sys/block/mdX/md/array_state. This environment is
setup to avoid writing until it mounts the root fs.

My original intention for this was local hard disks, possibly encrypted, but
with the design, USB shouldn't be any problems. For my tests, my goal is to
beable to get to my root if each disk device to be encrypted (that is before
mdX), then the md device to be encrypted, followed by lvm and an lv that is
encrypted. Overkill and very low performance, but it's just testing =)

> The first is answered by giving '-f' to mdadm, or not.
> The second by giving '-R' to mdadm, or not.

Ok. That's good.

> > > If you just want mdadm to assemble as soon a a degraded array is possible,
> > > just use "mdadm -IR" - but I suspect you already know that.
> >
> > Sort of, but I didn't think of -I and -R together. Another question on
> > that. If I have /sys/module/md_mod/parameters/start_ro set to 1, I use
> > -IR with each device, will a resync happen once all devices show up?
> >
> > IE:
> > mdadm -IR /dev/sda1
> > mdadm -IR /dev/sdb1
> > mdadm -IR /dev/sdc1
> > At this point it would be running in degraded (start_ro = 1).
> > mdadm -IR /dev/sdd1
> > Does a resync happen here? (assume there's no bitmap)
>
> It depends.
> On a recent kernel, if nothing had written to the array, then a resync won't
> be required when /dev/sdd1 is included - it will just change to array from
> being degraded to being optimal.

Do you know off hand which kernel that would be?

> However if there has been any write - and mounting a filesystem often writes
> something to the superblock - then a resync (actually a recovery) will happen
> at this point. sdd1 will be seen as a new spare to be added and rebuilt.

I fully understand this part.

I sure do appreciate your responces, it'll help me greatly with this little
project.

--
Microsoft has beaten Volkswagen's world record. Volkswagen only created 22
million bugs.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html