race condition in md creation?

race condition in md creation?

am 27.05.2011 15:20:57 von Stijn Hoop

--MP_/m/VuAg25iuuDK3Vgv9CwdgZ
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

Hello,

while creating a test suite for internal purposes I ran into a race
condition where a (very small) raid array that was just created cannot
be stopped.

mdadm --create succeeds, but the subsequent mdadm --stop reports
'Device or resource busy'.

Please see the attached script for reproduction purposes, partial output
from a run on my system (Fedora 14, kernel 2.6.35.13-91.fc14.x86_64,
mdadm-3.1.3-0.git20100804.2.fc14.x86_64):


5+0 records in
5+0 records out
5242880 bytes (5.2 MB) copied, 0.0166663 s, 315 MB/s
5+0 records in
5+0 records out
5242880 bytes (5.2 MB) copied, 0.0197533 s, 265 MB/s
mdadm: array /dev/md0 started.
mdadm: failed to stop array /dev/md0: Device or resource busy
Perhaps a running process, mounted filesystem or active volume group?
failed to stop /dev/md0, sleep 1 sec then retrying one more time
mdadm: stopped /dev/md0


I know that this might be an artificial bug, for with real raid arrays
people will not stop their just-created raid systems, but I figured
somebody might be interested to find out what was actually going on. As
I have no kernel expertise (yet! :) and I need to move on, I am only
posting my results...

BTW, I'm posting here only because I failed to google a bug tracker for
linux-raid. If there is one, my apologies, I will gladly create a bug
instead.

HTH,

--Stijn
--MP_/m/VuAg25iuuDK3Vgv9CwdgZ
Content-Type: application/x-shellscript
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename=mdadm-bug.sh

IyEvYmluL3NoCmZvciBpIGluIGBzZXEgNTBgOyBkbwoJZGQgaWY9L2Rldi96 ZXJvIG9mPS90bXAv
bWRsb29wLjAgYnM9MU0gY291bnQ9NQoJZGQgaWY9L2Rldi96ZXJvIG9mPS90 bXAvbWRsb29wLjEg
YnM9MU0gY291bnQ9NQoJbG9zZXR1cCAvZGV2L2xvb3AwIC90bXAvbWRsb29w LjAKCWxvc2V0dXAg
L2Rldi9sb29wMSAvdG1wL21kbG9vcC4xCgltZGFkbSAtQyAtLWxldmVsPTEg LS1yYWlkLWRldmlj
ZXM9MiAtLW1ldGFkYXRhPTAuOTAgL2Rldi9tZDAgL2Rldi9sb29wWzAxXQoJ aWYgISBtZGFkbSAt
LXN0b3AgL2Rldi9tZDA7IHRoZW4KCQllY2hvICJmYWlsZWQgdG8gc3RvcCAv ZGV2L21kMCwgc2xl
ZXAgMSBzZWMgdGhlbiByZXRyeWluZyBvbmUgbW9yZSB0aW1lIgoJCXNsZWVw IDEKCQltZGFkbSAt
LXN0b3AgL2Rldi9tZDAgfHwgZXhpdCAxCglmaQoJbG9zZXR1cCAtZCAvZGV2 L2xvb3AxCglsb3Nl
dHVwIC1kIC9kZXYvbG9vcDAKCXJtIC90bXAvbWRsb29wLlswMV0KZG9uZQo=

--MP_/m/VuAg25iuuDK3Vgv9CwdgZ--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: race condition in md creation?

am 27.05.2011 23:24:40 von NeilBrown

On Fri, 27 May 2011 15:20:57 +0200 Stijn Hoop wrote:

> Hello,
>
> while creating a test suite for internal purposes I ran into a race
> condition where a (very small) raid array that was just created cannot
> be stopped.
>
> mdadm --create succeeds, but the subsequent mdadm --stop reports
> 'Device or resource busy'.
>
> Please see the attached script for reproduction purposes, partial output
> from a run on my system (Fedora 14, kernel 2.6.35.13-91.fc14.x86_64,
> mdadm-3.1.3-0.git20100804.2.fc14.x86_64):
>
>
> 5+0 records in
> 5+0 records out
> 5242880 bytes (5.2 MB) copied, 0.0166663 s, 315 MB/s
> 5+0 records in
> 5+0 records out
> 5242880 bytes (5.2 MB) copied, 0.0197533 s, 265 MB/s
> mdadm: array /dev/md0 started.
> mdadm: failed to stop array /dev/md0: Device or resource busy
> Perhaps a running process, mounted filesystem or active volume group?
> failed to stop /dev/md0, sleep 1 sec then retrying one more time
> mdadm: stopped /dev/md0

When a new device appears (such as a new md array), udev springs in to
action and examines it to see if it should do something with it.
While udev (or some tool that it ran) is examining the md array it looks like
it is busy so an attempt to stop it will fail.

My test scripts tend to have
udevadm settle
before
mdadm --stop

for exactly this reason.

>
>
> I know that this might be an artificial bug, for with real raid arrays
> people will not stop their just-created raid systems, but I figured
> somebody might be interested to find out what was actually going on. As
> I have no kernel expertise (yet! :) and I need to move on, I am only
> posting my results...
>
> BTW, I'm posting here only because I failed to google a bug tracker for
> linux-raid. If there is one, my apologies, I will gladly create a bug
> instead.
>

This email list *is* the bug tracker (I'm not a big fan of bug trackers
myself).

Thanks for the report,
NeilBrown



> HTH,
>
> --Stijn
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: race condition in md creation?

am 28.05.2011 13:19:59 von Stijn Hoop

Hi,

On Sat, 28 May 2011 07:24:40 +1000
NeilBrown wrote:
> On Fri, 27 May 2011 15:20:57 +0200 Stijn Hoop
> wrote:
>
> > Hello,
> >
> > while creating a test suite for internal purposes I ran into a race
> > condition where a (very small) raid array that was just created
> > cannot be stopped.
> >
> > mdadm --create succeeds, but the subsequent mdadm --stop reports
> > 'Device or resource busy'.
> >
> > Please see the attached script for reproduction purposes, partial
> > output from a run on my system (Fedora 14, kernel
> > 2.6.35.13-91.fc14.x86_64, mdadm-3.1.3-0.git20100804.2.fc14.x86_64):
> >
> >
> > 5+0 records in
> > 5+0 records out
> > 5242880 bytes (5.2 MB) copied, 0.0166663 s, 315 MB/s
> > 5+0 records in
> > 5+0 records out
> > 5242880 bytes (5.2 MB) copied, 0.0197533 s, 265 MB/s
> > mdadm: array /dev/md0 started.
> > mdadm: failed to stop array /dev/md0: Device or resource busy
> > Perhaps a running process, mounted filesystem or active volume
> > group? failed to stop /dev/md0, sleep 1 sec then retrying one more
> > time mdadm: stopped /dev/md0
>
> When a new device appears (such as a new md array), udev springs in
> to action and examines it to see if it should do something with it.
> While udev (or some tool that it ran) is examining the md array it
> looks like it is busy so an attempt to stop it will fail.
>
> My test scripts tend to have
> udevadm settle
> before
> mdadm --stop
>
> for exactly this reason.

Ah, that makes perfect sense. Thanks for the explanation!

--Stijn
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html