2.6.32.28 - md resync + pvmove - crash
am 07.05.2011 12:39:07 von Nikola Ciprich--===============9157196714549586020==
Content-Type: multipart/signed; micalg=pgp-sha1;
protocol="application/pgp-signature"; boundary="huq684BweRXVnRxX"
Content-Disposition: inline
--huq684BweRXVnRxX
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
Hi,
first, I'm sorry for crossposting and also CCing stable@, if that's not OK,=
please let me knows.
Anyways, we've experienced hang of system running 2.6.32.28.
After upgrading to 2.6.32 and replacing failed disk, md resync has started.=
Then when the technician started pvmove, dome deadlock must have occured, =
because all disk requests started to hang and the whole system had to be re=
booted...
here's the backtrace:
[ 1229.645028] alg: No test for stdrng (krng)
[ 1229.668172] alg: No test for authenc(hmac(sha1),cbc(des3_ede)) (authenc(=
hmac(sha1-generic),cbc(des3_ede-generic)))
[ 1531.585167] md: bind
[ 1531.927846] raid1: raid set md2 active with 1 out of 2 mirrors
[ 1531.934613] md2: detected capacity change from 0 to 2000133029888
[ 1549.850444] md1: bitmap file is out of date (0 < 439231) -- forcing full=
recovery
[ 1549.858719] md1: bitmap file is out of date, doing full recovery
[ 1550.068105] md1: bitmap initialized from disk: read 11/11 pages, set 357=
576 bits
[ 1550.076054] created bitmap (175 pages) for device md1
[ 1561.449841] md2: unknown partition table
[ 1561.501645] md2: bitmap file is out of date (0 < 4) -- forcing full reco=
very
[ 1561.509999] md2: bitmap file is out of date, doing full recovery
[ 1562.158515] md2: bitmap initialized from disk: read 15/15 pages, set 476=
869 bits
[ 1562.167764] created bitmap (233 pages) for device md2
[ 2400.956019] INFO: task kjournald:1038 blocked for more than 120 seconds.
[ 2400.963280] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables =
this message.
[ 2400.971356] kjournald D ffff8800016ac400 0 1038 2 0x000000=
00
[ 2400.978621] ffff88003cc33c60 0000000000000046 ffff88003cc33bd0 ffffffff=
8119ba6f
[ 2400.986513] 0000000000013780 ffff88003f9746b0 ffff88003f9745f0 ffff8800=
3ea2c5f0
[ 2400.994426] ffff88003f9749a0 ffff88003cc33fd8 ffff88003d65b000 ffff8800=
35600a00
[ 2401.002415] Call Trace:
[ 2401.005024] [
[ 2401.010530] [
[ 2401.016182] [
[ 2401.021643] [
[ 2401.027029] [
[ 2401.032638] [
[ 2401.038177] [
[ 2401.043659] [
[ 2401.050129] [
[ 2401.056143] [
[ 2401.062077] [
0 [jbd]
[ 2401.069693] [
[ 2401.076212] [
[ 2401.082831] [
[ 2401.088708] [
[ 2401.095369] [
[ 2401.101337] [
[ 2401.106354] [
[ 2401.111477] [
[ 2401.116598] [
[ 2401.121893] INFO: task flush-253:2:3168 blocked for more than 120 second=
s.
[ 2401.128983] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables =
this message.
[ 2401.137114] flush-253:2 D 0000000000000002 0 3168 2 0x000000=
00
[ 2401.144318] ffff88002c245a40 0000000000000046 ffff880035601600 ffff8800=
2f621840
[ 2401.152248] 0000000000013780 ffff88003ceb9810 ffff88003ceb9750 ffff8800=
3ea2c5f0
[ 2401.160169] ffff88003ceb9b00 ffff88002c245fd8 ffff88002c245a00 ffff8800=
35601600
[ 2401.168048] Call Trace:
[ 2401.170608] [
[ 2401.176303] [
[ 2401.181723] [
[ 2401.186970] [
[ 2401.192991] [
[ 2401.198287] [
[ 2401.203687] [
[ 2401.209687] [
[ 2401.215802] [
[ 2401.221291] [
[ 2401.227327] [
[ 2401.232965] [
[ 2401.239412] [
[ 2401.245715] [
[ 2401.251360] [
[ 2401.257287] [
[ 2401.263317] [
[ 2401.268886] [
[ 2401.274370] [
[ 2401.279947] [
[ 2401.285000] [
[ 2401.290120] [
[ 2401.295247] [
[ 2401.300586] INFO: task reiserfs/0:3204 blocked for more than 120 seconds.
[ 2401.307590] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables =
this message.
[ 2401.315682] reiserfs/0 D ffff880016fdad48 0 3204 2 0x000000=
00
[ 2401.322884] ffff88002f1b1d10 0000000000000046 ffff88000180dda0 ffff8800=
0180dec0
[ 2401.330754] 0000000000013780 ffff88003ea180c0 ffff88003ea18000 ffff8800=
2f43aea0
[ 2401.338683] ffff88003ea183b0 ffff88002f1b1fd8 ffff88002f1b1cd0 ffffffff=
81048960
[ 2401.346684] Call Trace:
[ 2401.349252] [
[ 2401.354983] [
[ 2401.361480] [
[ 2401.366791] [
I can't 100% separate out some hardware problem, but this system has been r=
unning 2.6.27.x rock solid for years till then..
Can somebody see something interesting in those backtraces?
If I can provide further information, I'll be glad to assist...
BR
nik
--=20
-------------------------------------
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava
tel.: +420 596 603 142
fax: +420 596 621 273
mobil: +420 777 093 799
www.linuxbox.cz
mobil servis: +420 737 238 656
email servis: servis@linuxbox.cz
-------------------------------------
--huq684BweRXVnRxX
Content-Type: application/pgp-signature
Content-Disposition: inline
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iEYEARECAAYFAk3FIUsACgkQ3xdJJrLygV6+twCfWE+92qK/CCSR+mmDCvSr HvfL
3hcAoL93OACppARVrlXuDIIuGdsvnUGV
=EfAI
-----END PGP SIGNATURE-----
--huq684BweRXVnRxX--
--===============9157196714549586020==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
_______________________________________________
stable mailing list
stable@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/stable
--===============9157196714549586020==--