md 3.2.1 and xfs kernel panic on Linux 2.6.38

md 3.2.1 and xfs kernel panic on Linux 2.6.38

am 12.06.2011 20:50:01 von fibre raid

Hi All,

I am benchmarking md RAID with XFS on a server running Linux 2.6.38
kernel. The server has 24 x HDD's, dual 2.4GHz 6-core CPUs, and 24GB
RAM.

I created an md0 array using RAID 5, 64k chunk, 23 active drives, and
1 hot-spare. I then created a LVM2 volume group from this md0, and
created an LV out of it. The volume was formatted XFS as follows:

/sbin/mkfs.xfs =96f =96l lazy-count=3D1 -l size=3D128m -s size=3D4096
/dev/mapper/pool1-vol1

I then mounted it as follows:

/dev/mapper/pool1-vol1 on /volumes/pool1/vol1 type xfs
(rw,_netdev,noatime,nodiratime,osyncisdsync,nobarrier,logbuf s=3D8,delay=
log)

Once md synchronization was complete, I removed one of the active 23
drives. After attempting some IO, the md0 array began to rebuild to
the hot-spare. In a few hours, it was complete and the md0 array was
listed as active and healthy again (though now lacking a hot-spare
obviously).

As a test, I removed one more drive to see what would happen. As
expected, mdadm reported the array as active but degraded, and since
there was no hot-spare available, there was no rebuilding happening.

root@TESTBA16:~# mdadm -D /dev/md0
/dev/md0:
Version : 1.2
Creation Time : Sat Jun 11 19:06:46 2011
Raid Level : raid5
Array Size : 8595566848 (8197.37 GiB 8801.86 GB)
Used Dev Size : 390707584 (372.61 GiB 400.08 GB)
Raid Devices : 23
Total Devices : 24
Persistence : Superblock is persistent

Update Time : Sat Jun 11 23:30:14 2011
State : active, degraded
Active Devices : 22
Working Devices : 22
Failed Devices : 2
Spare Devices : 0

Layout : left-symmetric
Chunk Size : 64K

Name : 00259009BA16:md0 (local to host 00259009BA16)
UUID : fca4c639:41713d3f:20fa02e3:89b64c95
Events : 36

Number Major Minor RaidDevice State
0 65 81 0 active sync /dev/sdv1
1 65 97 1 active sync /dev/sdw1
2 65 113 2 active sync /dev/sdx1
3 65 129 3 active sync /dev/sdy1
4 65 17 4 active sync /dev/sdr1
5 65 33 5 active sync /dev/sds1
6 65 49 6 active sync /dev/sdt1
7 65 65 7 active sync /dev/sdu1
8 8 145 8 active sync /dev/sdj1
9 8 161 9 active sync /dev/sdk1
10 8 177 10 active sync /dev/sdl1
11 8 193 11 active sync /dev/sdm1
12 8 225 12 active sync /dev/sdo1
13 8 241 13 active sync /dev/sdp1
14 65 1 14 active sync /dev/sdq1
23 8 81 15 faulty spare rebuilding
16 8 17 16 active sync /dev/sdb1
17 8 33 17 active sync /dev/sdc1
18 8 49 18 active sync /dev/sdd1
19 8 65 19 active sync /dev/sde1
20 8 97 20 active sync /dev/sdg1
21 8 113 21 active sync /dev/sdh1
22 8 129 22 active sync /dev/sdi1

15 8 209 - faulty spare



What surprised me though is that I was no longer able to run IO on the
md0 device. As a test, I am using fio to generate IO to the XFS
mountpoint /volumes/pool1/vol1. However, IO failed. A few minutes
later, I received the following kernel dumps in /var/log/messages. Any
ideas?



Jun 12 11:33:54 TESTBA16 kernel: [59435.936575] fio D
ffff88060c6e1a50 0 30463 1 0x00000000
Jun 12 11:33:54 TESTBA16 kernel: [59435.936578] ffff880609887778
0000000000000086 0000000000000001 0000000000000086
Jun 12 11:33:54 TESTBA16 kernel: [59435.936581] 0000000000011e40
ffff88060c6e16c0 ffff88060c6e1a50 ffff880609887fd8
Jun 12 11:33:54 TESTBA16 kernel: [59435.936583] ffff88060c6e1a58
0000000000011e40 ffff880609886010 0000000000011e40
Jun 12 11:33:54 TESTBA16 kernel: [59435.936586] Call Trace:
Jun 12 11:33:54 TESTBA16 kernel: [59435.936594] []
make_request+0x138/0x3d0 [raid456]
Jun 12 11:33:54 TESTBA16 kernel: [59435.936602] []
? autoremove_wake_function+0x0/0x40
Jun 12 11:33:54 TESTBA16 kernel: [59435.936607] []
md_make_request+0xd5/0x210
Jun 12 11:33:54 TESTBA16 kernel: [59435.936610] []
? _dm_request+0x10f/0x160
Jun 12 11:33:54 TESTBA16 kernel: [59435.936615] []
generic_make_request+0x218/0x5c0
Jun 12 11:33:54 TESTBA16 kernel: [59435.936617] []
? dm_get_live_table+0x4b/0x60
Jun 12 11:33:54 TESTBA16 kernel: [59435.936620] []
? linear_merge+0x52/0x60
Jun 12 11:33:54 TESTBA16 kernel: [59435.936622] []
? dm_merge_bvec+0xc1/0x140
Jun 12 11:33:54 TESTBA16 kernel: [59435.936624] []
submit_bio+0x86/0x110
Jun 12 11:33:54 TESTBA16 kernel: [59435.936629] []
dio_bio_submit+0xbc/0xc0
Jun 12 11:33:54 TESTBA16 kernel: [59435.936631] []
dio_send_cur_page+0x95/0xc0
Jun 12 11:33:54 TESTBA16 kernel: [59435.936633] []
submit_page_section+0x65/0x170
Jun 12 11:33:54 TESTBA16 kernel: [59435.936635] []
do_direct_IO+0x375/0x430
Jun 12 11:33:54 TESTBA16 kernel: [59435.936637] []
direct_io_worker+0x1c4/0x390
Jun 12 11:33:54 TESTBA16 kernel: [59435.936639] []
__blockdev_direct_IO+0x26d/0x2d0
Jun 12 11:33:54 TESTBA16 kernel: [59435.936672] []
? xfs_get_blocks_direct+0x0/0x20 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.936675] []
? perf_event_task_sched_out+0x33/0x80
Jun 12 11:33:54 TESTBA16 kernel: [59435.936690] []
xfs_vm_direct_IO+0x11f/0x130 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.936706] []
? xfs_get_blocks_direct+0x0/0x20 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.936710] []
generic_file_aio_read+0x215/0x230
Jun 12 11:33:54 TESTBA16 kernel: [59435.936724] []
? xfs_ilock+0x6c/0xe0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.936739] []
? xfs_ilock_demote+0x76/0xb0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.936754] []
xfs_file_aio_read+0x161/0x2d0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.936769] []
? xfs_file_aio_read+0x0/0x2d0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.936772] []
aio_rw_vect_retry+0x7c/0x1d0
Jun 12 11:33:54 TESTBA16 kernel: [59435.936775] []
aio_run_iocb+0x62/0x190
Jun 12 11:33:54 TESTBA16 kernel: [59435.936777] []
io_submit_one+0x180/0x2f0
Jun 12 11:33:54 TESTBA16 kernel: [59435.936779] []
do_io_submit+0x13c/0x2b0
Jun 12 11:33:54 TESTBA16 kernel: [59435.936781] []
sys_io_submit+0x10/0x20
Jun 12 11:33:54 TESTBA16 kernel: [59435.936784] []
system_call_fastpath+0x16/0x1b
Jun 12 11:33:54 TESTBA16 kernel: [59435.936789] fio D
ffff88060b063110 0 30464 1 0x00000000
Jun 12 11:33:54 TESTBA16 kernel: [59435.936791] ffff8805f70a9778
0000000000000086 0000000000000001 0000000000000001
Jun 12 11:33:54 TESTBA16 kernel: [59435.936793] 0000000000011e40
ffff88060b062d80 ffff88060b063110 ffff8805f70a9fd8
Jun 12 11:33:54 TESTBA16 kernel: [59435.936795] ffff88060b063118
0000000000011e40 ffff8805f70a8010 0000000000011e40
Jun 12 11:33:54 TESTBA16 kernel: [59435.936798] Call Trace:
Jun 12 11:33:54 TESTBA16 kernel: [59435.936800] []
make_request+0x138/0x3d0 [raid456]
Jun 12 11:33:54 TESTBA16 kernel: [59435.936803] []
? autoremove_wake_function+0x0/0x40
Jun 12 11:33:54 TESTBA16 kernel: [59435.936805] []
md_make_request+0xd5/0x210
Jun 12 11:33:54 TESTBA16 kernel: [59435.936807] []
? __map_bio+0xad/0x130
Jun 12 11:33:54 TESTBA16 kernel: [59435.936809] []
? _dm_request+0x10f/0x160
Jun 12 11:33:54 TESTBA16 kernel: [59435.936811] []
generic_make_request+0x218/0x5c0
Jun 12 11:33:54 TESTBA16 kernel: [59435.936813] []
? dm_get_live_table+0x4b/0x60
Jun 12 11:33:54 TESTBA16 kernel: [59435.936815] []
? linear_merge+0x52/0x60
Jun 12 11:33:54 TESTBA16 kernel: [59435.936817] []
? dm_merge_bvec+0xc1/0x140
Jun 12 11:33:54 TESTBA16 kernel: [59435.936819] []
submit_bio+0x86/0x110
Jun 12 11:33:54 TESTBA16 kernel: [59435.936821] []
dio_bio_submit+0xbc/0xc0
Jun 12 11:33:54 TESTBA16 kernel: [59435.936823] []
dio_send_cur_page+0x95/0xc0
Jun 12 11:33:54 TESTBA16 kernel: [59435.936825] []
submit_page_section+0x65/0x170
Jun 12 11:33:54 TESTBA16 kernel: [59435.936827] []
do_direct_IO+0x375/0x430
Jun 12 11:33:54 TESTBA16 kernel: [59435.936829] []
direct_io_worker+0x1c4/0x390
Jun 12 11:33:54 TESTBA16 kernel: [59435.936831] []
__blockdev_direct_IO+0x26d/0x2d0
Jun 12 11:33:54 TESTBA16 kernel: [59435.936847] []
? xfs_get_blocks_direct+0x0/0x20 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.936849] []
? perf_event_task_sched_out+0x33/0x80
Jun 12 11:33:54 TESTBA16 kernel: [59435.936864] []
xfs_vm_direct_IO+0x11f/0x130 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.936879] []
? xfs_get_blocks_direct+0x0/0x20 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.936882] []
generic_file_aio_read+0x215/0x230
Jun 12 11:33:54 TESTBA16 kernel: [59435.936896] []
? xfs_ilock+0x6c/0xe0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.936910] []
? xfs_ilock_demote+0x76/0xb0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.936925] []
xfs_file_aio_read+0x161/0x2d0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.936940] []
? xfs_file_aio_read+0x0/0x2d0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.936942] []
aio_rw_vect_retry+0x7c/0x1d0
Jun 12 11:33:54 TESTBA16 kernel: [59435.936944] []
aio_run_iocb+0x62/0x190
Jun 12 11:33:54 TESTBA16 kernel: [59435.936946] []
io_submit_one+0x180/0x2f0
Jun 12 11:33:54 TESTBA16 kernel: [59435.936948] []
do_io_submit+0x13c/0x2b0
Jun 12 11:33:54 TESTBA16 kernel: [59435.936950] []
sys_io_submit+0x10/0x20
Jun 12 11:33:54 TESTBA16 kernel: [59435.936953] []
system_call_fastpath+0x16/0x1b
Jun 12 11:33:54 TESTBA16 kernel: [59435.936957] fio D
ffff88060c679a50 0 30465 1 0x00000000
Jun 12 11:33:54 TESTBA16 kernel: [59435.936959] ffff8805ad67f778
0000000000000086 0000000000000001 0000000000000086
Jun 12 11:33:54 TESTBA16 kernel: [59435.936961] 0000000000011e40
ffff88060c6796c0 ffff88060c679a50 ffff8805ad67ffd8
Jun 12 11:33:54 TESTBA16 kernel: [59435.936964] ffff88060c679a58
0000000000011e40 ffff8805ad67e010 0000000000011e40
Jun 12 11:33:54 TESTBA16 kernel: [59435.936966] Call Trace:
Jun 12 11:33:54 TESTBA16 kernel: [59435.936968] []
make_request+0x138/0x3d0 [raid456]
Jun 12 11:33:54 TESTBA16 kernel: [59435.936971] []
? autoremove_wake_function+0x0/0x40
Jun 12 11:33:54 TESTBA16 kernel: [59435.936973] []
md_make_request+0xd5/0x210
Jun 12 11:33:54 TESTBA16 kernel: [59435.936975] []
? __map_bio+0xad/0x130
Jun 12 11:33:54 TESTBA16 kernel: [59435.936977] []
? _dm_request+0x10f/0x160
Jun 12 11:33:54 TESTBA16 kernel: [59435.936979] []
generic_make_request+0x218/0x5c0
Jun 12 11:33:54 TESTBA16 kernel: [59435.936982] []
? __slab_alloc+0xb0/0x2c0
Jun 12 11:33:54 TESTBA16 kernel: [59435.936984] []
? dm_get_live_table+0x4b/0x60
Jun 12 11:33:54 TESTBA16 kernel: [59435.936986] []
? linear_merge+0x52/0x60
Jun 12 11:33:54 TESTBA16 kernel: [59435.936988] []
? dm_merge_bvec+0xc1/0x140
Jun 12 11:33:54 TESTBA16 kernel: [59435.936990] []
submit_bio+0x86/0x110
Jun 12 11:33:54 TESTBA16 kernel: [59435.936992] []
dio_bio_submit+0xbc/0xc0
Jun 12 11:33:54 TESTBA16 kernel: [59435.936994] []
dio_send_cur_page+0x95/0xc0
Jun 12 11:33:54 TESTBA16 kernel: [59435.936996] []
submit_page_section+0x65/0x170
Jun 12 11:33:54 TESTBA16 kernel: [59435.936998] []
do_direct_IO+0x375/0x430
Jun 12 11:33:54 TESTBA16 kernel: [59435.937000] []
direct_io_worker+0x1c4/0x390
Jun 12 11:33:54 TESTBA16 kernel: [59435.937003] []
__blockdev_direct_IO+0x26d/0x2d0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937018] []
? xfs_get_blocks_direct+0x0/0x20 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937020] []
? perf_event_task_sched_out+0x33/0x80
Jun 12 11:33:54 TESTBA16 kernel: [59435.937035] []
xfs_vm_direct_IO+0x11f/0x130 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937050] []
? xfs_get_blocks_direct+0x0/0x20 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937053] []
generic_file_aio_read+0x215/0x230
Jun 12 11:33:54 TESTBA16 kernel: [59435.937067] []
? xfs_ilock+0x6c/0xe0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937081] []
? xfs_ilock_demote+0x76/0xb0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937096] []
xfs_file_aio_read+0x161/0x2d0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937111] []
? xfs_file_aio_read+0x0/0x2d0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937113] []
aio_rw_vect_retry+0x7c/0x1d0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937115] []
aio_run_iocb+0x62/0x190
Jun 12 11:33:54 TESTBA16 kernel: [59435.937117] []
io_submit_one+0x180/0x2f0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937119] []
do_io_submit+0x13c/0x2b0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937121] []
sys_io_submit+0x10/0x20
Jun 12 11:33:54 TESTBA16 kernel: [59435.937124] []
system_call_fastpath+0x16/0x1b
Jun 12 11:33:54 TESTBA16 kernel: [59435.937128] fio D
ffff88060c525e90 0 30466 1 0x00000000
Jun 12 11:33:54 TESTBA16 kernel: [59435.937130] ffff8806099ab778
0000000000000082 ffff8806099ab748 0000000000000000
Jun 12 11:33:54 TESTBA16 kernel: [59435.937132] 0000000000011e40
ffff88060c525b00 ffff88060c525e90 ffff8806099abfd8
Jun 12 11:33:54 TESTBA16 kernel: [59435.937134] ffff88060c525e98
0000000000011e40 ffff8806099aa010 0000000000011e40
Jun 12 11:33:54 TESTBA16 kernel: [59435.937136] Call Trace:
Jun 12 11:33:54 TESTBA16 kernel: [59435.937139] []
make_request+0x138/0x3d0 [raid456]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937142] []
? autoremove_wake_function+0x0/0x40
Jun 12 11:33:54 TESTBA16 kernel: [59435.937144] []
md_make_request+0xd5/0x210
Jun 12 11:33:54 TESTBA16 kernel: [59435.937146] []
? _dm_request+0x10f/0x160
Jun 12 11:33:54 TESTBA16 kernel: [59435.937148] []
generic_make_request+0x218/0x5c0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937150] []
? dm_get_live_table+0x4b/0x60
Jun 12 11:33:54 TESTBA16 kernel: [59435.937152] []
? linear_merge+0x52/0x60
Jun 12 11:33:54 TESTBA16 kernel: [59435.937154] []
? dm_merge_bvec+0xc1/0x140
Jun 12 11:33:54 TESTBA16 kernel: [59435.937156] []
submit_bio+0x86/0x110
Jun 12 11:33:54 TESTBA16 kernel: [59435.937158] []
dio_bio_submit+0xbc/0xc0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937160] []
dio_send_cur_page+0x95/0xc0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937162] []
submit_page_section+0x65/0x170
Jun 12 11:33:54 TESTBA16 kernel: [59435.937164] []
do_direct_IO+0x375/0x430
Jun 12 11:33:54 TESTBA16 kernel: [59435.937166] []
direct_io_worker+0x1c4/0x390
Jun 12 11:33:54 TESTBA16 kernel: [59435.937168] []
__blockdev_direct_IO+0x26d/0x2d0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937184] []
? xfs_get_blocks_direct+0x0/0x20 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937186] []
? perf_event_task_sched_out+0x33/0x80
Jun 12 11:33:54 TESTBA16 kernel: [59435.937201] []
xfs_vm_direct_IO+0x11f/0x130 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937216] []
? xfs_get_blocks_direct+0x0/0x20 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937219] []
generic_file_aio_read+0x215/0x230
Jun 12 11:33:54 TESTBA16 kernel: [59435.937233] []
? xfs_ilock+0x6c/0xe0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937247] []
? xfs_ilock_demote+0x76/0xb0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937262] []
xfs_file_aio_read+0x161/0x2d0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937277] []
? xfs_file_aio_read+0x0/0x2d0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937279] []
aio_rw_vect_retry+0x7c/0x1d0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937281] []
aio_run_iocb+0x62/0x190
Jun 12 11:33:54 TESTBA16 kernel: [59435.937283] []
io_submit_one+0x180/0x2f0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937285] []
do_io_submit+0x13c/0x2b0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937287] []
sys_io_submit+0x10/0x20
Jun 12 11:33:54 TESTBA16 kernel: [59435.937290] []
system_call_fastpath+0x16/0x1b
Jun 12 11:33:54 TESTBA16 kernel: [59435.937294] fio D
ffff88060b488390 0 30467 1 0x00000000
Jun 12 11:33:54 TESTBA16 kernel: [59435.937296] ffff8805af683778
0000000000000086 0000000000000001 0000000000000086
Jun 12 11:33:54 TESTBA16 kernel: [59435.937298] 0000000000011e40
ffff88060b488000 ffff88060b488390 ffff8805af683fd8
Jun 12 11:33:54 TESTBA16 kernel: [59435.937300] ffff88060b488398
0000000000011e40 ffff8805af682010 0000000000011e40
Jun 12 11:33:54 TESTBA16 kernel: [59435.937303] Call Trace:
Jun 12 11:33:54 TESTBA16 kernel: [59435.937305] []
make_request+0x138/0x3d0 [raid456]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937308] []
? autoremove_wake_function+0x0/0x40
Jun 12 11:33:54 TESTBA16 kernel: [59435.937310] []
md_make_request+0xd5/0x210
Jun 12 11:33:54 TESTBA16 kernel: [59435.937312] []
? _dm_request+0x10f/0x160
Jun 12 11:33:54 TESTBA16 kernel: [59435.937314] []
generic_make_request+0x218/0x5c0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937316] []
? dm_get_live_table+0x4b/0x60
Jun 12 11:33:54 TESTBA16 kernel: [59435.937318] []
? linear_merge+0x52/0x60
Jun 12 11:33:54 TESTBA16 kernel: [59435.937320] []
? dm_merge_bvec+0xc1/0x140
Jun 12 11:33:54 TESTBA16 kernel: [59435.937322] []
submit_bio+0x86/0x110
Jun 12 11:33:54 TESTBA16 kernel: [59435.937324] []
dio_bio_submit+0xbc/0xc0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937326] []
dio_send_cur_page+0x95/0xc0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937328] []
submit_page_section+0x65/0x170
Jun 12 11:33:54 TESTBA16 kernel: [59435.937330] []
do_direct_IO+0x375/0x430
Jun 12 11:33:54 TESTBA16 kernel: [59435.937332] []
direct_io_worker+0x1c4/0x390
Jun 12 11:33:54 TESTBA16 kernel: [59435.937334] []
__blockdev_direct_IO+0x26d/0x2d0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937350] []
? xfs_get_blocks_direct+0x0/0x20 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937352] []
? perf_event_task_sched_out+0x33/0x80
Jun 12 11:33:54 TESTBA16 kernel: [59435.937367] []
xfs_vm_direct_IO+0x11f/0x130 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937382] []
? xfs_get_blocks_direct+0x0/0x20 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937385] []
generic_file_aio_read+0x215/0x230
Jun 12 11:33:54 TESTBA16 kernel: [59435.937399] []
? xfs_ilock+0x6c/0xe0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937413] []
? xfs_ilock_demote+0x76/0xb0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937428] []
xfs_file_aio_read+0x161/0x2d0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937443] []
? xfs_file_aio_read+0x0/0x2d0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937445] []
aio_rw_vect_retry+0x7c/0x1d0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937447] []
aio_run_iocb+0x62/0x190
Jun 12 11:33:54 TESTBA16 kernel: [59435.937449] []
io_submit_one+0x180/0x2f0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937451] []
do_io_submit+0x13c/0x2b0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937454] []
sys_io_submit+0x10/0x20
Jun 12 11:33:54 TESTBA16 kernel: [59435.937456] []
system_call_fastpath+0x16/0x1b
Jun 12 11:33:54 TESTBA16 kernel: [59435.937460] fio D
ffff880608b61a50 0 30468 1 0x00000000
Jun 12 11:33:54 TESTBA16 kernel: [59435.937462] ffff8805f77c1778
0000000000000082 0000000000000001 0000000000000000
Jun 12 11:33:54 TESTBA16 kernel: [59435.937464] 0000000000011e40
ffff880608b616c0 ffff880608b61a50 ffff8805f77c1fd8
Jun 12 11:33:54 TESTBA16 kernel: [59435.937466] ffff880608b61a58
0000000000011e40 ffff8805f77c0010 0000000000011e40
Jun 12 11:33:54 TESTBA16 kernel: [59435.937469] Call Trace:
Jun 12 11:33:54 TESTBA16 kernel: [59435.937471] []
make_request+0x138/0x3d0 [raid456]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937474] []
? autoremove_wake_function+0x0/0x40
Jun 12 11:33:54 TESTBA16 kernel: [59435.937476] []
md_make_request+0xd5/0x210
Jun 12 11:33:54 TESTBA16 kernel: [59435.937478] []
? __map_bio+0xad/0x130
Jun 12 11:33:54 TESTBA16 kernel: [59435.937480] []
? _dm_request+0x10f/0x160
Jun 12 11:33:54 TESTBA16 kernel: [59435.937482] []
generic_make_request+0x218/0x5c0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937484] []
? dm_get_live_table+0x4b/0x60
Jun 12 11:33:54 TESTBA16 kernel: [59435.937486] []
? linear_merge+0x52/0x60
Jun 12 11:33:54 TESTBA16 kernel: [59435.937488] []
? dm_merge_bvec+0xc1/0x140
Jun 12 11:33:54 TESTBA16 kernel: [59435.937490] []
submit_bio+0x86/0x110
Jun 12 11:33:54 TESTBA16 kernel: [59435.937492] []
dio_bio_submit+0xbc/0xc0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937494] []
dio_send_cur_page+0x95/0xc0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937496] []
submit_page_section+0x65/0x170
Jun 12 11:33:54 TESTBA16 kernel: [59435.937498] []
do_direct_IO+0x375/0x430
Jun 12 11:33:54 TESTBA16 kernel: [59435.937501] []
? find_busiest_group+0x65/0x340
Jun 12 11:33:54 TESTBA16 kernel: [59435.937503] []
direct_io_worker+0x1c4/0x390
Jun 12 11:33:54 TESTBA16 kernel: [59435.937506] []
__blockdev_direct_IO+0x26d/0x2d0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937521] []
? xfs_get_blocks_direct+0x0/0x20 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937523] []
? perf_event_task_sched_out+0x33/0x80
Jun 12 11:33:54 TESTBA16 kernel: [59435.937538] []
xfs_vm_direct_IO+0x11f/0x130 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937553] []
? xfs_get_blocks_direct+0x0/0x20 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937556] []
generic_file_aio_read+0x215/0x230
Jun 12 11:33:54 TESTBA16 kernel: [59435.937570] []
? xfs_ilock+0x6c/0xe0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937584] []
? xfs_ilock_demote+0x76/0xb0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937599] []
xfs_file_aio_read+0x161/0x2d0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937615] []
? xfs_file_aio_read+0x0/0x2d0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937617] []
aio_rw_vect_retry+0x7c/0x1d0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937618] []
aio_run_iocb+0x62/0x190
Jun 12 11:33:54 TESTBA16 kernel: [59435.937620] []
io_submit_one+0x180/0x2f0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937622] []
do_io_submit+0x13c/0x2b0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937625] []
sys_io_submit+0x10/0x20
Jun 12 11:33:54 TESTBA16 kernel: [59435.937627] []
system_call_fastpath+0x16/0x1b
Jun 12 11:33:54 TESTBA16 kernel: [59435.937631] fio D
ffff88060b1cc7d0 0 30469 1 0x00000000
Jun 12 11:33:54 TESTBA16 kernel: [59435.937634] ffff88060ba3f778
0000000000000086 0000000000000001 00000000ffffffff
Jun 12 11:33:54 TESTBA16 kernel: [59435.937636] 0000000000011e40
ffff88060b1cc440 ffff88060b1cc7d0 ffff88060ba3ffd8
Jun 12 11:33:54 TESTBA16 kernel: [59435.937638] ffff88060b1cc7d8
0000000000011e40 ffff88060ba3e010 0000000000011e40
Jun 12 11:33:54 TESTBA16 kernel: [59435.937640] Call Trace:
Jun 12 11:33:54 TESTBA16 kernel: [59435.937643] []
make_request+0x138/0x3d0 [raid456]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937645] []
? autoremove_wake_function+0x0/0x40
Jun 12 11:33:54 TESTBA16 kernel: [59435.937648] []
md_make_request+0xd5/0x210
Jun 12 11:33:54 TESTBA16 kernel: [59435.937650] []
? __make_request+0x41e/0x4b0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937652] []
? _dm_request+0x10f/0x160
Jun 12 11:33:54 TESTBA16 kernel: [59435.937654] []
generic_make_request+0x218/0x5c0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937656] []
? dm_get_live_table+0x4b/0x60
Jun 12 11:33:54 TESTBA16 kernel: [59435.937658] []
? linear_merge+0x52/0x60
Jun 12 11:33:54 TESTBA16 kernel: [59435.937660] []
? dm_merge_bvec+0xc1/0x140
Jun 12 11:33:54 TESTBA16 kernel: [59435.937662] []
submit_bio+0x86/0x110
Jun 12 11:33:54 TESTBA16 kernel: [59435.937664] []
dio_bio_submit+0xbc/0xc0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937666] []
dio_send_cur_page+0x95/0xc0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937668] []
submit_page_section+0x65/0x170
Jun 12 11:33:54 TESTBA16 kernel: [59435.937670] []
do_direct_IO+0x375/0x430
Jun 12 11:33:54 TESTBA16 kernel: [59435.937672] []
direct_io_worker+0x1c4/0x390
Jun 12 11:33:54 TESTBA16 kernel: [59435.937674] []
__blockdev_direct_IO+0x26d/0x2d0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937690] []
? xfs_get_blocks_direct+0x0/0x20 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937692] []
? perf_event_task_sched_out+0x33/0x80
Jun 12 11:33:54 TESTBA16 kernel: [59435.937707] []
xfs_vm_direct_IO+0x11f/0x130 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937722] []
? xfs_get_blocks_direct+0x0/0x20 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937725] []
generic_file_aio_read+0x215/0x230
Jun 12 11:33:54 TESTBA16 kernel: [59435.937739] []
? xfs_ilock+0x6c/0xe0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937753] []
? xfs_ilock_demote+0x76/0xb0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937768] []
xfs_file_aio_read+0x161/0x2d0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937784] []
? xfs_file_aio_read+0x0/0x2d0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937786] []
aio_rw_vect_retry+0x7c/0x1d0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937788] []
aio_run_iocb+0x62/0x190
Jun 12 11:33:54 TESTBA16 kernel: [59435.937789] []
io_submit_one+0x180/0x2f0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937792] []
do_io_submit+0x13c/0x2b0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937794] []
sys_io_submit+0x10/0x20
Jun 12 11:33:54 TESTBA16 kernel: [59435.937796] []
system_call_fastpath+0x16/0x1b
Jun 12 11:33:54 TESTBA16 kernel: [59435.937800] fio D
ffff88060afe8390 0 30470 1 0x00000000
Jun 12 11:33:54 TESTBA16 kernel: [59435.937802] ffff8805f74c5778
0000000000000082 0000000000000001 00000000ffffffff
Jun 12 11:33:54 TESTBA16 kernel: [59435.937805] 0000000000011e40
ffff88060afe8000 ffff88060afe8390 ffff8805f74c5fd8
Jun 12 11:33:54 TESTBA16 kernel: [59435.937807] ffff88060afe8398
0000000000011e40 ffff8805f74c4010 0000000000011e40
Jun 12 11:33:54 TESTBA16 kernel: [59435.937809] Call Trace:
Jun 12 11:33:54 TESTBA16 kernel: [59435.937812] []
make_request+0x138/0x3d0 [raid456]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937814] []
? autoremove_wake_function+0x0/0x40
Jun 12 11:33:54 TESTBA16 kernel: [59435.937816] []
md_make_request+0xd5/0x210
Jun 12 11:33:54 TESTBA16 kernel: [59435.937818] []
? __make_request+0x110/0x4b0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937820] []
? _dm_request+0x10f/0x160
Jun 12 11:33:54 TESTBA16 kernel: [59435.937822] []
generic_make_request+0x218/0x5c0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937824] []
? __slab_alloc+0xb0/0x2c0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937826] []
? dm_get_live_table+0x4b/0x60
Jun 12 11:33:54 TESTBA16 kernel: [59435.937828] []
? linear_merge+0x52/0x60
Jun 12 11:33:54 TESTBA16 kernel: [59435.937830] []
? dm_merge_bvec+0xc1/0x140
Jun 12 11:33:54 TESTBA16 kernel: [59435.937832] []
submit_bio+0x86/0x110
Jun 12 11:33:54 TESTBA16 kernel: [59435.937834] []
dio_bio_submit+0xbc/0xc0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937836] []
dio_send_cur_page+0x95/0xc0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937838] []
submit_page_section+0x65/0x170
Jun 12 11:33:54 TESTBA16 kernel: [59435.937840] []
do_direct_IO+0x375/0x430
Jun 12 11:33:54 TESTBA16 kernel: [59435.937842] []
direct_io_worker+0x1c4/0x390
Jun 12 11:33:54 TESTBA16 kernel: [59435.937845] []
__blockdev_direct_IO+0x26d/0x2d0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937860] []
? xfs_get_blocks_direct+0x0/0x20 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937862] []
? perf_event_task_sched_out+0x33/0x80
Jun 12 11:33:54 TESTBA16 kernel: [59435.937877] []
xfs_vm_direct_IO+0x11f/0x130 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937893] []
? xfs_get_blocks_direct+0x0/0x20 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937895] []
generic_file_aio_read+0x215/0x230
Jun 12 11:33:54 TESTBA16 kernel: [59435.937909] []
? xfs_ilock+0x6c/0xe0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937924] []
? xfs_ilock_demote+0x76/0xb0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937939] []
xfs_file_aio_read+0x161/0x2d0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937954] []
? xfs_file_aio_read+0x0/0x2d0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937956] []
aio_rw_vect_retry+0x7c/0x1d0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937958] []
aio_run_iocb+0x62/0x190
Jun 12 11:33:54 TESTBA16 kernel: [59435.937960] []
io_submit_one+0x180/0x2f0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937962] []
do_io_submit+0x13c/0x2b0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937964] []
sys_io_submit+0x10/0x20
Jun 12 11:33:54 TESTBA16 kernel: [59435.937966] []
system_call_fastpath+0x16/0x1b
Jun 12 11:33:54 TESTBA16 kernel: [59435.937970] fio D
ffff88060c4d3110 0 30764 1 0x00000004
Jun 12 11:33:54 TESTBA16 kernel: [59435.937973] ffff880609307898
0000000000000082 ffffffffffffffff 0000000000000000
Jun 12 11:33:54 TESTBA16 kernel: [59435.937975] 0000000000011e40
ffff88060c4d2d80 ffff88060c4d3110 ffff880609307fd8
Jun 12 11:33:54 TESTBA16 kernel: [59435.937977] ffff88060c4d3118
0000000000011e40 ffff880609306010 0000000000011e40
Jun 12 11:33:54 TESTBA16 kernel: [59435.937979] Call Trace:
Jun 12 11:33:54 TESTBA16 kernel: [59435.937981] []
md_write_start+0xb5/0x1e0
Jun 12 11:33:54 TESTBA16 kernel: [59435.937984] []
? autoremove_wake_function+0x0/0x40
Jun 12 11:33:54 TESTBA16 kernel: [59435.937987] []
make_request+0x41/0x3d0 [raid456]
Jun 12 11:33:54 TESTBA16 kernel: [59435.937989] []
md_make_request+0xd5/0x210
Jun 12 11:33:54 TESTBA16 kernel: [59435.938004] []
? xlog_write+0x489/0x500 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.938006] []
? _dm_request+0x10f/0x160
Jun 12 11:33:54 TESTBA16 kernel: [59435.938008] []
generic_make_request+0x218/0x5c0
Jun 12 11:33:54 TESTBA16 kernel: [59435.938010] []
? linear_merge+0x52/0x60
Jun 12 11:33:54 TESTBA16 kernel: [59435.938012] []
? dm_merge_bvec+0xc1/0x140
Jun 12 11:33:54 TESTBA16 kernel: [59435.938014] []
submit_bio+0x86/0x110
Jun 12 11:33:54 TESTBA16 kernel: [59435.938030] []
_xfs_buf_ioapply+0x192/0x2f0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.938044] []
? xlog_bdstrat+0x2a/0x60 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.938060] []
xfs_buf_iorequest+0x51/0xd0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.938074] []
xlog_bdstrat+0x2a/0x60 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.938089] []
xlog_sync+0x248/0x3f0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.938104] []
xlog_state_release_iclog+0x9b/0xd0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.938119] []
_xfs_log_force_lsn+0x1eb/0x290 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.938135] []
_xfs_trans_commit+0x29d/0x2b0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.938150] []
xfs_change_file_space+0x1e5/0x2e0 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.938154] []
? do_filp_open+0x181/0x770
Jun 12 11:33:54 TESTBA16 kernel: [59435.938170] []
xfs_file_fallocate+0xde/0x150 [xfs]
Jun 12 11:33:54 TESTBA16 kernel: [59435.938172] []
? putname+0x35/0x50
Jun 12 11:33:54 TESTBA16 kernel: [59435.938174] []
? do_unlinkat+0x5e/0x1d0
Jun 12 11:33:54 TESTBA16 kernel: [59435.938180] []
? apparmor_file_permission+0x18/0x20
Jun 12 11:33:54 TESTBA16 kernel: [59435.938183] []
do_fallocate+0x113/0x120
Jun 12 11:33:54 TESTBA16 kernel: [59435.938185] []
sys_fallocate+0x4e/0x80
Jun 12 11:33:54 TESTBA16 kernel: [59435.938187] []
system_call_fastpath+0x16/0x1b



The errors seem to be a combination of XFS and md related messages.
Any insight into this issue would be greatly appreciated. Thanks!


-TG
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: md 3.2.1 and xfs kernel panic on Linux 2.6.38

am 16.06.2011 03:55:31 von NeilBrown

On Sun, 12 Jun 2011 11:50:01 -0700 "fibreraid@gmail.com"
wrote:

> Hi All,
>=20
> I am benchmarking md RAID with XFS on a server running Linux 2.6.38
> kernel. The server has 24 x HDD's, dual 2.4GHz 6-core CPUs, and 24GB
> RAM.
>=20
> I created an md0 array using RAID 5, 64k chunk, 23 active drives, and
> 1 hot-spare. I then created a LVM2 volume group from this md0, and
> created an LV out of it. The volume was formatted XFS as follows:
>=20
> /sbin/mkfs.xfs =96f =96l lazy-count=3D1 -l size=3D128m -s size=3D4096
> /dev/mapper/pool1-vol1
>=20
> I then mounted it as follows:
>=20
> /dev/mapper/pool1-vol1 on /volumes/pool1/vol1 type xfs
> (rw,_netdev,noatime,nodiratime,osyncisdsync,nobarrier,logbuf s=3D8,del=
aylog)
>=20
> Once md synchronization was complete, I removed one of the active 23
> drives. After attempting some IO, the md0 array began to rebuild to
> the hot-spare. In a few hours, it was complete and the md0 array was
> listed as active and healthy again (though now lacking a hot-spare
> obviously).
>=20
> As a test, I removed one more drive to see what would happen. As
> expected, mdadm reported the array as active but degraded, and since
> there was no hot-spare available, there was no rebuilding happening.
>=20
...
>=20
> What surprised me though is that I was no longer able to run IO on th=
e
> md0 device. As a test, I am using fio to generate IO to the XFS
> mountpoint /volumes/pool1/vol1. However, IO failed. A few minutes
> later, I received the following kernel dumps in /var/log/messages. An=
y
> ideas?
>=20
>=20
>=20
> Jun 12 11:33:54 TESTBA16 kernel: [59435.936575] fio D
> ffff88060c6e1a50 0 30463 1 0x00000000
> Jun 12 11:33:54 TESTBA16 kernel: [59435.936578] ffff880609887778
> 0000000000000086 0000000000000001 0000000000000086
> Jun 12 11:33:54 TESTBA16 kernel: [59435.936581] 0000000000011e40
> ffff88060c6e16c0 ffff88060c6e1a50 ffff880609887fd8
> Jun 12 11:33:54 TESTBA16 kernel: [59435.936583] ffff88060c6e1a58
> 0000000000011e40 ffff880609886010 0000000000011e40
> Jun 12 11:33:54 TESTBA16 kernel: [59435.936586] Call Trace:
> Jun 12 11:33:54 TESTBA16 kernel: [59435.936594] []
> make_request+0x138/0x3d0 [raid456]

>=20
> The errors seem to be a combination of XFS and md related messages.
> Any insight into this issue would be greatly appreciated. Thanks!
>=20

Very peculiar!

It appears that make_request in raid5.c is entering schedule() in an
uninterruptible wait.
There are 4 places where make_request calls schedule.
2 can only happen if the array is being reshaped (e.2. 5 drives to 6 d=
rives)
but that does not appear to be happening.
1 causes and interruptible wait, so it cannot be that one.

That just leaves the one on line 4105.
This requires either than the stripe is being reshaped (which we alread=
y
decided isn't happening) or that md/raid5 has received overlapping requ=
ests.

i.e. while one request (either read or write) was pending, another requ=
est
(either read or write, not necessarily the same) arrives for a range of
sectors which over-laps the previous request.

When this happens (which it shouldn't because it would be dumb for a
filesystem to do that, but you never know) md/raid5 will wait for the f=
irst
request to be completely handled before letting the second proceed.
So we should be waiting here for at most a small fraction of a second.
Clearly we are waiting longer than that...

So this cannot possibly happen (as is so often the case when debugging =
:-)

Hmmm... maybe we are missing the wakeup call. I can find where we wake=
-up
anyone waiting for an overlapping read request to complete, but I canno=
t find
where we wake-up someone waiting for when an overlapping write request
completes. That should probably go in handle_stripe_clean_event.

Do you have the system still hanging in this state? If not, can you ge=
t it
back into this state easily?
If so, you can force a wakeup with the magic incantation:

cat /sys/block/mdXX/md/suspend_lo > /sys/block/mdXX/md/suspend_lo

(with 'XX' suitably substituted).

If that makes a difference, then I know I am on the right track

Thanks,
NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: md 3.2.1 and xfs kernel panic on Linux 2.6.38

am 16.06.2011 04:26:52 von NeilBrown

On Thu, 16 Jun 2011 11:55:31 +1000 NeilBrown wrote:


> Hmmm... maybe we are missing the wakeup call. I can find where we wake-up
> anyone waiting for an overlapping read request to complete, but I cannot find
> where we wake-up someone waiting for when an overlapping write request
> completes. That should probably go in handle_stripe_clean_event.

On second thoughts, I can find where it is handled correctly for an
overlapping write request.

So I'm not stumped again.

I tried looking at a disassembly listing of make_request to confirm which
'schedule' call it was, but when I disassemble make_request it is about 1700
bytes long where as you trace show:

make_request+0x138/0x3d0

That it is 0x3d0 long, or about 900 bytes. That is odd.

Can you get a disassembly to show me?
e.g.
gdb raid456.ko
disassemble make_request

Thanks,
NeilBrown


>
> Do you have the system still hanging in this state? If not, can you get it
> back into this state easily?
> If so, you can force a wakeup with the magic incantation:
>
> cat /sys/block/mdXX/md/suspend_lo > /sys/block/mdXX/md/suspend_lo
>
> (with 'XX' suitably substituted).
>
> If that makes a difference, then I know I am on the right track
>
> Thanks,
> NeilBrown
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: md 3.2.1 and xfs kernel panic on Linux 2.6.38

am 16.06.2011 07:50:19 von Stan Hoeppner

On 6/15/2011 8:55 PM, NeilBrown wrote:
> On Sun, 12 Jun 2011 11:50:01 -0700 "fibreraid@gmail.com"
> wrote:
>=20
>> Hi All,

Hi guys.

I was racking my brain trying to figure out why this thread wasn't
hitting the XFS list, and finally figured it out. Palm to forehead.

It's 'xfs@oss.sgi.com' not 'linux-xfs@vger.kernel.org'

>> I am benchmarking md RAID with XFS on a server running Linux 2.6.38
>> kernel. The server has 24 x HDD's, dual 2.4GHz 6-core CPUs, and 24GB
>> RAM.

What HBA(s)/RAID card(s)? BBWC enabled?

>> I created an md0 array using RAID 5, 64k chunk, 23 active drives, an=
d
>> 1 hot-spare. I then created a LVM2 volume group from this md0, and
>> created an LV out of it. The volume was formatted XFS as follows:
>>
>> /sbin/mkfs.xfs =96f =96l lazy-count=3D1 -l size=3D128m -s size=3D409=
6
>> /dev/mapper/pool1-vol1

With 22 stripe spindles you should have at least specified '-d sw=3D22'=
in
mkfs.xfs. This would give better performance, though it should have
nothing to do with the panic.

>> I then mounted it as follows:
>>
>> /dev/mapper/pool1-vol1 on /volumes/pool1/vol1 type xfs
>> (rw,_netdev,noatime,nodiratime,osyncisdsync,nobarrier,logbuf s=3D8,de=
laylog)

I'm wondering if specifying nobarrier might have something to do with
the OP's issue. Does the system panic when using only

defaults,delaylog

>> Once md synchronization was complete, I removed one of the active 23
>> drives. After attempting some IO, the md0 array began to rebuild to
>> the hot-spare. In a few hours, it was complete and the md0 array was
>> listed as active and healthy again (though now lacking a hot-spare
>> obviously).
>>
>> As a test, I removed one more drive to see what would happen. As
>> expected, mdadm reported the array as active but degraded, and since
>> there was no hot-spare available, there was no rebuilding happening.
>>
> ....
>>
>> What surprised me though is that I was no longer able to run IO on t=
he
>> md0 device. As a test, I am using fio to generate IO to the XFS
>> mountpoint /volumes/pool1/vol1. However, IO failed. A few minutes
>> later, I received the following kernel dumps in /var/log/messages. A=
ny
>> ideas?

What happens when you test with something other than FIO? How about
simply touching a file or creating a directory?

>>
>>
>> Jun 12 11:33:54 TESTBA16 kernel: [59435.936575] fio D
>> ffff88060c6e1a50 0 30463 1 0x00000000
>> Jun 12 11:33:54 TESTBA16 kernel: [59435.936578] ffff880609887778
>> 0000000000000086 0000000000000001 0000000000000086
>> Jun 12 11:33:54 TESTBA16 kernel: [59435.936581] 0000000000011e40
>> ffff88060c6e16c0 ffff88060c6e1a50 ffff880609887fd8
>> Jun 12 11:33:54 TESTBA16 kernel: [59435.936583] ffff88060c6e1a58
>> 0000000000011e40 ffff880609886010 0000000000011e40
>> Jun 12 11:33:54 TESTBA16 kernel: [59435.936586] Call Trace:
>> Jun 12 11:33:54 TESTBA16 kernel: [59435.936594] [=
]
>> make_request+0x138/0x3d0 [raid456]
>=20
>>
>> The errors seem to be a combination of XFS and md related messages.
>> Any insight into this issue would be greatly appreciated. Thanks!
>>
>=20
> Very peculiar!
>=20
> It appears that make_request in raid5.c is entering schedule() in an
> uninterruptible wait.
> There are 4 places where make_request calls schedule.
> 2 can only happen if the array is being reshaped (e.2. 5 drives to 6=
drives)
> but that does not appear to be happening.
> 1 causes and interruptible wait, so it cannot be that one.
>=20
> That just leaves the one on line 4105.
> This requires either than the stripe is being reshaped (which we alre=
ady
> decided isn't happening) or that md/raid5 has received overlapping re=
quests.
>=20
> i.e. while one request (either read or write) was pending, another re=
quest
> (either read or write, not necessarily the same) arrives for a range =
of
> sectors which over-laps the previous request.
>=20
> When this happens (which it shouldn't because it would be dumb for a
> filesystem to do that, but you never know) md/raid5 will wait for the=
first
> request to be completely handled before letting the second proceed.
> So we should be waiting here for at most a small fraction of a second=

> Clearly we are waiting longer than that...

With nobarrier set, I'm wondering if XFS is issuing overlapping writes
to the same sector on the log device. Maybe the drives aren't
responding quickly enough, causing the excess wait.

> So this cannot possibly happen (as is so often the case when debuggin=
g :-)
>=20
> Hmmm... maybe we are missing the wakeup call. I can find where we wa=
ke-up
> anyone waiting for an overlapping read request to complete, but I can=
not find
> where we wake-up someone waiting for when an overlapping write reques=
t
> completes. That should probably go in handle_stripe_clean_event.

I'm beginning to think this is a case of non enterprise drives (no TLER=
,
etc) being used with a cache less HBA and without write barriers. This
would definitely be a recipe for disaster from a data loss standpoint,
though I'm not sure it should cause a kernel panic.

> Do you have the system still hanging in this state? If not, can you =
get it
> back into this state easily?
> If so, you can force a wakeup with the magic incantation:
>=20
> cat /sys/block/mdXX/md/suspend_lo > /sys/block/mdXX/md/suspend_lo
>=20
> (with 'XX' suitably substituted).
>=20
> If that makes a difference, then I know I am on the right track

Is there any downside to introducing such a wake-up for writers?

--=20
Stan
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html