Bookmarks

Yahoo Gmail Google Facebook Delicious Twitter Reddit Stumpleupon Myspace Digg

Search queries

sqldatasource dal, wwwxxxenden, convert raid5 to raid 10 mdadm, apache force chunked, nrao wwwxxx, xxxxxdup, procmail change subject header, wwwXxx not20, Wwwxxx.doks sas, linux raid resync after reboot

Links

XODOX
Impressum

#1: 2 drive RAID10 rebuild issue

Posted on 2011-10-14 05:06:45 by Brad Campbell

G'day all,

My main OS drives are a pair of 1TB WD SATA units in a RAID-10 f,2 layout.

Current configuration is as follows :

root@srv:~# uname -a
Linux srv 3.1.0-rc9 #1 SMP Wed Oct 5 17:35:49 WST 2011 x86_64 GNU/Linux
root@srv:~# mdadm --version
mdadm - v3.2.1 - 28th March 2011
root@srv:~# mdadm --detail /dev/md2
/dev/md2:
Version : 1.2
Creation Time : Sun May 8 14:02:40 2011
Raid Level : raid10
Array Size : 976247808 (931.02 GiB 999.68 GB)
Used Dev Size : 976247808 (931.02 GiB 999.68 GB)
Raid Devices : 2
Total Devices : 1
Persistence : Superblock is persistent

Intent Bitmap : Internal

Update Time : Fri Oct 14 10:53:23 2011
State : active, degraded
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0

Layout : far=2
Chunk Size : 512K

Name : sysresccd:2
UUID : 6df98448:8cfbee7e:acdf3947:f282c441
Events : 317419

Number Major Minor RaidDevice State
0 0 0 0 removed
1 8 226 1 active sync /dev/sdo2

root@srv:~# mdadm --examine /dev/sdo2
/dev/sdo2:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 6df98448:8cfbee7e:acdf3947:f282c441
Name : sysresccd:2
Creation Time : Sun May 8 14:02:40 2011
Raid Level : raid10
Raid Devices : 2

Avail Dev Size : 1952497072 (931.02 GiB 999.68 GB)
Array Size : 1952495616 (931.02 GiB 999.68 GB)
Used Dev Size : 1952495616 (931.02 GiB 999.68 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 0f132a57:e1c95358:904c3195:4c3f9af8

Internal Bitmap : 2 sectors from superblock
Update Time : Fri Oct 14 10:53:53 2011
Checksum : 91576962 - correct
Events : 317431

Layout : far=2
Chunk Size : 512K

Device Role : Active device 1
Array State : .A ('A' == active, '.' == missing)

root@srv:~# mdadm --examine /dev/sdp2
mdadm: No md superblock detected on /dev/sdp2.

I accidentally unplugged sdp a while ago. Yesterday I plugged it back in and tried to re-add
/dev/sdp2 to /dev/md2. /dev/sdp2 was initially added as a spare, so I removed it and zero'd the
superblock before re-trying an add. sd[op]1 are both components of /dev/md1 in a RAID1 and that all
worked ok.

<snip from bootup dmesg> (root is on md2p1)

[ 4.464763] md: md2 stopped.
[ 4.465318] md: bind<sdo2>
[ 4.465992] md/raid10:md2: not clean -- starting background reconstruction
[ 4.466026] md/raid10:md2: active with 1 out of 2 devices
[ 4.466236] created bitmap (8 pages) for device md2
[ 4.466464] md2: bitmap initialized from disk: read 1/1 pages, set 308 of 14897 bits
[ 4.478694] md2: detected capacity change from 0 to 999677755392
[ 4.489859] md2: p1 p2 p3

When I add /dev/sdp2 to /dev/md2 the following occurs :

Oct 14 10:05:51 srv kernel: [ 266.534562] md: bind<sdp2>
Oct 14 10:05:51 srv kernel: [ 266.559686] RAID10 conf printout:
Oct 14 10:05:51 srv kernel: [ 266.559694] --- wd:1 rd:2
Oct 14 10:05:51 srv kernel: [ 266.559701] disk 1, wo:1, o:1, dev:sdp2
Oct 14 10:05:51 srv kernel: [ 266.559717] ------------[ cut here ]------------
Oct 14 10:05:51 srv kernel: [ 266.559772] WARNING: at fs/sysfs/dir.c:455 sysfs_add_one+0xb9/0xf0()
Oct 14 10:05:51 srv kernel: [ 266.559816] Hardware name: To Be Filled By O.E.M.
Oct 14 10:05:51 srv kernel: [ 266.559858] sysfs: cannot create duplicate filename
'/devices/virtual/block/md2/md/rd1'
Oct 14 10:05:51 srv kernel: [ 266.559905] Modules linked in: iptable_filter ip_tables x_tables nfs
ppp_generic slhc cls_u32 sch_htb deflate zlib_deflate des_generic cbc ecb crypto_blkcipher
sha1_generic md5 hmac crypto_hash cryptomgr aead crypto_algapi af_key fuse w83627ehf hwmon_vid
vhost_net powernow_k8 mperf kvm_amd kvm pl2303 usbserial xhci_hcd i2c_piix4 k10temp ohci_hcd
ehci_hcd r8169 usbcore ahci libahci sata_mv megaraid_sas [last unloaded: scsi_wait_scan]
Oct 14 10:05:51 srv kernel: [ 266.561427] Pid: 1468, comm: md2_raid10 Not tainted 3.1.0-rc9 #1
Oct 14 10:05:51 srv kernel: [ 266.561469] Call Trace:
Oct 14 10:05:51 srv kernel: [ 266.561516] [<ffffffff81034dcb>] ? warn_slowpath_common+0x7b/0xc0
Oct 14 10:05:51 srv kernel: [ 266.561562] [<ffffffff81034ec5>] ? warn_slowpath_fmt+0x45/0x50
Oct 14 10:05:51 srv kernel: [ 266.561617] [<ffffffff8111aea9>] ? sysfs_add_one+0xb9/0xf0
Oct 14 10:05:51 srv kernel: [ 266.561662] [<ffffffff8111bf53>] ? sysfs_do_create_link+0x143/0x210
Oct 14 10:05:51 srv kernel: [ 266.561709] [<ffffffff811dd1d3>] ? sprintf+0x43/0x50
Oct 14 10:05:51 srv kernel: [ 266.561755] [<ffffffff812f24c9>] ? md_check_recovery+0x549/0x6a0
Oct 14 10:05:51 srv kernel: [ 266.561801] [<ffffffff812db397>] ? raid10d+0x27/0xb50
Oct 14 10:05:51 srv kernel: [ 266.561846] [<ffffffff81041043>] ? lock_timer_base+0x33/0x70
Oct 14 10:05:51 srv kernel: [ 266.561890] [<ffffffff810410ec>] ? try_to_del_timer_sync+0x6c/0x90
Oct 14 10:05:51 srv kernel: [ 266.561935] [<ffffffff8104113a>] ? del_timer_sync+0x2a/0x50
Oct 14 10:05:51 srv kernel: [ 266.561981] [<ffffffff813e9440>] ? schedule_timeout+0x160/0x230
Oct 14 10:05:51 srv kernel: [ 266.562025] [<ffffffff810411f0>] ? del_timer+0x90/0x90
Oct 14 10:05:51 srv kernel: [ 266.562071] [<ffffffff812efa4f>] ? md_thread+0x10f/0x140
Oct 14 10:05:51 srv kernel: [ 266.562117] [<ffffffff81050120>] ? wake_up_bit+0x40/0x40
Oct 14 10:05:51 srv kernel: [ 266.562162] [<ffffffff812ef940>] ? md_register_thread+0x100/0x100
Oct 14 10:05:51 srv kernel: [ 266.562208] [<ffffffff812ef940>] ? md_register_thread+0x100/0x100
Oct 14 10:05:51 srv kernel: [ 266.562580] [<ffffffff8104fcc6>] ? kthread+0x96/0xa0
Oct 14 10:05:51 srv kernel: [ 266.562625] [<ffffffff813ec6b4>] ? kernel_thread_helper+0x4/0x10
Oct 14 10:05:51 srv kernel: [ 266.562671] [<ffffffff8104fc30>] ? kthread_worker_fn+0x120/0x120
Oct 14 10:05:51 srv kernel: [ 266.562715] [<ffffffff813ec6b0>] ? gs_change+0xb/0xb
Oct 14 10:05:51 srv kernel: [ 266.562757] ---[ end trace c02313193e85d8a8 ]---
Oct 14 10:05:51 srv kernel: [ 266.562879] md: recovery of RAID array md2
Oct 14 10:05:51 srv kernel: [ 266.562927] md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
Oct 14 10:05:51 srv kernel: [ 266.562971] md: using maximum available idle IO bandwidth (but not
more than 200000 KB/sec) for recovery.
Oct 14 10:05:51 srv kernel: [ 266.563062] md: using 128k window, over a total of 976247808k.
Oct 14 10:05:51 srv kernel: [ 266.563253] md/raid10:md2: insufficient working devices for recovery.
Oct 14 10:05:51 srv kernel: [ 266.563306] md: md2: recovery done.
Oct 14 10:05:51 srv kernel: [ 266.609662] RAID10 conf printout:
Oct 14 10:05:51 srv kernel: [ 266.609669] --- wd:1 rd:2
Oct 14 10:05:51 srv kernel: [ 266.609675] disk 1, wo:1, o:1, dev:sdp2
Oct 14 10:05:51 srv kernel: [ 266.750052] RAID10 conf printout:
Oct 14 10:05:51 srv kernel: [ 266.750056] --- wd:1 rd:2
Oct 14 10:05:52 srv kernel: [ 267.757749] Buffer I/O error on device md2p1, logical block 5230645
Oct 14 10:05:52 srv kernel: [ 267.757808] EXT4-fs warning (device md2p1): ext4_end_bio:258: I/O
error writing to inode 923126 (offset 282624 size 4096 starting block 5230901)
Oct 14 10:05:52 srv kernel: [ 267.757907] Buffer I/O error on device md2p1, logical block 1620503
Oct 14 10:05:52 srv kernel: [ 267.757952] Buffer I/O error on device md2p1, logical block 1620504
Oct 14 10:05:52 srv kernel: [ 267.757997] EXT4-fs warning (device md2p1): ext4_end_bio:258: I/O
error writing to inode 425274 (offset 0 size 8192 starting block 1620759)
Oct 14 10:05:52 srv kernel: [ 267.758067] Buffer I/O error on device md2p1, logical block 2917504
Oct 14 10:05:52 srv kernel: [ 267.758114] EXT4-fs warning (device md2p1): ext4_end_bio:258: I/O
error writing to inode 1052016 (offset 0 size 4096 starting block 2917760)
Oct 14 10:05:52 srv kernel: [ 267.758180] Buffer I/O error on device md2p1, logical block 2917529
Oct 14 10:05:52 srv kernel: [ 267.758225] Buffer I/O error on device md2p1, logical block 2917530
Oct 14 10:05:52 srv kernel: [ 267.758270] EXT4-fs warning (device md2p1): ext4_end_bio:258: I/O
error writing to inode 1052016 (offset 102400 size 8192 starting block 2917785)
Oct 14 10:05:56 srv kernel: [ 271.151176] Buffer I/O error on device md2p2, logical block 4352449
Oct 14 10:05:56 srv kernel: [ 271.151226] lost page write due to I/O error on md2p2
Oct 14 10:05:56 srv kernel: [ 271.151322] JBD2: Detected IO errors while flushing file data on md2p2-8
Oct 14 10:05:56 srv kernel: [ 271.151370] Aborting journal on device md2p2-8.
Oct 14 10:05:56 srv kernel: [ 271.151417] Buffer I/O error on device md2p2, logical block 5275648
Oct 14 10:05:56 srv kernel: [ 271.151459] lost page write due to I/O error on md2p2
Oct 14 10:05:56 srv kernel: [ 271.151503] JBD2: I/O error detected when updating journal superblock
for md2p2-8.
Oct 14 10:05:57 srv kernel: [ 272.774195] Buffer I/O error on device md2p2, logical block 5767612
Oct 14 10:05:57 srv kernel: [ 272.774246] lost page write due to I/O error on md2p2
Oct 14 10:05:57 srv kernel: [ 272.774303] Buffer I/O error on device md2p2, logical block 5770220
Oct 14 10:05:57 srv kernel: [ 272.774346] lost page write due to I/O error on md2p2
Oct 14 10:05:57 srv kernel: [ 272.774392] Buffer I/O error on device md2p2, logical block 9439050
Oct 14 10:05:57 srv kernel: [ 272.774436] lost page write due to I/O error on md2p2

I've repeated this three times now, each time zeroing the superblock on /dev/sdp2 and trying an add.
I get the same result every time, requiring a belt of the big red button.

I'm just using :
mdadm --add /dev/md2 /dev/sdp2

Have I done something particularly wrong?

This is neither urgent, nor critical as the system is happily spinning on one drive and I have
pedantic backups of everything.

Regards,
Brad
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Report this message