Problems after reshaping of Raid5 array
Problems after reshaping of Raid5 array
am 29.11.2010 18:24:18 von Michele Bonera
--0015175cf7c098883504963457d0
Content-Type: text/plain; charset=ISO-8859-1
Hi all.
I'm a little bit in panic... and I really need some help to solve this
(if possible......)...
I have a storage server in my LAN where I save everything
for security (sigh).
The system consists in a 32 GB SSD containing the o.s.
plus 4 WD EADS 1TB harddisks in RAID5 with all my data.
The disks are seen by the system as sdb1, sdc1, sdd1, sde1
Yesterday evening I added another WD, this time an EARS
(512 byte sectors): I created a partition on it, respecting the
alignment and then I added it to the array and performed a
grow command
mdadm --add /dev/md6 /dev/sdb1 (after adding it, the hd took sdb)
mdadm --grow /dev/md6 --raid-devices=5
Reshape started... and worked until today. Or better, until the system
hangs and I have to sync+remount-ro with the sysrq keys.
After rebooting, the reshaping restarted, but the disk become sdb
not sdb1 in the raid array, and the file system became unreadable
Any ideas of what happened?
Thanks a lot for any suggestion you can give me.
I attach the mdadm -E and dumpe2fs outputs
--0015175cf7c098883504963457d0
Content-Type: application/x-gzip; name="mdadm-e.tgz"
Content-Disposition: attachment; filename="mdadm-e.tgz"
Content-Transfer-Encoding: base64
X-Attachment-Id: f_gh3mocpz0
H4sIAK7N80wAA+2Xy27aQBSGWfspZtdkEWeuvi0q5dJGVZuoCqFdD/Y4uAEb jQ0SffqOx9ghqCIq
AkKj87Ewnst/zlzsf1wmkvT2DDZ4nNurYf1Kief3COG++Xke83qYYCb8HsL7 TqxmVlZSI9TTRVFt
avda/X/KeaLm56XZBJGDOm7lYxajCMmQDrlK467qh9JlVuSmCmM3JC7GK73Q YPDl2lSlKQniOFYR
C4eYK09GaUA9lpg/NBQY+0kteKWVrGqth2yiTK/+LEd3xRwhgnAYURJhjCjG YRPgXmYJ+qbmamya
anMjTPmgVAm6VnPUz37XEoEIfROIeegkIKEbYnSTXaIgwK4purk8baQutJaL tgvjLAixYJyjE0b9
0PWaTkxQ4prCZS8b3kTKYlWaXnXwh6KS4xdl37VKldYmp9ssL7Qp8xzT8F6V IzlVaFqUH+qZMxkS
IZio06QCm5DChqReyF2fLENeq3ElV+QJOuFnH8WpYwcxmCayUu3U3RbN1NEQ ES8SQUSomTqyujb9
qm4foXisZO6gi7jK5upF9j8L/ZTljy9H+VlmY7U68lqzP5VarZVZrkYqfipn k3opOKNMpOgMxYWZ
krh6zuXTXOWVHZJ55APnueKbXBSzylSMVVqdlYvJRFU6W26+q9Esf2pXjdDg a9vxbjYZKm037a/C
Xu3kN2vWJNmM3qlGWWm7kGXAoI1M1v7IZnrKRR6bu+4RsSGxbYHXJIjfSuAN EkPidEG2TsLcUNuA
rikEnQLdoJBaBWYbsDUFsxOXsA0Kyipw24CvKbA2qbbmrwqx89avPWBJvSn3 HWMb//cw+P8h6N5M
4P/g/zv2fy85Tv/f2rzB/8H/3xlmMfYe43X/Fyv+79f+T334/j8I7RMJ9g/2 v2P7D/hx2v+2ngXu
D+7/7qgXdN8x/u3732++/zn4/yHonmo4AMABYMcHAMmO8wCwtfHBCQBOAO+M ekPsO8ZW/i/A/w9B
91YA/wf/37H/D8lx+v/Wxgn+D/4PAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAADwxvwBDsOC
FgBQAAA=
--0015175cf7c098883504963457d0
Content-Type: application/octet-stream; name=dumpe2fs
Content-Disposition: attachment; filename=dumpe2fs
Content-Transfer-Encoding: base64
X-Attachment-Id: f_gh3mogta1
cm9vdEBtaXphcjp+IyBkdW1wZTJmcyAvZGV2L21kNgpkdW1wZTJmcyAxLjQx LjExICgxNC1NYXIt
MjAxMCkKRmlsZXN5c3RlbSB2b2x1bWUgbmFtZTogICA8bm9uZT4KTGFzdCBt b3VudGVkIG9uOiAg
ICAgICAgICA8bm90IGF2YWlsYWJsZT4KRmlsZXN5c3RlbSBVVUlEOiAgICAg ICAgICAxMWU2MDIz
NC05N2EyLTQ3YTAtOGU1Mi04ZmUwYzg2YWE2NjcKRmlsZXN5c3RlbSBtYWdp YyBudW1iZXI6ICAw
eEVGNTMKRmlsZXN5c3RlbSByZXZpc2lvbiAjOiAgICAxIChkeW5hbWljKQpG aWxlc3lzdGVtIGZl
YXR1cmVzOiAgICAgIGhhc19qb3VybmFsIGV4dF9hdHRyIHJlc2l6ZV9pbm9k ZSBkaXJfaW5kZXgg
ZmlsZXR5cGUgbmVlZHNfcmVjb3Zlcnkgc3BhcnNlX3N1cGVyIGxhcmdlX2Zp bGUKRmlsZXN5c3Rl
bSBmbGFnczogICAgICAgICBzaWduZWRfZGlyZWN0b3J5X2hhc2ggCkRlZmF1 bHQgbW91bnQgb3B0
aW9uczogICAgKG5vbmUpCkZpbGVzeXN0ZW0gc3RhdGU6ICAgICAgICAgY2xl YW4gd2l0aCBlcnJv
cnMKRXJyb3JzIGJlaGF2aW9yOiAgICAgICAgICBDb250aW51ZQpGaWxlc3lz dGVtIE9TIHR5cGU6
ICAgICAgIExpbnV4Cklub2RlIGNvdW50OiAgICAgICAgICAgICAgMTYxMjAy MTc2CkJsb2NrIGNv
dW50OiAgICAgICAgICAgICAgNjQ0Nzk0NzUyClJlc2VydmVkIGJsb2NrIGNv dW50OiAgICAgMzIy
Mzk3MzcKRnJlZSBibG9ja3M6ICAgICAgICAgICAgICA0MjYxNDQwNgpGcmVl IGlub2RlczogICAg
ICAgICAgICAgIDEyMDM2MzUwNwpGaXJzdCBibG9jazogICAgICAgICAgICAg IDAKQmxvY2sgc2l6
ZTogICAgICAgICAgICAgICA0MDk2CkZyYWdtZW50IHNpemU6ICAgICAgICAg ICAgNDA5NgpSZXNl
cnZlZCBHRFQgYmxvY2tzOiAgICAgIDg3MApCbG9ja3MgcGVyIGdyb3VwOiAg ICAgICAgIDMyNzY4
CkZyYWdtZW50cyBwZXIgZ3JvdXA6ICAgICAgMzI3NjgKSW5vZGVzIHBlciBn cm91cDogICAgICAg
ICA4MTkyCklub2RlIGJsb2NrcyBwZXIgZ3JvdXA6ICAgNTEyClJBSUQgc3Ry aWRlOiAgICAgICAg
ICAgICAgMzIKUkFJRCBzdHJpcGUgd2lkdGg6ICAgICAgICA5NgpGaWxlc3lz dGVtIGNyZWF0ZWQ6
ICAgICAgIFN1biBOb3YgIDEgMDk6MjI6MjIgMjAwOQpMYXN0IG1vdW50IHRp bWU6ICAgICAgICAg
IE1vbiBOb3YgMjkgMDc6NTU6MzcgMjAxMApMYXN0IHdyaXRlIHRpbWU6ICAg ICAgICAgIE1vbiBO
b3YgMjkgMTY6NDY6MDAgMjAxMApNb3VudCBjb3VudDogICAgICAgICAgICAg IDEKTWF4aW11bSBt
b3VudCBjb3VudDogICAgICAzMQpMYXN0IGNoZWNrZWQ6ICAgICAgICAgICAg IFN1biBOb3YgMjgg
MjM6MDQ6MjcgMjAxMApDaGVjayBpbnRlcnZhbDogICAgICAgICAgIDE1NTUy MDAwICg2IG1vbnRo
cykKTmV4dCBjaGVjayBhZnRlcjogICAgICAgICBTYXQgTWF5IDI4IDAwOjA0 OjI3IDIwMTEKUmVz
ZXJ2ZWQgYmxvY2tzIHVpZDogICAgICAwICh1c2VyIHJvb3QpClJlc2VydmVk IGJsb2NrcyBnaWQ6
ICAgICAgMCAoZ3JvdXAgcm9vdCkKRmlyc3QgaW5vZGU6ICAgICAgICAgICAg ICAxMQpJbm9kZSBz
aXplOgkgICAgICAgICAgMjU2ClJlcXVpcmVkIGV4dHJhIGlzaXplOiAgICAg MjgKRGVzaXJlZCBl
eHRyYSBpc2l6ZTogICAgICAyOApKb3VybmFsIGlub2RlOiAgICAgICAgICAg IDgKRGVmYXVsdCBk
aXJlY3RvcnkgaGFzaDogICBoYWxmX21kNApEaXJlY3RvcnkgSGFzaCBTZWVk OiAgICAgIDRiNjlm
ZmZkLTVkZDEtNDI1ZS04ZWJkLWJkMjk5NDBjMDcyMgpKb3VybmFsIGJhY2t1 cDogICAgICAgICAg
IGlub2RlIGJsb2NrcwpKb3VybmFsIHN1cGVyYmxvY2sgbWFnaWMgbnVtYmVy IGludmFsaWQhCgo=
--0015175cf7c098883504963457d0--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problems after reshaping of Raid5 array
am 29.11.2010 20:12:32 von Jan Ceuleers
On 29/11/10 18:24, Michele Bonera wrote:
> After rebooting, the reshaping restarted, but the disk become sdb
> not sdb1 in the raid array, and the file system became unreadable
>
> Any ideas of what happened?
Michele,
I've had similar problems, which I resolved by changing the partition
type from fd (Linux RAID autodetect) to 83 (Linux), ensuring that the
initrd is able to assemble the RAID.
Worth a shot.
HTH, Jan
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problems after reshaping of Raid5 array
am 29.11.2010 21:04:39 von Michele Bonera
Il giorno lun, 29/11/2010 alle 20.12 +0100, Jan Ceuleers ha scritto:
> I've had similar problems, which I resolved by changing the partition
> type from fd (Linux RAID autodetect) to 83 (Linux), ensuring that the
> initrd is able to assemble the RAID.
> Worth a shot.
> HTH, Jan
Thanks for the reply, Jan, but the problem is not at raid level, or
better, the raid just finished to reshape and now has 5/5 of his
components and is active:
root@mizar:~# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0]
[raid1] [raid10]
md6 : active raid5 sdf1[2] sdb1[0] sda1[1] sdc[4] sde1[3]
3438905344 blocks level 5, 128k chunk, algorithm 2 [5/5] [UUUUU]
The problem is that I added a device partition (sdc1) but after the
crash the array has sdc (the device) as one of his components. The
partition table on the disk was wiped-out.
The result is that the filesystem on it (dev/md6) is unreadable and
fsck.ext3 can't fix it.
Bye
--
Michele Bonera
linux user group brescia
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problems after reshaping of Raid5 array
am 29.11.2010 22:45:01 von NeilBrown
On Mon, 29 Nov 2010 18:24:18 +0100 Michele Bonera wrote:
> Hi all.
>
> I'm a little bit in panic... and I really need some help to solve this
> (if possible......)...
>
> I have a storage server in my LAN where I save everything
> for security (sigh).
>
> The system consists in a 32 GB SSD containing the o.s.
> plus 4 WD EADS 1TB harddisks in RAID5 with all my data.
> The disks are seen by the system as sdb1, sdc1, sdd1, sde1
>
> Yesterday evening I added another WD, this time an EARS
> (512 byte sectors): I created a partition on it, respecting the
> alignment and then I added it to the array and performed a
> grow command
>
> mdadm --add /dev/md6 /dev/sdb1 (after adding it, the hd took sdb)
> mdadm --grow /dev/md6 --raid-devices=5
>
> Reshape started... and worked until today. Or better, until the system
> hangs and I have to sync+remount-ro with the sysrq keys.
>
> After rebooting, the reshaping restarted, but the disk become sdb
> not sdb1 in the raid array, and the file system became unreadable
>
> Any ideas of what happened?
Yes.
I think you can fix it by simply failing and removing sdc
Then md/raid5 will recover that data using the parity block, and that should
be correct.
It appears that the partition you created on the new device started at a
multiple of 64K. When this happen, the superblock at the end of the
partition also looks valid when seen at the end of the whole device.
Somehow mdadm got confused and choose the whole device (sdc) instead of the
partition (sdc1).
I am surprised at this because since mdadm-2.5.1, mdadm will refuse to
assemble an array if it sees two devices that appear to have the same
superblock. Could you possibly be using something that old??
So when the reshape started, it was writing data for the 5th device to sdc1.
Then after you restared, it was writing data for the 5th device to sdc, which
the same drive of course, but at a different offset. So everthing that was
written before the crash will look wrong.
So the thing to do is to stop md from reading from sdc at all, as that device
is clearly corrupt. So just fail and remove it. Then add it back in again.
If you do re-partition, try to make sure sdc1 does not start at a multiple of
64K (128 sectors).
NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problems after reshaping of Raid5 array
am 30.11.2010 08:23:00 von Michele Bonera
Il giorno mar, 30/11/2010 alle 08.45 +1100, Neil Brown ha scritto:
> > Yesterday evening I added another WD, this time an EARS
> > (512 byte sectors): I created a partition on it, respecting the
Just to be precise: 4K byte sector (my mistake).
> Yes.
> I think you can fix it by simply failing and removing sdc
> Then md/raid5 will recover that data using the parity block, and that should
> be correct.
> It appears that the partition you created on the new device started at a
> multiple of 64K. When this happen, the superblock at the end of the
> partition also looks valid when seen at the end of the whole device.
> Somehow mdadm got confused and choose the whole device (sdc) instead of the
> partition (sdc1).
I did it and it worked! Thanks a lot Neil!!!
> I am surprised at this because since mdadm-2.5.1, mdadm will refuse to
> assemble an array if it sees two devices that appear to have the same
> superblock. Could you possibly be using something that old??
The distribution is Ubuntu Server 10.04,
kernel 2.6.32-26-generic-pae, mdadm 2.6.7.1
Again many thanks Neil, you saved me! :)
Cheers,
--
Michele Bonera
linux user group brescia
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problems after reshaping of Raid5 array
am 30.11.2010 18:00:54 von Jan Ceuleers
On 29/11/10 21:04, Michele Bonera wrote:
> The problem is that I added a device partition (sdc1) but after the
> crash the array has sdc (the device) as one of his components. The
> partition table on the disk was wiped-out.
>
> The result is that the filesystem on it (dev/md6) is unreadable and
> fsck.ext3 can't fix it.
Right. This is consistent with what I've seen as well.
I can't help you repair this filesystem, but when you do my suggestion
remains to change the partition types of the RAID members. This is
because I've noticed that the in-kernel RAID detection and assembly code
sometimes gets it wrong (leading to the kinds of symptoms you've
reported), whereas mdadm run from the initrd does not.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problems after reshaping of Raid5 array
am 30.11.2010 20:45:41 von Jan Ceuleers
On 30/11/10 08:23, Michele Bonera wrote:
>> It appears that the partition you created on the new device started at a
>> multiple of 64K. When this happen, the superblock at the end of the
>> partition also looks valid when seen at the end of the whole device.
>> Somehow mdadm got confused and choose the whole device (sdc) instead of the
>> partition (sdc1).
>
> I did it and it worked! Thanks a lot Neil!!!
Michele,
Does it survive a reboot?
Jan
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problems after reshaping of Raid5 array
am 03.12.2010 08:03:21 von Michele Bonera
Il giorno mar, 30/11/2010 alle 20.45 +0100, Jan Ceuleers ha scritto:
> On 30/11/10 08:23, Michele Bonera wrote:
> >> It appears that the partition you created on the new device started at a
> >> multiple of 64K. When this happen, the superblock at the end of the
> >> partition also looks valid when seen at the end of the whole device.
> >> Somehow mdadm got confused and choose the whole device (sdc) instead of the
> >> partition (sdc1).
> >
> > I did it and it worked! Thanks a lot Neil!!!
> Michele,
> Does it survive a reboot?
> Jan
Yes. Now it's working perfectly...
Bye
--
Michele Bonera
linux user group brescia
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html