data corruption after rebuild

am 19.07.2011 15:55:35 von Pavel Herrmann

Hi,

I have a big problem with mdadm, I removed a drive from my raid6, after
replacing it the raid started an online resync. I accidentally pushed the
computer and it shut down (power cord moved), and after booting it again the
online resync continued.

the problem is that the rebuilt array is corrupted. most of the data is fine,
but every several MB there is an error (which doesn't look like being caused
by a crash), effectively invalidating all data on the drive (about 7TB, mainly
HD video)

I do monthly scans, so the redundancy syndromes should have been up-to-date,
the array is made of 8 disks, the setup is ext4 on lvm on mdraid

is there anything to solve this? or at least ideas what happened?
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: data corruption after rebuild

am 19.07.2011 17:12:40 von Roman Mamedov

--Sig_/020Ahsc9DU4WJor_qKExBc6
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

Hello,

On Tue, 19 Jul 2011 15:55:35 +0200
Pavel Herrmann wrote:

> the problem is that the rebuilt array is corrupted. most of the data is f=
ine,=20
> but every several MB there is an error (which doesn't look like being cau=
sed=20
> by a crash), effectively invalidating all data on the drive (about 7TB, m=
ainly=20
> HD video)

Which model of SATA controller/HBA do you use?

Kernel version, mdadm version?

Anything unusual in SMART reports of any of the drives (e.g. a nonzero UDMA=
CRC Error count)?

>=20
> I do monthly scans, so the redundancy syndromes should have been up-to-da=
te,=20
> the array is made of 8 disks, the setup is ext4 on lvm on mdraid

Did you notice any nonzero mismatch_cnt during those scans?

--=20
With respect,
Roman

--Sig_/020Ahsc9DU4WJor_qKExBc6
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)

iEYEARECAAYFAk4lnugACgkQTLKSvz+PZwhp0QCeNKJ/6XQCLsSt15x+7Pok BUca
TvkAn3oe+ANvFrY+izCPlp11SqkMGoBn
=DYQm
-----END PGP SIGNATURE-----

--Sig_/020Ahsc9DU4WJor_qKExBc6--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: data corruption after rebuild

am 19.07.2011 18:18:56 von Pavel Herrmann

On Tuesday 19 of July 2011 21:12:40 Roman Mamedov wrote:
> Hello,
>
> On Tue, 19 Jul 2011 15:55:35 +0200
>
> Pavel Herrmann wrote:
> > the problem is that the rebuilt array is corrupted. most of the data is
> > fine, but every several MB there is an error (which doesn't look like
> > being caused by a crash), effectively invalidating all data on the
> > drive (about 7TB, mainly HD video)
>
> Which model of SATA controller/HBA do you use?

4 drives on ahci (ICH10R), 4 drives on sata_mv (adaptec 1430SA)

> Kernel version, mdadm version?

2.6.33-gentoo-r1, mdadm - v3.1.5 - 23rd March 2011

> Anything unusual in SMART reports of any of the drives (e.g. a nonzero UDMA
> CRC Error count)?

one current-pending-sector on the drive that was removed, and one on one more
>
> > I do monthly scans, so the redundancy syndromes should have been
> > up-to-date, the array is made of 8 disks, the setup is ext4 on lvm on
> > mdraid
>
> Did you notice any nonzero mismatch_cnt during those scans?

where would I find this?

syslog for last scan is just:

Jul 2 08:40:01 Bloomfield kernel: [83795.157876] md: data-check of RAID array md0
Jul 2 08:40:01 Bloomfield mdadm[2613]: RebuildStarted event detected on md device /dev/md0
Jul 2 09:46:41 Bloomfield mdadm[2613]: Rebuild21 event detected on md device /dev/md0
Jul 2 10:53:21 Bloomfield mdadm[2613]: Rebuild42 event detected on md device /dev/md0
Jul 2 12:00:02 Bloomfield mdadm[2613]: Rebuild61 event detected on md device /dev/md0
Jul 2 13:40:02 Bloomfield mdadm[2613]: Rebuild82 event detected on md device /dev/md0
Jul 2 16:02:46 Bloomfield kernel: [110348.161984] md: md0: data-check done.
Jul 2 16:02:47 Bloomfield mdadm[2613]: RebuildFinished event detected on md device /dev/md0, component device mismatches found: 72

I presume the nonzero "mismatches found" is a bad thing?

just to mention, all files were fine two dayss ago (I do keep md5sums of all files to check for bit rot)
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: data corruption after rebuild

am 19.07.2011 18:35:26 von Pavel Herrmann

On Tuesday 19 of July 2011 21:12:40 Roman Mamedov wrote:
> > I do monthly scans, so the redundancy syndromes should have been
> > up-to-date, the array is made of 8 disks, the setup is ext4 on lvm on
> > mdraid
>
> Did you notice any nonzero mismatch_cnt during those scans?

oh crap, the rebuild finished with

Jul 19 09:41:24 Bloomfield mdadm[3996]: RebuildFinished event detected on md device /dev/md0, component device mismatches found: 3184

this is really bad i presume?
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: data corruption after rebuild

am 19.07.2011 18:48:19 von Roman Mamedov

--Sig_/WqFbqoATtx1lxEcDBU2twx3
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

On Tue, 19 Jul 2011 18:35:26 +0200
Pavel Herrmann wrote:

> On Tuesday 19 of July 2011 21:12:40 Roman Mamedov wrote:
> > > I do monthly scans, so the redundancy syndromes should have been
> > > up-to-date, the array is made of 8 disks, the setup is ext4 on lvm on
> > > mdraid
> >=20
> > Did you notice any nonzero mismatch_cnt during those scans?
>=20
> oh crap, the rebuild finished with
>=20
> Jul 19 09:41:24 Bloomfield mdadm[3996]: RebuildFinished event detected on=
md device /dev/md0, component device mismatches found: 3184
>=20
> this is really bad i presume?=20

Well, this basically tells you what you already know - a part of the data y=
ou have was corrupted. In this case I think it's 3184 512-byte sectors, whi=
ch is about 1.6MB total.
How it got there and how to prevent that from happening in the future - tha=
t's a whole different question.

--=20
With respect,
Roman

--Sig_/WqFbqoATtx1lxEcDBU2twx3
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)

iEYEARECAAYFAk4ltVMACgkQTLKSvz+PZwj7wACfZrCVJ/nBUzg1ElhM1R+F EKtL
zgUAn0o8rPHOPmc9yULdD3XXduoujtLN
=+Qhk
-----END PGP SIGNATURE-----

--Sig_/WqFbqoATtx1lxEcDBU2twx3--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: data corruption after rebuild

am 19.07.2011 19:05:39 von Pavel Herrmann

On Tuesday 19 of July 2011 22:48:19 Roman Mamedov wrote:
> On Tue, 19 Jul 2011 18:35:26 +0200
> Well, this basically tells you what you already know - a part of the data
> you have was corrupted. In this case I think it's 3184 512-byte sectors,
> which is about 1.6MB total.

the number seems too low to me, I have about 2000 video files larger than 1GB
on that array, and every one i tried has been corrupted enough to create
almost unstoppable visual artifacts in hi-def video.

> How it got there and how to prevent that from
> happening in the future - that's a whole different question.

would ZFS in raidz2 mode be much better than raid6+ext4? I understand its not
the topic of this list, but file-level checksummed rebuild looks like a nice
feature

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: data corruption after rebuild

am 19.07.2011 19:38:56 von Roman Mamedov

--Sig_/1yR60B0yhe3dT4txqvfizAV
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

On Tue, 19 Jul 2011 18:18:56 +0200
Pavel Herrmann wrote:

> On Tuesday 19 of July 2011 21:12:40 Roman Mamedov wrote:
> > Hello,
> >=20
> > On Tue, 19 Jul 2011 15:55:35 +0200
> >=20
> > Pavel Herrmann wrote:
> > > the problem is that the rebuilt array is corrupted. most of the data =
is
> > > fine, but every several MB there is an error (which doesn't look like
> > > being caused by a crash), effectively invalidating all data on the
> > > drive (about 7TB, mainly HD video)
> >=20
> > Which model of SATA controller/HBA do you use?
>=20
> 4 drives on ahci (ICH10R), 4 drives on sata_mv (adaptec 1430SA)

OK, one more guess - any Samsung drives, F4EG? http://sourceforge.net/apps/=
trac/smartmontools/wiki/SamsungF4EGBadBlocks

--=20
With respect,
Roman

--Sig_/1yR60B0yhe3dT4txqvfizAV
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)

iEYEARECAAYFAk4lwTAACgkQTLKSvz+PZwgaBQCgiMP447qps3/+MiO6pQ1l itHc
5TIAoIkWEbLNBp68bm2mKw3VpiRcUV1b
=+JwA
-----END PGP SIGNATURE-----

--Sig_/1yR60B0yhe3dT4txqvfizAV--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: data corruption after rebuild

am 19.07.2011 19:44:24 von Pavel Herrmann

On Tuesday 19 of July 2011 23:38:56 Roman Mamedov wrote:
> On Tue, 19 Jul 2011 18:18:56 +0200
>
> Pavel Herrmann wrote:
> > On Tuesday 19 of July 2011 21:12:40 Roman Mamedov wrote:
> > > Which model of SATA controller/HBA do you use?
> >
> > 4 drives on ahci (ICH10R), 4 drives on sata_mv (adaptec 1430SA)
>
> OK, one more guess - any Samsung drives, F4EG?
> http://sourceforge.net/apps/trac/smartmontools/wiki/SamsungF 4EGBadBlocks

nope, all disks are WDC_WD20EARS-00MVWB0,
4 were bought about 10 months ago, other 4 about 6 months ago
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: data corruption after rebuild

am 19.07.2011 20:12:28 von Roman Mamedov

--Sig_/BoqpXx7bGq1rxnE/+JA1ZR.
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

On Tue, 19 Jul 2011 19:05:39 +0200
Pavel Herrmann wrote:

> > How it got there and how to prevent that from
> > happening in the future - that's a whole different question.
>=20
> would ZFS in raidz2 mode be much better than raid6+ext4? I understand its=
not=20
> the topic of this list, but file-level checksummed rebuild looks like a n=
ice=20
> feature

Personally I prefer to not bother with ZFS, it brings way too many complica=
tions into software choice, and I just want to use my favorite GNU/Linux di=
stro and not Solaris, and also not trusting 12 TB of data to a third-party =
kernel module or a FUSE driver which are barely tested and have uncertain f=
uture. I'd put more hope in BTRFS RAID5, but that one is a long way ahead f=
rom becoming a viable option too.

Regarding mdadm+raid6, AFAIK it currently does not try to heal itself from =
silent corruption inside a single chunk, even though that should be possibl=
e with RAID6. On a repair, if the data chunks are readable with no I/O erro=
r, they are considered to be the golden standard and all parity chunks are =
simply recalculated from data and overwritten (also incrementing mismatch_c=
nt, if they changed). So maybe implementing a more advanced repair feature =
could give protection against silent corruption not much weaker than what i=
s offered by per-file checksumming RAID implementations.

--=20
With respect,
Roman

--Sig_/BoqpXx7bGq1rxnE/+JA1ZR.
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)

iEYEARECAAYFAk4lyQwACgkQTLKSvz+PZwgTMQCfWxRJzCYbn71lqyWbBrOh PiYI
WDoAn2OFO6Ofsrm9leHTqDaSDnqmgeBa
=92YN
-----END PGP SIGNATURE-----

--Sig_/BoqpXx7bGq1rxnE/+JA1ZR.--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: data corruption after rebuild

am 20.07.2011 08:24:31 von NeilBrown

On Tue, 19 Jul 2011 15:55:35 +0200 Pavel Herrmann
wrote:

> Hi,
>
> I have a big problem with mdadm, I removed a drive from my raid6, after
> replacing it the raid started an online resync. I accidentally pushed the
> computer and it shut down (power cord moved), and after booting it again the
> online resync continued.
>
> the problem is that the rebuilt array is corrupted. most of the data is fine,
> but every several MB there is an error (which doesn't look like being caused
> by a crash), effectively invalidating all data on the drive (about 7TB, mainly
> HD video)
>
> I do monthly scans, so the redundancy syndromes should have been up-to-date,
> the array is made of 8 disks, the setup is ext4 on lvm on mdraid
>
> is there anything to solve this? or at least ideas what happened?

My suggestion would be to remove the drive you recently added and then see if
the data is still corrupted. It may not help but is probably worth a try.

There was a bug prior to 2.6.32 where RAID6 could sometimes write the wrong
data when recovering to a spare. It would only happen if you were accessing
that data at the same time as it was recovery it, and if you were unlucky.

However you are running a newer kernel so that shouldn't affect you, but you
never know.

BTW the monthly scans that you do are primarily for finding sleeping bad
blocks - blocks that you cannot read. They do check for inconsistencies in
the parity but only report them, it doesn't correct them. This is because
automatically correcting can cause more problems than it solves.

When the monthly check reported inconsistencies you "should" have confirmed
that all the drives seem to be functioning correctly and then run a 'repair'
pass to fix the parity blocks up.

As you didn't that bad parity would have created bad data when you recovered.

NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: data corruption after rebuild

am 20.07.2011 10:20:55 von Pavel Herrmann

On Wednesday 20 of July 2011 16:24:31 you wrote:
> My suggestion would be to remove the drive you recently added and then see
> if the data is still corrupted. It may not help but is probably worth a
> try.

tried that, did not help, probably due to the finished rebuild that "repaired"
all the parity data

> There was a bug prior to 2.6.32 where RAID6 could sometimes write the wrong
> data when recovering to a spare. It would only happen if you were accessing
> that data at the same time as it was recovery it, and if you were unlucky.

I was accessing some data (read-mostly), but now the largest undamaged file on
the filesystem has just under 3MB - that looks a bit suspicious, as the stripe-
width is 6x 512K = 3M

> BTW the monthly scans that you do are primarily for finding sleeping bad
> blocks - blocks that you cannot read. They do check for inconsistencies in
> the parity but only report them, it doesn't correct them. This is because
> automatically correcting can cause more problems than it solves.
>
> When the monthly check reported inconsistencies you "should" have confirmed
> that all the drives seem to be functioning correctly and then run a 'repair'
> pass to fix the parity blocks up.
>
> As you didn't that bad parity would have created bad data when you
> recovered.

see the last part, at this point I would be perfectly OK with 72 damaged
blocks, as per the last scan (or even a few hundred, for that matter)

PS: forgot to include maillist

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html