dm-crypt over raid6 unreadable after crash

dm-crypt over raid6 unreadable after crash

am 06.07.2011 18:12:28 von Louis-David Mitterrand

Hi,

After a hardware crash I can no longer open a dm-crypt partition located
directly over a md-raid6 partition. I get this error:

root@grml ~ # cryptsetup isLuks /dev/md1
Device /dev/md1 is not a valid LUKS device

It seems the LUKS header has been shifted a few bytes forward, but looks
otherwise fine to specialists on the dm-crypt mailing list. Normally the
"LUKS" signature should be at 0x00000000

Is there some way that the md layer could have shifted its contents?

Is there a way to shift it back in place?

Thanks,


Here is a hexdum of /dev/md1 done with "hd /dev/md1 | head -n 40"

00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00100000 4c 55 4b 53 ba be 00 01 61 65 73 00 00 00 00 00 |LUKS....aes.....|
00100010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00100020 00 00 00 00 00 00 00 00 63 62 63 2d 65 73 73 69 |........cbc-essi|
00100030 76 3a 73 68 61 32 35 36 00 00 00 00 00 00 00 00 |v:sha256........|
00100040 00 00 00 00 00 00 00 00 73 68 61 31 00 00 00 00 |........sha1....|
00100050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00100060 00 00 00 00 00 00 00 00 00 00 18 00 00 00 00 20 |............... |
00100070 d3 67 34 b9 7b 55 fe 0d e7 cc 5d df 6e ae f7 1f |.g4.{U....].n...|
00100080 96 78 43 c5 d2 55 58 4b a9 2a 73 8b 28 45 6b 8c |.xC..UXK.*s.(Ek.|
00100090 cd 26 45 61 6b f4 5a 58 6d 8e 1b 85 c2 60 4f 7c |.&Eak.ZXm....`O||
001000a0 00 39 48 89 00 00 6d dd 66 37 38 31 66 62 34 30 |.9H...m.f781fb40|
001000b0 2d 37 65 34 31 2d 34 30 63 62 2d 62 65 66 66 2d |-7e41-40cb-beff-|
001000c0 61 39 66 37 61 65 61 63 64 35 63 36 00 00 00 00 |a9f7aeacd5c6....|
001000d0 00 ac 71 f3 00 01 b8 3f f7 9d c2 ef e0 07 78 c3 |..q....?......x.|
001000e0 f6 90 73 57 39 be 8e 79 59 44 a1 23 35 11 6f 6a |..sW9..yYD.#5.oj|
001000f0 6c b0 36 fc 7d 9b 15 0f 00 00 00 08 00 00 0f a0 |l.6.}...........|
00100100 00 00 de ad 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00100110 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00100120 00 00 00 00 00 00 00 00 00 00 01 08 00 00 0f a0 |................|
00100130 00 00 de ad 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00100140 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00100150 00 00 00 00 00 00 00 00 00 00 02 08 00 00 0f a0 |................|
00100160 00 00 de ad 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00100170 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00100180 00 00 00 00 00 00 00 00 00 00 03 08 00 00 0f a0 |................|
00100190 00 00 de ad 00 00 00 00 00 00 00 00 00 00 00 00 |................|
001001a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
001001b0 00 00 00 00 00 00 00 00 00 00 04 08 00 00 0f a0 |................|
001001c0 00 00 de ad 00 00 00 00 00 00 00 00 00 00 00 00 |................|
001001d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
001001e0 00 00 00 00 00 00 00 00 00 00 05 08 00 00 0f a0 |................|
001001f0 00 00 de ad 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00100200 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00100210 00 00 00 00 00 00 00 00 00 00 06 08 00 00 0f a0 |................|
00100220 00 00 de ad 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00100230 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00100240 00 00 00 00 00 00 00 00 00 00 07 08 00 00 0f a0 |................|
00100250 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: dm-crypt over raid6 unreadable after crash

am 06.07.2011 19:00:43 von Phil Turmel

Hi Louis-David,

On 07/06/2011 12:12 PM, Louis-David Mitterrand wrote:
> Hi,
>
> After a hardware crash I can no longer open a dm-crypt partition located
> directly over a md-raid6 partition. I get this error:
>
> root@grml ~ # cryptsetup isLuks /dev/md1
> Device /dev/md1 is not a valid LUKS device
>
> It seems the LUKS header has been shifted a few bytes forward, but looks
> otherwise fine to specialists on the dm-crypt mailing list. Normally the
> "LUKS" signature should be at 0x00000000
>
> Is there some way that the md layer could have shifted its contents?
>
> Is there a way to shift it back in place?
>
> Thanks,
>
>
> Here is a hexdum of /dev/md1 done with "hd /dev/md1 | head -n 40"
>
> 00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
> *
> 00100000 4c 55 4b 53 ba be 00 01 61 65 73 00 00 00 00 00 |LUKS....aes.....|
> 00100010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
> 00100020 00 00 00 00 00 00 00 00 63 62 63 2d 65 73 73 69 |........cbc-essi|
> 00100030 76 3a 73 68 61 32 35 36 00 00 00 00 00 00 00 00 |v:sha256........|

[...]

The offset is precisely 1MB. This is the default data offset for metadata types 1.1 and 1.2 (nowadays). Metadata types 0.90 and 1.0 have a zero offset (the metadata is at the end.)

You don't say what your recovery efforts were, but I'd guess you did a "mdadm --create" somewhere in there, and didn't match the original parameters. Or you used an older version of mdadm than was used originally, and therefore got different defaults.

Another possibility is that the original array was set up on a 1MB aligned partition, and the array is now using the whole device. This can happen with v0.90 metadata. If so, the original partition table is obviously zeroed out now.

Please share more information about what you've done so far. Also show us the output of "mdadm -D /dev/md1" and then "mdadm -E /dev/xxx" for each of its components.

The output of "lsdrv"[1] would also be useful for visualizing your setup.

Regards,

Phil

[1] http://github.com/pturmel/lsdrv

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: dm-crypt over raid6 unreadable after crash

am 07.07.2011 11:05:40 von Louis-David Mitterrand

On Wed, Jul 06, 2011 at 01:00:43PM -0400, Phil Turmel wrote:
> > After a hardware crash I can no longer open a dm-crypt partition lo=
cated
> > directly over a md-raid6 partition. I get this error:
> >=20
> > root@grml ~ # cryptsetup isLuks /dev/md1=20
> > Device /dev/md1 is not a valid LUKS device
> >=20
> > It seems the LUKS header has been shifted a few bytes forward, but =
looks
> > otherwise fine to specialists on the dm-crypt mailing list. Normall=
y the
> > "LUKS" signature should be at 0x00000000
> >=20
> > Is there some way that the md layer could have shifted its contents=
?
> >=20
> > Here is a hexdum of /dev/md1 done with "hd /dev/md1 | head -n 40"
> >=20
> > 00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |......=
.........|
> > *
> > 00100000 4c 55 4b 53 ba be 00 01 61 65 73 00 00 00 00 00 |LUKS..=
.aes.....|
> > 00100010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |......=
.........|
> > 00100020 00 00 00 00 00 00 00 00 63 62 63 2d 65 73 73 69 |......=
.cbc-essi|
> > 00100030 76 3a 73 68 61 32 35 36 00 00 00 00 00 00 00 00 |v:sha2=
56........|

Hi Phil,

> The offset is precisely 1MB. This is the default data offset for
> metadata types 1.1 and 1.2 (nowadays). Metadata types 0.90 and 1.0
> have a zero offset (the metadata is at the end.)
>=20
> You don't say what your recovery efforts were, but I'd guess you did =
a
> "mdadm --create" somewhere in there, and didn't match the original
> parameters. Or you used an older version of mdadm than was used
> originally, and therefore got different defaults.

No I did a mdadm-startall with a grml livecd.

> Another possibility is that the original array was set up on a 1MB
> aligned partition, and the array is now using the whole device. This
> can happen with v0.90 metadata. If so, the original partition table
> is obviously zeroed out now.
>=20
> Please share more information about what you've done so far. Also

Nothing appart from assembling the array and failing to decrypt it with
cryptsetup.

> show us the output of "mdadm -D /dev/md1"=20

/dev/md1:
Version : 1.2
Creation Time : Wed Oct 20 21:40:40 2010
Raid Level : raid6
Array Size : 841863168 (802.86 GiB 862.07 GB)
Used Dev Size : 140310528 (133.81 GiB 143.68 GB)
Raid Devices : 8
Total Devices : 8
Persistence : Superblock is persistent

Intent Bitmap : Internal

Update Time : Thu Jul 7 09:44:49 2011
State : active
Active Devices : 8
Working Devices : 8
Failed Devices : 0
Spare Devices : 0

Layout : left-symmetric
Chunk Size : 512K

Name : grml:1 (local to host grml)
UUID : 1434a46a:f2b751cd:8604803c:b545de8c
Events : 8292

Number Major Minor RaidDevice State
0 8 130 0 active sync /dev/sdi2
1 8 50 1 active sync /dev/sdd2
2 8 34 2 active sync /dev/sdc2
3 8 82 3 active sync /dev/sdf2
4 8 66 4 active sync /dev/sde2
5 8 146 5 active sync /dev/sdj2
8 8 114 6 active sync /dev/sdh2
7 8 98 7 active sync /dev/sdg2

> and then "mdadm -E /dev/xxx" for each of its components.


/dev/sdc2:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 1434a46a:f2b751cd:8604803c:b545de8c
Name : grml:1 (local to host grml)
Creation Time : Wed Oct 20 21:40:40 2010
Raid Level : raid6
Raid Devices : 8

Avail Dev Size : 280621372 (133.81 GiB 143.68 GB)
Array Size : 1683726336 (802.86 GiB 862.07 GB)
Used Dev Size : 280621056 (133.81 GiB 143.68 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 0a10d6c3:8a6f1948:f1a546a4:32f10094

Internal Bitmap : 2 sectors from superblock
Update Time : Thu Jul 7 11:00:42 2011
Checksum : 16c1099b - correct
Events : 8292

Layout : left-symmetric
Chunk Size : 512K

Device Role : Active device 2
Array State : AAAAAAAA ('A' == active, '.' == missing)
/dev/sdd2:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 1434a46a:f2b751cd:8604803c:b545de8c
Name : grml:1 (local to host grml)
Creation Time : Wed Oct 20 21:40:40 2010
Raid Level : raid6
Raid Devices : 8

Avail Dev Size : 280621372 (133.81 GiB 143.68 GB)
Array Size : 1683726336 (802.86 GiB 862.07 GB)
Used Dev Size : 280621056 (133.81 GiB 143.68 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 5f8c46cb:614354cf:dd91f7c2:f1260b2e

Internal Bitmap : 2 sectors from superblock
Update Time : Thu Jul 7 11:00:42 2011
Checksum : 5e277b71 - correct
Events : 8292

Layout : left-symmetric
Chunk Size : 512K

Device Role : Active device 1
Array State : AAAAAAAA ('A' == active, '.' == missing)
/dev/sde2:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 1434a46a:f2b751cd:8604803c:b545de8c
Name : grml:1 (local to host grml)
Creation Time : Wed Oct 20 21:40:40 2010
Raid Level : raid6
Raid Devices : 8

Avail Dev Size : 280621372 (133.81 GiB 143.68 GB)
Array Size : 1683726336 (802.86 GiB 862.07 GB)
Used Dev Size : 280621056 (133.81 GiB 143.68 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : clean
Device UUID : ab27b114:ea95aa0a:9adb310b:c456ee56

Internal Bitmap : 2 sectors from superblock
Update Time : Thu Jul 7 11:00:42 2011
Checksum : 5405e2d7 - correct
Events : 8292

Layout : left-symmetric
Chunk Size : 512K

Device Role : Active device 4
Array State : AAAAAAAA ('A' == active, '.' == missing)
/dev/sdf2:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 1434a46a:f2b751cd:8604803c:b545de8c
Name : grml:1 (local to host grml)
Creation Time : Wed Oct 20 21:40:40 2010
Raid Level : raid6
Raid Devices : 8

Avail Dev Size : 280621372 (133.81 GiB 143.68 GB)
Array Size : 1683726336 (802.86 GiB 862.07 GB)
Used Dev Size : 280621056 (133.81 GiB 143.68 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 31bc8c85:7b754501:ea0b713e:2714810a

Internal Bitmap : 2 sectors from superblock
Update Time : Thu Jul 7 11:00:42 2011
Checksum : a24e44a0 - correct
Events : 8292

Layout : left-symmetric
Chunk Size : 512K

Device Role : Active device 3
Array State : AAAAAAAA ('A' == active, '.' == missing)
/dev/sdg2:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 1434a46a:f2b751cd:8604803c:b545de8c
Name : grml:1 (local to host grml)
Creation Time : Wed Oct 20 21:40:40 2010
Raid Level : raid6
Raid Devices : 8

Avail Dev Size : 280621372 (133.81 GiB 143.68 GB)
Array Size : 1683726336 (802.86 GiB 862.07 GB)
Used Dev Size : 280621056 (133.81 GiB 143.68 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 4db23afd:d4422390:e39d701e:7223cc9e

Internal Bitmap : 2 sectors from superblock
Update Time : Thu Jul 7 11:00:42 2011
Checksum : 1d24a95f - correct
Events : 8292

Layout : left-symmetric
Chunk Size : 512K

Device Role : Active device 7
Array State : AAAAAAAA ('A' == active, '.' == missing)
/dev/sdh2:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 1434a46a:f2b751cd:8604803c:b545de8c
Name : grml:1 (local to host grml)
Creation Time : Wed Oct 20 21:40:40 2010
Raid Level : raid6
Raid Devices : 8

Avail Dev Size : 280621372 (133.81 GiB 143.68 GB)
Array Size : 1683726336 (802.86 GiB 862.07 GB)
Used Dev Size : 280621056 (133.81 GiB 143.68 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 1f220bf2:1c86fc2b:0e99f2d2:8283497c

Internal Bitmap : 2 sectors from superblock
Update Time : Thu Jul 7 11:00:42 2011
Checksum : 3fcdb7b5 - correct
Events : 8292

Layout : left-symmetric
Chunk Size : 512K

Device Role : Active device 6
Array State : AAAAAAAA ('A' == active, '.' == missing)
/dev/sdi2:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 1434a46a:f2b751cd:8604803c:b545de8c
Name : grml:1 (local to host grml)
Creation Time : Wed Oct 20 21:40:40 2010
Raid Level : raid6
Raid Devices : 8

Avail Dev Size : 280621372 (133.81 GiB 143.68 GB)
Array Size : 1683726336 (802.86 GiB 862.07 GB)
Used Dev Size : 280621056 (133.81 GiB 143.68 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 6acb87a4:3ac53237:1f5fff58:3611a0b0

Internal Bitmap : 2 sectors from superblock
Update Time : Thu Jul 7 11:00:42 2011
Checksum : b7e3f3da - correct
Events : 8292

Layout : left-symmetric
Chunk Size : 512K

Device Role : Active device 0
Array State : AAAAAAAA ('A' == active, '.' == missing)
/dev/sdj2:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 1434a46a:f2b751cd:8604803c:b545de8c
Name : grml:1 (local to host grml)
Creation Time : Wed Oct 20 21:40:40 2010
Raid Level : raid6
Raid Devices : 8

Avail Dev Size : 280621372 (133.81 GiB 143.68 GB)
Array Size : 1683726336 (802.86 GiB 862.07 GB)
Used Dev Size : 280621056 (133.81 GiB 143.68 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 1515f5a7:b0c78638:8ca1d918:f1fa47d7

Internal Bitmap : 2 sectors from superblock
Update Time : Thu Jul 7 11:00:42 2011
Checksum : a3276c28 - correct
Events : 8292

Layout : left-symmetric
Chunk Size : 512K

Device Role : Active device 5
Array State : AAAAAAAA ('A' == active, '.' == missing)

> The output of "lsdrv"[1] would also be useful for visualizing your se=
tup.

PCI [ata_piix] 00:1f.2 IDE interface: Intel Corporation 82801IB (ICH9) =
2 port SATA IDE Controller (rev 02)
├─scsi 0:0:0:0 HL-DT-ST DVD+-RW GH50N {K1LA7D41849}
â”=82 └─sr0: [11:0] Partitioned (dos) 224.00m 'gr=
ml64-medium_2011.05'
â”=82 └─Mounted as /dev/sr0 @ /live/image
└─scsi 1:x:x:x [Empty]
PCI [mpt2sas] 02:00.0 Serial Attached SCSI controller: LSI Logic / Symb=
ios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 02)
├─scsi 2:0:0:0 ATA WDC WD1002FAEX-0 {WD-WCATR1851552}
â”=82 └─sdc: [8:32] Partitioned (dos) 931.51g
â”=82 ├─sdc1: [8:33] MD raid1 (2/8) 250.98m md=
0 clean in_sync {2871f814-ceb7-6a88-d8b7-8f6599226e41}
â”=82 â”=82 └─md0: [9:0] Partitioned (dos=
) 250.88m {d1d876e9-6905-4940-bf55-7cdb4b64484f}
â”=82 ├─sdc2: [8:34] MD raid6 (2/8) 133.81g md=
1 clean in_sync 'grml:1' {1434a46a-f2b7-51cd-8604-803cb545de8c}
â”=82 â”=82 └─md1: [9:1] Empty/Unknown 80=
2.86g
â”=82 └─sdc3: [8:35] MD raid6 (0/8) 797.36g md=
2 clean in_sync 'zenon:2' {5c037ba3-ca4b-f7b9-eb8f-b01608e1fd3b}
â”=82 └─md2: [9:2] (crypto_LUKS) 4.67t {1d3=
0a244-9d40-48e8-925a-1d6c93a45474}
â”=82 └─dm-0: [253:0] (xfs) 4.67t {3cad6=
3a0-a586-43e0-bf89-5be9066c884f}
â”=82 └─Mounted as /dev/mapper/cmd2 @=
/backup
├─scsi 2:0:1:0 ATA WDC WD1002FAEX-0 {WD-WCATR2968402}
â”=82 └─sdd: [8:48] Partitioned (dos) 931.51g
â”=82 ├─sdd1: [8:49] MD raid1 (3/8) 250.98m md=
0 clean in_sync {2871f814-ceb7-6a88-d8b7-8f6599226e41}
â”=82 ├─sdd2: [8:50] MD raid6 (1/8) 133.81g md=
1 clean in_sync 'grml:1' {1434a46a-f2b7-51cd-8604-803cb545de8c}
â”=82 └─sdd3: [8:51] MD raid6 (5/8) 797.36g md=
2 clean in_sync 'zenon:2' {5c037ba3-ca4b-f7b9-eb8f-b01608e1fd3b}
├─scsi 2:0:2:0 ATA WDC WD1002FAEX-0 {WD-WCATR1851573}
â”=82 └─sde: [8:64] Partitioned (dos) 931.51g
â”=82 ├─sde1: [8:65] MD raid1 (7/8) 250.98m md=
0 clean in_sync {2871f814-ceb7-6a88-d8b7-8f6599226e41}
â”=82 ├─sde2: [8:66] MD raid6 (4/8) 133.81g md=
1 clean in_sync 'grml:1' {1434a46a-f2b7-51cd-8604-803cb545de8c}
â”=82 └─sde3: [8:67] MD raid6 (6/8) 797.36g md=
2 clean in_sync 'zenon:2' {5c037ba3-ca4b-f7b9-eb8f-b01608e1fd3b}
├─scsi 2:0:3:0 ATA WDC WD1002FAEX-0 {WD-WCATR3005506}
â”=82 └─sdf: [8:80] Partitioned (dos) 931.51g
â”=82 ├─sdf1: [8:81] MD raid1 (0/8) 250.98m md=
0 clean in_sync {2871f814-ceb7-6a88-d8b7-8f6599226e41}
â”=82 ├─sdf2: [8:82] MD raid6 (3/8) 133.81g md=
1 clean in_sync 'grml:1' {1434a46a-f2b7-51cd-8604-803cb545de8c}
â”=82 └─sdf3: [8:83] MD raid6 (4/8) 797.36g md=
2 clean in_sync 'zenon:2' {5c037ba3-ca4b-f7b9-eb8f-b01608e1fd3b}
├─scsi 2:0:4:0 ATA WDC WD1002FAEX-0 {WD-WCATR3007070}
â”=82 └─sdg: [8:96] Partitioned (dos) 931.51g
â”=82 ├─sdg1: [8:97] MD raid1 (6/8) 250.98m md=
0 clean in_sync {2871f814-ceb7-6a88-d8b7-8f6599226e41}
â”=82 ├─sdg2: [8:98] MD raid6 (7/8) 133.81g md=
1 clean in_sync 'grml:1' {1434a46a-f2b7-51cd-8604-803cb545de8c}
â”=82 └─sdg3: [8:99] MD raid6 (3/8) 797.36g md=
2 clean in_sync 'zenon:2' {5c037ba3-ca4b-f7b9-eb8f-b01608e1fd3b}
├─scsi 2:0:5:0 ATA WDC WD1002FAEX-0 {WD-WCATR3004862}
â”=82 └─sdh: [8:112] Partitioned (dos) 931.51g
â”=82 ├─sdh1: [8:113] MD raid1 (4/8) 250.98m m=
d0 clean in_sync {2871f814-ceb7-6a88-d8b7-8f6599226e41}
â”=82 ├─sdh2: [8:114] MD raid6 (6/8) 133.81g m=
d1 clean in_sync 'grml:1' {1434a46a-f2b7-51cd-8604-803cb545de8c}
â”=82 └─sdh3: [8:115] MD raid6 (1/8) 797.36g m=
d2 clean in_sync 'zenon:2' {5c037ba3-ca4b-f7b9-eb8f-b01608e1fd3b}
├─scsi 2:0:6:0 ATA WDC WD1002FAEX-0 {WD-WCATR2969087}
â”=82 └─sdi: [8:128] Partitioned (dos) 931.51g
â”=82 ├─sdi1: [8:129] MD raid1 (1/8) 250.98m m=
d0 clean in_sync {2871f814-ceb7-6a88-d8b7-8f6599226e41}
â”=82 ├─sdi2: [8:130] MD raid6 (0/8) 133.81g m=
d1 clean in_sync 'grml:1' {1434a46a-f2b7-51cd-8604-803cb545de8c}
â”=82 └─sdi3: [8:131] MD raid6 (7/8) 797.36g m=
d2 clean in_sync 'zenon:2' {5c037ba3-ca4b-f7b9-eb8f-b01608e1fd3b}
├─scsi 2:0:7:0 ATA WDC WD1002FAEX-0 {WD-WCATR2984361}
â”=82 └─sdj: [8:144] Partitioned (dos) 931.51g
â”=82 ├─sdj1: [8:145] MD raid1 (5/8) 250.98m m=
d0 clean in_sync {2871f814-ceb7-6a88-d8b7-8f6599226e41}
â”=82 ├─sdj2: [8:146] MD raid6 (5/8) 133.81g m=
d1 clean in_sync 'grml:1' {1434a46a-f2b7-51cd-8604-803cb545de8c}
â”=82 └─sdj3: [8:147] MD raid6 (2/8) 797.36g m=
d2 clean in_sync 'zenon:2' {5c037ba3-ca4b-f7b9-eb8f-b01608e1fd3b}
└─scsi 2:x:x:x [Empty]
USB [usb-storage] Bus 002 Device 004: ID 0624:0249 Avocent Corp. {2008=
0519}
└─scsi 3:0:0:0 iDRAC LCDRIVE
└─sda: [8:0] Empty/Unknown 0.00k
USB [usb-storage] Bus 002 Device 004: ID 0624:0249 Avocent Corp. {2008=
0519}
├─scsi 4:0:0:0 iDRAC Virtual CD
â”=82 └─sr1: [11:1] Empty/Unknown 1.00g
└─scsi 4:0:0:1 iDRAC Virtual Floppy
└─sdb: [8:16] Empty/Unknown 0.00k
Other Block Devices
└─loop0: [7:0] (squashfs) 199.44m
└─Mounted as /dev/loop0 @ /grml64-medium.squashfs

> Regards,

Thanks,
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: dm-crypt over raid6 unreadable after crash

am 07.07.2011 14:41:36 von Phil Turmel

On 07/07/2011 05:05 AM, Louis-David Mitterrand wrote:
> On Wed, Jul 06, 2011 at 01:00:43PM -0400, Phil Turmel wrote:
>>> After a hardware crash I can no longer open a dm-crypt partition lo=
cated
>>> directly over a md-raid6 partition. I get this error:
>>>
>>> root@grml ~ # cryptsetup isLuks /dev/md1=20
>>> Device /dev/md1 is not a valid LUKS device
>>>
>>> It seems the LUKS header has been shifted a few bytes forward, but =
looks
>>> otherwise fine to specialists on the dm-crypt mailing list. Normall=
y the
>>> "LUKS" signature should be at 0x00000000
>>>
>>> Is there some way that the md layer could have shifted its contents=
?
>>>
>>> Here is a hexdum of /dev/md1 done with "hd /dev/md1 | head -n 40"
>>>
>>> 00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |......=
.........|
>>> *
>>> 00100000 4c 55 4b 53 ba be 00 01 61 65 73 00 00 00 00 00 |LUKS..=
.aes.....|
>>> 00100010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |......=
.........|
>>> 00100020 00 00 00 00 00 00 00 00 63 62 63 2d 65 73 73 69 |......=
.cbc-essi|
>>> 00100030 76 3a 73 68 61 32 35 36 00 00 00 00 00 00 00 00 |v:sha2=
56........|
>=20
> Hi Phil,
>=20
>> The offset is precisely 1MB. This is the default data offset for
>> metadata types 1.1 and 1.2 (nowadays). Metadata types 0.90 and 1.0
>> have a zero offset (the metadata is at the end.)
>>
>> You don't say what your recovery efforts were, but I'd guess you did=
a
>> "mdadm --create" somewhere in there, and didn't match the original
>> parameters. Or you used an older version of mdadm than was used
>> originally, and therefore got different defaults.
>=20
> No I did a mdadm-startall with a grml livecd.

Very interesting.

>> Another possibility is that the original array was set up on a 1MB
>> aligned partition, and the array is now using the whole device. Thi=
s
>> can happen with v0.90 metadata. If so, the original partition table
>> is obviously zeroed out now.
>>
>> Please share more information about what you've done so far. Also
>=20
> Nothing appart from assembling the array and failing to decrypt it wi=
th
> cryptsetup.

OK. And it's a v1.2 superblock, so mdadm very likely got it right.

>> show us the output of "mdadm -D /dev/md1"=20
>=20
> /dev/md1:
> Version : 1.2
> Creation Time : Wed Oct 20 21:40:40 2010
> Raid Level : raid6
> Array Size : 841863168 (802.86 GiB 862.07 GB)
> Used Dev Size : 140310528 (133.81 GiB 143.68 GB)
> Raid Devices : 8
> Total Devices : 8
> Persistence : Superblock is persistent
>=20
> Intent Bitmap : Internal
>=20
> Update Time : Thu Jul 7 09:44:49 2011
> State : active
> Active Devices : 8
> Working Devices : 8
> Failed Devices : 0
> Spare Devices : 0
>=20
> Layout : left-symmetric
> Chunk Size : 512K
>=20
> Name : grml:1 (local to host grml)
> UUID : 1434a46a:f2b751cd:8604803c:b545de8c
> Events : 8292
>=20
> Number Major Minor RaidDevice State
> 0 8 130 0 active sync /dev/sdi2
> 1 8 50 1 active sync /dev/sdd2
> 2 8 34 2 active sync /dev/sdc2
> 3 8 82 3 active sync /dev/sdf2
> 4 8 66 4 active sync /dev/sde2
> 5 8 146 5 active sync /dev/sdj2
> 8 8 114 6 active sync /dev/sdh2
> 7 8 98 7 active sync /dev/sdg2
>=20
>> and then "mdadm -E /dev/xxx" for each of its components.
>=20
>=20
> /dev/sdc2:
> Magic : a92b4efc
> Version : 1.2
> Feature Map : 0x1
> Array UUID : 1434a46a:f2b751cd:8604803c:b545de8c
> Name : grml:1 (local to host grml)
> Creation Time : Wed Oct 20 21:40:40 2010
> Raid Level : raid6
> Raid Devices : 8
>=20
> Avail Dev Size : 280621372 (133.81 GiB 143.68 GB)
> Array Size : 1683726336 (802.86 GiB 862.07 GB)
> Used Dev Size : 280621056 (133.81 GiB 143.68 GB)
> Data Offset : 2048 sectors
> Super Offset : 8 sectors
> State : clean
> Device UUID : 0a10d6c3:8a6f1948:f1a546a4:32f10094
>=20
> Internal Bitmap : 2 sectors from superblock
> Update Time : Thu Jul 7 11:00:42 2011
> Checksum : 16c1099b - correct
> Events : 8292
>=20
> Layout : left-symmetric
> Chunk Size : 512K
>=20
> Device Role : Active device 2
> Array State : AAAAAAAA ('A' == active, '.' == missing)

[...]

No oddities in the above. Both of my speculations are wrong.

>> The output of "lsdrv"[1] would also be useful for visualizing your s=
etup.
>=20
> PCI [ata_piix] 00:1f.2 IDE interface: Intel Corporation 82801IB (ICH9=
) 2 port SATA IDE Controller (rev 02)
> ├â”â‚scsi 0:0:0:0 HL-DT-ST D=
VD+-RW GH50N {K1LA7D41849}
> │ â””â”â=
‚¬sr0: [11:0] Partitioned (dos) 224.00m 'grml64-medium_2011.05'
> │ â””ââ€=9D=
â‚Mounted as /dev/sr0 @ /live/image
> └─scsi 1:x:x:x [Empty]
> PCI [mpt2sas] 02:00.0 Serial Attached SCSI controller: LSI Logic / Sy=
mbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 02)
> ├â”â‚scsi 2:0:0:0 ATA WDC WD=
1002FAEX-0 {WD-WCATR1851552}
> │ â””â”â=
‚¬sdc: [8:32] Partitioned (dos) 931.51g
> │ ├â”=E2=
‚¬sdc1: [8:33] MD raid1 (2/8) 250.98m md0 clean in_sync {2871f814-c=
eb7-6a88-d8b7-8f6599226e41}
> │ │ ââ€=9D=
”â”â‚md0: [9:0] Partitioned (dos) 250.88m {=
d1d876e9-6905-4940-bf55-7cdb4b64484f}
> │ ├â”=E2=
‚¬sdc2: [8:34] MD raid6 (2/8) 133.81g md1 clean in_sync 'grml:1' {1=
434a46a-f2b7-51cd-8604-803cb545de8c}
> │ │ ââ€=9D=
”â”â‚md1: [9:1] Empty/Unknown 802.86g
> │ â””ââ€=9D=
â‚sdc3: [8:35] MD raid6 (0/8) 797.36g md2 clean in_sync 'zenon:2=
' {5c037ba3-ca4b-f7b9-eb8f-b01608e1fd3b}
> │ â””ââ€=
â‚¬md2: [9:2] (crypto_LUKS) 4.67t {1d30a244-9d40-48e8-925a-1d6=
c93a45474}
> │ â””â=E2=
€â‚dm-0: [253:0] (xfs) 4.67t {3cad63a0-a586-43e0-bf89-5be90=
66c884f}
> │ â””â=
”€Mounted as /dev/mapper/cmd2 @ /backup

So, cryptsetup saw and properly handled /dev/md2.

[...]

Well. /dev/md1 is assembled correctly as far as I can tell. That make=
me wonder what else might be in play. First, it would be good to know=
if the luks data is truly intact at the 1MB offset. As a test, please=
add a linear device mapper layer that skips the 1MB. Like so:

echo 0 1683724288 linear /dev/md1 2048 | dmsetup create mdtest

Depending on grml's udev setup, this may prompt you to unlock /dev/mapp=
er/mdtest. Otherwise, use cryptsetup to test it, and then unlock it. =
Do *NOT* mount yet. Run "fsck -n" to see if it is intact.

I also wonder if the md device itself was partitioned, maybe with EFI G=
PT, and the grml liveCD doesn't support it? (Long shot.) Please show =
"zcat /proc/config.gz |grep PART"

I'm running out of ideas.

Phil
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: dm-crypt over raid6 unreadable after crash

am 07.07.2011 15:09:18 von Louis-David Mitterrand

On Thu, Jul 07, 2011 at 08:41:36AM -0400, Phil Turmel wrote:
>
> So, cryptsetup saw and properly handled /dev/md2.
>
> [...]
>
> Well. /dev/md1 is assembled correctly as far as I can tell. That
> make me wonder what else might be in play. First, it would be good to
> know if the luks data is truly intact at the 1MB offset. As a test,
> please add a linear device mapper layer that skips the 1MB. Like so:
>
> echo 0 1683724288 linear /dev/md1 2048 | dmsetup create mdtest

That worked fine.

I then successfully unlocked mdtest with:

cryptsetup luksOpen /dev/mapper/mdtest cmd1

> Depending on grml's udev setup, this may prompt you to unlock
> /dev/mapper/mdtest. Otherwise, use cryptsetup to test it, and then
> unlock it. Do *NOT* mount yet. Run "fsck -n" to see if it is intact.

root@grml /dev/mapper # xfs_check /dev/mapper/cmd1
xfs_check: /dev/mapper/cmd1 is not a valid XFS filesystem (unexpected SB magic number 0xd0f1b462)
xfs_check: size check failed
xfs_check: WARNING - filesystem uses v1 dirs,limited functionality provided.
xfs_check: read failed: Invalid argument
xfs_check: data size check failed
cache_node_purge: refcount was 1, not zero (node=0x1835a30)
xfs_check: cannot read root inode (22)
cache_node_purge: refcount was 1, not zero (node=0x1835c80)
xfs_check: cannot read realtime bitmap inode (22)
xfs_check: size check failed
xfs_check: WARNING - filesystem uses v1 dirs,limited functionality provided.
xfs_check: read failed: Invalid argument
xfs_check: data size check failed
bad superblock magic number d0f1b462, giving up

> I also wonder if the md device itself was partitioned, maybe with EFI
> GPT, and the grml liveCD doesn't support it? (Long shot.) Please
> show "zcat /proc/config.gz |grep PART"

root@grml /dev/mapper # zcat /proc/config.gz |grep PART
CONFIG_PM_STD_PARTITION=""
CONFIG_MTD_PARTITIONS=y
CONFIG_MTD_REDBOOT_PARTS=m
# CONFIG_MTD_REDBOOT_PARTS_UNALLOCATED is not set
# CONFIG_MTD_REDBOOT_PARTS_READONLY is not set
CONFIG_MTD_AR7_PARTS=m
CONFIG_PARTITION_ADVANCED=y
CONFIG_ACORN_PARTITION=y
# CONFIG_ACORN_PARTITION_CUMANA is not set
# CONFIG_ACORN_PARTITION_EESOX is not set
CONFIG_ACORN_PARTITION_ICS=y
# CONFIG_ACORN_PARTITION_ADFS is not set
# CONFIG_ACORN_PARTITION_POWERTEC is not set
# CONFIG_ACORN_PARTITION_RISCIX is not set
CONFIG_OSF_PARTITION=y
CONFIG_AMIGA_PARTITION=y
CONFIG_ATARI_PARTITION=y
CONFIG_MAC_PARTITION=y
CONFIG_MSDOS_PARTITION=y
CONFIG_MINIX_SUBPARTITION=y
CONFIG_SOLARIS_X86_PARTITION=y
CONFIG_LDM_PARTITION=y
CONFIG_SGI_PARTITION=y
CONFIG_ULTRIX_PARTITION=y
CONFIG_SUN_PARTITION=y
CONFIG_KARMA_PARTITION=y
CONFIG_EFI_PARTITION=y
CONFIG_SYSV68_PARTITION=y

> I'm running out of ideas.

Well thanks a lot for trying at least. At least now I understand how
'linear' can be used.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: dm-crypt over raid6 unreadable after crash

am 07.07.2011 15:32:38 von Phil Turmel

On 07/07/2011 09:09 AM, Louis-David Mitterrand wrote:
> On Thu, Jul 07, 2011 at 08:41:36AM -0400, Phil Turmel wrote:
>>
>> So, cryptsetup saw and properly handled /dev/md2.
>>
>> [...]
>>
>> Well. /dev/md1 is assembled correctly as far as I can tell. That
>> make me wonder what else might be in play. First, it would be good to
>> know if the luks data is truly intact at the 1MB offset. As a test,
>> please add a linear device mapper layer that skips the 1MB. Like so:
>>
>> echo 0 1683724288 linear /dev/md1 2048 | dmsetup create mdtest
>
> That worked fine.
>
> I then successfully unlocked mdtest with:
>
> cryptsetup luksOpen /dev/mapper/mdtest cmd1
>
>> Depending on grml's udev setup, this may prompt you to unlock
>> /dev/mapper/mdtest. Otherwise, use cryptsetup to test it, and then
>> unlock it. Do *NOT* mount yet. Run "fsck -n" to see if it is intact.
>
> root@grml /dev/mapper # xfs_check /dev/mapper/cmd1
> xfs_check: /dev/mapper/cmd1 is not a valid XFS filesystem (unexpected SB magic number 0xd0f1b462)
> xfs_check: size check failed

This is important. When I computed the sector count for the linear mapping, I just took 2048 off the end. You may want to select a sector count that aligns the endpoint.

> xfs_check: WARNING - filesystem uses v1 dirs,limited functionality provided.
> xfs_check: read failed: Invalid argument
> xfs_check: data size check failed
> cache_node_purge: refcount was 1, not zero (node=0x1835a30)
> xfs_check: cannot read root inode (22)
> cache_node_purge: refcount was 1, not zero (node=0x1835c80)
> xfs_check: cannot read realtime bitmap inode (22)
> xfs_check: size check failed
> xfs_check: WARNING - filesystem uses v1 dirs,limited functionality provided.
> xfs_check: read failed: Invalid argument
> xfs_check: data size check failed
> bad superblock magic number d0f1b462, giving up

If the md array is assembled with devices out of order, the initialization vectors for all the sectors are likely to be wrong, with no way to examine the contents to try to work out the device order. Of course, I don't know of any way mdadm can screw up the device order with v1+ metadata.

>> I also wonder if the md device itself was partitioned, maybe with EFI
>> GPT, and the grml liveCD doesn't support it? (Long shot.) Please
>> show "zcat /proc/config.gz |grep PART"
>
> root@grml /dev/mapper # zcat /proc/config.gz |grep PART
> CONFIG_PM_STD_PARTITION=""
> CONFIG_MTD_PARTITIONS=y
> CONFIG_MTD_REDBOOT_PARTS=m
> # CONFIG_MTD_REDBOOT_PARTS_UNALLOCATED is not set
> # CONFIG_MTD_REDBOOT_PARTS_READONLY is not set
> CONFIG_MTD_AR7_PARTS=m
> CONFIG_PARTITION_ADVANCED=y
> CONFIG_ACORN_PARTITION=y
> # CONFIG_ACORN_PARTITION_CUMANA is not set
> # CONFIG_ACORN_PARTITION_EESOX is not set
> CONFIG_ACORN_PARTITION_ICS=y
> # CONFIG_ACORN_PARTITION_ADFS is not set
> # CONFIG_ACORN_PARTITION_POWERTEC is not set
> # CONFIG_ACORN_PARTITION_RISCIX is not set
> CONFIG_OSF_PARTITION=y
> CONFIG_AMIGA_PARTITION=y
> CONFIG_ATARI_PARTITION=y
> CONFIG_MAC_PARTITION=y
> CONFIG_MSDOS_PARTITION=y
> CONFIG_MINIX_SUBPARTITION=y
> CONFIG_SOLARIS_X86_PARTITION=y
> CONFIG_LDM_PARTITION=y
> CONFIG_SGI_PARTITION=y
> CONFIG_ULTRIX_PARTITION=y
> CONFIG_SUN_PARTITION=y
> CONFIG_KARMA_PARTITION=y
> CONFIG_EFI_PARTITION=y
^^^
So that's not it.

> CONFIG_SYSV68_PARTITION=y
>
>> I'm running out of ideas.
>
> Well thanks a lot for trying at least. At least now I understand how
> 'linear' can be used.

Sorry I couldn't help more.

Phil
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: dm-crypt over raid6 unreadable after crash

am 08.07.2011 07:40:45 von Luca Berra

On Thu, Jul 07, 2011 at 09:32:38AM -0400, Phil Turmel wrote:
>On 07/07/2011 09:09 AM, Louis-David Mitterrand wrote:
>> On Thu, Jul 07, 2011 at 08:41:36AM -0400, Phil Turmel wrote:
....
>>> echo 0 1683724288 linear /dev/md1 2048 | dmsetup create mdtest
>>
.....
>>
>> root@grml /dev/mapper # xfs_check /dev/mapper/cmd1
>> xfs_check: /dev/mapper/cmd1 is not a valid XFS filesystem (unexpected SB magic number 0xd0f1b462)
>> xfs_check: size check failed
>
>This is important. When I computed the sector count for the linear mapping, I just took 2048 off the end. You may want to select a sector count that aligns the endpoint.
but the xfs sb should be at sector 0

>If the md array is assembled with devices out of order, the initialization vectors for all the sectors are likely to be wrong, with no way to examine the contents to try to work out the device order. Of course, I don't know of any way mdadm can screw up the device order with v1+ metadata.
>
>>> I also wonder if the md device itself was partitioned, maybe with EFI
>>> GPT, and the grml liveCD doesn't support it? (Long shot.) Please
>>> show "zcat /proc/config.gz |grep PART"
if the md itself were partitioned i would expect to see a partition
table somewhere between the start and the liks header, but from the
posted od it is 1M of zeroes.

also from the lsdrv i see another array named zenon, while this is named
grml (i suppose the hostname of your server is the former), this makes
me wonder...
i do not have a debian, but from what i found of the man-page the script
should be harmless, but.
the luks header starts at 1M, which is a multiple of the 512K chunk
size, which could be a symptom of mis-ordered devices.
did you keep any log when you ran the mdadm-startall?
did md1 resync?

L.


--
Luca Berra -- bluca@comedia.it
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: dm-crypt over raid6 unreadable after crash

am 10.07.2011 14:03:50 von Louis-David Mitterrand

On Fri, Jul 08, 2011 at 07:40:45AM +0200, Luca Berra wrote:
> >This is important. When I computed the sector count for the linear mapping, I just took 2048 off the end. You may want to select a sector count that aligns the endpoint.
> but the xfs sb should be at sector 0

So what should I change to the dmsetup command to make it work?
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: dm-crypt over raid6 unreadable after crash

am 10.07.2011 14:28:27 von Phil Turmel

On 07/10/2011 08:03 AM, Louis-David Mitterrand wrote:
> On Fri, Jul 08, 2011 at 07:40:45AM +0200, Luca Berra wrote:
>>> This is important. When I computed the sector count for the linear mapping, I just took 2048 off the end. You may want to select a sector count that aligns the endpoint.
>> but the xfs sb should be at sector 0
>
> So what should I change to the dmsetup command to make it work?

He means (not to put words in Luca's mouth...) that the precise endpoint of the mapped volume shouldn't matter to fsck.xfs. Changing device order in the raid is probably your only hope of recovery. The dmsetup exercise was a blind alley.

Luca also pointed out that the problem array is named "grml" which means that it was created with grml, not your original system (zenon). That suggests that "mdadm --create" was used under grml, and the member devices were specified in an order differing from the original install. If that "mdadm --create" didn't include the "--assume-clean" option, then the parity blocks are almost certainly recomputed, and your data destroyed. Otherwise, you can try "mdadm --create --assume-clean" with other combinations of device order to try to find the "right" one.

I recommend trying "mdadm --create --assume-clean" with the devices in the same order as shown by lsdrv for the zenon array.

Phil
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: dm-crypt over raid6 unreadable after crash

am 11.07.2011 01:44:21 von Stan Hoeppner

On 7/10/2011 7:28 AM, Phil Turmel wrote:
> On 07/10/2011 08:03 AM, Louis-David Mitterrand wrote:
>> On Fri, Jul 08, 2011 at 07:40:45AM +0200, Luca Berra wrote:
>>>> This is important. When I computed the sector count for the linear mapping, I just took 2048 off the end. You may want to select a sector count that aligns the endpoint.
>>> but the xfs sb should be at sector 0
>>
>> So what should I change to the dmsetup command to make it work?
>
> He means (not to put words in Luca's mouth...) that the precise endpoint of the mapped volume shouldn't matter to fsck.xfs. Changing device order in the raid is probably your only hope of recovery. The dmsetup exercise was a blind alley.
>
> Luca also pointed out that the problem array is named "grml" which means that it was created with grml, not your original system (zenon). That suggests that "mdadm --create" was used under grml, and the member devices were specified in an order differing from the original install. If that "mdadm --create" didn't include the "--assume-clean" option, then the parity blocks are almost certainly recomputed, and your data destroyed. Otherwise, you can try "mdadm --create --assume-clean" with other combinations of device order to try to find the "right" one.
>
> I recommend trying "mdadm --create --assume-clean" with the devices in the same order as shown by lsdrv for the zenon array.

Moral of this story: Using live CDs to troubleshoot Linux RAID arrays,
LVM volumes, dmcrypt, etc, is tricky at best, fraught with disaster at
worst. Always have a "system rescue" CD/DVD built from the particular
machine in question at the ready. Most distros have a facility for
building such a per machine boot/rescue CD. Such media will have the
same kernel and module versions, the system's mdadm configuration, and
all other system specific stuff needed to properly troubleshoot and repair.

If such rescue media had been used the OP would have likely experienced
far less grief, and might already had this problem fixed.

--
Stan
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: dm-crypt over raid6 unreadable after crash

am 11.07.2011 09:54:16 von Louis-David Mitterrand

On Sun, Jul 10, 2011 at 08:28:27AM -0400, Phil Turmel wrote:
> On 07/10/2011 08:03 AM, Louis-David Mitterrand wrote:
> > On Fri, Jul 08, 2011 at 07:40:45AM +0200, Luca Berra wrote:
> >>> This is important. When I computed the sector count for the linear mapping, I just took 2048 off the end. You may want to select a sector count that aligns the endpoint.
> >> but the xfs sb should be at sector 0
> >
> > So what should I change to the dmsetup command to make it work?
>
> He means (not to put words in Luca's mouth...) that the precise
> endpoint of the mapped volume shouldn't matter to fsck.xfs. Changing
> device order in the raid is probably your only hope of recovery. The
> dmsetup exercise was a blind alley.
>
> Luca also pointed out that the problem array is named "grml" which
> means that it was created with grml, not your original system (zenon).
> That suggests that "mdadm --create" was used under grml, and the
> member devices were specified in an order differing from the original
> install. If that "mdadm --create" didn't include the "--assume-clean"
> option, then the parity blocks are almost certainly recomputed, and
> your data destroyed. Otherwise, you can try "mdadm --create
> --assume-clean" with other combinations of device order to try to find
> the "right" one.
>
> I recommend trying "mdadm --create --assume-clean" with the devices in
> the same order as shown by lsdrv for the zenon array.

That clinched the deal!

By reordering the md1 array according to the working 'zenon' md2 array I
was able to unlock it with cryptsetup. However the xfs filesystem was
too damaged: an xfs_repair sent everything in lost+found.

That lsdrv tool is really useful, I'll keep it my box.

Thanks,
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: dm-crypt over raid6 unreadable after crash

am 11.07.2011 09:57:30 von Louis-David Mitterrand

On Sun, Jul 10, 2011 at 06:44:21PM -0500, Stan Hoeppner wrote:
>
> Moral of this story: Using live CDs to troubleshoot Linux RAID arrays,
> LVM volumes, dmcrypt, etc, is tricky at best, fraught with disaster at
> worst. Always have a "system rescue" CD/DVD built from the particular
> machine in question at the ready. Most distros have a facility for
> building such a per machine boot/rescue CD. Such media will have the
> same kernel and module versions, the system's mdadm configuration, and
> all other system specific stuff needed to properly troubleshoot and repair.
>
> If such rescue media had been used the OP would have likely experienced
> far less grief, and might already had this problem fixed.

A live CD like grml can properly auto-start a md array. My problem
came from doing a --create instead of an --assemble at some point.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: dm-crypt over raid6 unreadable after crash

am 11.07.2011 13:49:47 von Stan Hoeppner

On 7/11/2011 2:57 AM, Louis-David Mitterrand wrote:
> On Sun, Jul 10, 2011 at 06:44:21PM -0500, Stan Hoeppner wrote:
>>
>> Moral of this story: Using live CDs to troubleshoot Linux RAID arrays,
>> LVM volumes, dmcrypt, etc, is tricky at best, fraught with disaster at
>> worst. Always have a "system rescue" CD/DVD built from the particular
>> machine in question at the ready. Most distros have a facility for
>> building such a per machine boot/rescue CD. Such media will have the
>> same kernel and module versions, the system's mdadm configuration, and
>> all other system specific stuff needed to properly troubleshoot and repair.
>>
>> If such rescue media had been used the OP would have likely experienced
>> far less grief, and might already had this problem fixed.
>
> A live CD like grml can properly auto-start a md array. My problem
> came from doing a --create instead of an --assemble at some point.

What did it not auto-start your md array in this case again?

--
Stan
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html