Confusion with setting up new RAID6 with mdadm

am 14.11.2010 16:36:38 von Zoltan Szecsei

Hi,
I hope this is the correct list to address this on - I've done a lot of=
=20
typing for nothing, if not :-)

I have done days of research, including reading=20
https://raid.wiki.kernel.org/index.php/Linux_Raid, but all I am doing i=
s=20
getting confused in the detail.

My goal is to set up an 8*2TB SiI3132 based RAID6 on Ubuntu 10.04LTS,=20
with LVM and ext4.
The setup will mostly hold thousands of 400MB image files, and they wil=
l=20
not be accessed regularly - they mostly just need to be online. The=20
entire space on all 8 drives can be used, and I want 1 massive=20
filesystem, when I finally mount this RAID device. No boot, root or swa=
p.

I have gone quite far with the help of the local linux group, but after=
=20
I had completed the 27 hour mdadm --create run, further tidbits were=20
thrown at me, and I am trying to get an opinion on if it is worth=20
scrapping this effort, and starting again.

Please can someone provide clarity on:

*If I have to reformat the drives and redo mdadm --create, other than=20
mdadm stop, how can I get rid of all the /dev/md* etc etc so that when =
I=20
restart this exercise, the original bad RAID does not interfere with=20
this new attempt?

*Partition alignment?
Is this relevant for modern HDs (I'm using 5900rpm Seagate 2TB drives)
None of the mdadm helps I've googled or received speak about how to=20
correctly format the drives before running mdadm --create.
All the benchmarks & performance tests I've found, do not bother to say=
=20
whether they have aligned the partitions on the HD

*What is the correct fdisk or parted method get rid of the DOS & GPT=20
flags, and create a correctly aligned partition, and should this be a=20
0xda partiton (& then I use metatdata 1.2 for mdadm)?

*Chunk size:
After reading MANY different opinions, I'm guessing staying at the=20
default chunk size is optimal? Anyone want to add to this argument?

*After partitioning the 8 drives, is this the correct sequence?
mdadm --create /dev/md0 --metadata=3D1.2 --auto=3Dmd --chunk=3D64=20
--level=3Draid6 --raid devices=3D8 /dev/sd[abcdefgh]1
mdadm --detail --scan >> /etc/mdadm.conf
mdadm --assemble /dev/md0 /dev/sd[abcdefg]1

*After this, do I mkfs ext4 first, or LVM first?

*What stride and stripe values should I use?

If you've read this far: Wow! - big thanks.
If you're going to venture some help or affirmation - BIGGER thanks! :=3D=
)

Kind regards to all,
Zoltan

This is where I am, but I'd like to get it right, so am happy to delete=
=20
& restart, if the current state is not fixable.:

**************************************************

root@gs0:/home/geograph# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]=20
[raid4] [raid10]
md_d0 : active raid6 sde1[4] sdg1[6] sdh1[7] sdc1[2] sda1[0] sdb1[1]=20
sdd1[3] sdf1[5]
11721071616 blocks level 6, 64k chunk, algorithm 2 [8/8] [UUUUUU=
UU]

unused devices:
root@gs0:/home/geograph#
**************************************************

root@gs0:/home/geograph# mdadm -E /dev/md_d0
mdadm: No md superblock detected on /dev/md_d0.

**************************************************
root@gs0:/dev# ls -la /dev/md*
brw-rw---- 1 root disk 254, 0 2010-11-13 16:41 /dev/md_d0
lrwxrwxrwx 1 root root 7 2010-11-13 16:41 /dev/md_d0p1 -> md/d0p1
lrwxrwxrwx 1 root root 7 2010-11-13 16:41 /dev/md_d0p2 -> md/d0p2
lrwxrwxrwx 1 root root 7 2010-11-13 16:41 /dev/md_d0p3 -> md/d0p3
lrwxrwxrwx 1 root root 7 2010-11-13 16:41 /dev/md_d0p4 -> md/d0p4

/dev/md:
total 0
drwxrwx--- 2 root disk 140 2010-11-13 16:41 .
drwxr-xr-x 18 root root 4520 2010-11-14 11:42 ..
brw------- 1 root root 254, 0 2010-11-13 16:41 d0
brw------- 1 root root 254, 1 2010-11-13 16:41 d0p1
brw------- 1 root root 254, 2 2010-11-13 16:41 d0p2
brw------- 1 root root 254, 3 2010-11-13 16:41 d0p3
brw------- 1 root root 254, 4 2010-11-13 16:41 d0p4
root@gs0:/dev#

***********************************************
root@gs0:/home/geograph# fdisk -lu

WARNING: GPT (GUID Partition Table) detected on '/dev/sda'! The util=20
fdisk doesn't support GPT. Use GNU Parted.

Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
256 heads, 63 sectors/track, 242251 cylinders, total 3907029168 sectors
Units =3D sectors of 1 * 512 =3D 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Device Boot Start End Blocks Id System
/dev/sda1 63 3907024127 1953512032+ fd Linux raid=20
autodetect

(All 8 disks are as above)
************************************************

root@gs0:/home/geograph# parted /dev/sde
GNU Parted 2.2
Using /dev/sde
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) p
Warning: /dev/sde contains GPT signatures, indicating that it has a GPT=
=20
table. However, it does not have a valid fake msdos partition table, a=
s=20
it should. Perhaps it was
corrupted -- possibly by a program that doesn't understand GPT partitio=
n=20
tables. Or perhaps you deleted the GPT table, and are now using an=20
msdos partition table. Is this a
GPT partition table?
Yes/No? yes
Model: ATA ST32000542AS (scsi)
Disk /dev/sde: 2000GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt

Number Start End Size File system Name =
=20
=46lags
1 17.4kB 134MB 134MB Microsoft reserved=20
partition msftres

(parted)

****************************************************

--=20

==================== =====
===================3D
Zoltan Szecsei PrGISc [PGP0031]
Geograph (Pty) Ltd.
P.O. Box 7, Muizenberg 7950, South Africa.

65 Main Road, Muizenberg 7945
Western Cape, South Africa.

34° 6'16.35"S 18°28'5.62"E

Tel: +27-21-7884897 Mobile: +27-83-6004028
=46ax: +27-86-6115323 www.geograph.co.za
==================== =====
===================3D

-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1153 / Virus Database: 424/3255 - Release Date: 11/13/10

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Confusion with setting up new RAID6 with mdadm

am 14.11.2010 17:48:53 von Mikael Abrahamsson

On Sun, 14 Nov 2010, Zoltan Szecsei wrote:

> *If I have to reformat the drives and redo mdadm --create, other than mdadm
> stop, how can I get rid of all the /dev/md* etc etc so that when I restart
> this exercise, the original bad RAID does not interfere with this new
> attempt?

Look into "--zero-superblock" for all drives.

> *Partition alignment?
> Is this relevant for modern HDs (I'm using 5900rpm Seagate 2TB drives)
> None of the mdadm helps I've googled or received speak about how to correctly
> format the drives before running mdadm --create.
> All the benchmarks & performance tests I've found, do not bother to say
> whether they have aligned the partitions on the HD

Me recommendation is to not use partitions at all, just use the whole
device (/dev/sdX).

> *What is the correct fdisk or parted method get rid of the DOS & GPT flags,
> and create a correctly aligned partition, and should this be a 0xda partiton
> (& then I use metatdata 1.2 for mdadm)?

I just "dd if=/dev/zero of=/dev/sdX bs=1024000 count=1" to zero the first
megabyte of the drive to get rid of the partition table (you get rid of
the v1.2 metadata at the same time actually). Then you know for sure
you're correctly aligned as well as md is 4k aligned.

> *Chunk size:
> After reading MANY different opinions, I'm guessing staying at the default
> chunk size is optimal? Anyone want to add to this argument?

Default should be fine.

> *After this, do I mkfs ext4 first, or LVM first?

LVM if you want to use LVM. Filesystems live in lv:s in the LVM concept.

--
Mikael Abrahamsson email: swmike@swm.pp.se
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Confusion with setting up new RAID6 with mdadm

am 14.11.2010 20:50:36 von Luca Berra

On Sun, Nov 14, 2010 at 05:36:38PM +0200, Zoltan Szecsei wrote:
> *If I have to reformat the drives and redo mdadm --create, other than mdadm
> stop, how can I get rid of all the /dev/md* etc etc so that when I restart
> this exercise, the original bad RAID does not interfere with this new
> attempt?

mdadm -Ss
mdadm --zero-superblock on each partition
>
>
> *Partition alignment?
> Is this relevant for modern HDs (I'm using 5900rpm Seagate 2TB drives)
for modern hdds with 4k sectors it is
new fdisk and/or parted should already know how to align
in any case, since you want to use the whole space for raid, why create
partitions at all, md works nicely without

> *Chunk size:
> After reading MANY different opinions, I'm guessing staying at the default
> chunk size is optimal? Anyone want to add to this argument?
i believe the default in newer mdadm is fine

>
> *After partitioning the 8 drives, is this the correct sequence?
> mdadm --create /dev/md0 --metadata=1.2 --auto=md --chunk=64 --level=raid6
why you state the chunk size here, i tought you wanted to stay with the
defauklt
> --raid devices=8 /dev/sd[abcdefgh]1
> mdadm --detail --scan >> /etc/mdadm.conf
> mdadm --assemble /dev/md0 /dev/sd[abcdefg]1
it should be already assembled after create, and after you appended the
info to mdadm.conf you just need mdadm --assemble /dev/md0 or mdadm
--assemble --scan.

> *After this, do I mkfs ext4 first, or LVM first?
if you want to use lvm it would be lvm first, but... do you want to?
there is no point if the aim is allocating the whole space to a single
filesystem.

> *What stride and stripe values should I use?
new toolstack should already find the correct stripe/stride for you

one more note:
for such a big array i would suggest to create a bitmap, so in case of
an unclean shutdown you do not have to wait for 27 hours for it to
rebuild. an internal bitmap will do.

L.

--
Luca Berra -- bluca@comedia.it
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Confusion with setting up new RAID6 with mdadm

am 14.11.2010 23:13:26 von NeilBrown

On Sun, 14 Nov 2010 17:36:38 +0200
Zoltan Szecsei wrote:

> Hi,
> I hope this is the correct list to address this on - I've done a lot of
> typing for nothing, if not :-)
>
> I have done days of research, including reading
> https://raid.wiki.kernel.org/index.php/Linux_Raid, but all I am doing is
> getting confused in the detail.
>
> My goal is to set up an 8*2TB SiI3132 based RAID6 on Ubuntu 10.04LTS,
> with LVM and ext4.
> The setup will mostly hold thousands of 400MB image files, and they will
> not be accessed regularly - they mostly just need to be online. The
> entire space on all 8 drives can be used, and I want 1 massive
> filesystem, when I finally mount this RAID device. No boot, root or swap.
>
> I have gone quite far with the help of the local linux group, but after
> I had completed the 27 hour mdadm --create run, further tidbits were
> thrown at me, and I am trying to get an opinion on if it is worth
> scrapping this effort, and starting again.
>
>
>
> Please can someone provide clarity on:
>
> *If I have to reformat the drives and redo mdadm --create, other than
> mdadm stop, how can I get rid of all the /dev/md* etc etc so that when I
> restart this exercise, the original bad RAID does not interfere with
> this new attempt?
>
>
>
> *Partition alignment?
> Is this relevant for modern HDs (I'm using 5900rpm Seagate 2TB drives)
> None of the mdadm helps I've googled or received speak about how to
> correctly format the drives before running mdadm --create.
> All the benchmarks & performance tests I've found, do not bother to say
> whether they have aligned the partitions on the HD
>
> *What is the correct fdisk or parted method get rid of the DOS & GPT
> flags, and create a correctly aligned partition, and should this be a
> 0xda partiton (& then I use metatdata 1.2 for mdadm)?
>
>
> *Chunk size:
> After reading MANY different opinions, I'm guessing staying at the
> default chunk size is optimal? Anyone want to add to this argument?

Depending on which version of mdadm you are using, the default chunk size
will be 64K or 512K. I would recommend using 512K even if you have an older
mdadm. 64K appears to be too small for modern hardware, particularly if you
are storing large files.

For raid6 with the current implementation it is safe to use "--assume-clean"
to avoid the long recovery time. It is certainly safe to use that if you
want to build a test array, do some performance measurement, and then scrap
it and try again. If some time later you want to be sure that the array is
entirely in sync you can
echo repair > /sys/block/md0/md/sync_action
and wait a while.

I agree with what Mikael and Luca suggested - particularly the suggested for
"--bitmap internal". You really want that.

NeilBrown

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Confusion with setting up new RAID6 with mdadm

am 15.11.2010 06:30:10 von Roman Mamedov

--Sig_/dzhXupKHPRszOn1SqzUinG.
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

On Mon, 15 Nov 2010 09:13:26 +1100
Neil Brown wrote:

> Depending on which version of mdadm you are using, the default chunk size
> will be 64K or 512K. I would recommend using 512K even if you have an ol=
der
> mdadm. 64K appears to be too small for modern hardware, particularly if =
you
> are storing large files.

According to some benchmarks I found, 64 or 128K still provide the sweet-sp=
ot
performance on RAID5 and RAID6, especially on writes.
http://louwrentius.blogspot.com/2010/05/raid-level-and-chunk -size-benchmark=
s.html
http://blog.jamponi.net/2008/07/raid56-and-10-benchmarks-on- 26255_10.html#r=
aid-5-performance
http://alephnull.com/benchmarks/sata2009/chunksize.html

> I agree with what Mikael and Luca suggested - particularly the suggested =
for
> "--bitmap internal". You really want that.

It will also help to increase the --bitmap-chunk value to reduce its
performance impact, I suggest using 131072 or more.

--=20
With respect,
Roman

--Sig_/dzhXupKHPRszOn1SqzUinG.
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)

iEYEARECAAYFAkzgxWIACgkQTLKSvz+PZwhf7gCfW1qnpejn2/EJCDjXHtBv XJqG
JvMAnR0tEzA3HELnKorIUxOeWE0BVwvb
=LKLb
-----END PGP SIGNATURE-----

--Sig_/dzhXupKHPRszOn1SqzUinG.--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Confusion with setting up new RAID6 with mdadm

am 15.11.2010 07:52:32 von Zoltan Szecsei

On 2010-11-14 21:50, Luca Berra wrote:
> On Sun, Nov 14, 2010 at 05:36:38PM +0200, Zoltan Szecsei wrote:
>>
>> *Partition alignment?
>> Is this relevant for modern HDs (I'm using 5900rpm Seagate 2TB drive=
s)
> for modern hdds with 4k sectors it is
> new fdisk and/or parted should already know how to align
fdisk reports 512b sectors:
root@gs0:/home/geograph# fdisk -lu

WARNING: GPT (GUID Partition Table) detected on '/dev/sda'! The util=20
fdisk doesn't support GPT. Use GNU Parted.

Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
256 heads, 63 sectors/track, 242251 cylinders, total 3907029168 sectors
Units =3D sectors of 1 * 512 =3D 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Device Boot Start End Blocks Id System
/dev/sda1 63 3907024127 1953512032+ fd Linux raid=20
autodetect

(All 8 disks are as above)

> in any case, since you want to use the whole space for raid, why crea=
te
> partitions at all, md works nicely without
OK

>
>> *Chunk size:
>> After reading MANY different opinions, I'm guessing staying at the=20
>> default chunk size is optimal? Anyone want to add to this argument?
> i believe the default in newer mdadm is fine
>
>>
>> *After partitioning the 8 drives, is this the correct sequence?
>> mdadm --create /dev/md0 --metadata=3D1.2 --auto=3Dmd --chunk=3D64=20
>> --level=3Draid6=20
> why you state the chunk size here, i tought you wanted to stay with t=
he
> defauklt
because on my mdadm, 64 is the default - and I was just re-enforcing=20
that for the reader.
root@gs0:/home/geograph# mdadm -V
mdadm - v2.6.7.1 - 15th October 2008
root@gs0:/home/geograph#
>> *After this, do I mkfs ext4 first, or LVM first?
> if you want to use lvm it would be lvm first, but... do you want to?
> there is no point if the aim is allocating the whole space to a singl=
e
> filesystem.
Because I might want to join this array to another one at a later stage=
=20
- I would then have 2 boxes each with 8 drives, and each with a SiI3132=
=20
card on the same motherboard.
I might use the second box to mirror the first, or to extend it - not=20
sure of my needs yet.

>
>> *What stride and stripe values should I use?
> new toolstack should already find the correct stripe/stride for you
How would I check ?

root@gs0:/home/geograph# mkfs.ext4 -V
mke2fs 1.41.11 (14-Mar-2010)
Using EXT2FS Library version 1.41.11
root@gs0:/home/geograph#
>
> one more note:
> for such a big array i would suggest to create a bitmap, so in case o=
f
> an unclean shutdown you do not have to wait for 27 hours for it to
> rebuild. an internal bitmap will do.
>
Nice tip - I'll look into it - Thanks,
Zoltan

--=20

==================== =====
===================3D
Zoltan Szecsei PrGISc [PGP0031]
Geograph (Pty) Ltd.
P.O. Box 7, Muizenberg 7950, South Africa.

65 Main Road, Muizenberg 7945
Western Cape, South Africa.

34° 6'16.35"S 18°28'5.62"E

Tel: +27-21-7884897 Mobile: +27-83-6004028
=46ax: +27-86-6115323 www.geograph.co.za
==================== =====
===================3D

-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1153 / Virus Database: 424/3256 - Release Date: 11/14/10

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Confusion with setting up new RAID6 with mdadm

am 15.11.2010 07:58:12 von Zoltan Szecsei

On 2010-11-15 00:13, Neil Brown wrote:
>
> Depending on which version of mdadm you are using, the default chunk =
size
> will be 64K or 512K. I would recommend using 512K even if you have a=
n older
> mdadm. 64K appears to be too small for modern hardware, particularly=
if you
> are storing large files.
> =20
root@gs0:/home/geograph# mdadm -V
mdadm - v2.6.7.1 - 15th October 2008
root@gs0:/home/geograph#

This was what apt-get install got for me, from Ubuntu 10.04 64bit Deskt=
op.
Should I download & compile a newer one?
(Where from? - haven't found the mdadm developer page yet))

> For raid6 with the current implementation it is safe to use "--assume=
-clean"
> =20
Is my above version "current" enough?
> to avoid the long recovery time. It is certainly safe to use that if=
you
> want to build a test array, do some performance measurement, and then=
scrap
> it and try again. If some time later you want to be sure that the ar=
ray is
> entirely in sync you can
> echo repair> /sys/block/md0/md/sync_action
> and wait a while.
>
> I agree with what Mikael and Luca suggested - particularly the sugges=
ted for
> "--bitmap internal". You really want that.
>
>
> N
> =20

Regards & thanks,
Zoltan

--=20

==================== =====
===================3D
Zoltan Szecsei PrGISc [PGP0031]
Geograph (Pty) Ltd.
P.O. Box 7, Muizenberg 7950, South Africa.

65 Main Road, Muizenberg 7945
Western Cape, South Africa.

34° 6'16.35"S 18°28'5.62"E

Tel: +27-21-7884897 Mobile: +27-83-6004028
=46ax: +27-86-6115323 www.geograph.co.za
==================== =====
===================3D

-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1153 / Virus Database: 424/3256 - Release Date: 11/14/10

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Confusion with setting up new RAID6 with mdadm

am 15.11.2010 08:41:25 von Luca Berra

On Mon, Nov 15, 2010 at 08:52:32AM +0200, Zoltan Szecsei wrote:
> On 2010-11-14 21:50, Luca Berra wrote:
>> On Sun, Nov 14, 2010 at 05:36:38PM +0200, Zoltan Szecsei wrote:
>>>
>>> *Partition alignment?
>>> Is this relevant for modern HDs (I'm using 5900rpm Seagate 2TB drives)
>> for modern hdds with 4k sectors it is
>> new fdisk and/or parted should already know how to align
> fdisk reports 512b sectors:
> root@gs0:/home/geograph# fdisk -lu

i believe your fdisk does not support getting geometry from blkid, but i
am not an ubuntu user
you could try checking with something like 'strings /sbin/fdisk | grep io_size',
but since we are going without partitions you can ignore all this.

btw to check sector size on a disk, on a fairly recent kernel you can
check the files under /sys/block/*/queue,
hw_sector_size
minimum_io_size
optimal_io_size

except for disks that lie about their sector size, but this is a
different story.

>>> *After partitioning the 8 drives, is this the correct sequence?
>>> mdadm --create /dev/md0 --metadata=1.2 --auto=md --chunk=64 --level=raid6
>> why you state the chunk size here, i tought you wanted to stay with the
>> defauklt
> because on my mdadm, 64 is the default - and I was just re-enforcing that
> for the reader.
> root@gs0:/home/geograph# mdadm -V
> mdadm - v2.6.7.1 - 15th October 2008
this mdadm release is a tad old (about two years), it will work, but
some things may be different than current 3.1.x

> root@gs0:/home/geograph#
>>> *After this, do I mkfs ext4 first, or LVM first?
>> if you want to use lvm it would be lvm first, but... do you want to?
>> there is no point if the aim is allocating the whole space to a single
>> filesystem.
> Because I might want to join this array to another one at a later stage - I
> would then have 2 boxes each with 8 drives, and each with a SiI3132 card on
> the same motherboard.
> I might use the second box to mirror the first, or to extend it - not sure
> of my needs yet.
ok, then you need to align lvm as well
check if you have these parameters in /etc/lvm/lvm.conf
md_chunk_alignment = 1
data_alignment_detection = 1

if you don't have those at all, check if lvm supports them
strings /sbin/lvm|grep io_size

if not, you have to align manually, using the --dataalignment option to
pvcreate, align to a full stripe (chunk_size * 6, see below)

>>> *What stride and stripe values should I use?
stride=chunk_size/fs block size
stripe-width=stride * num_data_disks
num_data disks in your case is 6, 8 total disks - 2 parity disks

on a fairly recent kernel:
/sys/block/md?/queue/minimum_io_size would be the chunk_size of the array
/sys/block/md?/queue/optimal_io_size would be the stripe size

this should be exported on lvm devices also
/sys/block/dm-*/queue/...

so you can check with data in /sys/block at each step which is the value
to feed into tools.

>> new toolstack should already find the correct stripe/stride for you
> How would I check ?
strings /sbin/lvm|grep io_size

--
Luca Berra -- bluca@comedia.it
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Confusion with setting up new RAID6 with mdadm

am 15.11.2010 08:43:03 von Mikael Abrahamsson

On Mon, 15 Nov 2010, Zoltan Szecsei wrote:

> (Where from? - haven't found the mdadm developer page yet))

--
Mikael Abrahamsson email: swmike@swm.pp.se
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Confusion with setting up new RAID6 with mdadm

am 15.11.2010 10:18:40 von NeilBrown

On Mon, 15 Nov 2010 08:43:03 +0100 (CET)
Mikael Abrahamsson wrote:

> On Mon, 15 Nov 2010, Zoltan Szecsei wrote:
>
> > (Where from? - haven't found the mdadm developer page yet))
>
>
>

aka http://neil.brown.name/blog/mdadm

NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Confusion with setting up new RAID6 with mdadm

am 15.11.2010 12:06:30 von Zoltan Szecsei

On 2010-11-15 09:41, Luca Berra wrote:
> btw to check sector size on a disk, on a fairly recent kernel you can
> check the files under /sys/block/*/queue,
> hw_sector_size
512
> minimum_io_size
512
> optimal_io_size
0
>
> except for disks that lie about their sector size, but this is a
> different story.
>> root@gs0:/home/geograph# mdadm -V
>> mdadm - v2.6.7.1 - 15th October 2008
> this mdadm release is a tad old (about two years), it will work, but
> some things may be different than current 3.1.x
>
just downloaded the tarball for 3.1.4 and will have a crack at compilin=
g it.

Thanks !

Z

--=20

==================== =====
===================3D
Zoltan Szecsei PrGISc [PGP0031]
Geograph (Pty) Ltd.
P.O. Box 7, Muizenberg 7950, South Africa.

65 Main Road, Muizenberg 7945
Western Cape, South Africa.

34° 6'16.35"S 18°28'5.62"E

Tel: +27-21-7884897 Mobile: +27-83-6004028
=46ax: +27-86-6115323 www.geograph.co.za
==================== =====
===================3D

-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1153 / Virus Database: 424/3257 - Release Date: 11/14/10

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Confusion with setting up new RAID6 with mdadm

am 15.11.2010 13:27:45 von Zoltan Szecsei

On 2010-11-14 18:48, Mikael Abrahamsson wrote:
>
>> *What is the correct fdisk or parted method get rid of the DOS & GPT=
=20
>> flags, and create a correctly aligned partition, and should this be =
a=20
>> 0xda partiton (& then I use metatdata 1.2 for mdadm)?
>
> I just "dd if=3D/dev/zero of=3D/dev/sdX bs=3D1024000 count=3D1" to ze=
ro the=20
> first megabyte of the drive to get rid of the partition table (you ge=
t=20
> rid of the v1.2 metadata at the same time actually). Then you know fo=
r=20
> sure you're correctly aligned as well as md is 4k aligned.
I did this on all 8 drives (/dev/sd[a-h])
root@gs0:/etc# dd if=3D/dev/zero of=3D/dev/sdb bs=3D1024000 count=3D1
1+0 records in
1+0 records out
1024000 bytes (1.0 MB) copied, 0.0103209 s, 99.2 MB/s
root@gs0:/etc# dd if=3D/dev/zero of=3D/dev/sdc bs=3D1024000 count=3D1

But the GPT id has not disappeared. I am going to use these drives=20
unpartitioned, so is this a problem?

Thanks,
Zoltan

root@gs0:/etc# fdisk -lu

WARNING: GPT (GUID Partition Table) detected on '/dev/sda'! The util=20
fdisk doesn't support GPT. Use GNU Parted.

Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units =3D sectors of 1 * 512 =3D 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sda doesn't contain a valid partition table

WARNING: GPT (GUID Partition Table) detected on '/dev/sdb'! The util=20
fdisk doesn't support GPT. Use GNU Parted.

Disk /dev/sdb: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units =3D sectors of 1 * 512 =3D 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdb doesn't contain a valid partition table

WARNING: GPT (GUID Partition Table) detected on '/dev/sdc'! The util=20
fdisk doesn't support GPT. Use GNU Parted.

Disk /dev/sdc: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units =3D sectors of 1 * 512 =3D 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdc doesn't contain a valid partition table

WARNING: GPT (GUID Partition Table) detected on '/dev/sdd'! The util=20
fdisk doesn't support GPT. Use GNU Parted.

Disk /dev/sdd: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units =3D sectors of 1 * 512 =3D 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdd doesn't contain a valid partition table

WARNING: GPT (GUID Partition Table) detected on '/dev/sde'! The util=20
fdisk doesn't support GPT. Use GNU Parted.

Disk /dev/sde: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units =3D sectors of 1 * 512 =3D 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sde doesn't contain a valid partition table

WARNING: GPT (GUID Partition Table) detected on '/dev/sdf'! The util=20
fdisk doesn't support GPT. Use GNU Parted.

Disk /dev/sdf: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units =3D sectors of 1 * 512 =3D 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdf doesn't contain a valid partition table

WARNING: GPT (GUID Partition Table) detected on '/dev/sdg'! The util=20
fdisk doesn't support GPT. Use GNU Parted.

Disk /dev/sdg: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units =3D sectors of 1 * 512 =3D 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdg doesn't contain a valid partition table

WARNING: GPT (GUID Partition Table) detected on '/dev/sdh'! The util=20
fdisk doesn't support GPT. Use GNU Parted.

Disk /dev/sdh: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units =3D sectors of 1 * 512 =3D 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdh doesn't contain a valid partition table

Disk /dev/sdi: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units =3D sectors of 1 * 512 =3D 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000e30c7

Device Boot Start End Blocks Id System
/dev/sdi1 * 2048 391167 194560 83 Linux
Partition 1 does not end on cylinder boundary.
/dev/sdi2 393214 976771071 488188929 5 Extended
/dev/sdi5 393216 98047999 48827392 83 Linux
/dev/sdi6 98050048 110047231 5998592 82 Linux swap / So=
laris
/dev/sdi7 110049280 976771071 433360896 83 Linux
root@gs0:/etc#

--=20

==================== =====
===================3D
Zoltan Szecsei PrGISc [PGP0031]
Geograph (Pty) Ltd.
P.O. Box 7, Muizenberg 7950, South Africa.

65 Main Road, Muizenberg 7945
Western Cape, South Africa.

34° 6'16.35"S 18°28'5.62"E

Tel: +27-21-7884897 Mobile: +27-83-6004028
=46ax: +27-86-6115323 www.geograph.co.za
==================== =====
===================3D

-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1153 / Virus Database: 424/3257 - Release Date: 11/14/10

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Confusion with setting up new RAID6 with mdadm

am 15.11.2010 13:47:55 von Michal Soltys

On 15.11.2010 13:27, Zoltan Szecsei wrote:
> On 2010-11-14 18:48, Mikael Abrahamsson wrote:
>>
>>> *What is the correct fdisk or parted method get rid of the DOS & GPT
>>> flags, and create a correctly aligned partition, and should this be a
>>> 0xda partiton (& then I use metatdata 1.2 for mdadm)?
>>
>> I just "dd if=/dev/zero of=/dev/sdX bs=1024000 count=1" to zero the
>> first megabyte of the drive to get rid of the partition table (you get
>> rid of the v1.2 metadata at the same time actually). Then you know for
>> sure you're correctly aligned as well as md is 4k aligned.
> I did this on all 8 drives (/dev/sd[a-h])
> root@gs0:/etc# dd if=/dev/zero of=/dev/sdb bs=1024000 count=1
> 1+0 records in
> 1+0 records out
> 1024000 bytes (1.0 MB) copied, 0.0103209 s, 99.2 MB/s
> root@gs0:/etc# dd if=/dev/zero of=/dev/sdc bs=1024000 count=1
>
> But the GPT id has not disappeared.

You might want to do blockdev --rereadpt /dev/sd[a-h] to make sure
kernel registers new situation (or do the same with sfdisk -R)

Also, GPT stores backup partition table + gpt header at the end of the
disk. Kernel might be clever enough to rely on it if you destroy the
data at the beginning of the disk.

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Confusion with setting up new RAID6 with mdadm

am 15.11.2010 14:23:13 von Zoltan Szecsei

On 2010-11-15 14:47, Michal Soltys wrote:
> On 15.11.2010 13:27, Zoltan Szecsei wrote:
>> On 2010-11-14 18:48, Mikael Abrahamsson wrote:
>>>
>>>> *What is the correct fdisk or parted method get rid of the DOS & G=
PT
>>>> flags, and create a correctly aligned partition, and should this b=
e a
>>>> 0xda partiton (& then I use metatdata 1.2 for mdadm)?
>>>
>>> I just "dd if=3D/dev/zero of=3D/dev/sdX bs=3D1024000 count=3D1" to =
zero the
>>> first megabyte of the drive to get rid of the partition table (you =
get
>>> rid of the v1.2 metadata at the same time actually). Then you know =
for
>>> sure you're correctly aligned as well as md is 4k aligned.
>> I did this on all 8 drives (/dev/sd[a-h])
>> root@gs0:/etc# dd if=3D/dev/zero of=3D/dev/sdb bs=3D1024000 count=3D=
1
>> 1+0 records in
>> 1+0 records out
>> 1024000 bytes (1.0 MB) copied, 0.0103209 s, 99.2 MB/s
>> root@gs0:/etc# dd if=3D/dev/zero of=3D/dev/sdc bs=3D1024000 count=3D=
1
>>
>> But the GPT id has not disappeared.
>
> You might want to do blockdev --rereadpt /dev/sd[a-h] to make sure=20
> kernel registers new situation (or do the same with sfdisk -R)
>
> Also, GPT stores backup partition table + gpt header at the end of th=
e=20
> disk. Kernel might be clever enough to rely on it if you destroy the=20
> data at the beginning of the disk.
>

OK, just done this on all 8 drives:
root@gs0:/sys/block# dd if=3D/dev/zero of=3D/dev/sdb bs=3D512 seek=3D39=
07029166
dd: writing `/dev/sdb': No space left on device
3+0 records in
2+0 records out
1024 bytes (1.0 kB) copied, 0.000913281 s, 1.1 MB/s
root@gs0:/sys/block#

fdisk -lu produces the results below - so presumably the drives now=20
clean & ready for mdadm?

BTW: I've just downloaded and compiled the latest mdadm too:
root@gs0:/sys/block# mdadm -V
mdadm - v3.1.4 - 31st August 2010
root@gs0:/sys/block#

Thanks for your (collective) helps...
Zoltan

root@gs0:/sys/block# fdisk -lu

Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units =3D sectors of 1 * 512 =3D 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sda doesn't contain a valid partition table

Disk /dev/sdb: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units =3D sectors of 1 * 512 =3D 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdb doesn't contain a valid partition table

Disk /dev/sdc: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units =3D sectors of 1 * 512 =3D 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdc doesn't contain a valid partition table

Disk /dev/sdd: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units =3D sectors of 1 * 512 =3D 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdd doesn't contain a valid partition table

Disk /dev/sde: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units =3D sectors of 1 * 512 =3D 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sde doesn't contain a valid partition table

Disk /dev/sdf: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units =3D sectors of 1 * 512 =3D 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdf doesn't contain a valid partition table

Disk /dev/sdg: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units =3D sectors of 1 * 512 =3D 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdg doesn't contain a valid partition table

Disk /dev/sdh: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units =3D sectors of 1 * 512 =3D 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdh doesn't contain a valid partition table

Disk /dev/sdi: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units =3D sectors of 1 * 512 =3D 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000e30c7

Device Boot Start End Blocks Id System
/dev/sdi1 * 2048 391167 194560 83 Linux
Partition 1 does not end on cylinder boundary.
/dev/sdi2 393214 976771071 488188929 5 Extended
/dev/sdi5 393216 98047999 48827392 83 Linux
/dev/sdi6 98050048 110047231 5998592 82 Linux swap / So=
laris
/dev/sdi7 110049280 976771071 433360896 83 Linux
root@gs0:/sys/block#

--=20

==================== =====
===================3D
Zoltan Szecsei PrGISc [PGP0031]
Geograph (Pty) Ltd.
P.O. Box 7, Muizenberg 7950, South Africa.

65 Main Road, Muizenberg 7945
Western Cape, South Africa.

34° 6'16.35"S 18°28'5.62"E

Tel: +27-21-7884897 Mobile: +27-83-6004028
=46ax: +27-86-6115323 www.geograph.co.za
==================== =====
===================3D

-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1153 / Virus Database: 424/3257 - Release Date: 11/14/10

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Confusion with setting up new RAID6 with mdadm

am 15.11.2010 19:01:48 von Zoltan Szecsei

Hi,
One last quick question:

Neil Brown wrote:
> Depending on which version of mdadm you are using, the default chunk =
size
> will be 64K or 512K. I would recommend using 512K even if you have a=
n older
> mdadm. 64K appears to be too small for modern hardware, particularly=
if you
> are storing large files.
>
> For raid6 with the current implementation it is safe to use "--assume=
-clean"
> to avoid the long recovery time. It is certainly safe to use that if=
you
> want to build a test array, do some performance measurement, and then=
scrap
> it and try again. If some time later you want to be sure that the ar=
ray is
> entirely in sync you can
> echo repair> /sys/block/md0/md/sync_action
> and wait a while.
> =20
****************************************************
I have compiled the following mdadm on my Ubuntu 64 bit 10.04 Desktop=20
system:
root@gs0:/home/geograph# uname -a
Linux gs0 2.6.32-25-generic #45-Ubuntu SMP Sat Oct 16 19:52:42 UTC 2010=
=20
x86_64 GNU/Linux
root@gs0:/home/geograph# mdadm -V
mdadm - v3.1.4 - 31st August 2010
root@gs0:/home/geograph#

****************************************************
I have deleted the partitions on all 8 drives, and done a mdadm -Ss

root@gs0:/home/geograph# fdisk -lu

Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units =3D sectors of 1 * 512 =3D 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sda doesn't contain a valid partition table

Disk /dev/sdb: 2000.4 GB, 2000398934016 bytes

******************************************************
Based on the above "assume-clean" comment, plus all the help you guys=20
have offered, I have just run:
mdadm --create /dev/md0 --metadata=3D1.2 --auto=3Dmd --assume-clean=20
--bitmap=3Dinternal --bitmap-chunk=3D131072 --chunk=3D512 --level=3D6=20
--raid-devices=3D8 /dev/sd[abcdefgh]

It took a nano-second to complete!

The man-pages for assume-clean say that "the array pre-existed". Surely=
=20
as I have erased the HDs, and now have no partitions on them, this is=20
not true?
Do I need to re-run the above mdadm command, or is it safe to proceed=20
with LVM then mkfs ext4?

Thanks for all,
Zoltan

******************************************************
root@gs0:/home/geograph# mdadm -E /dev/md0
mdadm: No md superblock detected on /dev/md0.

root@gs0:/home/geograph# ls -la /dev/md*
brw-rw---- 1 root disk 9, 0 2010-11-15 19:53 /dev/md0
/dev/md:
total 0
drwxr-xr-x 2 root root 60 2010-11-15 19:53 .
drwxr-xr-x 19 root root 4260 2010-11-15 19:53 ..
lrwxrwxrwx 1 root root 6 2010-11-15 19:53 0 -> ../md0

root@gs0:/home/geograph# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]=20
[raid4] [raid10]
md0 : active raid6 sdc[2] sdf[5] sdh[7] sdd[3] sdb[1] sdg[6] sda[0] sde=
[4]
11721077760 blocks super 1.2 level 6, 512k chunk, algorithm 2=20
[8/8] [UUUUUUUU]
bitmap: 0/8 pages [0KB], 131072KB chunk

unused devices:

*******************************************************

--=20

==================== =====
===================3D
Zoltan Szecsei PrGISc [PGP0031]
Geograph (Pty) Ltd.
P.O. Box 7, Muizenberg 7950, South Africa.

65 Main Road, Muizenberg 7945
Western Cape, South Africa.

34° 6'16.35"S 18°28'5.62"E

Tel: +27-21-7884897 Mobile: +27-83-6004028
=46ax: +27-86-6115323 www.geograph.co.za
==================== =====
===================3D

-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1153 / Virus Database: 424/3258 - Release Date: 11/15/10

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Confusion with setting up new RAID6 with mdadm

am 15.11.2010 20:53:25 von NeilBrown

On Mon, 15 Nov 2010 20:01:48 +0200
Zoltan Szecsei wrote:

> Hi,
> One last quick question:
>
> Neil Brown wrote:
> > Depending on which version of mdadm you are using, the default chunk size
> > will be 64K or 512K. I would recommend using 512K even if you have an older
> > mdadm. 64K appears to be too small for modern hardware, particularly if you
> > are storing large files.
> >
> > For raid6 with the current implementation it is safe to use "--assume-clean"
> > to avoid the long recovery time. It is certainly safe to use that if you
> > want to build a test array, do some performance measurement, and then scrap
> > it and try again. If some time later you want to be sure that the array is
> > entirely in sync you can
> > echo repair> /sys/block/md0/md/sync_action
> > and wait a while.
> >
> ****************************************************
> I have compiled the following mdadm on my Ubuntu 64 bit 10.04 Desktop
> system:
> root@gs0:/home/geograph# uname -a
> Linux gs0 2.6.32-25-generic #45-Ubuntu SMP Sat Oct 16 19:52:42 UTC 2010
> x86_64 GNU/Linux
> root@gs0:/home/geograph# mdadm -V
> mdadm - v3.1.4 - 31st August 2010
> root@gs0:/home/geograph#
>
> ****************************************************
> I have deleted the partitions on all 8 drives, and done a mdadm -Ss
>
> root@gs0:/home/geograph# fdisk -lu
>
> Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
> 255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
> Units = sectors of 1 * 512 = 512 bytes
> Sector size (logical/physical): 512 bytes / 512 bytes
> I/O size (minimum/optimal): 512 bytes / 512 bytes
> Disk identifier: 0x00000000
>
> Disk /dev/sda doesn't contain a valid partition table
>
> Disk /dev/sdb: 2000.4 GB, 2000398934016 bytes
>
> ******************************************************
> Based on the above "assume-clean" comment, plus all the help you guys
> have offered, I have just run:
> mdadm --create /dev/md0 --metadata=1.2 --auto=md --assume-clean
> --bitmap=internal --bitmap-chunk=131072 --chunk=512 --level=6
> --raid-devices=8 /dev/sd[abcdefgh]
>
> It took a nano-second to complete!
>
> The man-pages for assume-clean say that "the array pre-existed". Surely
> as I have erased the HDs, and now have no partitions on them, this is
> not true?
> Do I need to re-run the above mdadm command, or is it safe to proceed
> with LVM then mkfs ext4?

It is safe to proceed.

The situation is that the two parity block are probably not correct on most
(or even any) stripes. But you have no live data on them to protect, so it
doesn't really matter.

With the current implementation of RAID6, every time you write, the correct
parity blocks are computed and written. So any live data that is written
will be accompanies by correct parity blocks to protect it.

This does *not* apply to RAID5 as it sometimes uses the old parity block to
compute the new parity block. If the old was wrong, the new will be wrong
too.

It is conceivable that one day we might change the raid6 code to perform
similar updates if it ever turns out to be faster to do it that way, but it
seems unlikely at the moment.

NeilBrown

>
> Thanks for all,
> Zoltan
>
> ******************************************************
> root@gs0:/home/geograph# mdadm -E /dev/md0
> mdadm: No md superblock detected on /dev/md0.
>
>
>
> root@gs0:/home/geograph# ls -la /dev/md*
> brw-rw---- 1 root disk 9, 0 2010-11-15 19:53 /dev/md0
> /dev/md:
> total 0
> drwxr-xr-x 2 root root 60 2010-11-15 19:53 .
> drwxr-xr-x 19 root root 4260 2010-11-15 19:53 ..
> lrwxrwxrwx 1 root root 6 2010-11-15 19:53 0 -> ../md0
>
>
> root@gs0:/home/geograph# cat /proc/mdstat
> Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
> [raid4] [raid10]
> md0 : active raid6 sdc[2] sdf[5] sdh[7] sdd[3] sdb[1] sdg[6] sda[0] sde[4]
> 11721077760 blocks super 1.2 level 6, 512k chunk, algorithm 2
> [8/8] [UUUUUUUU]
> bitmap: 0/8 pages [0KB], 131072KB chunk
>
> unused devices:
>
>
>
>
> *******************************************************
>
>
>
>
>
>
>
>
>
>
>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Confusion with setting up new RAID6 with mdadm

am 16.11.2010 07:48:59 von Zoltan Szecsei

On 2010-11-15 21:53, Neil Brown wrote:
> On Mon, 15 Nov 2010 20:01:48 +0200
> Zoltan Szecsei wrote:
>
> =20
>> Hi,
>> One last quick question:
>> ****************************************************
>> I have compiled the following mdadm on my Ubuntu 64 bit 10.04 Deskto=
p
>> system:
>> root@gs0:/home/geograph# uname -a
>> Linux gs0 2.6.32-25-generic #45-Ubuntu SMP Sat Oct 16 19:52:42 UTC 2=
010
>> x86_64 GNU/Linux
>> root@gs0:/home/geograph# mdadm -V
>> mdadm - v3.1.4 - 31st August 2010
>> root@gs0:/home/geograph#
>>
>> ****************************************************
>> I have deleted the partitions on all 8 drives, and done a mdadm -Ss
>>
>> root@gs0:/home/geograph# fdisk -lu
>>
>> Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
>> 255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sect=
ors
>> Units =3D sectors of 1 * 512 =3D 512 bytes
>> Sector size (logical/physical): 512 bytes / 512 bytes
>> I/O size (minimum/optimal): 512 bytes / 512 bytes
>> Disk identifier: 0x00000000
>>
>> Disk /dev/sda doesn't contain a valid partition table
>>
>> Disk /dev/sdb: 2000.4 GB, 2000398934016 bytes
>>
>> ******************************************************
>> Based on the above "assume-clean" comment, plus all the help you guy=
s
>> have offered, I have just run:
>> mdadm --create /dev/md0 --metadata=3D1.2 --auto=3Dmd --assume-clean
>> --bitmap=3Dinternal --bitmap-chunk=3D131072 --chunk=3D512 --level=3D=
6
>> --raid-devices=3D8 /dev/sd[abcdefgh]
>>
>> It took a nano-second to complete!
>>
>> The man-pages for assume-clean say that "the array pre-existed". Sur=
ely
>> as I have erased the HDs, and now have no partitions on them, this i=
s
>> not true?
>> Do I need to re-run the above mdadm command, or is it safe to procee=
d
>> with LVM then mkfs ext4?
>> =20
> It is safe to proceed.
> =20

Too cool (A for away at last :-) )
Neil: Big thanks to you and the others on this list for all the patienc=
e=20
& help you guys have given.,
Kind regards,
Zoltan
> The situation is that the two parity block are probably not correct o=
n most
> (or even any) stripes. But you have no live data on them to protect,=
so it
> doesn't really matter.
>
> With the current implementation of RAID6, every time you write, the c=
orrect
> parity blocks are computed and written. So any live data that is wri=
tten
> will be accompanies by correct parity blocks to protect it.
>
> This does *not* apply to RAID5 as it sometimes uses the old parity bl=
ock to
> compute the new parity block. If the old was wrong, the new will be =
wrong
> too.
>
> It is conceivable that one day we might change the raid6 code to perf=
orm
> similar updates if it ever turns out to be faster to do it that way, =
but it
> seems unlikely at the moment.
>
> NeilBrown
>
>
> =20

--=20

==================== =====
===================3D
Zoltan Szecsei PrGISc [PGP0031]
Geograph (Pty) Ltd.
P.O. Box 7, Muizenberg 7950, South Africa.

65 Main Road, Muizenberg 7945
Western Cape, South Africa.

34° 6'16.35"S 18°28'5.62"E

Tel: +27-21-7884897 Mobile: +27-83-6004028
=46ax: +27-86-6115323 www.geograph.co.za
==================== =====
===================3D

-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1153 / Virus Database: 424/3258 - Release Date: 11/15/10

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Confusion with setting up new RAID6 with mdadm

am 22.07.2011 03:08:34 von Tanguy Herrmann

Luca Berra comedia.it> writes:

>
> On Sun, Nov 14, 2010 at 05:36:38PM +0200, Zoltan Szecsei wrote:
> > *If I have to reformat the drives and redo mdadm --create, other than mdadm
> > stop, how can I get rid of all the /dev/md* etc etc so that when I restart
> > this exercise, the original bad RAID does not interfere with this new
> > attempt?
>
> mdadm -Ss
> mdadm --zero-superblock on each partition
> >
> >
> > *Partition alignment?
> > Is this relevant for modern HDs (I'm using 5900rpm Seagate 2TB drives)
> for modern hdds with 4k sectors it is
> new fdisk and/or parted should already know how to align
> in any case, since you want to use the whole space for raid, why create
> partitions at all, md works nicely without

Hello,
first thank you for the interesting topic (because it fits my questions ^^) and
for all the participation of this community to this topic !

I've read somewhere (sorry I can't remind it) that the raid still could be
unaligned by using the whole disk, and so we had to create a partition aligned
(by using fdisk -u, then creating a partition beginning at LBA 64 at least, and
that would span on a length multiple of 8.

Was it totally wrong ?

Tanguy

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Confusion with setting up new RAID6 with mdadm

am 22.07.2011 07:17:25 von Mikael Abrahamsson

On Fri, 22 Jul 2011, Tanguy Herrmann wrote:

> I've read somewhere (sorry I can't remind it) that the raid still could be
> unaligned by using the whole disk, and so we had to create a partition aligned
> (by using fdisk -u, then creating a partition beginning at LBA 64 at least, and
> that would span on a length multiple of 8.
>
> Was it totally wrong ?

No, but I don't see how using the whole disk can end up to be not 4k
aligned when doing your way would. Only way this would end up unaligned
would be if the offset jumper on the WDxxEARS drives was set, and then
you'd be misaligned regardless of method.

--
Mikael Abrahamsson email: swmike@swm.pp.se
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html