Considering a complete rework of RAID on my home compute server

am 29.12.2010 20:54:15 von Mark Knecht

Hi,
I set up RAID using mdadm for the first time 8-9 months ago. As a
complete newbie, of which class I still consider myself, I made a
bunch of compromises at that time. I am now considering reworking the
setup and am looking for some guidance.

The system has 5 500GB WD RAID Edition drives. They are all the
same model purchased at the same time so they are subject to some of
the statistical problems that happen when one does this. The system
runs MySQL in Linux and then (typically) runs 3 or 4 VirtualBox VMs
running different versions of Windows.

1) The Linux/MySQL stuff is a 3 drive RAID1. (sda/sdb/sdc)

2) The Windows VMs run on a 2 drive RAID0 (sdd/sde)

3) There is a second RAID1 (sda/sdb/sdc) used for backups of the
RAID0. The RAID0 is backed up to RAID1 nightly. If the RAID0 fails
then I lose 1 day's work.

The RAID1 is backed up to another machine every week or two. This is a
home network, not a business but I do depend on the data to be there
if there's some hardware failure. With a 3 drive RAID1 I figure I'm
not in huge danger unless a power supply failure takes out all of the
drives.

I will copy smartctl -a data at the end of this email ifor all
drives n case it might impact any inputs.

OK - the problems I have with this arrangement are:

1) I used the older v0.9 metadata.

2) The RAIDs are assembled by the kernel automatically. I do not use
an initrd. (Because I don't know how/newer have)

3) I think with 5 disks I could get better performance than I
currently get , with similar or better safety using maybe RAID5 or
RAID6.

Overall, as I see it, I can suffer no disk loss on the RAID0, and
can handle a 2 disk loss on the RAID1. (Is that correct?) I'm thinking
that with a 5-drive RAID6 I might well get better performance than
either of the current RAIDs and (from reading) more protection during
a rebuild if one of my drives goes bad.

All said, I'm leaning toward RAID6 support everything (MySQL and
VMs), probably about 100GB to start with, and then would hopefully
scale the size up using the rest of the drives after data is copied
over to the new RAID6. I'd build RAID6, copy everything from the
current system, ensure the new RAID boots, etc, then eventually blow
away the old partitions and resize the RAID 6 larger.

Does this make sense? What am I missing or should be thinking about.

I have no problem buying maybe 1 new drive now as a spare. The
chassis is filled at this time and there's no way to run what I think
is considered a hot spare.

Thanks in advance,
Mark

c2stable ~ # smartctl -a /dev/sda
smartctl 5.40 2010-10-16 r3189 [x86_64-pc-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family: Western Digital RE3 Serial ATA family
Device Model: WDC WD5002ABYS-02B1B0
Serial Number: WD-WCASYA846988
Firmware Version: 02.03B03
User Capacity: 500,107,862,016 bytes
Device is: In smartctl database [for details use: -P show]
ATA Version is: 8
ATA Standard is: Exact ATA specification draft version not indicated
Local Time is: Wed Dec 29 11:31:35 2010 PST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail
Always - 0
3 Spin_Up_Time 0x0027 238 235 021 Pre-fail
Always - 1100
4 Start_Stop_Count 0x0032 100 100 000 Old_age
Always - 308
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail
Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age
Always - 0
9 Power_On_Hours 0x0032 099 099 000 Old_age
Always - 1363
10 Spin_Retry_Count 0x0032 100 100 000 Old_age
Always - 0
11 Calibration_Retry_Count 0x0032 100 100 000 Old_age
Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age
Always - 306
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age
Always - 15
193 Load_Cycle_Count 0x0032 200 200 000 Old_age
Always - 292
194 Temperature_Celsius 0x0022 099 089 000 Old_age
Always - 48
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age
Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age
Always - 0
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age
Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age
Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age
Offline - 0

SMART Error Log Version: 1
No Errors Logged

c2stable ~ # smartctl -a /dev/sdb

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail
Always - 0
3 Spin_Up_Time 0x0027 237 236 021 Pre-fail
Always - 1108
4 Start_Stop_Count 0x0032 100 100 000 Old_age
Always - 308
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail
Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age
Always - 0
9 Power_On_Hours 0x0032 099 099 000 Old_age
Always - 1362
10 Spin_Retry_Count 0x0032 100 100 000 Old_age
Always - 0
11 Calibration_Retry_Count 0x0032 100 100 000 Old_age
Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age
Always - 306
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age
Always - 16
193 Load_Cycle_Count 0x0032 200 200 000 Old_age
Always - 291
194 Temperature_Celsius 0x0022 098 090 000 Old_age
Always - 49
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age
Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age
Always - 0
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age
Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age
Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age
Offline - 0

SMART Error Log Version: 1
No Errors Logged

c2stable ~ # smartctl -a /dev/sdc

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail
Always - 0
3 Spin_Up_Time 0x0027 238 236 021 Pre-fail
Always - 1100
4 Start_Stop_Count 0x0032 100 100 000 Old_age
Always - 308
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail
Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age
Always - 0
9 Power_On_Hours 0x0032 099 099 000 Old_age
Always - 1362
10 Spin_Retry_Count 0x0032 100 100 000 Old_age
Always - 0
11 Calibration_Retry_Count 0x0032 100 100 000 Old_age
Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age
Always - 306
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age
Always - 15
193 Load_Cycle_Count 0x0032 200 200 000 Old_age
Always - 292
194 Temperature_Celsius 0x0022 102 091 000 Old_age
Always - 45
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age
Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age
Always - 0
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age
Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age
Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age
Offline - 0

SMART Error Log Version: 1
No Errors Logged

c2stable ~ # smartctl -a /dev/sdd

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail
Always - 0
3 Spin_Up_Time 0x0027 238 235 021 Pre-fail
Always - 1100
4 Start_Stop_Count 0x0032 100 100 000 Old_age
Always - 296
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail
Always - 0
7 Seek_Error_Rate 0x002e 100 253 000 Old_age
Always - 0
9 Power_On_Hours 0x0032 099 099 000 Old_age
Always - 1299
10 Spin_Retry_Count 0x0032 100 100 000 Old_age
Always - 0
11 Calibration_Retry_Count 0x0032 100 100 000 Old_age
Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age
Always - 294
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age
Always - 29
193 Load_Cycle_Count 0x0032 200 200 000 Old_age
Always - 266
194 Temperature_Celsius 0x0022 098 090 000 Old_age
Always - 49
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age
Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age
Always - 0
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age
Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age
Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age
Offline - 0

SMART Error Log Version: 1
No Errors Logged

c2stable ~ # smartctl -a /dev/sde

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail
Always - 0
3 Spin_Up_Time 0x0027 237 236 021 Pre-fail
Always - 1108
4 Start_Stop_Count 0x0032 100 100 000 Old_age
Always - 297
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail
Always - 0
7 Seek_Error_Rate 0x002e 100 253 000 Old_age
Always - 0
9 Power_On_Hours 0x0032 099 099 000 Old_age
Always - 1300
10 Spin_Retry_Count 0x0032 100 100 000 Old_age
Always - 0
11 Calibration_Retry_Count 0x0032 100 100 000 Old_age
Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age
Always - 295
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age
Always - 17
193 Load_Cycle_Count 0x0032 200 200 000 Old_age
Always - 279
194 Temperature_Celsius 0x0022 103 094 000 Old_age
Always - 44
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age
Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age
Always - 0
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age
Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age
Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age
Offline - 0

SMART Error Log Version: 1
No Errors Logged
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

RE: Considering a complete rework of RAID on my home compute server

am 06.01.2011 01:03:47 von Leslie Rhorer

> 1) The Linux/MySQL stuff is a 3 drive RAID1. (sda/sdb/sdc)

Well, RAID1 certainly offers the most robust solution, especially
with more than 1 mirror.

> 2) The Windows VMs run on a 2 drive RAID0 (sdd/sde)

Given the relatively small drive size (500G), I think I would
probably replace that with a single drive, if you want to keep the overall
architecture.

> 3) There is a second RAID1 (sda/sdb/sdc) used for backups of the
> RAID0. The RAID0 is backed up to RAID1 nightly. If the RAID0 fails
> then I lose 1 day's work.

Well, first of all, unless you meant to say sda1/sdb1/sdc1, etc,
then this can't be different from the #1 RAID array above. Assuming you are
actually using partitions, then I don't really see the value of two separate
arrays. Why not just one RAID1 array?

Also, I personally would reverse the backup strategy. I would put
the Windows VM on the (single) main RAID array and back up the data to a
single 1T hard drive.

> OK - the problems I have with this arrangement are:
>
> 1) I used the older v0.9 metadata.

This may be necessary if you are booting from the array. The
limitations of the 0.90 superblock may never impact you. That's a small
system with only a few drives.

> 2) The RAIDs are assembled by the kernel automatically. I do not use
> an initrd. (Because I don't know how/newer have)

How is that a problem? An initrd, or lack thereof, won't prevent
you from disabling the automatic assembly of one or more arrays, unless once
again you boot from the array in question. Most modern distros default to
using an initrd. What distro are you using?

> 3) I think with 5 disks I could get better performance than I
> currently get , with similar or better safety using maybe RAID5 or
> RAID6.

No, RAID1 is as safe as it gets. RAID0 allows for better
performance, but if you make the RAID0 into your backup solution, the
performance won't matter much.

Is performance a really big issue? Are you having problems with bad
performance?

> Overall, as I see it, I can suffer no disk loss on the RAID0, and
> can handle a 2 disk loss on the RAID1. (Is that correct?) I'm thinking
> that with a 5-drive RAID6 I might well get better performance than
> either of the current RAIDs and (from reading) more protection during
> a rebuild if one of my drives goes bad.

A single RAID5 or RAID6 solution is certainly simpler, and there is
significant value there. Read performance should be enhanced, but write
performance will be impacted. You've also lost your backup solution in this
scenario, though, so you will need to come up with something.

> All said, I'm leaning toward RAID6 support everything (MySQL and
> VMs), probably about 100GB to start with, and then would hopefully
> scale the size up using the rest of the drives after data is copied
> over to the new RAID6. I'd build RAID6, copy everything from the
> current system, ensure the new RAID boots, etc, then eventually blow
> away the old partitions and resize the RAID 6 larger.
>
> Does this make sense? What am I missing or should be thinking about.

I would worry that you are blowing away your backup.

> I have no problem buying maybe 1 new drive now as a spare. The
> chassis is filled at this time and there's no way to run what I think
> is considered a hot spare.

You might consider an external enclosure. Enclosures for up to 5
dives are quite economical.

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Considering a complete rework of RAID on my home compute server

am 06.01.2011 01:47:54 von Roman Mamedov

--Sig_/J6S6MIEsDW7IHyo97NFZbNz
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

On Wed, 5 Jan 2011 18:03:47 -0600
"Leslie Rhorer" wrote:

> RAID1 certainly offers the most robust solution, especially
> with more than 1 mirror.

> RAID1 is as safe as it gets

Are you sure about that? Considering that mdadm's handling of corrupt data =
on
RAID1 devices is pretty simplistic (obviously it does not have per-block
checksums anywhere, it does not do 'voting' on RAID1 with more than 2
devices), it basically has no way of knowing if a block of data is returned
differently by some of the component devices, which one has the 'correct'
data. From what I understand, RAID5 and especially RAID6 give a much better
protection in this situation.

--=20
With respect,
Roman

--Sig_/J6S6MIEsDW7IHyo97NFZbNz
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)

iEYEARECAAYFAk0lEToACgkQTLKSvz+PZwggywCgkNypaMVg0D0ovbmIbCHg LLrW
cJwAnRKJv3rT6GvCBInpF2ur1i62CHj5
=Gdod
-----END PGP SIGNATURE-----

--Sig_/J6S6MIEsDW7IHyo97NFZbNz--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Considering a complete rework of RAID on my home compute server

am 06.01.2011 03:05:31 von Roberto Spadim

could we implement a more flexible raid1? maybe with checksum? wrong
checksum =3D page failed or page with errors
page should be correct for a high performace
example,
mirror 1 =3D 4096 page size
mirror 2 =3D 8192 page size
mirror 3 =3D 512 page size

a good value for raid page size is 8192 (is multiple of 4096 and 512)
the checksum size shoud be multiple of page size
for example 1byte for each 512bytes, with a page of 8192bytes, we have
8192 pages checksum with only one page...

what's the `new` raid1 with checksum idea?
considering 8192 page size, with 3 mirrors...
the error is detect by page, not by mirror
pages make filesystem fast (ok, a little less than without raid)
low disk use for checksum

what we need...
example:
a raid with 8192001bytes
page size=3D8192 <- give at mdadm --create
checksum size per page =3D crc32? 4 bytes <- give at mdadm --create ??=
?
total pages =3D floor(size/page size) =3D floor(8.192.001/8192) =3D 10=
00
(~1000,000122, we will lose 1 byte...)
checksums per page size =3D floor(page size / checksum size) =3D
floor(8192/4) =3D 2048
total checksum pages =3D ceil(total pages / checksums per page size) =3D
ceil(1000 / 2048) =3D 1 (0,48828125 we will have a lot of checksum
without use)
total data pages =3D total pages - check sum pages =3D 1000 - 1 =3D 999
total size for filesystem =3D total data pages * page size =3D 999 * 81=
92
=3D 8.183.808 bytes

should we usa more information? what about what's the newest drive?
for example, we remove disk1 and disk2,3 are online, so write to 2,3
will make 1 older... should we use disk last write time information?
maybe a page just for information? this could help us for check what's
the currently working disk, checksum should be included with this
value, for example 4096 bytes + this page value? or a page for
checksum and a page for last write time value? the idea is help to
know what's the newest value, a page startup could allow us to sync
pages on each disk

ideas:
*it does not do 'voting' on RAID1 with more than 2 devices
this could be done with per page last write time (raid 5 or raid6?)
*obviously it does not have per-block checksums anywhere
a per block checksum (raid 5 or raid6?)

got? any idea?
for example, imagine that we have ten 1TB disks and we want a 1TB
'raid' disk, the best option is RAID1 today, a mirror on every disk,
and a read speed very fast (if we could select right read algorithm,
for example closest head position, fastest read time, round robin,
page module per mirrors on raid (for example, 10 disks, a read at page
1, will read for disk 1, a read from page 12 will read from disk 2,
page 23, 3, 13, 43, will read from disk 3, 'page number' mod 'mirrors
on raid' =3D disk to read)

a fast resume, reading about openbsd we could get:

write algorithm (what disk should be write? raid 0 with strip for examp=
le)
read algorithm (what disk should be read? raid1 with good disks, could
read with closest head position, fastest read time, round robin,
etc...)
strip algorithm (raid0, raid0 with strip)
mirror algorithm (raid1)
checksum algorithm (none =3D raid1, crc disk ~ raid 5/6, crc page per
mirror =3D raid1 with checksum)
correction algorithm (?? any idea)
sync algorithm (per page / per disk ??)
start disk algorithm (per page? per disk? last write time? incremental
write number?)
checksum/correction location (at each disk more secure, or, at
external disk / file less secure)

a mdadm with all this options could make a very flexible raid
solution... i don't believe that we could have a more flexible than
this, any idea??
we have a lot of work done today... just remap it, ok we have more
thinks to do... anyone want a new project? md2? like v4l2?

2011/1/5 Roman Mamedov :
> On Wed, 5 Jan 2011 18:03:47 -0600
> "Leslie Rhorer" wrote:
>
>> =A0 =A0 =A0 RAID1 certainly offers the most robust solution, especia=
lly
>> with more than 1 mirror.
>
>> =A0 =A0 =A0 RAID1 is as safe as it gets
>
> Are you sure about that? Considering that mdadm's handling of corrupt=
data on
> RAID1 devices is pretty simplistic (obviously it does not have per-bl=
ock
> checksums anywhere, it does not do 'voting' on RAID1 with more than 2
> devices), it basically has no way of knowing if a block of data is re=
turned
> differently by some of the component devices, which one has the 'corr=
ect'
> data. From what I understand, RAID5 and especially RAID6 give a much =
better
> protection in this situation.
>
>
>
> --
> With respect,
> Roman
>

--=20
Roberto Spadim
Spadim Technology / SPAEmpresarial
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

RE: Considering a complete rework of RAID on my home compute server

am 06.01.2011 09:10:48 von Leslie Rhorer

> On Wed, 5 Jan 2011 18:03:47 -0600
> "Leslie Rhorer" wrote:
>
> > RAID1 certainly offers the most robust solution, especially
> > with more than 1 mirror.
>
> > RAID1 is as safe as it gets
>
> Are you sure about that?

Well, yeah.

> Considering that mdadm's handling of corrupt data
> on
> RAID1 devices is pretty simplistic (obviously it does not have per-block
> checksums anywhere, it does not do 'voting' on RAID1 with more than 2
> devices), it basically has no way of knowing if a block of data is
> returned

Well I can't answer to that very well. Some of the other folks who
are more familiar with the mechanics of the situation will have to comment,
but each single member of a RAID1 array holds the entire contents of the
data set. A 3N has 3 complete sets of data on it, and piecing together a
fragmented data set from 3 complete sets of data is going to be more likely
to produce an intact data set than from 1 + 1/N data sets.

> differently by some of the component devices, which one has the 'correct'
> data. From what I understand, RAID5 and especially RAID6 give a much
> better
> protection in this situation.

It's certainly not been my experience. That said, I do run RAID6
arrays, along with RAID1 arrays. YMMV, of course.

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Considering a complete rework of RAID on my home compute server

am 06.01.2011 11:45:04 von Roman Mamedov

--Sig_//ICpnWGndYcbOYX4MgV4O6k
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

On Thu, 6 Jan 2011 00:05:31 -0200
Roberto Spadim wrote:

> could we implement a more flexible raid1? maybe with checksum?

> what's the `new` raid1 with checksum idea?

There is BTRFS RAID1 which already does checksum verification:
https://btrfs.wiki.kernel.org/index.php/Using_Btrfs_with_Mul tiple_Devices

--=20
With respect,
Roman

--Sig_//ICpnWGndYcbOYX4MgV4O6k
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)

iEYEARECAAYFAk0lnTAACgkQTLKSvz+PZwgy7wCfQtKfpczOIBxuOuDfXF/h jvtG
CG4AnjKH2s2gF1rCNGp1To5zhttyry9m
=ku3Q
-----END PGP SIGNATURE-----

--Sig_//ICpnWGndYcbOYX4MgV4O6k--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Considering a complete rework of RAID on my home compute server

am 06.01.2011 19:17:28 von Mark Knecht

On Wed, Jan 5, 2011 at 4:03 PM, Leslie Rhorer wro=
te:
>> 1) The Linux/MySQL stuff is a 3 drive RAID1. (sda/sdb/sdc)
>
> Â Â Â Â Well, RAID1 certainly offers the most robu=
st solution, especially
> with more than 1 mirror.
>

>
>> 3) There is a second RAID1 (sda/sdb/sdc) used for backups of the
>> RAID0. The RAID0 is backed up to RAID1 nightly. If the RAID0 fails
>> then I lose 1 day's work.
>
> Â Â Â Â Well, first of all, unless you meant to sa=
y sda1/sdb1/sdc1, etc,
> then this can't be different from the #1 RAID array above. Â Assu=
ming you are
> actually using partitions, then I don't really see the value of two s=
eparate
> arrays. Â Why not just one RAID1 array?
>

Sorry I wasn't very clear.

In #1 the RAID1 is my main Linux system on sda3/sdb3/sdc3.
In #3 the RAID1 is purely for backups on /swa6/sdb6/sdc6 and is used
only for the backup of the RAID0 data. It is not normally mounted
except when doing backups. I wanted the extra protection that if
something went wrong with the basic Linux box the backup partition
would not normally be mounted and the data hopefully a bit safer.

> Â Â Â Â Also, I personally would reverse the backu=
p strategy. Â I would put
> the Windows VM on the (single) main RAID array and back up the data t=
o a
> single 1T hard drive.
>

I do this today backing up the existing RAID1 partitions to an
external eSATA 1TB drive.

>> Â Â OK - the problems I have with this arrangement are:
>>
>> 1) I used the older v0.9 metadata.
>
> Â Â Â Â This may be necessary if you are booting f=
rom the array. Â The
> limitations of the 0.90 superblock may never impact you. Â That's=
a small
> system with only a few drives.
>
>> 2) The RAIDs are assembled by the kernel automatically. I do not use
>> an initrd. (Because I don't know how/newer have)
>
> Â Â Â Â How is that a problem? Â An initrd, or=
lack thereof, won't prevent
> you from disabling the automatic assembly of one or more arrays, unle=
ss once
> again you boot from the array in question. Â Most modern distros =
default to
> using an initrd. Â What distro are you using?
>

The machines are Gentoo and an initrd/initramfs is up to the builder.
The new RAID6/superblock-1.2 boot uses one. The RAID1/superblock-0.9
does not.

>> 3) I think with 5 disks I could get better performance Â than I
>> currently get , with similar or better safety using maybe RAID5 or
>> RAID6.
>
> Â Â Â Â No, RAID1 is as safe as it gets. Â RAI=
D0 allows for better
> performance, but if you make the RAID0 into your backup solution, the
> performance won't matter much.
>

If I'm wrong about RAID6 please correct me as this understanding is
why I chose it.

1) A 5-drive RAID6 can survive losing 2 disks and still return good dat=
a.

2) A 5-drive RAID6 reads data as nearly fast as a 3-drive RAID0.

If those two aren't true then my choice of RAID6 doesn't improve my
system as I hoped.

3) My current 3-drive RAID1 can lose 2 disks and still return good
data making #1 equivalent to #3

4) #2 would be faster than my current 2-drive RAID0 and wouldn't have
the risk of a single drive loss.

If #3 & #4 aren't correct then maybe RAID6 isn't buying me anything.

> Â Â Â Â Is performance a really big issue? Â A=
re you having problems with bad
> performance?
>

I think I am. On the current RAID0 side I'm running 4-5 Win XP VMs
doing number crunching. Each is sitting in a 20GB virtual disk which
is just files in VMWare or Virtualbox. Sometimes I run into moderate
periods of time (5-30 seconds) with disk activity lights flashing,
apparent loss of interactivity on the machine (mouse & keyboard not
responding quickly in Linux) even when the RAID1/Linux side isn't
doing anything. No cron jobs or anything like that running, just the
VMs sucking up CPU and disk. Most of the number crunching is reading
larger amounts of data, using the CPU and then writing some smallish
files out.

>> Â Â Overall, as I see it, I can suffer no disk loss on the =
RAID0, and
>> can handle a 2 disk loss on the RAID1. (Is that correct?) I'm thinki=
ng
>> that with a 5-drive RAID6 I might well get better performance than
>> either of the current RAIDs and (from reading) more protection durin=
g
>> a rebuild if one of my drives goes bad.
>
> Â Â Â Â A single RAID5 or RAID6 solution is certai=
nly simpler, and there is
> significant value there. Â Read performance should be enhanced, b=
ut write
> performance will be impacted. Â You've also lost your backup solu=
tion in this
> scenario, though, so you will need to come up with something.
>

I have a local eSATA in my office and then a second machine in the
house with a 2-drive RAID1. I use them both for backups currently.

>
> Â Â Â Â You might consider an external enclosure. =
Â Enclosures for up to 5
> dives are quite economical.

I will give it some thought.

Thanks!

- Mark
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Considering a complete rework of RAID on my home compute server

am 07.01.2011 00:57:37 von John Robinson

On 06/01/2011 18:17, Mark Knecht wrote:
> On Wed, Jan 5, 2011 at 4:03 PM, Leslie Rhorer wrote:
[...]
>>> 3) I think with 5 disks I could get better performance than I
>>> currently get , with similar or better safety using maybe RAID5 or
>>> RAID6.
>>
>> No, RAID1 is as safe as it gets. RAID0 allows for better
>> performance, but if you make the RAID0 into your backup solution, the
>> performance won't matter much.
>
> If I'm wrong about RAID6 please correct me as this understanding is
> why I chose it.
>
> 1) A 5-drive RAID6 can survive losing 2 disks and still return good data.

Yes, but then everything's still spread across three discs, with no
redundancy, which will be less reliable than just one disc. Of course
you ought to be replacing a disc as soon as it fails, then the remaining
redundancy should get you through the rebuild.

> 2) A 5-drive RAID6 reads data as nearly fast as a 3-drive RAID0.

Potentially nearly as fast as a 5-drive RAID0 actually, because data is
striped all across all 5 discs.

> If those two aren't true then my choice of RAID6 doesn't improve my
> system as I hoped.
>
> 3) My current 3-drive RAID1 can lose 2 disks and still return good
> data making #1 equivalent to #3

Sorry, I'm a bit confused by what these #s are.

> 4) #2 would be faster than my current 2-drive RAID0 and wouldn't have
> the risk of a single drive loss.
>
> If #3& #4 aren't correct then maybe RAID6 isn't buying me anything.

Running your VMs on a 5-drive RAID-6 instead of a 2-drive RAID-0 will
give you better read performance, streaming or random, maybe similar
streaming write performance if you're lucky, but generally worse write
performance, maybe much worse, no better than a single disc.

If you want to keep a backup partition that you can take offline, you
could make your RAID-6 partitioned, or run LVM over the top of it.

Cheers,

John.

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

RE: Considering a complete rework of RAID on my home compute server

am 10.01.2011 01:10:31 von Leslie Rhorer

> On Wed, Jan 5, 2011 at 4:03 PM, Leslie Rhorer w=
rote:
> >> 1) The Linux/MySQL stuff is a 3 drive RAID1. (sda/sdb/sdc)
> >
> > =A0 =A0 =A0 =A0Well, RAID1 certainly offers the most robust solutio=
n, especially
> > with more than 1 mirror.
> >
>
> >
> >> 3) There is a second RAID1 (sda/sdb/sdc) used for backups of the
> >> RAID0. The RAID0 is backed up to RAID1 nightly. If the RAID0 fails
> >> then I lose 1 day's work.
> >
> > =A0 =A0 =A0 =A0Well, first of all, unless you meant to say sda1/sdb=
1/sdc1, etc,
> > then this can't be different from the #1 RAID array above. =A0Assum=
ing you
> are
> > actually using partitions, then I don't really see the value of two
> separate
> > arrays. =A0Why not just one RAID1 array?
> >
>=20
> Sorry I wasn't very clear.
>=20
> In #1 the RAID1 is my main Linux system on sda3/sdb3/sdc3.
> In #3 the RAID1 is purely for backups on /swa6/sdb6/sdc6 and is used
> only for the backup of the RAID0 data. It is not normally mounted
> except when doing backups. I wanted the extra protection that if
> something went wrong with the basic Linux box the backup partition
> would not normally be mounted and the data hopefully a bit safer.
>=20
>=20
> > =A0 =A0 =A0 =A0Also, I personally would reverse the backup strategy=
=A0I would
> put
> > the Windows VM on the (single) main RAID array and back up the data=
to a
> > single 1T hard drive.
> >
>=20
> I do this today backing up the existing RAID1 partitions to an
> external eSATA 1TB drive.

What I meant was, you currently have a 2 disk RAID0. Why not buy a
larger disk and move the 2 drives currently tied up in the RAID0 to a 5=
disk
RAID6 array with no partitions? The function currently provided by the=
2nd
RAID1 can be taken over by the single drive, and the function of the RA=
ID0
can be taken over by a directory on the main array. This makes better
functional use of the space and may provide performance benefits, not t=
o
mention being much simpler. I'm a fan of having a separate, small disk=
for
booting. My servers both have a pair of small drives each partitioned =
into
three sections. Each of the partition pairs are in turn assembled into=
a
RAID array for a total of 3 mounts:

md1 - a tiny /boot
md2 - a small /
md3 - swap. =20

> >> =A0 =A0OK - the problems I have with this arrangement are:
> >>
> >> 1) I used the older v0.9 metadata.
> >
> > =A0 =A0 =A0 =A0This may be necessary if you are booting from the ar=
ray. =A0The
> > limitations of the 0.90 superblock may never impact you. =A0That's =
a small
> > system with only a few drives.
> >
> >> 2) The RAIDs are assembled by the kernel automatically. I do not u=
se
> >> an initrd. (Because I don't know how/newer have)
> >
> > =A0 =A0 =A0 =A0How is that a problem? =A0An initrd, or lack thereof=
, won't prevent
> > you from disabling the automatic assembly of one or more arrays, un=
less
> once
> > again you boot from the array in question. =A0Most modern distros d=
efault
> to
> > using an initrd. =A0What distro are you using?
> >
>=20
> The machines are Gentoo and an initrd/initramfs is up to the builder.
> The new RAID6/superblock-1.2 boot uses one. The RAID1/superblock-0.9
> does not.

OK, well, either way, any array can have auto-assembly disabled. If
you go with a single RAID6 plus a simple drive, it won't be required,
though.

> >> 3) I think with 5 disks I could get better performance =A0than I
> >> currently get , with similar or better safety using maybe RAID5 or
> >> RAID6.
> >
> > =A0 =A0 =A0 =A0No, RAID1 is as safe as it gets. =A0RAID0 allows for=
better
> > performance, but if you make the RAID0 into your backup solution, t=
he
> > performance won't matter much.
> >
>=20
> If I'm wrong about RAID6 please correct me as this understanding is
> why I chose it.
>=20
> 1) A 5-drive RAID6 can survive losing 2 disks and still return good d=
ata.

Yes.

> 2) A 5-drive RAID6 reads data as nearly fast as a 3-drive RAID0.

Or faster. Right now your 3 drive arrays are RAID1, though, aren't
they? RAID6 should read *MUCH* faster than RAID1

> If those two aren't true then my choice of RAID6 doesn't improve my
> system as I hoped.
>=20
> 3) My current 3-drive RAID1 can lose 2 disks and still return good
> data making #1 equivalent to #3

Well, only in terms of survivability and only for the configuration
you mention. An N drive RAID1 array can lose N-1 drives. Any RAID6 ar=
ray
can lose 2 drives.

> 4) #2 would be faster than my current 2-drive RAID0 and wouldn't have
> the risk of a single drive loss.

Yes.
=20

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Considering a complete rework of RAID on my home compute server

am 10.01.2011 02:45:18 von Mark Knecht

On Sun, Jan 9, 2011 at 4:10 PM, Leslie Rhorer wro=
te:

>
> Â Â Â Â What I meant was, you currently have a 2 d=
isk RAID0. Â Why not buy a
> larger disk and move the 2 drives currently tied up in the RAID0 to a=
5 disk
> RAID6 array with no partitions? Â The function currently provided=
by the 2nd
> RAID1 can be taken over by the single drive, and the function of the =
RAID0
> can be taken over by a directory on the main array. Â This makes =
better
> functional use of the space and may provide performance benefits, not=
to
> mention being much simpler. Â I'm a fan of having a separate, sma=
ll disk for
> booting. Â My servers both have a pair of small drives each parti=
tioned into
> three sections. Â Each of the partition pairs are in turn assembl=
ed into a
> RAID array for a total of 3 mounts:
>
> md1 - a tiny /boot
> md2 - a small /
> md3 - swap.
>

I believe that is essentially what I'm doing:

- 2 drive RAID0 using 500GB drives is 1TB
- 3 drive RAID1 using 500GB drives is 500GB

- 5 drive RAID6 using 500GB drives is 1.5TB

The RAID0 is moving to the RAID6
The RAID1 is moving to the RAID6

I'll purchase 1 additional 500GB drive to sit cold in my office in
case I drop a drive in the RAID6.

Sounds like it should work out well.

I'll investigate using a different drive for booting.

Thanks,
Mark
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html