Q: LVM over RAID, or plain disks? A:"Yes" = best of both worlds?

am 28.11.2010 16:30:56 von hansBKK

- - - - - - My abject apologies to all for improper addressing in my
previous messages (thanks to all those who set me straight :)

Hope you're all still willing to consider my request for feedback.
Start with a bit of context:

- SAN/NAS (call it FILER-A) hosting say a dozen TB and servicing a few
dozen client machines and servers, mostly virtual hosts. Another,
larger (FILER-B - still just tens of TB) host's drives are used for
storing backup sets, via not only Amanda, but also filesystems
comprising gazillions of hard-linked archive sets created by (eg)
rdiff-backup, rsnapshot and BackupPC. We're on a very limited budget,
therefore no tape storage for backups.

- I plan to run LVM over RAID (likely RAID1 or RAID10) for IMO an
ideal combination of fault tolerance, performance and flexibility.

- I am not at this point overly concerned about performance issues -
reliability/redundancy and ease of recovery are my main priorities.

Problem:

For off-site data rotation, the hard-linked filesystems on FILER-B
require full filesystem cloning with block-level tools rather than
file-level copying or sync'ing. My current plan is to swap out disks
mirrored via RAID, marking them as "failed" and then rebuilding using
the (re-initialized) incoming rotation set.

HOWEVER - the use of LVM (and possibly RAID10) adds complexity to the
filesystems, which makes disaster recovery from the detached disk sets
much more difficult than regular partitions on physical disks.

Theoretical solution:

Use RAID1 on the "top layer" to mirror the data stored in an LVM (set
of) disk(s) on the one hand (call it TopRAID1) to ***regular
partitions*** on actual physical disks on the other (call this the
TopRAID2 side).

(ASCII art best viewed with a monospaced font)

"TopRAID1" side
______________________________________
| LVM VG |
| _____ _____________ __________ |
| | LV1 | | LV2 | | LV3 | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| |_____| |_____________| |__________| |
|____v___________v______________v______|
v v v
v v v
RAID1 RAID1 RAID1
v v v
__v__ ______v______ _____v____
| HD1 | | HD2 | | HD3 |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
|_____| |_____________| |__________|

"TopRAID2" side

The mirroring at the top level would be set up between the individual
LVs on the TopRAID1 side and regular filesystem partitions (no RAID or
LVM) on the TopRAID2 side. In the event of a massive host failure, the
filesystems on the TopRAID2 side could be easily mounted for data
recovery and/or service resumption on another machine, and the
TopRAID1 disk set rebuilt from scratch and then re-mirrored from the
TopRAID2 disks.

One design goal would be to not allow any LV to get so large that it
won't fit on a single physical disk on the TopRAID2 side. If this is
not possible, then the corresponding TopRAID2 side would need to
comprise a multiple disk set, perhaps striped by RAID0 - not as
straightforward to recover as single disks, but at least without the
LVM layer.

Remember, the main purpose of this arrangement is so that the disks in
the TopRAID2 set can be rotated out for offsite storage. Ideally this
would be done by using an extra identical set (TopRAID2a and
TopRAID2b) to minimize the time windows when the live data is running
on TopRAID1 only.

Note that on the TopRAID1 side the LVM layers could be running on top
of another set of RAID disks (call it the BottomRAID), again either
RAID1 or perhaps RAID10 mirroring at the lowest layer. This disk set
could be allowed to grow in both size and complexity, with an
expectation that in the event of massive failure I won't even attempt
to rebuild/recover it, just tear it down and set it up again from
scratch, then mirror the data back from TopRAID2.

At this point this is all idle speculation on my part, and although I
think the technology makes it possible, I don't know whether it is a
practical scheme.

An enhancement of this idea would be to implement the "TopRAID" with a
full-server mirror using drdb and heartbeat, perhaps eliminating the
need for intra-server disk mirroring. In this case the TopRAID1 server
would have the flexibile disk space allocation of LVM, while the
TopRAID2 server's disks would all be just regular partitions (no LVM),
again, easily swapped out for offsite rotation.

Any feedback on these ideas would be most appreciated.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

RE: LVM over RAID, or plain disks? A:"Yes" = best of both worlds?

am 28.11.2010 19:34:02 von Leslie Rhorer

> -----Original Message-----
> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
> owner@vger.kernel.org] On Behalf Of hansbkk@gmail.com
> Sent: Sunday, November 28, 2010 9:31 AM
> To: linux-raid@vger.kernel.org
> Subject: Q: LVM over RAID, or plain disks? A:"Yes" = best of both worlds?
>
> - - - - - - My abject apologies to all for improper addressing in my
> previous messages (thanks to all those who set me straight :)
>
> Hope you're all still willing to consider my request for feedback.
> Start with a bit of context:
>
> - SAN/NAS (call it FILER-A) hosting say a dozen TB and servicing a few
> dozen client machines and servers, mostly virtual hosts. Another,
> larger (FILER-B - still just tens of TB) host's drives are used for
> storing backup sets, via not only Amanda, but also filesystems
> comprising gazillions of hard-linked archive sets created by (eg)
> rdiff-backup, rsnapshot and BackupPC. We're on a very limited budget,
> therefore no tape storage for backups.
>
> - I plan to run LVM over RAID (likely RAID1 or RAID10) for IMO an
> ideal combination of fault tolerance, performance and flexibility.
>
> - I am not at this point overly concerned about performance issues -
> reliability/redundancy and ease of recovery are my main priorities.

In that case, I'm not sure about your desire for the additional
complexity. Someone else suggested RAID6, which from an operational
standpoint is much simpler. The main reason, it seems to me, for the more
complex topologies would be enhancing perfromance.

> Problem:
>
> For off-site data rotation, the hard-linked filesystems on FILER-B
> require full filesystem cloning with block-level tools rather than
> file-level copying or sync'ing.

Yeah. That's a bit of a sticky problem. RAID1 or LVM certainly
fill the bill in terms of cloning. It's the "data rotation" that throws a
monkey wrench into the gears.

> My current plan is to swap out disks
> mirrored via RAID, marking them as "failed" and then rebuilding using
> the (re-initialized) incoming rotation set.

That sounds to me like a situation fraught with potential problems,
especially failures. It seems to me swapping out disks in this fashion -
especially if there is any sort of striping - is merely going to wind up
producing a bunch of incompatible backup disks.

> HOWEVER - the use of LVM (and possibly RAID10) adds complexity to the
> filesystems, which makes disaster recovery from the detached disk sets
> much more difficult than regular partitions on physical disks.
>
>
> Theoretical solution:
>
> Use RAID1 on the "top layer" to mirror the data stored in an LVM (set
> of) disk(s) on the one hand (call it TopRAID1) to ***regular
> partitions*** on actual physical disks on the other (call this the
> TopRAID2 side).
>
>
> (ASCII art best viewed with a monospaced font)
>
> "TopRAID1" side
> ______________________________________
> | LVM VG |
> | _____ _____________ __________ |
> | | LV1 | | LV2 | | LV3 | |
> | | | | | | | |
> | | | | | | | |
> | | | | | | | |
> | | | | | | | |
> | | | | | | | |
> | |_____| |_____________| |__________| |
> |____v___________v______________v______|
> v v v
> v v v
> RAID1 RAID1 RAID1
> v v v
> __v__ ______v______ _____v____
> | HD1 | | HD2 | | HD3 |
> | | | | | |
> | | | | | |
> | | | | | |
> | | | | | |
> | | | | | |
> |_____| |_____________| |__________|
>
> "TopRAID2" side
>
> The mirroring at the top level would be set up between the individual
> LVs on the TopRAID1 side and regular filesystem partitions (no RAID or
> LVM) on the TopRAID2 side. In the event of a massive host failure, the
> filesystems on the TopRAID2 side could be easily mounted for data
> recovery and/or service resumption on another machine, and the
> TopRAID1 disk set rebuilt from scratch and then re-mirrored from the
> TopRAID2 disks.

This will certainly work. Let me ask you this: What advantage do
you seek from using LVM on side 1? Growth? Re-sizing?

> One design goal would be to not allow any LV to get so large that it
> won't fit on a single physical disk on the TopRAID2 side. If this is
> not possible, then the corresponding TopRAID2 side would need to
> comprise a multiple disk set, perhaps striped by RAID0 - not as
> straightforward to recover as single disks, but at least without the
> LVM layer.

To my mind, a solution which assumes the data payload will not
exceed a certain physical size is poorly considered unless the server hosts
only a very specific application whose data extents are specifically limited
by the application itself. The beauty, as it were, of LVM is that it can
easily accommodate changing space needs within a pool of available storage
space. If there are strict limits on the payload size, then I'm not certain
how LVM would offer advantages, especially if you are disallowing striping
the volumes.

> Remember, the main purpose of this arrangement is so that the disks in
> the TopRAID2 set can be rotated out for offsite storage. Ideally this
> would be done by using an extra identical set (TopRAID2a and
> TopRAID2b) to minimize the time windows when the live data is running
> on TopRAID1 only.

What you are proposing can certainly be done, but it sounds awfully
frail, to me. I think a better solution would be to be a RAID1 mirror of
every volume - each volume being a RAID or LVM of some number of disks, with
no disk being host to more than one volume on either side. Then, rather
than failing an element, you can fail the entire volume set, stop the volume
and take all the disks offline at once. Not only that, but if I were you, I
would make it a 3 volume mirror, keeping at least 2 volumes online at one
time. Given your need to rotate the entire volume set, I don't believe
RAID10 will be ideal for what you want, but if not a relatively simple
matrix of RAID0 and RAID1 should work. I'm not all that familiar with
RAID10, though, so one of the more expert members of the list may want to
chime in.

> Note that on the TopRAID1 side the LVM layers could be running on top
> of another set of RAID disks (call it the BottomRAID), again either
> RAID1 or perhaps RAID10 mirroring at the lowest layer. This disk set
> could be allowed to grow in both size and complexity, with an
> expectation that in the event of massive failure I won't even attempt
> to rebuild/recover it, just tear it down and set it up again from
> scratch, then mirror the data back from TopRAID2.
>
> At this point this is all idle speculation on my part, and although I
> think the technology makes it possible, I don't know whether it is a
> practical scheme.

It sounds practical to me in terms of setting it up, but not in
terms of being reliable, given the way you intend to use it.

> An enhancement of this idea would be to implement the "TopRAID" with a
> full-server mirror using drdb and heartbeat, perhaps eliminating the
> need for intra-server disk mirroring. In this case the TopRAID1 server
> would have the flexibile disk space allocation of LVM, while the
> TopRAID2 server's disks would all be just regular partitions (no LVM),
> again, easily swapped out for offsite rotation.
>
> Any feedback on these ideas would be most appreciated.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: LVM over RAID, or plain disks? A:"Yes" = best of both worlds?

am 29.11.2010 12:01:11 von hansBKK

On Mon, Nov 29, 2010 at 1:34 AM, Leslie Rhorer wr=
ote:
>> - I am not at this point overly concerned about performance issues -
>> reliability/redundancy and ease of recovery are my main priorities.
>
> =A0 =A0 =A0 =A0In that case, I'm not sure about your desire for the a=
dditional
> complexity. =A0Someone else suggested RAID6, which from an operationa=
l
> standpoint is much simpler. =A0The main reason, it seems to me, for t=
he more
> complex topologies would be enhancing perfromance.

I can see how RAID6 is simpler than RAID10, but compared to RAID1?

>> The mirroring at the top level would be set up between the individua=
l
>> LVs on the TopRAID1 side and regular filesystem partitions (no RAID =
or
>> LVM) on the TopRAID2 side. In the event of a massive host failure, t=
he
>> filesystems on the TopRAID2 side could be easily mounted for data
>> recovery and/or service resumption on another machine, and the
>> TopRAID1 disk set rebuilt from scratch and then re-mirrored from the
>> TopRAID2 disks.
>
> =A0 =A0 =A0 =A0This will certainly work. =A0Let me ask you this: =A0W=
hat advantage do
> you seek from using LVM on side 1? =A0Growth? =A0Re-sizing?

It's a filer, servicing an ever-changing mix of dozens of clients
(from a storage POV, both clients and servers), possibly using iSCSI,
definitely samba and nfs. Creating new shares and/or iSCSI targets or
re-configuring existing ones can't interfere with the operations of
the other hosts. I can't see doing this without LVM myself. I'm likely
to implement this with Openfiler, and in fact I believe it's
install/configuration routines force the use of LVM for exactly this
reason.

>> One design goal would be to not allow any LV to get so large that it
>> won't fit on a single physical disk on the TopRAID2 side. If this is
>> not possible, then the corresponding TopRAID2 side would need to
>> comprise a multiple disk set, perhaps striped by RAID0 - not as
>> straightforward to recover as single disks, but at least without the
>> LVM layer.
>
> =A0 =A0 =A0 =A0To my mind, a solution which assumes the data payload =
will not
> exceed a certain physical size is poorly considered unless the server=
hosts
> only a very specific application whose data extents are specifically =
limited
> by the application itself. =A0The beauty, as it were, of LVM is that =
it can
> easily accommodate changing space needs within a pool of available st=
orage
> space. =A0If there are strict limits on the payload size, then I'm no=
t certain
> how LVM would offer advantages, especially if you are disallowing str=
iping
> the volumes.

Sorry I wasn't more clear. I'm using LVM on the TopRAID1 side for
exactly this reason. The TopRAID2 side is simply to enable cloning a
given LV to a straightforward partition for easy disaster recovery.

If and when a given LV needs to be resized, I would break the RAID1 to
the TopRAID2 beforehand and manually prepare an appropriate new
"static" target on the TopRAID1 side and then re-mirror.

> =A0 =A0 =A0 =A0What you are proposing can certainly be done, but it s=
ounds awfully
> frail, to me. =A0I think a better solution would be to be a RAID1 mir=
ror of
> every volume - each volume being a RAID or LVM of some number of disk=
s, with
> no disk being host to more than one volume on either side. =A0Then, r=
ather
> than failing an element, you can fail the entire volume set, stop the=
volume
> and take all the disks offline at once. =A0Not only that, but if I we=
re you, I
> would make it a 3 volume mirror, keeping at least 2 volumes online at=
one
> time. =A0Given your need to rotate the entire volume set, I don't bel=
ieve
> RAID10 will be ideal for what you want, but if not a relatively simpl=
e
> matrix of RAID0 and RAID1 should work. =A0I'm not all that familiar w=
ith
> RAID10, though, so one of the more expert members of the list may wan=
t to
> chime in.

> =A0 =A0 =A0 =A0It sounds practical to me in terms of setting it up, b=
ut not in
> terms of being reliable, given the way you intend to use it.

Yes, I see that I shouldn't rely on the TopRAID2 side for reliability,
and now plan to make the TopRAID1 side reliable in and of itself, and
of course am also taking frequent backups.

This now allows me to consider the TopRAID2 side as "expendable" only
for the purpose of cloning a given LV to a straightforward partition,
and I can test and experiment as I like.

It would certainly be more conventional to just use a partition
cloning tool like Acronis or BESR to make a copy of an LVM snapshot
into a file; I was hoping I could use mdm instead to allow me to take
normal partitions offsite ready to mount, rather than filesystems
needing a restore procedure.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: LVM over RAID, or plain disks? A:"Yes" = best of both worlds?

am 29.11.2010 16:29:16 von Keld Simonsen

On Mon, Nov 29, 2010 at 06:01:11PM +0700, hansbkk@gmail.com wrote:
> On Mon, Nov 29, 2010 at 1:34 AM, Leslie Rhorer =
wrote:
> >> - I am not at this point overly concerned about performance issues=
-
> >> reliability/redundancy and ease of recovery are my main priorities=

> >
> > =A0 =A0 =A0 =A0In that case, I'm not sure about your desire for the=
additional
> > complexity. =A0Someone else suggested RAID6, which from an operatio=
nal
> > standpoint is much simpler. =A0The main reason, it seems to me, for=
the more
> > complex topologies would be enhancing perfromance.
>=20
> I can see how RAID6 is simpler than RAID10, but compared to RAID1?

Hmm, does not compute by me. RAID1 and RAID10 are the same in complexit=
y,
RAID10 is just a modern RAID1, and should actually have been called
RAID1.

best regards
keld
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: LVM over RAID, or plain disks? A:"Yes" = best of both worlds?

am 29.11.2010 17:00:19 von hansBKK

2010/11/29 Keld J=F8rn Simonsen :
>> I can see how RAID6 is simpler than RAID10, but compared to RAID1?
>
> Hmm, does not compute by me. RAID1 and RAID10 are the same in complex=
ity,
> RAID10 is just a modern RAID1, and should actually have been called
> RAID1.

My understanding is that if I use RAID10 on a single pair of disks
then that is literally the same as RAID1. These to me are very simple
in that I can take either one of the pair and mount it on any normal
machine and get at the data without doing anything special.

However, if I have my six disks configured as a single RAID10 array, I
believe this is no longer true - the data from (at least the larger
of) the files has been distributed over all six disks, correct?

Now compare putting LVM on top of this array, compared to three RAID1
pairs on the one hand and a RAID6 array on the other (third) hand :)

If I were trying to recover the data using the latest version of a
LiveCD - say Fedora or Knoppix, which would be easier?

I'm not trying to score any points, it's a genuine question.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: LVM over RAID, or plain disks? A:"Yes" = best of both worlds?

am 30.11.2010 01:42:57 von NeilBrown

On Mon, 29 Nov 2010 23:00:19 +0700 hansbkk@gmail.com wrote:

> 2010/11/29 Keld J=F8rn Simonsen :
> >> I can see how RAID6 is simpler than RAID10, but compared to RAID1?
> >
> > Hmm, does not compute by me. RAID1 and RAID10 are the same in compl=
exity,
> > RAID10 is just a modern RAID1, and should actually have been called
> > RAID1.
>=20
> My understanding is that if I use RAID10 on a single pair of disks
> then that is literally the same as RAID1. These to me are very simple
> in that I can take either one of the pair and mount it on any normal
> machine and get at the data without doing anything special.
>=20
> However, if I have my six disks configured as a single RAID10 array, =
I
> believe this is no longer true - the data from (at least the larger
> of) the files has been distributed over all six disks, correct?
>=20
> Now compare putting LVM on top of this array, compared to three RAID1
> pairs on the one hand and a RAID6 array on the other (third) hand :)
>=20
> If I were trying to recover the data using the latest version of a
> LiveCD - say Fedora or Knoppix, which would be easier?
>=20
> I'm not trying to score any points, it's a genuine question.

If you are comparing recovering after some sort of problem with
a RAID10 over 6 devices compared with LVM over 2 2-device RAID1s, then=
the
former is certainly easier. This is simply because there are less laye=
rs of
complexity where something could go wrong.

In both cases, your data will be spread across multiple disks, and any =
one
disk or even any two disks would be of no use to you.

NeilBrown

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [linux-lvm] Q: LVM over RAID, or plain disks? A:"Yes" = best ofboth worlds?

am 30.11.2010 06:20:46 von hansBKK

On Tue, Nov 30, 2010 at 1:57 AM, Nataraj wrote:
> Your proposed solution is a bit confusing to understand,

Thanks for taking the time to give me feedback! And sorry I couldn't
make it clearer, but if the diagram is pasted into a monospace screen
it becomes more so - duplicated below.

The key is that I want to RAID1 from a given **LV** within the
TopRAID1 set to regular partitions on physical disks on the TopRAID2
side.

> however raid1 works for doing backups in the manner that you describe .
> I use it myself and I
> have, over time read about others doing so as well. Be sure to create your
> volumes with --bitmap=internal, that way when you swap in a drive, it won't
> need to replicate the entire drive, only the part that is changed.

TopRAID1's LVM is likely to be running over a RAID6 set , so I'm not
depending on the TopRAID mirroring for reliability, just using it for
the above volume cloning.

When TopRAID1 is running with the TopRAID2 side marked faulty or
missing, will there be a performance hit or other negative
consequences?

If so, would it be possible/better for the host In normal operations
to mount the underlying LV directly rather than the degraded top-level
RAID1?

Obviously I'm asking the whole list here, not just Nataraj :)

> If your not going to manage the drives yourself, you will need an operations
> staff that has a pretty good understanding of how raid works and/or possibly
> write a robust set of scripts to manage the drives, ensure that the correct
> volume is mounted, etc.

Just me, and I'm sure my understanding will grow over time :)

We're not keeping archive sets of the drives themselves, rather
managing the archives created by BackupPC/rsnapshot/rdiff-backup and
Amanda, keeping them "live" in the disk sets discussed here, each in
its own LV(s), which data is constantly being refreshed onto the
rotating drives through the RAID re-mirroring process.

> Also, I don't personally feel that disks are a
> suitable media for long term data archival, so if that is really your
> purpose, as opposed to a quick way to recover from a disk failure, then you
> might consider doing periodic tape or optical media backups as well.

Remember we're talking about tens of terabytes here! I do also make
occasional DVD runs of key data sets + 20% PAR2s, but even then, I'm
not counting on these lasting more than 3 years, maybe 5 if I use
well-recommended media (checking the media ID). If we could afford
Verbatim "archive quality" I'd probably invest in a tape unit.

I just keep in mind Mr Keynes' "In the long run, we're all dead."

Thanks to all for your help!

----------------------------------------------

Repeat of conceptual diagram, best viewed with a monospaced font

"TopRAID1" side - LV's within a VG running on a RAID6 set
______________________________________
| LVM VG |
| _____ _____________ __________ |
| | LV1 | | LV2 | | LV3 | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| |_____| |_____________| |__________| |
|____v___________v______________v______|
v v v
v v v
RAID1 RAID1 RAID1
v v v
__v__ ______v______ _____v____
| HD1 | | HD2 | | HD3 |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
|_____| |_____________| |__________|

"TopRAID2" side - regular partitions on physical disks
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: LVM over RAID, or plain disks? A:"Yes" = best of both worlds?

am 30.11.2010 06:35:56 von hansBKK

On Tue, Nov 30, 2010 at 7:42 AM, Neil Brown wrote:
> If you are comparing recovering after some sort of problem with
> a RAID10 over 6 devices compared with LVM over 2 2-device RAID1s, th=
en the
> former is certainly easier. This is simply because there are less la=
yers of
> complexity where something could go wrong.
>
> In both cases, your data will be spread across multiple disks, and an=
y one
> disk or even any two disks would be of no use to you.

Thanks Neil.

Still true with LVM on top of the 6-drive set in either case?

Scenario being
All the drives are together and OK (generic SATA2, cleanly
disconnected) - but everything else is gone
Not practical to rebuild the whole set of hosts, just want to get at k=
ey data
Mount the disks on a new machine, boot from SystemRescueCD or Knoppix
and copy the key data off.

And between RAID6 and RAID10?

> On Mon, 29 Nov 2010 23:00:19 +0700 hansbkk@gmail.com wrote:
>
>> 2010/11/29 Keld J=F8rn Simonsen :
>> >> I can see how RAID6 is simpler than RAID10, but compared to RAID1=
?
>> >
>> > Hmm, does not compute by me. RAID1 and RAID10 are the same in comp=
lexity,
>> > RAID10 is just a modern RAID1, and should actually have been calle=
d
>> > RAID1.
>>
>> My understanding is that if I use RAID10 on a single pair of disks
>> then that is literally the same as RAID1. These to me are very simpl=
e
>> in that I can take either one of the pair and mount it on any normal
>> machine and get at the data without doing anything special.
>>
>> However, if I have my six disks configured as a single RAID10 array,=
I
>> believe this is no longer true - the data from (at least the larger
>> of) the files has been distributed over all six disks, correct?
>>
>> Now compare putting LVM on top of this array, compared to three RAID=
1
>> pairs on the one hand and a RAID6 array on the other (third) hand :)
>>
>> If I were trying to recover the data using the latest version of a
>> LiveCD - say Fedora or Knoppix, which would be easier?
>>
>> I'm not trying to score any points, it's a genuine question.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: LVM over RAID, or plain disks? A:"Yes" = best of both worlds?

am 30.11.2010 07:47:01 von NeilBrown

On Tue, 30 Nov 2010 12:35:56 +0700 hansbkk@gmail.com wrote:

> On Tue, Nov 30, 2010 at 7:42 AM, Neil Brown wrote:
> > If you are comparing recovering after some sort of problem with
> > a RAID10 over 6 devices compared with LVM over 2 2-device RAID1s, =
then the
> > former is certainly easier. This is simply because there are less =
layers of
> > complexity where something could go wrong.
> >
> > In both cases, your data will be spread across multiple disks, and =
any one
> > disk or even any two disks would be of no use to you.
>=20
>=20
> Thanks Neil.
>=20
> Still true with LVM on top of the 6-drive set in either case?

Yes. LVM will spread the data around in a different way, but it is sti=
ll not
possible to recovery anything reliably without all of the data.

>=20
>=20
> Scenario being
> All the drives are together and OK (generic SATA2, cleanly
> disconnected) - but everything else is gone
> Not practical to rebuild the whole set of hosts, just want to get at=
key data
> Mount the disks on a new machine, boot from SystemRescueCD or Knoppi=
x
> and copy the key data off.

There should be no difficulty doing that in either case. But there is =
more
room for things to go wrong if you use LVM+MD than if you just use MD.

So certainly use LVM if you need any of its features, but otherwise don=
't.

>=20
>=20
> And between RAID6 and RAID10?

I think this has already been answered.
RAID10 tends to be faster, but with 5 or more devices, RAID6 makes more=
space
available.
RAID6 can survive any 2 devices failing. RAID10 over 6 devices can som=
etimes
survive 3 failures, and sometimes not survive 2.

NeilBrown

>=20
>=20
> > On Mon, 29 Nov 2010 23:00:19 +0700 hansbkk@gmail.com wrote:
> >
> >> 2010/11/29 Keld J=F8rn Simonsen :
> >> >> I can see how RAID6 is simpler than RAID10, but compared to RAI=
D1?
> >> >
> >> > Hmm, does not compute by me. RAID1 and RAID10 are the same in co=
mplexity,
> >> > RAID10 is just a modern RAID1, and should actually have been cal=
led
> >> > RAID1.
> >>
> >> My understanding is that if I use RAID10 on a single pair of disks
> >> then that is literally the same as RAID1. These to me are very sim=
ple
> >> in that I can take either one of the pair and mount it on any norm=
al
> >> machine and get at the data without doing anything special.
> >>
> >> However, if I have my six disks configured as a single RAID10 arra=
y, I
> >> believe this is no longer true - the data from (at least the large=
r
> >> of) the files has been distributed over all six disks, correct?
> >>
> >> Now compare putting LVM on top of this array, compared to three RA=
ID1
> >> pairs on the one hand and a RAID6 array on the other (third) hand =
:)
> >>
> >> If I were trying to recover the data using the latest version of a
> >> LiveCD - say Fedora or Knoppix, which would be easier?
> >>
> >> I'm not trying to score any points, it's a genuine question.

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [linux-lvm] Q: LVM over RAID, or plain disks? A:"Yes" = best ofboth worlds?

am 30.11.2010 08:34:05 von hansBKK

On Tue, Nov 30, 2010 at 2:14 PM, Nataraj wrot=
e:
>> TopRAID1's LVM is likely to be running over a RAID6 set , so I'm not
>> depending on the TopRAID mirroring for reliability, just using it fo=
r
>> the above volume cloning.
>
> Your raid 1 backups won't mirror any snapshots of your LV's unless yo=
u
> specifically setup mirroring of the snapshots after they exist.

Ah, getting clearer to me, I was thinking I'd be mirroring the LV
itself, but you're right, taking a snapshot and mirroring that is a
much better idea.

So here's a summary of steps, please confirm:
- create a snapshot of a given volume
- create a new RAID1 mdN between that and a physical partition (blank=
?)
- let that get sync'd up
- break the RAID (fail the partition?), remove the drive
- delete the snapshot

The below is less clear to me, especially if the above is correct:

>> If so, would it be possible/better for the host In normal operations
>> to mount the underlying LV directly rather than the degraded top-lev=
el
>> RAID1?
>
> No, you want to have mdadm assemble the raid volume, even if in degra=
ded
> mode with only one drive and then access the LV on top of the md devi=
ce.
> =A0Even if you were able to mount the LV and bypass raid, that would =
be
> pointless because you would not update the bitmap and superblock and =
the
> integrety of the raid set would be lost.

During normal operations - when I'm not in the process of taking a
RAID-backup of my LV snapshot - it seems to me that the "Top-RAID" mdN
doesn't even exist right? It's set up to mirror between a snapshot and
a regular partition, both of which don't even exist during these
normal operations.

Therefore during normal operations the host *is* mounting the LV
directly, not via the "Top-RAID"mdN.

I wasn't talking about accessing the "Bottom-RAID" which creates the
underlying PV - this is transparent to LVM anyway, right?

Thanks again for your help (or in advance to anyone else who would
like to chime in)
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [linux-lvm] Q: LVM over RAID, or plain disks? A:"Yes" = bestof both worlds?

am 30.11.2010 14:13:00 von Phil Turmel

Hi Hans, (?)

On 11/30/2010 02:34 AM, hansbkk@gmail.com wrote:
> On Tue, Nov 30, 2010 at 2:14 PM, Nataraj wrote:
>>> TopRAID1's LVM is likely to be running over a RAID6 set , so I'm not
>>> depending on the TopRAID mirroring for reliability, just using it for
>>> the above volume cloning.
>>
>> Your raid 1 backups won't mirror any snapshots of your LV's unless you
>> specifically setup mirroring of the snapshots after they exist.
>
> Ah, getting clearer to me, I was thinking I'd be mirroring the LV
> itself, but you're right, taking a snapshot and mirroring that is a
> much better idea.

I think you are making this overly complex, insisting on a RAID1 operation to backup from on filer to the other. Consider having each disk on filer #2 configured as a single LVM PV/VG, so it can stand alone in a rotation. The try the alternate below.

> So here's a summary of steps, please confirm:
> - create a snapshot of a given volume

Here's where you are over-complicating things:
> - create a new RAID1 mdN between that and a physical partition (blank?)
> - let that get sync'd up
> - break the RAID (fail the partition?), remove the drive

As an alternate, with simpler recovery semantics:
Create matching LV on non-RAID PV/VG on filer #2
dd + netcat + dd or other technique to dup the snapshot on filer #1 to filer #2

> - delete the snapshot

Now, you have a single disk in your backup set that can be mounted on either filer, and either copied back into service, or in an emergency, used directly (live) in filer #1.

This approach also gives you the *option* to implement the backup transfer with file system conversions, compression, free space removal, or any other administrative adjustments you need. A RAID mirror can only duplicate the raw block device.

HTH,

Phil
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [linux-lvm] Q: LVM over RAID, or plain disks? A:"Yes" = best ofboth worlds?

am 30.11.2010 16:39:16 von hansBKK

On Tue, Nov 30, 2010 at 8:13 PM, Phil Turmel wrote:
> I think you are making this overly complex, insisting on a RAID1 operation to backup from on filer to the other. Consider having each disk on filer #2 configured as a single LVM PV/VG, so it can stand alone in a rotation. The try the alternate below.

> As an alternate, with simpler recovery semantics:
> Create matching LV on non-RAID PV/VG on filer #2
> dd + netcat + dd or other technique to dup the snapshot on filer #1 to filer #2
>
>> - delete the snapshot

Thanks Phil, and yes I do tend to complicate things during a
learning/planning stage, then when I get to implementation reality
tends to force simplification :)

Perhaps I miscommunicated, but your suggestion misses a couple of factors:
- this process is (so far) only targeting the backup data hosted on Filer-B
- the backup-mirroring I'm considering is all within that one
server, not across a network connection
- I can't have "each disk configured as a single VG", since the
whole point of my using LVM is to get as much flexibility as possible
for dozens of hosts to share my dozen-terabyte disk space, when I only
have a half-dozen disks.
- my goal is for the snapshot-copy to end up in a regular partition
on a physical disk, without any RAID/LVM layers standing between the
data and easy recovery

However your main point is perfectly valid - dd basically does the
same block-level data transfer as mdm RAID1 - which is a must for the
gazillions-of-hardlinks backup filesystems (as opposed to file-level
tools like rsync).

So adapting your suggestion to fit (my perception of) my needs:

- create an LV snapshot
- mount a plain partition on a physical hard disk (preferably on a
separate controller?)
- dd the data from the LV snapshot over to the partition
- delete the snapshot

> A RAID mirror can only duplicate the raw block device.

Isn't that all that dd is doing as well?

My perception is that software like mdm, designed as it is to maximize
data reliability would handle the cloning more reliably than dd -
isn't there more error-checking going on during a RAID1 re-mirroring?
Your main (and quite valid) point is that a user-space tool designed
to do the job is probably more appropriate than putting RAID1 to use
in a way beyond what was originally intended.

So I guess my question becomes:

What is the best tool to block-level clone an LV snapshot to a regular
disk partition?

- "best" = as close to 100% reliably as possible, speed isn't nearly
as important

Would a COT cloning package (something like Acronis TrueImage) have
data reliability advantages (like mdm's) over dd?
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [linux-lvm] Q: LVM over RAID, or plain disks? A:"Yes" = bestof both worlds?

am 30.11.2010 17:56:05 von Phil Turmel

On 11/30/2010 10:39 AM, hansbkk@gmail.com wrote:
> On Tue, Nov 30, 2010 at 8:13 PM, Phil Turmel wrote:
>> I think you are making this overly complex, insisting on a RAID1 operation to backup from on filer to the other. Consider having each disk on filer #2 configured as a single LVM PV/VG, so it can stand alone in a rotation. The try the alternate below.
>
>
>
>> As an alternate, with simpler recovery semantics:
>> Create matching LV on non-RAID PV/VG on filer #2
>> dd + netcat + dd or other technique to dup the snapshot on filer #1 to filer #2
>>
>>> - delete the snapshot
>
> Thanks Phil, and yes I do tend to complicate things during a
> learning/planning stage, then when I get to implementation reality
> tends to force simplification :)
>
> Perhaps I miscommunicated, but your suggestion misses a couple of factors:
> - this process is (so far) only targeting the backup data hosted on Filer-B
> - the backup-mirroring I'm considering is all within that one
> server, not across a network connection

OK.

> - I can't have "each disk configured as a single VG", since the
> whole point of my using LVM is to get as much flexibility as possible
> for dozens of hosts to share my dozen-terabyte disk space, when I only
> have a half-dozen disks.

I meant only the disks that will be rotated offline/offsite should be
set up as "solo" Volume Groups.

> - my goal is for the snapshot-copy to end up in a regular partition
> on a physical disk, without any RAID/LVM layers standing between the
> data and easy recovery

This wasn't clear, and is a key factor. You can wipe the superblock from an mdraid mirror member, and if the SB was at the end of the partition, be left with usable partition (with a little waste space at the end). But this doesn't help you, because the partition you want to clone is an LV inside LVM. lvconvert can add a mirror to a partition on the fly, and split it off at an appropriate time, but only if the mirror is also inside LVM. lvconvert can do some neat stuff, but its all inside LVM.

> However your main point is perfectly valid - dd basically does the
> same block-level data transfer as mdm RAID1 - which is a must for the
> gazillions-of-hardlinks backup filesystems (as opposed to file-level
> tools like rsync).

(Actually, rsync and tar are both hardlink-aware, at least the versions I use.)

> So adapting your suggestion to fit (my perception of) my needs:
>
> - create an LV snapshot
> - mount a plain partition on a physical hard disk (preferably on a
> separate controller?)
> - dd the data from the LV snapshot over to the partition
> - delete the snapshot

Yep, this is basically what I recommended.

>> A RAID mirror can only duplicate the raw block device.
>
> Isn't that all that dd is doing as well?

The raid operation carries a lot of metadata to track what is in sync vs. what's not, because it is expected to work on live filesystems. If you are duping a snapshot, the snapshot is static, and dd will give you the same result, without the metadata overhead.

> My perception is that software like mdm, designed as it is to maximize
> data reliability would handle the cloning more reliably than dd -
> isn't there more error-checking going on during a RAID1 re-mirroring?
> Your main (and quite valid) point is that a user-space tool designed
> to do the job is probably more appropriate than putting RAID1 to use
> in a way beyond what was originally intended.

Well, both mdraid and dd will choke on a write error on the target.

> So I guess my question becomes:
>
> What is the best tool to block-level clone an LV snapshot to a regular
> disk partition?
>
> - "best" = as close to 100% reliably as possible, speed isn't nearly
> as important

I would use dd.

> Would a COT cloning package (something like Acronis TrueImage) have
> data reliability advantages (like mdm's) over dd?

Not really. But they may offer various forms of compression/sparsification/error detection if you wish to store the final backups as files. Of course, if you do that, you might as well use tar+gzip+md5sum.

I'm guessing that you'll be scripting all of this, in which case I'd recommend sticking to some combination of lvcreate -s, lvconvert, dd, and possibly tar+gzip.

You want your dismountable disks to be accessible stand-alone, but I don't see why that would preclude setting them up so each is a unique LVM VG.

HTH,

Phil
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [linux-lvm] Q: LVM over RAID, or plain disks? A:"Yes" = best ofboth worlds?

am 01.12.2010 05:45:30 von hansBKK

On Tue, Nov 30, 2010 at 11:56 PM, Phil Turmel wrote=
:

> (Actually, rsync and tar are both hardlink-aware, at least the versio=
ns I use.)

My backup filesystems contain so many hardlinks (millions, constantly
growing) that file-level tools choke - this really must be done at the
block device level - see my previous post for more detail.

It's also now clear to me that rsync is the tool to use for this for
all the other LVs without such problematic filesystems, as I know the
tool and trust its error-checking routines.

>> So adapting your suggestion to fit (my perception of) my needs:
>>
>> =A0 - create an LV snapshot
>> =A0 - mount a plain partition on a physical hard disk (preferably on=
a
>> separate controller?)
>> =A0 - dd the data from the LV snapshot over to the partition
>> =A0 - delete the snapshot
>
> Yep, this is basically what I recommended.

>> So I guess my question becomes:
>>
>> What is the best tool to block-level clone an LV snapshot to a regul=
ar
>> disk partition?
>>
>> =A0 - "best" =3D as close to 100% reliably as possible, speed isn't =
nearly
>> as important
>
> I would use dd.

OK, that's clear, thanks.

>> Would a COT cloning package (something like Acronis TrueImage) have
>> data reliability advantages (like mdm's) over dd?
>
> Not really. =A0But they may offer various forms of compression/sparsi=
fication/error detection if you wish to store the final backups as file=
s. =A0Of course, if you do that, you might as well use tar+gzip+md5sum.

No, I'm talking about partition-to-partition cloning operations, which
some of these do support. The error detection is critical, and why I
was looking at mdraid in the first place.

> You want your dismountable disks to be accessible stand-alone, but I =
don't see why that would preclude setting them up so each is a unique L=
VM VG.

It doesn't preclude it, but it's a layer of complexity during the data
recovery process I'm trying to avoid.

The ultimate goal is a plain partition on a plain disk that can be
directly mounted on a SATA2 host via a normal recovery/LiveCD by a
user that's never heard of RAID or LVM.

To summarize your feedback:

- mdraid's sync error-checking routines don't add value over dd to
ensure accurate cloning of a static partition; its metadata is just
useless overhead in this case.

- dd is reliable enough

One last question (and I do realize it's now OT for here, so I won't
be hurt if it's ignored :)

Does dd already do some sort of "verify after copy"? I will likely
investigate the available COTS partition cloning tools as well.

Thanks for all your help, not least in helping me to clarify my own goa=
ls
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [linux-lvm] Q: LVM over RAID, or plain disks? A:"Yes" = bestof both worlds?

am 01.12.2010 13:50:29 von Phil Turmel

On 11/30/2010 11:45 PM, hansbkk@gmail.com wrote:
> On Tue, Nov 30, 2010 at 11:56 PM, Phil Turmel wrote:
>
>> (Actually, rsync and tar are both hardlink-aware, at least the versions I use.)
>
> My backup filesystems contain so many hardlinks (millions, constantly
> growing) that file-level tools choke - this really must be done at the
> block device level - see my previous post for more detail.

Ah -- I did miss that detail.

> It's also now clear to me that rsync is the tool to use for this for
> all the other LVs without such problematic filesystems, as I know the
> tool and trust its error-checking routines.

Indeed. I push my own critical systems offsite with rsync+ssh.

[snip /]

>> I would use dd.
>
> OK, that's clear, thanks.
>
>
>> You want your dismountable disks to be accessible stand-alone, but I don't see why that would preclude setting them up so each is a unique LVM VG.
>
> It doesn't preclude it, but it's a layer of complexity during the data
> recovery process I'm trying to avoid.
>
> The ultimate goal is a plain partition on a plain disk that can be
> directly mounted on a SATA2 host via a normal recovery/LiveCD by a
> user that's never heard of RAID or LVM.

Ah -- not *you*. And you wouldn't be mixing customers on a disk, I presume?

> To summarize your feedback:
>
> - mdraid's sync error-checking routines don't add value over dd to
> ensure accurate cloning of a static partition; its metadata is just
> useless overhead in this case.

Right.

> - dd is reliable enough

I guess if your filer lacks ECC ram, you could have a bit flip between reading and writing that would be missed. It's really an end-to-end hardware integrity issue at this level. But an undetected read error between the platter and RAM will be propagated by every block-level software tool out there, including software raid. btrfs can do data-checksumming, but that's at the FS level.

> One last question (and I do realize it's now OT for here, so I won't
> be hurt if it's ignored :)
>
> Does dd already do some sort of "verify after copy"? I will likely
> investigate the available COTS partition cloning tools as well.

Not natively, but it fairly easy to pipe a dd reader through md5sum to a dd writer, then follow up with a dd read + md5sum of the copied partition (taking care to read precisely the same number of sectors).

The various flavors of ddrescue might have something like this.. didn't check.

> Thanks for all your help, not least in helping me to clarify my own goals

You're welcome.

Phil
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [linux-lvm] Q: LVM over RAID, or plain disks? A:"Yes" = best ofboth worlds?

am 01.12.2010 20:47:45 von hansBKK

On Wed, Dec 1, 2010 at 7:50 PM, Phil Turmel wrote:
>> Does dd already do some sort of "verify after copy"? I will likely
>> investigate the available COTS partition cloning tools as well.
>
> Not natively, but it fairly easy to pipe a dd reader through md5sum t=
o a dd writer, then follow up with a dd read + md5sum of the copied par=
tition (taking care to read precisely the same number of sectors).
>
> The various flavors of ddrescue might have something like this.. =A0d=
idn't check.

Sorry it's a bit OT, but for the sake of future googlers thought I'd
point to this tool I found:

http://dc3dd.sourceforge.net/
http://www.forensicswiki.org/wiki/Dc3dd
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html