error injection

error injection

am 28.09.2011 21:48:37 von Jojy Varghese

Hi
I am trying to dynamically add error injection to my virtual
disk(LVM) for testing+ debugging purpose. I saw "faulty" personality
module in the kernel and was wondering if there was any documentation
on its usage. I am not looking to set up a RAID but a simple mapped
device. So the basic use case is that I need to be able to dynamically
add/remove error sectors and also be able to have granular error
configuration like read error, read+write error etc.

thanks in advance
Jojy
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: error injection

am 29.09.2011 01:11:28 von NeilBrown

--Sig_/FP1uXW4iFq8SknZRmb7f7k8
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

On Wed, 28 Sep 2011 12:48:37 -0700 Jojy Varghese
wrote:

> Hi
> I am trying to dynamically add error injection to my virtual
> disk(LVM) for testing+ debugging purpose. I saw "faulty" personality
> module in the kernel and was wondering if there was any documentation
> on its usage. I am not looking to set up a RAID but a simple mapped
> device. So the basic use case is that I need to be able to dynamically
> add/remove error sectors and also be able to have granular error
> configuration like read error, read+write error etc.
>=20
> thanks in advance
> Jojy
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html

The 'faulty' md personality is described briefly in the 'md.4' man page whi=
ch
is included in the mdadm distribution.
I've included the relevant part below.

Configuring the type of faults is described in mdadm.8 under the '-p
--layout=3D' section. So can adjust the settings using mdadm --grow.
so:
mdadm -B /dev/md0 -l faulty -n1 /dev/sda

will build a 'faulty' device which provides access to /dev/sda, but
introduces faults. Initially no faults will be introduces.

mdadm -G /dev/md0 --layout=3Drt400

will tell md0 to generate a read error every 400 requests, but not to
remember the error - rt == readtransient
--layout=3Drp400
will create a persistent error every 400 reads subsequent reads of the same
block will produce the same error. at most 50 persistent errors can be
recorded.
mdadm -G /dev/md0 --layout=3Dclear
will stop producing new errors
mdadm -G /dev/md0 --layout=3Dflush
will forget all persistent errors.


from md.4:

FAULTY
The FAULTY md module is provided for testing purposes. A faulty ar=
ray
has exactly one component device and is normally assembled withou=
t a
superblock, so the md array created provides direct access to all =
of
the data in the component device.

The FAULTY module may be requested to simulate faults to allow test=
ing
of other md levels or of filesystems. Faults can be chosen to trig=
ger
on read requests or write requests, and can be transient (a subsequ=
ent
read/write at the address will probably succeed) or persistent (sub=
se-
quent read/write of the same address will fail). Further, read fau=
lts
can be "fixable" meaning that they persist until a write request at =
the
same address.

Fault types can be requested with a period. In this case, the fa=
ult
will recur repeatedly after the given number of requests of the re=
le-
vant type. For example if persistent read faults have a period of 1=
00,
then every 100th read request would generate a fault, and the fau=
lty
sector would be recorded so that subsequent reads on that sector wo=
uld
also fail.

There is a limit to the number of faulty sectors that are remember=
ed.
Faults generated after this limit is exhausted are treated as tr=
an-
sient.

The list of faulty sectors can be flushed, and the active list of fa=
il-
ure modes can be cleared.


from mdadm.8:

When setting the failure mode for level faulty, the options a=
re:
write-transient, wt, read-transient, rt, write-persistent, =
wp,
read-persistent, rp, write-all, read-fixable, rf, clear, flu=
sh,
none.

Each failure mode can be followed by a number, which is used =
as
a period between fault generation. Without a number, the fa=
ult
is generated once on the first relevant request. With a numb=
er,
the fault will be generated after that many requests, and w=
ill
continue to be generated every time the period elapses.

Multiple failure modes can be current simultaneously by us=
ing
the --grow option to set subsequent failure modes.

"clear" or "none" will remove any pending or periodic fail=
ure
modes, and "flush" will clear any persistent faults.



NeilBrown


--Sig_/FP1uXW4iFq8SknZRmb7f7k8
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)

iD8DBQFOg6mgG5fc6gV+Wb0RAm2xAJ9jF985UPsLXJi04JwgVyTUEFQeSgCg hRV0
DVWkE6l7RR/4pUiAR49dheM=
=/vRq
-----END PGP SIGNATURE-----

--Sig_/FP1uXW4iFq8SknZRmb7f7k8--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: error injection

am 29.09.2011 02:59:49 von Jojy Varghese

Thanks Neil. I tried setting my sda7 partition to generate write
errors every 40 bytes(writing 1 byte at a time). I did :

1. Create a array with:
mdadm -C /dev/md/me0 -l faulty -n1 /dev/sda7

After this step I can see /dev/md127 and when i do a mdadm -D /dev/md12=
7, i get:

/dev/md127:
Version : 1.2
Creation Time : Wed Sep 28 17:35:50 2011
Raid Level : faulty
Array Size : 969410424 (924.50 GiB 992.68 GB)
Raid Devices : 1
Total Devices : 1
Persistence : Superblock is persistent

Update Time : Wed Sep 28 17:35:50 2011
State : clean
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0

Name : eng-dev16.lab.local:me0 (local to host eng-dev16.lab=
local)
UUID : 96f4be10:312f9574:f40107aa:d9f278ba
Events : 0

Number Major Minor RaidDevice State
0 8 7 0 active sync /dev/sda7


2. Set write fault level with:

mdadm -G /dev/md/me0 --layout=3Dwp40



After this when i write > 40 bytes into /dev/md127, i dont get any
I/O errors. I am sure i am doing something wrong here.


Any help is much appreciated.

Thanks
Jojy

On Wed, Sep 28, 2011 at 4:11 PM, NeilBrown wrote:
> On Wed, 28 Sep 2011 12:48:37 -0700 Jojy Varghese com>
> wrote:
>
>> Hi
>>  I am trying to dynamically add error injection to my virtual
>> disk(LVM) for testing+ debugging purpose. I saw "faulty" personality
>> module in the kernel and was wondering if there was any documentatio=
n
>> on its usage. I am not looking to set up a RAID but a simple mapped
>> device. So the basic use case is that I need to be able to dynamical=
ly
>> add/remove error sectors and also be able to have granular error
>> configuration like read error, read+write error etc.
>>
>> thanks in advance
>> Jojy
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid=
" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.h=
tml
>
> The 'faulty' md personality is described briefly in the 'md.4' man pa=
ge which
> is included in the mdadm distribution.
> I've included the relevant part below.
>
> Configuring the type of faults is described in mdadm.8 under the '-p
> --layout=3D' section.  So can adjust the settings using mdadm --=
grow.
> so:
>  mdadm -B /dev/md0 -l faulty -n1 /dev/sda
>
> will build a 'faulty' device which provides access to /dev/sda, but
> introduces faults.  Initially no faults will be introduces.
>
>  mdadm -G /dev/md0 --layout=3Drt400
>
> will tell md0 to generate a read error every 400 requests, but not to
> remember the error - rt == readtransient
>   --layout=3Drp400
> will create a persistent error every 400 reads subsequent reads of th=
e same
> block will produce the same error.  at most 50 persistent errors=
can be
> recorded.
>  mdadm -G /dev/md0 --layout=3Dclear
> will stop producing new errors
>  mdadm -G /dev/md0 --layout=3Dflush
> will forget all persistent errors.
>
>
> from md.4:
>
>   FAULTY
>       The FAULTY md module is provided for testing pur=
poses.  A faulty  array
>       has  exactly  one  component devi=
ce and is normally assembled without a
>       superblock, so the md array created provides dir=
ect access  to  all  of
>       the data in the component device.
>
>       The  FAULTY module may be requested to simu=
late faults to allow testing
>       of other md levels or of filesystems.  Faul=
ts can be chosen to  trigger
>       on  read requests or write requests, and ca=
n be transient (a subsequent
>       read/write at the address will probably succeed)=
or persistent  (subse-
>       quent  read/write of the same address will =
fail).  Further, read faults
>       can be "fixable" meaning that they persist until=
a write request at the
>       same address.
>
>       Fault  types  can  be requested w=
ith a period.  In this case, the fault
>       will recur repeatedly after the given number of =
requests of  the  rele-
>       vant type.  For example if persistent read =
faults have a period of 100,
>       then every 100th read request would generate a f=
ault,  and  the  faulty
>       sector  would be recorded so that subsequen=
t reads on that sector would
>       also fail.
>
>       There is a limit to the number of faulty sectors=
that  are  remembered.
>       Faults  generated  after  this =C2=
=A0limit is exhausted are treated as tran-
>       sient.
>
>       The list of faulty sectors can be flushed, and t=
he active list of fail-
>       ure modes can be cleared.
>
>
> from mdadm.8:
>
>              When setting the fail=
ure mode for level faulty, the options are:
>              write-transient, wt, =
read-transient, rt,  write-persistent,  wp,
>              read-persistent,  =
rp, write-all, read-fixable, rf, clear, flush,
>              none.
>
>              Each failure mode can=
be followed by a number, which is used  as
>              a  period betwee=
n fault generation.  Without a number, the fault
>              is generated once on =
the first relevant request.  With a number,
>              the  fault  =
will be generated after that many requests, and will
>              continue to be genera=
ted every time the period elapses.
>
>              Multiple failure mode=
s can be current  simultaneously  by  using
>              the --grow option to =
set subsequent failure modes.
>
>              "clear"  or  =
"none"  will remove any pending or periodic failure
>              modes, and "flush" wi=
ll clear any persistent faults.
>
>
>
> NeilBrown
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: error injection

am 29.09.2011 03:08:36 von NeilBrown

--Sig_/k9Ir0z0.zzHr_=..ltLWAN/
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

On Wed, 28 Sep 2011 17:59:49 -0700 Jojy Varghese
wrote:

> Thanks Neil. I tried setting my sda7 partition to generate write
> errors every 40 bytes(writing 1 byte at a time). I did :

md doesn't see byte writes. It sees sectors or more - usually whole pages =
or
groups of pages.

>=20
> 1. Create a array with:
> mdadm -C /dev/md/me0 -l faulty -n1 /dev/sda7

-C will write a superblock to /dev/sda7 which you don't really want. It
doesn't hurt, but I always used -B (--build) to avoid any metadata.

>=20
> After this step I can see /dev/md127 and when i do a mdadm -D /dev/md127,=
i get:
>=20
> /dev/md127:
> Version : 1.2
> Creation Time : Wed Sep 28 17:35:50 2011
> Raid Level : faulty
> Array Size : 969410424 (924.50 GiB 992.68 GB)
> Raid Devices : 1
> Total Devices : 1
> Persistence : Superblock is persistent
>=20
> Update Time : Wed Sep 28 17:35:50 2011
> State : clean
> Active Devices : 1
> Working Devices : 1
> Failed Devices : 0
> Spare Devices : 0
>=20
> Name : eng-dev16.lab.local:me0 (local to host eng-dev16.lab.l=
ocal)
> UUID : 96f4be10:312f9574:f40107aa:d9f278ba
> Events : 0
>=20
> Number Major Minor RaidDevice State
> 0 8 7 0 active sync /dev/sda7
>=20
>=20
> 2. Set write fault level with:
>=20
> mdadm -G /dev/md/me0 --layout=3Dwp40
>=20
>=20
>=20
> After this when i write > 40 bytes into /dev/md127, i dont get any
> I/O errors. I am sure i am doing something wrong here.

When you write to /dev/md127 it will just go into the page cache and
eventually be flushed to the device in one write.
Use O_DIRECT or O_SYNC and it will be flushed out more quickly, but always
write at least 512 bytes at a time.

NeilBrown



>=20
>=20
> Any help is much appreciated.
>=20
> Thanks
> Jojy

--Sig_/k9Ir0z0.zzHr_=..ltLWAN/
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)

iD8DBQFOg8UUG5fc6gV+Wb0RAkEOAKCKye5hm5WYHh5r7FwJn0VzWEOD4gCf W0dh
/QpgbivMh9oa6SQP4IMabhc=
=3R1R
-----END PGP SIGNATURE-----

--Sig_/k9Ir0z0.zzHr_=..ltLWAN/--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: error injection

am 29.09.2011 04:06:17 von Jojy Varghese

Thanks Neil. Also, is there any way to find the current fault blocks be=
ing set?

On Wed, Sep 28, 2011 at 6:08 PM, NeilBrown wrote:
> On Wed, 28 Sep 2011 17:59:49 -0700 Jojy Varghese com>
> wrote:
>
>> Thanks Neil. I tried setting my sda7 partition to generate write
>> errors every 40 bytes(writing 1 byte at a time). I did :
>
> md doesn't see byte writes.  It sees sectors or more - usually w=
hole pages or
> groups of pages.
>
>>
>> 1. Create a array with:
>> mdadm -C /dev/md/me0 -l faulty -n1 /dev/sda7
>
> -C will write a superblock to /dev/sda7 which you don't really want. =
 It
> doesn't hurt, but I always used -B (--build) to avoid any metadata.
>
>>
>> After this step I can see /dev/md127 and when i do a mdadm -D /dev/m=
d127, i get:
>>
>> /dev/md127:
>>         Version : 1.2
>>   Creation Time : Wed Sep 28 17:35:50 2011
>>      Raid Level : faulty
>>      Array Size : 969410424 (924.50 GiB 992.68 GB)
>>    Raid Devices : 1
>>   Total Devices : 1
>>     Persistence : Superblock is persistent
>>
>>     Update Time : Wed Sep 28 17:35:50 2011
>>           State : clean
>>  Active Devices : 1
>> Working Devices : 1
>>  Failed Devices : 0
>>   Spare Devices : 0
>>
>>            Name : eng-dev16.lab.local:=
me0  (local to host eng-dev16.lab.local)
>>            UUID : 96f4be10:312f9574:f4=
0107aa:d9f278ba
>>          Events : 0
>>
>>     Number   Major   Minor   RaidDevice Sta=
te
>>        0       8     =C2=
=A0  7        0      active syn=
c   /dev/sda7
>>
>>
>> 2. Set write fault level with:
>>
>> mdadm -G /dev/md/me0 --layout=3Dwp40
>>
>>
>>
>>   After this when i write > 40 bytes into /dev/md127, i dont ge=
t any
>> I/O errors. I am sure i am doing something wrong here.
>
> When you write to /dev/md127 it will just go into the page cache and
> eventually be flushed to the device in one write.
> Use O_DIRECT or O_SYNC and it will be flushed out more quickly, but a=
lways
> write at least 512  bytes at a time.
>
> NeilBrown
>
>
>
>>
>>
>> Any help is much appreciated.
>>
>> Thanks
>> Jojy
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" i=
n
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: error injection

am 29.09.2011 04:12:26 von NeilBrown

--Sig_/BcsNVUd+dpG6M/L3DkhTyoX
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

On Wed, 28 Sep 2011 19:06:17 -0700 Jojy Varghese
wrote:

> Thanks Neil. Also, is there any way to find the current fault blocks bein=
g set?
>=20

No. All you can get is what is shown in "/proc/mdstat".

It wouldn't be too hard to add something to /proc/mdstat or /sys/.... to sh=
ow
that information.

NeilBrown


--Sig_/BcsNVUd+dpG6M/L3DkhTyoX
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)

iD8DBQFOg9QKG5fc6gV+Wb0RAsAGAKDHpnlM+cdCLfAKgj9kCtvGQqhWJACd HtET
AuGVx1ZkzZry3Df/l6jleyQ=
=qrR+
-----END PGP SIGNATURE-----

--Sig_/BcsNVUd+dpG6M/L3DkhTyoX--
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html