hi amit --
In a message dated 3/2/2007 4:54:57 P.M. Eastern Standard Time,=20
amit.hetawal@gmail.com writes:
> hello,
> Thanks for your previous reponses.
> Now this=
=20
time i am using the right syntax for matching, for the string like:
>=20
>
> $temp=3D "XXXXAAAZZZZBBBSSSSCCCGGGGBBBVVVVVBBB"
>
R>>=20
I need to write a regex for filterin out the string between.
>
>=
;=20
AAA
> BBB
> CCC
>
> so in the above case i should h=
ave=20
the output as:
>
>
> AAAZZZZZBBB
>=20
BBBSSSSSSCCC
> CCCGGGGBBB
> BBBVVVVVBBB
> meaning all=20
combinations of start and end for AAA BBB CCC.
>
> I have the r=
egex=20
for one of them but how do i do it simultaneously for
> all 3 of=20
them.
>
>
>=20
$temp=3D'XXXXAAAZZZZBBBSSSSCCCGGGGBBBVVVVVBBB';
>
> @t =3D ($te=
mp=20
=3D~/(AAA)(.*?)(BBB)/g);
> foreach (@t)
> {
>
> pri=
nt=20
$_;
>
> }
>
> Am not able to figure out how will g=
o=20
about when just after the match
> i need to match for
>=20
BBBSSSSCCC.
>
> Any suggestions
>
>
>=20
Thanks
try this:
C:\@Work\Perl>perl -we "use strict; my $dna=
'XXXXAAAZZZZBBBSSSSCCCGGGGBBBVVVVVBBBAAA';
my $starter =3D qr(AAA | BBB |=
=20
CCC)x; my $stopper =3D qr(AAA | BBB | CCC)x; my @seq;
while (=
$dna=20
=3D~ / ($starter) (.*?) ($stopper) /gx) { push @seq, qq($1$2$3); pos($=
dna) =
$-[3]; };
print qq({$_} \n) for=20
@seq"
{AAAZZZZBBB}
{BBBSSSSCCC}
{CCCGGGGBBB}
{BBBVVVVVBBB}
{B=
BBAAA}
the trick is resetting the search position in the body of the while=20
loop. as far as i know, there is
no way to do this purely from within a regex.
i defined two separate patterns for starting and stopping a subsequence=
=20
even though the actual
groups are identical; it may serve if the groups ever differ. &nbs=
p;=20
note that the above also captures the ``empty'' test case i added=20=
at=20
the end. if you do not want this,
try this instead (the (.*?) becomes (.+?)): &nbs=
p;=20
C:\@Work\Perl>perl -we "use strict; my $dna=
'XXXXAAAZZZZBBBSSSSCCCGGGGBBBVVVVVBBBAAA';
my $starter =3D qr(AAA | BBB |=
=20
CCC)x; my $stopper =3D qr(AAA | BBB | CCC)x; my @seq;
while (=
$dna=20
=3D~ / ($starter) (.+?) ($stopper) /gx) { push @seq, qq($1$2$3); pos($=
dna) =
$-[3]; };
print qq({$_} \n) for=20
@seq"
{AAAZZZZBBB}
{BBBSSSSCCC}
{CCCGGGGBBB}
{BBBVVVVVBBB}
=
DIV>
hth -- bill walters
font: normal 10pt ARIAL, SAN-SERIF;">
AOL now=20=
offers free email to everyone. Find out more about what's free from AOL at=20=
ol?redir=3Dhttp://www.aol.com" href=3D"http://pr.atwola.com/promoclk/1615326=
657x4311227241x4298082137/aol?redir=3Dhttp%3A%2F%2Fwww%2Eaol %2Ecom" target=
=3D"_blank">
AOL.com.
-------------------------------1172881585--
--===============1947490370==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
_______________________________________________
ActivePerl mailing list
ActivePerl@listserv.ActiveState.com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
--===============1947490370==--
Re: Another regex
am 03.03.2007 01:54:26 von Williamawalters
--===============0353856668==
Content-Type: multipart/alternative;
boundary="-----------------------------1172883266"
-------------------------------1172883266
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7bit
hi again, amit --
In a message dated 3/2/2007 7:27:39 P.M. Eastern Standard Time,
Williamawalters@aol.com writes:
> hi amit --
>
> In a message dated 3/2/2007 4:54:57 P.M. Eastern Standard Time,
_amit.hetawal@gmail.com_ (mailto:amit.hetawal@gmail.com) writes:
>
> > hello,
> > Thanks for your previous reponses.
> > Now this time i am using the right syntax for matching, for the string
like:
> >
> >
> > $temp= "XXXXAAAZZZZBBBSSSSCCCGGGGBBBVVVVVBBB"
> >
> > I need to write a regex for filterin out the string between.
> >
> > AAA
> > BBB
> > CCC
> >
> > so in the above case i should have the output as:
> >
> >
> > AAAZZZZZBBB
> > BBBSSSSSSCCC
> > CCCGGGGBBB
> > BBBVVVVVBBB
> > meaning all combinations of start and end for AAA BBB CCC.
> >
> > I have the regex for one of them but how do i do it simultaneously for
> > all 3 of them.
> >
> >
> > $temp='XXXXAAAZZZZBBBSSSSCCCGGGGBBBVVVVVBBB';
> >
> > @t = ($temp =~/(AAA)(.*?)(BBB)/g);
> > foreach (@t)
> > {
> >
> > print $_;
> >
> > }
> >
> > Am not able to figure out how will go about when just after the match
> > i need to match for
> > BBBSSSSCCC.
> >
> > Any suggestions
> >
> >
> > Thanks
>
>
> try this:
>
> C:\@Work\Perl>perl -we "use strict; my $dna=
'XXXXAAAZZZZBBBSSSSCCCGGGGBBBVVVVVBBBAAA';
> my $starter = qr(AAA | BBB | CCC)x; my $stopper = qr(AAA | BBB | CCC)x;
my @seq;
> while ($dna =~ / ($starter) (.*?) ($stopper) /gx) { push @seq, qq($1$2$3);
pos($dna) = $-[3]; };
> print qq({$_} \n) for @seq"
> {AAAZZZZBBB}
> {BBBSSSSCCC}
> {CCCGGGGBBB}
> {BBBVVVVVBBB}
> {BBBAAA}
>
> the trick is resetting the search position in the body of the while loop.
as far as i know, there is
> no way to do this purely from within a regex.
> i defined two separate patterns for starting and stopping a subsequence
even though the actual
> groups are identical; it may serve if the groups ever differ.
> note that the above also captures the ``empty'' test case i added at the
end. if you do not want this,
> try this instead (the (.*?) becomes (.+?)):
>
> C:\@Work\Perl>perl -we "use strict; my $dna=
'XXXXAAAZZZZBBBSSSSCCCGGGGBBBVVVVVBBBAAA';
> my $starter = qr(AAA | BBB | CCC)x; my $stopper = qr(AAA | BBB | CCC)x;
my @seq;
> while ($dna =~ / ($starter) (.+?) ($stopper) /gx) { push @seq, qq($1$2$3);
pos($dna) = $-[3]; };
> print qq({$_} \n) for @seq"
> {AAAZZZZBBB}
> {BBBSSSSCCC}
> {CCCGGGGBBB}
> {BBBVVVVVBBB}
>
>
> hth -- bill walters
>
i took another look at my original post and came up with a version that's
slightly more concise:
C:\@Work\Perl>perl -we "use strict; my $dna=
'XXXXAAAZZZZBBBSSSSCCCGGGGBBBVVVVVBBBAAA';
my $starter = qr(AAA | BBB | CCC)x; my $stopper = qr(AAA | BBB | CCC)x; my
@seq;
while ($dna =~ / ($starter .*? ($stopper)) /gx) { push @seq, $1; pos($dna)
= $-[2]; }
print qq({$_} \n) for @seq"
{AAAZZZZBBB}
{BBBSSSSCCC}
{CCCGGGGBBB}
{BBBVVVVVBBB}
{BBBAAA}
br -- bill
**************************************
AOL now offers free
email to everyone. Find out more about what's free from AOL at
http://www.aol.com.
-------------------------------1172883266
Content-Type: text/html; charset="US-ASCII"
Content-Transfer-Encoding: quoted-printable
Arial"=20
bottomMargin=3D7 leftMargin=3D7 topMargin=3D7 rightMargin=3D7>
e_document=20
face=3DArial color=3D#000000 size=3D2>
hi again, amit --
In a message dated 3/2/2007 7:27:39 P.M. Eastern Standard Time,=20
Williamawalters@aol.com writes:
> hi amit --
>
> In a message dated=20
3/2/2007 4:54:57 P.M. Eastern Standard Time,
href=3D"mailto:amit.hetawal@gmail.com">amit.hetawal@gmail.co m=20
writes:
>
> > hello,
> > Thanks for your prev=
ious=20
reponses.
> > Now this time i am using the right syntax for matchin=
g,=20
for the string like:
> >
> >
> > $temp=
"XXXXAAAZZZZBBBSSSSCCCGGGGBBBVVVVVBBB"
> >
> > I need to=20
write a regex for filterin out the string between.
> >
> >=
;=20
AAA
> > BBB
> > CCC
> >
> > so in the a=
bove=20
case i should have the output as:
> >
> >
> >=20
AAAZZZZZBBB
> > BBBSSSSSSCCC
> > CCCGGGGBBB
> >=20
BBBVVVVVBBB
> > meaning all combinations of start and end for AAA B=
BB=20
CCC.
> >
> > I have the regex for one of them but how do=20=
i do=20
it simultaneously for
> > all 3 of them.
> >
> >=
=20
> > $temp=3D'XXXXAAAZZZZBBBSSSSCCCGGGGBBBVVVVVBBB';
> >=20
> > @t =3D ($temp =3D~/(AAA)(.*?)(BBB)/g);
> > foreach=20
(@t)
> > {
> >
> > print $_;
> >
&g=
t;=20
> }
> >
> > Am not able to figure out how will go abou=
t=20
when just after the match
> > i need to match for
> >=20
BBBSSSSCCC.
> >
> > Any suggestions
> >
>=
=20
>
> > Thanks
>
>
> try this: &nb=
sp;=20
>
> C:\@Work\Perl>perl -we "use strict; my $dna=
=
'XXXXAAAZZZZBBBSSSSCCCGGGGBBBVVVVVBBBAAA';
> my $starter =3D qr(AAA |=20=
BBB |=20
CCC)x; my $stopper =3D qr(AAA | BBB | CCC)x; my @seq;
> wh=
ile=20
($dna =3D~ / ($starter) (.*?) ($stopper) /gx) { push @seq, qq($1$2$3); =
=20
pos($dna) =3D $-[3]; };
> print qq({$_} \n) for @seq"
>=20
{AAAZZZZBBB}
> {BBBSSSSCCC}
> {CCCGGGGBBB}
>=20
{BBBVVVVVBBB}
> {BBBAAA}
>
> the trick is resetting=
the=20
search position in the body of the while loop. as far as i know,=
=20
there is
> no way to do this purely from within a regex. =20
> i defined two separate patterns for starting and stopping a subsequ=
ence=20
even though the actual
> groups are identical; it may serve if the gr=
oups=20
ever differ.
> note that the above also captures the=20
``empty'' test case i added at the end. if you do not want this,=
=20
> try this instead (the (.*?) becomes =20
(.+?)):
>
> C:\@Work\Perl>perl -we "use=20
strict; my $dna=3D 'XXXXAAAZZZZBBBSSSSCCCGGGGBBBVVVVVBBBAAA';
>=20=
my=20
$starter =3D qr(AAA | BBB | CCC)x; my $stopper =3D qr(AAA | BBB | CCC)=
x; =20
my @seq;
> while ($dna =3D~ / ($starter) (.+?) ($stopper) /gx) { push=20=
@seq,=20
qq($1$2$3); pos($dna) =3D $-[3]; };
> print qq({$_} \n) for=20
@seq"
> {AAAZZZZBBB}
> {BBBSSSSCCC}
> {CCCGGGGBBB}
>=
=20
{BBBVVVVVBBB}
>
>
> hth -- bill=20
walters
>
i took another look at my original post and came up with a version=20
that's slightly more concise:
C:\@Work\Perl>perl -we "use strict; my $dna=
'XXXXAAAZZZZBBBSSSSCCCGGGGBBBVVVVVBBBAAA';
my $starter =3D qr(AAA | BBB |=
=20
CCC)x; my $stopper =3D qr(AAA | BBB | CCC)x; my @seq;
while (=
$dna=20
=3D~ / ($starter .*? ($stopper)) /gx) { push @seq, $1; pos($dna) =3D $=
-[2];=20
}
print qq({$_} \n) for=20
@seq"
{AAAZZZZBBB}
{BBBSSSSCCC}
{CCCGGGGBBB}
{BBBVVVVVBBB}
{B=
BBAAA}
br -- bill
font: normal 10pt ARIAL, SAN-SERIF;">
AOL now=20=
offers free email to everyone. Find out more about what's free from AOL at=20=
ol?redir=3Dhttp://www.aol.com" href=3D"http://pr.atwola.com/promoclk/1615326=
657x4311227241x4298082137/aol?redir=3Dhttp%3A%2F%2Fwww%2Eaol %2Ecom" target=
=3D"_blank">
AOL.com.
-------------------------------1172883266--
--===============0353856668==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
_______________________________________________
ActivePerl mailing list
ActivePerl@listserv.ActiveState.com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
--===============0353856668==--
Re: Another regex
am 03.03.2007 02:10:19 von Sam Dela Cruz
This is a multipart message in MIME format.
--===============1601439636==
Content-Type: multipart/alternative;
boundary="=_alternative 000657DE88257293_="
This is a multipart message in MIME format.
--=_alternative 000657DE88257293_=
Content-Type: text/plain; charset="US-ASCII"
Hi Amit,
Here's my solution:
my $dna = 'XXXXAAAZZZZBBBSSSSCCCGGGGBBBVVVVVBBB';
my @tags = ('AAA',
'BBB',
'CCC',
);
my $tags = join "|", @tags;
my $tag_pattern = qr($tags);
print $_,"\n" foreach ( $dna =~ /(?=($tag_pattern.*?$tag_pattern))/g );
Output is:
AAAZZZZBBB
BBBSSSSCCC
CCCGGGGBBB
BBBVVVVVBBB
This should work for whatever pattern you specify in @tags list.
Regards,
Sam Dela Cruz
__________________________________________________________
Business Applications, Application Developer
AMEC Operations Management - North America
activeperl-bounces@listserv.ActiveState.com wrote on 03/02/2007 01:54:26
PM:
> hello,
> Thanks for your previous reponses.
> Now this time i am using the right syntax for matching, for the string
like:
>
>
> $temp= "XXXXAAAZZZZBBBSSSSCCCGGGGBBBVVVVVBBB"
>
> I need to write a regex for filterin out the string between.
>
> AAA
> BBB
> CCC
>
> so in the above case i should have the output as:
>
>
> AAAZZZZZBBB
> BBBSSSSSSCCC
> CCCGGGGBBB
> BBBVVVVVBBB
> meaning all combinations of start and end for AAA BBB CCC.
>
> I have the regex for one of them but how do i do it simultaneously for
> all 3 of them.
>
>
> $temp='XXXXAAAZZZZBBBSSSSCCCGGGGBBBVVVVVBBB';
>
> @t = ($temp =~/(AAA)(.*?)(BBB)/g);
> foreach (@t)
> {
>
> print $_;
>
> }
>
> Am not able to figure out how will go about when just after the match
> i need to match for
> BBBSSSSCCC.
>
> Any suggestions
>
>
> Thanks
> _______________________________________________
> ActivePerl mailing list
> ActivePerl@listserv.ActiveState.com
> To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
--=_alternative 000657DE88257293_=
Content-Type: text/html; charset="US-ASCII"
Hi Amit,
Here's my solution:
my $dna = 'XXXXAAAZZZZBBBSSSSCCCGGGGBBBVVVVVBBB';
my @tags = ('AAA',
'BBB',
'CCC',
);
my $tags = join "|", @tags;
my $tag_pattern = qr($tags);
print $_,"\n" foreach ( $dna
=~ /(?=($tag_pattern.*?$tag_pattern))/g );
Output is:
AAAZZZZBBB
BBBSSSSCCC
CCCGGGGBBB
BBBVVVVVBBB
This should work for whatever pattern
you specify in @tags list.
Regards,
Sam Dela Cruz
__________________________________________________________
Business Applications, Application Developer
AMEC Operations Management - North America
activeperl-bounces@listserv.ActiveState.com wrote
on 03/02/2007 01:54:26 PM:
> hello,
> Thanks for your previous reponses.
> Now this time i am using the right syntax for matching, for the string
like:
>
>
> $temp= "XXXXAAAZZZZBBBSSSSCCCGGGGBBBVVVVVBBB"
>
> I need to write a regex for filterin out the string between.
>
> AAA
> BBB
> CCC
>
> so in the above case i should have the output as:
>
>
> AAAZZZZZBBB
> BBBSSSSSSCCC
> CCCGGGGBBB
> BBBVVVVVBBB
> meaning all combinations of start and end for AAA BBB CCC.
>
> I have the regex for one of them but how do i do it simultaneously
for
> all 3 of them.
>
>
> $temp='XXXXAAAZZZZBBBSSSSCCCGGGGBBBVVVVVBBB';
>
> @t = ($temp =~/(AAA)(.*?)(BBB)/g);
> foreach (@t)
> {
>
> print $_;
>
> }
>
> Am not able to figure out how will go about when just after the match
> i need to match for
> BBBSSSSCCC.
>
> Any suggestions
>
>
> Thanks
> _______________________________________________
> ActivePerl mailing list
> ActivePerl@listserv.ActiveState.com
> To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
--=_alternative 000657DE88257293_=--
--===============1601439636==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
_______________________________________________
ActivePerl mailing list
ActivePerl@listserv.ActiveState.com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
--===============1601439636==--
Re: Another regex
am 05.03.2007 19:38:48 von Andy_Bach
> [ is this DNA related?]
Searching CPAN for DNA (or Genetics) there's a whole bunch of stuff for it
- and via the Perl Journal, I recall a major genome mapping project
having been completed *only* on the power of Perl. So if you're doing
something ... I dunno, lab or genome related, you may want to look for the
wheels already out there for Perl and DNA.
a
Andy Bach
Systems Mangler
Internet: andy_bach@wiwb.uscourts.gov
VOICE: (608) 261-5738 FAX 264-5932
"Procrastination is like putting lots and lots of commas in the sentence
of your life."
Ze Frank
http://lifehacker.com/software/procrastination/ze-frank-on-p rocrastination-235859.php
_______________________________________________
ActivePerl mailing list
ActivePerl@listserv.ActiveState.com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Re: Another regex
am 07.03.2007 16:43:10 von n.haigh
amit hetawal wrote:
> yes it is a DNA sequence i need to find.
>
> But still not getting how.. should i go about.
>
> Can you advise something
>
> Thanks
>
>
> On 3/2/07, Deane.Rothenmaier@walgreens.com
> wrote:
>
>> If those letters were different, I'd think you were working on a chunk of
>> DNA... P-))
>>
>> Deane Rothenmaier
>> Programmer/Analyst
>> Walgreens Corp.
>> 847-914-5150
>>
>> "On two occasions I have been asked [by members of Parliament], 'Pray, Mr.
>> Babbage, if you put into the machine wrong figures, will the right answers
>> come out?' I am not able rightly to apprehend the kind of confusion of ideas
>> that could provoke such a question." -- Charles Babbage
>>
> _______________________________________________
> ActivePerl mailing list
> ActivePerl@listserv.ActiveState.com
> To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
>
Have you ever thought about using BioPerl?
I know it may not be the place to discuss this, but, could you explain
what you are trying to do in real/biological terms? I'm a
bioinformatician, so it shouldn't scare me!!
Nathan
_______________________________________________
ActivePerl mailing list
ActivePerl@listserv.ActiveState.com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs