selecting text between two patterns
selecting text between two patterns
am 02.09.2007 17:54:58 von ginger.m.griffin
Hey... I'm trying to extract text between two patterns. Ideally,
if I had two patterns:
PATTERN1
PATTERN2
I'd like to get back the text that is between the first occurance of
those patterns, not including the patterns
For instance, if I had the text:
this is one line
this is PATTERN1 another line
hello
this is somethingPATTERN2 else
this is the last line
I would want this result
another line
hello
this is something
Or if I had this text:
this PATTERN1is one big line. But it isPATTERN2 not very long.
I would want this result:
is one big line. But it is
Thanks!
Re: selecting text between two patterns
am 02.09.2007 19:47:26 von cfajohnson
On 2007-09-02, gin_g wrote:
> Hey... I'm trying to extract text between two patterns. Ideally,
> if I had two patterns:
> PATTERN1
> PATTERN2
> I'd like to get back the text that is between the first occurance of
> those patterns, not including the patterns
> For instance, if I had the text:
>
> this is one line
> this is PATTERN1 another line
> hello
> this is somethingPATTERN2 else
> this is the last line
>
> I would want this result
>
> another line
> hello
> this is something
>
> Or if I had this text:
> this PATTERN1is one big line. But it isPATTERN2 not very long.
>
> I would want this result:
> is one big line. But it is
This may not work with very large files:
file=$( cat "$FILENAME" )
temp=${file#*PATTERN1}
printf "%s\n" "${temp%%PATTERN2*}"
--
Chris F.A. Johnson, author
Shell Scripting Recipes: A Problem-Solution Approach (2005, Apress)
===== My code in this post, if any, assumes the POSIX locale
===== and is released under the GNU General Public Licence
Re: selecting text between two patterns
am 03.09.2007 00:46:38 von Dummy
gin_g wrote:
> Hey... I'm trying to extract text between two patterns. Ideally,
> if I had two patterns:
> PATTERN1
> PATTERN2
> I'd like to get back the text that is between the first occurance of
> those patterns, not including the patterns
> For instance, if I had the text:
>
> this is one line
> this is PATTERN1 another line
> hello
> this is somethingPATTERN2 else
> this is the last line
>
> I would want this result
>
> another line
> hello
> this is something
$ echo "this is one line
this is PATTERN1 another line
hello
this is somethingPATTERN2 else
this is the last line
" | perl -lne'print if s/.*?PATTERN1// .. s/(.*)PATTERN2.*/$1/'
another line
hello
this is something
> Or if I had this text:
> this PATTERN1is one big line. But it isPATTERN2 not very long.
>
> I would want this result:
> is one big line. But it is
$ echo "this PATTERN1is one big line. But it isPATTERN2 not very long.
" | perl -lne'print if s/.*?PATTERN1// .. s/(.*)PATTERN2.*/$1/'
is one big line. But it is
John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order. -- Larry Wall
Re: selecting text between two patterns
am 03.09.2007 01:46:22 von William James
On Sep 2, 10:54 am, gin_g wrote:
> Hey... I'm trying to extract text between two patterns. Ideally,
> if I had two patterns:
> PATTERN1
> PATTERN2
> I'd like to get back the text that is between the first occurance of
> those patterns, not including the patterns
> For instance, if I had the text:
>
> this is one line
> this is PATTERN1 another line
> hello
> this is somethingPATTERN2 else
> this is the last line
>
> I would want this result
>
> another line
> hello
> this is something
>
> Or if I had this text:
> this PATTERN1is one big line. But it isPATTERN2 not very long.
>
> I would want this result:
> is one big line. But it is
>
> Thanks!
ruby -e 'puts gets(nil)[ /PATTERN1(.*)PATTERN2/m,1]' my_file
The first occurrence of PATTERN1 is used, and
the last of PATTERN2.
==== input ====
one
twoPATTERN1
three PATTERN1
four
fivePATTERN2
sixPATTERN2
seven
==== output ====
three PATTERN1
four
fivePATTERN2
six
Re: selecting text between two patterns
am 03.09.2007 03:28:31 von Ed Morton
gin_g wrote:
> Hey... I'm trying to extract text between two patterns. Ideally,
> if I had two patterns:
> PATTERN1
> PATTERN2
> I'd like to get back the text that is between the first occurance of
> those patterns, not including the patterns
> For instance, if I had the text:
>
> this is one line
> this is PATTERN1 another line
> hello
> this is somethingPATTERN2 else
> this is the last line
>
> I would want this result
>
> another line
> hello
> this is something
>
> Or if I had this text:
> this PATTERN1is one big line. But it isPATTERN2 not very long.
>
> I would want this result:
> is one big line. But it is
>
>
> Thanks!
>
With an awk that supports REs as Record Separators (e.g. gawk), it might
be just:
awk -v RS="PATTERN1|PATTERN2" 'NR==2' file
depending on whether or not PATTERN1 or PATTERN2 can occur before
PATTERN1 in your real input.
Regards,
Ed.
Re: selecting text between two patterns
am 06.09.2007 07:46:35 von onkelheinz
"Ed Morton" schrieb im Newsbeitrag
news:eoCdnXjGT6jc_kbbnZ2dnUVZ_oGjnZ2d@comcast.com...
>
> awk -v RS="PATTERN1|PATTERN2" 'NR==2' file
>
> depending on whether or not PATTERN1 or PATTERN2 can occur before PATTERN1
> in your real input.
>
> Regards,
>
> Ed.
And what is the meaning of 'NR==2' in that awk expression?
Regards,
Heinz
Re: selecting text between two patterns
am 06.09.2007 10:46:50 von Ed Morton
Heinz Müller wrote:
> "Ed Morton" schrieb im Newsbeitrag
> news:eoCdnXjGT6jc_kbbnZ2dnUVZ_oGjnZ2d@comcast.com...
>
>>awk -v RS="PATTERN1|PATTERN2" 'NR==2' file
>>
>>depending on whether or not PATTERN1 or PATTERN2 can occur before PATTERN1
>>in your real input.
>>
>>Regards,
>>
>>Ed.
>
>
> And what is the meaning of 'NR==2' in that awk expression?
It's a test for the second record in the file, so if you have a file like:
abc
PATTERN1
def
ghi
PATTERN2
klm
then since I'm specifying that the Record Separator (RS) is the RE
"PATTERN1 or PATTERN2" the first record will be
abc
and the second will be:
def
ghi
and the third will be:
klm
So, by testing for the Number of Records (NR) equal to 2, I'm selecting
that second record. Since I don't specify any action to take when that
condition (NR==2) is true, awk uses the default action which is to just
print the record for which that condition is true.
Ed.
Re: selecting text between two patterns
am 06.09.2007 21:56:18 von onkelheinz
"Ed Morton" schrieb im Newsbeitrag
news:_YudnXwzhO_nI0LbnZ2dnUVZ_oSnnZ2d@comcast.com...
>
> It's a test for the second record in the file, so if you have a file like:
>
> abc
> PATTERN1
> def
> ghi
> PATTERN2
> klm
>
> then since I'm specifying that the Record Separator (RS) is the RE
> "PATTERN1 or PATTERN2" the first record will be
>
> abc
>
> and the second will be:
>
> def
> ghi
>
> and the third will be:
>
> klm
>
> So, by testing for the Number of Records (NR) equal to 2, I'm selecting
> that second record. Since I don't specify any action to take when that
> condition (NR==2) is true, awk uses the default action which is to just
> print the record for which that condition is true.
>
> Ed.
Thank you for the detailed eplanation!!
Re: selecting text between two patterns
am 07.09.2007 22:10:05 von William Park
gin_g wrote:
> Hey... I'm trying to extract text between two patterns. Ideally,
> if I had two patterns:
> PATTERN1
> PATTERN2
> I'd like to get back the text that is between the first occurance of
> those patterns, not including the patterns
> For instance, if I had the text:
>
> this is one line
> this is PATTERN1 another line
> hello
> this is somethingPATTERN2 else
> this is the last line
>
> I would want this result
>
> another line
> hello
> this is something
>
> Or if I had this text:
> this PATTERN1is one big line. But it isPATTERN2 not very long.
>
> I would want this result:
> is one big line. But it is
If you're brave, you can try my Bash extension:
http://freshmeat.net/projects/bashdiff/
http://home.eol.ca/~parkw/index.html#strinterval
strinterval [-r] string begin end [submatch]
Extract substring, delimited by non-overlapping BEGIN and END patterns. By
default, the patterns are simple string, but regex(7) pattern can be used
with -r option. Returns success (0) if patterns are found, or failure (1)
if patterns are not found. When patterns are found, there are 5 segments
to consider:
string --> submatch=( prefix BEGIN middle END suffix )
All 5 segments are returned in array variable SUBMATCH (if given) which is
always flushed first.
Eg.
a='this PATTERN1is one big line. But it isPATTERN2 not very long'
strinterval "$a" "PATTERN1" "PATTERN2" submatch
declare -p submatch
==> submatch='([0]="this " [1]="PATTERN1" [2]="is one big line. But it is"
[3]="PATTERN2" [4]=" not very long")'
So, you want
echo "${submatch[2]}"
--
William Park , Toronto, Canada
BashDiff: Super Bash shell
http://freshmeat.net/projects/bashdiff/