Combine multiple line segment into one, when certain pattern is found - awk/sed/perl
Combine multiple line segment into one, when certain pattern is found - awk/sed/perl
am 30.10.2007 22:31:08 von dba user
Dear Group,
I have a file with the content in the following format:
Junk...
Junk...
Heading P01
column1 column2 multiline text
CA1001 10 This is a multiline
text spanning two lines
CA1005 12 This is a multiline
text spanning three
lines
CA1008 11 This is a single line text
Heading P02
column1 column2
CA2001 10
CA2003 11
CA2005 12
Heading P03
Junk..
Junk..
I would like to list all the values under "Heading P01" for the same
column1 in a single line
CA1001 10 This is a multiline text spanning two lines
CA1005 12 This is a multiline text spanning three lines
CA1008 11 This is a single line text
Note: The column1 values will always have "CA" as the starting
character.
Appreciate your help in finding a solution using awk or perl or
sed ...
Thank you!!!!
Re: Combine multiple line segment into one, when certain patternis found - awk/sed/perl
am 30.10.2007 22:54:04 von Cyrus Kriticos
da. Ram wrote:
>
> I have a file with the content in the following format:
>
> Junk...
> Junk...
>
> Heading P01
> column1 column2 multiline text
> CA1001 10 This is a multiline
> text spanning two lines
>
> CA1005 12 This is a multiline
> text spanning three
> lines
>
> CA1008 11 This is a single line text
>
> Heading P02
> column1 column2
> CA2001 10
> CA2003 11
> CA2005 12
>
> Heading P03
> Junk..
> Junk..
>
> I would like to list all the values under "Heading P01" for the same
> column1 in a single line
>
> CA1001 10 This is a multiline text spanning two lines
> CA1005 12 This is a multiline text spanning three lines
> CA1008 11 This is a single line text
>
> Note: The column1 values will always have "CA" as the starting
> character.
>
> Appreciate your help in finding a solution using awk or perl or
> sed ...
[GNU sed]
Something like this?
$ sed -n "/^CA/{:X;N;s/\n//;/^$/bX;p}" file.txt
CA1001 10 This is a multiline text spanning two lines
CA1005 12 This is a multiline text spanning three
CA1008 11 This is a single line text
CA2001 10CA2003 11
CA2005 12
--
Best regards | Be nice to America or they'll bring democracy to
Cyrus | your country.
Re: Combine multiple line segment into one, when certain patternis found - awk/sed/perl
am 30.10.2007 23:19:12 von Michael Tosch
da. Ram wrote:
> Dear Group,
>
> I have a file with the content in the following format:
>
> Junk...
> Junk...
>
> Heading P01
> column1 column2 multiline text
> CA1001 10 This is a multiline
> text spanning two lines
>
> CA1005 12 This is a multiline
> text spanning three
> lines
>
> CA1008 11 This is a single line text
>
> Heading P02
> column1 column2
> CA2001 10
> CA2003 11
> CA2005 12
>
> Heading P03
> Junk..
> Junk..
>
> I would like to list all the values under "Heading P01" for the same
> column1 in a single line
>
> CA1001 10 This is a multiline text spanning two lines
> CA1005 12 This is a multiline text spanning three lines
> CA1008 11 This is a single line text
>
> Note: The column1 values will always have "CA" as the starting
> character.
>
> Appreciate your help in finding a solution using awk or perl or
> sed ...
>
> Thank you!!!!
>
awk '/^Heading P01/{x=1} /^Heading P02/{x=0} x==0{next}
/^CA/,/^$/{printf "%s",$0}/^$/{print}' file
--
Michael Tosch @ hp : com
Re: Combine multiple line segment into one, when certain pattern is found - awk/sed/perl
am 30.10.2007 23:59:05 von dba user
On Oct 30, 3:19 pm, Michael Tosch
wrote:
> da. Ram wrote:
> > Dear Group,
>
> > I have a file with the content in the following format:
>
> > Junk...
> > Junk...
>
> > Heading P01
> > column1 column2 multiline text
> > CA1001 10 This is a multiline
> > text spanning two lines
>
> > CA1005 12 This is a multiline
> > text spanning three
> > lines
>
> > CA1008 11 This is a single line text
>
> > Heading P02
> > column1 column2
> > CA2001 10
> > CA2003 11
> > CA2005 12
>
> > Heading P03
> > Junk..
> > Junk..
>
> > I would like to list all the values under "Heading P01" for the same
> > column1 in a single line
>
> > CA1001 10 This is a multiline text spanning two lines
> > CA1005 12 This is a multiline text spanning three lines
> > CA1008 11 This is a single line text
>
> > Note: The column1 values will always have "CA" as the starting
> > character.
>
> > Appreciate your help in finding a solution using awk or perl or
> > sed ...
>
> > Thank you!!!!
>
> awk '/^Heading P01/{x=1} /^Heading P02/{x=0} x==0{next}
> /^CA/,/^$/{printf "%s",$0}/^$/{print}' file
>
> --
> Michael Tosch @ hp : com
Thanks so much for the neat solution. Would it be possible to add the
heading ID to the combined line?
I tried the following, but the heading is getting added not just at
the begining but for every section of the broken line.
I am trying to figure out a way to get the heading id added once per
combined line
awk '/^Heading P01/{x=1;p=$2} /^Heading P02/{x=0} x==0{next}/^CA/,/^$/
{printf " %s %s",p,$0}/^$/{print}' file
P01 CA1001 10 This is a multiline P01 text spanning
two lines P01
P01 CA1005 12 This is a multiline P01 text spanning
three P01 lines P01
P01 CA1008 11 This is a single line text P01
Desired output
P01 CA1001 10 This is a multiline text spanning two
lines
P01 CA1005 12 This is a multiline text spanning
three lines
P01 CA1008 11 This is a single line text
BTW, what does the "print" at the end of the command do?
Re: Combine multiple line segment into one, when certain patternis found - awk/sed/perl
am 31.10.2007 00:42:35 von Michael Tosch
da. Ram wrote:
> On Oct 30, 3:19 pm, Michael Tosch
> wrote:
>> da. Ram wrote:
>>> Dear Group,
>>> I have a file with the content in the following format:
>>> Junk...
>>> Junk...
>>> Heading P01
>>> column1 column2 multiline text
>>> CA1001 10 This is a multiline
>>> text spanning two lines
>>> CA1005 12 This is a multiline
>>> text spanning three
>>> lines
>>> CA1008 11 This is a single line text
>>> Heading P02
>>> column1 column2
>>> CA2001 10
>>> CA2003 11
>>> CA2005 12
>>> Heading P03
>>> Junk..
>>> Junk..
>>> I would like to list all the values under "Heading P01" for the same
>>> column1 in a single line
>>> CA1001 10 This is a multiline text spanning two lines
>>> CA1005 12 This is a multiline text spanning three lines
>>> CA1008 11 This is a single line text
>>> Note: The column1 values will always have "CA" as the starting
>>> character.
>>> Appreciate your help in finding a solution using awk or perl or
>>> sed ...
>>> Thank you!!!!
>> awk '/^Heading P01/{x=1} /^Heading P02/{x=0} x==0{next}
>> /^CA/,/^$/{printf "%s",$0}/^$/{print}' file
>>
>> --
>> Michael Tosch @ hp : com
>
>
> Thanks so much for the neat solution. Would it be possible to add the
> heading ID to the combined line?
>
> I tried the following, but the heading is getting added not just at
> the begining but for every section of the broken line.
>
> I am trying to figure out a way to get the heading id added once per
> combined line
>
> awk '/^Heading P01/{x=1;p=$2} /^Heading P02/{x=0} x==0{next}/^CA/,/^$/
> {printf " %s %s",p,$0}/^$/{print}' file
>
> P01 CA1001 10 This is a multiline P01 text spanning
> two lines P01
> P01 CA1005 12 This is a multiline P01 text spanning
> three P01 lines P01
> P01 CA1008 11 This is a single line text P01
>
> Desired output
>
> P01 CA1001 10 This is a multiline text spanning two
> lines
> P01 CA1005 12 This is a multiline text spanning
> three lines
> P01 CA1008 11 This is a single line text
>
> BTW, what does the "print" at the end of the command do?
>
awk '/^Heading P01/{x=1;p=$2} /^Heading P02/{x=0} x==0{next}
/^CA/{printf " %s ",p} /^CA/,/^$/{printf "%s",$0} /^$/{print}' file
The print at the end prints a newline character.
(More precise: it prints the current line with a newline, but the
current line is empty).
printf "%s" prints without a newline.
--
Michael Tosch @ hp : com
Re: Combine multiple line segment into one, when certain pattern is found - awk/sed/perl
am 31.10.2007 01:19:45 von dba user
On Oct 30, 4:42 pm, Michael Tosch
wrote:
> da. Ram wrote:
> > On Oct 30, 3:19 pm, Michael Tosch
> > wrote:
> >> da. Ram wrote:
> >>> Dear Group,
> >>> I have a file with the content in the following format:
> >>> Junk...
> >>> Junk...
> >>> Heading P01
> >>> column1 column2 multiline text
> >>> CA1001 10 This is a multiline
> >>> text spanning two lines
> >>> CA1005 12 This is a multiline
> >>> text spanning three
> >>> lines
> >>> CA1008 11 This is a single line text
> >>> Heading P02
> >>> column1 column2
> >>> CA2001 10
> >>> CA2003 11
> >>> CA2005 12
> >>> Heading P03
> >>> Junk..
> >>> Junk..
> >>> I would like to list all the values under "Heading P01" for the same
> >>> column1 in a single line
> >>> CA1001 10 This is a multiline text spanning two lines
> >>> CA1005 12 This is a multiline text spanning three lines
> >>> CA1008 11 This is a single line text
> >>> Note: The column1 values will always have "CA" as the starting
> >>> character.
> >>> Appreciate your help in finding a solution using awk or perl or
> >>> sed ...
> >>> Thank you!!!!
> >> awk '/^Heading P01/{x=1} /^Heading P02/{x=0} x==0{next}
> >> /^CA/,/^$/{printf "%s",$0}/^$/{print}' file
>
> >> --
> >> Michael Tosch @ hp : com
>
> > Thanks so much for the neat solution. Would it be possible to add the
> > heading ID to the combined line?
>
> > I tried the following, but the heading is getting added not just at
> > the begining but for every section of the broken line.
>
> > I am trying to figure out a way to get the heading id added once per
> > combined line
>
> > awk '/^Heading P01/{x=1;p=$2} /^Heading P02/{x=0} x==0{next}/^CA/,/^$/
> > {printf " %s %s",p,$0}/^$/{print}' file
>
> > P01 CA1001 10 This is a multiline P01 text spanning
> > two lines P01
> > P01 CA1005 12 This is a multiline P01 text spanning
> > three P01 lines P01
> > P01 CA1008 11 This is a single line text P01
>
> > Desired output
>
> > P01 CA1001 10 This is a multiline text spanning two
> > lines
> > P01 CA1005 12 This is a multiline text spanning
> > three lines
> > P01 CA1008 11 This is a single line text
>
> > BTW, what does the "print" at the end of the command do?
>
> awk '/^Heading P01/{x=1;p=$2} /^Heading P02/{x=0} x==0{next}
> /^CA/{printf " %s ",p} /^CA/,/^$/{printf "%s",$0} /^$/{print}' file
>
> The print at the end prints a newline character.
> (More precise: it prints the current line with a newline, but the
> current line is empty).
>
> printf "%s" prints without a newline.
>
> --
> Michael Tosch @ hp : com
Thanks so much! The solution works great.
Re: Combine multiple line segment into one, when certain pattern is found - awk/sed/perl
am 31.10.2007 02:39:18 von krahnj
"da. Ram" wrote:
>
> Dear Group,
>
> I have a file with the content in the following format:
>
> Junk...
> Junk...
>
> Heading P01
> column1 column2 multiline text
> CA1001 10 This is a multiline
> text spanning two lines
>
> CA1005 12 This is a multiline
> text spanning three
> lines
>
> CA1008 11 This is a single line text
>
> Heading P02
> column1 column2
> CA2001 10
> CA2003 11
> CA2005 12
>
> Heading P03
> Junk..
> Junk..
>
> I would like to list all the values under "Heading P01" for the same
> column1 in a single line
>
> CA1001 10 This is a multiline text spanning two lines
> CA1005 12 This is a multiline text spanning three lines
> CA1008 11 This is a single line text
>
> Note: The column1 values will always have "CA" as the starting
> character.
>
> Appreciate your help in finding a solution using awk or perl or
> sed ...
$ echo "Junk...
Junk...
Heading P01
column1 column2 multiline text
CA1001 10 This is a multiline
text spanning two lines
CA1005 12 This is a multiline
text spanning three
lines
CA1008 11 This is a single line text
Heading P02
column1 column2
CA2001 10
CA2003 11
CA2005 12
Heading P03
Junk..
Junk..
" | perl -ln00e's/^.+\n(?=CA)//s,y/\n//d,print,if/Heading P01/../Heading
P02/and!/Heading P02/'
CA1001 10 This is a multiline text spanning two lines
CA1005 12 This is a multiline text spanning three lines
CA1008 11 This is a single line text
John
--
use Perl;
program
fulfillment
Re: Combine multiple line segment into one, when certain pattern is found - awk/sed/perl
am 31.10.2007 14:53:42 von Miguel Lobos
I'm looking to do something similar, but I'm not up to snuff enough on
awk or perl to figure it out on my own yet.
Anyway, I'm looking to match a particular string in a line, then grab
this line
and the next 5 after it so I can parse out some other parameters.
Here's a sample of what
I need to pull out of the file (matching on 'GHLR665', then pulling
this line plus the next 5):
1193616363 XXXXXX46D00 CM GHLR665 OCT28 18:59:54 7090 INFO
Table GHLRVLR Resource Limitation
1193616363 Operation: Update Location
1193616363 VLR number: 551178313920
1193616363 Description: Table GHLRVLR is about to
reach its 6000 Maximum Capacity.
1193616363 Space Left: 0 (6000)
1193616363 Action: Use QVLRACT in HLRADMIN to
identify the inactive VLRs.
Every line in this log file starts with a 10 digit number (i.e.
1193616363), which may or may not be the same value.
The line before and after what I'm trying to capture and write into a
single line / record will be just the 10 digit number,
followed by some white space character and a carriage return (UNIX
style, I think).
Any suggestions would be very much appreciated!
Mike
Re: Combine multiple line segment into one, when certain patternis found - awk/sed/perl
am 31.10.2007 15:34:25 von Ed Morton
On 10/31/2007 8:53 AM, Miguel Lobos wrote:
> I'm looking to do something similar, but I'm not up to snuff enough on
> awk or perl to figure it out on my own yet.
>
> Anyway, I'm looking to match a particular string in a line, then grab
> this line
> and the next 5 after it so I can parse out some other parameters.
> Here's a sample of what
> I need to pull out of the file (matching on 'GHLR665', then pulling
> this line plus the next 5):
>
> 1193616363 XXXXXX46D00 CM GHLR665 OCT28 18:59:54 7090 INFO
> Table GHLRVLR Resource Limitation
> 1193616363 Operation: Update Location
> 1193616363 VLR number: 551178313920
> 1193616363 Description: Table GHLRVLR is about to
> reach its 6000 Maximum Capacity.
> 1193616363 Space Left: 0 (6000)
> 1193616363 Action: Use QVLRACT in HLRADMIN to
> identify the inactive VLRs.
>
> Every line in this log file starts with a 10 digit number (i.e.
> 1193616363), which may or may not be the same value.
> The line before and after what I'm trying to capture and write into a
> single line / record will be just the 10 digit number,
> followed by some white space character and a carriage return (UNIX
> style, I think).
>
> Any suggestions would be very much appreciated!
>
> Mike
>
The general mechanism to pull out N lines starting at a pattern is:
awk '/pattern/{c=N}c&&c--' file
so, if you want the line containing GHLR665 plus the subsequent 5 lines, you'd do:
awk '/GHLR665/{c=6}c&&c--' file
If you want to look for your pattern in a specific field (e.g. it's in the 4th
field in your sample input) then you'd do:
awk '$4 ~ /GHLR665/{c=6}c&&c--' file
or
awk '$4 == "GHLR665"{c=6}c&&c--' file
if you want an exact string comparison rather than an RE comparison.
Regards,
Ed.
Re: Combine multiple line segment into one, when certain pattern is found - awk/sed/perl
am 31.10.2007 16:46:08 von Miguel Lobos
Ed,
Thanks! I'm running into some silly syntax errors, but thanks for the
explanation of the logic, that should get me going the right
direction.
Thanks Again,
Mike
Re: Combine multiple line segment into one, when certain patternis found - awk/sed/perl
am 31.10.2007 18:28:54 von Michael Tosch
Miguel Lobos wrote:
> Ed,
>
> Thanks! I'm running into some silly syntax errors, but thanks for the
> explanation of the logic, that should get me going the right
> direction.
>
> Thanks Again,
>
> Mike
>
old awks need
awk '/pattern/{c=N} c>0&&c-->0' file
or
awk '/pattern/{c=N} c>0{c--;print}' file
--
Michael Tosch @ hp : com
Re: Combine multiple line segment into one, when certain pattern is found - awk/sed/perl
am 01.11.2007 02:54:27 von Miguel Lobos
Michael,
Thanks, you saved me some time and a little banging my head against
the cubicle wall. Apparently the version of awk in Solaris 10 is an
'old' awk -- the last of you examples is the one that did the trick.
Regards and Thank You Again,
Mike
Re: Combine multiple line segment into one, when certain patternis found - awk/sed/perl
am 01.11.2007 07:59:21 von Ed Morton
On 10/31/2007 8:54 PM, Miguel Lobos wrote:
> Michael,
>
> Thanks, you saved me some time and a little banging my head against
> the cubicle wall. Apparently the version of awk in Solaris 10 is an
> 'old' awk -- the last of you examples is the one that did the trick.
>
> Regards and Thank You Again,
>
> Mike
>
Absolutely do not do anything to accomodate old, broken awk on Solaris. Use GNU
awk (gawk), New awk (nawk), or /usr/xpg4/bin/awk instead.
By the way, this is netnews not a web forum so you should leave enough context
in each post so it stands alone.
Ed
Re: Combine multiple line segment into one, when certain pattern is found - awk/sed/perl
am 02.11.2007 01:07:18 von Miguel Lobos
On Nov 1, 2:59 am, Ed Morton wrote:
> On 10/31/2007 8:54 PM, Miguel Lobos wrote:
>
> > Michael,
>
> > Thanks, you saved me some time and a little banging my head against
> > the cubicle wall. Apparently the version of awk in Solaris 10 is an
> > 'old' awk -- the last of you examples is the one that did the trick.
>
> > Regards and Thank You Again,
>
> > Mike
>
> Absolutely do not do anything to accomodate old, broken awk on Solaris. Use GNU
> awk (gawk), New awk (nawk), or /usr/xpg4/bin/awk instead.
>
> By the way, this is netnews not a web forum so you should leave enough context
> in each post so it stands alone.
>
> Ed
Ed,
Thank you again for the advice, and all points taken! Now that I've
managed to finish my report, I'll work on getting a more modern awk on
my Ultra 45.
Mike
Re: Combine multiple line segment into one, when certain patternis found - awk/sed/perl
am 02.11.2007 15:41:47 von Michael Tosch
Miguel Lobos wrote:
> On Nov 1, 2:59 am, Ed Morton wrote:
>> On 10/31/2007 8:54 PM, Miguel Lobos wrote:
>>
>>> Michael,
>>> Thanks, you saved me some time and a little banging my head against
>>> the cubicle wall. Apparently the version of awk in Solaris 10 is an
>>> 'old' awk -- the last of you examples is the one that did the trick.
>>> Regards and Thank You Again,
>>> Mike
>> Absolutely do not do anything to accomodate old, broken awk on Solaris. Use GNU
>> awk (gawk), New awk (nawk), or /usr/xpg4/bin/awk instead.
>>
>> By the way, this is netnews not a web forum so you should leave enough context
>> in each post so it stands alone.
>>
>> Ed
>
> Ed,
>
> Thank you again for the advice, and all points taken! Now that I've
> managed to finish my report, I'll work on getting a more modern awk on
> my Ultra 45.
>
> Mike
>
cd /usr/bin
ls -li awk nawk oawk
shows that awk linked to oawk.
rm awk
ln nawk awk
and it will be linked to nawk.
This is the path that AT&T had prepared
but Sun has never dared to go.
We all should open service cases with Sun and urge them for an RFE.
--
Michael Tosch @ hp : com
Re: Combine multiple line segment into one, when certain pattern is found - awk/sed/perl
am 04.11.2007 16:26:23 von Miguel Lobos
On Nov 2, 10:41 am, Michael Tosch
wrote:
> Miguel Lobos wrote:
> > On Nov 1, 2:59 am, Ed Morton wrote:
> >> On 10/31/2007 8:54 PM, Miguel Lobos wrote:
>
> >>> Michael,
> >>> Thanks, you saved me some time and a little banging my head against
> >>> the cubicle wall. Apparently the version of awk in Solaris 10 is an
> >>> 'old' awk -- the last of you examples is the one that did the trick.
> >>> Regards and Thank You Again,
> >>> Mike
> >> Absolutely do not do anything to accomodate old, broken awk on Solaris. Use GNU
> >> awk (gawk), New awk (nawk), or /usr/xpg4/bin/awk instead.
>
> >> By the way, this is netnews not a web forum so you should leave enough context
> >> in each post so it stands alone.
>
> >> Ed
>
> > Ed,
>
> > Thank you again for the advice, and all points taken! Now that I've
> > managed to finish my report, I'll work on getting a more modern awk on
> > my Ultra 45.
>
> > Mike
>
> cd /usr/bin
> ls -li awk nawk oawk
>
> shows that awk linked to oawk.
>
> rm awk
> ln nawk awk
>
> and it will be linked to nawk.
> This is the path that AT&T had prepared
> but Sun has never dared to go.
>
> We all should open service cases with Sun and urge them for an RFE.
>
> --
> Michael Tosch @ hp : com- Hide quoted text -
>
> - Show quoted text -
Michael,
Excellent! I'll be updating my system on Monday morning, though I've
considered going and grabbing gawk off of sunfreeware.com. Just to
have something to fall back on, I'm probably going to rename rather
than remove the original awk to something else. Its not that I'm
afraid, but want to have a safety net if something else I was doing
with the original awk breaks, until I get time to figure out how to
make it work with nawk or gawk.
Thanks again for all the wonderful suggestions, and helping me get on
the right track with this.
Regards,
Mike
Re: Combine multiple line segment into one, when certain patternis found - awk/sed/perl
am 05.11.2007 09:28:20 von Michael Tosch
Miguel Lobos wrote:
> On Nov 2, 10:41 am, Michael Tosch
> wrote:
>> Miguel Lobos wrote:
>>> On Nov 1, 2:59 am, Ed Morton wrote:
>>>> On 10/31/2007 8:54 PM, Miguel Lobos wrote:
>>>>> Michael,
>>>>> Thanks, you saved me some time and a little banging my head against
>>>>> the cubicle wall. Apparently the version of awk in Solaris 10 is an
>>>>> 'old' awk -- the last of you examples is the one that did the trick.
>>>>> Regards and Thank You Again,
>>>>> Mike
>>>> Absolutely do not do anything to accomodate old, broken awk on Solaris. Use GNU
>>>> awk (gawk), New awk (nawk), or /usr/xpg4/bin/awk instead.
>>>> By the way, this is netnews not a web forum so you should leave enough context
>>>> in each post so it stands alone.
>>>> Ed
>>> Ed,
>>> Thank you again for the advice, and all points taken! Now that I've
>>> managed to finish my report, I'll work on getting a more modern awk on
>>> my Ultra 45.
>>> Mike
>> cd /usr/bin
>> ls -li awk nawk oawk
>>
>> shows that awk linked to oawk.
>>
>> rm awk
>> ln nawk awk
>>
>> and it will be linked to nawk.
>> This is the path that AT&T had prepared
>> but Sun has never dared to go.
>>
>> We all should open service cases with Sun and urge them for an RFE.
>>
>> --
>> Michael Tosch @ hp : com- Hide quoted text -
>>
>> - Show quoted text -
>
> Michael,
>
> Excellent! I'll be updating my system on Monday morning, though I've
> considered going and grabbing gawk off of sunfreeware.com. Just to
> have something to fall back on, I'm probably going to rename rather
> than remove the original awk to something else. Its not that I'm
> afraid, but want to have a safety net if something else I was doing
> with the original awk breaks, until I get time to figure out how to
> make it work with nawk or gawk.
>
> Thanks again for all the wonderful suggestions, and helping me get on
> the right track with this.
>
> Regards,
>
> Mike
>
Hmm, oawk should be backup enough.
But it is maybe wise to symlink awk to nawk, so an awk patch would replace
the awk symlink but not spoil nawk.
--
Michael Tosch @ hp : com
Re: Combine multiple line segment into one, when certain pattern is found - awk/sed/perl
am 05.11.2007 20:16:55 von dba user
On Oct 30, 4:42 pm, Michael Tosch
wrote:
> da. Ram wrote:
> > On Oct 30, 3:19 pm, Michael Tosch
> > wrote:
> >> da. Ram wrote:
> >>> Dear Group,
> >>> I have a file with the content in the following format:
> >>> Junk...
> >>> Junk...
> >>> Heading P01
> >>> column1 column2 multiline text
> >>> CA1001 10 This is a multiline
> >>> text spanning two lines
> >>> CA1005 12 This is a multiline
> >>> text spanning three
> >>> lines
> >>> CA1008 11 This is a single line text
> >>> Heading P02
> >>> column1 column2
> >>> CA2001 10
> >>> CA2003 11
> >>> CA2005 12
> >>> Heading P03
> >>> Junk..
> >>> Junk..
> >>> I would like to list all the values under "Heading P01" for the same
> >>> column1 in a single line
> >>> CA1001 10 This is a multiline text spanning two lines
> >>> CA1005 12 This is a multiline text spanning three lines
> >>> CA1008 11 This is a single line text
> >>> Note: The column1 values will always have "CA" as the starting
> >>> character.
> >>> Appreciate your help in finding a solution using awk or perl or
> >>> sed ...
> >>> Thank you!!!!
> >> awk '/^Heading P01/{x=1} /^Heading P02/{x=0} x==0{next}
> >> /^CA/,/^$/{printf "%s",$0}/^$/{print}' file
>
> >> --
> >> Michael Tosch @ hp : com
>
> > Thanks so much for the neat solution. Would it be possible to add the
> > heading ID to the combined line?
>
> > I tried the following, but the heading is getting added not just at
> > the begining but for every section of the broken line.
>
> > I am trying to figure out a way to get the heading id added once per
> > combined line
>
> > awk '/^Heading P01/{x=1;p=$2} /^Heading P02/{x=0} x==0{next}/^CA/,/^$/
> > {printf " %s %s",p,$0}/^$/{print}' file
>
> > P01 CA1001 10 This is a multiline P01 text spanning
> > two lines P01
> > P01 CA1005 12 This is a multiline P01 text spanning
> > three P01 lines P01
> > P01 CA1008 11 This is a single line text P01
>
> > Desired output
>
> > P01 CA1001 10 This is a multiline text spanning two
> > lines
> > P01 CA1005 12 This is a multiline text spanning
> > three lines
> > P01 CA1008 11 This is a single line text
>
> > BTW, what does the "print" at the end of the command do?
>
> awk '/^Heading P01/{x=1;p=$2} /^Heading P02/{x=0} x==0{next}
> /^CA/{printf " %s ",p} /^CA/,/^$/{printf "%s",$0} /^$/{print}' file
>
> The print at the end prints a newline character.
> (More precise: it prints the current line with a newline, but the
> current line is empty).
>
> printf "%s" prints without a newline.
>
> --
> Michael Tosch @ hp : com
Thanks for all your help. I have an additional requirement now. Is it
possible to print only the 1st and last (text field) columns in
addition to the heading ID.
P01 CA1001 This is a multiline text spanning two lines
P01 CA1005 This is a multiline text spanning three lines P01
P01 CA1008 This is a single line text P01
Re: Combine multiple line segment into one, when certain patternis found - awk/sed/perl
am 06.11.2007 15:57:50 von Janis Papanagnou
da. Ram wrote:
[snip]
>
> Thanks for all your help. I have an additional requirement now. Is it
> possible to print only the 1st and last (text field) columns in
> addition to the heading ID.
awk '{print $1, $NF}'
will print the first and last field.
>
> P01 CA1001 This is a multiline text spanning two lines
> P01 CA1005 This is a multiline text spanning three lines P01
> P01 CA1008 This is a single line text P01
>
But how is the last field defined? From your example you seem to want
multiple fields that you want to extract. Is the larger space a
delimiter? Do the last fields start from a certain column number? In
the latter case use
awk '{print $1,substr($0,54)}'
Janis
Re: Combine multiple line segment into one, when certain pattern is found - awk/sed/perl
am 08.11.2007 20:58:39 von dba user
On Nov 6, 6:57 am, Janis Papanagnou
wrote:
> da. Ram wrote:
>
> [snip]
>
>
>
> > Thanks for all your help. I have an additional requirement now. Is it
> > possible to print only the 1st and last (text field) columns in
> > addition to the heading ID.
>
> awk '{print $1, $NF}'
>
> will print the first and last field.
>
>
>
> > P01 CA1001 This is a multiline text spanning two lines
> > P01 CA1005 This is a multiline text spanning three lines P01
> > P01 CA1008 This is a single line text P01
>
> But how is the last field defined? From your example you seem to want
> multiple fields that you want to extract. Is the larger space a
> delimiter? Do the last fields start from a certain column number? In
> the latter case use
>
> awk '{print $1,substr($0,54)}'
>
> Janis
Sorry, I missed the post earlier. Thanks for the suggestion and I will
try the substr option.
The last field is in fact a large text with spaces and tabs. They
always start from a certain position and could span multiple lines
until the next record is found. Initial chanleege was to get the
multiple lines combined into one.
Kind regards