Extracting Abbreviations In A File
Extracting Abbreviations In A File
am 28.09.2007 14:02:19 von derya.susman
Hi,
I want to extract abbreviations in a file. An abbreviation may consist
of capital letters and digits. How can I accomplish this? Since grep
returns lines, it does not help much. I could not make use of sed
either.
Thanks in advance.
Re: Extracting Abbreviations In A File
am 28.09.2007 14:05:34 von dozzie
On 28.09.2007, D. Susman wrote:
> I want to extract abbreviations in a file. An abbreviation may consist
> of capital letters and digits. How can I accomplish this? Since grep
> returns lines, it does not help much.
GNU grep has "-o" option.
> I could not make use of sed
> either.
Perl? AWK?
--
Secunia non olet.
Stanislaw Klekot
Re: Extracting Abbreviations In A File
am 28.09.2007 14:14:00 von derya.susman
On Sep 28, 3:05 pm, "Stachu 'Dozzie' K."
wrote:
> On 28.09.2007, D. Susman wrote:
>
> > I want to extract abbreviations in a file. An abbreviation may consist
> > of capital letters and digits. How can I accomplish this? Since grep
> > returns lines, it does not help much.
>
> GNU grep has "-o" option.
>
> > I could not make use of sed
> > either.
>
> Perl? AWK?
>
> --
> Secunia non olet.
> Stanislaw Klekot
I am using the standard grep. Is there a way for plain grep to return
words?
Soluitons based on awk also would be appreciated.
Re: Extracting Abbreviations In A File
am 28.09.2007 14:34:02 von William James
On Sep 28, 7:02 am, "D. Susman" wrote:
> Hi,
>
> I want to extract abbreviations in a file. An abbreviation may consist
> of capital letters and digits. How can I accomplish this? Since grep
> returns lines, it does not help much. I could not make use of sed
> either.
>
> Thanks in advance.
Assuming only 1 abbreviation in a line:
awk 'match($0,/[A-Z][A-Z0-9]+/){
print substr($0,RSTART,RLENGTH)}' myfile
Re: Extracting Abbreviations In A File
am 28.09.2007 14:50:11 von William James
On Sep 28, 7:02 am, "D. Susman" wrote:
> Hi,
>
> I want to extract abbreviations in a file. An abbreviation may consist
> of capital letters and digits. How can I accomplish this? Since grep
> returns lines, it does not help much. I could not make use of sed
> either.
>
> Thanks in advance.
Any number of abbreviations in a line.
gawk 'BEGIN{RS="[A-Z][A-Z0-9]+"}RT!=""{print RT}' file
Re: Extracting Abbreviations In A File
am 28.09.2007 14:57:41 von Tiago Peczenyj
My two cents:
$ cat -A filename # ^I is a tab
IBM adasd ^ICMMI UML RUP A3 xxxSQLyyy$
$ awk -v RS='[ \t]' '/\<[A-Z][A-Z0-9]+\>/' filename
IBM
CMMI
UML
RUP
A3
$ grep -oE '\b[A-Z][A-Z0-9]+\b' filename
IBM
CMMI
UML
RUP
A3
Best Regards
Tiago
On Sep 28, 9:02 am, "D. Susman" wrote:
> Hi,
>
> I want to extract abbreviations in a file. An abbreviation may consist
> of capital letters and digits. How can I accomplish this? Since grep
> returns lines, it does not help much. I could not make use of sed
> either.
>
> Thanks in advance.
Re: Extracting Abbreviations In A File
am 28.09.2007 15:49:54 von Maxwell Lol
"D. Susman" writes:
> Hi,
>
> I want to extract abbreviations in a file. An abbreviation may consist
> of capital letters and digits. How can I accomplish this? Since grep
> returns lines, it does not help much. I could not make use of sed
> either.
>
> Thanks in advance.
Here's another option
tr -dcs 'A-Z 0-9\n' ' '
It will output single numbers. You may need to use grep to add extra
conditions, i.e.
tr -dcs 'A-Z 0-9\n' ' '
Re: Extracting Abbreviations In A File
am 28.09.2007 15:54:52 von Ed Morton
D. Susman wrote:
> Hi,
>
> I want to extract abbreviations in a file. An abbreviation may consist
> of capital letters and digits. How can I accomplish this? Since grep
> returns lines, it does not help much. I could not make use of sed
> either.
>
> Thanks in advance.
>
In the text:
abcDEFghi
is "DEF" an abbreviation? If not, why not (i.e. what delimitters are
required around an "abbreviation")?
Ed.
Re: Extracting Abbreviations In A File
am 29.09.2007 19:11:07 von dozzie
On 28.09.2007, D. Susman wrote:
> On Sep 28, 3:05 pm, "Stachu 'Dozzie' K."
> wrote:
>> On 28.09.2007, D. Susman wrote:
>>
>> > I want to extract abbreviations in a file. An abbreviation may consist
>> > of capital letters and digits. How can I accomplish this? Since grep
>> > returns lines, it does not help much.
>>
>> GNU grep has "-o" option.
>>
>> > I could not make use of sed
>> > either.
>>
>> Perl? AWK?
>>
>> --
>> Secunia non olet.
>> Stanislaw Klekot
>
> I am using the standard grep. Is there a way for plain grep to return
> words?
I don't think so. I don't even know what is "plain grep", as the systems
implementing SUSv3 are adding their own functionality to grep
specification.
--
Secunia non olet.
Stanislaw Klekot