how to find the second column is a digital number and biger than

how to find the second column is a digital number and biger than

am 12.04.2008 09:21:38 von robertchen117

I have a file, like this:

host1 52233
host2 failed scan
host3 2333

I want to get all these second column is a digital number and biger
than 50000, please let me how to do it.

if I use the following command, it will get "failed" hosts also.
cat host_file | nawk '{if($2 >=50000){print $1, $2}}'

So how to check if the column is a digital number...? Please help me.

Re: how to find the second column is a digital number and biger than 50000

am 12.04.2008 10:42:16 von Janis Papanagnou

robertchen117@gmail.com wrote:
> I have a file, like this:
>
> host1 52233
> host2 failed scan
> host3 2333
>
> I want to get all these second column is a digital number and biger
> than 50000, please let me how to do it.
>
> if I use the following command, it will get "failed" hosts also.
> cat host_file | nawk '{if($2 >=50000){print $1, $2}}'
>
> So how to check if the column is a digital number...? Please help me.

Mind that cat is unnecessary in the shell context, and the if's are
unnecessary in awk in such contexts. To fix your failed line assure
that awk does a numerical comparison (instead of a lexicalic) by
converting the field explicitly to a number.

awk '0 + $2 >= 50000 {print $1, $2}' hostfile


Janis

Re: how to find the second column is a digital number and biger than 50000

am 12.04.2008 10:44:54 von Janis Papanagnou

Janis Papanagnou wrote:
> robertchen117@gmail.com wrote:
>
>> I have a file, like this:
>>
>> host1 52233
>> host2 failed scan
>> host3 2333
>>
>> I want to get all these second column is a digital number and biger
>> than 50000, please let me how to do it.
>>
>> if I use the following command, it will get "failed" hosts also.
>> cat host_file | nawk '{if($2 >=50000){print $1, $2}}'
>>
>> So how to check if the column is a digital number...? Please help me.
>
>
> Mind that cat is unnecessary in the shell context, and the if's are
> unnecessary in awk in such contexts. To fix your failed line assure
> that awk does a numerical comparison (instead of a lexicalic) by
> converting the field explicitly to a number.

Oops; ignore my last sentence. :-/

>
> awk '0 + $2 >= 50000 {print $1, $2}' hostfile
>
>
> Janis

Re: how to find the second column is a digital number and biger than 50000

am 12.04.2008 10:57:48 von PK

robertchen117@gmail.com wrote:

> I have a file, like this:
>
> host1 52233
> host2 failed scan
> host3 2333
>
> I want to get all these second column is a digital number and biger
> than 50000, please let me how to do it.
>
> if I use the following command, it will get "failed" hosts also.
> cat host_file | nawk '{if($2 >=50000){print $1, $2}}'
>
> So how to check if the column is a digital number...? Please help me.

awk '($2~/[[:digit:]]+/) && ($2 >=50000)' host_file

or

awk '($2!~/failed/) && ($2 >=50000)' host_file

--
All the commands are tested with bash and GNU tools, so they may use
nonstandard features. I try to mention when something is nonstandard (if
I'm aware of that), but I may miss something. Corrections are welcome.

Re: how to find the second column is a digital number and biger than 50000

am 12.04.2008 12:02:15 von PK

Janis Papanagnou wrote:

> awk '0 + $2 >= 50000 {print $1, $2}' hostfile

This is what I thought at first too, but if the file is

host1 52233
host2 51000failed scan
host3 2333

awk prints the second line too. I guess that's because the standard says
that string->number conversions should behave as in atof().

--
All the commands are tested with bash and GNU tools, so they may use
nonstandard features. I try to mention when something is nonstandard (if
I'm aware of that), but I may miss something. Corrections are welcome.

Re: how to find the second column is a digital number and biger than 50000

am 12.04.2008 12:04:14 von PK

pk wrote:

> awk '($2~/[[:digit:]]+/) && ($2 >=50000)' host_file

This should actually be

awk '($2~/^[[:digit:]]+$/) && ($2 >=50000)' host_file

--
All the commands are tested with bash and GNU tools, so they may use
nonstandard features. I try to mention when something is nonstandard (if
I'm aware of that), but I may miss something. Corrections are welcome.

Re: how to find the second column is a digital number and biger than50000

am 12.04.2008 12:38:28 von Janis Papanagnou

pk wrote:
> Janis Papanagnou wrote:
>
>> awk '0 + $2 >= 50000 {print $1, $2}' hostfile
>
> This is what I thought at first too, but if the file is
>
> host1 52233
> host2 51000failed scan
> host3 2333

Good point. Still open whether that misformat shall match or not; if
there's just a delimiter missing between '51000' and 'failed' it might
be intentional to extract that line, but then (e.g. if you post process
the output) it might be necessary to strip the string part from the two
glued data fileds. As mentioned in another thread, if the example data
leaves room for many possibilities I'd propose solutions and wait for
any necessary clarification.

>
> awk prints the second line too. I guess that's because the standard says
> that string->number conversions should behave as in atof().
>

Janis