Re: uniq without sort <-------------- GURU NEEDED

Re: uniq without sort <-------------- GURU NEEDED

am 28.01.2008 01:09:46 von Dan Mercer

wrote in message news:5462c3ef-cb53-40d8-8a96-bbf624408300@v4g2000hsf.googleg roups.com...
cat input|awk '!_[$0]++' <---- I am interested in understanding


UUOC.

awk '!_[$0]++' input

awk operates on each line of input with a succession of

pattern { action }

statements. The pattern is a test that evaluates either to 0 (false)
or non-zero (true). Only when the pattern evaluates to true is the action
performed. If a statement has no action the default action is performed.
That action is "print" which is shorthand for "print $0".
empty strings evaluate to 0. So,

_[$0]++

Accesses a hash element whose key is the contents of the current line.
If none exists, one is instantiated and its value is initialized to ''.
That sets the return code to zero. Since the ++ is postfix, the value is incremented
by 1. In awk, ''++ == 1. But the return code is already set. Since the return code
is negated, it is nonzero and therefore true and the line is printed. Thereafter,
the return code will be nonzero and wull be negated to 0 which is false. So only the
first uniq line is ever printed.

Dan Mercer


this and other one liners. The multiliners are easy to follow, so plz
dont worry about them.


On Jan 26, 6:11 pm, gnuist...@gmail.com wrote:
> I wish one of you had explained this magic of $0 with awk, or perl and
> discted your solution.
>
> cheers
>
> On Jan 25, 12:57 pm, jpd wrote:
>
>
>
> > Begin
> > On Fri, 25 Jan 2008 21:38:04 +0100 (CET),
>
> > Andrew Smallshaw wrote:
> > > On 2008-01-25, gnuist...@gmail.com wrote:
> > >> I want uniq without sorting the initial order.
>
> > > This works fine on this NetBSD system, though I make no claims as
> > > to portability:
>
> > > cat input|nl -b a -w 10 -n rz|sort -t "" -u -k 1.12|sort|colrm 1 16
>
> > I can offer something with less processes in the pipe:
>
> > cat input|awk '!_[$0]++'
>
> > The next question then would be: which is faster, more efficient, etc.?
>
> > --
> > j p d (at) d s b (dot) t u d e l f t (dot) n l .
> > This message was originally posted on Usenet in plain text.
> > Any other representation, additions, or changes do not have my
> > consent and may be a violation of international copyright law.- Hide quoted text -
>
> - Show quoted text -