Re: uniq without sort <-------------- GURU NEEDED
am 25.01.2008 11:35:55 von William James
On Jan 25, 1:50 am, Thomas Troeger
wrote:
> > This is a tough problem, and needs a guru.
>
> This problem is not tough.
>
> > So it is TRIVIAL with sort.
>
> > I want uniq without sorting the initial order.
>
> > The algorithm is this. For every line, look above if there is another
> > line like it. If so, then ignore it. If not, then output it. I am
> > sure, I can spend some time to write this in C. But what is the
> > solution using shell ? This way I can get an output that preserves the
> > order of first occurrence. It is needed in many problems.
>
> > Thanks to the star who can help
> > gnuist
>
> I'm not a star, but this will do the job:
>
> cat somefile | awk '{ if (!h[$0]) { print $0; h[$0]=1 } }' > unique_lines
Why are you using "cat"? Can't you guess that
awk can read a file? Reading a file does not require
magical powers.
Re: uniq without sort <-------------- GURU NEEDED
am 25.01.2008 11:43:41 von Stephane CHAZELAS
On Fri, 25 Jan 2008 02:35:55 -0800 (PST), William James wrote:
[...]
>> I'm not a star, but this will do the job:
>>
>> cat somefile | awk '{ if (!h[$0]) { print $0; h[$0]=1 } }' > unique_lines
>
> Why are you using "cat"? Can't you guess that
> awk can read a file? Reading a file does not require
> magical powers.
Note that awk will read from the pipe the same way as it would
read from a file.
So the question is not that much whether awk can read a file but
whether it can open a file for reading.
Even though it can, there are some pitfalls associated with
letting awk open a file.
Typically
awk '{...}' "$file"
may fail if $file contains a "=" sign.
So, it's a good idea to let the *shell* open the file for
reading instead:
awk '{...}' < "$file"
(the drawback of that approach is that awk's FILENAME variable
is not filled in with the name of the file).
In any case using "cat" here doesn't make sense as cat is the
command to concatenate files.
--
Stephane