Re: uniq without sort <-------------- GURU NEEDED

Re: uniq without sort <-------------- GURU NEEDED

am 29.01.2008 22:04:12 von Dann Corbit

On Jan 24, 6:45=A0pm, gnuist...@gmail.com wrote:
> This is a tough problem, and needs a guru.
>
> I know it is very easy to find uniq or non-uniq lines if you scramble
> all of them and sort them. Its trivially
>
> echo -e "a\nc\nd\nb\nc\nd" | sort | uniq
>
> $ echo -e "a\nc\nd\nb\nc\nd"
> a
> c
> d
> b
> c
> d
>
> $ echo -e "a\nc\nd\nb\nc\nd"|sort|uniq
> a
> b
> c
> d
>
> So it is TRIVIAL with sort.
>
> I want uniq without sorting the initial order.
>
> The algorithm is this. For every line, look above if there is another
> line like it. If so, then ignore it. If not, then output it. I am
> sure, I can spend some time to write this in C. But what is the
> solution using shell ? This way I can get an output that preserves the
> order of first occurrence. It is needed in many problems.

If you put them in a hash table, you can find out instantly if they
are unique or not. If you find the entry in the table, then don't
print it.

Now, you make be able to do it in bash or csh but why put the square
peg through the round hole?

For that matter, what is wrong with just piping the output to uniq?

P.S.
There is no C question that I can detect, so why crosspost to
news:comp.lang.c?