Avoid DISTINCT In Queries!

Avoid DISTINCT In Queries!

am 25.09.2005 15:57:37 von Arpan

The article at http://www.aspfaq.com/show.asp?id=2424 lists some tips
to make ASP pages more efficient. One of the tips under the sub-topic
"General" under the topic "Database-specific" says:

Avoid DISTINCT unless it is absolutely necessary. You can use GROUP BY,
self-JOINS or sub-queries in many cases.

Why is the author saying not to use DISTINCT in SQL queries? What are
the disadvantages of using DISTINCT?

Thanks,

Arpan

Re: Avoid DISTINCT In Queries!

am 26.09.2005 19:00:27 von Chris Hohmann

"Arpan" wrote in message
news:1127656657.237087.203650@g47g2000cwa.googlegroups.com.. .
> The article at http://www.aspfaq.com/show.asp?id=2424 lists some tips
> to make ASP pages more efficient. One of the tips under the sub-topic
> "General" under the topic "Database-specific" says:
>
> Avoid DISTINCT unless it is absolutely necessary. You can use GROUP BY,
> self-JOINS or sub-queries in many cases.
>
> Why is the author saying not to use DISTINCT in SQL queries? What are
> the disadvantages of using DISTINCT?

DISTINCT requires the intermediary result set to be sorted so that
duplicates can be removed. Sorting is an expensive operation. Thus, SQL
statements using DISTINCT often do not perform as well as their GROUP BY,
self-join, or subquery equivalents. Here's a thread I found when I Googled
"avoid distinct sql":

http://groups.google.com/group/microsoft.public.sqlserver.pr ogramming/browse_thread/thread/8a8204ce96473caa/96d2ddd47277 16dd

Re: Avoid DISTINCT In Queries!

am 27.09.2005 08:58:17 von Arpan

That's indeed a very good reason to avoid DISTINCT in queries but why
is sorting an expensive operation? Is it because the optimizer has to
go to each & every record & then arrange the records as per the sorting
critera specified?

Thanks Chris,

Regards,

Arpan

Re: Avoid DISTINCT In Queries!

am 29.09.2005 06:27:28 von mdkersey

Arpan wrote:
> The article at http://www.aspfaq.com/show.asp?id=2424 lists some tips
> to make ASP pages more efficient. One of the tips under the sub-topic
> "General" under the topic "Database-specific" says:
>
> Avoid DISTINCT unless it is absolutely necessary. You can use GROUP BY,
> self-JOINS or sub-queries in many cases.
>
> Why is the author saying not to use DISTINCT in SQL queries? What are
> the disadvantages of using DISTINCT?

IIRC in SQL Server 7 and prior, both GROUP BY and DISTINCT were
implemented by first sorting the result set.

With the advent of SQL Server 2000 the query optimizer can optionally do
a GROUP BY or DISTINCT by hashing into "buckets" (each bucket holds
records with matching GROUP BY fields [or all fields in the case of
DISTINCT]).

I mention the above because I don't think sorting is necessarily the
culprit when comparing DISTINCT with GROUP BY.

Now GROUP BY requires only that each record be aggregated with other
records that match on the GROUP BY fields. But DISTINCT may require
comparing all fields of the record [since if all fields match then the
record is a duplicate]. So DISTINCT may require more computation but
possibly less memory than GROUP BY.