Optimization Suggestion: GROUP / ORDER on a connecting field in JOIN

Optimization Suggestion: GROUP / ORDER on a connecting field in JOIN

am 14.09.2004 18:51:02 von Lorderon

Hi...

There might be an unnecessary temporary file used when using GROUP or ORDER
on a connecting field.
I'll give an example:

Let's say I got 3 tables, with their primary keys:
1. tbl_1 -> id1
2. tbl_2 -> id2+another_id2
3. tbl_3 -> id3+another_id3

Let's say running the next query (runs without a temporary table):

SELECT *
FROM tbl_1
INNER JOIN tbl_2 ON tbl_1.id1=tbl_2.id2
INNER JOIN tbl_3 ON tbl_2.id2=tbl_3.id3 AND
tbl_2.another_id2=tbl_3.another_id3
WHERE {where_condition};

And say that MySQL optimizes this query going first on the tables in this
order: tbl_2->tbl_1->tbl_3.
Now, I add GROUP BY on tbl_1.id_1, like this:
SELECT *
FROM tbl_1
INNER JOIN tbl_2 ON tbl_1.id1=tbl_2.id2
INNER JOIN tbl_3 ON tbl_2.id2=tbl_3.id3 AND
tbl_2.another_id2=tbl_3.another_id3
WHERE {where_condition}
GROUP BY tbl_1.id1;

If MySQL decides to go over the tables in the same order as before
(tbl_2->tbl_1->tbl_3), then it uses a temporary table to group by tbl_1.id1,
and the query takes longer to be executed (especially when LIMIT is also
used).
If I use GROUP BY tbl_2.id_2, then it doesn't use a temporary table, since
it goes first over tbl_2.
But... In this case:
GROUP BY tbl_1.id1 = GROUP BY tbl_2.id2 = GROUP BY tbl_3.id3
since those fields are defined to be equal in the JOIN conditions... So why
using a temporary table if I could not hit the exact connecting field of the
first table MySQL goes over ???
(MySQL can decide to go over the tables in different sequences according to
the query itself and the tables datas, which means that I can not always hit
the exact connecting field of the first table MySQL goes over).


Any comments would be accepted gratefully...

-thanks, Lorderon.



--
MySQL Bugs Mailing List
For list archives: http://lists.mysql.com/bugs
To unsubscribe: http://lists.mysql.com/bugs?unsub=gcdmb-bugs@m.gmane.org

Re: Optimization Suggestion: GROUP / ORDER on a connecting field in JOIN

am 25.09.2004 09:34:04 von Sergei Golubchik

Hi!

On Sep 14, Lorderon wrote:
> Hi...
>
> There might be an unnecessary temporary file used when using GROUP or ORDER
> on a connecting field.
> I'll give an example:
>
> Let's say I got 3 tables, with their primary keys:
> 1. tbl_1 -> id1
> 2. tbl_2 -> id2+another_id2
> 3. tbl_3 -> id3+another_id3

....

> Now, I add GROUP BY on tbl_1.id_1, like this:
> SELECT *
> FROM tbl_1
> INNER JOIN tbl_2 ON tbl_1.id1=tbl_2.id2
> INNER JOIN tbl_3 ON tbl_2.id2=tbl_3.id3 AND
> tbl_2.another_id2=tbl_3.another_id3
> WHERE {where_condition}
> GROUP BY tbl_1.id1;
>
> If MySQL decides to go over the tables in the same order as before
> (tbl_2->tbl_1->tbl_3), then it uses a temporary table to group by tbl_1.id1,
> and the query takes longer to be executed (especially when LIMIT is also
> used).
> If I use GROUP BY tbl_2.id_2, then it doesn't use a temporary table, since
> it goes first over tbl_2.
> But... In this case:
> GROUP BY tbl_1.id1 = GROUP BY tbl_2.id2 = GROUP BY tbl_3.id3
> since those fields are defined to be equal in the JOIN conditions... So why
> using a temporary table if I could not hit the exact connecting field of the
> first table MySQL goes over ???

You're right.
It's the optimization MySQL should be able to do.
It was not easy to do up until recently, but now he have reworked the
code that deals with these equality conditions so it should be easy to
do this optimization. And it should be done relatively soon.

(by the way, what you've reported isn't a bug, so it should've been sent
to internals@ not to bugs@ :)

Regards,
Sergei

--
__ ___ ___ ____ __
/ |/ /_ __/ __/ __ \/ / Sergei Golubchik
/ /|_/ / // /\ \/ /_/ / /__ MySQL AB, Senior Software Developer
/_/ /_/\_, /___/\___\_\___/ Osnabrueck, Germany
<___/ www.mysql.com

--
MySQL Bugs Mailing List
For list archives: http://lists.mysql.com/bugs
To unsubscribe: http://lists.mysql.com/bugs?unsub=gcdmb-bugs@m.gmane.org