Slow queries when using left join

Slow queries when using left join

am 19.03.2010 09:14:27 von olav.morkrid

Dear MySQL forum.

I have performance problems when using "left join x" combined with
"where x.y is null", in particularily when combining three tables this
way.

Please contact me by e-mail if you are familiar with these issues and
know how to eliminate slow queries.

I would really appreciate your help.

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org

Re: Slow queries when using left join

am 19.03.2010 12:14:03 von Johan De Meersman

--0016e647601eedff68048225714e
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

2010/3/19 Olav M=F8rkrid

> Dear MySQL forum.
>
> I have performance problems when using "left join x" combined with
> "where x.y is null", in particularily when combining three tables this
> way.
>

With a left join, particularly when you're using *is (not) null*, you can't
use index selecting on your right table. That is, you're bound to do a
tablescan on what is essentially the cartesian product of your tables.

Every additional table only compounds the problem. 100x100 is 10.000., but
100x100x100 is 1.000.000.

Avoid left joins whenever possible - in some cases it's quicker to split ou=
t
the complex query and implement it in code with loops - not always, though,
you'll have to apply some elbow grease to find out the optimal solution.

The most recent example of this, was a hierarchical lookup query in Drupal'=
s
taxonomy module: the hierarchy table was left-joined to itself five times.
Execution time on an unloaded machine was 0.54 seconds. By doing individual
lookups in a code loop until I got to the top level, I replaced that query
with a maximum of five (and usually less) 0.00 second ones over an existing
covering index.


Another thing - and maybe one you should look at first, is wether you can
add more selective where-clauses for you base table. That doesn't always
stop at the actual data you want, either. Another example from here: for a
radiostation, there was a multiple left-join query to display the last 20
played songs on the homepage. However, the playlist table keeps growing, so
I got the website people to agree that it's pretty unlikely that songs from
yesterday end up in those 20: we added an index on the playdate and selecte=
d
on that. Boom, execution time down from 0.35 to 0.01. In addition, killing
off old playlist items would've been very beneficial, but this was not an
option due to business requirements. Shame, I love to delete people's data
:-D


And, of course, check if you have indexes on the major parts of your where
clause. Selectivity brings speed.


I seem to have the order of obviousness in this mail wrong, though. Please
read it from bottom to top :-)


--=20
Bier met grenadyn
Is als mosterd by den wyn
Sy die't drinkt, is eene kwezel
Hy die't drinkt, is ras een ezel

--0016e647601eedff68048225714e--