Slow queries when using left join
am 19.03.2010 09:14:27 von olav.morkrid
Dear MySQL forum.
I have performance problems when using "left join x" combined with
"where x.y is null", in particularily when combining three tables this
way.
Please contact me by e-mail if you are familiar with these issues and
know how to eliminate slow queries.
I would really appreciate your help.
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org
Re: Slow queries when using left join
am 19.03.2010 12:14:03 von Johan De Meersman
--0016e647601eedff68048225714e
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
2010/3/19 Olav M=F8rkrid
> Dear MySQL forum.
>
> I have performance problems when using "left join x" combined with
> "where x.y is null", in particularily when combining three tables this
> way.
>
With a left join, particularly when you're using *is (not) null*, you can't
use index selecting on your right table. That is, you're bound to do a
tablescan on what is essentially the cartesian product of your tables.
Every additional table only compounds the problem. 100x100 is 10.000., but
100x100x100 is 1.000.000.
Avoid left joins whenever possible - in some cases it's quicker to split ou=
t
the complex query and implement it in code with loops - not always, though,
you'll have to apply some elbow grease to find out the optimal solution.
The most recent example of this, was a hierarchical lookup query in Drupal'=
s
taxonomy module: the hierarchy table was left-joined to itself five times.
Execution time on an unloaded machine was 0.54 seconds. By doing individual
lookups in a code loop until I got to the top level, I replaced that query
with a maximum of five (and usually less) 0.00 second ones over an existing
covering index.
Another thing - and maybe one you should look at first, is wether you can
add more selective where-clauses for you base table. That doesn't always
stop at the actual data you want, either. Another example from here: for a
radiostation, there was a multiple left-join query to display the last 20
played songs on the homepage. However, the playlist table keeps growing, so
I got the website people to agree that it's pretty unlikely that songs from
yesterday end up in those 20: we added an index on the playdate and selecte=
d
on that. Boom, execution time down from 0.35 to 0.01. In addition, killing
off old playlist items would've been very beneficial, but this was not an
option due to business requirements. Shame, I love to delete people's data
:-D
And, of course, check if you have indexes on the major parts of your where
clause. Selectivity brings speed.
I seem to have the order of obviousness in this mail wrong, though. Please
read it from bottom to top :-)
--=20
Bier met grenadyn
Is als mosterd by den wyn
Sy die't drinkt, is eene kwezel
Hy die't drinkt, is ras een ezel
--0016e647601eedff68048225714e--