Subquery performance slow to non-existent...

Subquery performance slow to non-existent...

am 19.01.2010 20:54:46 von Nick Torenvliet

--001485f454508354c5047d89d79b
Content-Type: text/plain; charset=ISO-8859-1

Hey all...

I am having a problem with sub-queries that I cannot trouble shoot.

I run query a:
select symbol from names where market like 'NYMEX' and name like 'natural
gas {%';

and get 168 names that I manually insert into query b:
select * from endOfDayData where endOfDayData.market like 'NYMEX' and
endOfDayData.symbol IN
("NGF09","NGF10","NGF11","NGF12","NGF13","NGF14","NGF15","NG F16","NGF17","NGF18","NGF19","NGF20",
"NGF21","NGF22","NGG09","NGG10","NGG11","NGG12","NGG13","NGG 14","NGG15","NGG16","NGG17","NGG18","NGG19","NGG20","NGG21", "NGG22","NGH09","NGH10","NGH11","NGH12","NGH13","NGH14","NGH 15","NGH16",
"NGH17","NGH18","NGH19","NGH20","NGH21","NGH22","NGJ09","NGJ 10","NGJ11","NGJ12","NGJ13","NGJ14","NGJ15","NGJ16","NGJ17", "NGJ18","NGJ19","NGJ20","NGJ21","NGJ22","NGK09","NGK10","NGK 11","NGK12",
"NGK13","NGK14","NGK15","NGK16","NGK17","NGK18","NGK19","NGK 20","NGK21","NGK22","NGM09","NGM10","NGM11","NGM12","NGM13", "NGM14","NGM15","NGM16","NGM17","NGM18","NGM19","NGM20","NGM 21","NGM22",
"NGN09","NGN10","NGN11","NGN12","NGN13","NGN14","NGN15","NGN 16","NGN17","NGN18","NGN19","NGN20","NGN21","NGN22","NGQ09", "NGQ10","NGQ11","NGQ12","NGQ13","NGQ14","NGQ15","NGQ16","NGQ 17","NGQ18",
"NGQ19","NGQ20","NGQ21","NGQ22","NGU09","NGU10","NGU11","NGU 12","NGU13","NGU14","NGU15","NGU16","NGU17","NGU18","NGU19", "NGU20","NGU21","NGU22","NGV09","NGV10","NGV11","NGV12","NGV 13","NGV14",
"NGV15","NGV16","NGV17","NGV18","NGV19","NGV20","NGV21","NGV 22","NGX09","NGX10","NGX11","NGX12","NGX13","NGX14","NGX15", "NGX16","NGX17","NGX18","NGX19","NGX20","NGX21","NGX22","NGZ 09","NGZ10",
"NGZ11","NGZ12","NGZ13","NGZ14","NGZ15","NGZ16","NGZ17","NGZ 18","NGZ19","NGZ20","NGZ21","NGZ22");

Running query b gives me a result set as follows:

| 2010-01-15 | NYMEX | NGZ22 | 8.9620 | 8.9680 | 8.9620 | 8.9680
| 0 |
+------------+--------+--------+---------+---------+-------- -+---------+--------+
86765 rows in set (4.46 sec)

I then because I want to generalize query b I continue by creating query c
as follows:
mysql> select * from endOfDayData where endOfDayData.market like 'NYMEX'
and endOfDayData.symbol IN (select names.symbol from names where
names.market like 'NYMEX' and names.name like 'natural gas {%');

Query c seems to have good syntax as neither the command line mysql
interface nor the gui spit it back but it literally takes forever to run;
I've waited at least twenty minutes and not got anything back. I'm running
Ubuntu 9.10 on an intel core i7 with 4GB RAM and 12GB swap... the process
monitor doesn't even flinch so I'm not thinking hardware here... why is the
sub-query running so slow?

Thanks for you help!!

Nick

--001485f454508354c5047d89d79b--

Re: Subquery performance slow to non-existent...

am 19.01.2010 21:06:30 von Dan Nelson

In the last episode (Jan 19), Nick Torenvliet said:
> Hey all...
>
> I am having a problem with sub-queries that I cannot trouble shoot.
>
> I run query a:
> select symbol from names where market like 'NYMEX' and name like 'natural
> gas {%';
>
> and get 168 names that I manually insert into query b:
> select * from endOfDayData where endOfDayData.market like 'NYMEX' and
> endOfDayData.symbol IN
> ("NGF09","NGF10","NGF11","NGF12","NGF13","NGF14","NGF15","NG F16","NGF17","NGF18","NGF19","NGF20",
> "NGF21","NGF22","NGG09","NGG10","NGG11","NGG12","NGG13","NGG 14","NGG15","NGG16","NGG17","NGG18","NGG19","NGG20","NGG21", "NGG22","NGH09","NGH10","NGH11","NGH12","NGH13","NGH14","NGH 15","NGH16",
> "NGH17","NGH18","NGH19","NGH20","NGH21","NGH22","NGJ09","NGJ 10","NGJ11","NGJ12","NGJ13","NGJ14","NGJ15","NGJ16","NGJ17", "NGJ18","NGJ19","NGJ20","NGJ21","NGJ22","NGK09","NGK10","NGK 11","NGK12",
> "NGK13","NGK14","NGK15","NGK16","NGK17","NGK18","NGK19","NGK 20","NGK21","NGK22","NGM09","NGM10","NGM11","NGM12","NGM13", "NGM14","NGM15","NGM16","NGM17","NGM18","NGM19","NGM20","NGM 21","NGM22",
> "NGN09","NGN10","NGN11","NGN12","NGN13","NGN14","NGN15","NGN 16","NGN17","NGN18","NGN19","NGN20","NGN21","NGN22","NGQ09", "NGQ10","NGQ11","NGQ12","NGQ13","NGQ14","NGQ15","NGQ16","NGQ 17","NGQ18",
> "NGQ19","NGQ20","NGQ21","NGQ22","NGU09","NGU10","NGU11","NGU 12","NGU13","NGU14","NGU15","NGU16","NGU17","NGU18","NGU19", "NGU20","NGU21","NGU22","NGV09","NGV10","NGV11","NGV12","NGV 13","NGV14",
> "NGV15","NGV16","NGV17","NGV18","NGV19","NGV20","NGV21","NGV 22","NGX09","NGX10","NGX11","NGX12","NGX13","NGX14","NGX15", "NGX16","NGX17","NGX18","NGX19","NGX20","NGX21","NGX22","NGZ 09","NGZ10",
> "NGZ11","NGZ12","NGZ13","NGZ14","NGZ15","NGZ16","NGZ17","NGZ 18","NGZ19","NGZ20","NGZ21","NGZ22");
>
> Running query b gives me a result set as follows:
>
> | 2010-01-15 | NYMEX | NGZ22 | 8.9620 | 8.9680 | 8.9620 | 8.9680
> | 0 |
> +------------+--------+--------+---------+---------+-------- -+---------+--------+
> 86765 rows in set (4.46 sec)
>
> I then because I want to generalize query b I continue by creating query c
> as follows:
> mysql> select * from endOfDayData where endOfDayData.market like 'NYMEX'
> and endOfDayData.symbol IN (select names.symbol from names where
> names.market like 'NYMEX' and names.name like 'natural gas {%');
>
> Query c seems to have good syntax as neither the command line mysql
> interface nor the gui spit it back but it literally takes forever to run;
> I've waited at least twenty minutes and not got anything back. I'm running
> Ubuntu 9.10 on an intel core i7 with 4GB RAM and 12GB swap... the process
> monitor doesn't even flinch so I'm not thinking hardware here... why is the
> sub-query running so slow?

MySQL's subquery optimizer is pretty bad. I bet if you explain that query,
mysql thinks the subquery is dependant. That means that it will run the
subquery for each row of the outer query, even though it's obviously not
going to change from row to row. Your best bet for now is to do what you're
currently doing with queries A and B.

--
Dan Nelson
dnelson@allantgroup.com

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org

Re: Subquery performance slow to non-existent...

am 19.01.2010 21:42:28 von Shawn Green

Nick Torenvliet wrote:
> Hey all...
>
> I am having a problem with sub-queries that I cannot trouble shoot.
>
> I run query a:
> select symbol from names where market like 'NYMEX' and name like 'natural
> gas {%';
>
> and get 168 names that I manually insert into query b:
> select * from endOfDayData where endOfDayData.market like 'NYMEX' and
> endOfDayData.symbol IN
> (... big list clipped ...);
>
> Running query b gives me a result set as follows:
>
> | 2010-01-15 | NYMEX | NGZ22 | 8.9620 | 8.9680 | 8.9620 | 8.9680
> | 0 |
> +------------+--------+--------+---------+---------+-------- -+---------+--------+
> 86765 rows in set (4.46 sec)
>
> I then because I want to generalize query b I continue by creating query c
> as follows:
> mysql> select * from endOfDayData where endOfDayData.market like 'NYMEX'
> and endOfDayData.symbol IN (select names.symbol from names where
> names.market like 'NYMEX' and names.name like 'natural gas {%');
>
> Query c seems to have good syntax as neither the command line mysql
> interface nor the gui spit it back but it literally takes forever to run;
> I've waited at least twenty minutes and not got anything back. I'm running
> Ubuntu 9.10 on an intel core i7 with 4GB RAM and 12GB swap... the process
> monitor doesn't even flinch so I'm not thinking hardware here... why is the
> sub-query running so slow?
>
> Thanks for you help!!

Subqueries are also not indexed. Even if this is an independent
subquery, the optimizer will still need to scan the results of each
subquery for every line of the outer query.

Try rewriting this as a joinl

SELECT eod.*
from endOfDayData eod
INNER JOIN names n
on n.symbol = eod.symbol
and n.market like 'NYMEX'
and names.name like 'natural gas {%';

Try that and see what a difference it makes.

--
Shawn Green, MySQL Senior Support Engineer
Sun Microsystems, Inc.
Office: Blountville, TN



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org

Re: Subquery performance slow to non-existent...

am 19.01.2010 22:22:09 von Peter Brawley

--------------060702050301000609080208
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Nick,

>select * from endOfDayData where endOfDayData.market like 'NYMEX'
>and endOfDayData.symbol IN (select names.symbol from names where
>names.market like 'NYMEX' and names.name like 'natural gas {%');
>
>Query c seems to have good syntax as neither the command line mysql
>interface nor the gui spit it back but it literally takes forever to run;

IN(SELECT...) is slow in MySQL; for alternatives see "The unbearable slowness of IN()" at http://www.artfulsoftware.com/infotree/queries.php.

Try something like this?

select *
from endOfDayData d
join names n using (symbol)
where d.market = 'NYMEX' and 'natural gas {' = substr( n.name,1,13);

PB

-----

Nick Torenvliet wrote:
> Hey all...
>
> I am having a problem with sub-queries that I cannot trouble shoot.
>
> I run query a:
> select symbol from names where market like 'NYMEX' and name like 'natural
> gas {%';
>
> and get 168 names that I manually insert into query b:
> select * from endOfDayData where endOfDayData.market like 'NYMEX' and
> endOfDayData.symbol IN
> ("NGF09","NGF10","NGF11","NGF12","NGF13","NGF14","NGF15","NG F16","NGF17","NGF18","NGF19","NGF20",
> "NGF21","NGF22","NGG09","NGG10","NGG11","NGG12","NGG13","NGG 14","NGG15","NGG16","NGG17","NGG18","NGG19","NGG20","NGG21", "NGG22","NGH09","NGH10","NGH11","NGH12","NGH13","NGH14","NGH 15","NGH16",
> "NGH17","NGH18","NGH19","NGH20","NGH21","NGH22","NGJ09","NGJ 10","NGJ11","NGJ12","NGJ13","NGJ14","NGJ15","NGJ16","NGJ17", "NGJ18","NGJ19","NGJ20","NGJ21","NGJ22","NGK09","NGK10","NGK 11","NGK12",
> "NGK13","NGK14","NGK15","NGK16","NGK17","NGK18","NGK19","NGK 20","NGK21","NGK22","NGM09","NGM10","NGM11","NGM12","NGM13", "NGM14","NGM15","NGM16","NGM17","NGM18","NGM19","NGM20","NGM 21","NGM22",
> "NGN09","NGN10","NGN11","NGN12","NGN13","NGN14","NGN15","NGN 16","NGN17","NGN18","NGN19","NGN20","NGN21","NGN22","NGQ09", "NGQ10","NGQ11","NGQ12","NGQ13","NGQ14","NGQ15","NGQ16","NGQ 17","NGQ18",
> "NGQ19","NGQ20","NGQ21","NGQ22","NGU09","NGU10","NGU11","NGU 12","NGU13","NGU14","NGU15","NGU16","NGU17","NGU18","NGU19", "NGU20","NGU21","NGU22","NGV09","NGV10","NGV11","NGV12","NGV 13","NGV14",
> "NGV15","NGV16","NGV17","NGV18","NGV19","NGV20","NGV21","NGV 22","NGX09","NGX10","NGX11","NGX12","NGX13","NGX14","NGX15", "NGX16","NGX17","NGX18","NGX19","NGX20","NGX21","NGX22","NGZ 09","NGZ10",
> "NGZ11","NGZ12","NGZ13","NGZ14","NGZ15","NGZ16","NGZ17","NGZ 18","NGZ19","NGZ20","NGZ21","NGZ22");
>
> Running query b gives me a result set as follows:
>
> | 2010-01-15 | NYMEX | NGZ22 | 8.9620 | 8.9680 | 8.9620 | 8.9680
> | 0 |
> +------------+--------+--------+---------+---------+-------- -+---------+--------+
> 86765 rows in set (4.46 sec)
>
> I then because I want to generalize query b I continue by creating query c
> as follows:
> mysql> select * from endOfDayData where endOfDayData.market like 'NYMEX'
> and endOfDayData.symbol IN (select names.symbol from names where
> names.market like 'NYMEX' and names.name like 'natural gas {%');
>
> Query c seems to have good syntax as neither the command line mysql
> interface nor the gui spit it back but it literally takes forever to run;
> I've waited at least twenty minutes and not got anything back. I'm running
> Ubuntu 9.10 on an intel core i7 with 4GB RAM and 12GB swap... the process
> monitor doesn't even flinch so I'm not thinking hardware here... why is the
> sub-query running so slow?
>
> Thanks for you help!!
>
> Nick
>
>
> ------------------------------------------------------------ ------------
>
>
> No virus found in this incoming message.
> Checked by AVG - www.avg.com
> Version: 8.5.432 / Virus Database: 270.14.150/2632 - Release Date: 01/19/10 07:34:00
>
>

--------------060702050301000609080208--