optimizer bug in the index used by mysql/Innodb in the search

optimizer bug in the index used by mysql/Innodb in the search

am 03.03.2003 09:53:56 von rafarife

Description:

Hello Peter,

I have sent to ftp://support.mysql.com/pub/mysql/secret the table definition
and data (Giros.txt) in a compressed file named Giros.zip so you can
invetigate it.


You can see the following queries:


SELECT TIPO,DOC,NRE
FROM GIROS
WHERE (GIROS.TIPO='R' AND GIROS.DOC='ZA03003996' AND GIROS.NRE<'01/01')
OR (GIROS.TIPO='R' AND GIROS.DOC<'ZA03003996') OR GIROS.TIPO<'R'
ORDER BY GIROS.TIPO DESC, GIROS.DOC DESC,GIROS.NRE DESC LIMIT 1

InnoDB
Time: 0.20 secs.
Explain:
table type possible_keys key key_len ref rows Extra
GIROS range PRIMARY,TipoFeVCod TipoFeVCod 1 NULL 9417 Using where; Using index; Using filesort

MYISAM
Time:0.02 secs
Explain:
table type possible_keys key key_len ref rows Extra
GIROS range PRIMARY,TipoFeVCod PRIMARY 16 NULL 19472 Using where; Using index
---

SELECT TIPO,DOC,NRE
FROM GIROS
WHERE CONCAT(TIPO,DOC,NRE)<='RZA0300399601/01'
ORDER BY TIPO DESC,DOC DESC,NRE DESC LIMIT 1

INNODB/MyIsam Time: 0.02 secs.
INNODB/MyIsam Explain:
table type possible_keys key key_len ref rows Extra
GIROS index NULL PRIMARY 16 NULL 19516 Using where; Using index

Both MyIsam and InnoDB return the record in the same time and use the same index PRIMARY.
This query is the same as the above query.

---

Also, in this query (I deleted the third condition of the where clause TIPO<'R')

SELECT TIPO,DOC,NRE
FROM GIROS
WHERE (TIPO='R' AND DOC='ZA03003996' AND NRE<'01/01')
OR (TIPO='R' AND DOC<'ZA03003996')
ORDER BY TIPO DESC, DOC DESC,NRE DESC limit 1

Both MyIsam and InnoDB use the index PRIMARY.

InnoDB
Time:0.03 secs.
Explain:
table type possible_keys key key_len ref rows Extra
GIROS range PRIMARY,TipoFeVCod PRIMARY 1 Const 1933 Using where; Using index

MyIsam
Time: 0.02 secs.
table type possible_keys key key_len ref rows Extra
GIROS range PRIMARY,TipoFeVCod PRIMARY 16 Null 3910 Using where; Using index

I don´t understand why if I add the third condition: OR TIPO<'R' InnoDB isn´t still
using the PRIMARY INDEX.

---

Finally,

SELECT TIPO,DOC,NRE
FROM GIROS
WHERE (GIROS.TIPO='R' AND GIROS.DOC='ZA03003996' AND GIROS.NRE<'01/01')
OR (GIROS.TIPO='R' AND GIROS.DOC<'ZA03003996') OR GIROS.TIPO<'R'
ORDER BY GIROS.TIPO DESC, GIROS.DOC DESC,GIROS.NRE DESC

There is no LIMIT.

Table type MyIsam:
Returned records: 19486 in 0.59 secs.
Explain:
table type possible_keys key key_len ref rows Extra
GIROS range PRIMARY,TipoFeVCod PRIMARY 16 NULL 19472 Using where; Using index

Table type InnoDb:
Returned records: 19486 in 1.18 secs.
Explain:
table type possible_keys key key_len ref rows Extra
GIROS range PRIMARY,TipoFeVCod TipoFeVCod 1 NULL 9417 Using where; Using index; Using filesort

With InnoDB, the optimizer believes it must examine 9417 records, but actually it returns 19486 records,
the same records as MyIsam but twice slower, so I think the optimizer is okey in MyIsam and wrong in
InnoDB.


Thanks in advance,
Rafa

How-To-Repeat:
Select ... from giros ...

Fix:
-

Synopsis:optimizer bug in the index used by mysql/Innodb in the search

Submitter-Id:
Originator: Rafa
Organization: Pecomark
MySQL support: none
Severity: non-critical
Priority: medium
Category: mysqld-max-nt
Class: sw-bug
Release: mysqld 4.0.11 Gamma(InnoDB)

Exectutable: mysqld-max-nt
Environment: Pentium III-MMX, 500 MHZ, 540 MB
System: Windows 2000
Compiler: -
Architecture: i



____________________________________________________________ ______
The NEW Netscape 7.0 browser is now available. Upgrade now! http://channels.netscape.com/ns/browsers/download.jsp

Get your own FREE, personal Netscape Mail account today at http://webmail.netscape.com/

------------------------------------------------------------ ---------
Before posting, please check:
http://www.mysql.com/manual.php (the manual)
http://lists.mysql.com/ (the list archive)

To request this thread, e-mail
To unsubscribe, e-mail
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: optimizer bug in the index used by mysql/Innodb in the search

am 04.03.2003 10:41:38 von Peter Zaitsev

On Mon, 2003-03-03 at 11:53, rafarife@netscape.net wrote:
> Description:
>
> Hello Peter,
>
> I have sent to ftp://support.mysql.com/pub/mysql/secret the table definition
> and data (Giros.txt) in a compressed file named Giros.zip so you can
> invetigate it.

Thank you it is really helpful.

>
>
> You can see the following queries:
>
>
> SELECT TIPO,DOC,NRE
> FROM GIROS
> WHERE (GIROS.TIPO='R' AND GIROS.DOC='ZA03003996' AND GIROS.NRE<'01/01')
> OR (GIROS.TIPO='R' AND GIROS.DOC<'ZA03003996') OR GIROS.TIPO<'R'
> ORDER BY GIROS.TIPO DESC, GIROS.DOC DESC,GIROS.NRE DESC LIMIT 1
>
> InnoDB
> Time: 0.20 secs.
> Explain:
> table type possible_keys key key_len ref rows Extra
> GIROS range PRIMARY,TipoFeVCod TipoFeVCod 1 NULL 9417 Using where; Using index; Using filesort
>
> MYISAM
> Time:0.02 secs
> Explain:
> table type possible_keys key key_len ref rows Extra
> GIROS range PRIMARY,TipoFeVCod PRIMARY 16 NULL 19472 Using where; Using index
> ---

This looks like minor optimizer bug in Innodb.
Actually for this query for retrieving data you can use only GIROS.TIPO
keypart, due to OR GIROS.TIPO
TIPO is prefix for two keys:

PRIMARY KEY (`Tipo`,`Doc`,`NRe`),
KEY `TipoFeVCod` (`Tipo`,`FeV`,`Cod`),


Innodb selects second one as it thinks it is likely to estimated to have
better selectivity.

Thus this is wrong as both keys have the same selectivity for their
first key part by definition and PRIMARY KEY is always faster to scan
in Innodb, so it shall be given preference in such case.


The second issue is - innodb for some reason does not boost primary key
preference even if it can be used to retrieve rows in sorted order.
I would leave for Heikki to comment why it happens.

The workaround is just to help Innodb to select proper index:

mysql> explain SELECT TIPO,DOC,NRE
-> FROM GIROS FORCE INDEX(PRIMARY)
-> WHERE (GIROS.TIPO='R' AND GIROS.DOC='ZA03003996' AND
GIROS.NRE<'01/01')
-> OR (GIROS.TIPO='R' AND GIROS.DOC<'ZA03003996') OR
GIROS.TIPO<'R'
-> ORDER BY GIROS.TIPO DESC, GIROS.DOC DESC,GIROS.NRE DESC LIMIT
1;
+-------+-------+---------------+---------+---------+------+ -------+--------------------------+
| table | type | possible_keys | key | key_len | ref | rows |
Extra |
+-------+-------+---------------+---------+---------+------+ -------+--------------------------+
| GIROS | range | PRIMARY | PRIMARY | 16 | NULL | 16622 |
Using where; Using index |
+-------+-------+---------------+---------+---------+------+ -------+--------------------------+
1 row in set (0.00 sec)

Notice FORCE INDEX optimizer hint.
As you might see number of estimated rows matched is almost 2 times
large in this case but it is just inaccurate estimations of course.



>
> SELECT TIPO,DOC,NRE
> FROM GIROS
> WHERE CONCAT(TIPO,DOC,NRE)<='RZA0300399601/01'
> ORDER BY TIPO DESC,DOC DESC,NRE DESC LIMIT 1
>
> INNODB/MyIsam Time: 0.02 secs.
> INNODB/MyIsam Explain:
> table type possible_keys key key_len ref rows Extra
> GIROS index NULL PRIMARY 16 NULL 19516 Using where; Using index
>
> Both MyIsam and InnoDB return the record in the same time and use the same index PRIMARY.
> This query is the same as the above query.

Right. In this case there is no index which can be used to optimize
where clause, only to optimize ORDER BY with LIMIT.

>
> ---
>
> Also, in this query (I deleted the third condition of the where clause TIPO<'R')
>
> SELECT TIPO,DOC,NRE
> FROM GIROS
> WHERE (TIPO='R' AND DOC='ZA03003996' AND NRE<'01/01')
> OR (TIPO='R' AND DOC<'ZA03003996')
> ORDER BY TIPO DESC, DOC DESC,NRE DESC limit 1
>
> Both MyIsam and InnoDB use the index PRIMARY.

>
> InnoDB
> Time:0.03 secs.
> Explain:
> table type possible_keys key key_len ref rows Extra
> GIROS range PRIMARY,TipoFeVCod PRIMARY 1 Const 1933 Using where; Using index
>
> MyIsam
> Time: 0.02 secs.
> table type possible_keys key key_len ref rows Extra
> GIROS range PRIMARY,TipoFeVCod PRIMARY 16 Null 3910 Using where; Using index


The strange about this case is - why Innodb uses only first part of the
key, while TIPO,DOC shall be possible at least ?


Still performance is good enough as you quickly spot needed single row
while scanning the index.

The difference from the first query is different range which means
different stats for the optimizer.






--
MySQL 2003 Users Conf. -> http://www.mysql.com/events/uc2003/
__ ___ ___ ____ __
/ |/ /_ __/ __/ __ \/ / Peter Zaitsev
/ /|_/ / // /\ \/ /_/ / /__ MySQL AB, Full-Time Developer
/_/ /_/\_, /___/\___\_\___/ Moscow, Russia
<___/ www.mysql.com


------------------------------------------------------------ ---------
Before posting, please check:
http://www.mysql.com/manual.php (the manual)
http://lists.mysql.com/ (the list archive)

To request this thread, e-mail bugs-thread13891@lists.mysql.com
To unsubscribe, e-mail