Index selection problem
am 21.07.2009 11:52:45 von Morten
Hi, I have a table "orders" with the columns
item_id INT FK items(id)
customer_id INT FK customers(id)
status_id TINYINT -- Between 1 and 4 always
ordered_at DATETIME
delivered_at DATETIME
There are indexes:
index_a: (item_id, customer_id, status_id)
index_b: (item_id, status_id, ordered_at, delivered_at)
Given this query:
SELECT * FROM orders WHERE item_id = 9602 AND customer_id = 5531
AND status_id IN (1,2)
Then the key chosen is index_b. Same happens if I use (status_id = 1
OR status_id = 2). If I only check against one status_id, then the
"correct" index_a gets picked with ref const,const,const.
I'm not even doing a range scan on status_id and even if I were, it's
the last column in index_a. Since ordered_at and delivered_at are both
dates then index_b will have a very high selectivity. In reality,
index_b may make little sense, but I still don't understand why MySQL
would ever pick that when 3 columns in the query can use the covering
index_a
Can anyone give me some input on how to make sense of this?
Thanks,
Morten
select count(*) from orders where item_id = 9602 -> 4534 records
select count(*) from orders where item_id = 9602 and status_id IN
(1,2) -> 4181 records
select count(*) from orders where item_id = 9602 and customer_id =
5531 -> 1226 records
select count(*) from orders where item_id = 9602 and customer_id =
5531 and status_id IN (1,2) -> 1174 records
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org
Re: Index selection problem
am 21.07.2009 15:27:41 von Johnny Withers
--0016364ed9ae0f7953046f37389e
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
MySQL is unable to use your index when you use IN and/or OR on yoru column.
If the query is slow, you should switch to a union:
SELECT * FROM orders WHERE item_id = 9602 AND customer_id = 5531 AND
status_id =1
UNION
SELECT * FROM orders WHERE item_id = 9602 AND customer_id = 5531 AND
status_id =2
On Tue, Jul 21, 2009 at 4:52 AM, Morten wrote:
>
> Hi, I have a table "orders" with the columns
>
> item_id INT FK items(id)
> customer_id INT FK customers(id)
> status_id TINYINT -- Between 1 and 4 always
> ordered_at DATETIME
> delivered_at DATETIME
>
> There are indexes:
>
> index_a: (item_id, customer_id, status_id)
> index_b: (item_id, status_id, ordered_at, delivered_at)
>
> Given this query:
>
> SELECT * FROM orders WHERE item_id = 9602 AND customer_id = 5531 AND
> status_id IN (1,2)
>
> Then the key chosen is index_b. Same happens if I use (status_id = 1 OR
> status_id = 2). If I only check against one status_id, then the "correct"
> index_a gets picked with ref const,const,const.
>
> I'm not even doing a range scan on status_id and even if I were, it's the
> last column in index_a. Since ordered_at and delivered_at are both dates
> then index_b will have a very high selectivity. In reality, index_b may make
> little sense, but I still don't understand why MySQL would ever pick that
> when 3 columns in the query can use the covering index_a
>
> Can anyone give me some input on how to make sense of this?
>
> Thanks,
>
> Morten
>
> select count(*) from orders where item_id = 9602 -> 4534 records
> select count(*) from orders where item_id = 9602 and status_id IN (1,2) ->
> 4181 records
> select count(*) from orders where item_id = 9602 and customer_id = 5531 ->
> 1226 records
> select count(*) from orders where item_id = 9602 and customer_id = 5531 and
> status_id IN (1,2) -> 1174 records
>
>
>
> --
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe: http://lists.mysql.com/mysql?unsub=johnny@pixelated.net
>
>
--
-----------------------------
Johnny Withers
601.209.4985
johnny@pixelated.net
--0016364ed9ae0f7953046f37389e--
Re: Index selection problem
am 21.07.2009 15:54:26 von Brent Baisley
Try doing a "SHOW INDEX FROM orders" and look at the cardinality
column. These are the stats MySQL uses to determine which index to
use. Sometimes they aren't always update properly and you may need to
run ANALYZE on the table.
But, you can also tell MySQL to use the index you want.
SELECT * FROM orders USE INDEX (index_a) WHERE ...
Brent Baisley
On Tue, Jul 21, 2009 at 5:52 AM, Morten wrote:
>
> Hi, I have a table "orders" with the columns
>
> =A0item_id INT FK items(id)
> =A0customer_id INT FK customers(id)
> =A0status_id TINYINT -- Between 1 and 4 always
> =A0ordered_at DATETIME
> =A0delivered_at DATETIME
>
> There are indexes:
>
> =A0index_a: (item_id, customer_id, status_id)
> =A0index_b: (item_id, status_id, ordered_at, delivered_at)
>
> Given this query:
>
> =A0SELECT * FROM orders WHERE item_id =3D 9602 AND customer_id =3D 5531 A=
ND
> status_id IN (1,2)
>
> Then the key chosen is index_b. Same happens if I use (status_id =3D 1 OR
> status_id =3D 2). If I only check against one status_id, then the "correc=
t"
> index_a gets picked with ref const,const,const.
>
> I'm not even doing a range scan on status_id and even if I were, it's the
> last column in index_a. Since ordered_at and delivered_at are both dates
> then index_b will have a very high selectivity. In reality, index_b may m=
ake
> little sense, but I still don't understand why MySQL would ever pick that
> when 3 columns in the query can use the covering index_a
>
> Can anyone give me some input on how to make sense of this?
>
> Thanks,
>
> Morten
>
> select count(*) from orders where item_id =3D 9602 -> 4534 records
> select count(*) from orders where item_id =3D 9602 and status_id IN (1,2)=
->
> 4181 records
> select count(*) from orders where item_id =3D 9602 and customer_id =3D 55=
31 ->
> 1226 records
> select count(*) from orders where item_id =3D 9602 and customer_id =3D 55=
31 and
> status_id IN (1,2) -> 1174 records
>
>
>
> --
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe: =A0 =A0http://lists.mysql.com/mysql?unsub=3Dbrenttech@gma=
il.com
>
>
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=3Dgcdmg-mysql-2@m.gmane.o rg
Re: Index selection problem
am 21.07.2009 19:42:35 von Morten Primdahl
On Jul 21, 2009, at 3:27 PM, Johnny Withers wrote:
> MySQL is unable to use your index when you use IN and/or OR on yoru =20=
> column.
Is this really true?
I'm reading "High Performance MySQL 2nd ed." these days and =20
specifically got the impression that using IN will allow usage of the =20=
index. The below quote is from the book, and the "multiple equality =20
condition" refers to an IN (...) expression.
"... we draw a distinction between ranges of values and multiple =20
equality conditions.The second query is a multiple equality condition, =20=
in our terminology. We=92re not just being picky: these two kinds of =20
index accesses perform differently. The range condition makes MySQL =20
ignore any further columns in the index, but the multiple equality =20
condition doesn=92t have that limitation."
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=3Dgcdmg-mysql-2@m.gmane.o rg
Re: Index selection problem
am 21.07.2009 19:43:52 von Morten Primdahl
The other index does have a way higher cardinality, but the query is
for 3 columns all of which are in the first index. I guess this is
just one of the situations where MySQL makes a wrong assessment.
On Jul 21, 2009, at 3:54 PM, Brent Baisley wrote:
> Try doing a "SHOW INDEX FROM orders" and look at the cardinality
> column. These are the stats MySQL uses to determine which index to
> use. Sometimes they aren't always update properly and you may need to
> run ANALYZE on the table.
>
> But, you can also tell MySQL to use the index you want.
> SELECT * FROM orders USE INDEX (index_a) WHERE ...
>
> Brent Baisley
>
> On Tue, Jul 21, 2009 at 5:52 AM, Morten wrote:
>>
>> Hi, I have a table "orders" with the columns
>>
>> item_id INT FK items(id)
>> customer_id INT FK customers(id)
>> status_id TINYINT -- Between 1 and 4 always
>> ordered_at DATETIME
>> delivered_at DATETIME
>>
>> There are indexes:
>>
>> index_a: (item_id, customer_id, status_id)
>> index_b: (item_id, status_id, ordered_at, delivered_at)
>>
>> Given this query:
>>
>> SELECT * FROM orders WHERE item_id = 9602 AND customer_id = 5531 AND
>> status_id IN (1,2)
>>
>> Then the key chosen is index_b. Same happens if I use (status_id =
>> 1 OR
>> status_id = 2). If I only check against one status_id, then the
>> "correct"
>> index_a gets picked with ref const,const,const.
>>
>> I'm not even doing a range scan on status_id and even if I were,
>> it's the
>> last column in index_a. Since ordered_at and delivered_at are both
>> dates
>> then index_b will have a very high selectivity. In reality, index_b
>> may make
>> little sense, but I still don't understand why MySQL would ever
>> pick that
>> when 3 columns in the query can use the covering index_a
>>
>> Can anyone give me some input on how to make sense of this?
>>
>> Thanks,
>>
>> Morten
>>
>> select count(*) from orders where item_id = 9602 -> 4534 records
>> select count(*) from orders where item_id = 9602 and status_id IN
>> (1,2) ->
>> 4181 records
>> select count(*) from orders where item_id = 9602 and customer_id =
>> 5531 ->
>> 1226 records
>> select count(*) from orders where item_id = 9602 and customer_id =
>> 5531 and
>> status_id IN (1,2) -> 1174 records
>>
>>
>>
>> --
>> MySQL General Mailing List
>> For list archives: http://lists.mysql.com/mysql
>> To unsubscribe: http://lists.mysql.com/mysql?unsub=brenttech@gmail.com
>>
>>
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org
Re: Index selection problem
am 21.07.2009 20:16:16 von John Daisley
--=-/bf2GBHt79Ez4QFcxPlY
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
On Tue, 2009-07-21 at 19:42 +0200, Morten Primdahl wrote:
> On Jul 21, 2009, at 3:27 PM, Johnny Withers wrote:
>=20
> > MySQL is unable to use your index when you use IN and/or OR on yoru =20
> > column.
>=20
> Is this really true?
No its not true! Try running OPTIMIZE TABLE on the affected table, then
run the query again and see if the other index is used!
=20
>=20
> I'm reading "High Performance MySQL 2nd ed." these days and =20
> specifically got the impression that using IN will allow usage of the =20
> index. The below quote is from the book, and the "multiple equality =20
> condition" refers to an IN (...) expression.
>=20
> "... we draw a distinction between ranges of values and multiple =20
> equality conditions.The second query is a multiple equality condition, =
=20
> in our terminology. Weâ=99re not just being picky: these two kinds=
of =20
> index accesses perform differently. The range condition makes MySQL =20
> ignore any further columns in the index, but the multiple equality =20
> condition doesnâ=99t have that limitation."
>=20
>=20
>=20
>=20
>=20
John Daisley
Email: john.daisley@butterflysystems.co.uk
Mobile: +44 (0)7812 451238
MySQL Certified Database Administrator (CMDBA)
MySQL Certified Developer (CMDEV)
MySQL Certified Associate (CMA)
Comptia A+ Certified Professional IT Technician
-------
Life's journey is not to arrive at the grave safely in a well preserved
body, but rather to slide in sideways, thoroughly used up, totally worn
out and screaming "Wow! what a ride!"
--=-/bf2GBHt79Ez4QFcxPlY--
Re: Index selection problem
am 22.07.2009 14:57:27 von Johnny Withers
Maybe I'm wrong :)
On Tuesday, July 21, 2009, John Daisley w=
rote:
>
>
> On Tue, 2009-07-21 at 19:42 +0200, Morten Primdahl wrote:
>
>> On Jul 21, 2009, at 3:27 PM, Johnny Withers wrote:
>>
>> > MySQL is unable to use your index when you use IN and/or OR on yoru
>> > column.
>>
>> Is this really true?
>
>
> No its not true! Try running OPTIMIZE TABLE on the affected table, then
> run the query again and see if the other index is used!
>
>
>
>
>>
>> I'm reading "High Performance MySQL 2nd ed." these days and
>> specifically got the impression that using IN will allow usage of the
>> index. The below quote is from the book, and the "multiple equality
>> condition" refers to an IN (...) expression.
>>
>> "... we draw a distinction between ranges of values and multiple
>> equality conditions.The second query is a multiple equality condition,
>> in our terminology. We=92re not just being picky: these two kinds of
>> index accesses perform differently. The range condition makes MySQL
>> ignore any further columns in the index, but the multiple equality
>> condition doesn=92t have that limitation."
>>
>>
>>
>>
>>
>
> John Daisley
> Email: john.daisley@butterflysystems.co.uk
> Mobile: +44 (0)7812 451238
>
> MySQL Certified Database Administrator (CMDBA)
> MySQL Certified Developer (CMDEV)
> MySQL Certified Associate (CMA)
> Comptia A+ Certified Professional IT Technician
>
> -------
>
> Life's journey is not to arrive at the grave safely in a well preserved
> body, but rather to slide in sideways, thoroughly used up, totally worn
> out and screaming "Wow! what a ride!"
>
--=20
-----------------------------
Johnny Withers
601.209.4985
johnny@pixelated.net
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=3Dgcdmg-mysql-2@m.gmane.o rg