Index question

Index question

am 11.10.2011 09:41:06 von Alex Schaft

--------------020003000403040609060704
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

If you have a table with columns A & B, and might do a where on A or B,
or an order by A, B, would single column indexes on A and B suffice or
would performance on the order by query be improved by an index on A,B?

Thanks


--------------020003000403040609060704
Content-Type: text/plain; charset=us-ascii


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org
--------------020003000403040609060704--

Re: Index question

am 11.10.2011 10:36:32 von Johan De Meersman

----- Original Message -----
> From: "Alex Schaft"
>
> If you have a table with columns A & B, and might do a where on A or
> B, or an order by A, B, would single column indexes on A and B suffice
> or would performance on the order by query be improved by an index on
> A,B?

Depends on usage :-)

key (a, b) is good for "where a=.." or "where a=.. and b=.."
key (b, a) is good for "where b=.." or "where b=.. and a=.."
(note that the sequence of a and b in the where clause is not important)
key (a), key (b) is good for "where a=.." or "where b=.." but will only use one index for "where a=.. and b=..".

I think work is ongoing on having the parser use multiple indices, but I'm not sure where that's at.


--
Bier met grenadyn
Is als mosterd by den wyn
Sy die't drinkt, is eene kwezel
Hy die't drinkt, is ras een ezel

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org

Re: Index question

am 11.10.2011 13:30:23 von Tompkins Neil

Just to clarify having key indexes of (a,b) or (b,a) have no difference ?

On 11 Oct 2011, at 09:36, Johan De Meersman wrote:

> ----- Original Message -----
>> From: "Alex Schaft"
>>=20
>> If you have a table with columns A & B, and might do a where on A or
>> B, or an order by A, B, would single column indexes on A and B suffice
>> or would performance on the order by query be improved by an index on
>> A,B?
>=20
> Depends on usage :-)
>=20
> key (a, b) is good for "where a=3D.." or "where a=3D.. and b=3D.."
> key (b, a) is good for "where b=3D.." or "where b=3D.. and a=3D.."
> (note that the sequence of a and b in the where clause is not important)
> key (a), key (b) is good for "where a=3D.." or "where b=3D.." but will onl=
y use one index for "where a=3D.. and b=3D..".
>=20
> I think work is ongoing on having the parser use multiple indices, but I'm=
not sure where that's at.
>=20
>=20
> --=20
> Bier met grenadyn
> Is als mosterd by den wyn
> Sy die't drinkt, is eene kwezel
> Hy die't drinkt, is ras een ezel
>=20
> --=20
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe: http://lists.mysql.com/mysql?unsub=3Dneil.tompkins@goog=
lemail.com
>=20

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=3Dgcdmg-mysql-2@m.gmane.o rg

Re: Index question

am 11.10.2011 13:42:15 von Arthur Fuller

--20cf3074b3dc33ca5604af046627
Content-Type: text/plain; charset=ISO-8859-1

The difference is based on the relative frequency of queries where a is
important vs. where b is important. Either way, what will happen is that the
index scan will isolate the first item mentioned, then scan the result set
to isolate the second term.

e.g.

SELECT * FROM someTable WERE a = someValue and b = someOtherValue

Step one isolates the matching "a" values.
Step two walks through that resultset and examines each value of b.

Since the values of b are already in memory, all it has to do is examine the
index keys to find the matches of b. No additional disk read is required.

Arthur

--20cf3074b3dc33ca5604af046627--

Re: Index question

am 11.10.2011 14:22:49 von Rik Wasmus

> Just to clarify having key indexes of (a,b) or (b,a) have no difference ?

They DO.

See it as lookup table which starts with 'a' in the first case, and 'b' in the
second one. Looking for anything that matches 'b' for an index (a,b) requires
a full scan as you don't know 'a', likewise searching for 'a' in an index
(b,a) requires a full scan. See it as looking through a phonebook trying to
locate someone by first- rather then lastname. It's in there, just not easily
accessible.

However, if you have an index on (a,b) and DO know which 'a' you want
('Smith'), looking for 'Smith, John' is faster with an index (a,b) then with
only an index on (a).

Johan was trying to explain this distinction:

- index (a,b) is good for searches on ONLY a or BOTH a & b, but bad for ONLY
b
- index (b,a) is good for searches on ONLY b or BOTH a & b, but bad for ONLY
a
- index (a) & index (b) is good for searches on ONLY b or ONLY a, and is
suboptimal for searching for BOTH a,b (although, faster then no index, but the
query optimizer has to choose which index to use, can't use both).
--
Rik Wasmus

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org

Re: Index question

am 11.10.2011 14:30:43 von Alex Schaft

--------------080703000206070208050305
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

On 2011/10/11 02:22 PM, Rik Wasmus wrote:
>> Just to clarify having key indexes of (a,b) or (b,a) have no difference ?
> They DO.
>
> See it as lookup table which starts with 'a' in the first case, and 'b' in the
> second one. Looking for anything that matches 'b' for an index (a,b) requires
> a full scan as you don't know 'a', likewise searching for 'a' in an index
> (b,a) requires a full scan. See it as looking through a phonebook trying to
> locate someone by first- rather then lastname. It's in there, just not easily
> accessible.
>
> However, if you have an index on (a,b) and DO know which 'a' you want
> ('Smith'), looking for 'Smith, John' is faster with an index (a,b) then with
> only an index on (a).
>
> Johan was trying to explain this distinction:
>
> - index (a,b) is good for searches on ONLY a or BOTH a& b, but bad for ONLY
> b
> - index (b,a) is good for searches on ONLY b or BOTH a& b, but bad for ONLY
> a
> - index (a)& index (b) is good for searches on ONLY b or ONLY a, and is
> suboptimal for searching for BOTH a,b (although, faster then no index, but the
> query optimizer has to choose which index to use, can't use both).
Next question. If you have the two separate indexes and then do two
queries, one for a and one for b. If you then get a list of unique id's
of both, would it be faster to create an intersection yourself rather
than have the server do the legwork?


--------------080703000206070208050305
Content-Type: text/plain; charset=us-ascii


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org
--------------080703000206070208050305--

Re: Index question

am 11.10.2011 14:35:08 von Alex Schaft

--------------040903020305060506090302
Content-Type: multipart/alternative;
boundary="------------050505020608050302080902"


--------------050505020608050302080902
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

On 2011/10/11 02:30 PM, Alex Schaft wrote:
> On 2011/10/11 02:22 PM, Rik Wasmus wrote:
>>> Just to clarify having key indexes of (a,b) or (b,a) have no
>>> difference ?
>> They DO.
>>
>> See it as lookup table which starts with 'a' in the first case, and
>> 'b' in the
>> second one. Looking for anything that matches 'b' for an index (a,b)
>> requires
>> a full scan as you don't know 'a', likewise searching for 'a' in an
>> index
>> (b,a) requires a full scan. See it as looking through a phonebook
>> trying to
>> locate someone by first- rather then lastname. It's in there, just
>> not easily
>> accessible.
>>
>> However, if you have an index on (a,b) and DO know which 'a' you want
>> ('Smith'), looking for 'Smith, John' is faster with an index (a,b)
>> then with
>> only an index on (a).
>>
>> Johan was trying to explain this distinction:
>>
>> - index (a,b) is good for searches on ONLY a or BOTH a& b, but bad
>> for ONLY
>> b
>> - index (b,a) is good for searches on ONLY b or BOTH a& b, but bad
>> for ONLY
>> a
>> - index (a)& index (b) is good for searches on ONLY b or ONLY a,
>> and is
>> suboptimal for searching for BOTH a,b (although, faster then no
>> index, but the
>> query optimizer has to choose which index to use, can't use both).
> Next question. If you have the two separate indexes and then do two
> queries, one for a and one for b. If you then get a list of unique
> id's of both, would it be faster to create an intersection yourself
> rather than have the server do the legwork?
>
>
>
Then there's index merge optimizations too I suppose

--------------050505020608050302080902
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit



http-equiv="Content-Type">


On 2011/10/11 02:30 PM, Alex Schaft wrote:

type="cite">On 2011/10/11 02:22 PM, Rik Wasmus wrote:



Just to clarify having key indexes of
(a,b) or (b,a) have no difference ?



They DO.




See it as lookup table which starts with 'a' in the first case,
and 'b'  in the


second one. Looking for anything that matches 'b' for an index
(a,b) requires


a full scan as you don't know 'a', likewise searching for 'a' in
an index


(b,a) requires a full scan. See it as looking through a
phonebook trying to


locate someone by first- rather then lastname. It's in there,
just not easily


accessible.




However, if you have an index on (a,b) and DO know which 'a' you
want


('Smith'), looking for 'Smith, John' is faster with an index
(a,b) then with


only an index on (a).




Johan was trying to explain this distinction:




- index (a,b) is good for searches on ONLY a  or BOTH a&  b,
but bad for ONLY


b


- index (b,a) is good for searches on ONLY b  or BOTH a&  b,
but bad for ONLY


a


- index (a)&  index (b) is good for searches on ONLY b  or
ONLY a, and is


suboptimal for searching for BOTH a,b (although, faster then no
index, but the


query optimizer has to choose which index to use, can't use
both).



Next question. If you have the two separate indexes and then do
two queries, one for a and one for b. If you then get a list of
unique id's of both, would it be faster to create an intersection
yourself rather than have the server do the legwork?












Then there's index merge optimizations too I suppose



--------------050505020608050302080902--


--------------040903020305060506090302
Content-Type: text/plain; charset=us-ascii


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org
--------------040903020305060506090302--

Re: Index question

am 11.10.2011 14:40:05 von Rik Wasmus

> Next question. If you have the two separate indexes and then do two
> queries, one for a and one for b. If you then get a list of unique id's
> of both, would it be faster to create an intersection yourself rather
> than have the server do the legwork?

If you only have 2 unrelated indexes on a & b, it depends on the data, the
distribution of values, etc. No single answer here, test with your data and
you'll have the results.

If you need it often, I'd go for the combined index & let MySQL do the work,
which is probably fastest.
--
Rik Wasmus

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org

Re: Index question

am 11.10.2011 15:00:25 von Tompkins Neil

In this instance would you create four indexes key(a) key(b) key(a,b) key (b=
,a) ? Or is the decision based on the query response time ?

On 11 Oct 2011, at 13:40, Rik Wasmus wrote:

>> Next question. If you have the two separate indexes and then do two
>> queries, one for a and one for b. If you then get a list of unique id's
>> of both, would it be faster to create an intersection yourself rather
>> than have the server do the legwork?
>=20
> If you only have 2 unrelated indexes on a & b, it depends on the data, the=
=20
> distribution of values, etc. No single answer here, test with your data an=
d=20
> you'll have the results.
>=20
> If you need it often, I'd go for the combined index & let MySQL do the wor=
k,=20
> which is probably fastest.=20
> --=20
> Rik Wasmus
>=20
> --=20
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe: http://lists.mysql.com/mysql?unsub=3Dneil.tompkins@goog=
lemail.com
>=20

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=3Dgcdmg-mysql-2@m.gmane.o rg

Re: Index question

am 11.10.2011 15:08:52 von Rik Wasmus

> In this instance would you create four indexes key(a) key(b) key(a,b) key
> (b,a) ? Or is the decision based on the query response time ?

Depends on the data and usage, but probably I'd go for a index(a,b) &
index(b,a) if reads heavily outnumber writes. As index(a) is covered by
index(a,b), and index(b) by index(b,a), we don't need to add those, which
saves time on modifications.
--
Rik Wasmus

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org

Re: Index question

am 11.10.2011 15:51:49 von Johan De Meersman

----- Original Message -----
> From: "Rik Wasmus"
>
> Depends on the data and usage, but probably I'd go for a index(a,b) &
> index(b,a) if reads heavily outnumber writes. As index(a) is covered
> by index(a,b), and index(b) by index(b,a), we don't need to add those,
> which saves time on modifications.

I'm trying to think of a scenario where index(a) would be beneficial in the presence of index(a,b). If both are available, and all else being equal, it's likely that the parser will pick the simplest index; but I can't see it having a major impact.

Any full prefix of a combined index may be used; so afaik a separate index on any full prefix is a waste of diskspace and cycles.

The net conclusion, Neil, is that you actually have to know what you're doing :-) Take the time to read the online documentation on mysql.com, it's pretty good.


--
Bier met grenadyn
Is als mosterd by den wyn
Sy die't drinkt, is eene kwezel
Hy die't drinkt, is ras een ezel

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org

Re: Index question

am 11.10.2011 20:38:13 von Tompkins Neil

--20cf307f356ec9d79504af0a35b3
Content-Type: text/plain; charset=ISO-8859-1

Thanks for the information. One final question in what ways should we use
EXPLAIN EXTENDED statement to help improve our query performance.

On Tue, Oct 11, 2011 at 2:51 PM, Johan De Meersman wrote:

> ----- Original Message -----
> > From: "Rik Wasmus"
> >
> > Depends on the data and usage, but probably I'd go for a index(a,b) &
> > index(b,a) if reads heavily outnumber writes. As index(a) is covered
> > by index(a,b), and index(b) by index(b,a), we don't need to add those,
> > which saves time on modifications.
>
> I'm trying to think of a scenario where index(a) would be beneficial in the
> presence of index(a,b). If both are available, and all else being equal,
> it's likely that the parser will pick the simplest index; but I can't see it
> having a major impact.
>
> Any full prefix of a combined index may be used; so afaik a separate index
> on any full prefix is a waste of diskspace and cycles.
>
> The net conclusion, Neil, is that you actually have to know what you're
> doing :-) Take the time to read the online documentation on mysql.com,
> it's pretty good.
>
>
> --
> Bier met grenadyn
> Is als mosterd by den wyn
> Sy die't drinkt, is eene kwezel
> Hy die't drinkt, is ras een ezel
>
> --
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe:
> http://lists.mysql.com/mysql?unsub=neil.tompkins@googlemail. com
>
>

--20cf307f356ec9d79504af0a35b3--

Re: Index question

am 12.10.2011 10:05:37 von Johan De Meersman

--=_1e372d9a-14e1-4180-9d34-6095ff06077e
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit

----- Original Message -----

> From: "Tompkins Neil"

> Thanks for the information. One final question in what ways should we
> use EXPLAIN EXTENDED statement to help improve our query
> performance.

Explain is your friend. You should listen to it :-)

It gives a nice idea of how the database interprets your query, so you can see where the bottlenecks are, for example what bits don't use indices or cause file sorts.

The exact interpretation of it is an art, though, and there are many subtleties you only get by experience and reading documentation. It is not something that is quickly explained; it requires a good knowledge of how a database works on the inside.

--
Bier met grenadyn
Is als mosterd by den wyn
Sy die't drinkt, is eene kwezel
Hy die't drinkt, is ras een ezel

--=_1e372d9a-14e1-4180-9d34-6095ff06077e--