Does InnoDB ever not cluster data by primary key?

Does InnoDB ever not cluster data by primary key?

am 31.07.2009 04:00:35 von Kyong Kim

We have a multi-column primary key with an auto-increment column as
the 3rd column in the primary key in InnoDB.
Is there a requirement to have the auto-increment column as the
leftmost column in the primary key in order for InnoDB to cluster by
the multi-column primary key?
I don't believe this to be the case but there has been some discussion
on this topic.
I haven't been able to find any definitive answers.
Judging by the query profiling results and explain output, we are
seeing the benefits of clustering by primary key.
If you have any insight on this matter, it would be much appreciated.
Kyong

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org

Re: Does InnoDB ever not cluster data by primary key?

am 31.07.2009 06:25:29 von Kyong Kim

Michael,
Yeah. We're trying to maximize the benefits of clustering and had to
sacrifice on the length of the primary key.
And we got fairly good results from query profiling using maatkit.
One thing that shocked me was the overhead of random inserts primary
key updates.
It's definitely a tradeoff.
We're reasonably certain that we'll see a lot of ordered bulk inserts.

It ran counter to the results that we were seeing so I had to verify
that InnoDB always clusters by primary key regardless of the position
of the auto increment column in the primary key.
Kyong

On Thu, Jul 30, 2009 at 7:08 PM, Michael Dykman wrote:
> InnoDb storage format is (always) a b-tree based on the primary key,
> so the simple answer is: no, InnoDB never clusters by anything other
> than a primary key. =A0The size of that key can have significant impact
> on performance though, so be careful with the multi-icolumn primary
> key. =A0Assuming your primary key remains constant over the lifetime of
> the record. I don't think it matters much where you put the
> auto-increment key.
>
> =A0- michael
>
>
> On Thu, Jul 30, 2009 at 10:00 PM, Kyong Kim wrote:
>> We have a multi-column primary key with an auto-increment column as
>> the 3rd column in the primary key in InnoDB.
>> Is there a requirement to have the auto-increment column as the
>> leftmost column in the primary key in order for InnoDB to cluster by
>> the multi-column primary key?
>> I don't believe this to be the case but there has been some discussion
>> on this topic.
>> I haven't been able to find any definitive answers.
>> Judging by the query profiling results and explain output, =A0we are
>> seeing the benefits of clustering by primary key.
>> If you have any insight on this matter, it would be much appreciated.
>> Kyong
>>
>> --
>> MySQL General Mailing List
>> For list archives: http://lists.mysql.com/mysql
>> To unsubscribe: =A0 =A0http://lists.mysql.com/mysql?unsub=3Dmdykman@gmai=
l.com
>>
>>
>
>
>
> --
> =A0- michael dykman
> =A0- mdykman@gmail.com
>
> Don=92t worry about people stealing your ideas. If they=92re any good,
> you=92ll have to ram them down their throats!
>
> =A0 Howard Aiken
>

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=3Dgcdmg-mysql-2@m.gmane.o rg

Re: Does InnoDB ever not cluster data by primary key?

am 31.07.2009 06:44:46 von Michael Dykman

With your auto-increment in the right-most position, it seems to me
that it would tend to make your inserts non-sequential (assuming the
fields to the left are not sequential) causing inserts to occur all
over the tree. With the auto-increment as the first field in the key,
the inserts would be going to the same place in the tree allowing it
to build out nicely.

I have definitely found that sequential inserts perform much better
than random ones.

- md

On Fri, Jul 31, 2009 at 12:25 AM, Kyong Kim wrote:
> Michael,
> Yeah. We're trying to maximize the benefits of clustering and had to
> sacrifice on the length of the primary key.
> And we got fairly good results from query profiling using maatkit.
> One thing that shocked me was the overhead of random inserts primary
> key updates.
> It's definitely a tradeoff.
> We're reasonably certain that we'll see a lot of ordered bulk inserts.
>
> It ran counter to the results that we were seeing so I had to verify
> that InnoDB always clusters by primary key regardless of the position
> of the auto increment column in the primary key.
> Kyong
>
> On Thu, Jul 30, 2009 at 7:08 PM, Michael Dykman wrote:
>> InnoDb storage format is (always) a b-tree based on the primary key,
>> so the simple answer is: no, InnoDB never clusters by anything other
>> than a primary key. =A0The size of that key can have significant impact
>> on performance though, so be careful with the multi-icolumn primary
>> key. =A0Assuming your primary key remains constant over the lifetime of
>> the record. I don't think it matters much where you put the
>> auto-increment key.
>>
>> =A0- michael
>>
>>
>> On Thu, Jul 30, 2009 at 10:00 PM, Kyong Kim wrote:
>>> We have a multi-column primary key with an auto-increment column as
>>> the 3rd column in the primary key in InnoDB.
>>> Is there a requirement to have the auto-increment column as the
>>> leftmost column in the primary key in order for InnoDB to cluster by
>>> the multi-column primary key?
>>> I don't believe this to be the case but there has been some discussion
>>> on this topic.
>>> I haven't been able to find any definitive answers.
>>> Judging by the query profiling results and explain output, =A0we are
>>> seeing the benefits of clustering by primary key.
>>> If you have any insight on this matter, it would be much appreciated.
>>> Kyong
>>>
>>> --
>>> MySQL General Mailing List
>>> For list archives: http://lists.mysql.com/mysql
>>> To unsubscribe: =A0 =A0http://lists.mysql.com/mysql?unsub=3Dmdykman@gma=
il.com
>>>
>>>
>>
>>
>>
>> --
>> =A0- michael dykman
>> =A0- mdykman@gmail.com
>>
>> Don=92t worry about people stealing your ideas. If they=92re any good,
>> you=92ll have to ram them down their throats!
>>
>> =A0 Howard Aiken
>>
>



--=20
- michael dykman
- mdykman@gmail.com

Don=92t worry about people stealing your ideas. If they=92re any good,
you=92ll have to ram them down their throats!

Howard Aiken

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=3Dgcdmg-mysql-2@m.gmane.o rg

Re: Does InnoDB ever not cluster data by primary key?

am 01.08.2009 01:07:17 von Kyong Kim

Michael,
We're counting on batch inserts of constant 2 leftmost columns of the
primary key.
We would be selecting within constant values for the leftmost columns as we=
ll.

For example, our primary key is

country_id, city_id, auto_inc, ...

We'll always be looking for data from within the same country and city.
Kyong


On Thu, Jul 30, 2009 at 9:44 PM, Michael Dykman wrote:
> With your auto-increment in the right-most position, it seems to me
> that it would tend to make your inserts non-sequential (assuming the
> fields to the left are not sequential) causing inserts to occur all
> over the tree. =A0With the auto-increment as the first field in the key,
> the inserts would be going to the same place in the tree allowing it
> to build out nicely.
>
> I have definitely found that sequential inserts perform much better
> than random ones.
>
> =A0- md
>
> On Fri, Jul 31, 2009 at 12:25 AM, Kyong Kim wrote:
>> Michael,
>> Yeah. We're trying to maximize the benefits of clustering and had to
>> sacrifice on the length of the primary key.
>> And we got fairly good results from query profiling using maatkit.
>> One thing that shocked me was the overhead of random inserts primary
>> key updates.
>> It's definitely a tradeoff.
>> We're reasonably certain that we'll see a lot of ordered bulk inserts.
>>
>> It ran counter to the results that we were seeing so I had to verify
>> that InnoDB always clusters by primary key regardless of the position
>> of the auto increment column in the primary key.
>> Kyong
>>
>> On Thu, Jul 30, 2009 at 7:08 PM, Michael Dykman wrote=
:
>>> InnoDb storage format is (always) a b-tree based on the primary key,
>>> so the simple answer is: no, InnoDB never clusters by anything other
>>> than a primary key. =A0The size of that key can have significant impact
>>> on performance though, so be careful with the multi-icolumn primary
>>> key. =A0Assuming your primary key remains constant over the lifetime of
>>> the record. I don't think it matters much where you put the
>>> auto-increment key.
>>>
>>> =A0- michael
>>>
>>>
>>> On Thu, Jul 30, 2009 at 10:00 PM, Kyong Kim wrote:
>>>> We have a multi-column primary key with an auto-increment column as
>>>> the 3rd column in the primary key in InnoDB.
>>>> Is there a requirement to have the auto-increment column as the
>>>> leftmost column in the primary key in order for InnoDB to cluster by
>>>> the multi-column primary key?
>>>> I don't believe this to be the case but there has been some discussion
>>>> on this topic.
>>>> I haven't been able to find any definitive answers.
>>>> Judging by the query profiling results and explain output, =A0we are
>>>> seeing the benefits of clustering by primary key.
>>>> If you have any insight on this matter, it would be much appreciated.
>>>> Kyong
>>>>
>>>> --
>>>> MySQL General Mailing List
>>>> For list archives: http://lists.mysql.com/mysql
>>>> To unsubscribe: =A0 =A0http://lists.mysql.com/mysql?unsub=3Dmdykman@gm=
ail.com
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> =A0- michael dykman
>>> =A0- mdykman@gmail.com
>>>
>>> Don=92t worry about people stealing your ideas. If they=92re any good,
>>> you=92ll have to ram them down their throats!
>>>
>>> =A0 Howard Aiken
>>>
>>
>
>
>
> --
> =A0- michael dykman
> =A0- mdykman@gmail.com
>
> Don=92t worry about people stealing your ideas. If they=92re any good,
> you=92ll have to ram them down their throats!
>
> =A0 Howard Aiken
>

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=3Dgcdmg-mysql-2@m.gmane.o rg