mySQL underperforming - trying to identify bottleneck

mySQL underperforming - trying to identify bottleneck

am 02.11.2006 13:09:45 von hazel

Currently we have a database with a main table containing 3 million
records - we want to increase that to 10 million but thats not a
possibility at the moment.
Nearly all 3 million records are deleted and replaced every day - all
through the day - currently we're handling this by having 2 sets of
tables - 1 for inserting, 1 for searching.

A block of records (10k - 1 million) (distinguished by a client
identifier field) happen on the 'alt' set of tables, then records are
inserted from CSV files using LOAD_DATA_INFILE (csv file created by
loading xml or csv files in proprietary client formats, validating and
rewriting data in our format)
To facilitate faster search times summary tables are updated from the
latest update - ie. insert into summarytable select fields from
alttable join on supportingtables where clientID = $clientID
Then we LOAD INDEX INTO CACHE for all the relevant tables (key_buffer
is set to 512MB)
Then we switch a flag in an info table to tell the searches to start
pulling from these updated tables and then we repeat the process on the
table that was previously the search table.

During this time even simple queries can end up in the slow query log
and I cant figure out why.

This query benchmarks at approx 0.25s
SELECT fldResort AS dest_name, fldResort as ap_destname,
fldDestinationAPC, min( fldPrice ) AS price, fldCountry as country,
fldBoardBasis, fldFlyTime, sum( fldOfferCount ) as offercount
FROM tblSummaryFull WHERE fldStatus = 0 AND fldDepartureDate >=
'2006-12-27' AND fldDepartureDate <= '2007-01-02' AND fldDuration >= 7
AND fldDuration <= 7 AND tblSummaryFull.fldSearchTypes LIKE '%all%'
GROUP BY dest_name, fldBoardBasis ORDER BY price
Its using where, temporary and filesort with a key length of 3 -
examined 23k rows -
The log reads:
Query_time: 11 Lock_time: 0 Rows_sent: 267 Rows_examined: 23889

But even the most basic queries are being affected

SELECT * FROM tblResortInfo WHERE fldClientID=17 AND fldAccomRef='3883'

Benchmarked at 0.02s (there are 0 results for this query)
>From the log: # Query_time: 11 Lock_time: 0 Rows_sent: 0 Rows_examined:
1

The site is at very low traffic atm, (around 3k visitors per day)

I'm doing everything I can to improve performance and query speeds
before next summer (where we're aiming for around 30k per day) but I
cant seem to do anything about this and if queries wont run at their
optimal speed then all this work has been for nothing.

Its probably worth noting that our CPU usage is barely at 50% - ditto
with RAM

Re: mySQL underperforming - trying to identify bottleneck

am 02.11.2006 14:15:01 von Jeff North

On 2 Nov 2006 04:09:45 -0800, in mailing.database.mysql "NancyJ"

<1162469385.761666.260490@e3g2000cwe.googlegroups.com> wrote:

>| Currently we have a database with a main table containing 3 million
>| records - we want to increase that to 10 million but thats not a
>| possibility at the moment.
>| Nearly all 3 million records are deleted and replaced every day - all

Why?

>| through the day - currently we're handling this by having 2 sets of
>| tables - 1 for inserting, 1 for searching.
>|
>| A block of records (10k - 1 million) (distinguished by a client
>| identifier field) happen on the 'alt' set of tables, then records are
>| inserted from CSV files using LOAD_DATA_INFILE (csv file created by
>| loading xml or csv files in proprietary client formats, validating and
>| rewriting data in our format)
>| To facilitate faster search times summary tables are updated from the
>| latest update - ie. insert into summarytable select fields from
>| alttable join on supportingtables where clientID = $clientID
>| Then we LOAD INDEX INTO CACHE for all the relevant tables (key_buffer
>| is set to 512MB)
>| Then we switch a flag in an info table to tell the searches to start
>| pulling from these updated tables and then we repeat the process on the
>| table that was previously the search table.
>|
>| During this time even simple queries can end up in the slow query log
>| and I cant figure out why.

What indices have you set for the table(s)?

>| This query benchmarks at approx 0.25s
>| SELECT fldResort AS dest_name, fldResort as ap_destname,
>| fldDestinationAPC, min( fldPrice ) AS price, fldCountry as country,
>| fldBoardBasis, fldFlyTime, sum( fldOfferCount ) as offercount
>| FROM tblSummaryFull WHERE fldStatus = 0 AND fldDepartureDate >=
>| '2006-12-27' AND fldDepartureDate <= '2007-01-02' AND fldDuration >= 7
>| AND fldDuration <= 7 AND tblSummaryFull.fldSearchTypes LIKE '%all%'
>| GROUP BY dest_name, fldBoardBasis ORDER BY price
>| Its using where, temporary and filesort with a key length of 3 -
>| examined 23k rows -
>| The log reads:
>| Query_time: 11 Lock_time: 0 Rows_sent: 267 Rows_examined: 23889
>|
>| But even the most basic queries are being affected
>|
>| SELECT * FROM tblResortInfo WHERE fldClientID=17 AND fldAccomRef='3883'
>|
>| Benchmarked at 0.02s (there are 0 results for this query)
>| >From the log: # Query_time: 11 Lock_time: 0 Rows_sent: 0 Rows_examined:
>| 1
>|
>| The site is at very low traffic atm, (around 3k visitors per day)
>|
>| I'm doing everything I can to improve performance and query speeds
>| before next summer (where we're aiming for around 30k per day) but I
>| cant seem to do anything about this and if queries wont run at their
>| optimal speed then all this work has been for nothing.
>|
>| Its probably worth noting that our CPU usage is barely at 50% - ditto
>| with RAM
------------------------------------------------------------ ---
jnorthau@yourpantsyahoo.com.au : Remove your pants to reply
------------------------------------------------------------ ---

Re: mySQL underperforming - trying to identify bottleneck

am 02.11.2006 17:41:30 von hazel

Jeff North wrote:

> On 2 Nov 2006 04:09:45 -0800, in mailing.database.mysql "NancyJ"
>
> <1162469385.761666.260490@e3g2000cwe.googlegroups.com> wrote:
>
> >| Currently we have a database with a main table containing 3 million
> >| records - we want to increase that to 10 million but thats not a
> >| possibility at the moment.
> >| Nearly all 3 million records are deleted and replaced every day - all
>
> Why?
Because they change every day - we have around 30 data suppliers and
every day they supply us with a new file - sometimes they want to add
the their current dataset, sometimes they want to replace it with a
whole new data set.
>
> >| through the day - currently we're handling this by having 2 sets of
> >| tables - 1 for inserting, 1 for searching.
> >|
> >| A block of records (10k - 1 million) (distinguished by a client
> >| identifier field) happen on the 'alt' set of tables, then records are
> >| inserted from CSV files using LOAD_DATA_INFILE (csv file created by
> >| loading xml or csv files in proprietary client formats, validating and
> >| rewriting data in our format)
> >| To facilitate faster search times summary tables are updated from the
> >| latest update - ie. insert into summarytable select fields from
> >| alttable join on supportingtables where clientID = $clientID
> >| Then we LOAD INDEX INTO CACHE for all the relevant tables (key_buffer
> >| is set to 512MB)
> >| Then we switch a flag in an info table to tell the searches to start
> >| pulling from these updated tables and then we repeat the process on the
> >| table that was previously the search table.
> >|
> >| During this time even simple queries can end up in the slow query log
> >| and I cant figure out why.
>
> What indices have you set for the table(s)?
>
We have nearly 100 tables - it would take all day to list every index.
Under good conditions all our uncached queries are fast - I'm trying to
find the cause of simple queries that arent locked or being limited by
CPU or Memory, going 1000 times slower than they should.

> >| This query benchmarks at approx 0.25s
> >| SELECT fldResort AS dest_name, fldResort as ap_destname,
> >| fldDestinationAPC, min( fldPrice ) AS price, fldCountry as country,
> >| fldBoardBasis, fldFlyTime, sum( fldOfferCount ) as offercount
> >| FROM tblSummaryFull WHERE fldStatus = 0 AND fldDepartureDate >=
> >| '2006-12-27' AND fldDepartureDate <= '2007-01-02' AND fldDuration >= 7
> >| AND fldDuration <= 7 AND tblSummaryFull.fldSearchTypes LIKE '%all%'
> >| GROUP BY dest_name, fldBoardBasis ORDER BY price
> >| Its using where, temporary and filesort with a key length of 3 -
> >| examined 23k rows -
> >| The log reads:
> >| Query_time: 11 Lock_time: 0 Rows_sent: 267 Rows_examined: 23889
> >|
> >| But even the most basic queries are being affected
> >|
> >| SELECT * FROM tblResortInfo WHERE fldClientID=17 AND fldAccomRef='3883'
> >|
> >| Benchmarked at 0.02s (there are 0 results for this query)
> >| >From the log: # Query_time: 11 Lock_time: 0 Rows_sent: 0 Rows_examined:
> >| 1
> >|
> >| The site is at very low traffic atm, (around 3k visitors per day)
> >|
> >| I'm doing everything I can to improve performance and query speeds
> >| before next summer (where we're aiming for around 30k per day) but I
> >| cant seem to do anything about this and if queries wont run at their
> >| optimal speed then all this work has been for nothing.
> >|
> >| Its probably worth noting that our CPU usage is barely at 50% - ditto
> >| with RAM
> ------------------------------------------------------------ ---
> jnorthau@yourpantsyahoo.com.au : Remove your pants to reply
> ------------------------------------------------------------ ---

Re: mySQL underperforming - trying to identify bottleneck

am 02.11.2006 21:32:20 von larko

NancyJ wrote:
> Currently we have a database with a main table containing 3 million
> records - we want to increase that to 10 million but thats not a
> possibility at the moment.
> Nearly all 3 million records are deleted and replaced every day - all
> through the day - currently we're handling this by having 2 sets of
> tables - 1 for inserting, 1 for searching.
>
> A block of records (10k - 1 million) (distinguished by a client
> identifier field) happen on the 'alt' set of tables, then records are
> inserted from CSV files using LOAD_DATA_INFILE (csv file created by
> loading xml or csv files in proprietary client formats, validating and
> rewriting data in our format)
> To facilitate faster search times summary tables are updated from the
> latest update - ie. insert into summarytable select fields from
> alttable join on supportingtables where clientID = $clientID
> Then we LOAD INDEX INTO CACHE for all the relevant tables (key_buffer
> is set to 512MB)
> Then we switch a flag in an info table to tell the searches to start
> pulling from these updated tables and then we repeat the process on the
> table that was previously the search table.
>
> During this time even simple queries can end up in the slow query log
> and I cant figure out why.
>
> This query benchmarks at approx 0.25s
> SELECT fldResort AS dest_name, fldResort as ap_destname,
> fldDestinationAPC, min( fldPrice ) AS price, fldCountry as country,
> fldBoardBasis, fldFlyTime, sum( fldOfferCount ) as offercount
> FROM tblSummaryFull WHERE fldStatus = 0 AND fldDepartureDate >=
> '2006-12-27' AND fldDepartureDate <= '2007-01-02' AND fldDuration >= 7
> AND fldDuration <= 7 AND tblSummaryFull.fldSearchTypes LIKE '%all%'
> GROUP BY dest_name, fldBoardBasis ORDER BY price
> Its using where, temporary and filesort with a key length of 3 -
> examined 23k rows -
> The log reads:
> Query_time: 11 Lock_time: 0 Rows_sent: 267 Rows_examined: 23889
>
> But even the most basic queries are being affected
>
> SELECT * FROM tblResortInfo WHERE fldClientID=17 AND fldAccomRef='3883'
>
> Benchmarked at 0.02s (there are 0 results for this query)
>>From the log: # Query_time: 11 Lock_time: 0 Rows_sent: 0 Rows_examined:
> 1
>
> The site is at very low traffic atm, (around 3k visitors per day)
>
> I'm doing everything I can to improve performance and query speeds
> before next summer (where we're aiming for around 30k per day) but I
> cant seem to do anything about this and if queries wont run at their
> optimal speed then all this work has been for nothing.
>
> Its probably worth noting that our CPU usage is barely at 50% - ditto
> with RAM
>

it shouldn't really matter why a dba deletes or adds tables or index
fields. the server should and is able to handle this and then some if
you have the right configuration.

having said that, turn on the slow query logging on your server and
start looking at what is causing the bottlenecks through slowquerydump
command. it will give you a somewhat aggregated tally of what is going
on with all of your queries. utilizing the slow query dump results start
creating index fields on the guilty parties. can't get any simpler than
that, ey?

you should set the slow_query_time parameters (it's a threshold
parameter) in the my.cnf file. usually that is set at 5 seconds or
whatever you feel is the right number.

Re: mySQL underperforming - trying to identify bottleneck

am 02.11.2006 21:45:28 von Jeff North

On 2 Nov 2006 08:41:30 -0800, in mailing.database.mysql "NancyJ"

<1162485690.095216.92120@m73g2000cwd.googlegroups.com> wrote:

>|
>| Jeff North wrote:
>|
>| > On 2 Nov 2006 04:09:45 -0800, in mailing.database.mysql "NancyJ"
>| >
>| > <1162469385.761666.260490@e3g2000cwe.googlegroups.com> wrote:
>| >
>| > >| Currently we have a database with a main table containing 3 million
>| > >| records - we want to increase that to 10 million but thats not a
>| > >| possibility at the moment.
>| > >| Nearly all 3 million records are deleted and replaced every day - all
>| >
>| > Why?
>| Because they change every day - we have around 30 data suppliers and
>| every day they supply us with a new file - sometimes they want to add
>| the their current dataset, sometimes they want to replace it with a
>| whole new data set.

Just wanting clarification of what it was necessary to delete the
records :-)

What method are you using to delete the records DELETE FROM or
TRUNCATE table?

>| > >| through the day - currently we're handling this by having 2 sets of
>| > >| tables - 1 for inserting, 1 for searching.
>| > >|
>| > >| A block of records (10k - 1 million) (distinguished by a client
>| > >| identifier field) happen on the 'alt' set of tables, then records are
>| > >| inserted from CSV files using LOAD_DATA_INFILE (csv file created by
>| > >| loading xml or csv files in proprietary client formats, validating and
>| > >| rewriting data in our format)
>| > >| To facilitate faster search times summary tables are updated from the
>| > >| latest update - ie. insert into summarytable select fields from
>| > >| alttable join on supportingtables where clientID = $clientID
>| > >| Then we LOAD INDEX INTO CACHE for all the relevant tables (key_buffer
>| > >| is set to 512MB)
>| > >| Then we switch a flag in an info table to tell the searches to start
>| > >| pulling from these updated tables and then we repeat the process on the
>| > >| table that was previously the search table.
>| > >|
>| > >| During this time even simple queries can end up in the slow query log
>| > >| and I cant figure out why.
>| >
>| > What indices have you set for the table(s)?
>| >
>| We have nearly 100 tables - it would take all day to list every index.
>| Under good conditions all our uncached queries are fast - I'm trying to
>| find the cause of simple queries that arent locked or being limited by
>| CPU or Memory, going 1000 times slower than they should.

This may not be a database issue. If you are deleting and recreating
tables/files the actual data maybe fragmented on the hard drive. Have
you tried defragging your drive?

>| > >| This query benchmarks at approx 0.25s
>| > >| SELECT fldResort AS dest_name, fldResort as ap_destname,
>| > >| fldDestinationAPC, min( fldPrice ) AS price, fldCountry as country,
>| > >| fldBoardBasis, fldFlyTime, sum( fldOfferCount ) as offercount
>| > >| FROM tblSummaryFull WHERE fldStatus = 0 AND fldDepartureDate >=
>| > >| '2006-12-27' AND fldDepartureDate <= '2007-01-02' AND fldDuration >= 7
>| > >| AND fldDuration <= 7 AND tblSummaryFull.fldSearchTypes LIKE '%all%'
>| > >| GROUP BY dest_name, fldBoardBasis ORDER BY price
>| > >| Its using where, temporary and filesort with a key length of 3 -
>| > >| examined 23k rows -
>| > >| The log reads:
>| > >| Query_time: 11 Lock_time: 0 Rows_sent: 267 Rows_examined: 23889
>| > >|
>| > >| But even the most basic queries are being affected
>| > >|
>| > >| SELECT * FROM tblResortInfo WHERE fldClientID=17 AND fldAccomRef='3883'
>| > >|
>| > >| Benchmarked at 0.02s (there are 0 results for this query)
>| > >| >From the log: # Query_time: 11 Lock_time: 0 Rows_sent: 0 Rows_examined:
>| > >| 1
>| > >|
>| > >| The site is at very low traffic atm, (around 3k visitors per day)
>| > >|
>| > >| I'm doing everything I can to improve performance and query speeds
>| > >| before next summer (where we're aiming for around 30k per day) but I
>| > >| cant seem to do anything about this and if queries wont run at their
>| > >| optimal speed then all this work has been for nothing.
>| > >|
>| > >| Its probably worth noting that our CPU usage is barely at 50% - ditto
>| > >| with RAM
>| > ------------------------------------------------------------ ---
>| > jnorthau@yourpantsyahoo.com.au : Remove your pants to reply
>| > ------------------------------------------------------------ ---
------------------------------------------------------------ ---
jnorthau@yourpantsyahoo.com.au : Remove your pants to reply
------------------------------------------------------------ ---