How To Duplicate Number of Hits from Prod Sever to NEW QA server?
am 27.08.2010 16:19:21 von Nunzio Daveri
--0-1038652357-1282918761=:34046
Content-Type: text/plain; charset=us-ascii
Hello, I have been asked to "replay" the traffic load we have on one of our
5.1.4X servers against a new 5.5 test server we are getting ready to put into
production as a stand alone. My question is that I have 6GB of mysqld.log BUT
how to I figure out how MUCH to play back at a given time and how fast / number
of clients etc...? The log only tells me this is all the data that has come to
the production server but it doesn't tell me how fast, how many connections, how
many users at a time, how many inserts, updates or delets per second etc...
How do I "replay" the log against the 5.5 server so as to "duplicate" real time
traffic and not just replay the logs? Is there a tool or a shell script? I
know there are built in "benchmarking" tools but I am trying to tell mgmt that
5.1.4x was lets say 60% percentage busy (cpu/mem/io) with traffic hitting it on
Monday, the SAME amount of traffic on 5.5 is only 48% busy.
Any help or advise is greatly appreciated please.
Thanks in advance.
Nunzio
--0-1038652357-1282918761=:34046--
RE: How To Duplicate Number of Hits from Prod Sever to NEW QA server?
am 27.08.2010 19:08:05 von Jerry Schwartz
>-----Original Message-----
>From: Nunzio Daveri [mailto:nunziodaveri@yahoo.com]
>Sent: Friday, August 27, 2010 10:19 AM
>To: mysql@lists.mysql.com
>Subject: How To Duplicate Number of Hits from Prod Sever to NEW QA server?
>
>Hello, I have been asked to "replay" the traffic load we have on one of our
>5.1.4X servers against a new 5.5 test server we are getting ready to put into
>production as a stand alone. My question is that I have 6GB of mysqld.log
>BUT
>how to I figure out how MUCH to play back at a given time and how fast /
>number
>of clients etc...? The log only tells me this is all the data that has come
>to
>the production server but it doesn't tell me how fast, how many connections,
>how
>many users at a time, how many inserts, updates or delets per second etc...
>
>How do I "replay" the log against the 5.5 server so as to "duplicate" real
>time
>traffic and not just replay the logs? Is there a tool or a shell script? I
>know there are built in "benchmarking" tools but I am trying to tell mgmt
>that
>5.1.4x was lets say 60% percentage busy (cpu/mem/io) with traffic hitting it
>on
>Monday, the SAME amount of traffic on 5.5 is only 48% busy.
>
>Any help or advise is greatly appreciated please.
>
[JS] Unfortunately, the short answer is that benchmarking an interactive
system is difficult.
Would your management be satisfied if you just used the query log to beat the
living daylights out of the two systems? That way you could say that you
eliminated all of the confounding factors and truly measured the database
servers' actual throughput.
If they won't swallow that, then you'll have to read the rest of my reply.
- IF THE APPLICATION IS WEB BASED, you should be able to log the web server's
traffic and use that to determine all of the information you need. Just replay
the server log, using the times in there to determine the transaction arrival
rates. If that won't do it, for some reason, then read on.
- Ideally, you should have a second system (a PC would have the horsepower) to
drive the transactions into your MySQL server. That is the easy part.
- As you said, you need to simulate the transaction arrival rate. Since you
don't really know when the individual transactions arrive, you need to use
some random delays; but what to use?
- Where are the transactions coming from? An end-user program? A web server?
Those two cases could have somewhat different characteristics.
- A web server handles one transaction at a time per thread, I'd guess; so
from the perspective of MySQL the number of threads would be equivalent to the
number of users.
- An end-user program that talks directly to MySQL would present one
transaction at a time per user.
- How clustered are the transactions for an individual user (human or web
server thread)? If the human at the input side (who ultimately initiates the
transactions) is processing an order, there might be a customer lookup; a
product lookup; and an actual order entry. Then there might be a time lapse
between orders, depending upon the workflow.
- Depending upon the application, one transaction might request multiple
queries. If the transactions are different, you'd have to know a rough
proportion within the mixture. That probably isn't hard to guesstimate.
- Because the humans are presumably not in lock-step, your best choice for
simulating the transaction arrival rate might be a Poisson distribution. While
the transactions (and consequent queries) executed by an individual human
might not fit a Poisson distribution, when you combine all of the users it
should be a fair approximation.
- If the web server is a bottleneck then the arrival rate of the queries is
the maximum rate at which the web server can service transactions, at least at
times which things are busy.
- One last tidbit: in any network system, there is always a bottleneck.
As an alternative to a physical benchmark there are simulation tools; but you
need to address these issues either way.
Now, some practical pointers about presenting your results:
- Use lots of graphs. They'll draw attention away from the tables of numbers.
- Use 3D graphs. They look whizzy, and at the same time they make it harder to
eyeball exact numbers.
- If you wind up sitting across the table from someone who brought colored
pencils and a ruler, prepare for the death of 1000 cuts.
Now you know why I got out of the performance consulting business.
Good luck.
Regards,
Jerry Schwartz
Global Information Incorporated
195 Farmington Ave.
Farmington, CT 06032
860.674.8796 / FAX: 860.674.8341
E-mail: jerry@gii.co.jp
Web site: www.the-infoshop.com
>Thanks in advance.
>
>Nunzio
>
>
>
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org