Here"s an Idea for Re-Syncing Master and Slave During Production Hours without Interrupting Users (

Here"s an Idea for Re-Syncing Master and Slave During Production Hours without Interrupting Users (

am 04.12.2009 12:34:49 von Eric.Robinson

=20
Let's face it, sometimes the master and slave get out of sync, even when
'show slave status' and 'show master status' indicate that all is well.
And sometimes it is not feasible to wait until after production hours to
resync them. We've been working on a method to do an emergency
hot-resync during production hours with little or no user downtime. What
do you guys think of this approach? It's only for Linux, though...

1. Shut down the slave and remove its replication logs (master.info and
*relay* files).

2. Do an initial rsync of the master to the slave. Using rsync's
bit-differential algorithm, this quickly copies most of the changed data
and can be safely be done against a live database. This initial rsync is
done before the next step to minimize the time during which the tables
will be read-locked.

3. Do a 'flush tables with read lock;reset master' on the master server.
At this point, user apps may freeze briefly during inserts or updates.=20

4. Do a second rsync, which goes very fast because very little data has
changed between steps 2 and 3.=20

5. Unlock the master tables.

6. Restart the slave.

When you're done, you have a 100% binary duplicate of the master
database on the slave, with no worries that some queries got missed
somewhere. The master was never stopped and users were not severely
impacted. (Mileage may vary, of course.)=20

We've tried this a few times and it has seemed to work well in most
cases. We had once case where the slave SQL thread did not want to
restart afterwards and we had to do the whole thing again, only we
stopped the master the second time. Not yet sure what that was all
about, but I think it may have been a race issue of some kind. We're
still exploring it.

Anyway, comments would be appreciated.

--
Eric Robinson


Disclaimer - December 4, 2009=20
This email and any files transmitted with it are confidential and =
intended solely for mysql@lists.mysql.com. If you are not the named =
addressee you should not disseminate, distribute, copy or alter this =
email. Any views or opinions presented in this email are solely those of =
the author and might not represent those of . Warning: Although has =
taken reasonable precautions to ensure no viruses are present in this =
email, the company cannot accept responsibility for any loss or damage =
arising from the use of this email or attachments.=20
This disclaimer was added by Policy Patrol: http://www.policypatrol.com/

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=3Dgcdmg-mysql-2@m.gmane.o rg

Re: Here"s an Idea for Re-Syncing Master and Slave During ProductionHours without Interrupting User

am 04.12.2009 15:27:18 von Tom Worster

i have two questions. (1) innodb? (2) why delete slave logs when you can
restart the slave with --skip-slave and then use CHANGE MASTER TO?

tom

On 12/4/09 6:34 AM, "Robinson, Eric" wrote:

>
> Let's face it, sometimes the master and slave get out of sync, even when
> 'show slave status' and 'show master status' indicate that all is well.
> And sometimes it is not feasible to wait until after production hours to
> resync them. We've been working on a method to do an emergency
> hot-resync during production hours with little or no user downtime. What
> do you guys think of this approach? It's only for Linux, though...
>
> 1. Shut down the slave and remove its replication logs (master.info and
> *relay* files).
>
> 2. Do an initial rsync of the master to the slave. Using rsync's
> bit-differential algorithm, this quickly copies most of the changed data
> and can be safely be done against a live database. This initial rsync is
> done before the next step to minimize the time during which the tables
> will be read-locked.
>
> 3. Do a 'flush tables with read lock;reset master' on the master server.
> At this point, user apps may freeze briefly during inserts or updates.
>
> 4. Do a second rsync, which goes very fast because very little data has
> changed between steps 2 and 3.
>
> 5. Unlock the master tables.
>
> 6. Restart the slave.
>
> When you're done, you have a 100% binary duplicate of the master
> database on the slave, with no worries that some queries got missed
> somewhere. The master was never stopped and users were not severely
> impacted. (Mileage may vary, of course.)
>
> We've tried this a few times and it has seemed to work well in most
> cases. We had once case where the slave SQL thread did not want to
> restart afterwards and we had to do the whole thing again, only we
> stopped the master the second time. Not yet sure what that was all
> about, but I think it may have been a race issue of some kind. We're
> still exploring it.
>
> Anyway, comments would be appreciated.
>
> --
> Eric Robinson



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org