In some previous posts I have written about replicating Business Intelligence information to Local Area Networks satellites.
Now I had my replication scheme up and running for some weeks, and it surpasses all my expectations. The replication runs through like a clyster. In the last post I feared the network speed would be to slow, but it seems like the network guys have cranked up the speed.
This Rsync statistics is from last nights replication:
Number of files: 294
Number of files transferred: 140
Total file size: 726588950 bytes
Total transferred file size: 641270064 bytes
Literal data: 553799963 bytes
Matched data: 87470101 bytes
File list size: 3898
File list generation time: 0.001 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 70309506
Total bytes received: 781434
sent 70309506 bytes received 781434 bytes 437482.71 bytes/sec
total size is 726588950 speedup is 10.22
I’m not sure what all figures means but it’s fast. The replication is done twice plus the replication of an empty database (also done twice), normally this procedure takes less than three minutes wall clock time.
I run this replication against a live MySQL/ISAM. I just flush the tables and then rsync. I know there is no ongoing activities against this database during the replication. But that actually doesn’t matter that much, you can run a similar scheme against a busy database, if the activity is low run rsync until no data is transferred or run twice then lock the database and run a final rsync replication, but in that case I would prefer a proper backup scheme. I replicate this way because i know there in no activities against the database, and my rsync procedure is fast and simple.
But there is still a problem, a colleague run this procedure manually. And this is certainly not the way it should be. But I just haven’t had time to set up an automatic procedure yet. This procedure is a bit complicated, as it involves taking down the target database system and run the replicate from the source system and this should be controlled by a third server who knows when time is due for replication. Why not run the replication process from the target system? Yes, that is undeniably simpler, but I like to have the process under the supervision of the controlling server who knows when time is due.
This is how I will set it up. From the controlling server I will issue commands via ssh to the source and target systems:
1 Stop MySQL
2 Flush tables
3 Start Rsync
4 Check the database - MySqL upgrade script
5 Start the database.
I have promised my colleague to replace his manual labor with a shell script next week :)
If you have a simpler/better way of replicate please comment this post.