Posts Tagged ‘MySQL’

bidirectional syncing of MySQL tables with percona pt-table-sync

What need to be done:

For an current project in the organization I work (INF) I want try the bidirectional syncing of a MySQL database. We will run the same application in different offices. Between the servers we have a VPN connections. We don’t need live syncing – once a day should be enough.


The possibilities that I found for syncing are these:

MySQL own replication.
seems to be hard to implement and nobody recommends a master-master replication. But maybe its not recommended to have a read-write access to master-master replication with any solution. But that doesn’t matter – I need a solution.

SQLyog Job Agent (SJA)
from what is part of SQLyog Enterprise and SQLyog Ultimate
This is the solution suggested by the developer of our application
The Linux version should be free
I cannot find the download for Linux, maybe its included in the 30 days trial version.
And the bidirectional sync seems just to be two times a one-way-sync. So first we sync from serverA to serverB and then back.
So this doesn’t help us much. If somebody updates serverB but we syncing first from serverA to serverB, we will overwrite the change again.

pt-table-sync made by percona
the bidirectional syncing is marked as experimental but till now it seems to be the best way – so I will test it first.

user permissions

the sync user needs quite much permissions. I tried to give as less as possible and run several times into problems. pt-table-sync tells you every time it has not enough privileges. I will have to find out what are the minimal privileges are and then (maybe :-} ) update this post. Now I ended up with SUPER and REPLICATION CLIENT as global permissions and SELECT, INSERT, UPDATE, DELETE, REFERENCES, LOCK TABLES for the table I like to sync.

primary key conflict:

In the MySQL default configuration the AUTO_INCREMENT fields are just incremented by 1 if a new row is added to the table. If two servers add a row in a table where the primary key is set to AUTO_INCREMENT (what happens really very often), we would have a conflict that is not resolvable because we have the same PRIMARY KEY double.
So we set two variables in the MySQL configuration file (my.cnf)
auto-increment-increment = 100
auto-increment-offset = 1

auto-increment-increment is on every server set to 100 and auto-increment-offset is set different on every server (1,2,3 etc.)
So now new AUTO_INCREMENT values will incremented by 100 and every server has an own offset, so new rows will get on server1 the ids 101, 201, 301 and on server2 102, 202, 302


Our database structure has a timestamp column in every table, what is very useful so we can use the pt-table-sync options: „–conflict-column timestamp –conflict-comparison newest“ for conflicts.
So in case an item is updated on both servers, the one with the newest timestamp will win.
Just in the case there is a change on the same item, saved in the same seconds we would have an unresolvable conflict, but that is not very likely.
By default pt-table-sync just creates a warning when running in a unresolvable conflict but with the option „–conflict-error die“ it stops the execution in this case.

tables with combined primary keys:

The documentation says that one of the limitations of the bidirectional sync is:
* requires that the table(s) are chunkable with the Chunk algorithm
I didn’t understand that limitation first, but now I think it means that we need a single-column-primary-key. The one way syncing works fine with combined primary key tables, but for bidirectional syncing of tables with combined primary keys I got the message that „No plugin can sync <database.table>„.
This forum post gave me the tip.

We have some tables that have combined primary keys and we need to sync them too.
So I changed the PRIMARY KEY indexes to UNIQUE and add an „id“ column and made it PRIMARY KEY.
This should not affect the application.

not found changes:

In every table our database has a column called „deleted“. This column can be set to 0 or 1. When I changed this value pt-table-sync didn’t recognize the change. I think this must be because of the checksum function, by default it is CRC32. When I set it to MD5 it works fine ( –function MD5 ). SHA1 is probably even more save but also slower.

automatic script:

I’ve wrote a small perl script that has contains a list of tables that need to be sync and the timestamp column name. If the timestamp column names would be the same everywhere we maybe would not need the list – but I cannot change the database design. The list makes also sure that we update / insert data in the correct order, so we don’t run into foreign key problems.
So the script runs through the list of tables and sync them with pt-table-sync.
Conflicts (changes between two sync on different servers) will be recognized with the help of the „change_log“ table. In a conflict the later change wins but both parties gets an email with the information about the conflict.

The script collects some information and prints them. It’s designed to run as cron job and send the syncing information by email to the admin, that’s why the short information is in the head and the verbose at the bottom.
You can find the script here.

But don’t use it as an out-of-the-box solution. It changes your database, so use it just if you understand what it does! I don’t give any guaranty it works. It might destroy your database, server, office. So be careful!

Additionally the script uses unison to sync files on the filesystem

process of the sync:

  1. every server runs a MySQL backup locally at a specific time e.g. every day at 6.00PM
  2. a little bit later e.g. at 6:30PM the main server in the central office starts the sync script
    1. The sync script first sets his own and the remote software to „Maintenance Mode“ (a feature we have in our software, to prevent user to use it while the admin works :-] )
      For this matter I use the preExecutionSQLcommand in the script.
    2. then it runs trough every server the first time and syncs the database and the files
    3. after syncing every server once the script runs a second time to make sure every server gets also the changes from later sync servers.  If I sync server 1,2,3,4 the first time I need to sync 1,2,3 again to make sure the changes from 4 are everywhere.
    4. if the synchronization works the server disable the „Maintenance Mode“ again (postExecutionSQLcommand). If the synchronization fails the „Maintenance Mode“ will be still enabled, so the administrator would have the chance to fix the problem before somebody else have the chance to edit data and make the problem worse.
    5. the results are send by email to the administrators

simulate slow and bad connection:

I’m testing this solution between two virtual machines and for simulating bad and slow connection we can use the Linux „tc“ tool:

For bad or slow connections it might be helpful to increase the net_read_timeout and net_write_timeout settings of MySQL.

Kategorien:English Schlagwörter: , ,

Weltweiter Unterschriftenaktion gegen Übername von MySQL durch Oracle

Kategorien:Deutsch Schlagwörter:

MySQL durch Indexe beschleunigen

Ein Kunde hatte erhebliche Probleme mit der Performance seiner Webanwendung. Die Vermutung war dass es an der MySQL Datenbank liegt, was ein Blick in die Prozessliste auch bestätigte.

Um herrauszufinden welche SQL-Anfragen lange dauern empfiehlt es sich „log_slow_queries“ bei MySQL zu aktivieren. In der entsprechenden Log-Datei sieht man dann einige Analysen zu den langsamen Querys. Die „schuldigen“ Anfragen in diesem Fall dauerten bis zu 1min (2×3,2GHz CPU) und überprüften über 39000 Zeilen über verschiedene Tabellen.
Die Lösung brachten Indexe auf die häufig verwendete Spalten in den WHERE und ORDER Klauseln. Die Geschwindigkeit verbesserte sich dadurch um den Faktor 50.

Um zu sehen welche Indexe verwendet werden kann man vor einen SQL Statement ein „EXPLAIN“ setzen.

mysql> explain select * from jos_facileforms_subrecords WHERE id=80;
| id | select_type | table                      | type  | possible_keys | key     | key_len | ref   | rows | Extra |
|  1 | SIMPLE      | jos_facileforms_subrecords | const | PRIMARY       | PRIMARY | 4       | const |    1 |       |

„id“ ist PRIMARY KEY und deswegen auch automatische in Index.

mysql> explain select * from jos_facileforms_subrecords WHERE record=80;
| id | select_type | table                      | type | possible_keys | key  | key_len | ref  | rows | Extra       |
|  1 | SIMPLE      | jos_facileforms_subrecords | ALL  | NULL          | NULL | NULL    | NULL | 2353 | Using where |

„record“ ist kein Index. Das ändern wir mit:

mysql> ALTER TABLE `jos_facileforms_subrecords` ADD INDEX ( `record` );

Der nächste Aufruf zeigt uns dass ein index verwendet wird:

mysql> explain select * from jos_facileforms_subrecords WHERE record=80;
| id | select_type | table                      | type | possible_keys | key    | key_len | ref   | rows | Extra |
|  1 | SIMPLE      | jos_facileforms_subrecords | ref  | record        | record | 4       | const |    7 |       |

sehr gute weiterführende Infos gibt es z.B. hier:  Optimierung von MySQL-Abfragen: Verwendung des Index

Kategorien:Deutsch Schlagwörter: , ,
%d Bloggern gefällt das: