At work we’re using MySQL (both 5.0 and 5.1 at this time) to provide data that’s regularly updated. We have two physical sites, with each site having one master and three slaves behind it, like so:
_____ _____ | S1M |----------------------------| S2M | |------|_____|-----| |-----|_____|------| ___|__ ___|__ __|___ ___|__ __|___ __|___ | S1S1 | | S1S2 | | S1S3 | | S2S1 | | S2S2 | | S2S3 | |______| |______| |______| |______| |______| |______|
Multi-Master? Then the Masters are also Slaves!
It’s okay if you have a “duh” moment with that concept. The “master-slave” relationship is very easy to understand; but if you have two systems that will be receiving updates in some way, but also be in sync, then both need to be masters and slaves of each other. There’s a gotcha though that I stumbled over when I was working on the slaves to a particular master, and hopefully I’ll be able to help you avoid that issue. Also, it helps if you’re talking the same version of MySQL here, as there is a minor gotcha with moving between 5.0 and 5.1: Avoid trying to go “back” from 5.1 to 5.0, but going forwards will require an additional command. Also, avoid using InnoDB tables; it’s easier to use MyISAM.
First, basic configuration:
[mysqld] datadir=/var/lib/mysql socket=/var/lib/mysql/mysql.sock user=mysql auto-increment-increment = 2 auto-increment-offset = 2 # should be 1 on the other master! server-id=(unique ID on all servers involved) log-bin=mysqld-bin expire_logs_days=1 # required for master-slave-slave updates to propagate log-slave-updates [mysqld_safe] log-error=/var/log/mysqld.log pid-file=/var/run/mysqld/mysqld.pid
Some notes: log-bin=mysqld-bin is required on the masters for replication to work. auto-increment-increment=2 just tells a master to skip a number when auto-incrementing INSERT statements. To avoid collisions and other pain, you should, on one master, use auto-increment-offset=1 so it INSERTs only odd IDs, and on the other set it to “2” so it uses the evens. Without this, a master may attempt to insert a row and give it an ID that the other master has already used.
Next, make sure that the server-id is specified, and between all masters and slaves, that it is unique.
The thing I tripped over was the log-slave-updates. I set up the second master, but the first was set up by a more intelligent individual … so I missed it when setting up the second one. When you finally have all your masters and slaves set up, if one of the masters doesn’t have this setting, you’ll find that your slaves under that master will not see updates/inserts from the other master. In our case this was a bit of a disaster and required redoing the three slaves under the misconfigured master after the configuration was corrected.
If you intend to use SSL, you’ll want to include the appropriate commands, such as:
sl-ca=/etc/pki/tls/certs/ca-bundle.crt ssl-cert=/etc/pki/tls/certs/(FQDN).crt ssl-key=/etc/pki/tls/private/(FQDN).key ssl-cipher=DHE-RSA-AES256-SHA
So, let’s assume you have a master, and that it already has data. Setting up its slaves and the other master are mostly identical:
- Ensure no changes are occurring.
- Create the slave account:
GRANT REPLICATION SLAVE ON *.* TO '(replication account name)'@(slave FQDN or IP) IDENTIFIED BY '(replication password)';
- You can include “REQUIRE SSL” at the end if you are going to use it. Also, you’ll need to “FLUSH PRIVILEGES;”.
- Run the command (and observe the following output):
mysql> show master status; +-------------------+----------+--------------+------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | +-------------------+----------+--------------+------------------+ | mysqld-bin.000007 | 98 | (db),mysql | | +-------------------+----------+--------------+------------------+ 1 row in set (0.00 sec)
You’ll need the file name and position for a later command on the slaves.
- Next, copy your data over. There are several ways to do this: tar, scp, detar; rsync; or mysqldump, copy dump, import. If you go for the direct files, be aware that with MyISAM databases, if you copy the raw MySQL files, you only need to worry about everything in the DB directory (for us it’s in /var/lib/mysql/[db]). Oh, and assuming full master-slave, copy the mysql database too. If you use InnoDB tables, your best bet may be to shut down the master, copy the entire /var/lib/mysql directory to the (shut down) slave, and then restart mysql.
- You can restart your master at this point, and also restart slave(s). On a given slave:
mysql> change master to master_host='(master server FQDN or IP)', master_user='(replication account name)', master_password='(replication password)', master_log_file='(filename from show master status; on master)', master_log_pos=(position from pshow master status; on master), MASTER_SSL=1, MASTER_SSL_CA='/etc/pki/tls/certs/ca-bundle.crt';
The SSL options are only required if you are going to use SSL and have the certs set up.
mysql> start slave; Query OK, 0 rows affected (0.00 sec) mysql> show slave status;
For that last command, you’re looking for the Slave_IO_State column to show “Waiting for master to send event”. If it does not, take a look at your mysql log file.
Of course, that’s great for the slaves, but what about the other master? Simple: treat it as a slave to start, then create the replication account on the second master per #2 above. Assuming still no changes going on, do #3 on the second master. Then, on the first master, make it a slave per #5 above.
Creating the second set of slaves is essentially the same as above. The big key is to ensure that no changes are occurring on either master at steps 3 & 4.
So, how do I know if my masters and slaves are truly in sync?
Actually, take a look at the CHECKSUM TABLE MySQL command. I hacked together (meaning the code works perfectly fine but I don’t consider it worthy enough to include here) a Nagios check that uses this command on a master and slave. Note that if the check runs at just the wrong time, there’s a possibility of the checksums being different. If you’re building your tables from scratch, I would strongly recommend that you create them with CHECKSUM=1 as an option to speed up the query.
Here’s a script I wrote to test that:
#!/bin/sh /bin/rm -f /usr/local/var/mysql/(db-to-check).sql /usr/bin/mysqldump --skip-extended-insert --database (db-to-check) \ -u root --password=(DANGER! PW IN PLAIN TEXT!) | \ grep INSERT > /usr/local/var/mysql/(db-to-check).sql /usr/bin/md5sum /usr/local/var/mysql/(db-to-check).sql “–skip-extended-insert” puts each row on it’s own INSERT line, which makes life easier if you run into issues and want to diff or sdiff a couple of files and compare them. The “grep INSERT” is used as there can be comments or other data in the raw mysqldump of a database that will never match up between a master and a slave … but at the very least, the collection of INSERT lines should match. Run this script on a master and a slave at roughly the same time (assuming no update or insert has occurred between the two executions) and they should spit out the same MD5 signature. If they don’t … well, that’s how I figured out I was missing updates on slaves from the “other” master.
Please note: If you have a MySQL 5.0 master and copy the data to a MySQL 5.1 slave, after copying your db and mysql data to the slave and restarting MySQL on the slave, run:
mysql_fix_privilege_tables --user=root --password=(your root pass) --verbose
It will dump a lot of info that’s probably not that important. 5.1 includes additional info tables that are not there in 5.0. Failure to run this will cause error messages somewhere along the line (probably at the “CHANGE MASTER” stage).