MySQL DBA Tips and Tricks: Semi Synchronous replication

This is a blogpost by Aaron http://blog.9minutesnooze.com/performance-mysql-replication-high-latency/

I have seen a few posts on DBA.SE (where I answer a lot of questions) recommending the use of semi-synchronous replication in MySQL 5.5 over a WAN as a way to improve the reliability of replication. My gut reaction was that this is a very bad idea with even a tiny write load, but I wanted to test it out to confirm. Please note that I do not mean to disparage the author of those posts, a user whom I have great respect for.

What is semi-synchronous replication?

The short version is that one slave has to acknowledge receipt of the binary log event before the query returns. The slave doesn’t have to execute it before returning control so it’s still an asynchronous commit. The net effect is that the slave should only miss a maximum of 1 event if the master has a catastrophic failure. It does not improve the reliability of replication itself or prevent data drift.

What about performance, though? Semi-synchronous replication causes the client to block until a slave has acknowledged that it has received the event. On a LAN with sub-millisecond latencies, this should not present much of a problem. But what if there is 85ms of latency between the master and the slave, as is the case between Virginia and California? My hypothesis is that, with 85ms of latency, it is impossible to get better than 11 write queries (INSERT/UPDATE/DELETE) per second – 1000ms / 85ms = 11.7.

Let’s test that out.

I spun up identical m1.small instances in EC2′s us-east-1 and us-west-1 regions using the latest Ubuntu 12.04 LTS AMI from Alestic and installed the latest Percona Server 5.5 from their apt repository.

gpg --keyserver hkp://keys.gnupg.net --recv-keys 1C4CBDCDCD2EFD2A

gpg -a --export CD2EFD2A | apt-key add -

echo deb http://repo.percona.com/apt precise main > /etc/apt/sources.list.d/percona.list

apt-get update

apt-get install percona-server-server-5.5 libmysqlclient-dev

Although mostly irrelevant to the benchmarks, my.cnf is configured thusly (basically identical to support-files/my-huge.cnf shipped with the distribution):

[mysqld]

port = 3306

socket = /var/run/mysql/mysql.sock

skip-external-locking

key_buffer_size = 384M

max_allowed_packet = 1M

table_open_cache = 512

sort_buffer_size = 2M

read_buffer_size = 2M

read_rnd_buffer_size = 8M

myisam_sort_buffer_size = 64M

thread_cache_size = 8

query_cache_size = 32M

thread_concurrency = 8

server-id = 1

log-bin=mysql-bin

binlog_format=mixed

innodb_data_home_dir = /var/lib/mysql

innodb_data_file_path = ibdata1:2000M;ibdata2:10M:autoextend

innodb_log_group_home_dir = /var/lib/mysql

innodb_buffer_pool_size = 384M

innodb_additional_mem_pool_size = 20M

innodb_log_file_size = 100M

innodb_log_buffer_size = 8M

innodb_flush_log_at_trx_commit = 0

innodb_lock_wait_timeout = 50

Then, I wrote a very simple Ruby script to perform 10k inserts into a table, using the sequel gem, which I love.

apt-get install rubygems

gem install sequel mysql2 --no-rdoc --no-ri

#!/usr/bin/env ruby

# insertperf.rb

require 'logger'

require 'rubygems'

require 'sequel'

logger = Logger.new(STDOUT)

localdb = "inserttest"

db = Sequel.connect( :database => localdb,

:adapter => 'mysql2',

:user => 'root',

:logger => logger )

db["DROP DATABASE IF EXISTS #{localdb}"].all

db["CREATE DATABASE #{localdb}"].all

db["CREATE TABLE IF NOT EXISTS #{localdb}.foo (

id int unsigned AUTO_INCREMENT PRIMARY KEY,

text VARCHAR(8)

) ENGINE=InnoDB"].all

n = 10000

t1 = Time.new

n.times do

value = (0...8).map{65.+(rand(25)).chr}.join

db["INSERT INTO #{localdb}.foo (text) VALUES (?)", value].insert

end

t2 = Time.new

elapsed = t2-t1

logger.info "Elapsed: #{elapsed} seconds. #{n/elapsed} qps"

With MySQL configured, let’s knock out a few INSERTs into the us-east-1 database, which has no slaves:

# w/ no slaves

...

INFO -- : (0.000179s) INSERT INTO test.foo (text) VALUES ('FKGDLOWD')

...

INFO -- : Elapsed: 9.37364 seconds. 1066.82142689499 qps

My control is roughly 1000 inserts/sec with each query taking less than .2ms.

Then, I set up a traditional, asynchronous MySQL slave on the server in us-west-1 and ran the test again:

# w/ traditional replication

...

INFO -- : (0.000237s) INSERT INTO test.foo (text) VALUES ('CVGAMLXA')

...

INFO -- : Elapsed: 10.601943 seconds. 943.223331798709 qps

Somewhat inexplicably, the performance was slightly worse with the slave attached, but not by much. ~950 inserts/sec

Next is semi-synchronous replication. First, I tested the latency between us-east-1 and us-west-1.

# ping -c 1 184.72.189.235

PING 184.72.189.235 (184.72.189.235) 56(84) bytes of data.

64 bytes from 184.72.189.235: icmp_req=1 ttl=52 time=85.5 ms

Latency between us-east-1 and us-west-1 is 85ms, so I still predict 11 inserts/sec at most, which means my script will take 15 minutes instead of 10 seconds:

I set up semi-synchronous replication like this:

master> INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';

master> SET GLOBAL rpl_semi_sync_master_enabled = 1;

slave> INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';

slave> SET GLOBAL rpl_semi_sync_slave_enabled = 1;

slave> STOP SLAVE; START SLAVE;

I started the script and, as predicted, each insert was taking approximately 85ms. There was a screaming 2 month old in the next room so in the interest of brevity, I reduced the count from 10k to 1k. That should take about 90s:

# w/ semi-sync replication

...

INFO -- : (0.086301s) INSERT INTO test.foo (text) VALUES ('JKIJTUDO')

...

INFO -- : Elapsed: 86.889529 seconds. 11.5088666207409 qps

Just as I suspected – 11 inserts/sec.

In conclusion, the speed of light is a bitch, so don’t enable semi-synchronous replication over wide area or high latency networks.

Hi All

I checked the presentation of Data charmer and Sheeri. It's awasome.

What’s New in MySQL 5.5 and 5.6 Replication

1. MySQL 5.5 supports Semi Synchronous replication

2. MySQL 5.6 supports delayed replication

3. Server UUID ( New: Added to ensure Global transaction ID )

4. Crash-Safe Slave

5. Multi-threaded slave (Most Awaited features)

6. Global transaction identifiers

MySQL 5.5 – Semi synchronous replication

It increases reliability of slaves by making sure that the changes of master are committed on at least one of the slaves before the write query is returned.

To use it, you will require to install the plug-in rpl_semi_sync_master on the master and rpl_semi_sync_slave on the slave. Then you can configure the master/slave to use the plug-in. You need to restart both the master and slave to activate the plug-in and configurations.

Once you restart you will see variables like rpl_semi_sync and you can see whether it’s enabled, what the timeout, trace level, etc are.

Rpl_semi_sync_master_no_trx (how many transactions didn’t go to the slaves) and Rpl_semi_sync_master_yes_trx (was it successfully sent) are status variables or counters on the master.

MySQL 5.6 – Delayed replication

It is part of the MySQL 5.6 that ships with MySQL, No need to install plug-ins. You need to execute below command after stopping slave.

CHANGE MASTER TO MASTER_DELAY=#seconds;

Show slave status shows you two parameters in output SQL_Delay and SQL_Remaining_Delay. Where SQL_Delay show total defined delay and SQL_remaining_delay shows the remaining delay.

UUID – Each master has a UUID in addition to the server_id. This is important for global transaction IDs, which makes sure that even if you change server_id’s the global transaction ID is still associated with the same server. So you can finally identify the each transaction on a server.

More crash-safe replication: Replication in MySQL 5.6 is crash-safe. It will make sure a statement is going to execute more than one time.

Execute below command
SHOW VARIABLES LIKE ‘%info%’ ;

Apart from these variables one more variables use is called slave_parallel_workers.

These variables help ensure that if the slave crashes, the relay log and master.info files are kept in sync and don’t try to retrieve or apply a statement more than once.

The crash-safe features are only for innodb, but MySQL ships the following tables with MyISAM engines.

1. slave_master_info

2. slave_relay_log_info

3. slave_worker_info

So when you start working with it, change engine of these tables to innodb using command

ALTER TABLE …ENGINE=INNODB

When you do SHOW SLAVE STATUS you’ll see Master_info_file: mysql.slave_master_info

When you will check these tables , you will get a lot of information. Table slave_master_info is most, but not all, of the same info in SHOW SLAVE STATUS.

Table slave_worker_info has 1 line for each worker you have set.

Multi-threaded slave aka parallel replication

1. It requires MySQL 5.6 in both master and slave (can’t do 5.5 master, 5.6 slave) to actually do parallel replication.

2. Parallel replication with a MySQL 5.5 master will slow down replication – so you can do parallel replication but it’s 3x slower (right now).

3. Data gets parallelized by the schema – so if you have 2 queries in the same schema, they can’t be done in parallel, but 2 queries in 2 different schemas, they can be done in parallel. If you only have one schema, don’t bother using parallel replication.

To use it configure the global dynamic parameter:

SET GLOBAL slave_parallel_workers=10; (or 3, or whatever, default is 0).

How do you choose the # of parallel workers to choose? MySQL is smart enough to figure out how to use the parallel workers around the schemas you have. In tests, parallel replication was 3-4x faster than regular replication.

Global transaction ID – If a master fails, all the data from the master cannot be accessed. So you have to figure out which slave is the most advanced one (in terms of time ), promote that slave to the master, and figure out which transactions the other slaves were missing. But it’s hard to know in the relay log, because it can be a different file/position in each slave. So the global transaction id uses a unique number, so you don’t have to worry about the relay log filename/position. To use the global transaction id, all machines in the cluster (master and slaves) should have following parameters in their configuration files.

log-bin
log-slave-updates
gtid-mode=ON
disable-gtd-unsafe-statemnets –tricky

Why tricky? This feature does not work on non-transactional tables, like MyISAM. If you try to remove the anonymous user from the mysql.user table and you have this set, it’s not safe so it doesn’t work and you get a master error. GRANT, REVOKE, DROP USER will work, but DELETE, INSERT, etc don’t work.

CREATE TABLE…SELECT does not work with this feature, so don’t turn it on if you use that.

After changing the variables in the mysql config file, restart the servers to pick up the changes.

You then see the

@@SESSION.GTID_NEXT=’44556A96-8417-11E1-9589-2BD5ACDD51FD:1′
After the : is the transaction ID number (1 in this case), everything before it is the server’s UUID. Using this information you can more easily find which transactions in the binary log you need.

There are 2 more lines in SHOW SLAVE STATUS\G:
Retrieved_Gtid_Set:1-100
Executed_Gtid_Set:1-100

The show slave output will be like below now.

MySQL DBA Tips and Tricks

Friday, July 27, 2012

Performance of MySQL Semi-Synchronous Replication

Friday, May 18, 2012

What’s New in MySQL 5.5 and 5.6

Hi All

I checked the presentation of Data charmer and Sheeri. It's awasome.

What’s New in MySQL 5.5 and 5.6 Replication

Blog Archive

Followers