Friday, July 27, 2012

Performance of MySQL Semi-Synchronous Replication

This is a blogpost by Aaron http://blog.9minutesnooze.com/performance-mysql-replication-high-latency/

I have seen a few posts on DBA.SE (where I answer a lot of questions) recommending the use of semi-synchronous replication in MySQL 5.5 over a WAN as a way to improve the reliability of replication. My gut reaction was that this is a very bad idea with even a tiny write load, but I wanted to test it out to confirm. Please note that I do not mean to disparage the author of those posts, a user whom I have great respect for.

What is semi-synchronous replication?

The short version is that one slave has to acknowledge receipt of the binary log event before the query returns. The slave doesn’t have to execute it before returning control so it’s still an asynchronous commit. The net effect is that the slave should only miss a maximum of 1 event if the master has a catastrophic failure. It does not improve the reliability of replication itself or prevent data drift.

What about performance, though? Semi-synchronous replication causes the client to block until a slave has acknowledged that it has received the event. On a LAN with sub-millisecond latencies, this should not present much of a problem. But what if there is 85ms of latency between the master and the slave, as is the case between Virginia and California? My hypothesis is that, with 85ms of latency, it is impossible to get better than 11 write queries (INSERT/UPDATE/DELETE) per second – 1000ms / 85ms = 11.7.

Let’s test that out.

I spun up identical m1.small instances in EC2′s us-east-1 and us-west-1 regions using the latest Ubuntu 12.04 LTS AMI from Alestic and installed the latest Percona Server 5.5 from their apt repository.

gpg --keyserver  hkp://keys.gnupg.net --recv-keys 1C4CBDCDCD2EFD2A

gpg -a --export CD2EFD2A | apt-key add -

echo deb http://repo.percona.com/apt precise main > /etc/apt/sources.list.d/percona.list

apt-get update

apt-get install percona-server-server-5.5 libmysqlclient-dev

Although mostly irrelevant to the benchmarks, my.cnf is configured thusly (basically identical to support-files/my-huge.cnf shipped with the distribution):

[mysqld]

port          = 3306

socket          = /var/run/mysql/mysql.sock

skip-external-locking

key_buffer_size = 384M

max_allowed_packet = 1M

table_open_cache = 512

sort_buffer_size = 2M

read_buffer_size = 2M

read_rnd_buffer_size = 8M

myisam_sort_buffer_size = 64M

thread_cache_size = 8

query_cache_size = 32M

thread_concurrency = 8



server-id     = 1

log-bin=mysql-bin

binlog_format=mixed



innodb_data_home_dir = /var/lib/mysql

innodb_data_file_path = ibdata1:2000M;ibdata2:10M:autoextend

innodb_log_group_home_dir = /var/lib/mysql

innodb_buffer_pool_size = 384M

innodb_additional_mem_pool_size = 20M

innodb_log_file_size = 100M

innodb_log_buffer_size = 8M

innodb_flush_log_at_trx_commit = 0

innodb_lock_wait_timeout = 50

Then, I wrote a very simple Ruby script to perform 10k inserts into a table, using the sequel gem, which I love.

apt-get install rubygems

gem install sequel mysql2 --no-rdoc --no-ri



#!/usr/bin/env ruby

# insertperf.rb



require 'logger'

require 'rubygems'

require 'sequel'



logger = Logger.new(STDOUT)

localdb = "inserttest"



db = Sequel.connect( :database => localdb,

                     :adapter  => 'mysql2',

                     :user     => 'root',

                     :logger   => logger )



db["DROP DATABASE IF EXISTS #{localdb}"].all

db["CREATE DATABASE #{localdb}"].all

db["CREATE TABLE IF NOT EXISTS #{localdb}.foo (

  id int unsigned AUTO_INCREMENT PRIMARY KEY,

  text VARCHAR(8)

) ENGINE=InnoDB"].all



n = 10000

t1 = Time.new

n.times do

  value = (0...8).map{65.+(rand(25)).chr}.join

  db["INSERT INTO #{localdb}.foo (text) VALUES (?)", value].insert

end

t2 = Time.new

elapsed = t2-t1

logger.info "Elapsed: #{elapsed} seconds. #{n/elapsed} qps"

With MySQL configured, let’s knock out a few INSERTs into the us-east-1 database, which has no slaves:

# w/ no slaves

...

INFO -- : (0.000179s) INSERT INTO test.foo (text) VALUES ('FKGDLOWD')

...

INFO -- : Elapsed: 9.37364 seconds. 1066.82142689499 qps

My control is roughly 1000 inserts/sec with each query taking less than .2ms.

Then, I set up a traditional, asynchronous MySQL slave on the server in us-west-1 and ran the test again:

# w/ traditional replication

...

INFO -- : (0.000237s) INSERT INTO test.foo (text) VALUES ('CVGAMLXA')

...

INFO -- : Elapsed: 10.601943 seconds. 943.223331798709 qps

Somewhat inexplicably, the performance was slightly worse with the slave attached, but not by much. ~950 inserts/sec

Next is semi-synchronous replication. First, I tested the latency between us-east-1 and us-west-1.

# ping -c 1 184.72.189.235

PING 184.72.189.235 (184.72.189.235) 56(84) bytes of data.

64 bytes from 184.72.189.235: icmp_req=1 ttl=52 time=85.5 ms

Latency between us-east-1 and us-west-1 is 85ms, so I still predict 11 inserts/sec at most, which means my script will take 15 minutes instead of 10 seconds:

I set up semi-synchronous replication like this:

master> INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';

master> SET GLOBAL rpl_semi_sync_master_enabled = 1;

slave> INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';

slave> SET GLOBAL rpl_semi_sync_slave_enabled = 1;

slave> STOP SLAVE; START SLAVE;

I started the script and, as predicted, each insert was taking approximately 85ms. There was a screaming 2 month old in the next room so in the interest of brevity, I reduced the count from 10k to 1k. That should take about 90s:

# w/ semi-sync replication

...

INFO -- : (0.086301s) INSERT INTO test.foo (text) VALUES ('JKIJTUDO')

...

INFO -- : Elapsed: 86.889529 seconds. 11.5088666207409 qps

Just as I suspected – 11 inserts/sec.

In conclusion, the speed of light is a bitch, so don’t enable semi-synchronous replication over wide area or high latency networks.

Friday, July 20, 2012

Useful UNIX Command For Oracle DBA

How to kill all similar processes with single command (in this case opmn)

ps -ef | grep opmn |grep -v grep | awk ‘{print $2}’ |xargs -i kill -9 {}

Locating Files under a particular directory

find . -print |grep -i test.sql

 Using AWK in UNIX

To remove a specific column of output from a UNIX command – for example to determine the UNIX process Ids for all Oracle processes on server (second column)

ps -ef |grep -i oracle |awk '{ print $2 }'

Changing the standard prompt for Oracle Users

Edit the .profile for the oracle user

PS1="`hostname`*$ORACLE_SID:$PWD>"

 Display top 10 CPU consumers using the ps command

/usr/ucb/ps auxgw | head -11

 Show number of active Oracle dedicated connection users for a particular ORACLE_SID

ps -ef | grep $ORACLE_SID|grep -v grep|grep -v ora_|wc -l

 Display the number of CPU’s in Solaris

psrinfo -v | grep "Status of processor"|wc -l

Display the number of CPU’s in AIX

lsdev -C | grep Process|wc -l

Display RAM Memory size on Solaris

prtconf |grep -i mem

Display RAM memory size on AIX

First determine name of memory device

lsdev -C |grep mem

then assuming the name of the memory device is ‘mem0’

lsattr -El mem0

Swap space allocation and usage

Solaris : swap -s or swap -l

Aix : lsps -a

 Total number of semaphores held by all instances on server

ipcs -as | awk '{sum += $9} END {print sum}'

View allocated RAM memory segments

ipcs -pmb

Manually deallocate shared memeory segments

ipcrm -m ''

 Show mount points for a disk in AIX

lspv -l hdisk13

 Display amount of occupied space (in KB) for a file or collection of files in a directory or sub-directory

du -ks * | sort -n| tail

Display total file space in a directory

du -ks .

 Cleanup any unwanted trace files more than seven days old

find . *.trc -mtime +7 -exec rm {} \;

 Locate Oracle files that contain certain strings

find . -print | xargs grep rollback

 Locate recently created UNIX files (in the past one day)

find . -mtime -1 -print

 Finding large files on the server (more than 100MB in size)

find . -size +102400 -print

Crontab :

To submit a task every Tuesday (day 2) at 2:45PM

45 14 2 * * /opt/oracle/scripts/tr_listener.sh > /dev/null 2>&1

To submit a task to run every 15 minutes on weekdays (days 1-5)

15,30,45 * 1-5 * * /opt/oracle/scripts/tr_listener.sh > /dev/null 2>&1

To submit a task to run every hour at 15 minutes past the hour on weekends (days 6 and 0)

15 * 0,6 * * opt/oracle/scripts/tr_listener.sh > /dev/null 2>&1


How to scale inserts in MySQL with Innodb Engine

Usually we thinks/expect that inserts with large number of threads would to be faster but this is not always true(Due to MySQL limitations). The bottleneck might be the CPU, IO controller and OS. MySQL cannot fully use available cores/cpus  in server e.g. MySQL 5.0 (Innodb) is limited to 4 cores etc. We are working on scaling inserts, we have two data processing clusters each of which use 50 threads - so total 100 threads to  insert data into MySQL database (version 5.0.51). The issue, inserts are delayed by minute(s) and the backlog continues to grow... After examining innodb monitor status we found long list of transactions waiting for AUTO-INC lock: For example:


------- TRX HAS BEEN WAITING 9 SEC FOR THIS LOCK TO BE GRANTED: TABLE LOCK table `alertxxxx/alertdata` trx id 0 685590474 lock mode AUTO-INC waiting ------------------ ---TRANSACTION 0 685590465, ACTIVE 10 sec, process no 8457, OS thread id 1169045824 setting auto-inc lock

Why AUTO-INC lock?

 When accessing the auto-increment counter, InnoDB engine uses a special table-level AUTO-INC lock that it keeps to the end of the current SQL statement, not to the end of the transaction. This basically causes all inserts into the same table to serialize. With single row inserts it is normally not too bad but could prevent scalability with multiple threads inserting Bug #16979. However, we can get better through put (inserts per second) with less  number of threads. So after dropping number of threads on both clusters by 50% initially - taking it to 20-20 sessions. The problem almost disappeared and when we further reduced number of threads to 10-10 sessions, the problem disappeared!

Beginning with MySQL 5.1.22 - new locking model introduced for handling Innodb auto-increment in InnoDB. There is a good article which talks about this MySQL Site and Blog

Similarly, if you want to achieve fast insert performance, it can be interesting to load files instead of the loading the inserts one by one : it is 3 to 4 times faster.
If the goal is a huge amount of data already known at that time, it is probably the best option.

Optimization: We can’t use load files for all projects. We need to insert data row by row. For same we can optimize variables/Configuration.

1. Optimize database structure - Design your tables to minimize their space on the disk. This can result in huge improvements by reducing the amount of data written to and read from disk. Smaller tables normally require less main memory while their contents are being actively processed during query execution. Any space reduction for table data also results in smaller indexes that can be processed faster. Ensure columns have the right data types and require least amount of storage; you should try to use the most precise type in all cases. For example, if an integer column is used for values in the range from 1 to 99999, MEDIUMINT UNSIGNED is the best type. For more information about optimizing database structure click MySQL Site. If you store large strings (TEXT) or BLOB, compression may help there.
Use procedure analyse () function to check optimal data type based on table . But before changing data type, check your future plans, So it will not create a  problem in near future.



2. Innodb flush method - e.g. O_DIRECT, if used can help to avoid double buffering between the InnoDB buffer pool and the operating system's filesystem cache. MySQL reference manual explain this MySQL Site.

[ Warning]
           O_DIRECT, serializes the writes in ext3. Howerver, impact can be lowered by using innodb_file_per_table)

3. Innodb thread concurrency - Keep the value low for this option variable (default 8 ok), however the correct value for this variable is dependent on environment and workload. This option variable is explained MySQL Site

4. Innodb buffer pool - Innodb maintains a buffer pool for caching data and indexes in memory. Making the pool larger can improve performance by reducing the amount of disk I/O needed, here is the detail about it.

5. Innodb log file size- The larger the value, the less checkpoint flush activity is needed in the buffer pool, saving disk I/O. But larger log files also mean that recovery is slower in case of a crash. As of MySQL 5.5, recovery times have been greatly improved and the whole log file flushing algorithm has been improved. In 5.5 you generally want larger log files as recovery is improved.

6. Filesystem selection and disk issues - 'xfs' is known to perform very well with MySQL. Also writing the redo logs, binary logs, data files in different physical disks is a good practice with a bigger gain than server configuration. RAID 10 is recommended for best performance, more detail about disk issue can be found at MySQL Site .

Thursday, July 19, 2012

Replicate a Database with different Name in MySQL

It happened to me   first time, so I was very excited about it until I noticed that this command has been with us from nearly the very beginning (ie MySQL 4). We need to replicate a database with different name. Was thinking it will be a big task J finally got the easy solution and working now. J

The Situation
Master Server contains db1
Slave Server contains db1 and db2
Slave Server needs db1 from the Master to replicate to db2

The Solution
On my.cnf on the slave server:
--replicate-rewrite-db=from_name->to_name

Thus, this situation would dictate the following:
--replicate-rewrite-db=db1->db2
Easy!

In replication there may be scenario like below
Part 1
What if the Master Server contained db1 and db2; and
the Slave needs only db1 from the Master to replicate to db2 on the Slave?

Part 2
We don't want to replicate both db1 and db2 to this slave, although we want to replicate both databases to another slave - this means that the filter for databases should reside on my.cnf on the slave...

Assuming that we are using replicate-rewrite-db=db1->db2:
Using replicate-do-db on the my.cnf file on the slave, which database would you use with that command?

db1 or db2?  It only takes a moment in time to find out... :)