Optimizing PXC Xtrabackup State Snapshot Transfer

State Snapshot Transfer (SST) at a glance

PXC uses a protocol called State Snapshot Transfer to provision a node joining an existing cluster with all the data it needs to synchronize.  This is analogous to cloning a slave in asynchronous replication:  you take a full backup of one node and copy it to the new one, while tracking the replication position of the backup.

PXC automates this process using scriptable SST methods.  The most common of these methods is the xtrabackup-v2 method which is the default in PXC 5.6.  Xtrabackup generally is more favored over other SST methods because it is non-blocking on the Donor node (the node contributing the backup).

The basic flow of this method is:

  • The Joiner:
    • joins the cluster
    • Learns it needs a full SST and clobbers its local datadir (the SST will replace it)
    • prepares for a state transfer by opening a socat on port 4444 (by default)
    • The socat pipes the incoming files into the datadir/.sst directory
  • The Donor:
    • is picked by the cluster (could be configured or be based on WAN segments)
    • starts a streaming Xtrabackup and pipes the output of that via socat to the Joiner on port 4444.
    • Upon finishing its backup, sends an indication of this and the final Galera GTID of the backup is sent to the Joiner
  • The Joiner:
    • Records all changes from the Donor’s backup’s GTID forward in its gcache (and overflow pages, this is limited by available disk space)
    • runs the –apply-log phase of Xtrabackup on the donor
    • Moves the datadir/.sst directory contents into the datadir
    • Starts mysqld
    • Applies all the transactions it needs (Joining and Joined states just like IST does it)
    • Moves to the ‘Synced’ state and is done.

There are a lot of moving pieces here, and nothing is really tuned by default.  On larger clusters, SST can be quite scary because it may take hours or even days.  Any failure can mean starting over again from the start.

This blog will concentrate on some ways to make a good dent in the time SST can take.  Many of these methods are trade-offs and may not apply to your situations.  Further, there may be other ways I haven’t thought of to speed things up, please share what you’ve found that works!

The Environment

I am testing SST on a PXC 5.6.24 cluster in AWS.  The nodes are c3.4xlarge and the datadirs are RAID-0 over the two ephemeral SSD drives in that instance type.  These instances are all in the same region.

My simulated application is using only node1 in the cluster and is sysbench OLTP with 200 tables with 1M rows each.  This comes out to just under 50G of data.  The test application runs on a separate server with 32 threads.

The PXC cluster itself is tuned to best practices for Innodb and Galera performance

Baseline

In my first test the cluster is a single member (receiving workload) and I am  joining node2.  This configuration is untuned for SST.  I measured the time from when mysqld started on node2 until it entered the Synced state (i.e., fully caught up).  In the log, it looked like this:

150724 15:59:24 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
... lots of other output ...
2015-07-24 16:48:39 31084 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 4647341)

Doing some math on the above, we find that the SST took 51 minutes to complete.

–use-memory

One of the first things I noticed was that the –apply-log step on the Joiner was very slow.  Anyone who uses Xtrabackup a lot will know that –apply-log will be a lot faster if you give it some extra RAM to use while making the backup consistent via the –use-memory option.  We can set this in our my.cnf like this:

[sst]
inno-apply-opts="--use-memory=20G"

The [sst] section is a special one understood only by the xtrabackup-v2 script.  inno-apply-opts allows me to specify arguments to innobackupex when it runs.

Note that this change only applies to the Joiner (i.e., you don’t have to put it on all your nodes and restart them to take advantage of it).

This change immediately makes a huge improvement to our above scenario (node2 joining node1 under load) and the SST now takes just over 30 minutes.

wsrep_slave_threads

Another slow part of getting to Synced is how long it takes to apply transactions up to realtime after the backup is restored and in place on the Joiner.  We can improve this throughput by increasing the number of apply threads on the Joiner to make better use of the CPU.  Prior to this wsrep_slave_threads was set to 1, but if I increase this to 32  (there are 16 cores on this instance type) my SST now takes 25m 32s

Compression

xtrabackup-v2 supports adding a compression process into the datastream.  On the Donor it compresses and on the Joiner it decompresses.  This allows you to trade CPU for transfer speed.  If your bottleneck turns out to be network transport and you have spare CPU, this can help a lot.

Further, I can use pigz instead of gzip to get parallel compression, but theoretically any compression utilization can work as long as it can compress and decompress standard input to standard output.  I install the ‘pigz’ package on all my nodes and change my my.cnf like this:

[sst]
inno-apply-opts="--use-memory=20G"
compressor="pigz"
decompressor="pigz -d"

Both the Joiner and the Donor must have the respective decompressor and compressor settings or the SST will fail with a vague error message (not actually having pigz installed will do the same thing).

By adding compression, my SST is down to 21 minutes, but there’s a catch.  My application performance starts to take a serious nose-dive during this test.  Pigz is consuming most of the CPU on my Donor, which is also my primary application node.  This may or may not hurt your application workload in the same way, but this emphasizes the importance of understanding (and measuring) the performance impact of SST has on your Donor nodes.

Dedicated donor

To alleviate the problem with the application, I now leave node2 up and spin up node3.  Since I’m expecting node2 to normally not be receiving application traffic directly, I can configure node3 to prefer node2 as its donor like this:

[mysqld]
...
wsrep_sst_donor = node2,

When node3 starts, this setting instructs the cluster that node3 is the preferred donor, but if that’s not available, pick something else (that’s what the trailing comma means).

Donor nodes are permitted to fall behind in replication apply as needed without sending flow control.  Sending application traffic to such a node may see an increase in the amount of stale data as well as certification failures for writes (not to mention the performance issues we saw above with node1).  Since node2 is not getting application traffic, moving into the Donor state and doing an expensive SST with pigz compression should be relatively safe for the rest of the cluster (in this case, node1).

Even if you don’t have a dedicated donor, if you use a load balancer of some kind in front of your cluster, you may elect to consider Donor nodes as failing their health checks so application traffic is diverted during any state transfer.

When I brought up node3, with node2 as the donor, the SST time dropped to 18m 33s

Conclusion

Each of these tunings helped the SST speed, though the later adjustments maybe had less of a direct impact.  Depending on your workload, database size, network and CPU available, your mileage may of course vary.  Your tunings should vary accordingly, but also realize you may actually want to limit (and not increase) the speed of state transfers in some cases to avoid other problems. For example, I’ve seen several clusters get unstable during SST and the only explanation for this is the amount of network bandwidth consumed by the state transfer preventing the actual Galera communication between the nodes. Be sure to consider the overall state of production when tuning your SSTs.

The post Optimizing PXC Xtrabackup State Snapshot Transfer appeared first on MySQL Performance Blog.

Read more at: http://www.mysqlperformanceblog.com/

Percona XtraDB Cluster: Quorum and Availability of the cluster

Percona XtraDB Cluster (PXC) has become a popular option to provide high availability for MySQL servers. However many people are still having a hard time understanding what will happen to the cluster when one or several nodes leave the cluster (gracefully or ungracefully). This is what we will clarify in this post.

Nodes leaving gracefully

Let’s assume we have a 3-node cluster and all nodes have an equal weight, which is the default.

What happens if Node1 is gracefully stopped (service mysql stop)? When shutting down, Node1 will instruct the other nodes that it is leaving the cluster. We now have a 2-node cluster and the remaining members have 2/2 = 100% of the votes. The cluster keeps running normally.

What happens now if Node2 is gracefully stopped? Same thing, Node3 knows that Node2 is no longer part of the cluster. Node3 then has 1/1 = 100% of the votes and the 1-node cluster can keep on running.

In these scenarios, there is no need for a quorum vote as the remaining node(s) always know what happened to the nodes that are leaving the cluster.

Nodes becoming unreachable

On the same 3-node cluster with all 3 nodes running, what happens now if Node1 crashes?

This time Node2 and Node3 must run a quorum vote to estimate if it is safe continue: they have 2/3 of the votes, 2/3 is > 50%, so the remaining 2 nodes have quorum and they keep on working normally.

Note that the quorum vote does not happen immediately when Node2 and Node3 are not able to join Node1. It only happens after the ‘suspect timeout’ (evs.suspect_timeout) which is 5 seconds by default. Why? It allows the cluster to be resilient to short network failures which can be quite useful when operating the cluster over a WAN. The tradeoff is that if a node crashes, writes are stalled during the suspect timeout.

Now what happens if Node2 also crashes?

Again a quorum vote must be performed. This time Node3 has only 1/2 of the votes: this is not > 50% of the votes. Node3 doesn’t have quorum, so it stops processing reads and writes.

If you look at the wsrep_cluster_status status variable on the remaining node, it will show NON_PRIMARY. This indicates that the node is not part of the Primary Component.

Why does the remaining node stop processing queries?

This is a question I often hear: after all, MySQL is up and running on Node3 so why is it prevented from running any query? The point is that Node3 has no way to know what happened to Node2:

  • Did it crash? In this case, it is safe for the remaining node to keep on running queries.
  • Or is there a network partition between the two nodes? In this case, it is dangerous to process queries because Node2 might also process other queries that will not be replicated because of the broken network link: the result will be two divergent datasets. This is a split-brain situation, and it is a serious issue as it may be impossible to later merge the two datasets. For instance if the same row has been changed in both nodes, which row has the correct value?

Quorum votes are not held because it’s fun, but only because the remaining nodes have to talk together to see if they can safely proceed. And remember that one of the goals of Galera is to provide strong data consistency, so any time the cluster does not know whether it is safe to proceed, it takes a conservative approach and it stops processing queries.

In such a scenario, the status of Node3 will be set to NON_PRIMARY and a manual intervention is needed to re-bootstrap the cluster from this node by running:

SET GLOBAL wsrep_provider_options='pc.boostrap=YES';

An aside question is: now it is clear why writes should be forbidden in this scenario, but what about reads? Couldn’t we allow them?

Actually this is possible from PXC 5.6.24-25.11 with the wsrep_dirty_reads setting.

Conclusion

Split-brain is one of the worst enemies of a Galera cluster. Quorum votes will take place every time one or several nodes suddenly become unreachable and are meant to protect data consistency. The tradeoff is that it can hurt availability, because in some situations a manual intervention is necessary to instruct the remaining nodes that they can accept executing queries.

The post Percona XtraDB Cluster: Quorum and Availability of the cluster appeared first on MySQL Performance Blog.

Read more at: http://www.mysqlperformanceblog.com/

Q&A: High availability when using MySQL in the cloud

Percona MySQL webinar followup: Q&ALast week I hosted a webinar on using MySQL in the cloud for High Availability (HA) alongside 451 Research analyst Jason Stamper. You can watch the recording and also download the slides (free) here. Just click the “Register” button at the end of that page.

We had several excellent questions and we didn’t have time to get to several of them in the allotted time. I’m posting them here along with the answers. Feel free to ask follow-up questions in the comments below.


Q: Can the TokuDB engine be used in a PXC environment?

A: No, TokuDB cannot currently be used in a PXC environment, the only supported engine in Percona XtraDB Cluster 5.6 is InnoDB.

Q: With Galera replication (PXC), is balancing the load on each node?

A: No, you need to implement your own load balancing and HA layer between your clients and the Percona XtraDB Cluster server.  Examples mentioned in the webinar include HAProxy and F5 BigIP.

Q: What’s the best version of Percona XtraDB Cluster regarding InnoDB performance?

A: In general for best performance you should be using the latest release of Percona XtraDB Cluster 5.6, which is currently 5.6.24, released on June 3rd, 2015.

Q: Can I redirect my writes in Percona XtraDB Cluster to multiple nodes using the HAProxy? While trying with SysBench I can see write-only goes to first nodes in PXC while reads does goes to multiple nodes.

A: Yes you can configure HAProxy to distribute both reads and writes across all of your nodes in a Percona XtraDB Cluster environment. Perhaps SysBench created only one database connection for all writes, and so haproxy kept those confined to only one host. You may want to experiment with parallel_prepare.lua.

Q: What’s the optimal HA for small datasets (db is less than 10gb)?

A: The optimal HA deployment for small datasets would be dependent on your level of recovery required (tolerance for loss of transactions) and time that you can be in an unavailable state (seconds, minutes, hours?).  Unfortunately there isn’t a single answer to your question, however, if you are interested in further discussion on this point Percona would be happy to coordinate a time to speak.  Please feel free to contact me directly and we can continue the conversation at [email protected].

 Q: Is there a concept of local master vs. remote master with PXC?

A: No there is no concept of local vs remote master.  All nodes in a Percona XtraDB Cluster can now be classified as Master, regardless of their proximity to the clients.

Q: Are there any concerns when considering AWS RDS or AURORA DB for MySQL HA in the Cloud?

A: Regarding AWS RDS, yes this a good option for MySQL HA in the Cloud.  I unfortunately haven’t worked with Aurora DB that much yet so I don’t have an opinion on it’s suitability for HA in the Cloud.

Q: We tried out PXC awhile back and it used to lock everything whenever any ddl was done. Has that changed?

A: We would have to look at the specifics of your environment, however, there have been numerous improvements in the 1½ years of development since Percona XtraDB Cluster went Generally Available (GA) on January 30th, 2014 in version 5.6.15.

Q: Is using the arbitrator a must?

A: No the arbitrator role via the garbd daemon is generally only used when operating in a minimal environment of two nodes that contain the data and you need a third node for quorum but don’t want to store the data a third time.

Q: Can we do a cluster across different zones?

A: Yes you can. However be aware that the latency incurred for all cluster certification operations will be impacted by the round trip time between nodes.

Q: Does PXC also support the MyISAM database?

A: No, Percona XtraDB Cluster does not support any storage engine other than InnoDB as of PXC 5.6.

Q: How do load balancers affect the throughput in a Galera-based setup given that the write would be limited by the slowest node?

A: Load balancers will introduce some measure of additional latency in the form of CPU time in the load balancer layer as it evaluates its own ruleset, and also in network time due to additional hop via load balancer.  Otherwise there should be no perceptible difference in the write throughput of a Percona XtraDB Cluster with and without a load balancer as it relates to the “slowest node” factor.

Q: Have you used MaxScale yet? If so, what are your thoughts?

A: Unfortunately I haven’t used MaxScale however Yves Trudeau, Percona Principal Architect, has recently written about MaxScale in this blog post.

Q: How do you configure timeout and maintain persistent connection to HAProxy?

A: I would encourage you to refer to the HAProxy Documentation.

The post Q&A: High availability when using MySQL in the cloud appeared first on MySQL Performance Blog.

Read more at: http://www.mysqlperformanceblog.com/

Percona XtraDB Cluster (PXC): How many nodes do you need?

A question I often hear when customers want to set up a production PXC cluster is: “How many nodes should we use?”

Three nodes is the most common deployment, but when are more nodes needed? They also ask: “Do we always need to use an even number of nodes?”

This is what we’ll clarify in this post.

This is all about quorum

I explained in a previous post that a quorum vote is held each time one node becomes unreachable. With this vote, the remaining nodes will estimate whether it is safe to keep on serving queries. If quorum is not reached, all remaining nodes will set themselves in a state where they cannot process any query (even reads).

To get the right size for you cluster, the only question you should answer is: how many nodes can simultaneously fail while leaving the cluster operational?

  • If the answer is 1 node, then you need 3 nodes: when 1 node fails, the two remaining nodes have quorum.
  • If the answer is 2 nodes, then you need 5 nodes.
  • If the answer is 3 nodes, then you need 7 nodes.
  • And so on and so forth.

Remember that group communication is not free, so the more nodes in the cluster, the more expensive group communication will be. That’s why it would be a bad idea to have a cluster with 15 nodes for instance. In general we recommend that you talk to us if you think you need more than 10 nodes.

What about an even number of nodes?

The recommendation above always specifies odd number of nodes, so is there anything bad with an even number of nodes? Let’s take a 4-node cluster and see what happens if nodes fail:

  • If 1 node fails, 3 nodes are remaining: they have quorum.
  • If 2 nodes fail, 2 nodes are remaining: they no longer have quorum (remember 50% is NOT quorum).

Conclusion: availability of a 4-node cluster is no better than the availability of a 3-node cluster, so why bother with a 4th node?

The next question is: is a 4-node cluster less available than a 3-node cluster? Many people think so, specifically after reading this sentence from the manual:

Clusters that have an even number of nodes risk split-brain conditions.

Many people read this as “as soon as one node fails, this is a split-brain condition and the whole cluster stop working”. This is not correct! In a 4-node cluster, you can lose 1 node without any problem, exactly like in a 3-node cluster. This is not better but not worse.

By the way the manual is not wrong! The sentence makes sense with its context.

There could actually reasons why you might want to have an even number of nodes, but we will discuss that topic in the next section.

Quorum with multiple data centers

To provide more availability, spreading nodes in several datacenters is a common practice: if power fails in one DC, nodes are available elsewhere. The typical implementation is 3 nodes in 2 DCs:

3nodes

Notice that while this setup can handle any single node failure, it can’t handle all single DC failures: if we lose DC1, 2 nodes leave the cluster and the remaining node has not quorum. You can try with 4, 5 or any number of nodes and it will be easy to convince yourself that in all cases, losing one DC can make the whole cluster stop operating.

If you want to be resilient to a single DC failure, you must have 3 DCs, for instance like this:

3nodes_3DC

Other considerations

Sometimes other factors will make you choose a higher number of nodes. For instance, look at these requirements:

  • All traffic is directed to a single node.
  • The application should be able to fail over to another node in the same datacenter if possible.
  • The cluster must keep operating even if one datacenter fails.

The following architecture is an option (and yes, it has an even number of nodes!):

6nodes

Conclusion

Regarding availability, it is easy to estimate the number of nodes you need for your PXC cluster. But node failures are not the only aspect to consider: Resilience to a datacenter failure can, for instance, influence the number of nodes you will be using.

The post Percona XtraDB Cluster (PXC): How many nodes do you need? appeared first on MySQL Performance Blog.

Read more at: http://www.mysqlperformanceblog.com/

Percona XtraDB Cluster/ Galera with Percona Monitoring Plugins

The Percona Monitoring Plugins (PMP) provide some free tools to make it easier to monitor PXC/Galera nodes.  Monitoring broadly falls into two categories: alerting and historical graphing, and the plugins support Nagios and Cacti, respectively, for those purposes.

Graphing

An update to the PMP this summer (thanks to our Remote DBA team for supporting this!) added a Galera-specific host template that includes a variety of Galera-related stats, including:

  • Replication traffic and transaction counts and average trx size
  • Inbound and outbound (Send and Recv) queue sizes
  • Parallelization efficiency
  • Write conflicts (Local Cert Failures and Brute Force Aborts)
  • Cluster size
  • Flow control

You can see examples and descriptions of all the graphs in the manual.

Alerting

There is not a Galera-specific Nagios plugin in the PMP yet, but there does exist a check that can pretty universally check any status variable you like called pmp-check-mysql-status.  We can pretty easily adapt this to check some key action-worthy Galera stats, but I hadn’t worked out the details until a customer requested it recently.

Checking for a Primary Cluster

Technically this is a cluster or cluster-partition state for whatever part of the cluster the queried node is a part of.  However, any single node could be disconnected from the rest of the cluster, so checking this on each node should be fine.  We can verify this with this check:

$ /usr/lib64/nagios/plugins/pmp-check-mysql-status -x wsrep_cluster_status -C == -T str -c non-Primary
OK wsrep_cluster_status (str) = Primary | wsrep_cluster_status=Primary;;non-Primary;0;

Local node state

We also want to verify the given node is ‘Synced’ into the cluster and not in some other state:

/usr/lib64/nagios/plugins/pmp-check-mysql-status -x wsrep_local_state_comment -C '!=' -T str -w Synced
OK wsrep_local_state_comment (str) = Synced | wsrep_local_state_comment=Synced;;Synced;0;

Note that we are only  warning when the state is not Synced — this is because it is perfectly valid for a node to be in the Donor/Desynced state.  This warning can alert us to a node in a less-than-ideal state without screaming about it, but you could certainly go critical instead.

Verify the Cluster Size

This is a bit of a sanity check, but we want to know how many nodes are in the cluster and either warn if we’re down a single node or go critical if we’re down more.  For a three node cluster, your check might look like this:

# /usr/lib64/nagios/plugins/pmp-check-mysql-status -x wsrep_cluster_size -C '<=' -w 2 -c 1
OK wsrep_cluster_size = 3 | wsrep_cluster_size=3;2;1;0;

This is OK when we have 3 nodes, warns at 2 nodes and goes critical at 1 node (when we have no redundancy left).   You could certainly adjust thresholds differently depending on your normative cluster size.   This check is likely meaningless unless we’re also in a Primary cluster, so you could set a service dependency on the Primary Cluster check here.

Check for Flow Control

Flow control is really something to keep an eye on in your cluster. We can monitor the recent state of flow control like this:

/usr/lib64/nagios/plugins/pmp-check-mysql-status -x wsrep_flow_control_paused -w 0.1 -c 0.9
OK wsrep_flow_control_paused = 0.000000 | wsrep_flow_control_paused=0.000000;0.1;0.9;0;

This warns when FC exceeds 10% and goes critical after 90%.  This may need some fine tuning, but I believe it’s a general principle that some small amount of FC might be normal, but you want to know when it starts to get more excessive.

Conclusion

Alerting with Nagios and Graphing with Cacti tend to work best with per-host checks and graphs, but there are aspects of a PXC cluster that you may want to monitor from a cluster-wide perspective.  However, most of the things that can “go wrong” are easily detectable with per-host checks and you can get by without needing a custom script that is Galera-aware.

I’d also always recommend what I call a “service check” that connects through your VIP or load balancer to ensure that MySQL is available (regardless of underlying cluster state) and can do a query.  As long as that works (proving there is at least 1 Primary cluster node), you can likely sleep through any other cluster event.  :)

The post Percona XtraDB Cluster/ Galera with Percona Monitoring Plugins appeared first on MySQL Performance Blog.

Read more at: http://www.mysqlperformanceblog.com/

Percona XtraDB Cluster: Setting up a simple cluster

Percona XtraDB Cluster (PXC) is different enough from async replication that it can be a bit of a puzzle how to do things the Galera way.  This post will attempt to illustrate the basics of setting up 2 node PXC cluster from scratch.

Requirements

Two servers (could be VMs) that can talk to each other.  I’m using CentOS for this post.  Here’s a dirt-simple Vagrant setup: https://github.com/jayjanssen/two_centos_nodes to make this easy (on Virtualbox).

These servers are talking over the 192.168.70.0/24 internal network for our example.

[email protected]~/Src $ git clone https://github.com/jayjanssen/two_centos_nodes.git
[email protected]~/Src $ cd two_centos_nodes
[email protected]~/Src/two_centos_nodes $ vagrant up
 Bringing machine 'node1' up with 'virtualbox' provider...
 Bringing machine 'node2' up with 'virtualbox' provider...
 [node1] Importing base box 'centos-6_4-64_percona'...
 [node1] Matching MAC address for NAT networking...
 [node1] Setting the name of the VM...
 [node1] Clearing any previously set forwarded ports...
 [node1] Creating shared folders metadata...
 [node1] Clearing any previously set network interfaces...
 [node1] Preparing network interfaces based on configuration...
 [node1] Forwarding ports...
 [node1] -- 22 => 2222 (adapter 1)
 [node1] Booting VM...
 [node1] Waiting for machine to boot. This may take a few minutes...
 [node1] Machine booted and ready!
 [node1] Setting hostname...
 [node1] Configuring and enabling network interfaces...
 [node1] Mounting shared folders...
 [node1] -- /vagrant
 [node2] Importing base box 'centos-6_4-64_percona'...
 [node2] Matching MAC address for NAT networking...
 [node2] Setting the name of the VM...
 [node2] Clearing any previously set forwarded ports...
 [node2] Fixed port collision for 22 => 2222. Now on port 2200.
 [node2] Creating shared folders metadata...
 [node2] Clearing any previously set network interfaces...
 [node2] Preparing network interfaces based on configuration...
 [node2] Forwarding ports...
 [node2] -- 22 => 2200 (adapter 1)
 [node2] Booting VM...
 [node2] Waiting for machine to boot. This may take a few minutes...
 [node2] Machine booted and ready!
 [node2] Setting hostname...
 [node2] Configuring and enabling network interfaces...
 [node2] Mounting shared folders...
 [node2] -- /vagrant

Install the software

These steps should be repeated on both nodes:

[email protected]~/Src/two_centos_nodes $ vagrant ssh node1
Last login: Tue Sep 10 14:15:50 2013 from 10.0.2.2
[[email protected] ~]# yum localinstall http://www.percona.com/downloads/percona-release/percona-release-0.0-1.x86_64.rpm
Loaded plugins: downloadonly, fastestmirror, priorities
Setting up Local Package Process
percona-release-0.0-1.x86_64.rpm | 6.1 kB 00:00
Examining /var/tmp/yum-root-t61o64/percona-release-0.0-1.x86_64.rpm: percona-release-0.0-1.x86_64
Marking /var/tmp/yum-root-t61o64/percona-release-0.0-1.x86_64.rpm to be installed
Determining fastest mirrors
epel/metalink | 15 kB 00:00
* base: mirror.atlanticmetro.net
* epel: mirror.seas.harvard.edu
* extras: centos.mirror.netriplex.com
* updates: mirror.team-cymru.org
base | 3.7 kB 00:00
base/primary_db | 4.4 MB 00:01
epel | 4.2 kB 00:00
epel/primary_db | 5.5 MB 00:04
extras | 3.4 kB 00:00
extras/primary_db | 18 kB 00:00
updates | 3.4 kB 00:00
updates/primary_db | 4.4 MB 00:03
Resolving Dependencies
--> Running transaction check
---> Package percona-release.x86_64 0:0.0-1 will be installed
--> Finished Dependency Resolution
Dependencies Resolved
================================================================================================================================
Package Arch Version Repository Size
================================================================================================================================
Installing:
percona-release x86_64 0.0-1 /percona-release-0.0-1.x86_64 3.6 k
Transaction Summary
================================================================================================================================
Install 1 Package(s)
Total size: 3.6 k
Installed size: 3.6 k
Is this ok [y/N]: y
Downloading Packages:
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
Installing : percona-release-0.0-1.x86_64 1/1
Verifying : percona-release-0.0-1.x86_64 1/1
Installed:
percona-release.x86_64 0:0.0-1
Complete!
[[email protected] ~]# yum install -y Percona-XtraDB-Cluster-server
Loaded plugins: downloadonly, fastestmirror, priorities
Loading mirror speeds from cached hostfile
* base: mirror.atlanticmetro.net
* epel: mirror.seas.harvard.edu
* extras: centos.mirror.netriplex.com
* updates: mirror.team-cymru.org
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package Percona-XtraDB-Cluster-server.x86_64 1:5.5.31-23.7.5.438.rhel6 will be installed
--> Processing Dependency: xtrabackup >= 1.9.0 for package: 1:Percona-XtraDB-Cluster-server-5.5.31-23.7.5.438.rhel6.x86_64
--> Processing Dependency: Percona-XtraDB-Cluster-client for package: 1:Percona-XtraDB-Cluster-server-5.5.31-23.7.5.438.rhel6.x86_64
--> Processing Dependency: libaio.so.1(LIBAIO_0.4)(64bit) for package: 1:Percona-XtraDB-Cluster-server-5.5.31-23.7.5.438.rhel6.x86_64
--> Processing Dependency: rsync for package: 1:Percona-XtraDB-Cluster-server-5.5.31-23.7.5.438.rhel6.x86_64
--> Processing Dependency: Percona-XtraDB-Cluster-galera for package: 1:Percona-XtraDB-Cluster-server-5.5.31-23.7.5.438.rhel6.x86_64
--> Processing Dependency: nc for package: 1:Percona-XtraDB-Cluster-server-5.5.31-23.7.5.438.rhel6.x86_64
--> Processing Dependency: Percona-XtraDB-Cluster-shared for package: 1:Percona-XtraDB-Cluster-server-5.5.31-23.7.5.438.rhel6.x86_64
--> Processing Dependency: libaio.so.1(LIBAIO_0.1)(64bit) for package: 1:Percona-XtraDB-Cluster-server-5.5.31-23.7.5.438.rhel6.x86_64
--> Processing Dependency: libaio.so.1()(64bit) for package: 1:Percona-XtraDB-Cluster-server-5.5.31-23.7.5.438.rhel6.x86_64
--> Running transaction check
---> Package Percona-XtraDB-Cluster-client.x86_64 1:5.5.31-23.7.5.438.rhel6 will be installed
---> Package Percona-XtraDB-Cluster-galera.x86_64 0:2.6-1.152.rhel6 will be installed
---> Package Percona-XtraDB-Cluster-shared.x86_64 1:5.5.31-23.7.5.438.rhel6 will be obsoleting
---> Package libaio.x86_64 0:0.3.107-10.el6 will be installed
---> Package mysql-libs.x86_64 0:5.1.69-1.el6_4 will be obsoleted
---> Package nc.x86_64 0:1.84-22.el6 will be installed
---> Package percona-xtrabackup.x86_64 0:2.1.4-656.rhel6 will be installed
--> Processing Dependency: perl(DBD::mysql) for package: percona-xtrabackup-2.1.4-656.rhel6.x86_64
--> Processing Dependency: perl(Time::HiRes) for package: percona-xtrabackup-2.1.4-656.rhel6.x86_64
---> Package rsync.x86_64 0:3.0.6-9.el6 will be installed
--> Running transaction check
---> Package perl-DBD-MySQL.x86_64 0:4.013-3.el6 will be installed
--> Processing Dependency: perl(DBI::Const::GetInfoType) for package: perl-DBD-MySQL-4.013-3.el6.x86_64
--> Processing Dependency: perl(DBI) for package: perl-DBD-MySQL-4.013-3.el6.x86_64
--> Processing Dependency: libmysqlclient.so.16(libmysqlclient_16)(64bit) for package: perl-DBD-MySQL-4.013-3.el6.x86_64
--> Processing Dependency: libmysqlclient.so.16()(64bit) for package: perl-DBD-MySQL-4.013-3.el6.x86_64
---> Package perl-Time-HiRes.x86_64 4:1.9721-131.el6_4 will be installed
--> Running transaction check
---> Package Percona-Server-shared-compat.x86_64 0:5.5.33-rel31.1.566.rhel6 will be obsoleting
---> Package perl-DBI.x86_64 0:1.609-4.el6 will be installed
--> Finished Dependency Resolution
Dependencies Resolved
================================================================================================================================
Package Arch Version Repository Size
================================================================================================================================
Installing:
Percona-Server-shared-compat x86_64 5.5.33-rel31.1.566.rhel6 percona 3.4 M
replacing mysql-libs.x86_64 5.1.69-1.el6_4
Percona-XtraDB-Cluster-server x86_64 1:5.5.31-23.7.5.438.rhel6 percona 15 M
Percona-XtraDB-Cluster-shared x86_64 1:5.5.31-23.7.5.438.rhel6 percona 648 k
replacing mysql-libs.x86_64 5.1.69-1.el6_4
Installing for dependencies:
Percona-XtraDB-Cluster-client x86_64 1:5.5.31-23.7.5.438.rhel6 percona 6.3 M
Percona-XtraDB-Cluster-galera x86_64 2.6-1.152.rhel6 percona 1.1 M
libaio x86_64 0.3.107-10.el6 base 21 k
nc x86_64 1.84-22.el6 base 57 k
percona-xtrabackup x86_64 2.1.4-656.rhel6 percona 6.8 M
perl-DBD-MySQL x86_64 4.013-3.el6 base 134 k
perl-DBI x86_64 1.609-4.el6 base 705 k
perl-Time-HiRes x86_64 4:1.9721-131.el6_4 updates 47 k
rsync x86_64 3.0.6-9.el6 base 334 k
Transaction Summary
================================================================================================================================
Install 12 Package(s)
Total download size: 35 M
Downloading Packages:
(1/12): Percona-Server-shared-compat-5.5.33-rel31.1.566.rhel6.x86_64.rpm | 3.4 MB 00:04
(2/12): Percona-XtraDB-Cluster-client-5.5.31-23.7.5.438.rhel6.x86_64.rpm | 6.3 MB 00:03
(3/12): Percona-XtraDB-Cluster-galera-2.6-1.152.rhel6.x86_64.rpm | 1.1 MB 00:00
(4/12): Percona-XtraDB-Cluster-server-5.5.31-23.7.5.438.rhel6.x86_64.rpm | 15 MB 00:04
(5/12): Percona-XtraDB-Cluster-shared-5.5.31-23.7.5.438.rhel6.x86_64.rpm | 648 kB 00:00
(6/12): libaio-0.3.107-10.el6.x86_64.rpm | 21 kB 00:00
(7/12): nc-1.84-22.el6.x86_64.rpm | 57 kB 00:00
(8/12): percona-xtrabackup-2.1.4-656.rhel6.x86_64.rpm | 6.8 MB 00:03
(9/12): perl-DBD-MySQL-4.013-3.el6.x86_64.rpm | 134 kB 00:00
(10/12): perl-DBI-1.609-4.el6.x86_64.rpm | 705 kB 00:00
(11/12): perl-Time-HiRes-1.9721-131.el6_4.x86_64.rpm | 47 kB 00:00
(12/12): rsync-3.0.6-9.el6.x86_64.rpm | 334 kB 00:00
--------------------------------------------------------------------------------------------------------------------------------
Total 1.0 MB/s | 35 MB 00:34
warning: rpmts_HdrFromFdno: Header V4 DSA/SHA1 Signature, key ID cd2efd2a: NOKEY
Retrieving key from file:///etc/pki/rpm-gpg/RPM-GPG-KEY-percona
Importing GPG key 0xCD2EFD2A:
Userid : Percona MySQL Development Team <[email protected]>
Package: percona-release-0.0-1.x86_64 (@/percona-release-0.0-1.x86_64)
From : /etc/pki/rpm-gpg/RPM-GPG-KEY-percona
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
Installing : libaio-0.3.107-10.el6.x86_64 1/13
Installing : 1:Percona-XtraDB-Cluster-shared-5.5.31-23.7.5.438.rhel6.x86_64 2/13
Installing : 1:Percona-XtraDB-Cluster-client-5.5.31-23.7.5.438.rhel6.x86_64 3/13
Installing : Percona-XtraDB-Cluster-galera-2.6-1.152.rhel6.x86_64 4/13
Installing : nc-1.84-22.el6.x86_64 5/13
Installing : 4:perl-Time-HiRes-1.9721-131.el6_4.x86_64 6/13
Installing : perl-DBI-1.609-4.el6.x86_64 7/13
Installing : rsync-3.0.6-9.el6.x86_64 8/13
Installing : Percona-Server-shared-compat-5.5.33-rel31.1.566.rhel6.x86_64 9/13
Installing : perl-DBD-MySQL-4.013-3.el6.x86_64 10/13
Installing : percona-xtrabackup-2.1.4-656.rhel6.x86_64 11/13
Installing : 1:Percona-XtraDB-Cluster-server-5.5.31-23.7.5.438.rhel6.x86_64 12/13
ls: cannot access /var/lib/mysql/*.err: No such file or directory
ls: cannot access /var/lib/mysql/*.err: No such file or directory
130917 14:43:16 [Note] WSREP: Read nil XID from storage engines, skipping position init
130917 14:43:16 [Note] WSREP: wsrep_load(): loading provider library 'none'
130917 14:43:16 [Note] WSREP: Service disconnected.
130917 14:43:17 [Note] WSREP: Some threads may fail to exit.
130917 14:43:17 [Note] WSREP: Read nil XID from storage engines, skipping position init
130917 14:43:17 [Note] WSREP: wsrep_load(): loading provider library 'none'
130917 14:43:18 [Note] WSREP: Service disconnected.
130917 14:43:19 [Note] WSREP: Some threads may fail to exit.
PLEASE REMEMBER TO SET A PASSWORD FOR THE MySQL root USER !
To do so, start the server, then issue the following commands:
/usr/bin/mysqladmin -u root password 'new-password'
/usr/bin/mysqladmin -u root -h node1 password 'new-password'
Alternatively you can run:
/usr/bin/mysql_secure_installation
which will also give you the option of removing the test
databases and anonymous user created by default. This is
strongly recommended for production servers.
See the manual for more instructions.
Please report any problems with the /usr/bin/mysqlbug script!
Percona recommends that all production deployments be protected with a support
contract (http://www.percona.com/mysql-suppport/) to ensure the highest uptime,
be eligible for hot fixes, and boost your team's productivity.
/var/tmp/rpm-tmp.rVkEi4: line 95: x0: command not found
Percona XtraDB Cluster is distributed with several useful UDFs from Percona Toolkit.
Run the following commands to create these functions:
mysql -e "CREATE FUNCTION fnv1a_64 RETURNS INTEGER SONAME 'libfnv1a_udf.so'"
mysql -e "CREATE FUNCTION fnv_64 RETURNS INTEGER SONAME 'libfnv_udf.so'"
mysql -e "CREATE FUNCTION murmur_hash RETURNS INTEGER SONAME 'libmurmur_udf.so'"
See http://code.google.com/p/maatkit/source/browse/trunk/udf for more details
Erasing : mysql-libs-5.1.69-1.el6_4.x86_64 13/13
Verifying : 1:Percona-XtraDB-Cluster-server-5.5.31-23.7.5.438.rhel6.x86_64 1/13
Verifying : 1:Percona-XtraDB-Cluster-client-5.5.31-23.7.5.438.rhel6.x86_64 2/13
Verifying : perl-DBD-MySQL-4.013-3.el6.x86_64 3/13
Verifying : Percona-Server-shared-compat-5.5.33-rel31.1.566.rhel6.x86_64 4/13
Verifying : rsync-3.0.6-9.el6.x86_64 5/13
Verifying : perl-DBI-1.609-4.el6.x86_64 6/13
Verifying : percona-xtrabackup-2.1.4-656.rhel6.x86_64 7/13
Verifying : 1:Percona-XtraDB-Cluster-shared-5.5.31-23.7.5.438.rhel6.x86_64 8/13
Verifying : 4:perl-Time-HiRes-1.9721-131.el6_4.x86_64 9/13
Verifying : nc-1.84-22.el6.x86_64 10/13
Verifying : libaio-0.3.107-10.el6.x86_64 11/13
Verifying : Percona-XtraDB-Cluster-galera-2.6-1.152.rhel6.x86_64 12/13
Verifying : mysql-libs-5.1.69-1.el6_4.x86_64 13/13
Installed:
Percona-Server-shared-compat.x86_64 0:5.5.33-rel31.1.566.rhel6 Percona-XtraDB-Cluster-server.x86_64 1:5.5.31-23.7.5.438.rhel6
Percona-XtraDB-Cluster-shared.x86_64 1:5.5.31-23.7.5.438.rhel6
Dependency Installed:
Percona-XtraDB-Cluster-client.x86_64 1:5.5.31-23.7.5.438.rhel6 Percona-XtraDB-Cluster-galera.x86_64 0:2.6-1.152.rhel6
libaio.x86_64 0:0.3.107-10.el6 nc.x86_64 0:1.84-22.el6
percona-xtrabackup.x86_64 0:2.1.4-656.rhel6 perl-DBD-MySQL.x86_64 0:4.013-3.el6
perl-DBI.x86_64 0:1.609-4.el6 perl-Time-HiRes.x86_64 4:1.9721-131.el6_4
rsync.x86_64 0:3.0.6-9.el6
Replaced:
mysql-libs.x86_64 0:5.1.69-1.el6_4
Complete!

Disable IPtables and SElinux

It is possible to run PXC with these enabled, but for simplicity here we just disable them (on both nodes!):

[[email protected] ~]# echo 0 > /selinux/enforce
[[email protected] ~]# service iptables stop
iptables: Flushing firewall rules:                         [  OK  ]
iptables: Setting chains to policy ACCEPT: filter          [  OK  ]
iptables: Unloading modules:                               [  OK  ]

Configure the cluster nodes

Create a my.cnf file on each node and put this into it:

[mysqld]
datadir = /var/lib/mysql
binlog_format = ROW
wsrep_cluster_name = twonode
wsrep_cluster_address = gcomm://192.168.70.2,192.168.70.3
wsrep_node_address = 192.168.70.2
wsrep_provider = /usr/lib64/libgalera_smm.so
wsrep_sst_method = xtrabackup
wsrep_sst_auth = sst:secret
innodb_locks_unsafe_for_binlog = 1
innodb_autoinc_lock_mode = 2

Note that the wsrep_node_address should be the proper address on each node.  We only  need this because in this environment we are not using the default NIC.

Bootstrap node1

Bootstrapping is simply starting up the first node in the cluster.  Any data on this node is taken as the source of truth for the other nodes.

[[email protected] ~]# service mysql bootstrap-pxc
Bootstrapping PXC (Percona XtraDB Cluster)Starting MySQL (Percona XtraDB Cluster).. SUCCESS!
[[email protected] mysql]# mysql -e "show global status like 'wsrep%'"
+----------------------------+--------------------------------------+
| Variable_name              | Value                                |
+----------------------------+--------------------------------------+
| wsrep_local_state_uuid     | 43ac4bea-1fa8-11e3-8496-070bd0e5c133 |
| wsrep_protocol_version     | 4                                    |
| wsrep_last_committed       | 0                                    |
| wsrep_replicated           | 0                                    |
| wsrep_replicated_bytes     | 0                                    |
| wsrep_received             | 2                                    |
| wsrep_received_bytes       | 133                                  |
| wsrep_local_commits        | 0                                    |
| wsrep_local_cert_failures  | 0                                    |
| wsrep_local_bf_aborts      | 0                                    |
| wsrep_local_replays        | 0                                    |
| wsrep_local_send_queue     | 0                                    |
| wsrep_local_send_queue_avg | 0.000000                             |
| wsrep_local_recv_queue     | 0                                    |
| wsrep_local_recv_queue_avg | 0.000000                             |
| wsrep_flow_control_paused  | 0.000000                             |
| wsrep_flow_control_sent    | 0                                    |
| wsrep_flow_control_recv    | 0                                    |
| wsrep_cert_deps_distance   | 0.000000                             |
| wsrep_apply_oooe           | 0.000000                             |
| wsrep_apply_oool           | 0.000000                             |
| wsrep_apply_window         | 0.000000                             |
| wsrep_commit_oooe          | 0.000000                             |
| wsrep_commit_oool          | 0.000000                             |
| wsrep_commit_window        | 0.000000                             |
| wsrep_local_state          | 4                                    |
| wsrep_local_state_comment  | Synced                               |
| wsrep_cert_index_size      | 0                                    |
| wsrep_causal_reads         | 0                                    |
| wsrep_incoming_addresses   | 192.168.70.2:3306                    |
| wsrep_cluster_conf_id      | 1                                    |
| wsrep_cluster_size         | 1                                    |
| wsrep_cluster_state_uuid   | 43ac4bea-1fa8-11e3-8496-070bd0e5c133 |
| wsrep_cluster_status       | Primary                              |
| wsrep_connected            | ON                                   |
| wsrep_local_index          | 0                                    |
| wsrep_provider_name        | Galera                               |
| wsrep_provider_vendor      | Codership Oy <[email protected]>    |
| wsrep_provider_version     | 2.6(r152)                            |
| wsrep_ready                | ON                                   |
+----------------------------+--------------------------------------+

We can see the cluster is Primary, the size is 1, and our local state is Synced.  This is a one node cluster!

Prep for SST

SST is how new nodes (post-bootstrap) get a copy of data when joining the cluster.  It is in essence (and reality) a full backup.  We specified Xtrabackup as our backup and a username/password (sst:secret).  We need to setup a GRANT on node1 so we can run Xtrabackup against it to SST node2:

[[email protected] ~]# mysql
Welcome to the MySQL monitor.  Commands end with ; or g.
Your MySQL connection id is 4
Server version: 5.5.31 Percona XtraDB Cluster (GPL), wsrep_23.7.5.r3880
Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.
mysql> GRANT RELOAD, LOCK TABLES, REPLICATION CLIENT ON *.* TO 'sst'@'localhost' IDENTIFIED BY 'secret';
Query OK, 0 rows affected (0.00 sec)

This GRANT should not be necessary to re-issue more than once if you are adding more nodes to the cluster.

Start node2

Assuming you’ve installed the software and my.cnf on node2, then it should be ready to start up:

[[email protected] ~]# service mysql start
Starting MySQL (Percona XtraDB Cluster).....SST in progress, setting sleep higher. SUCCESS!

If we check the status of the cluster again:

[[email protected] mysql]# mysql -e "show global status like 'wsrep%'"
+----------------------------+--------------------------------------+
| Variable_name              | Value                                |
+----------------------------+--------------------------------------+
| wsrep_local_state_uuid     | 43ac4bea-1fa8-11e3-8496-070bd0e5c133 |
| wsrep_protocol_version     | 4                                    |
| wsrep_last_committed       | 0                                    |
| wsrep_replicated           | 0                                    |
| wsrep_replicated_bytes     | 0                                    |
| wsrep_received             | 6                                    |
| wsrep_received_bytes       | 393                                  |
| wsrep_local_commits        | 0                                    |
| wsrep_local_cert_failures  | 0                                    |
| wsrep_local_bf_aborts      | 0                                    |
| wsrep_local_replays        | 0                                    |
| wsrep_local_send_queue     | 0                                    |
| wsrep_local_send_queue_avg | 0.000000                             |
| wsrep_local_recv_queue     | 0                                    |
| wsrep_local_recv_queue_avg | 0.000000                             |
| wsrep_flow_control_paused  | 0.000000                             |
| wsrep_flow_control_sent    | 0                                    |
| wsrep_flow_control_recv    | 0                                    |
| wsrep_cert_deps_distance   | 0.000000                             |
| wsrep_apply_oooe           | 0.000000                             |
| wsrep_apply_oool           | 0.000000                             |
| wsrep_apply_window         | 0.000000                             |
| wsrep_commit_oooe          | 0.000000                             |
| wsrep_commit_oool          | 0.000000                             |
| wsrep_commit_window        | 0.000000                             |
| wsrep_local_state          | 4                                    |
| wsrep_local_state_comment  | Synced                               |
| wsrep_cert_index_size      | 0                                    |
| wsrep_causal_reads         | 0                                    |
| wsrep_incoming_addresses   | 192.168.70.3:3306,192.168.70.2:3306  |
| wsrep_cluster_conf_id      | 2                                    |
| wsrep_cluster_size         | 2                                    |
| wsrep_cluster_state_uuid   | 43ac4bea-1fa8-11e3-8496-070bd0e5c133 |
| wsrep_cluster_status       | Primary                              |
| wsrep_connected            | ON                                   |
| wsrep_local_index          | 1                                    |
| wsrep_provider_name        | Galera                               |
| wsrep_provider_vendor      | Codership Oy <[email protected]>    |
| wsrep_provider_version     | 2.6(r152)                            |
| wsrep_ready                | ON                                   |
+----------------------------+--------------------------------------+

We can see that there are now 2 nodes in the cluster!

The network connection is established over the default Galera port of 4567:

[[email protected] ~]# netstat -ant
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address               Foreign Address             State
tcp        0      0 0.0.0.0:3306                0.0.0.0:*                   LISTEN
tcp        0      0 0.0.0.0:22                  0.0.0.0:*                   LISTEN
tcp        0      0 0.0.0.0:4567                0.0.0.0:*                   LISTEN
tcp        0      0 10.0.2.15:22                10.0.2.2:61129              ESTABLISHED
tcp        0    108 192.168.70.2:4567           192.168.70.3:50170          ESTABLISHED
tcp        0      0 :::22                       :::*                        LISTEN

Summary

In these steps we:

  • Installed PXC server package and dependencies
  • Did the bare-minimum configuration to get it started
  • Bootstrapped the first node
  • Prepared for SST
  • Started the second node (SST was copied by netcat over port 4444)
  • Confirmed both nodes were in the cluster

The setup can certainly be more involved in this, but this gives a simple illustration at what it takes to get things rolling.

The post Percona XtraDB Cluster: Setting up a simple cluster appeared first on MySQL Performance Blog.

Read more at: http://www.mysqlperformanceblog.com/