ProxySQL Native Support for Percona XtraDB Cluster (PXC)

galera proxy content image

ProxySQL in its versions up to 1.x did not natively support Percona XtraDB Cluster (PXC). Instead, it relied on the flexibility offered by the scheduler. This approach allowed users to implement their own preferred way to manage the ProxySQL behaviour in relation to the Galera events.

From version 2.0 we can use native ProxySQL support for PXC.. The mechanism to activate native support is very similar to the one already in place for group replication.

In brief it is based on the table [runtime_]mysql_galera_hostgroups and the information needed is mostly the same:

  • writer_hostgroup: the hostgroup ID that refers to the WRITER
  • backup_writer_hostgroup: the hostgoup ID referring to the Hostgorup that will contain the candidate servers
  • reader_hostgroup: The reader Hostgroup ID, containing the list of servers that need to be taken in consideration
  • offline_hostgroup: The Hostgroup ID that will eventually contain the writer that will be put OFFLINE
  • active: True[1]/False[0] if this configuration needs to be used or not
  • max_writers: This will contain the MAX number of writers you want to have at the same time. In a sane setup this should be always 1, but if you want to have multiple writers, you can define it up to the number of nodes.
  • writer_is_also_reader: If true [1] the Writer will NOT be removed from the reader HG
  • max_transactions_behind: The number of wsrep_local_recv_queue after which the node will be set OFFLINE. This must be carefully set, observing the node behaviour.
  • comment: I suggest to put some meaningful notes to identify what is what.

Given the above let us see what we need to do in order to have a working galera native solution.
I will have three Servers:

192.168.1.205 (Node1)
  192.168.1.21  (Node2)
  192.168.1.231 (node3)

As set of Hostgroup, I will have:

Writer  HG-> 100
Reader  HG-> 101
BackupW HG-> 102
offHG   HG-> 9101

To set it up

Servers first:

INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.205',101,3306,1000);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.21',101,3306,1000);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.231',101,3306,1000);

Then the galera settings:

insert into mysql_galera_hostgroups (writer_hostgroup,backup_writer_hostgroup,reader_hostgroup, offline_hostgroup,active,max_writers,writer_is_also_reader,max_transactions_behind) values (100,102,101,9101,0,1,1,16);

As usual if we want to have R/W split we need to define the rules for it:

insert into mysql_query_rules (rule_id,proxy_port,schemaname,username,destination_hostgroup,active,retries,match_digest,apply) values(1040,6033,'windmills','app_test',100,1,3,'^SELECT.*FOR UPDATE',1);
insert into mysql_query_rules (rule_id,proxy_port,schemaname,username,destination_hostgroup,active,retries,match_digest,apply) values(1041,6033,'windmills','app_test',101,1,3,'^SELECT.*@@',1);
save mysql query rules to disk;
load mysql query rules to run;

Then another important variable… the server version, please do yourself a good service ad NEVER use the default.

update global_variables set variable_value='5.7.0' where variable_name='mysql-server_version';
LOAD MYSQL VARIABLES TO RUNTIME;SAVE MYSQL VARIABLES TO DISK;

Finally activate the whole thing:

save mysql servers to disk;
load mysql servers to runtime;

One thing to note before we go ahead. In the list of servers I had:

  1. Filled only the READER HG
  2. Used the same weight

This because of the election mechanism ProxySQL will use to identify the writer, and the (many) problems that may be attached to it.

For now let us go ahead and see what happens when I load this information to runtime.

Before running the above commands:

+--------+-----------+---------------+----------+---------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+
| weight | hostgroup | srv_host      | srv_port | status  | ConnUsed | ConnFree | ConnOK | ConnERR | MaxConnUsed | Queries | Queries_GTID_sync | Bytes_data_sent | Bytes_data_recv | Latency_us |
+--------+-----------+---------------+----------+---------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+
+--------+-----------+---------------+----------+---------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+

After:

+--------+-----------+---------------+----------+---------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+
| weight | hostgroup | srv_host      | srv_port | status  | ConnUsed | ConnFree | ConnOK | ConnERR | MaxConnUsed | Queries | Queries_GTID_sync | Bytes_data_sent | Bytes_data_recv | Latency_us |
+--------+-----------+---------------+----------+---------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+
| 1000   | 100       | 192.168.1.231 | 3306     | ONLINE  | 0        | 0        | 0	 | 0	   | 0           | 0	   | 0                 | 0               | 0               | 501        |
| 1000   | 101       | 192.168.1.231 | 3306     | ONLINE  | 0        | 0        | 0	 | 0	   | 0           | 0	   | 0                 | 0               | 0               | 501        |
| 1000   | 101       | 192.168.1.21  | 3306     | ONLINE  | 0        | 0        | 0	 | 0	   | 0           | 0	   | 0                 | 0               | 0               | 546        |
| 1000   | 101       | 192.168.1.205 | 3306     | ONLINE  | 0        | 0        | 0	 | 0	   | 0           | 0	   | 0                 | 0               | 0               | 467        |
| 1000   | 102       | 192.168.1.21  | 3306     | ONLINE  | 0        | 0        | 0	 | 0	   | 0           | 0	   | 0                 | 0               | 0               | 546        |
| 1000   | 102       | 192.168.1.205 | 3306     | ONLINE  | 0        | 0        | 0	 | 0	   | 0           | 0	   | 0                 | 0               | 0               | 467        |
+--------+-----------+---------------+----------+---------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+
mysql> select * from runtime_mysql_galera_hostgroups G
*************************** 1. row ***************************
       writer_hostgroup: 100
backup_writer_hostgroup: 102
       reader_hostgroup: 101
      offline_hostgroup: 9101
                active: 0  <----------- note this
            max_writers: 1
  writer_is_also_reader: 1
max_transactions_behind: 16
                comment: NULL
1 row in set (0.01 sec)

As we can see, ProxySQL had taken the nodes from my READER group and distribute them adding node 1 in the writer and node 2 as backup_writer.

But – there is a but – wasn’t my rule set with Active=0? Indeed it was, and I assume this is a bug (#Issue  1902).

The other thing we should note is that ProxySQL had elected as writer node 3 (192.168.1.231).
As I said before what should we do IF we want to have a specific node as preferred writer?

We need to modify its weight. So say we want to have node 1 (192.168.1.205) as writer we will need something like this:

INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.205',101,3306,10000);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.21',101,3306,100);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.231',101,3306,100);

Doing that will give us :

+--------+-----------+---------------+----------+--------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+
| weight | hostgroup | srv_host      | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | MaxConnUsed | Queries | Queries_GTID_sync | Bytes_data_sent | Bytes_data_recv | Latency_us |
+--------+-----------+---------------+----------+--------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+
| 10000  | 100       | 192.168.1.205 | 3306     | ONLINE | 0        | 0        | 0      | 0       | 0           | 0       | 0                 | 0               | 0               | 2209       |
| 100    | 101       | 192.168.1.231 | 3306     | ONLINE | 0        | 0        | 0      | 0       | 0           | 0       | 0                 | 0               | 0               | 546        |
| 100    | 101       | 192.168.1.21  | 3306     | ONLINE | 0        | 0        | 0      | 0	  | 0           | 0	  | 0                 | 0               | 0               | 508        |
| 10000  | 101       | 192.168.1.205 | 3306     | ONLINE | 0        | 0        | 0      | 0	  | 0           | 0	  | 0                 | 0               | 0               | 2209       |
| 100    | 102       | 192.168.1.231 | 3306     | ONLINE | 0        | 0        | 0      | 0       | 0           | 0       | 0                 | 0               | 0               | 546        |
| 100    | 102       | 192.168.1.21  | 3306     | ONLINE | 0        | 0        | 0      | 0       | 0           | 0       | 0                 | 0               | 0               | 508        |
+--------+-----------+---------------+----------+--------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+

If you noticed, given we had set the WEIGHT in node 1 higher, this node will become also the most utilized for reads.
We probably do not want that, so let us modify the reader weight.

update mysql_servers set weight=10 where hostgroup_id=101 and hostname='192.168.1.205';

At this point if we trigger the failover, with set global wsrep_reject_queries=all; on node 1.
ProxySQL will take action and will elect another node as writer:

+--------+-----------+---------------+----------+---------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+
| weight | hostgroup | srv_host      | srv_port | status  | ConnUsed | ConnFree | ConnOK | ConnERR | MaxConnUsed | Queries | Queries_GTID_sync | Bytes_data_sent | Bytes_data_recv | Latency_us |
+--------+-----------+---------------+----------+---------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+
| 100    | 100       | 192.168.1.231 | 3306     | ONLINE  | 0        | 0        | 0      | 0       | 0           | 0       | 0                 | 0               | 0               | 562        |
| 100    | 101       | 192.168.1.231 | 3306     | ONLINE  | 0        | 0        | 0      | 0       | 0           | 0       | 0                 | 0               | 0               | 562        |
| 100    | 101       | 192.168.1.21  | 3306     | ONLINE  | 0        | 0        | 0      | 0	      | 0           | 0	      | 0                 | 0               | 0               | 588        |
| 100    | 102       | 192.168.1.21  | 3306     | ONLINE  | 0        | 0        | 0      | 0       | 0           | 0       | 0                 | 0               | 0               | 588        |
| 10000  | 9101      | 192.168.1.205 | 3306     | ONLINE  | 0        | 0        | 0      | 0       | 0           | 0       | 0                 | 0               | 0               | 468        |
+--------+-----------+---------------+----------+---------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+

Node 3 (192.168.1.231) is the new writer and node 1 is in the special group for OFFLINE.
Let see now what will happen IF we put back node 1.

+--------+-----------+---------------+----------+---------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+
| weight | hostgroup | srv_host      | srv_port | status  | ConnUsed | ConnFree | ConnOK | ConnERR | MaxConnUsed | Queries | Queries_GTID_sync | Bytes_data_sent | Bytes_data_recv | Latency_us |
+--------+-----------+---------------+----------+---------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+
| 10000  | 100       | 192.168.1.205 | 3306     | ONLINE  | 0        | 0        | 0      | 0       | 0           | 0       | 0                 | 0               | 0               | 449        |
| 100    | 101       | 192.168.1.231 | 3306     | ONLINE  | 0        | 0        | 0      | 0       | 0           | 0       | 0                 | 0               | 0               | 532        |
| 100    | 101       | 192.168.1.21  | 3306     | ONLINE  | 0        | 0        | 0      | 0       | 0           | 0       | 0                 | 0               | 0               | 569        |
| 10000  | 101       | 192.168.1.205 | 3306     | ONLINE  | 0        | 0        | 0      | 0       | 0           | 0       | 0                 | 0               | 0               | 449        |
| 100    | 102       | 192.168.1.231 | 3306     | ONLINE  | 0        | 0        | 0      | 0       | 0           | 0       | 0                 | 0               | 0               | 532        |
| 100    | 102       | 192.168.1.21  | 3306     | ONLINE  | 0        | 0        | 0      | 0       | 0           | 0       | 0                 | 0               | 0               | 569        |
+--------+-----------+---------------+----------+---------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+

Ooops the READER has come back with the HIGHEST value and as such it will be the most used node, once more. To fix it, we need to re-run the update as before.

But there is a way to avoid this? In short the answer is NO!
This, in my opinion, is BAD and is worth a feature request, because this can really put a node on the knees.

Now this is not the only problem. There is another point that is probably worth discussion, which is the fact ProxySQL is currently doing FAILOVER/FAILBACK.

Failover, is obviously something we want to have, but failback is another discussion. The point is, once the failover is complete and the cluster has redistributed the incoming requests, doing a failback is an impacting operation that can be a disruptive one too.

If all nodes are treated as equal, there is no real way to prevent it, while if YOU set a node to be the main writer, something can be done, let us see what and how.
Say we have:

INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.205',101,3306,1000);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.21',101,3306,100);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.231',101,3306,100);
+--------+-----------+---------------+----------+--------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+
| weight | hostgroup | srv_host      | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | MaxConnUsed | Queries | Queries_GTID_sync | Bytes_data_sent | Bytes_data_recv | Latency_us |
+--------+-----------+---------------+----------+--------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+
| 1000   | 100       | 192.168.1.205 | 3306     | ONLINE | 0        | 0        | 0      | 0	  | 0           | 0	  | 0                 | 0               | 0               | 470        |
| 100    | 101       | 192.168.1.231 | 3306     | ONLINE | 0        | 0        | 0      | 0	  | 0           | 0	  | 0                 | 0               | 0               | 558        |
| 100    | 101       | 192.168.1.21  | 3306     | ONLINE | 0        | 0        | 0      | 0	  | 0           | 0	  | 0                 | 0               | 0               | 613        |
| 10     | 101       | 192.168.1.205 | 3306     | ONLINE | 0        | 0        | 0      | 0	  | 0           | 0	  | 0                 | 0               | 0               | 470        |
| 100    | 102       | 192.168.1.231 | 3306     | ONLINE | 0        | 0        | 0      | 0	  | 0           | 0	  | 0                 | 0               | 0               | 558        |
| 100    | 102       | 192.168.1.21  | 3306     | ONLINE | 0        | 0        | 0      | 0       | 0           | 0       | 0                 | 0               | 0               | 613        |
+--------+-----------+---------------+----------+--------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+

Let us put the node down
set global wsrep_reject_queries=all;

And check:

+--------+-----------+---------------+----------+---------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+
| weight | hostgroup | srv_host      | srv_port | status  | ConnUsed | ConnFree | ConnOK | ConnERR | MaxConnUsed | Queries | Queries_GTID_sync | Bytes_data_sent | Bytes_data_recv | Latency_us |
+--------+-----------+---------------+----------+---------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+
| 100    | 100       | 192.168.1.231 | 3306     | ONLINE  | 0        | 0        | 0      | 0	      | 0           | 0	      | 0                 | 0               | 0               | 519        |
| 100    | 101       | 192.168.1.231 | 3306     | ONLINE  | 0        | 0        | 0      | 0	      | 0           | 0	      | 0                 | 0               | 0               | 519        |
| 100    | 101       | 192.168.1.21  | 3306     | ONLINE  | 0        | 0        | 0      | 0	      | 0           | 0	      | 0                 | 0               | 0               | 506        |
| 100    | 102       | 192.168.1.21  | 3306     | ONLINE  | 0        | 0        | 0      | 0       | 0           | 0       | 0                 | 0               | 0               | 506        |
| 1000   | 9101      | 192.168.1.205 | 3306     | ONLINE  | 0        | 0        | 0      | 0       | 0           | 0       | 0                 | 0               | 0               | 527        |
+--------+-----------+---------------+----------+---------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+

We can now manipulate the weight in the special OFFLINE group and see what happen:

update mysql_servers set weight=10 where hostgroup_id=9101 and hostname='192.168.1.205'

Then I put the node up again:
set global wsrep_reject_queries=none;

+--------+-----------+---------------+----------+---------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+
| weight | hostgroup | srv_host      | srv_port | status  | ConnUsed | ConnFree | ConnOK | ConnERR | MaxConnUsed | Queries | Queries_GTID_sync | Bytes_data_sent | Bytes_data_recv | Latency_us |
+--------+-----------+---------------+----------+---------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+
| 100    | 100       | 192.168.1.231 | 3306     | ONLINE  | 0        | 0        | 0      | 0       | 0           | 0       | 0                 | 0               | 0               | 537        |
| 100    | 101       | 192.168.1.231 | 3306     | ONLINE  | 0        | 0        | 0      | 0       | 0           | 0       | 0                 | 0               | 0               | 537        |
| 100    | 101       | 192.168.1.21  | 3306     | ONLINE  | 0        | 0        | 0      | 0       | 0           | 0       | 0                 | 0               | 0               | 573        |
| 10     | 101       | 192.168.1.205 | 3306     | ONLINE  | 0        | 0        | 0      | 0	   | 0           | 0	   | 0                 | 0               | 0               | 458	|
| 100    | 102       | 192.168.1.21  | 3306     | ONLINE  | 0        | 0        | 0      | 0	   | 0           | 0	   | 0                 | 0               | 0               | 573	|
| 10     | 102       | 192.168.1.205 | 3306     | ONLINE  | 0        | 0        | 0      | 0       | 0           | 0       | 0                 | 0               | 0               | 458        |
+--------+-----------+---------------+----------+---------+----------+----------+--------+---------+-------------+---------+-------------------+-----------------+-----------------+------------+

That’s it, the node is back but with no service interruption.

At this point we can decide if make this node reader like the others, or wait and plan a proper time of the day when we can put it back as writer, while, in the meanwhile it has a bit of load to warm its bufferpool.

The other point – and important information – is what is ProxySQL is currently checking on Galera? From reading the code Proxy will trap the following:

  • read_only
  • wsrep_local_recv_queue
  • wsrep_desync
  • wsrep_reject_queries
  • wsrep_sst_donor_rejects_queries
  • primary_partition

Plus the standard sanity checks on the node.

Finally to monitor the whole situation we can use this:

mysql> select * from mysql_server_galera_log order by time_start_us desc limit 10;
+---------------+------+------------------+-----------------+-------------------+-----------+------------------------+-------------------+--------------+----------------------+---------------------------------+-------+
| hostname      | port | time_start_us    | success_time_us | primary_partition | read_only | wsrep_local_recv_queue | wsrep_local_state | wsrep_desync | wsrep_reject_queries | wsrep_sst_donor_rejects_queries | error |
+---------------+------+------------------+-----------------+-------------------+-----------+------------------------+-------------------+--------------+----------------------+---------------------------------+-------+
| 192.168.1.231 | 3306 | 1549982591661779 | 2884            | YES               | NO        | 0                      | 4                 | NO           | NO                   | NO                              | NULL  |
| 192.168.1.21  | 3306 | 1549982591659644 | 2778            | YES               | NO        | 0                      | 4                 | NO           | NO                   | NO                              | NULL  |
| 192.168.1.205 | 3306 | 1549982591658728 | 2794            | YES               | NO        | 0                      | 4                 | NO           | YES                  | NO                              | NULL  |
| 192.168.1.231 | 3306 | 1549982586669233 | 2827            | YES               | NO        | 0                      | 4                 | NO           | NO                   | NO                              | NULL  |
| 192.168.1.21  | 3306 | 1549982586663458 | 5100            | YES               | NO        | 0                      | 4                 | NO           | NO                   | NO                              | NULL  |
| 192.168.1.205 | 3306 | 1549982586658973 | 4132            | YES               | NO        | 0                      | 4                 | NO           | YES                  | NO                              | NULL  |
| 192.168.1.231 | 3306 | 1549982581665317 | 3084            | YES               | NO        | 0                      | 4                 | NO           | NO                   | NO                              | NULL  |
| 192.168.1.21  | 3306 | 1549982581661261 | 3129            | YES               | NO        | 0                      | 4                 | NO           | NO                   | NO                              | NULL  |
| 192.168.1.205 | 3306 | 1549982581658242 | 2786            | YES               | NO        | 0                      | 4                 | NO           | NO                   | NO                              | NULL  |
| 192.168.1.231 | 3306 | 1549982576661349 | 2982            | YES               | NO        | 0                      | 4                 | NO           | NO                   | NO                              | NULL  |
+---------------+------+------------------+-----------------+-------------------+-----------+------------------------+-------------------+--------------+----------------------+---------------------------------+-------+

As you can see above the log table keeps track of what is changed. In this case, it reports that node 1 has wsrep_reject_queries activated, and it will continue to log this until we set wsrep_reject_queries=none.

Conclusions

ProxySQL galera native integration is a useful feature to manage any Galera implementation, no matter whether it’s Percona PXC, MariaDB cluster or MySQL/Galera.

The generic approach is obviously a good thing, still it may miss some specific extension like we have in PXC with the performance_schema pxc_cluster_view table.

I’ve already objected about the failover/failback, and I am here again to remind you: whenever you do a controlled failover REMEMBER to change the weight to prevent an immediate failback.

This is obviously not possible in the case of a real failover, and, for instance, a simple temporary eviction will cause two downtimes instead only one. Some environments are fine with that others not so.

Personally I think there should be a FLAG in the configuration, such that we can decide if failback should be executed or not.

 

Percona Monitoring and Management (PMM) 1.17.1 Is Now Available

Percona Monitoring and Management 1.17.0

Percona Monitoring and Management

Percona Monitoring and Management (PMM) is a free and open-source platform for managing and monitoring MySQL®, MongoDB®, and PostgreSQL performance. You can run PMM in your own environment for maximum security and reliability. It provides thorough time-based analysis for MySQL®, MongoDB®, and PostgreSQL® servers to ensure that your data works as efficiently as possible.

In this release, we are introducing support for detection of our upcoming PMM 2.0 release in order to avoid potential version conflicts in the future, as PMM 1.x will not be compatible with PMM 2.x.

Another improvement in this release is we have updated the Tooltips for Dashboard MySQL Query Response Time by providing a description of what the graphs display, along with links to related documentation resources. An example of Tooltips in action:

PMM 1.17.1 release provides fixes for CVE-2018-16492 and CVE-2018-16487 vulnerabilities, related to Node.js modules. The authentication system used in PMM is not susceptible to the attacks described in these CVE reports. PMM does not use client-side data objects to control user-access.

In release 1.17.1 we have included two improvements and fixed nine bugs.

Improvements

  • PMM-1339: Improve tooltips for MySQL Query Response Time dashboard
  • PMM-3477: Add Ubuntu 18.10 support

Fixed Bugs

  • PMM-3471: Fix global status metric names in mysqld_exporter for MySQL 8.0 compatibility
  • PMM-3400: Duplicate column in the Query Analytics dashboard Explain section
  • PMM-3353: postgres_exporter does not work with PostgreSQL 11
  • PMM-3188: Duplicate data on Amazon RDS / Aurora MySQL Metrics dashboard
  • PMM-2615: Fix wrong formatting in log which appears if pmm-qan-agent process fails to start
  • PMM-2592: MySQL Replication Dashboard shows error with multi-source replication
  • PMM-2327: Member State Uptime and Max Member Ping time charts on the MongoDB ReplSet dashboard return an error
  • PMM-955: Fix format of User Time and CPU Time Graphs on MySQL User Statistics dashboard
  • PMM-3522: CVE-2018-16492 and CVE-2018-16487

Help us improve our software quality by reporting any Percona Monitoring and Management bugs you encounter using our bug tracking system.

MySQL 8 is not always faster than MySQL 5.7

mysql 8 slower than mysql 5.7 sysbench

MySQL 8.0.15 performs worse in sysbench oltp_read_write than MySQL 5.7.25

Initially I was testing group replication performance and was puzzled why MySQL 8.0.15 performs consistently worse than MySQL 5.7.25.

It appears that a single server instance is affected by a performance degradation.

My testing setup

mysql 8 slower than mysql 5.7 sysbenchHardware details:
Bare metal server provided by packet.net, instance size: c2.medium.x86
24 Physical Cores @ 2.2 GHz
(1 X AMD EPYC 7401P)
Memory: 64 GB of ECC RAM

Storage : INTEL® SSD DC S4500, 480GB

This is a server grade SATA SSD.

Benchmark

sysbench oltp_read_write --report-interval=1 --time=1800 --threads=24 --tables=10 --table-size=10000000 --mysql-user=root --mysql-socket=/tmp/mysql.sock run

In the following summary I used these combinations:

  • innodb_flush_log_at_trx_commit=0 or 1
  • Binlog: off or on
  • sync_binlog=1000 or sync_binlog=1

The summary table, the number are transactions per second (tps – the more the better)

+-------------------------------------------+--------------+--------------+-------+
| case                                      | MySQL 5.7.25 | MySQL 8.0.15 | ratio |
+-------------------------------------------+--------------+--------------+-------+
| trx_commit=0, binlog=off                  | 11402 tps    | 9840(*)      | 1.16  |
+-------------------------------------------+--------------+--------------+-------+
| trx_commit=1, binlog=off                  | 8375         | 7974         | 1.05  |
+-------------------------------------------+--------------+--------------+-------+
| trx_commit=0, binlog=on, sync_binlog=1000 | 10862        | 8871         | 1.22  |
+-------------------------------------------+--------------+--------------+-------+
| trx_commit=0, binlog=on, sync_binlog=1    | 7238         | 6459         | 1.12  |
+-------------------------------------------+--------------+--------------+-------+
| trx_commit=1, binlog=on, sync_binlog=1    | 5970         | 5043         | 1.18  |
+-------------------------------------------+--------------+--------------+-------+

Summary: MySQL 8.0.15 is persistently worse than MySQL 5.7.25.

In the worst case with

trx_commit=0

  and

sync_binlog=1000

 , it is worse by 22%, which is huge.

I was looking to use these settings for group replication testing, but these settings, when used with MySQL 8.0.15, provide much worse results than I had with MySQL 5.7.25

(*)  in the case of trx_commit=0, binlog=off, MySQL 5.7.25 performance is very stable, and practically stays at the 11400 tps level. MySQL 8.0.15 varies a lot from 8758 tps to 10299 tps in 1 second resolution measurements

Update:

To clarify some comments, I’ve used latin1 CHARSET in this benchmark for both MySQL 5.7 and MySQL 8.0

Appendix:

[mysqld]
datadir= /mnt/data/mysql
socket=/tmp/mysql.sock
ssl=0
#innodb-encrypt-tables=ON
character_set_server=latin1
collation_server=latin1_swedish_ci
skip-character-set-client-handshake
#skip-log-bin
log-error=error.log
log_bin = binlog
relay_log=relay
sync_binlog=1000
binlog_format = ROW
binlog_row_image=MINIMAL
server-id=1
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0
# Recommended in standard MySQL setup
# general
 table_open_cache = 200000
 table_open_cache_instances=64
 back_log=3500
 max_connections=4000
# files
 innodb_file_per_table
 innodb_log_file_size=15G
 innodb_log_files_in_group=2
 innodb_open_files=4000
# buffers
 innodb_buffer_pool_size= 40G
 innodb_buffer_pool_instances=8
 innodb_log_buffer_size=64M
# tune
 innodb_doublewrite= 1
 innodb_thread_concurrency=0
 innodb_flush_log_at_trx_commit= 0
 innodb_flush_method=O_DIRECT_NO_FSYNC
 innodb_max_dirty_pages_pct=90
 innodb_max_dirty_pages_pct_lwm=10
 innodb_lru_scan_depth=2048
 innodb_page_cleaners=4
 join_buffer_size=256K
 sort_buffer_size=256K
 innodb_use_native_aio=1
 innodb_stats_persistent = 1
 #innodb_spin_wait_delay=96
# perf special
 innodb_adaptive_flushing = 1
 innodb_flush_neighbors = 0
 innodb_read_io_threads = 16
 innodb_write_io_threads = 16
 innodb_io_capacity=1500
 innodb_io_capacity_max=2500
 innodb_purge_threads=4
 innodb_adaptive_hash_index=0
max_prepared_stmt_count=1000000


Photo by Suzy Hazelwood from Pexels

 

Measuring Percona Server for MySQL On-Disk Decryption Overhead

benchmark heavy IO percona server for mysql 8 encryption

Percona Server for MySQL 8.0 comes with enterprise grade total data encryption features. However, there is always the question of how much overhead – or performance penalty – comes with the data decryption. As we saw in my networking performance post, SSL under high concurrency might be problematic. Is this the case for data decryption?

To measure any overhead, I will start with a simplified read-only workload, where data gets decrypted during read IO.

MySQL decryption schematic

During query execution, the data in memory is already decrypted so there is no additional processing time. The decryption happens only for blocks that require a read from storage.

For the benchmark I will use the following workload:

sysbench oltp_read_only --mysql-ssl=off --tables=20 --table-size=10000000 --threads=$i --time=300 --report-interval=1 --rand-type=uniform run

The datasize for this workload is about 50GB, so I will use

innodb_buffer_pool_size = 5GB

  to emulate a heavy disk read IO during the benchmark. In the second run, I will use

innodb_buffer_pool_size = 60GB

  so all data is kept in memory and there are NO disk read IO operations.

I will only use table-level encryption at this time (ie: no encryption for binary log, system tablespace, redo-  and undo- logs).

The server I am using has AES hardware CPU acceleration. Read more at https://en.wikipedia.org/wiki/AES_instruction_set

Benchmark N1, heavy read IO

benchmark heavy IO percona server for mysql 8 encryption

Threadsencrypted storageno encryptionencryption overhead
1389.11423.471.09
41531.481673.21.09
165583.0460551.08
328250.618479.611.03
648558.68574.431.00
968571.558577.91.00
1288570.58580.681.00
2568576.3485851.00
5128573.158573.731.00
10248570.28562.821.00
20488422.248286.650.98

Benchmark N2, data in memory, no read IO

benchmark data in memory percona server for mysql 8 encryption

ThreadsEncryptionNo encryption
1578.91567.65
42289.132275.12
168304.18584.06
3213324.0213513.39
6420007.2219821.48
9619613.8219587.56
12819254.6819307.82
25618694.0518693.93
51218431.9718372.13
102418571.4318453.69
204818509.7318332.59

Observations

For a high number of threads, there is no measurable difference between encrypted and unencrypted storage. This is because a lot of CPU resources are spent in contention and waits, so the relative time spend in decryption is negligible.

However, we can see some performance penalty for a low number of threads: up to 9% penalty for hardware decryption. When data fully fits into memory, there is no measurable difference between encrypted and unencrypted storage.

So if you have hardware support then you should see little impact when using storage encryption with MySQL. The easiest way to check if you have support for this is to look at CPU flags and search for ‘aes’ string:

> lscpu | grep aes Flags: ... tsc_deadline_timer aes xsave avx f16c ...

Percona Live 2019 First Sneak Peek!

Percona Live 2019We know you’ve been really looking forward to a glimpse of what to expect at Percona Live Austin, so here is the first sneak peek of the agenda!

Our conference committee has been reviewing hundreds of talks over the last few weeks and is delighted to present some initial talks.

  • New features in MySQL 8.0 Replication by Luís Soares, Oracle OSS
  • Shaping the Future of Privacy & Data Protection by Cristina DeLisle, XWiki SAS
  • Galera Cluster New Features by Seppo Jaakola, Codership
  • MySQL Security and Standardization at PayPal by Stacy Yuan &  Yashada Jadha, PayPal
  • Mailchimp Scale: a MySQL Perspective by John Scott, Mailchimp
  • The State of Databases in 2019 by Dinesh Joshi, Apache Cassandra

PingCAP will be sponsoring the TiDB track and have a day of really exciting content to share! Liu Tang, Chief Engineer at PingCAP, will be presenting: Using Chaos Engineering to Build a Reliable TiDB. Keep your eye out for more coming soon!

We could not put on this conference without the support of our sponsors. By being a sponsor at Percona Live it gives companies the opportunity to showcase their products and services, interact with the community for invaluable face time, meet with users or customers and showcase their recruitment opportunities.

It’s with great pleasure to announce the first round of sponsors for Percona Live!

Diamond Sponsors

continuent

 

VividCortex

 

Silver Sponsors

pingcapmysql

If you’d like to find out more about being a sponsor, download the prospectus here
 
Stay tuned for more updates on the conference agenda! 

MySQL Challenge: 100k Connections

thread pools MySQL 100k connections

In this post, I want to explore a way to establish 100,000 connections to MySQL. Not just idle connections, but executing queries.

100,000 connections. Is that really needed for MySQL, you may ask? Although it may seem excessive, I have seen a lot of different setups in customer deployments. Some deploy an application connection pool, with 100 application servers and 1,000 connections in each pool. Some applications use a “re-connect and repeat if the query is too slow” technique, which is a terrible practice. It can lead to a snowball effect, and could establish thousands of connections to MySQL in a matter of seconds.

So now I want to set an overachieving goal and see if we can achieve it.

Setup

For this I will use the following hardware:

Bare metal server provided by packet.net, instance size: c2.medium.x86
Physical Cores @ 2.2 GHz
(1 X AMD EPYC 7401P)
Memory: 64 GB of ECC RAM
Storage : INTEL® SSD DC S4500, 480GB

This is a server grade SATA SSD.

I will use five of these boxes, for the reason explained below. One box for the MySQL server and four boxes for client connections.

For the server I will use Percona  Server for MySQL 8.0.13-4 with the thread pool plugin. The plugin will be required to support the thousands of connections.

Initial server setup

Network settings (Ansible format):

- { name: 'net.core.somaxconn', value: 32768 }
- { name: 'net.core.rmem_max', value: 134217728 }
- { name: 'net.core.wmem_max', value: 134217728 }
- { name: 'net.ipv4.tcp_rmem', value: '4096 87380 134217728' }
- { name: 'net.ipv4.tcp_wmem', value: '4096 87380 134217728' }
- { name: 'net.core.netdev_max_backlog', value: 300000 }
- { name: 'net.ipv4.tcp_moderate_rcvbuf', value: 1 }
- { name: 'net.ipv4.tcp_no_metrics_save', value: 1 }
- { name: 'net.ipv4.tcp_congestion_control', value: 'htcp' }
- { name: 'net.ipv4.tcp_mtu_probing', value: 1 }
- { name: 'net.ipv4.tcp_timestamps', value: 0 }
- { name: 'net.ipv4.tcp_sack', value: 0 }
- { name: 'net.ipv4.tcp_syncookies', value: 1 }
- { name: 'net.ipv4.tcp_max_syn_backlog', value: 4096 }
- { name: 'net.ipv4.tcp_mem', value: '50576   64768 98152' }
- { name: 'net.ipv4.ip_local_port_range', value: '4000 65000' }
- { name: 'net.ipv4.netdev_max_backlog', value: 2500 }
- { name: 'net.ipv4.tcp_tw_reuse', value: 1 }
- { name: 'net.ipv4.tcp_fin_timeout', value: 5 }

These are the typical settings recommended for 10Gb networks and high concurrent workloads.

Limits settings for systemd:

[Service]
LimitNOFILE=1000000
LimitNPROC=500000

And the relevant setting for MySQL in my.cnf:

back_log=3500
max_connections=110000

For the client I will use sysbench version 0.5 and not 1.0.x, for the reasons explained below.

The workload is

sysbench --test=sysbench/tests/db/select.lua --mysql-host=139.178.82.47 --mysql-user=sbtest --mysql-password=sbtest --oltp-tables-count=10 --report-interval=1 --num-threads=10000 --max-time=300 --max-requests=0 --oltp-table-size=10000000 --rand-type=uniform --rand-init=on run

Step 1. 10,000 connections

This one is very easy, as there is not much to do to achieve this. We can do this with only one client. But you may face the following error on the client side:

FATAL: error 2004: Can't create TCP/IP socket (24)

This is caused by the open file limit, which is also a limit of TCP/IP sockets. This can be fixed by setting  

ulimit -n 100000

  on the client.

The performance we observe:

[  26s] threads: 10000, tps: 0.00, reads: 33367.48, writes: 0.00, response time: 3681.42ms (95%), errors: 0.00, reconnects:  0.00
[  27s] threads: 10000, tps: 0.00, reads: 33289.74, writes: 0.00, response time: 3690.25ms (95%), errors: 0.00, reconnects:  0.00

Step 2. 25,000 connections

With 25,000 connections, we hit an error on MySQL side:

Can't create a new thread (errno 11); if you are not out of available memory, you can consult the manual for a possible OS-dependent bug

If you try to lookup information on this error you might find the following article:  https://www.percona.com/blog/2013/02/04/cant_create_thread_errno_11/

But it does not help in our case, as we have all limits set high enough:

cat /proc/`pidof mysqld`/limits
Limit                     Soft Limit Hard Limit           Units
Max cpu time              unlimited  unlimited            seconds
Max file size             unlimited  unlimited            bytes
Max data size             unlimited  unlimited            bytes
Max stack size            8388608    unlimited            bytes
Max core file size        0          unlimited            bytes
Max resident set          unlimited  unlimited            bytes
Max processes             500000     500000               processes
Max open files            1000000    1000000              files
Max locked memory         16777216   16777216             bytes
Max address space         unlimited  unlimited            bytes
Max file locks            unlimited  unlimited            locks
Max pending signals       255051     255051               signals
Max msgqueue size         819200     819200               bytes
Max nice priority         0          0
Max realtime priority     0          0
Max realtime timeout      unlimited unlimited            us

This is where we start using the thread pool feature:  https://www.percona.com/doc/percona-server/8.0/performance/threadpool.html

Add:

thread_handling=pool-of-threads

to the my.cnf and restart Percona Server

The results:

[   7s] threads: 25000, tps: 0.00, reads: 33332.57, writes: 0.00, response time: 974.56ms (95%), errors: 0.00, reconnects:  0.00
[   8s] threads: 25000, tps: 0.00, reads: 33187.01, writes: 0.00, response time: 979.24ms (95%), errors: 0.00, reconnects:  0.00

We have the same throughput, but actually the 95% response time has improved (thanks to the thread pool) from 3690 ms to 979 ms.

Step 3. 50,000 connections

This is where we encountered the biggest challenge. At first, trying to get 50,000 connections in sysbench we hit the following error:

FATAL: error 2003: Can't connect to MySQL server on '139.178.82.47' (99)

Error (99) is cryptic and it means: Cannot assign requested address.

It comes from the limit of ports an application can open. By default on my system it is

cat /proc/sys/net/ipv4/ip_local_port_range : 32768   60999

This says there are only 28,231 available ports — 60999 minus 32768 — or the limit of TCP connections you can establish from or to the given IP address.

You can extend this using a wider range, on both the client and the server:

echo 4000 65000 > /proc/sys/net/ipv4/ip_local_port_range

This will give us 61,000 connections, but this is very close to the limit for one IP address (maximal port is 65535). The key takeaway from here is that if we want more connections we need to allocate more IP addresses for MySQL server. In order to achieve 100,000 connections, I will use two IP addresses on the server running MySQL.

After sorting out the port ranges, we hit the following problem with sysbench:

sysbench 0.5:  multi-threaded system evaluation benchmark
Running the test with following options:
Number of threads: 50000
FATAL: pthread_create() for thread #32352 failed. errno = 12 (Cannot allocate memory)

In this case, it’s a problem with sysbench memory allocation (namely lua subsystem). Sysbench can allocate memory for only 32,351 connections. This is a problem which is even more severe in sysbench 1.0.x.

Sysbench 1.0.x limitation

Sysbench 1.0.x uses a different Lua JIT, which hits memory problems even with 4000 connections, so it is impossible to go over 4000 connection in sysbench 1.0.x

So it seems we hit a limit with sysbench sooner than with Percona Server. In order to use more connections, we need to use multiple sysbench clients, and if 32,351 connections is the limit for sysbench, we have to use at least four sysbench clients to get up to 100,000 connections.

For 50,000 connections I will use 2 servers (each running separate sysbench), each running 25,000 threads from sysbench.

The results for each sysbench looks like:

[  29s] threads: 25000, tps: 0.00, reads: 16794.09, writes: 0.00, response time: 1799.63ms (95%), errors: 0.00, reconnects:  0.00
[  30s] threads: 25000, tps: 0.00, reads: 16491.03, writes: 0.00, response time: 1800.70ms (95%), errors: 0.00, reconnects:  0.00

So we have about the same throughput (16794*2 = 33588 tps in total), however the 95% response time doubled. This is to be expected as we are using twice as many connections compared to the 25,000 connections benchmark.

Step 3. 75,000 connections

To achieve 75,000 connections we will use three servers with sysbench, each running 25,000 threads.

The results for each sysbench:

[ 157s] threads: 25000, tps: 0.00, reads: 11633.87, writes: 0.00, response time: 2651.76ms (95%), errors: 0.00, reconnects:  0.00
[ 158s] threads: 25000, tps: 0.00, reads: 10783.09, writes: 0.00, response time: 2601.44ms (95%), errors: 0.00, reconnects:  0.00

Step 4. 100,000 connections

There is nothing eventful to achieve75k and 100k connections. We just spin up an additional server and start sysbench. For 100,000 connections we need four servers for sysbench, each shows:

[ 101s] threads: 25000, tps: 0.00, reads: 8033.83, writes: 0.00, response time: 3320.21ms (95%), errors: 0.00, reconnects:  0.00
[ 102s] threads: 25000, tps: 0.00, reads: 8065.02, writes: 0.00, response time: 3405.77ms (95%), errors: 0.00, reconnects:  0.00

So we have the same throughput (8065*4=32260 tps in total) with 3405ms 95% response time.

A very important takeaway from this: with 100k connections and using a thread pool, the 95% response time is even better than for 10k connections without a thread pool. The thread pool allows Percona Server to manage resources more efficiently and provides better response times.

Conclusions

100k connections is quite achievable for MySQL, and I am sure we could go even further. There are three components to achieve this:

  • Thread pool in Percona Server
  • Proper tuning of network limits
  • Using multiple IP addresses on the server box (one IP address per approximately 60k connections)

Appendix: full my.cnf

[mysqld]
datadir {{ mysqldir }}
ssl=0
skip-log-bin
log-error=error.log
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0
character_set_server=latin1
collation_server=latin1_swedish_ci
skip-character-set-client-handshake
innodb_undo_log_truncate=off
# general
table_open_cache = 200000
table_open_cache_instances=64
back_log=3500
max_connections=110000
# files
innodb_file_per_table
innodb_log_file_size=15G
innodb_log_files_in_group=2
innodb_open_files=4000
# buffers
innodb_buffer_pool_size= 40G
innodb_buffer_pool_instances=8
innodb_log_buffer_size=64M
# tune
innodb_doublewrite= 1
innodb_thread_concurrency=0
innodb_flush_log_at_trx_commit= 0
innodb_flush_method=O_DIRECT_NO_FSYNC
innodb_max_dirty_pages_pct=90
innodb_max_dirty_pages_pct_lwm=10
innodb_lru_scan_depth=2048
innodb_page_cleaners=4
join_buffer_size=256K
sort_buffer_size=256K
innodb_use_native_aio=1
innodb_stats_persistent = 1
#innodb_spin_wait_delay=96
innodb_adaptive_flushing = 1
innodb_flush_neighbors = 0
innodb_read_io_threads = 16
innodb_write_io_threads = 16
innodb_io_capacity=1500
innodb_io_capacity_max=2500
innodb_purge_threads=4
innodb_adaptive_hash_index=0
max_prepared_stmt_count=1000000
innodb_monitor_enable = '%'
performance_schema = ON

Percona XtraBackup Now Supports Dump of InnoDB Buffer Pool

percona-xtra-backup buffer pool restore

percona-xtra-backup buffer pool restoreInnoDB keeps hot data in memory on its buffer named InnoDB Buffer Pool. For a long time, when a MySQL instance needed to bounce, this hot cached data was lost and the instance required a warm-up period to perform as well as it did before the service restart.

That is not the case anymore. Newer versions of MySQL/MariaDB allow users to save the state of this buffer by dumping tablespace ID’s and page ID’s to a file on disk that will be loaded automatically on startup, making the newly started server buffer pool as it was prior the restart.

Details about the MySQL implementation can be found at https://dev.mysql.com/doc/refman/5.7/en/innodb-preload-buffer-pool.html

With that in mind, Percona XtraBackup versions 2.4.13 can now instruct MySQL to dump the content of buffer pool while taking a backup. This means you can restore the backup on a new server and make MySQL perform just like the other instance in terms of InnoDB Buffer Pool data.

How it works

The buffer pool dump happens at the beginning of backup if --dump-innodb-buffer-pool is set.

The user can choose to change the default innodb_buffer_pool_dump_pct. If --dump-innodb-buffer-pool-pct is set, it stores the current MySQL innodb_buffer_pool_dump_pct value, then it changes it to the desired percentage. After the end of the backup, original values is restored back.

The actual file copy happens at the end of the backup.

Percona XtraDB Cluster

A very good use case is PXC/Galera. When a node initiates SST, we would like the joiner to have a copy of InnoDB Buffer Pool from the donor. We can configure PXC nodes to do that:

[xtrabackup]
dump-innodb-buffer-pool
dump-innodb-buffer-pool-pct=100

Here is an example of a PXC node that just received SST:

Before PXB-1548:

[root@marcelo-altmann-pxb-pxc-3 ~]# systemctl stop mysql && rm -rf /var/lib/mysql/* && systemctl start mysql && mysql -psekret -e "SHOW ENGINE INNODB STATUSG" | grep 'Database pages'
mysql: [Warning] Using a password on the command line interface can be insecure.
Database pages 311

Joiner started with a cold buffer pool.

After adding dump-innodb-buffer-pool and dump-innodb-buffer-pool-pct=100 to my.cnf :

[root@marcelo-altmann-pxb-pxc-3 ~]# systemctl stop mysql && rm -rf /var/lib/mysql/* && systemctl start mysql && mysql -psekret -e "SHOW ENGINE INNODB STATUSG" | grep 'Database pages'
mysql: [Warning] Using a password on the command line interface can be insecure.
Database pages 30970

Joiner started with a copy of the buffer pool from the donor, which will reduce the joiner warm-up period.

Conclusion

The new version of Percona XtraBackup can help to minimize the time a newly restored backup will take to perform like source server


Photo by Jametlene Reskp on Unsplash

Charset and Collation Settings Impact on MySQL Performance

MySQL 8.0 utf8mb4

Following my post MySQL 8 is not always faster than MySQL 5.7, this time I decided to test very simple read-only CPU intensive workloads, when all data fits memory. In this workload there is NO IO operations, only memory and CPU operations.

My Testing Setup

Environment specification

  • Release | Ubuntu 18.04 LTS (bionic)
  • Kernel | 4.15.0-20-generic
  • Processors | physical = 2, cores = 28, virtual = 56, hyperthreading = yes
  • Models | 56xIntel(R) Xeon(R) Gold 5120 CPU @ 2.20GHz<
  • Memory Total | 376.6G
  • Provider | packet.net x2.xlarge.x86 instance

I will test two workloads, sysbench oltp_read_only and oltp_point_select varying amount of threads

sysbench oltp_read_only --mysql-ssl=off --report-interval=1 --time=300 --threads=$i --tables=10 --table-size=10000000 --mysql-user=root run

sysbench oltp_point_select --mysql-ssl=off --report-interval=1 --time=300 --threads=$i --tables=10 --table-size=10000000 --mysql-user=root run

The results for OLTP read-only (latin1 character set):

MySQL 5.7.25MySQL 8.0.15
threadsthroughputthroughputthroughput ratio
11241.181114.41.11
44578.184106.691.11
1615763.6414303.541.10
2421384.5719472.891.10
3225081.1722897.041.10
4832363.2729600.261.09
6439629.0935585.881.11
12838448.2334718.421.11
25636306.4432798.121.11

The results for point_select (latin1 character set):

point selectMySQL 5.7.25MySQL 8.0.15
threadsthroughputthroughputthroughput ratio
131672.5228344.251.12
4110650.798296.461.13
16390165.41347026.491.12
24534454.55474024.561.13
32620402.74554524.731.12
48806367.3718350.871.12
641120586.03972366.591.15
1281108638.47960015.171.15
2561038166.63891470.111.16

We can see that in the OLTP read-only workload, MySQL 8.0.15 is slower by 10%, and for the point_select workload MySQL 8.0.15 is slower by 12-16%.

Although the difference is not necessarily significant, this is enough to reveal that MySQL 8.0.15 does not perform as well as MySQL 5.7.25 in the variety of workloads that I am testing.

However, it appears that the dynamic of the results will change if we use the utf8mb4 character set instead of latin1.

Let’s compare MySQL 5.7.25 latin1 vs utf8mb4, as utf8mb4 is now default CHARSET in MySQL 8.0

But before we do that let’s take look also at COLLATION.

MySQL 5.7.25 uses a default collation utf8mb4_general_ci, However, I read that to use proper sorting and comparison for Eastern European languages, you may want to use the utf8mb4_unicode_ci collation. For MySQL 8.0.5 the default collation is

So let’s compare each version latin1 vs utf8mb4 (with default collation). First 5.7:

Threadsutf8mb4_general_cilatin1latin1 ratio
42957.994578.181.55
2413792.5521384.571.55
6424516.9939629.091.62
12823977.0738448.231.60

So here we can see that utf8mb4 in MySQL 5.7 is really much slower than latin1 (by 55-60%)

And the same for MySQL 8.0.15

MySQL 8.0 defaultcollations

Threadsutf8mb4_0900_ai_ci (default)latin1latin1 ratio
43968.884106.691.03
2418446.1919472.891.06
6432776.3535585.881.09
12831301.7534718.421.11

For MySQL 8.0 the hit from utf8mb4 is much lower (up to 11%)

Now let’s compare all collations for utf8mb4

For MySQL 5.7

MySQL 5.7 utf8mb4

utf8mb4_general_ci (default)utf8mb4_binutf8mb4_unicode_ciutf8mb4_unicode_520_ci
42957.993328.82157.611942.78
2413792.5515857.299989.969095.17
6424516.9928125.1616207.2614768.64
12823977.0727410.9415970.614560.6

If you plan to use utf8mb4_unicode_ci, you will get an even further performance hit (comparing to utf8mb4_general_ci )

And for MySQL 8.0.15

MySQL 8.0 utf8mb4

utf8mb4_general_ciutf8mb4_binutf8mb4_unicode_ciutf8mb4_0900_ai_ci (default)
43461.83628.013363.73968.88
2416327.4517136.1615740.8318446.19
6428960.6230390.2927242.7232776.35
12827967.2529256.8926489.8331301.75

So now let’s compare MySQL 8.0 vs MySQL 5.7 in utf8mb4 with default collations:

mysql 8 and 5.7 default collation

MySQL 8.0 utf8mb4_0900_ai_ciMySQL 5.7 utf8mb4_general_ciMySQL 8.0 ratio
43968.882957.991.34
2418446.1913792.551.34
6432776.3524516.991.34
12831301.7523977.071.31

So there we are. In this case, MySQL 8.0 is actually better than MySQL 5.7 by 34%

Conclusions

There are several observations to make:

  • MySQL 5.7 outperforms MySQL 8.0 in latin1 charset
  • MySQL 8.0 outperforms MySQL 5.7 by a wide margin if we use utf8mb4 charset
  • Be aware that utf8mb4  is now default MySQL 8.0, while MySQL 5.7 has latin1 by default
  • When running comparison between MySQL 8.0 vs MySQL 5.7 be aware what charset you are using, as it may affect the comparison a lot.

MySQL 8.0 Bug 94394, Fixed!

MySQL optimizer bugs

MySQL optimizer bugs

Last week I came across a bug in MySQL 8.0, which meant that the absence of mysql.user leads to auto-apply of –skip-grant-tables (#94394) would leave MySQL running in an undesirable state. My colleague Sveta Smirnova blogged about the issue and it also caught the interest of Valeriy Kravchuk in Fun with Bugs #80 – On MySQL Bug Reports I am Subscribed to, Part XVI. Thanks for the extra visibility!

Credit is now due to Oracle for the quick response, as it was fixed in less than one week (including a weekend):

Fixed in 8.0.16.

Previously, if the grant tables were corrupted, the MySQL server
wrote a message to the error log but continued as if the
–skip-grant-tables option had been specified. This resulted in the
server operating in an unexpected state unless –skip-grant-tables
had in fact been specified. Now, the server stops after writing a
message to the error log unless started with –skip-grant-tables.
(Starting the server with that option enables you to connect to
perform diagnostic operations.)

I think that this particular bug reflects some of the nice things about the MySQL community (and Open Source in general); anyone can find and report a bug, or make a feature request, to one of the software vendors (MySQL, Percona, or MariaDB) and try to improve the software. Sometimes bugs hang around for a while, either because they are hard to fix, viewed as lower in priority (despite the reporter’s opinion), or perhaps the bug does not have enough public visibility. Then a member of the community notices the bug and takes an interest and soon there is more interest. If you are lucky the bug gets fixed quickly! You can of course also provide a fix for the bug yourself, which may speed up the process with a little luck.

If you have not yet reported a bug, or want to find if you are reporting them in the right sort of way then you can take a look at How to create a useful MySQL bug report…and make sure it’s properly processed by Valeriy from FOSDEM 2019.

? ? ? You can help to find more!

Percona XtraDB Cluster 5.7.25-31.35 Is Now Available

Percona XtraDB Cluster 5.7

Percona XtraDB Cluster 5.7Percona is glad to announce the release of Percona XtraDB Cluster 5.7.25-31.35 on February 28, 2018. Binaries are available from the downloads section or from our software repositories.

This release of Percona XtraDB Cluster includes the support of Ubuntu 18.10 (Cosmic Cuttlefish). Percona XtraDB Cluster 5.7.25-31.35 is now the current release, based on the following:

All Percona software is open-source and free.

Bugs Fixed

  • PXC-2346mysqld could crash when executing mysqldump --single-transaction while the binary log is disabled. This problem was also reported in PXC-1711PXC-2371PXC-2419.
  • PXC-2388: In some cases, DROP FUNCTION function_name was not replicated.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!