Charset and Collation Settings Impact on MySQL Performance

MySQL 8.0 utf8mb4

Following my post MySQL 8 is not always faster than MySQL 5.7, this time I decided to test very simple read-only CPU intensive workloads, when all data fits memory. In this workload there is NO IO operations, only memory and CPU operations.

My Testing Setup

Environment specification

  • Release | Ubuntu 18.04 LTS (bionic)
  • Kernel | 4.15.0-20-generic
  • Processors | physical = 2, cores = 28, virtual = 56, hyperthreading = yes
  • Models | 56xIntel(R) Xeon(R) Gold 5120 CPU @ 2.20GHz<
  • Memory Total | 376.6G
  • Provider | packet.net x2.xlarge.x86 instance

I will test two workloads, sysbench oltp_read_only and oltp_point_select varying amount of threads

sysbench oltp_read_only --mysql-ssl=off --report-interval=1 --time=300 --threads=$i --tables=10 --table-size=10000000 --mysql-user=root run

sysbench oltp_point_select --mysql-ssl=off --report-interval=1 --time=300 --threads=$i --tables=10 --table-size=10000000 --mysql-user=root run

The results for OLTP read-only (latin1 character set):

MySQL 5.7.25MySQL 8.0.15
threadsthroughputthroughputthroughput ratio
11241.181114.41.11
44578.184106.691.11
1615763.6414303.541.10
2421384.5719472.891.10
3225081.1722897.041.10
4832363.2729600.261.09
6439629.0935585.881.11
12838448.2334718.421.11
25636306.4432798.121.11

The results for point_select (latin1 character set):

point selectMySQL 5.7.25MySQL 8.0.15
threadsthroughputthroughputthroughput ratio
131672.5228344.251.12
4110650.798296.461.13
16390165.41347026.491.12
24534454.55474024.561.13
32620402.74554524.731.12
48806367.3718350.871.12
641120586.03972366.591.15
1281108638.47960015.171.15
2561038166.63891470.111.16

We can see that in the OLTP read-only workload, MySQL 8.0.15 is slower by 10%, and for the point_select workload MySQL 8.0.15 is slower by 12-16%.

Although the difference is not necessarily significant, this is enough to reveal that MySQL 8.0.15 does not perform as well as MySQL 5.7.25 in the variety of workloads that I am testing.

However, it appears that the dynamic of the results will change if we use the utf8mb4 character set instead of latin1.

Let’s compare MySQL 5.7.25 latin1 vs utf8mb4, as utf8mb4 is now default CHARSET in MySQL 8.0

But before we do that let’s take look also at COLLATION.

MySQL 5.7.25 uses a default collation utf8mb4_general_ci, However, I read that to use proper sorting and comparison for Eastern European languages, you may want to use the utf8mb4_unicode_ci collation. For MySQL 8.0.5 the default collation is

So let’s compare each version latin1 vs utf8mb4 (with default collation). First 5.7:

Threadsutf8mb4_general_cilatin1latin1 ratio
42957.994578.181.55
2413792.5521384.571.55
6424516.9939629.091.62
12823977.0738448.231.60

So here we can see that utf8mb4 in MySQL 5.7 is really much slower than latin1 (by 55-60%)

And the same for MySQL 8.0.15

MySQL 8.0 defaultcollations

Threadsutf8mb4_0900_ai_ci (default)latin1latin1 ratio
43968.884106.691.03
2418446.1919472.891.06
6432776.3535585.881.09
12831301.7534718.421.11

For MySQL 8.0 the hit from utf8mb4 is much lower (up to 11%)

Now let’s compare all collations for utf8mb4

For MySQL 5.7

MySQL 5.7 utf8mb4

utf8mb4_general_ci (default)utf8mb4_binutf8mb4_unicode_ciutf8mb4_unicode_520_ci
42957.993328.82157.611942.78
2413792.5515857.299989.969095.17
6424516.9928125.1616207.2614768.64
12823977.0727410.9415970.614560.6

If you plan to use utf8mb4_unicode_ci, you will get an even further performance hit (comparing to utf8mb4_general_ci )

And for MySQL 8.0.15

MySQL 8.0 utf8mb4

utf8mb4_general_ciutf8mb4_binutf8mb4_unicode_ciutf8mb4_0900_ai_ci (default)
43461.83628.013363.73968.88
2416327.4517136.1615740.8318446.19
6428960.6230390.2927242.7232776.35
12827967.2529256.8926489.8331301.75

So now let’s compare MySQL 8.0 vs MySQL 5.7 in utf8mb4 with default collations:

mysql 8 and 5.7 default collation

MySQL 8.0 utf8mb4_0900_ai_ciMySQL 5.7 utf8mb4_general_ciMySQL 8.0 ratio
43968.882957.991.34
2418446.1913792.551.34
6432776.3524516.991.34
12831301.7523977.071.31

So there we are. In this case, MySQL 8.0 is actually better than MySQL 5.7 by 34%

Conclusions

There are several observations to make:

  • MySQL 5.7 outperforms MySQL 8.0 in latin1 charset
  • MySQL 8.0 outperforms MySQL 5.7 by a wide margin if we use utf8mb4 charset
  • Be aware that utf8mb4  is now default MySQL 8.0, while MySQL 5.7 has latin1 by default
  • When running comparison between MySQL 8.0 vs MySQL 5.7 be aware what charset you are using, as it may affect the comparison a lot.