Update on the InnoDB double-write buffer and EXT4 transactions

In a post, written a few months ago, I found that using EXT4 transactions with the “data=journal” mount option, improves the write performance significantly, by 55%, without putting data at risk. Many people commented on the post mentioning they were not able to reproduce the results and thus, I decided to further investigate in order to find out why my results were different.

So, I ran sysbench benchmarks on a few servers and found when the InnoDB double-write buffer limitations occur and when they don’t. I also made sure some of my colleagues were able to reproduce the results. Basically, in order to reproduce the results you need the following conditions:

  • Spinning disk (no SSD)
  • Enough CPU power
  • A dataset that fits in the InnoDB buffer pool
  • A continuous high write load with many ops waiting for disk

Using the InnoDB double write buffer on an SSD disk somewhat prevents us from seeing the issue, something good performance wise. That comes from the fact that the latency of each write operation is much lower. That makes sense, the double-writer buffer is an area of 128 pages on disk that is used by the write threads. When a write thread needs to write a bunch of dirty pages to disk, it first writes them sequentially to free slots in the double write buffer in a single iop and then, it spends time writing the pages to their actual locations on disk using typically one iop per page. Once done writing, it releases the double-write buffer slots it was holding and another thread can do its work. The presence of a raid controller with a write cache certainly helps, at least until the write cache is full. Thus, since I didn’t tested with a raid controller, I suspect a raid controller write cache will delay the apparition of the symptoms but if the write load is sustained over a long period of time, the issue with the InnoDB double write buffer will appear.

So, to recapitulate, on a spinning disk, a write thread needs to hold a lock on some of the double-write buffer slots for at least a few milliseconds per page it needs to write while on a SSD disk, the slots are released very quickly because of the low latency of the SSD storage. To actually stress the InnoDB double-write buffer on a SSD disk, one must push much more writes.

That leads us to the second point, the amount of CPU resources available. At first, one of my colleague tried to reproduce the results on a small EC2 instance and failed. It appeared that by default, the sysbench oltp.lua script is doing quite a lot of reads and those reads saturate the CPU, throttling the writes. By lowering the amount of reads in the script, he was then able to reproduce the results.

For my benchmarks, I used the following command:

sysbench --num-threads=16 --mysql-socket=/var/lib/mysql/mysql.sock
--mysql-database=sbtest --mysql-user=root
--test=/usr/share/doc/sysbench/tests/db/oltp.lua --oltp-table-size=50000000
--oltp-test-mode=complex --mysql-engine=innodb --db-driver=mysql
--report-interval=60 --max-requests=0 --max-time=3600 run

Both servers used were metal boxes with 12 physical cores (24 HT). With less CPU resources, I suggest adding the following parameters:

--oltp-point-selects=1
--oltp-range-size=1
--oltp-index-updates=10

So that the CPU is not wasted on reads and enough writes are generated. Remember we are not doing a generic benchmarks, we are just stressing the InnoDB double-write buffer.

In order to make sure something else isn’t involved, I verified the following:

  • Server independence, tried on 2 physical servers and one EC2 instance, Centos 6 and Ubuntu 14.04
  • MySQL provided, tried on MySQL community and Percona Server
  • MySQL version, tried on 5.5.37 and 5.6.23 (Percona Server)
  • Varied the InnoDB log file size from 32MB to 512MB
  • The impacts of the number of InnoDB write threads (1,2,4,8,16,32)
  • The use of Linux native asynchronous iop
  • Spinning and SSD storage

So, with all those verifications done, I can maintain that if you are using a server with spinning disks and a high write load, using EXT4 transactions instead of the InnoDB double write buffer yields to an increase in throughput of more than 50%. In an upcoming post, I’ll show how the performance stability is affected by the InnoDB double-write buffer under a high write load.

Appendix: the relevant part of the my.cnf

innodb_buffer_pool_size = 12G
innodb_write_io_threads = 8 # or else in {1,2,4,8,16,32}
innodb_read_io_threads = 8
innodb_flush_log_at_trx_commit = 0 # must be 0 or 2 to really stress the double write buffer
innodb_log_file_size = 512M # or 32M, 64M
innodb_log_files_in_group = 2
innodb_file_per_table
innodb_flush_method=O_DIRECT # or O_DSYNC
innodb_buffer_pool_restore_at_startup=300 # On 5.5.x, important to warm up the buffer pool
#innodb_buffer_pool_load_at_startup=ON # on 5.6, important to warm up the buffer pool
#innodb_buffer_pool_dump_at_shutdown=ON # on 5.6, important to warm up the buffer pool,
skip-innodb_doublewrite # or commented out
innodb_flush_neighbor_pages=none # or area for spinning

The post Update on the InnoDB double-write buffer and EXT4 transactions appeared first on MySQL Performance Blog.

Read more at: http://www.mysqlperformanceblog.com/