Startup penetrates the mind of Glassholes
Think you couldn’t look any sillier than a Glasshole? Prepare to be proven wrong by a British startup that asks you to add a forehead-dimpling biosensor to the dorky spectacle that is Google headgear.…
Think you couldn’t look any sillier than a Glasshole? Prepare to be proven wrong by a British startup that asks you to add a forehead-dimpling biosensor to the dorky spectacle that is Google headgear.…
UPDATE A Microsoft employee has created an “unauthorised” Windows Mobile app to access a catch-up TV service in Australia.…
Google has tweaked Chrome Remote Desktop to let it access Linux machines.…
Google is already rifling through bug reports on its new product after an unpleasantly short amount of time: just two days after fandroids performed the great unboxening. Android Wear owners are now reporting that the devices won’t install paid apps.…
Second Look News has started to circulate about the next version of Windows 8.1 – namely update 2 – and even “Windows 9”.…
CA Technologies has sold its arcserve data protection business to Marlin Equity Partners for an undisclosed sum.…
Updated Local services review site Yelp has formally submitted a complaint against Google’s search business to Brussels’ competition officials.…
Worstall on Wednesday There has been much consternation in Middle America over Facebook’s decision to ban from its advertising network a picture taken by a doting mother of a little girl pulling the woman’s daughter’s swimming costume down.…
Earlier this month I wrote about vmstat iowait cpu numbers and some of the comments I got were advertising the use of util% as reported by the iostat tool instead. I find this number even more useless for MySQL performance tuning and capacity planning.
Now let me start by saying this is a really tricky and deceptive number. Many DBAs who report instances of their systems having a very busy IO subsystem said the util% in vmstat was above 99% and therefore they believe this number is a good indicator of an overloaded IO subsystem.
Indeed – when your IO subsystem is busy, up to its full capacity, the utilization should be very close to 100%. However, it is perfectly possible for the IO subsystem and MySQL with it to have plenty more capacity than when utilization is showing 100% – as I will show in an example.
Before that though lets see what the iostat manual page has to say on this topic – from this main page we can read:
%util
Percentage of CPU time during which I/O requests were issued to the device (bandwidth utilization for the device). Device saturation occurs when this value is close to 100% for devices serving requests serially. But for devices serving requests in parallel, such as RAID arrays and modern SSDs, this number does not reflect their performance limits.
Which says right here that the number is useless for pretty much any production database server that is likely to be running RAID, Flash/SSD, SAN or cloud storage (such as EBS) capable of handling multiple requests in parallel.
Let’s look at the following illustration. I will run sysbench on a system with a rather slow storage data size larger than buffer pool and uniform distribution to put pressure on the IO subsystem. I will use a read-only benchmark here as it keeps things more simple…
sysbench –num-threads=1 –max-requests=0 –max-time=6000000 –report-interval=10 –test=oltp –oltp-read-only=on –db-driver=mysql –oltp-table-size=100000000 –oltp-dist-type=uniform –init-rng=on –mysql-user=root –mysql-password= run
I’m seeing some 9 transactions per second, while disk utilization from iostat is at nearly 96%:
[ 80s] threads: 1, tps: 9.30, reads/s: 130.20, writes/s: 0.00 response time: 171.82ms (95%)
[ 90s] threads: 1, tps: 9.20, reads/s: 128.80, writes/s: 0.00 response time: 157.72ms (95%)
[ 100s] threads: 1, tps: 9.00, reads/s: 126.00, writes/s: 0.00 response time: 215.38ms (95%)
[ 110s] threads: 1, tps: 9.30, reads/s: 130.20, writes/s: 0.00 response time: 141.39ms (95%)Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
dm-0 0.00 0.00 127.90 0.70 4070.40 28.00 31.87 1.01 7.83 7.52 96.68
This makes a lot of sense – with read single thread read workload the drive should be only used getting data needed by the query, which will not be 100% as there is some extra time needed to process the query on the MySQL side as well as passing the result set back to sysbench.
So 96% utilization; 9 transactions per second, this is a close to full-system capacity with less than 5% of device time to spare, right?
Let’s run a benchmark with more concurrency – 4 threads at the time; we’ll see…
[ 110s] threads: 4, tps: 21.10, reads/s: 295.40, writes/s: 0.00 response time: 312.09ms (95%)
[ 120s] threads: 4, tps: 22.00, reads/s: 308.00, writes/s: 0.00 response time: 297.05ms (95%)
[ 130s] threads: 4, tps: 22.40, reads/s: 313.60, writes/s: 0.00 response time: 335.34ms (95%)Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
dm-0 0.00 0.00 295.40 0.90 9372.80 35.20 31.75 4.06 13.69 3.38 100.01
So we’re seeing 100% utilization now, but what is interesting – we’re able to reclaim much more than less than 5% which was left if we look at utilization – throughput of the system increased about 2.5x
Finally let’s do the test with 64 threads – this is more concurrency than exists at storage level which is conventional hard drives in RAID on this system…
[ 70s] threads: 64, tps: 42.90, reads/s: 600.60, writes/s: 0.00 response time: 2516.43ms (95%)
[ 80s] threads: 64, tps: 42.40, reads/s: 593.60, writes/s: 0.00 response time: 2613.15ms (95%)
[ 90s] threads: 64, tps: 44.80, reads/s: 627.20, writes/s: 0.00 response time: 2404.49ms (95%)Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
dm-0 0.00 0.00 601.20 0.80 19065.60 33.60 31.73 65.98 108.72 1.66 100.00
In this case we’re getting 4.5x of throughput compared to single thread and 100% utilization. We’re also getting almost double throughput of the run with 4 thread where 100% utilization was reported. This makes sense – there are 4 drives which can work in parallel and with many outstanding requests they are able to optimize their seeks better hence giving a bit more than 4x.
So what have we so ? The system which was 96% capacity but which could have driven still to provide 4.5x throughput – so it had plenty of extra IO capacity. More powerful storage might have significantly more ability to run requests in parallel so it is quite possible to see 10x or more room after utilization% starts to be reported close to 100%
So if utilization% is not very helpful what can we use to understand our database IO capacity better ? First lets look at the performance reported from those sysbench runs. If we look at 95% response time you can see 1 thread and 4 threads had relatively close 95% time growing just from 150ms to 250-300ms. This is the number I really like to look at- if system is able to respond to the queries with response time not significantly higher than it has with concurrency of 1 it is not overloaded. I like using 3x as multiplier – ie when 95% spikes to be more than 3x of the single concurrency the system might be getting to the overload.
With 64 threads the 95% response time is 15-20x of the one we see with single thread so it is surely overloaded.
Do we have anything reported by iostat which we can use in a similar way? It turns out we do! Check out the “await” column which tells us how much the requester had to wait for the IO request to be serviced. With single concurrency it is 7.8ms which is what this drives can do for random IO and is as good as it gets. With 4 threads it is 13.7ms – less than double of best possible, so also good enough… with concurrency of 64 it is however 108ms which is over 10x of what this device could produce with no waiting and which is surely sign of overload.
A couple words of caution. First, do not look at svctm which is not designed with parallel processing in mind. You can see in our case it actually gets better with high concurrency while really database had to wait a lot longer for requests submitted. Second, iostat mixes together reads and writes in single statistics which specifically for databases and especially on RAID with BBU might be like mixing apples and oranges together – writes should go to writeback cache and be acknowledged essentially instantly while reads only complete when actual data can be delivered. The tool pt-diskstats from Percona Tookit breaks them apart and so can be much more for storage tuning for database workload. Some of the recent operating systems also ship with sysstat/iostat which breaks out await to r_await and w_await which is much more useful.
Final note – I used a read-only workload on purpose – when it comes to writes things can be even more complicated – MySQL buffer pool can be more efficient with more intensive writes plus group commit might be able to commit a lot of transactions with single disk write. Still, the same base logic will apply.
Summary: The take away is simple – util% only shows if a device has at least one operation to deal with or is completely busy, which does not reflect actual utilization for a majority of modern IO subsystems. So you may have a lot of storage IO capacity left even when utilization is close to 100%.
The post Why %util number from iostat is meaningless for MySQL capacity planning appeared first on MySQL Performance Blog.
My previous post was an introduction to the TokuDB storage engine and aimed at explaining the basics of its design and how it differentiates from InnoDB/XtraDB. This post is all about motivating you to give it a try and have a look for yourself. Percona Server is not officially supporting TokuDB as of today, though the guys in the development team are working hard on this and the first GA release of Percona Server with TokuDB is looming on the horizon. However, there’s a beta version available now. For the installation tests in this post I’ve used the latest version of Percona Server alongside the accompanying TokuDB complement, which was published last week.
One of the tools Percona Support Engineers really love is Giuseppe Maxia’s MySQL Sandbox. It allows us to setup a sandbox running a MySQL instance of our choice and makes executing multiple ones for testing purposes very easily. Whenever a customer reaches us with a problem happening on a particular version of MySQL or Percona Server that we can reproduce, we quickly spin off a new sandbox and test it ourselves, so it’s very handy. I’ll use one here to explore this beta version of Percona Server with TokuDB but if you prefer you can install it the regular way using a package from our apt experimental or yum testing repositories.
We start by downloading the tarballs from here: TokuDB’s plugin has been packaged in its own tarball, so there are two to download. Once you get them let’s decompress both and create a unified working directory, compressing it again to create a single tarball we’ll use as source to create our sandbox:
[nando@test01 ~]# tar zxvf Percona-Server-5.6.17-rel66.0-608.Linux.x86_64.tar.gz [nando@test01 ~]# tar zxvf Percona-Server-5.6.17-rel66.0-608.TokuDB.Linux.x86_64.tar.gz [nando@test01 ~]# tar cfa Percona-Server-5.6.17-rel66.0-608.Linux.x86_64.tar.gz Percona-Server-5.6.17-rel66.0-608.Linux.x86_64/
Before going ahead, verify if you have transparent huge pages enabled as TokuDB won’t run if it is set. See this documentation page for explanation on what this is and how to disable it on Ubuntu. In my CentOS test server it was defined in a slightly different place and I’ve done the following to temporarily disable it:
[nando@test01]# echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled [nando@test01]# echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag
We’re now ready to create our sandbox. The following command should be enough (I’ve chosen to run Percona Server on port 5617, you can use any other available one):
[nando@test01 ~]# make_sandbox Percona-Server-5.6.17-rel66.0-608.Linux.x86_64.tar.gz -- --sandbox_directory=tokudb --sandbox_port=5617
If the creation process goes well you will see something like the following at the end:
.... sandbox server started Your sandbox server was installed in $HOME/sandboxes/tokudb
You should now be able to access the MySQL console on the sandbox with the default credentials; if you cannot, verify the log-in $HOME/sandboxes/tokudb/data/msandbox.err:
[nando@test01 ~]# mysql --socket=/tmp/mysql_sandbox5617.sock -umsandbox -pmsandbox
Alternatively, you can make use of the “use” script located inside the sandbox directory, which employs the same credentials (configured in the client section of the configuration file my.sandbox.cnf):
[nando@test01 ~]# cd sandboxes/tokudb/ [nando@test01 tokudb]# ./use
First thing to check is if TokuDB is being listed as an available storage engine:
mysql> SHOW ENGINES; +--------------------+---------+----------------------------------------------------------------------------+--------------+------+------------+ | Engine | Support | Comment | Transactions | XA | Savepoints | +--------------------+---------+----------------------------------------------------------------------------+--------------+------+------------+ | (...) | (...) | (...) | (...) | (...)| (...) | | TokuDB | YES | Tokutek TokuDB Storage Engine with Fractal Tree(tm) Technology | YES | YES | YES | | InnoDB | DEFAULT | Percona-XtraDB, Supports transactions, row-level locking, and foreign keys | YES | YES | YES | | (...) | (...) | (...) | NO | (...)| (...) | +--------------------+---------+----------------------------------------------------------------------------+--------------+------+------------+
If that’s not the case, you may need to load the plugins manually – I had to do so in my sandbox; you may not need if you’re installing it from a package in a fresh setup:
mysql> INSTALL PLUGIN tokudb SONAME 'ha_tokudb.so';
TokuDB should now figure in the list of supported ENGINES but you still need to activate the related plugins:
mysql> INSTALL PLUGIN tokudb_file_map SONAME 'ha_tokudb.so'; mysql> INSTALL PLUGIN tokudb_fractal_tree_info SONAME 'ha_tokudb.so'; mysql> INSTALL PLUGIN tokudb_fractal_tree_block_map SONAME 'ha_tokudb.so'; mysql> INSTALL PLUGIN tokudb_trx SONAME 'ha_tokudb.so'; mysql> INSTALL PLUGIN tokudb_locks SONAME 'ha_tokudb.so'; mysql> INSTALL PLUGIN tokudb_lock_waits SONAME 'ha_tokudb.so';
Please note the INSTALL PLUGIN action results in permanent changes and thus is required only once. No modifications to MySQL’s configuration file are required to have those plugins load in subsequent server restarts.
Now you should see not only the main TokuDB plugin but also the add-ons to the INFORMATION SCHEMA:
mysql> SHOW PLUGINS; +-------------------------------+----------+--------------------+--------------+---------+ | Name | Status | Type | Library | License | +-------------------------------+----------+--------------------+--------------+---------+ | (...) | (...) | (...) | (...) | (...) | | TokuDB | ACTIVE | STORAGE ENGINE | ha_tokudb.so | GPL | | TokuDB_trx | ACTIVE | INFORMATION SCHEMA | ha_tokudb.so | GPL | | TokuDB_locks | ACTIVE | INFORMATION SCHEMA | ha_tokudb.so | GPL | | TokuDB_lock_waits | ACTIVE | INFORMATION SCHEMA | ha_tokudb.so | GPL | | TokuDB_file_map | ACTIVE | INFORMATION SCHEMA | ha_tokudb.so | GPL | | TokuDB_fractal_tree_info | ACTIVE | INFORMATION SCHEMA | ha_tokudb.so | GPL | | TokuDB_fractal_tree_block_map | ACTIVE | INFORMATION SCHEMA | ha_tokudb.so | GPL | +-------------------------------+----------+--------------------+--------------+---------+
We are now ready to create our first TokuDB table – the only different thing to do here is to specify TokuDB as the storage engine to use:
mysql> CREATE TABLE test.Numbers (id INT PRIMARY KEY, number VARCHAR(20)) ENGINE=TokuDB;
Note some unfamiliar files lying in the datadir; the details surrounding those is certainly good material for future posts:
[nando@test01]# ls ~/sandboxes/tokudb/data auto.cnf _test_Numbers_main_3_2_19.tokudb ibdata1 _test_Numbers_status_3_1_19.tokudb ib_logfile0 tokudb.directory ib_logfile1 tokudb.environment log000000000000.tokulog25 __tokudb_lock_dont_delete_me_data msandbox.err __tokudb_lock_dont_delete_me_environment mysql __tokudb_lock_dont_delete_me_logs mysql_sandbox5617.pid __tokudb_lock_dont_delete_me_recovery performance_schema __tokudb_lock_dont_delete_me_temp tc.log tokudb.rollback test
As noted by Vadim long ago, “Tuning of TokuDB is much easier than InnoDB, there’re only a few parameters to change, and actually out-of-box things running pretty well“:
mysql> show variables like 'tokudb_%'; +---------------------------------+------------------+ | Variable_name | Value | +---------------------------------+------------------+ | tokudb_alter_print_error | OFF | | tokudb_analyze_time | 5 | | tokudb_block_size | 4194304 | | tokudb_cache_size | 522651648 | | tokudb_check_jemalloc | 1 | | tokudb_checkpoint_lock | OFF | | tokudb_checkpoint_on_flush_logs | OFF | | tokudb_checkpointing_period | 60 | | tokudb_cleaner_iterations | 5 | | tokudb_cleaner_period | 1 | | tokudb_commit_sync | ON | | tokudb_create_index_online | ON | | tokudb_data_dir | | | tokudb_debug | 0 | | tokudb_directio | OFF | | tokudb_disable_hot_alter | OFF | | tokudb_disable_prefetching | OFF | | tokudb_disable_slow_alter | OFF | | tokudb_disable_slow_update | OFF | | tokudb_disable_slow_upsert | OFF | | tokudb_empty_scan | rl | | tokudb_fs_reserve_percent | 5 | | tokudb_fsync_log_period | 0 | | tokudb_hide_default_row_format | ON | | tokudb_init_flags | 11403457 | | tokudb_killed_time | 4000 | | tokudb_last_lock_timeout | | | tokudb_load_save_space | ON | | tokudb_loader_memory_size | 100000000 | | tokudb_lock_timeout | 4000 | | tokudb_lock_timeout_debug | 1 | | tokudb_log_dir | | | tokudb_max_lock_memory | 65331456 | | tokudb_pk_insert_mode | 1 | | tokudb_prelock_empty | ON | | tokudb_read_block_size | 65536 | | tokudb_read_buf_size | 131072 | | tokudb_read_status_frequency | 10000 | | tokudb_row_format | tokudb_zlib | | tokudb_tmp_dir | | | tokudb_version | tokudb-7.1.7-rc7 | | tokudb_write_status_frequency | 1000 | +---------------------------------+------------------+ 42 rows in set (0.00 sec)
The most important of the tokudb_ variables is arguably tokudb_cache_size. The test server where I ran those tests (test01) have a little less than 1G of memory and as you can see above TokuDB is “reserving” half (50%) of them to itself. That’s the default behavior but, of course, you can change it. And you must do it if you are also going to have InnoDB tables on your server – you should not overcommit memory between InnoDB and TokuDB engines. Shlomi Noach wrote a good post explaining the main TokuDB-specific variables and what they do. It’s definitely a worth read.
I hope you have fun testing Percona Server with TokuDB! If you run into any problems worth reporting, please let us know.
The post Percona Server with TokuDB (beta): Installation, configuration appeared first on MySQL Performance Blog.