This Week in Data with Colin Charles 28: Percona Live, MongoDB Transactions and Spectre/Meltdown Rumble On

Colin Charles

Colin CharlesJoin Percona Chief Evangelist Colin Charles as he covers happenings, gives pointers and provides musings on the open source database community.

In case you missed last week’s column, don’t forget to read the fairly lengthy FOSDEM MySQL & Friends DevRoom summary.

From a Percona Live Santa Clara 2018 standpoint, beyond the tutorials getting picked and scheduled, the talks have also been picked and scheduled (so you were very likely getting acceptance emails from the Hubb.me system by Tuesday). The rejections have not gone out yet but will follow soon. I expect the schedule to go live either today (end of week) or early next week. Cheapest tickets end March 4, so don’t wait to register!

Amazon Relational Database Service has had a lot of improvements in 2017, and the excellent summary from Jeff Barr is worth a read: Amazon Relational Database Service – Looking Back at 2017. Plenty of improvements for the MySQL, MariaDB Server, PostgreSQL and Aurora worlds.

Spectre/Meltdown and its impact are still being discovered. You need to read Brendan Gregg’s amazing post: KPTI/KAISER Meltdown Initial Performance Regressions. And if you visit Percona Live, you’ll see an amazing keynote from him too! Are you still using MyISAM? MyISAM and KPTI – Performance Implications From The Meltdown Fix suggests switching to Aria or InnoDB.

Probably the biggest news this week though? Transactions are coming to MongoDB 4.0. From the site, “MongoDB 4.0 will add support for multi-document transactions, making it the only database to combine the speed, flexibility, and power of the document model with ACID guarantees. Through snapshot isolation, transactions will provide a globally consistent view of data, and enforce all-or-nothing execution to maintain data integrity.”. You want to read the blog post, MongoDB Drops ACID (the title works if you’re an English native speaker, but maybe not quite if you aren’t). The summary diagram was a highlight for me because you can see the building blocks, plus future plans for MongoDB 4.2.

Releases

Link List

Upcoming appearances

  • SCALE16x – Pasadena, California, USA – March 8-11 2018
  • FOSSASIA 2018 – Singapore – March 22-25 2018

Feedback

I look forward to feedback/tips via e-mail at colin.charles@percona.com or on Twitter @bytebot.

Percona Server 5.6.28-76.1 is now available

Percona Server 5.6.28-76.1Percona is glad to announce the release of Percona Server 5.6.28-76.1 on January 12, 2016. Download the latest version from the Percona web site or from the Percona Software Repositories.

Based on MySQL 5.6.28, including all the bug fixes in it, Percona Server 5.6.28-76.1 is the current GA release in the Percona Server 5.6 series. Percona Server is open-source and free – and this is the latest release of our enhanced, drop-in replacement for MySQL. Complete details of this release can be found in the 5.6.28-76.1 milestone on Launchpad.

Bugs Fixed:

  • Clustering secondary index could not be created on a partitioned TokuDB table. Bug fixed #1527730 (DB-720).
  • When enabled, super-read-only option could break statement-based replication while executing a multi-table update statement on a slave. Bug fixed #1441259.
  • Running OPTIMIZE TABLE or ALTER TABLE without the ENGINE clause would silently change table engine if enforce_storage_engine variable was active. This could also result in system tables being changed to incompatible storage engines, breaking server operation. Bug fixed #1488055.
  • Setting the innodb_sched_priority_purge variable (available only in debug builds) while purge threads were stopped would cause a server crash. Bug fixed #1368552.
  • Small buffer pool size could cause XtraDB buffer flush thread to spin at 100% CPU. Bug fixed #1433432.
  • Enabling TokuDB with ps_tokudb_admin script inside the Docker container would cause an error due to insufficient privileges even when running as root. In order for this script to be used inside Docker containers this error has be been changed to a warning that a check is impossible. Bug fixed #1520890.
  • InnoDB status will start printing negative values for spin rounds per wait, if the wait number, even though being accounted as a signed 64-bit integer, will not fit into a signed 32-bit integer. Bug fixed #1527160 (upstream #79703).

Other bugs fixed: #1384595 (upstream #74579), #1384658 (upstream #74619), #1471141 (upstream #77705), #1179451, #1524763 and #1530102.

Release notes for Percona Server 5.6.28-76.1 are available in the online documentation. Please report any bugs on the launchpad bug tracker .

ObjectRocket’s David Murphy talks about MongoDB, Percona Live Amsterdam

Say hello to David Murphy, lead DBA and MongoDB Master at ObjectRocket (a Rackspace company). David works on sharding, tool building, very large-scale issues and high-performance MongoDB architecture. Prior to ObjectRocket he was a MySQL/NoSQL architect at Electronic Arts. David enjoys large-scale operational tool building, high performance OS and database tuning. He is also a core code contributor to MongoDB. He’ll be speaking next month at Percona Live Amsterdam, which runs Sept. 21-13. Enter promo code “BlogInterview” at registration to save €20!


Tom: David, your 3-hour tutorial is titled “Mongo Sharding from the trench: A Veterans field guide.” Did your experience in working with vast amounts of data at Rackspace give you a unique perspective, in view, that now puts you into a position to help people just getting started? Can you give a couple examples?

David: I think this has been something organically I grew into from the days of supporting Cpanel type MySQL instances to today. I have worked for a few verticals from hosts to advertising to gaming, finally entering into the platform service. The others give me a host of knowledge around how customer need systems to work, and then the number and range of workloads we see at Rackspace re-enforces this.

ObjectRocket's David Murphy talks MongoDB & Percona Live Amsterdam

ObjectRocket’s David Murphy

Many times the unique perspective comes with the scale such as someone calling up a single node to the multi-terabyte range. When they go to “shard” they can find the process that is normally very light and unnoticeable to most Mongo sharding can severally lock the metadata for an extended time. In other cases, the “balancer” might not be able to keep up with the amount of working being asked of it.

Toward the smaller end of the spectrum, having seen so many workloads from big to small. I can see similar thought processes and trends. When this happens having worked with some many of these workloads, and honestly having learned along the evolution of mongo helps me explain to clients the good, bad, and the hairy. Many times discussions come down to people not using connection pooling, non-indexed sorting, or complex operators such as $in, $nin, and more. In these cases, I can talk to people about the balance of using these concepts and when they will become bigger issues for them. My goal is to give them the enough knowledge to help determine when it is correct to use development resource to fix and issue, and when it’s manageable and that development could be better spent elsewhere.

 

Tom: The title of your tutorial also sounds like the perfect title of a book. Do you have any for one?

David: What an excellent question! I have thought about this. However, I think the goal of a book if I can find the time to do it. A working title might be “Mongo from the trenches: Surviving the minefield to get ahead”. I think the book might be broken into three sections:  “When should you use or not user Mongo”,  “Schema and Operatorators in the NoSQL world”, “Sharding”. I would do this as this could be a great mini book on its own the community really could use a level of depth similar to the MySQL 5.0 certification guides.  I liked these books as it helped someone understand all the bits of what to consider with your schema design and how it affects the application as much as the database hosts. Then in the second half more administration geared it took those same schema and design choices to help you manage them with confidence.

In the end, Mongo is a good product that works well for most people as it matures we need more and discussion. On topics such as what should you monitor, how you should predict issues, and how valuable are regular audits. Especially in an ecosystem where it’s easy to spin something up, launch it, and move on to the next project.

 

Tom: When and why would you recommend using MongoDB instead of MySQL?

David: I am glad I mentioned this is worthy of a book already, as it is such a complex topic and one that gets me very excited.

I feel there is a bit or misinformation on both sides of this field. Many in the MySQL camp of experts know when someone says they can’t get more than 1000 TPS via MySQL. 9 out of 10 times and design, not a technology issue,  the Mongo crowd love this and due to inherit sharding nature of Mongo they can sidestep these types of issues. Conversely in the Mongo camp you will hear how bad the  SQL standard is, however, omitting transactions for a moment, the same types of operations exist in MySQL and Mongo.  There are some interesting powers in the Mongo aggregation. However, SQL is more powerful and just as complex as some map reduce jobs and aggregations I have written.

As to your question, MySQL will always win in regards to repeatable reads to the database in a transaction. There is some talk of limited transactions in Mongo. However, these will likely not become global and cluster wide anytime soon if ever.  I don’t trust floats in Mongo for financials; it’s not that Mongo doesn’t do them but rather JavaScript type floats are present. Sometimes you need to store data as a 64-bit integer and do math in the app to make it a high precision float. MySQL, on the other hand, has excellent support for precision.

Another area is simply looking at the history of Mongo and MySQL.  Mongo until WiredTiger and  RocksDB were very similar to MyISAM from a locking behavior and support perspective. With the advent of the new storage system, we will-will see major leaps forward in types of flows you will want in Mongo. With the writer lock issue is gone, and locking between the systems is becoming more and more similar making deciding which much harder.

The news is not all use. However, subdocuments and array support in Mongo is amazing there are so many things  I can do in Mongo that even in bitwise SET/ENUM operators I could not do. So if you need that type of system, or you want to create a semi denormalize for of a view in the database. Mongo can do this with ease and on the fly. MySQL, on the other hand, would take careful planning and need whole tables updated.  In this regard I feel more people could use Mongo and is ability to have a versioned document schema allowing more incremental changes to documents. With new code  releases, allowing the application to read old version and “upgrade” them to the latest form. Removing a whole flurry of maintenance related pains that RDBMs have to the frustration of developers who just want to launch the new product.

The last thing I would want to say here is you need not choose, why not use both. Mongo can be very powerful for keeping a semi denormalized version of the data that is nimble to allow fast application or system updates and features. Leaving MySQL for a very specific workload that need the precision are simple are not expected to have schema changes.  I am a huge fan of keeping the transactional portions in MySQL, and the rest in Mongo. Allowing you to scale quickly up and down the build of your data needs, and more slowly change the parts that need to be 100% consistent all of the time with no room for eventual consistency.

 

Tom: What another session(s) are you most looking forward to besides your own at Percona Live Amsterdam?

David: There are a few that are near and dear to me.

Turtles all the way down: tuning Linux for database workloads” looks like a great one. It is one view I have always had, and DBA’s should be DBA’s,  SysAdmins, and Storage people rolled into one. That way they can understand the impacts of the application down to the blocks the database reads.

TokuDB internals” is another one. I have used TokuDB in MySQL and Mongo to some degree but as it has never had in-depth documentation. A topic like that is a great way to fill any gaps for experienced and new people alike.

Database Reliability Engineering” looks like a great talk from a great speaker.

As an InnoDB geek, I like the idea around “Understanding InnoDB locks: case studies.”

I see a huge amount of potential for MaxScale if anyone else is curious, “Anatomy of a Proxy Server: MaxScale Internals” should be good for R/W splits and split writing type cases.

Finally, one of my favorite people is Charity as she always is so energetic and can get to the heart of the matter. If you are not going to “Upgrade your database: without losing your data, your perf or your mind” you are missing out!

 

Tom: Thanks for speaking with me, David! Is there anything else you’d like to add: either about Rackspace or Percona Live Amsterdam?

David: In regards to Rackspace, I urge everyone to check out the Data Services group.  We handle everything from Redis to Hadoop with a goal of augmenting your groups or providing experts to help keep your uptime as high as possible. With options for dedicated hosts to platform type services, there is something that helps everyone. Rackspace is not just a cloud company but a real support company that provides amazing hardware to use, or support for other hardware location that is growing rapidly.

With Percona Amsterdam, everyone should come the group of speakers is simply amazing, I for one am excited by so many topics because they are all so compelling. Outside of that you will it hard find another a gathering of database experts with multiple technologies under their belt and who truly believe in the move to picking the right technology for the right use case.

The post ObjectRocket’s David Murphy talks about MongoDB, Percona Live Amsterdam appeared first on MySQL Performance Blog.

Reference architecture for a write-intensive MySQL deployment

We designed Percona Cloud Tools (both hardware and software setup) to handle a very high-intensive MySQL write workload. For example, we already observe inserts of 1bln+ datapoints per day. So I wanted to share what kind of hardware we use to achieve this result.

Let me describe what we use, and later I will explain why.

Server:

  • Chassis: Supermicro SC825TQ-R740LPB 2U Rackmount Chassis
  • Motherboard: Supermicro X9DRI-F dual socket
  • CPU: Dual Intel Xeon Ivy Bridge E5-2643v2 (6x 3.5Ghz cores, 12x HT cores, 25M L3)
  • Memory: 256GB (16x 16GB 256-bit quad-channel) ECC registered DDR3-1600
  • Raid: LSI MegaRAID 9260-4i 4-port 6G/s hardware RAID controller, 512M buffer
  • MainStorage: PCIe SSD HGST FlashMAX II 4.8TB
  • Secondary Storage (OS, logs): RAID 1 over 2x 3TB hard drives

Software:

When selecting hardware for your application, you need to look at many aspects – typically you’re looking for a solution for which you already have experience in working with and has also proved to be the most efficient option. For us it has been as follows:

Cloud vs Bare Metal
We have experience having hardware hosted at the data center as well as cash for upfront investments in hardware so we decided to go for physical self-hosted hardware instead of the cloud. Going this route also gave us maximum flexibility in choosing a hardware setup that was the most optimal for our application rather than selecting one of the stock options.

Scale Up vs Scale Out
We have designed a system from scratch to be able to utilize multiple servers through sharding – so our main concern is choosing the most optimal configuration for the server and provisioning servers as needed. In addition to raw performance we also need to consider power usage and overhead of managing many servers which typically makes having slightly more high-end hardware worth it.

Resource Usage
Every application uses resources in different ways so an optimal configuration will be different depending on your application. Yet all applications use the same resources you need to consider. Typically you want to plan for all of your resources to be substantially used – providing some margin for spikes and maintenance.

CPU

  • Our application processes a lot of data and uses the TokuDB storage engine which uses a lot of CPU for compression, so we needed powerful CPUs.
  • Many MySQL functions are not parallel, think executing single query or Alter table so we’re going for CPU with faster cores rather than larger amount of cores. The resulting configuration with 2 sockets giving 12 cores and 24 threads is good enough for our workloads.
  • Lower end CPUs such as Xeon E3 have very attractive price/performance but only support 32GB of memory which was not enough for our application.

Memory

  • For database boxes memory is mainly used as a cache, so depending on your application you may be better off investing in memory or storage for optimal performance. Check out this blog post for more details.
  • Accessing data in memory is much faster than even on the fastest flash storage so it is still important.
    For our workload having recent data in memory is very important so we get as much “cheap” memory as we can populating all 16 slots with 16GB dimms which have attractive cost per GB at this point.

Storage
There are multiple uses for the storage so there are many variables to consider

  • Bandwidth
    • We need to be able access data on the storage device quickly and with stable response time. HGST FlashMax II has been able to meet these very demanding needs.
  • Endurance
    • When using flash storage you need to worry about endurance – how much beating with writes flash storage can handle before it wears out. Some low cost MLC SSDs would wear out in the time frame of weeks if being written with maximum speed. HGST FlashMax II has endurance rating of 10 Petabytes written (for a random workload) – 30 Petabytes written (for a sequential workload)
    • We also use TokuDB storage engine which significantly reduces amount of writes compared to Innodb.
  • Durability
    • Does the storage provide true durability with data guaranteed to be persisted when write is acknowledged at the operating system level when power goes down or is loss possible?
      We do not want to risk database corruption in case of power failure so we were looking for storage solution which guarantees durability.
      HGST FlashMax II guarantees durability which has been confirmed by our stress tests.
  • Size
    • To scale application storage demands you need to scale both number of IO operations storage can handle and storage size. For flash storage it is often the size which becomes limiting factor.
      HGST FlashMax II 4.8 TB capacity is best available on the market which allows us to go “All Flash” and achieve very quick data access to all our data set.
  • Secondary Storage
    • Not every application need requires flash storage properties.
    • We have secondary storage with conventional drives for operating system and logs.
      Sequential read/write pattern works well with low cost conventional drives and also allow us to increase flash life time, having it handling less writes.
    • We’re using RAID with BBU for secondary storage to be able to have fully durable binary logs without paying high performance penalty.

Why PCIe SSD over SATA SSD?
There are arguments that SATA SSD provides just a good enough performance for MySQL and there is no need for PCIe. While these arguments are valid in one dimension, there are several more to consider.

First, like I said PCIe SSD still provides a best absolute response time and it is an important factor for an end user experience in SaaS systems like Percona Cloud Tools.
Second, consider maintenance operations like backup, ALTER TABLES or slave setups. While these operations are boring and do not get as much attention as a response time or throughput in benchmarks, it is still operations that DBAs performs basically daily, and it is very important to finish a backup or ALTER TABLE in a predictable time, especially on 3-4TB datasize range. And this is where PCIe SSD performs much better than SATA SSDs. For SATA SSD, especially bigger size, write endurance is another point of concern.

Why TokuDB engine?
The TokuDB engine is the best when it comes to insert operations to a huge dataset, and few more factors makes it a no-brainer:

  • TokuDB compression is a huge win. I estimate into this storage ( FlashMAX II 4.8TB) we will fit about 20-30TB of raw data.
  • TokuDB is SSD friendly, as it performs much less data writes per INSERT operation than InnoDB, which greatly extends SSD (which is, well, expensive to say the least) lifetime.

The post Reference architecture for a write-intensive MySQL deployment appeared first on MySQL Performance Blog.

Why TokuDB hates Transparent HugePages

If you try to install the TokuDB storage engine on a modern Linux distribution it might fail with following error message:

2014-07-17 19:02:55 13865 [ERROR] TokuDB will not run with transparent huge pages enabled.
2014-07-17 19:02:55 13865 [ERROR] Please disable them to continue.
2014-07-17 19:02:55 13865 [ERROR] (echo never > /sys/kernel/mm/transparent_hugepage/enabled)

You might be curious why TokuDB refuses to start with Transparent HugePages. Are they not a good thing… allowing smaller kernel page tables and less TLB misses when accessing data in the buffer pool? I was curious, so I asked Tim Callaghan this very question.

This problem originates with TokuDB using jemalloc memory allocator, which uses a particular trick to deal with memory fragmentation. The classical problem with memory allocators is fragmentation – if you allocated a say 2MB chunk from the operating system (typically using mmap),  as the process runs it is likely some of that 2MB memory block will become free but not all of it, hence it can’t be given back to operating system completely. jemalloc uses a clever trick being able to give back portions of memory allocated in such a way through madvise(…, MADV_DONTNEED) call.

Now what happens when you use transparent huge pages? In this case the operating system (and CPU, really) works with pages of a much larger size which only can be unmapped from the address space in its entirety – which does not work when smaller objects are freed which produce smaller free “holes.”

As a result, without being able to free memory efficiently the amount of allocated memory may grow unbound until the process starts to swap out – and in the end being killed by “out of memory” killer at least under some workloads. This is not a behavior you want to see from the database server. As such requiring to disable huge pages is a better choice.

Having said that this is pretty crude requirement/solution – disabling huge pages on complete operating system image to make one application work while others might be negatively impacted. I hope with a future jemalloc version/kernel releases there will be solution where jemalloc simply prevents huge pages usage for its allocations.

Using jemalloc and its approach to remove pages from resident space also makes TokuDB a lot different than typical MySQL instances running Innodb from the process space. With Innodb VSZ and RSS are often close. In fact we often monitor VSZ to ensure it is not excessively large to avoid danger of process starting to swap actively or be killed with OOM killer. TokuDB however often can look like this

[root@smt1 mysql]# ps aux | grep mysqld
mysql 14604 21.8 50.6 12922416 4083016 pts/0 Sl Jul17 1453:27 /usr/sbin/mysqld –basedir=/usr –datadir=/var/lib/mysql –plugin-dir=/usr/lib64/mysql/plugin –user=mysql –log-error=/var/lib/mysql/smt1.pz.percona.com.err –pid-file=/var/lib/mysql/smt1.pz.percona.com.pid
root 28937 0.0 0.0 103244 852 pts/2 S+ 10:38 0:00 grep mysqld

In this case TokuDB is run with defaults on 8GB system – it takes approximately 50% of memory in terms of RSS size, however the VSZ of the process is over 12GB – this is a lot more than memory available.

This is completely fine for TokuDB. If I would not have Transparent HugePages disabled, though, my RSS would be a lot closer to VSZ causing intense swapping or even process killed by OOM killer.

In addition to explaining this to me, Tim Callaghan was also kind enough to share some links on this issue from other companies such as Oracle, NuoDB , Splunk, SAP, SAP(2), which provide more background information on this topic.

The post Why TokuDB hates Transparent HugePages appeared first on MySQL Performance Blog.

Percona Server with TokuDB (beta): Installation, configuration

My previous post was an introduction to the TokuDB storage engine and aimed at explaining the basics of its design and how it differentiates from InnoDB/XtraDB. This post is all about motivating you to give it a try and have a look for yourself. Percona Server is not officially supporting TokuDB as of today, though the guys in the development team are working hard on this and the first GA release of Percona Server with TokuDB is looming on the horizon. However, there’s a beta version available now. For the installation tests in this post I’ve used the latest version of Percona Server alongside the accompanying TokuDB complement, which was published last week.

Installing Percona Server with TokuDB on a sandbox

One of the tools Percona Support Engineers really love is Giuseppe Maxia’s MySQL Sandbox. It allows us to setup a sandbox running a MySQL instance of our choice and makes executing multiple ones for testing purposes very easily. Whenever a customer reaches us with a problem happening on a particular version of MySQL or Percona Server that we can reproduce, we quickly spin off a new sandbox and test it ourselves, so it’s very handy. I’ll use one here to explore this beta version of Percona Server with TokuDB but if you prefer you can install it the regular way using a package from our apt experimental or yum testing repositories.

We start by downloading the tarballs from here: TokuDB’s plugin has been packaged in its own tarball, so there are two to download. Once you get them let’s decompress both and create a unified working directory, compressing it again to create a single tarball we’ll use as source to create our sandbox:

[nando@test01 ~]# tar zxvf Percona-Server-5.6.17-rel66.0-608.Linux.x86_64.tar.gz
[nando@test01 ~]# tar zxvf Percona-Server-5.6.17-rel66.0-608.TokuDB.Linux.x86_64.tar.gz
[nando@test01 ~]# tar cfa Percona-Server-5.6.17-rel66.0-608.Linux.x86_64.tar.gz Percona-Server-5.6.17-rel66.0-608.Linux.x86_64/

Before going ahead, verify if you have transparent huge pages enabled as TokuDB won’t run if it is set. See this documentation page for explanation on what this is and how to disable it on Ubuntu. In my CentOS test server it was defined in a slightly different place and I’ve done the following to temporarily disable it:

[nando@test01]# echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled
[nando@test01]# echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag

We’re now ready to create our sandbox. The following command should be enough (I’ve chosen to run Percona Server on port 5617, you can use any other available one):

[nando@test01 ~]# make_sandbox Percona-Server-5.6.17-rel66.0-608.Linux.x86_64.tar.gz -- --sandbox_directory=tokudb --sandbox_port=5617

If the creation process goes well you will see something like the following at the end:

.... sandbox server started
Your sandbox server was installed in $HOME/sandboxes/tokudb

You should now be able to access the MySQL console on the sandbox with the default credentials; if you cannot, verify the log-in $HOME/sandboxes/tokudb/data/msandbox.err:

[nando@test01 ~]# mysql --socket=/tmp/mysql_sandbox5617.sock -umsandbox -pmsandbox

Alternatively, you can make use of the “use” script located inside the sandbox directory, which employs the same credentials (configured in the client section of the configuration file my.sandbox.cnf):

[nando@test01 ~]# cd sandboxes/tokudb/
[nando@test01 tokudb]# ./use

First thing to check is if TokuDB is being listed as an available storage engine:

mysql> SHOW ENGINES;
+--------------------+---------+----------------------------------------------------------------------------+--------------+------+------------+
| Engine             | Support | Comment                                                                    | Transactions | XA   | Savepoints |
+--------------------+---------+----------------------------------------------------------------------------+--------------+------+------------+
| (...)              | (...)   | (...)                                                                      | (...)        | (...)| (...)      |
| TokuDB             | YES     | Tokutek TokuDB Storage Engine with Fractal Tree(tm) Technology             | YES          | YES  | YES        |
| InnoDB             | DEFAULT | Percona-XtraDB, Supports transactions, row-level locking, and foreign keys | YES          | YES  | YES        |
| (...)          | (...)       | (...)                                                                      | NO           | (...)| (...)      |
+--------------------+---------+----------------------------------------------------------------------------+--------------+------+------------+

If that’s not the case, you may need to load the plugins manually – I had to do so in my sandbox; you may not need if you’re installing it from a package in a fresh setup:

mysql> INSTALL PLUGIN tokudb SONAME 'ha_tokudb.so';

TokuDB should now figure in the list of supported ENGINES but you still need to activate the related plugins:

mysql> INSTALL PLUGIN tokudb_file_map SONAME 'ha_tokudb.so';
mysql> INSTALL PLUGIN tokudb_fractal_tree_info SONAME 'ha_tokudb.so';
mysql> INSTALL PLUGIN tokudb_fractal_tree_block_map SONAME 'ha_tokudb.so';
mysql> INSTALL PLUGIN tokudb_trx SONAME 'ha_tokudb.so';
mysql> INSTALL PLUGIN tokudb_locks SONAME 'ha_tokudb.so';
mysql> INSTALL PLUGIN tokudb_lock_waits SONAME 'ha_tokudb.so';

Please note the INSTALL PLUGIN action results in permanent changes and thus is required only once. No modifications to MySQL’s configuration file are required to have those plugins load in subsequent server restarts.

Now you should see not only the main TokuDB plugin but also the add-ons to the INFORMATION SCHEMA:

mysql> SHOW PLUGINS;
+-------------------------------+----------+--------------------+--------------+---------+
| Name                          | Status   | Type               | Library      | License |
+-------------------------------+----------+--------------------+--------------+---------+
| (...)                         | (...)    | (...)              | (...)        | (...)   |
| TokuDB                        | ACTIVE   | STORAGE ENGINE     | ha_tokudb.so | GPL     |
| TokuDB_trx                    | ACTIVE   | INFORMATION SCHEMA | ha_tokudb.so | GPL     |
| TokuDB_locks                  | ACTIVE   | INFORMATION SCHEMA | ha_tokudb.so | GPL     |
| TokuDB_lock_waits             | ACTIVE   | INFORMATION SCHEMA | ha_tokudb.so | GPL     |
| TokuDB_file_map               | ACTIVE   | INFORMATION SCHEMA | ha_tokudb.so | GPL     |
| TokuDB_fractal_tree_info      | ACTIVE   | INFORMATION SCHEMA | ha_tokudb.so | GPL     |
| TokuDB_fractal_tree_block_map | ACTIVE   | INFORMATION SCHEMA | ha_tokudb.so | GPL     |
+-------------------------------+----------+--------------------+--------------+---------+

We are now ready to create our first TokuDB table – the only different thing to do here is to specify TokuDB as the storage engine to use:

mysql> CREATE TABLE test.Numbers (id INT PRIMARY KEY, number VARCHAR(20)) ENGINE=TokuDB;

Note some unfamiliar files lying in the datadir; the details surrounding those is certainly good material for future posts:

[nando@test01]# ls ~/sandboxes/tokudb/data
auto.cnf                   _test_Numbers_main_3_2_19.tokudb
ibdata1                    _test_Numbers_status_3_1_19.tokudb
ib_logfile0                tokudb.directory
ib_logfile1                tokudb.environment
log000000000000.tokulog25  __tokudb_lock_dont_delete_me_data
msandbox.err               __tokudb_lock_dont_delete_me_environment
mysql                      __tokudb_lock_dont_delete_me_logs
mysql_sandbox5617.pid      __tokudb_lock_dont_delete_me_recovery
performance_schema         __tokudb_lock_dont_delete_me_temp
tc.log                     tokudb.rollback
test

Configuration: what’s really important

As noted by Vadim long ago, “Tuning of TokuDB is much easier than InnoDB, there’re only a few parameters to change, and actually out-of-box things running pretty well“:

mysql> show variables like 'tokudb_%';
+---------------------------------+------------------+
| Variable_name                   | Value            |
+---------------------------------+------------------+
| tokudb_alter_print_error        | OFF              |
| tokudb_analyze_time             | 5                |
| tokudb_block_size               | 4194304          |
| tokudb_cache_size               | 522651648        |
| tokudb_check_jemalloc           | 1                |
| tokudb_checkpoint_lock          | OFF              |
| tokudb_checkpoint_on_flush_logs | OFF              |
| tokudb_checkpointing_period     | 60               |
| tokudb_cleaner_iterations       | 5                |
| tokudb_cleaner_period           | 1                |
| tokudb_commit_sync              | ON               |
| tokudb_create_index_online      | ON               |
| tokudb_data_dir                 |                  |
| tokudb_debug                    | 0                |
| tokudb_directio                 | OFF              |
| tokudb_disable_hot_alter        | OFF              |
| tokudb_disable_prefetching      | OFF              |
| tokudb_disable_slow_alter       | OFF              |
| tokudb_disable_slow_update      | OFF              |
| tokudb_disable_slow_upsert      | OFF              |
| tokudb_empty_scan               | rl               |
| tokudb_fs_reserve_percent       | 5                |
| tokudb_fsync_log_period         | 0                |
| tokudb_hide_default_row_format  | ON               |
| tokudb_init_flags               | 11403457         |
| tokudb_killed_time              | 4000             |
| tokudb_last_lock_timeout        |                  |
| tokudb_load_save_space          | ON               |
| tokudb_loader_memory_size       | 100000000        |
| tokudb_lock_timeout             | 4000             |
| tokudb_lock_timeout_debug       | 1                |
| tokudb_log_dir                  |                  |
| tokudb_max_lock_memory          | 65331456         |
| tokudb_pk_insert_mode           | 1                |
| tokudb_prelock_empty            | ON               |
| tokudb_read_block_size          | 65536            |
| tokudb_read_buf_size            | 131072           |
| tokudb_read_status_frequency    | 10000            |
| tokudb_row_format               | tokudb_zlib      |
| tokudb_tmp_dir                  |                  |
| tokudb_version                  | tokudb-7.1.7-rc7 |
| tokudb_write_status_frequency   | 1000             |
+---------------------------------+------------------+
42 rows in set (0.00 sec)

The most important of the tokudb_ variables is arguably tokudb_cache_size. The test server where I ran those tests (test01) have a little less than 1G of memory and as you can see above TokuDB is “reserving” half (50%) of them to itself. That’s the default behavior but, of course, you can change it. And you must do it if you are also going to have InnoDB tables on your server – you should not overcommit memory between InnoDB and TokuDB engines. Shlomi Noach wrote a good post explaining the main TokuDB-specific variables and what they do. It’s definitely a worth read.

I hope you have fun testing Percona Server with TokuDB! If you run into any problems worth reporting, please let us know.

The post Percona Server with TokuDB (beta): Installation, configuration appeared first on MySQL Performance Blog.

Percona Server 5.6.19-67.0 with TokuDB (GA) now available

Percona ServerPercona is glad to announce the release of Percona Server 5.6.19-67.0 on July 1, 2014. Download the latest version from the Percona web site or from the Percona Software Repositories.

Based on MySQL 5.6.19, including all the bug fixes in it, Percona Server 5.6.19-67.0 is the current GA release in the Percona Server 5.6 series. All of Percona’s software is open-source and free. Complete details of this release can be found in the 5.6.19-67.0 milestone on Launchpad.

New Features:

  • Percona has merged a contributed patch by Kostja Osipov implementing the Multiple user level locks per connection feature. This feature fixes the upstream bugs: #1118 and #67806.
  • TokuDB storage engine support is now considered general availability (GA) quality. The TokuDB storage engine from Tokutek improves scalability and the operational efficiency of MySQL with faster performance and increased compression. It is available as a separate package and can be installed along with the Percona Server by following the instructions in the release documentation.
  • Percona Server now supports the MTR --valgrind option for a server that is either statically or dynamically linked with jemalloc.

Bugs Fixed:

  • The libperconaserverclient18.1 package was missing the library files. Bug fixed #1329911.
  • Percona Server introduced a regression in 5.6.17-66.0 when support for TokuDB storage engine was initially introduced. This regression caused spurious “wrong table structure” errors for PERFORMANCE_SCHEMA tables. Bug fixed #1329772.
  • Race condition in group commit code could lead to a race condition in PFS instrumentation code resulting in a server crash. Bug fixed #1309026 (upstream #72681).

Other bugs fixed: #1326348 and #1167486.

NOTE: There was no Percona Server 5.6.18 release because there was no MySQL Community Server 5.6.18 release. That version number was used for a MySQL Enterprise Edition release to address the OpenSSL “Heartbleed” issue.

Release notes for Percona Server 5.6.19-67.0 are available in the online documentation. Please report any bugs on the launchpad bug tracker.

The post Percona Server 5.6.19-67.0 with TokuDB (GA) now available appeared first on MySQL Performance Blog.

Percona Server with TokuDB: Packing 15TB into local SSDs

Two weeks ago we released an Alpha release of Percona Server with TokuDB. Right now I am on a final stage of evaluation of TokuDB for using in our project Percona Cloud Tools and it looks promising.

What is the most attractive in TokuDB? For me it is compression, but not just compression: TokuDB provides great performance over compressed data.

In my synthetic tests I saw a compression ratio of 10:1 (TokuDB LZMA to InnoDB uncompressed), in the real production data it is less, 6:1, but still impressive.

In our servers we have 4 x SSD Crucial M500 960GB combined in RAID5, which give 2877.0 GB of usable space. With TokuDB we should be able to pack around 15TB of raw data. Of course we can try InnoDB compression, but the best we can get is 2x compression without sacrificing performance.

And of course TokuDB is transaction, fully ACID-compliant with automatic crash-recovery storage engine.

This all makes TokuDB a very attractive choice for handling terabytes of data (or as it popular to say nowadays “BigData”).

One of first operational questions we have is how to handle backups.
For backups we use LVM partitions and the mylvmbackup tool. Unfortunately Percona XtraBackup is not able to handle TokuDB tables (and probably won’t be able anytime soon). The other choice is to use TokuDB Hot back-up, available by Tokutek Enterprise Subscription. I did not test it myself, so I can’t provide any feedback.

And of course there are things which I do not fully like in TokuDB:

  • No Foreign Keys support. It is not a big issue for us, but I know for some users this is a showstopper.
  • Time-based checkpoints. You may not notice a direct effect from this, but we clearly see it in our benchmarks. Every 60 sec (default timeperiod between checkpoints) we see a drop in throughput during write-intensive benchmarks. It is very similar to drops in InnoDB we tried to solve (and are still trying), for example see Adaptive flushing in MySQL 5.6. My advice to the Tokutek team would be to also look into a fuzzy check-pointing, instead of time-based.
  • All TokuDB files are stored in a single directory, sometime with mangled filenames. This especially becomes bad in sharding or multi-tenant environments when tens of thousands of files are in the same directory

Well, I guess for now, we will take these limitations as TokuDB specific and will deal with them.

Next week we plan on a Beta release of Percona Server with TokuDB, so stay tuned!

The post Percona Server with TokuDB: Packing 15TB into local SSDs appeared first on MySQL Performance Blog.

New MySQL features, related technologies at Percona Live London

Call for papers: Percona Live LondonThe upcoming Percona Live London conference, November 11-12, features quite a number of talks about the latest MySQL features and related technologies. There will be a lots of talks about the new MySQL 5.6 features:

  • Opening keynote highlights MySQL 5.6 new features.
  • New InnoDB Compression talk will cover the new compression algorithm, implemented by Facebook and included in MySQL 5.6.
  • New MySQL Replication features, including multi-treaded slave applier, Global Transaction Ids which can help for automatic failover and lots of performance optimizations and much).

Altho MySQL 5.6 is a very important milestone there are much more interesting technologies going on around MySQL. Here are some of the talks, which look pretty interesting (at least for me):

NoSQL World

Hadoop

Hadoop is a relatively new topic at MySQL conferences, however, it gains more and more traction, especially after MySQL applier for Hadoop (alpha version) release. Danil  Zburivsky will be talking about building a data warehouse with Hadoop and MySQL. I personally have a strong interest in Hadoop and recently  did a webinar about this topic. Hadoop concept is very different from MySQL, but there are a lots of real use cases where Hadoop will fit best.

MongoDB

MongoDB is a another interesting technology. There will be full MongoDB tutorial by Stephane Combaudon as well as MongoDB for MySQL Guru talk by Robert Hodges (Continuent) and Tim Callaghan (Tokutek)

New MySQL Cluster features.

MySQL Cluster 7.3 (based on  a mainline MySQL Server 5.6 release + NDBCluster storage engine) was recently released. Johan Andersson will cover some new MySQL Cluster 7.3 features in his MySQL Cluster Performance Tuning talk, including foreign key constrains (Foreign key constrains were the “showstopper” for many customers), memcached integration, etc. I knew Johan from the early MySQL Ab days and he always was (and now is) “the MySQL Cluster guy”, so I’m sure he will show some new MySQL cluster magic.

Other Storage Engines

TokuDB features the fractal tree and compression. Vadim blogged about using TokuDB  for storing timeseries data and it looks promising. Tim Callaghan of Tokutek will talk about Fractal Tree Indexes.

MariaDB contains the CONNECT engine (to join data between Oracle and Cassandra for example) and SPIDER storage engine (for automatic “sharding”). Colin Charles from Monty Program Ab will talk about new MariaDB features

Percona Live London is approaching fast so be sure to register today!

The post New MySQL features, related technologies at Percona Live London appeared first on MySQL Performance Blog.