Spaniard claims WWII WAR HERO pigeon code crack. Explain please

Come on, give it up and let Speckled Jim rest in peace

A 22-year old Spaniard claims that he’s cracked a previously unsolved WWII coded message. Others have claimed this before and there’s nothing particularly solid to back up the latest effort, but let’s have a look at it anyway.…

The language of compression

Leif Walsh & friends

Leif Walsh will talk about the language of compression at Percona Live Amsterdam

Storage. Everyone needs it. Whether your data is in MySQL, a NoSQL, or somewhere in the cloud, with ever-growing data volumes – along with the need for SSDs to cut latency and replication to provide insurance – an organization’s storage footprint is an important place to look for savings. That’s where compression comes in (squeeze!) to save disk space.

Two Sigma software engineer Leif Walsh speaks the language of compression. Fluently. In fact, he’ll be speaking on

that exact subject September 22 during the Percona Live conference in Amsterdam.

I asked him about his talk, and about Amsterdam, the other day. Here’s what he had to say.

* * *

Tom: Hi Leif, how will your talk help IT decision-makers cut through the marketing mumbo-jumbo on what’s important to focus on and what is not
Leif: My talk will have three lessons aimed at those making storage decisions for their company:

  1. What are the key factors to consider when evaluating storage options, and how can they affect your bottom line?  This is not only how storage tech influences your hardware, operations, and management costs, but also how it can facilitate new development initiatives and cut time-to-market for your products.
  2. How should you read benchmarks and marketing materials about storage technology?  You’ll learn what to look for in promotional material, and how to think critically about whether that material is applicable to your business needs.
  3. What’s the most effective way to communicate with storage vendors about your application’s requirements?  A lot of time can be spent in the early stages of a relationship in finding a common language for users and vendors to have meaningful discussions about users’ needs and vendors’ capacity to meet those needs.  With the tools you’ll learn in my talk, you’ll be able to accelerate quickly to the high-bandwidth conversations you need to have in order to make the right decision, and consequently, you’ll be empowered to evaluate more choices to find the best one faster.

Tom: In addition to IT decision-makers, who else should attend your session and what will they take away afterward?
Leif: My talk is primarily about the language that everyone in the storage community should be using to communicate. Therefore, storage vendors should attend to get ideas for how  to express their benchmarks and their system’s properties more effectively, and application developers and operations people will learn strategies for getting better support and for making a convincing case to the decision makers in their own company.

Tom: Which session(s) are you most looking forward to besides your own?
Leif: Sam Kottler is a good friend and an intensely experienced systems engineer with a dynamic and boisterous personality, so I can’t wait to hear more about his experiences with Linux tuning.

As one of the original developers of TokuMX, I’ll absolutely have to check out Stephane’s talk about it, but I promise not to heckle. Charity Majors is always hilarious and has great experiences and insights to share, so I’ll definitely check out her talk too.

* * *

Catch Leif’s talk at Percona Live in Amsterdam September 21-23. Enter the promo code “BlogInterview” at registration and save €20! Register now!

The post The language of compression appeared first on MySQL Performance Blog.

MySQL is crashing: a support engineer’s point of view

In MySQL QA Episode #12, “MySQL is Crashing, now what?,” Roel demonstrated how to collect crash-related information that will help Percona discover what the issue is that you are experiencing, and fix it.

As a Support Engineer I (Sveta) am very happy to see this post – but as a person who better understands writing than recording – I’d like to have same information, in textual form. We discussed it, and decided to do a joint blog post. Hence, this post :)

If you haven’t seen the video yet, or you do not have any experience with gdb, core files and crashes, I highly recommend to watch it first.

Once you have an idea of why crashes happen, what to do after it happens in your environment, and how to open a Support issue and/or a bug report, you’re ready for the next step: which information do you need to provide? Note that the more complete and comprehensive information you provide, the quicker the evaluation and potential fix process will go – it’s a win-win situation!

At first we need the MySQL error log file. If possible, please send us the full error log file. Often users like to send only the part which they think is relevant, but the error log file can contain other information, recorded before the crash happened. For example, records about table corruptions, lack of disk space, issues with InnoDB dictionary, etc.

If your error log is quite large, please note it would compress very well using a standard compression tool like gzip. If for some reason you cannot send the full error log file, please sent all lines, written after the words “mysqld: ready for connections” (as seen the last time before the actual crash), until the end of the error log file (alternatively, you can also search for rows, started with word “Version:”). Or, if you use scripts (or mysqld_safe) which automatically restart MySQL Server in case of disaster, obviously please search for the one-previous server start after the crash.

An example which includes an automatic restart as mentioned above:

2015-08-03 14:24:03 9911 [Note] /home/sveta/SharedData/Downloads/5.6.25/bin/mysqld: ready for connections.
Version: '5.6.25-73.1-log'  socket: '/tmp/mysql_sandbox21690.sock'  port: 21690  Percona Server (GPL), Release 73.1, Revision 07b797f
2015-08-03 14:24:25 7f5b193f9700 InnoDB: Buffer pool(s) load completed at 150803 14:24:25
11:25:12 UTC - mysqld got signal 4 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona Server better by reporting any
bugs at http://bugs.percona.com/
key_buffer_size=268435456
read_buffer_size=131072
max_used_connections=1
max_threads=216
thread_count=1
connection_count=1
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 348059 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0 thread_stack 0x40000
/home/sveta/SharedData/Downloads/5.6.25/bin/mysqld(my_print_stacktrace+0x2e)[0x8dd38e]
/home/sveta/SharedData/Downloads/5.6.25/bin/mysqld(handle_fatal_signal+0x491)[0x6a5dc1]
/lib64/libpthread.so.0(+0xf890)[0x7f5c58ac8890]
/lib64/libc.so.6(__poll+0x2d)[0x7f5c570fbc5d]
/home/sveta/SharedData/Downloads/5.6.25/bin/mysqld(_Z26handle_connections_socketsv+0x1c2)[0x5f64c2]
/home/sveta/SharedData/Downloads/5.6.25/bin/mysqld(_Z11mysqld_mainiPPc+0x1b5d)[0x5fd87d]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7f5c57040b05]
/home/sveta/SharedData/Downloads/5.6.25/bin/mysqld[0x5f10fd]
You may download the Percona Server operations manual by visiting
http://www.percona.com/software/percona-server/. You may find information
in the manual which will help you identify the cause of the crash.
150803 14:25:12 mysqld_safe Number of processes running now: 0
150803 14:25:12 mysqld_safe mysqld restarted
/home/sveta/SharedData/Downloads/5.6.25/bin/mysqld: /lib64/libssl.so.1.0.0: no version information available (required by /home/sveta/SharedData/Downloads/5.6.25/bin/mysqld)
/home/sveta/SharedData/Downloads/5.6.25/bin/mysqld: /lib64/libcrypto.so.1.0.0: no version information available (required by /home/sveta/SharedData/Downloads/5.6.25/bin/mysqld)
2015-08-03 14:25:12 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2015-08-03 14:25:12 0 [Note] /home/sveta/SharedData/Downloads/5.6.25/bin/mysqld (mysqld 5.6.25-73.1-log) starting as process 10038 ...
2015-08-03 14:25:12 10038 [Warning] Buffered warning: Changed limits: max_open_files: 1024 (requested 50005)
2015-08-03 14:25:12 10038 [Warning] Buffered warning: Changed limits: max_connections: 214 (requested 10000)
2015-08-03 14:25:12 10038 [Warning] Buffered warning: Changed limits: table_open_cache: 400 (requested 4096)
2015-08-03 14:25:12 10038 [Note] Plugin 'FEDERATED' is disabled.
2015-08-03 14:25:12 10038 [Note] InnoDB: Using atomics to ref count buffer pool pages
2015-08-03 14:25:12 10038 [Note] InnoDB: The InnoDB memory heap is disabled
2015-08-03 14:25:12 10038 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2015-08-03 14:25:12 10038 [Note] InnoDB: Memory barrier is not used
2015-08-03 14:25:12 10038 [Note] InnoDB: Compressed tables use zlib 1.2.3
2015-08-03 14:25:12 10038 [Note] InnoDB: Using Linux native AIO
2015-08-03 14:25:12 10038 [Note] InnoDB: Using CPU crc32 instructions
2015-08-03 14:25:12 10038 [Note] InnoDB: Initializing buffer pool, size = 4.0G
2015-08-03 14:25:13 10038 [Note] InnoDB: Completed initialization of buffer pool
2015-08-03 14:25:13 10038 [Note] InnoDB: Highest supported file format is Barracuda.
2015-08-03 14:25:13 10038 [Note] InnoDB: The log sequence numbers 514865622 and 514865622 in ibdata files do not match the log sequence number 514865742 in the ib_logfiles!
2015-08-03 14:25:13 10038 [Note] InnoDB: Database was not shutdown normally!
2015-08-03 14:25:13 10038 [Note] InnoDB: Starting crash recovery.
2015-08-03 14:25:13 10038 [Note] InnoDB: Reading tablespace information from the .ibd files...
2015-08-03 14:25:14 10038 [Note] InnoDB: Restoring possible half-written data pages
2015-08-03 14:25:14 10038 [Note] InnoDB: from the doublewrite buffer...
InnoDB: Last MySQL binlog file position 0 150866, file name mysql-bin.000006
2015-08-03 14:25:16 10038 [Note] InnoDB: 128 rollback segment(s) are active.
2015-08-03 14:25:16 10038 [Note] InnoDB: Waiting for purge to start
2015-08-03 14:25:16 10038 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.25-rel73.1 started; log sequence number 514865742
2015-08-03 14:25:16 7f67ceff9700 InnoDB: Loading buffer pool(s) from .//ib_buffer_pool
2015-08-03 14:25:16 10038 [Note] Recovering after a crash using mysql-bin
2015-08-03 14:25:16 10038 [Note] Starting crash recovery...
2015-08-03 14:25:16 10038 [Note] Crash recovery finished.
2015-08-03 14:25:17 10038 [Note] RSA private key file not found: /home/sveta/sandboxes/rsandbox_Percona-Server-5_6_25/master/data//private_key.pem. Some authentication plugins will not work.
2015-08-03 14:25:17 10038 [Note] RSA public key file not found: /home/sveta/sandboxes/rsandbox_Percona-Server-5_6_25/master/data//public_key.pem. Some authentication plugins will not work.
2015-08-03 14:25:17 10038 [Note] Server hostname (bind-address): '127.0.0.1'; port: 21690
2015-08-03 14:25:17 10038 [Note]   - '127.0.0.1' resolves to '127.0.0.1';
2015-08-03 14:25:17 10038 [Note] Server socket created on IP: '127.0.0.1'.
2015-08-03 14:25:17 10038 [Warning] 'proxies_priv' entry '@ root@thinkie' ignored in --skip-name-resolve mode.
2015-08-03 14:25:17 10038 [Note] Event Scheduler: Loaded 0 events
2015-08-03 14:25:17 10038 [Note] /home/sveta/SharedData/Downloads/5.6.25/bin/mysqld: ready for connections.
Version: '5.6.25-73.1-log'  socket: '/tmp/mysql_sandbox21690.sock'  port: 21690  Percona Server (GPL), Release 73.1, Revision 07b797f

Usually the error log file contains the actual query which caused the crash. If it does not and you know the query (for example, if your application logs errors / query problems), please send us this query too. Additionally, if possible, include the CREATE TABLE statements for any tables mentioned in the query. Actually working with the query is the first thing which you can do to resolve the issue: try to run this query (on a non-production/test server which is as close a copy to your production server as possible), to ensure it crashes MySQL Server consistently. If so, you can try and create a temporary workaround by avoiding this kind of queries in your application.

If you have doubts as to which query caused the crash, but have the general query log turned ON, you can use utility

parse_general_log.pl

  from percona-qa to create a potential test case. Simply execute:

$ sudo yum install bzr
$ cd ~
$ bzr branch lp:percona-qa
$ cp /path_that_contains_your_general_log/your_log_file.sql ~
$ ~/percona-qa/parse_general_log.pl -i./your_log_file.sql -o./output.sql

And subsequently execute output.sql against mysqld on a non-production test server to see if a crash is produced. Alternatively, you may mail us the output.sql file (provided your company privacy etc. policies allow for this). If you want to try and reduce the testcase further, please see QA Episode #7 on reducing testcases.

The next thing which we need is a backtrace. You usually have a simple backtrace showing in the error log directly after crash. An example (extracted from an error log) of what this looks like:

stack_bottom = 0 thread_stack 0x40000
/home/sveta/SharedData/Downloads/5.6.25/bin/mysqld(my_print_stacktrace+0x2e)[0x8dd38e]
/home/sveta/SharedData/Downloads/5.6.25/bin/mysqld(handle_fatal_signal+0x491)[0x6a5dc1]
/lib64/libpthread.so.0(+0xf890)[0x7f5c58ac8890]
/lib64/libc.so.6(__poll+0x2d)[0x7f5c570fbc5d]
/home/sveta/SharedData/Downloads/5.6.25/bin/mysqld(_Z26handle_connections_socketsv+0x1c2)[0x5f64c2]
/home/sveta/SharedData/Downloads/5.6.25/bin/mysqld(_Z11mysqld_mainiPPc+0x1b5d)[0x5fd87d]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7f5c57040b05]
/home/sveta/SharedData/Downloads/5.6.25/bin/mysqld[0x5f10fd]

Note that the above backtrace is mangled. You can send us the file like this (we can demangle it). However, if you want to work with it yourself more comfortably you can unmangle it with help of

c++filt

  utility:

sveta@linux-85fm:~/sandboxes/rsandbox_Percona-Server-5_6_25> cat master/data/msandbox.err | c++filt
...
terribly wrong...
stack_bottom = 0 thread_stack 0x40000
/home/sveta/SharedData/Downloads/5.6.25/bin/mysqld(my_print_stacktrace+0x2e)[0x8dd38e]
/home/sveta/SharedData/Downloads/5.6.25/bin/mysqld(handle_fatal_signal+0x491)[0x6a5dc1]
/lib64/libpthread.so.0(+0xf890)[0x7f5c58ac8890]
/lib64/libc.so.6(__poll+0x2d)[0x7f5c570fbc5d]
/home/sveta/SharedData/Downloads/5.6.25/bin/mysqld(handle_connections_sockets()+0x1c2)[0x5f64c2]
/home/sveta/SharedData/Downloads/5.6.25/bin/mysqld(mysqld_main(int, char**)+0x1b5d)[0x5fd87d]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7f5c57040b05]
/home/sveta/SharedData/Downloads/5.6.25/bin/mysqld[0x5f10fd]
...

Now the backtrace looks much nicer. When sending error log reports to us, please to not use 

c++filt

 on them before sending. We have a list of known bugs, and to scan the known bugs list we need to receive the error log unaltered.

You can also turn core files ON. Core files are memory dump files, created when a process crashes. They are very helpful for debugging, because they contain not only the backtrace of crashing thread, but backtraces of all threads, and much of what was in memory at the time the crash occurred.

Sidenote: Please note it is always a good idea to have the debuginfo (for example Percona-Server-56-debuginfo.x86_64 from the Percona Repository) package installed. This package provides the debugging symbols for Percona server (there are similar packages for other distributions) and ensures that stack traces are more readable and contain more information. It is important to ensure that you have the right package version etc. as symbols are different for each version of mysqld. If you have installed Percona Server from our repository, you can simply install the debuginfo package, the version will be correct, and the package will be auto-updated when Percona Server is updated.

By default the MySQL server does not create core files. To let it do so, you can follow instructions from the “GDB Cheat sheet” (page 2 under header ‘Core Files Cheat Sheet’). In short:

  • Add the option core-file under the [mysqld] section of your configuration file
  • Tune your OS options, so it allows mysqld to create core files as described in the cheat sheet
    echo "core.%p.%e.%s" > /proc/sys/kernel/core_pattern
    ulimit -c unlimited
    sudo sysctl -w fs.suid_dumpable=2
    Note: some systems use ‘kernel’ instead of ‘fs’: use kernel.suid_dumpable=2 instead if you get key or file warnings/errors.
  • Restart the MySQL server

Besides the core file which is generated by the MySQL server, you can also setup the operating system to dump a core file. These are two different core files (for a single crash of the mysqld binary), and the amount of information contained within may differ. The procedure above shows how to setup the one for the MySQL Server alone.

If you like your operating system to dump a core file as well, please see the MySQL QA Episode #12 video. Also, please note that changing the ulimit and fs.suid_dumpable settings may alter the security of your system. Please read more about these options online before using them or leaving them permanently on a production system.

Once a core file is generated, you can use the GDB utility to debug the core file (also called a ‘coredump’). GDB allows you to better resolve backtraces (also called ‘stack traces’ or ‘stacks’), for example by taking a back trace of all threads instead of only the crashing threads. This is off-course better then the single backtrace available in the error log file. To use GDB, you need to first start it:

gdb /path_to_mysqld /path_to_core

/path_to_core is usually your data directory (for coredumps produced by mysqld as a result of using the –core-file option in your my.cnf file), or sometimes in the same directory where the crashing binary is (for coredumps produced by the OS) – though you can specify an alternate fixed location for OS coredumps as the cheat sheet. Note that OS generated dumps are sometimes written with very few privileges and so you may have to use chown/chmod/sudo to access it.

Once you’re into GDB, and all looks fine, run the commands

bt

(backtrace) and  

bt thread apply all

(get backtrace for all threads) to get the stacktraces. bt should more or less match the backtrace seen in the error log, but sometimes this may differ.

For us, ideally you would run the following commands in GDB (as seen in the cheat sheet):

set trace-commands on
set pagination off
set print pretty on
set print array on
set print array-indexes on
set print elements 4096
set logging file gdb_standard.txt
set logging on
thread apply all bt
set logging off
set logging file gdb_full.txt
set logging on
thread apply all bt full

After you run these commands and have existed ( quit ) GDB, please send us the 

./gdb_standard.txt

  and

./gdb_full.txt

 files.

Finally, we would be happy to receive the actual core file from you. In terms of security and privacy, please note that a core file often contains fragments, or sections, or even the full memory of your server.

However, a core file without mysqld is useless, thus please add the mysqld binary together with the core file. If you use our compiled binaries you can also specify the exact package and file name which you downloaded, but if you use a self-compiled version of the server, the mysqld binary is required for us to resolve backtrace and other necessary information (like varialbes) from your core file. Generally speaking, it’s easier just to sent mysqld along.

Also, it would be really nice, if you send us library files which are dynamically linked with mysqld you use. You can get them by using a tool, called

ldd_files.sh

  from the percona-qa suite. Just create a temporary directory, copy your

mysqld

  binary to it and run the tool on it:

sveta@thinkie:~/tmp> wget http://bazaar.launchpad.net/~percona-core/percona-qa/trunk/download/head:/ldd_files.sh-20150713030145-8xdk0llrd3skfsan-1/ldd_files.sh
sveta@thinkie:~/tmp> mkdir tmp
sveta@thinkie:~/tmp> cd tmp/
sveta@thinkie:~/tmp/tmp> cp /home/sveta/SharedData/Downloads/5.6.25/bin/mysqld . # Copy of your mysqld
sveta@thinkie:~/tmp/tmp> ../ldd_files.sh mysqld # Run the tool on it
cp: cannot stat ‘./mysqld: /lib64/libssl.so.1.0.0: no version information available’: No such file or directory # Ignore
cp: cannot stat ‘./mysqld: /lib64/libcrypto.so.1.0.0: no version information available’: No such file or directory # Ignore
sveta@thinkie:~/tmp/tmp> ls
ld-linux-x86-64.so.2 libaio.so.1 libcrypto.so.1.0.0 libcrypt.so.1 libc.so.6 libdl.so.2 libgcc_s.so.1 libm.so.6 libpthread.so.0 librt.so.1 libssl.so.1.0.0 libstdc++.so.6 libz.so.1 mysqld # Files to supply in combination with mysqld

These library files are needed if case some of the frames from the stacktrace are system calls, so that our developers can resolve/check those frames also.

Summary

If you hit a crash, please send us (in order of preference, but even better ‘all of these’):

  • The error log file (please sent it unaltered – i.e. before c++filt was executed – which allows us to scan for known bugs)
  • The crashing query (from your application logs and/or extracted from the core file – ref the query extraction blog post)
    • Please include the matching CREATE TABLE statements
  • A resolved backtrace (and/or preferably the ./gdb_standard.txt and ./gdb_full.txt files)
  • The core file together with the mysqld binary and preferably the ldd files

Thank you!

 

The post MySQL is crashing: a support engineer’s point of view appeared first on MySQL Performance Blog.

Featured Talk: The Future of Replication is Today: New Features in Practice

In the past years, both MySQL 5.6, MySQL 5.7 and MariaDB 10 have been successful implementing new features. For many DBAs, the “old way” of replicating data is comfortable so taking the action to implement these new features seems like a momentous leap rather then a simple step. But perhaps it isn’t that complicated…

Giuseppe Maxia, a Quality Assurance Architect at VMware and loyal member of the Percona Live Confepercona-2015DSC_4112rence Committee will be presenting “The Future of Replication is Today: New Features in Practice” at the Percona Live Data Performance Conference this September in Amsterdam.
Percona’s Community Manager, Tom Diederich had an opportunity to catch up with Giuseppe last week and get an in-depth look at some of the items Giuseppe will be covering in his talk in addition to getting his take on some of the hot sessions to hit while at the conference.  This is how it went:

(Hint: Read to the end to find a special discount code) 

 

Tom: Your talk is titled, “The Future of Replication is today: new features in practice.” What are the top 3 areas in which replication options have improved in MySQL 5.6, MySQL 5.7, and MariaDB 10?
Giuseppe: Replication has been stagnant for over 10 years. Before MySQL 5.6, the only important change in the technology was the introduction of row-based replication in 2008. After that, we had to wait till 2013 to see global transaction identifiers in MySQL 5.6, followed by the same feature, with different implementation in 2014 with MariaDB 10. GTID has been complemented, in both flavors, with crash-safe replication tables, which is a feature that guarantees a reliable resume of replication after a server failure. There is also the parallel applier, a minor feature that has been implemented in both MySQL 5.6 and MariaDB, and improved in latest versions, although it seems to lack proper support for monitoring. The last feature that was introduced in MySQL 5.6 and MariaDB 10 is multi-source replication, i.e. the ability of replicating from multiple masters to a single slave. In both editions, the implementation is quite simple, and not so different from what DBAs are used to do for regular replication.
Tom: For DBAs, how difficult will it be to make the change from the “old way” of replicating data — to stop using the same comfortable features that have been around for several years — and put into practice some of the latest features?
Giuseppe: The adoption of new features can be deceptively simple. For example, GTID in MariaDB comes out of the box and its adoption could be as easy as running a backup followed by a restore, but it can produce unpleasant results if you try to combine this feature with multi-source replication without planning ahead. That said, the transition could be simpler than its counterpart in MySQL.
MySQL 5.6 and 5.7 require some reconfiguration to run GTID, and users can face unpleasant failures due to the complexity of the rules applying to this feature. They will need to read the manual thoroughly and test the deployment extensively before trusting an upgrade in production.
For multi-source replication, the difficulties are, in my experience, hidden in the users expectations. When speaking about multi-source (or multi-masters, as it is commonly referred to), many users have the mistaken expectation that they can easily insert anything in multiple masters as if they were doing it in a single server. However, the nature of asynchronous replication and the current implementation of multi-source topologies do not handle conflicts, and this fact will probably surprise and anger the early adopters.
Tom: What is still missing in replication technology? How can MySQL improve?
Giuseppe: There are two areas where the current implementation is lacking. The first one is monitoring data: while new features have been adding up to replication, there is not enough effort made to cover the monitoring needs. The current way of monitoring replication is hard-wired around the original replication feature, and little has been done to give the users a deeper view of what is going on. With the latest releases at our disposal, we can run parallel replication using multiple masters, and yet we have very little visibility on what goes on inside the dozen of threads that the new features can unchain inside a single slave. It’s like driving a F1 racing car with the dashboard of a Ford model-T. MySQL 5.7 has moved a few steps in that direction, with the new replication tables in performance_schema, but it is still a drop in the ocean compared to what we need.
The second area where replication is still too much tied with its past is in heterogeneous replication. While relational databases are still dominating the front-end of the web economy, its back-end is largely being run by different structures, such as Hadoop, MongoDB, Cassandra. Moving data back and forth between the relational storage and its growing siblings has become an urgent need. There have been a few sparks of change in this direction, but nothing that can qualify as promising changes.
Tom: Which other session(s) are you most looking forward to besides your own?
Giuseppe: I am always interested in the sessions that explain and discuss new features. I am most interested in the talks by Oracle engineers, who have been piling up many features in the latest years, and I am sure they have something more up their sleeve that will appear at the conference. I also attend eagerly sessions about complementary tools, which are usually highly educational and often give me more ideas.

Want to read more on the topic? Visit Giuseppe’s blog:

 MySQL Replication Monitoring 101

The Percona Live Data Performance Conference is the premier event for the rich and diverse MySQL, NoSQL and data in the cloud ecosystems in Europe. It is the place to be for the open source community as well as businesses that thrive in the MySQL, NoSQL, cloud, big data and IoT (Internet of Things) marketplaces. Attendees include DBAs, sysadmins, developers, architects, CTOs, CEOs, and vendors from around the world.

This year’s conference will feature one day of tutorials and two days of keynote talks and breakout sessions related to MySQL, NoSQL and Data in the Cloud. Attendees will get briefed on the hottest topics, learn about building and maintaining high-performing deployments and hear from top industry leaders.

The Percona Live Europe Data Performance Conference will be September 21-23 at the Mövenpick Hotel Amsterdam City Centre.

Register using code “FeaturedTalk” and save 20 euros off of registration!

Hope to see you in Amsterdam!

The post Featured Talk: The Future of Replication is Today: New Features in Practice appeared first on MySQL Performance Blog.

How much could you benefit from MySQL 5.6 parallel replication?

I have heard this question quite often: “At busy times, our replicas start lagging quite frequently. We are using N schemas, so which performance boost could we expect from MySQL 5.6 parallel replication?” Here is a quick way to give you a rough estimate of the potential benefit.

General idea

In MySQL 5.6, parallelism is added at the schema level. So in theory, if you have N schemas and if you use N parallel threads, replication could be up to N times faster. This assumes at least 2 things:

  • Replication throughput scales linearly with the number of parallel threads.
  • Writes are evenly distributed across schemas.

Both assumptions are of course not realistic. But it is easy to know the distribution of writes, and that can already give you an idea about how much you could benefit from parallel replication.

Writes are stored in binary logs but it is much easier to work with the slow query log, so we can enable full slow query logging for some time with long_query_time = 0 and then use pt-query-digest to analyze the resulting log file.

An example

I have a test server with 3 schemas, and I’ve run some sysbench load on it to get a decent slow query log file. Once done, I can run this command:

pt-query-digest --filter '$event->{arg} !~ m/^select|^set|^commit|^show|^admin|^rollback|^begin/i' --group-by db --report-format profile slow_query.log > digest.out

and here is the result I get:

# Profile
# Rank Query ID Response time  Calls  R/Call V/M   Item
# ==== ======== ============== ====== ====== ===== ====
#    1 0x       791.6195 52.1% 100028 0.0079  0.70 db3
#    2 0x       525.1231 34.5% 100022 0.0053  0.68 db1
#    3 0x       203.4649 13.4% 100000 0.0020  0.64 db2

In a perfect world, with 3 parallel threads and if each schema would handle 33% of the total write workload, I could expect a 3x performance improvement.

However here we can see in the report that the 3 replication threads will only work simultaneously 25% of the time in the best case (13.4/52.1 = 0.25). We can also expect 2 replication threads to work simultaneously for some part of the workload, but let’s ignore that for clarity.

It means that instead of the theoretical 200% performance improvement (3 parallel threads 100% of the time), we can hardly expect more than a 50% performance improvement (3 parallel threads 25% of the time). And the reality is that the benefit will be much lower than that.

Conclusion

Parallel replication in MySQL 5.6 is a great step forward, however don’t expect too much if your writes are not evenly distributed across all your schemas. The pt-query-digest trick I shared can give you a rough idea whether your workload is a good fit for multi-threaded slaves in 5.6.

I’m expecting much better results for 5.7, partly because parallelism is handled differently, but also because you can tune how efficient parallel replication will be by adjusting the binlog group commit settings.

The post How much could you benefit from MySQL 5.6 parallel replication? appeared first on MySQL Performance Blog.

Optimizing PXC Xtrabackup State Snapshot Transfer

State Snapshot Transfer (SST) at a glance

PXC uses a protocol called State Snapshot Transfer to provision a node joining an existing cluster with all the data it needs to synchronize.  This is analogous to cloning a slave in asynchronous replication:  you take a full backup of one node and copy it to the new one, while tracking the replication position of the backup.

PXC automates this process using scriptable SST methods.  The most common of these methods is the xtrabackup-v2 method which is the default in PXC 5.6.  Xtrabackup generally is more favored over other SST methods because it is non-blocking on the Donor node (the node contributing the backup).

The basic flow of this method is:

  • The Joiner:
    • joins the cluster
    • Learns it needs a full SST and clobbers its local datadir (the SST will replace it)
    • prepares for a state transfer by opening a socat on port 4444 (by default)
    • The socat pipes the incoming files into the datadir/.sst directory
  • The Donor:
    • is picked by the cluster (could be configured or be based on WAN segments)
    • starts a streaming Xtrabackup and pipes the output of that via socat to the Joiner on port 4444.
    • Upon finishing its backup, sends an indication of this and the final Galera GTID of the backup is sent to the Joiner
  • The Joiner:
    • Records all changes from the Donor’s backup’s GTID forward in its gcache (and overflow pages, this is limited by available disk space)
    • runs the –apply-log phase of Xtrabackup on the donor
    • Moves the datadir/.sst directory contents into the datadir
    • Starts mysqld
    • Applies all the transactions it needs (Joining and Joined states just like IST does it)
    • Moves to the ‘Synced’ state and is done.

There are a lot of moving pieces here, and nothing is really tuned by default.  On larger clusters, SST can be quite scary because it may take hours or even days.  Any failure can mean starting over again from the start.

This blog will concentrate on some ways to make a good dent in the time SST can take.  Many of these methods are trade-offs and may not apply to your situations.  Further, there may be other ways I haven’t thought of to speed things up, please share what you’ve found that works!

The Environment

I am testing SST on a PXC 5.6.24 cluster in AWS.  The nodes are c3.4xlarge and the datadirs are RAID-0 over the two ephemeral SSD drives in that instance type.  These instances are all in the same region.

My simulated application is using only node1 in the cluster and is sysbench OLTP with 200 tables with 1M rows each.  This comes out to just under 50G of data.  The test application runs on a separate server with 32 threads.

The PXC cluster itself is tuned to best practices for Innodb and Galera performance

Baseline

In my first test the cluster is a single member (receiving workload) and I am  joining node2.  This configuration is untuned for SST.  I measured the time from when mysqld started on node2 until it entered the Synced state (i.e., fully caught up).  In the log, it looked like this:

150724 15:59:24 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
... lots of other output ...
2015-07-24 16:48:39 31084 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 4647341)

Doing some math on the above, we find that the SST took 51 minutes to complete.

–use-memory

One of the first things I noticed was that the –apply-log step on the Joiner was very slow.  Anyone who uses Xtrabackup a lot will know that –apply-log will be a lot faster if you give it some extra RAM to use while making the backup consistent via the –use-memory option.  We can set this in our my.cnf like this:

[sst]
inno-apply-opts="--use-memory=20G"

The [sst] section is a special one understood only by the xtrabackup-v2 script.  inno-apply-opts allows me to specify arguments to innobackupex when it runs.

Note that this change only applies to the Joiner (i.e., you don’t have to put it on all your nodes and restart them to take advantage of it).

This change immediately makes a huge improvement to our above scenario (node2 joining node1 under load) and the SST now takes just over 30 minutes.

wsrep_slave_threads

Another slow part of getting to Synced is how long it takes to apply transactions up to realtime after the backup is restored and in place on the Joiner.  We can improve this throughput by increasing the number of apply threads on the Joiner to make better use of the CPU.  Prior to this wsrep_slave_threads was set to 1, but if I increase this to 32  (there are 16 cores on this instance type) my SST now takes 25m 32s

Compression

xtrabackup-v2 supports adding a compression process into the datastream.  On the Donor it compresses and on the Joiner it decompresses.  This allows you to trade CPU for transfer speed.  If your bottleneck turns out to be network transport and you have spare CPU, this can help a lot.

Further, I can use pigz instead of gzip to get parallel compression, but theoretically any compression utilization can work as long as it can compress and decompress standard input to standard output.  I install the ‘pigz’ package on all my nodes and change my my.cnf like this:

[sst]
inno-apply-opts="--use-memory=20G"
compressor="pigz"
decompressor="pigz -d"

Both the Joiner and the Donor must have the respective decompressor and compressor settings or the SST will fail with a vague error message (not actually having pigz installed will do the same thing).

By adding compression, my SST is down to 21 minutes, but there’s a catch.  My application performance starts to take a serious nose-dive during this test.  Pigz is consuming most of the CPU on my Donor, which is also my primary application node.  This may or may not hurt your application workload in the same way, but this emphasizes the importance of understanding (and measuring) the performance impact of SST has on your Donor nodes.

Dedicated donor

To alleviate the problem with the application, I now leave node2 up and spin up node3.  Since I’m expecting node2 to normally not be receiving application traffic directly, I can configure node3 to prefer node2 as its donor like this:

[mysqld]
...
wsrep_sst_donor = node2,

When node3 starts, this setting instructs the cluster that node3 is the preferred donor, but if that’s not available, pick something else (that’s what the trailing comma means).

Donor nodes are permitted to fall behind in replication apply as needed without sending flow control.  Sending application traffic to such a node may see an increase in the amount of stale data as well as certification failures for writes (not to mention the performance issues we saw above with node1).  Since node2 is not getting application traffic, moving into the Donor state and doing an expensive SST with pigz compression should be relatively safe for the rest of the cluster (in this case, node1).

Even if you don’t have a dedicated donor, if you use a load balancer of some kind in front of your cluster, you may elect to consider Donor nodes as failing their health checks so application traffic is diverted during any state transfer.

When I brought up node3, with node2 as the donor, the SST time dropped to 18m 33s

Conclusion

Each of these tunings helped the SST speed, though the later adjustments maybe had less of a direct impact.  Depending on your workload, database size, network and CPU available, your mileage may of course vary.  Your tunings should vary accordingly, but also realize you may actually want to limit (and not increase) the speed of state transfers in some cases to avoid other problems. For example, I’ve seen several clusters get unstable during SST and the only explanation for this is the amount of network bandwidth consumed by the state transfer preventing the actual Galera communication between the nodes. Be sure to consider the overall state of production when tuning your SSTs.

The post Optimizing PXC Xtrabackup State Snapshot Transfer appeared first on MySQL Performance Blog.

Find unused indexes on MongoDB and TokuMX

Finding and removing unused indexes is a pretty common technique to improve overall performance of relational databases. Less indexes means faster insert and updates but also less disk space used. The usual way to do it is to log all queries’ execution plans and then get a list of those indexes that are not used. Same theory applies to MongoDB and TokuMX so in this blog post I’m going to explain how to find those.

Profiling in MongoDB

To understand what profiling is you only need to think about MySQL’s slow query log, it is basically the same idea. It can be enabled with the following command:

db.setProfilingLevel(level, slowms)

There are three different levels:

0: No profiling enabled.
1: Only those queries slower than “slowms” are profiled.
2: All queries are profiled, similar to query_long_time=0.

Once it is enabled you can use db.system.profile.find().pretty() to read it. You would need to scan through all profiles and find those indexes that are never used. To make things easier there is a javascript program that will find the unused indexes after reading all the profile information. Unfortunately, it only works with mongodb 2.x.

The javascript is hosted in this github project https://github.com/wfreeman/indexalizer You just need to start mongo shell with indexStats.js and run db.indexStats() command. This is an sample output:

scanning profile {ns:"test.col"} with 2 records... this could take a while.
{
	"query" : {
		"b" : 1
	},
	"count" : 1,
	"index" : "",
	"cursor" : "BtreeCursor b_1",
	"millis" : 0,
	"nscanned" : 1,
	"n" : 1,
	"scanAndOrder" : false
}
{
	"query" : {
		"b" : 2
	},
	"count" : 1,
	"index" : "",
	"cursor" : "BtreeCursor b_1",
	"millis" : 0,
	"nscanned" : 1,
	"n" : 1,
	"scanAndOrder" : false
}
checking for unused indexes in: col
this index is not being used:
"_id_"
this index is not being used:
"a_1"

 

So “a_1” is not used and could be dropped. We can ignore “_id_” because that one is needed :)

There is a problem with profiling. It will affect performance so you need to run it only for some hours and usually during low peak. That means that there is a possibility that not all possible queries from your application are going to be executed during that maintenance window. What alternative TokuMX provides?

Finding unused indexes in TokuMX

Good news for all of us. TokuMX doesn’t require you to enable profiling. Index usage statistics are stored as part of every query execution and you can access them with a simple db.collection.stats() command. Let me show you an example:

> db.col.stats()
[...]
{
"name" : "a_1",
"count" : 5,
"size" : 140,
"avgObjSize" : 28,
"storageSize" : 16896,
"pageSize" : 4194304,
"readPageSize" : 65536,
"fanout" : 16,
"compression" : "zlib",
"queries" : 0,
"nscanned" : 0,
"nscannedObjects" : 0,
"inserts" : 0,
"deletes" : 0
},
{
"name" : "b_1",
"count" : 5,
"size" : 140,
"avgObjSize" : 28,
"storageSize" : 16896,
"pageSize" : 4194304,
"readPageSize" : 65536,
"fanout" : 16,
"compression" : "zlib",
"queries" : 2,
"nscanned" : 2,
"nscannedObjects" : 2,
"inserts" : 0,
"deletes" : 0
}
],
"ok" : 1
}

 

There are our statistics without profiling enabled. queries means the number of times that index has been used on a query execution. b_1 has been used twice and a_1 has never been used. You can use this small javascript code I’ve written to scan all collections inside the current database:

db.forEachCollectionName(function (cname) {
	output = db.runCommand({collstats : cname });
	print("Checking " + output.ns + "...")
	output.indexDetails.forEach(function(findUnused) { if (findUnused.queries == 0) { print( "Unused index: " + findUnused.name ); }})
});

 

An example using the same data:

> db.forEachCollectionName(function (cname) {
... output = db.runCommand({collstats : cname });
... print("Checking " + output.ns + "...")
... output.indexDetails.forEach(function(findUnused) { if (findUnused.queries == 0) { print( "Unused index: " + findUnused.name ); }})
...
... });
Checking test.system.indexes...
Checking test.col...
Unused index: a_1

 

Conclusion

Finding unused indexes is a regular task that every DBA should do. In MongoDB you have to use profiling while in TokuMX nothing needs to be enabled because it will gather information by default without impacting service performance.

The post Find unused indexes on MongoDB and TokuMX appeared first on MySQL Performance Blog.