OpenSSL heartbleed CVE-2014-0160 – Data leaks make my heart bleed

The heartbleed bug was introduced in OpenSSL 1.0.1 and is present in

  • 1.0.1
  • 1.0.1a
  • 1.0.1b
  • 1.0.1c
  • 1.0.1d
  • 1.0.1e
  • 1.0.1f

The bug is not present in 1.0.1g, nor is it present in the 1.0.0 branch nor the 0.9.8 branch of OpenSSL some sources report 1.0.2-beta is also affected by this bug at the time of writing, however it is a beta product and I would really recommend not to use beta quality releases for something as fundamentally important as OpenSSL in production.

The bug itself is within the heartbeat extension of OpenSSL (RFC6520). The bug allows an attacker to leak the memory in up to 64k chunks, this is not to say the data being leaked is limited to 64k as the attacker can continually abuse this bug to leak data, until they are satisfied with what has been recovered.

At worst the attacker can retrieve the private keys, the implications for which is that the attacker now has the keys to decrypt the encrypted data, as such the only way to be 100% certain of protection against this bug is to first update OpenSSL (>= 1.0.1g) and then revoke and regenerate new keys and certificates, expect to see a tirade of revocations and re-issuing of CA certs and the like in the coming days.

So how does this affect you as a MySQL user?

Taking Percona Server as an example, this is linked against OpenSSL, meaning if you want to use TLS for your client connections and/or your replication connections you’re going to need to have openSSL installed.

You can find your version easily via your package manager for example:

  • rpm -q openssl
  • dpkg-query -W openssl

If you’re running a vulnerable installation of OpenSSL an update will be required.

  • update OpenSSL >= 1.0.1g
  1. 1.0.1e-2+deb7u5 is reported as patched on debian,
  2. 1.0.1e-16.el6_5.7 is reported as patched in RedHat and CentOS
  3. 1.0.1e-37.66 changelogs note this has been patched on Amazon AMI
  • shutdown mysqld
  • regenerate keys and certs used by mysql for TLS connections (revoking the old certs if possible to do so)
  • start mysqld

You can read more about the heartbleed bug at heartbleed.com Redhat Bugzilla Mitre CVE filing Ubuntu Security Notice

The post OpenSSL heartbleed CVE-2014-0160 – Data leaks make my heart bleed appeared first on MySQL Performance Blog.

Heartbleed: Separating FAQ From FUD

If you’ve been following this blog (my colleague, David Busby, posted about it yesterday) or any tech news outlet in the past few days, you’ve probably seen some mention of the “Heartbleed” vulnerability in certain versions of the OpenSSL library.

So what is ‘Heartbleed’, really?

In short, Heartbleed is an information-leak issue. An attacker can exploit this bug to retrieve the contents of a server’s memory without any need for local access. According to the researchers that discovered it, this can be done without leaving any trace of compromise on the system. In other words, if you’re vulnerable, they can steal your keys and you won’t even notice that they’ve gone missing. I use the word “keys” literally here; by being able to access the contents of the impacted service’s memory, the attacker is able to retrieve, among other things, private encryption keys for SSL/TLS-based services, which means that the attacker would be able to decrypt the communications, impersonate other users (see here, for example, for a session hijacking attack based on this bug), and generally gain access to data which is otherwise believed to be secure. This is a big deal. It isn’t often that bugs have their own dedicated websites and domain names, but this one does: http://www.heartbleed.com

Why is it such a big deal?

One, because it has apparently been in existence since at least 2012. Two, because SSL encryption is widespread across the Internet. And three, because there’s no way to know if your keys have been compromised. The best detection that currently exists for this are some Snort rules, but if you’re not using Snort or some other IDS, then you’re basically in the dark.

What kinds of services are impacted?

Any software that uses the SSL/TLS features of a vulnerable version of OpenSSL. This means Apache, NGINX, Percona Server, MariaDB, the commercial version of MySQL 5.6.6+, Dovecot, Postfix, SSL/TLS VPN software (OpenVPN, for example), instant-messaging clients, and many more. Also, software packages that bundle their own vulnerable version of SSL rather than relying on the system’s version, which might be patched. In other words, it’s probably easier to explain what isn’t affected.

What’s NOT impacted?

SSH does not use SSL/TLS, so you’re OK there. If you downloaded a binary installation of MySQL community from Oracle, you’re also OK, because the community builds use yaSSL, which is not known to be vulnerable to this bug. Obviously, any service which doesn’t use SSL/TLS is not going to be vulnerable, either, since the salient code paths aren’t going to be executed. So, for example, if you don’t use SSL for your MySQL connections, then this bug isn’t going to affect your database server, although it probably still impacts you in other ways (e.g., your web servers).

What about Amazon cloud services?

According to Amazon’s security bulletin on the issue, all vulnerable services have been patched, but they still recommend that you rotate your SSL certificates.

Do I need to upgrade Percona Server, MySQL, NGINX, Apache, or other server software?

No, not unless you built any of these and statically-linked them with a vulnerable version of OpenSSL. This is not common. 99% of affected users can fix this issue by upgrading their OpenSSL libraries and cycling their keys/certificates.

What about client-level tools, like Percona Toolkit or XtraBackup?

Again, no. The client sides of Percona Toolkit, Percona XtraBackup, and Percona Cloud Tools (PCT) are not impacted. Moreover, the PCT website has already been patched. Encrypted backups are still secure.

There are some conflicting reports out there about exactly how much information leakage this bug allows. What’s the deal?

Some of the news articles and blogs claim that with this bug, any piece of the server’s memory can be accessed. Others have stated that the disclosure is limited to memory space owned by processes using a vulnerable version of OpenSSL. As far as we are aware, and as reported in CERT’s Vulnerability Notes Database, the impact of the bug is the latter; i.e., it is NOT possible for an attacker to use this exploit to retrieve arbitrary bits of your server’s memory, only bits of memory from your vulnerable services. That said, your vulnerable services could still leak information that attackers could exploit in other ways.

How do I know if I’m affected?

You can test your web server at http://filippo.io/Heartbleed/ – enter your site’s URL and see what it says. If you have Go installed, you can also get a command-line tool from Github and test from the privacy of your own workstation. There’s also a Python implementation. You can also check the version of OpenSSL that’s installed on your servers. If you’re running OpenSSL 1.0.1 through 1.0.1f or 1.0.2-beta, you’re vulnerable. (Side note here: some distributions, such as RHEL/CentOS, have patched the issue without actually updating the OpenSSL version number; on RPM-based systems you can view the changelog to see if it’s been fixed, for example:

rpm -q --changelog openssl | head -2
* Mon Apr 07 2014 Tomáš Mráz <tmraz@redhat.com> 1.0.1e-16.7
- fix CVE-2014-0160 - information disclosure in TLS heartbeat extension

Also, note that versions of OpenSSL prior to 1.0.1 are not known to be vulnerable. If you’re still unsure, we would be happy to assist you in determining the extent of the issue in your environment and taking any required corrective action. Just give us a call.

How can I know if I’ve been compromised?

If you’ve already been compromised, you won’t. However, if you use Snort as an IDS, you can use some rules developed by Fox-IT to detect and defer new attacks; other IDS/IPS vendors may have similar rule updates available. Without some sort of IDS in place, however, attacks can and will go unnoticed.

Are there any exploits for this currently out there?

Currently, yes, there are some proofs-of-concept floating around out there, although before this week, that was uncertain. But given that this is likely a 2-year-old bug, I would be quite surprised if someone, somewhere (you came to my talk at Percona Live last week, didn’t you? Remember what I said about assuming that you’re already owned?) didn’t have a solid, viable exploit.

So, then, what should I do?

Ubuntu, RedHat/CentOS, Amazon, and Fedora have already released patched versions of the OpenSSL library. Upgrade now. Do it now, as in right now. If you’ve compiled your own version of OpenSSL from source, upgrade to 1.0.1g or rebuild your existing source with the -DOPENSSL_NO_HEARTBEATS flag.

Once that’s been done, stop any certificate-using services, regenerate the private keys for any services that use SSL (Apache, MySQL, whatever), and then obtain a new SSL certificate from whatever certificate authority you’re using. Once the new certificates are installed, restart your services. You can also, of course, do the private key regeneration and certificate cycling on a separate machine, and only bring the service down long enough to install the new keys and certificates. Yes, you will need to restart MySQL. Or Apache. Or whatever. Full stop, full start, no SIGHUP (or equivalent).

Unfortunately, that’s still not all. Keeping in mind the nature of this bug, you should also change / reset any user passwords and/or user sessions that were in effect prior to patching your system and recycling your keys. See, for example, the session hijacking exploit referenced above. Also note that Google services were, prior to their patching of the bug, vulnerable to this issue, which means that it’s entirely possible that your Gmail login (or any other Google-related login) could have been compromised.

Can I get away with just upgrading OpenSSL?

NO. At a bare minimum, you will need to restart your services, but in order to really be sure you’ve fixed the problem, you also need to cycle your keys and certificates (and revoke your old ones, if possible). This is the time-consuming part, but since you have no way of knowing whether or not someone has previously compromised your private keys (and you can bet that now that the bug is public, there will be a lot of would-be miscreants out there looking for servers to abuse), the only safe thing to do is cycle them. You wouldn’t leave your locks unchanged after being burgled, would you?

Also note that once you do upgrade OpenSSL, you can get a quick list of the services that need to be restarted by running the following:

sudo lsof | grep ssl | grep DEL

Where can I get help and/or more information?

In addition to the assorted links already mentioned, you can read up on the nuts and bolts of this bug, or various news articles such as this or this. There are a lot of articles out there right now, most of which are characterizing this as a major issue. It is. Test your vulnerability, upgrade your OpenSSL and rotate your private keys and certificates, and then change your passwords.

The post Heartbleed: Separating FAQ From FUD appeared first on MySQL Performance Blog.

Tools and tips for analysis of MySQL’s Slow Query Log

MySQL's Slow Query LogMySQL has a nice feature, slow query log, which allows you to log all queries that exceed a predefined about of time to execute. Peter Zaitsev first wrote about this back in 2006 – there have been a few other posts here on the MySQL Performance Blog since then (check this and this, too) but I wanted to revisit his original subject in today’s post.

Query optimization is essential for good database server performance and usually DBAs need to ensure the top performance possible for all queries. In MySQL, the desirable way is to generate a query log for all running queries within a specific time period and then run a query analysis tool to identify the bad queries. Percona Toolkit’s pt-query-digest is one of the most powerful tools for SQL analysis. That’s because pt-query-digest can generate a very comprehensive report that spots problematic queries very efficiently. It works equally well with Oracle MySQL server. This post will focus mainly on pt-query-digest.

Slow query log is great at spotting really slow queries that are good candidates for optimization. Beginning with MySQL 5.1.21, the minimum value is 0 for long_query_time, and the value can be specified to a resolution of microseconds. In Percona Server additional statistics may be output to the slow query log. You can find the full details here. For our clients, we often need to identify queries that impact an application the most. It does not always have to be the slowest queries – queries that runs more frequently with lower execution time per call put more load on a server than queries running with lower frequency. We of course want to get rid of really slow queries but to really optimize application throughput, we also need to investigate queries that generate most of the load. Further, if you enable option log_queries_not_using_indexes  then MySQL will log queries doing full table scans which doesn’t always reflect that the query is slow, because in some situations the query optimizer chooses full table scan rather than using any available index or probably showing all records from a small table.

Our usual recommendation is to generate the slow log with long_query_time=0. This will record all the traffic but this will be I/O intensive and will eat up disk space very quickly depending on your workload. So beware of running with long_query_time=0 for only a specific period of time and revert it back to logging only very slow queries. In Percona Server there is nice option where you can limit the rate of logging, log_slow_rate_limit is the option to handle it. Filtering slow query log is very helpful too in some cases e.g. if we know the main performance issue is table scans we can log queries only doing full table scans or if we see I/O is bottleneck we can collect queries doing full scans and queries creating on disk temporary tables. Again, this is only possible in Percona Server with the log_slow_filter option. Also, you may want to collect everything on slow query log and then filter with pt-query-digest. Depending on I/O capacity, you might prefer one or another way, as collecting everything in slow query log allows us to investigate other queries too if needed. Finally, use pt-query-digest to generate an aggregate report over slow query log which highlights the problematic part very efficiently. Again, pt-query-digest can bring up server load high so our usual recommendation on it is to move slow query log to some staging/dev server and run pt-query-digest over there to generate the report.

Note: changing the long_query_time parameter value only affects newly created connections to log queries exceeds long_query_time threshold. In Percona Server there is feature which changes variable scope to global instead of local. Enabling slow_query_log_use_global_control  log queries for connected sessions too after changing long_query_time parameter threshold. You can read more about this patch here.

I am not going to show you a detailed report of pt-query-digest and explain each part of it here, because it is well defined already by my colleague Ovais Tariq in this post. However, I will show you some of the other aspects of pt-query-digest tool here.

Let me show you code snippets that enable slow query log for only a specific time period with long_query_time=0 and log_slow_verbosity to ‘full’. log_slow_verbosity is a Percona Server variable which logs extra stats such as information on query cache, Filesort, temporary tables, InnoDB statistics etc. Once you are done collecting logs, revert back the values for long_query_time to the previous value, and finally run pt-query-digest on the log to generate report. Note: run the below code in same MySQL session.

-- Save previous settings
mysql> SELECT @@global.log_slow_verbosity INTO @__log_slow_verbosity;
mysql> SELECT @@global.long_query_time INTO @__long_query_time;
mysql> SELECT @@global.slow_query_log INTO @__slow_query_log;
mysql> SELECT @@global.log_slow_slave_statements INTO @__log_slow_slave_statements;
-- Keep this in safe place, we'll need to run pt-query-digest
mysql> SELECT NOW() AS "Time Since";
-- Set values to enable query collection
mysql> SET GLOBAL slow_query_log_use_global_control='log_slow_verbosity,long_query_time';
mysql> SET GLOBAL log_slow_verbosity='full';
mysql> SET GLOBAL slow_query_log=1;
mysql> SET GLOBAL long_query_time=0;
mysql> SET GLOBAL log_slow_slave_statements=1;
-- Verify settings are OK
mysql> SELECT @@global.long_query_time, @@global.slow_query_log, @@global.log_slow_verbosity;
-- wait for 30 - 60 minutes
-- Keep this one too, also for pt-query-digest
mysql> SELECT NOW() AS "Time Until";
-- Revert to previous values
mysql> SET GLOBAL slow_query_log=@__slow_query_log;
mysql> SET GLOBAL long_query_time=@__long_query_time;
mysql> SET GLOBAL log_slow_verbosity=@__log_slow_verbosity; -- if percona server
mysql> SET GLOBAL log_slow_slave_statements=@__log_slow_slave_statements;
-- Verify settings are back to previous values
mysql> SELECT @@global.long_query_time, @@global.slow_query_log, @@global.log_slow_verbosity, @@global.slow_query_log_file;
-- Then with pt-query-digest run like (replace values for time-since, time-until and log name)
$ pt-query-digest --since='<time-since>' --until='<time-until>' --limit=100% /path/to/slow_query_log_file.log > /path/to/report.out
-- If you're not using Percona Server then you need to remove all references to log_slow_verbosity, slow_query_log_use_global_control and log_slow_slave_statements (priot MySQL 5.6).

My colleague Bill Karwin wrote bash script that does almost the same as the above code. You can find the script to collect slow logs here. This script doesn’t hold connection to the database session while you wait for logs to accumulate and it sets all the variables back to the state they were before. For full documentation view this.

Further, you can also get explain output into the report from the pt-query-digest tool. For that you need to use –explain parameter similar to as follows.

$ pt-query-digest --explain u=<user>,p=<password>,h=<hostname> /path/to/slow.log > /path/to/report.out

Explain output in query report will get you all the information for query execution plan and explain output signal towards how that particular query going to be executed. Note that, if you execute pt-query-digest over slow query log other than originated server of slow query log as I mentioned above e.g. staging/dev you may get different execution path for the query in the report or lower number of rows to examined, etc., because usually staging/dev servers has different data distribution, different MySQL versions, or different indexes. MySQL explain adds overhead as queries needs to be prepared on the server to generate intended query execution path. For this reason, you may want to run pt-query-digest with –explain on a production replica.

It’s worth mentioning that logging queries with log_slow_verbosity in Percona Server is really handy as it shows lots of additional statistics and it is more helpful in situations when the explain plan reports a different execution path than when the query is executed. On that particular topic, you may want to check this nice post.

pt-query-digest also supports filters. You can read more about it here. Let me show you an example. The following command will discard everything apart from insert/update/delete queries in pt-query-digest output report.

$ pt-query-digest --filter '$event->{arg} =~ m/^(insert|update|delete)/i' --since='<time-since>' --until='<time-until>' --limit=100% /path/to/slow_query_log_file.log > /path/to/report.out

If you’re looking for some GUI tools for pt-query-digest then I would recommend reading this nice blogpost from my colleague Roman. Further, our CEO Peter Zaitsev also wrote a post recently where he shows the comparison between performance_schema and slow query log. Check here for details.

In related new, Percona recently announced Percona Cloud Tools, the next generation of tools for MySQL. It runs a client-side agent (pt-agent) which runs pt-query-digest on the server with some intervals and uploads the aggregated data to the Percona Cloud Tools API which process it further.  Query Analytics is one tool from the Percona Cloud Tools that provides advanced query metrics. It  is a nice visualization tool. You may be interested to learn more about it here, and it’s also worth viewing this related webinar about Percona Cloud Tools from our CTO Vadim Tkachenko.

Conclusion:
pt-query-digest from Percona Toolkit is a versatile (and free) tool for slow query log analysis. It provides good insight about every individual query, especially in Percona Server with log_slow_verbosity enabled, e.g. log queries with microsecond precision, log information about the query’s execution plan. On top of that, Percona Cloud Tools includes Query Analytics which provides you with good visuals about query performance and also provides a view of historical data.

The post Tools and tips for analysis of MySQL’s Slow Query Log appeared first on MySQL Performance Blog.

How to log slow queries on Slave in MySQL 5.0 with pt-query-digest

Working as a Percona Support Engineer, every day we are seeing lots of issues related to MySQL replication. One very common issue is slave lagging. There are many reasons for slave lag but one common reason is that queries are taking more time on slave then master. How to check and log those long-running queries?  From MySQL 5.1, log-slow-slave-statements variable was introduced, which you can enable on slave and log slow queries. But what if you want to log slow queries on slave in earlier versions like MySQL 5.0?  There is a good solution/workaround: pt-query-digest. How? let’s take a look….

If you want to log all queries that are running on slave (including those, running by sql thread), you can use pt-query-digest with –processlist and –print (In pt-query-digest 2.1.9) OR –output (In pt-query-digest 2.2.7) options and log all queries in specific file. I have tested it in my local environment and it works.

You can start pt-query-digest like below on Slave,

nil@Dell:~$ /percona-toolkit-2.1.9/bin/pt-query-digest --processlist u=msandbox,p=msandbox,S=/tmp/mysql_sandbox34498.sock --print --no-report
OR
nil@Dell:-$ /percona-toolkit-2.2.7/bin/pt-query-digest --processlist u=msandbox,p=msandbox,S=/tmp/mysql_sandbox34498.sock --no-report --output=slowlog

Run some long running queries on Master,

nil@Dell:~$ mysql -umsandbox -p --socket=/tmp/mysql_sandbox34497.sock
Enter password:
mysql> use percona
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> delete from test limit 5000000;
Query OK, 5000000 rows affected (1 min 54.33 sec)
mysql> delete from test limit 5000000;
Query OK, 5000000 rows affected (1 min 56.42 sec)

mysql>

and you’ll see the output on Slave like this,

nil@Dell:~/Downloads/percona-toolkit-2.1.9/bin$ ./pt-query-digest --processlist u=msandbox,p=msandbox,S=/tmp/mysql_sandbox34498.sock --print --no-report
# Time: 2014-03-18T12:10:57
# User@Host: system user[system user] @ []
# Query_time: 114.000000 Lock_time: 0.000000 Rows_sent: 0 Rows_examined: 0
use percona;
delete from test limit 5000000;
nil@Dell:~/Downloads/percona-toolkit-2.2.7/bin$ pt-query-digest --processlist u=msandbox,p=msandbox,S=/tmp/mysql_sandbox34498.sock --no-report --output=slowlog
# Time: 2014-03-18T12:21:05
# User@Host: system user[system user] @ []
# Query_time: 117.000000 Lock_time: 0.000000 Rows_sent: 0 Rows_examined: 0
use percona;
delete from test limit 5000000;

You can also run pt-query-digest in background like a daemon and send output to the specific file like slow.log and review it.

i.e /percona-toolkit-2.1.9/bin/pt-query-digest –processlist u=msandbox,p=msandbox,S=/tmp/mysql_sandbox34498.sock –print –no-report > slow.log 2>&1

OR

i.e /percona-toolkit-2.2.7/bin/pt-query-digest –processlist u=msandbox,p=msandbox,S=/tmp/mysql_sandbox34498.sock –no-report –output=slowlog > slow.log 2>&1

Here, the default output will be just like slow query log. If we have master-master replication where every master is slave too and we want to log only those statements that are executing by sql_thread then –filter option can be used like this:

pt-query-digest –filter ‘$event->user eq “system user”‘ –no-report –output=slowlog

Since pt-query-digest–processlist polls 10 times/second ( –interval option), it’s not reliable to use for collecting complete query logs, because quick queries could fall in between the polling intervals. And in any case, it won’t measure query time with precision any better than 1/10th of a second. But if the goal is to identify queries that are very long-running, it should be adequate.

The post How to log slow queries on Slave in MySQL 5.0 with pt-query-digest appeared first on MySQL Performance Blog.

MySQL on Windows: A survival guide for Linux-based DBAs

MySQL on Windows: A survival guide for Linux-based DBAsNext week, on Nov. 6, I will be delivering a webinar about running MySQL on Windows, with a strong focus on Linux-based sysadmins and DBAs – and how not to go crazy in the process.

An interesting (and challenging!) part of working for Percona is that you never know what kind of setup a customer will have, and even though MySQL is still strongly tied to the LAMP stack and therefore Linux, more people are running it on Windows these days.

As someone who last used Windows on a daily basis in 1999, my first reaction was usually not nice. But over time, I have learned to embrace these kind of cases as an opportunity to get out of my comfort zone and learn. I still personally prefer a Unix derivative, but over time, I have gathered some knowledge that I think can be useful to others who arrive at MySQL on Windows from a Linux/Unix background. And who knows? Perhaps I can even give some insight to longtime Windows admins who are newcomers to the world of MySQL!

If you’re interested, register here to attend. The webinar starts at 10 a.m. PST on Nov. 6. It will also be recorded and the replay available using that same link.

The post MySQL on Windows: A survival guide for Linux-based DBAs appeared first on MySQL Performance Blog.

MySQL 5.6 Configuration Optimization Webinar, Sept. 25

Webinar: MySQL 5.6 Configuration OptimizationThis Wednesday in our next webinar I’ll share how to configure a better-performing MySQL 5.6 server. You’ll lean a practical approach to generating a sensible configuration file that sets what is needed and omits what is not.

Why dedicate an entire webinar to the new configuration settings within MySQL 5.6? Mainly because the default configuration files that come with MySQL 5.6 are not designed for high volume production use, and I’ve seen many MySQL incidents caused by poor configuration. Hopefully my advice will save you the headache of tweaking the variables within MySQL’s configuration files in order to work within your organization’s unique business environment.

And while I’ll be doing most of the talking, I do look forward to your questions during the webinar, titled “MySQL 5.6 Configuration Optimization.” It will begin at 10 a.m. Pacific time and run for about an hour. So please tune in on Wednesday, Sept. 25 at 10 a.m. You can register here to reserve your spot.

I welcome your questions both during the webinar and in the comments section below. See you Wednesday!

The post MySQL 5.6 Configuration Optimization Webinar, Sept. 25 appeared first on MySQL Performance Blog.

How to reclaim space in InnoDB when innodb_file_per_table is ON

When innodb_file_per_table is OFF and all data is going to be stored in ibdata files. If you drop some tables of delete some data then there is no any other way to reclaim that unused disk space except dump/reload method.

When Innodb_file_per_table is ON, each table stores data and indexes in it’s own tablespace file. However, the shared tablespace-ibdata1 can still grow and you can check more information here about why it grows and what are the solutions.

http://www.mysqlperformanceblog.com/2013/08/20/why-is-the-ibdata1-file-continuously-growing-in-mysql/

Following the recent blog post from Miguel Angel Nieto titled “Why is the ibdata1 file continuously growing in MySQL?“, and since this is a very common question for Percona Support, this post covers how to reclaim the space when we are using innodb_file_per_table. Also, I will show you how to do it without causing performance or availability problems with the help of our Percona Toolkit.

When you remove rows, they are just marked as deleted on disk but space will be consumed by InnoDB files which can be re-used later when you insert/update more rows but it will never shrink. Very old MySQL bug : http://bugs.mysql.com/bug.php?id=1341

But, if you are using innodb_file_per_table then you can reclaim the space by running OPTIMIZE TABLE on that table. OPTIMIZE TABLE will create a new identical empty table. Then it will copy row by row data from old table to the new one. In this process a new .ibd tablespace will be created and the space will be reclaimed

mysql> select count(*) From test;
+----------+
| count(*) |
+----------+
| 3145728 |
+----------+
root@nil:/var/lib/mysql/mydb# ll -h
...
-rw-rw---- 1 mysql mysql 168M Sep 5 11:52 test.ibd
mysql> delete from test limit 2000000;
mysql> select count(*) From test;
+----------+
| count(*) |
+----------+
| 1145728 |
+----------+
root@nil:/var/lib/mysql/mydb# ll -h
...
-rw-rw---- 1 mysql mysql 168M Sep 5 11:52 test.ibd

You can see that after deleting 2M records, the test.ibd size was 168M.

mysql> optimize table test;
+-----------+----------+----------+-------------------------------------------------------------------+
| Table | Op | Msg_type | Msg_text |
+-----------+----------+----------+-------------------------------------------------------------------+
| mydb.test | optimize | note | Table does not support optimize, doing recreate + analyze instead |
| mydb.test | optimize | status | OK |
+-----------+----------+----------+-------------------------------------------------------------------+
root@nil:/var/lib/mysql/mydb# ll -h
...
-rw-rw---- 1 mysql mysql 68M Sep 5 12:47 test.ibd

After OPTIMIZE, you will be able to reclaim the space. As you can see, test.ibd file size is decreased from 168M to 68M.

I would like to mention here that during that process the table will be locked.(Table locked for just Writes) Which can affect to the performance when you’ll have large table. So If you don’t want to lock the table then you can use one of the best utility by Percona, pt-online-schema-change. It can ALTER without locking tables. You can run ALTER TABLE with ENGINE=INNODB which will re-create the table and reclaim the space.

(It’s always recommended to use latest version of pt-* utilities)

mysql> select count(*) from test;
+----------+
| count(*) |
+----------+
| 2991456 |
+----------+
mysql> delete from test limit 2000000;
mysql> select count(*) from test;
+----------+
| count(*) |
+----------+
| 991456 |
+----------+
root@nil:/var/lib/mysql/mydb# ll -h
...
-rw-rw---- 1 mysql mysql 157M Sep 6 11:52 test.ibd
nilnandan@nil:~$ pt-online-schema-change --alter "ENGINE=InnoDB" D=mydb,t=test --execute
Operation, tries, wait:
copy_rows, 10, 0.25
create_triggers, 10, 1
drop_triggers, 10, 1
swap_tables, 10, 1
update_foreign_keys, 10, 1
Altering `mydb`.`test`...
Creating new table...
Created new table mydb._test_new OK.
Altering new table...
Altered `mydb`.`_test_new` OK.
2013-09-06T15:37:46 Creating triggers...
2013-09-06T15:37:47 Created triggers OK.
2013-09-06T15:37:47 Copying approximately 991800 rows...
Copying `mydb`.`test`: 96% 00:01 remain
2013-09-06T15:38:17 Copied rows OK.
2013-09-06T15:38:17 Swapping tables...
2013-09-06T15:38:18 Swapped original and new tables OK.
2013-09-06T15:38:18 Dropping old table...
2013-09-06T15:38:18 Dropped old table `mydb`.`_test_old` OK.
2013-09-06T15:38:18 Dropping triggers...
2013-09-06T15:38:18 Dropped triggers OK.
Successfully altered `mydb`.`test`.
root@nil:/var/lib/mysql/mydb# ll -h
...
-rw-rw---- 1 mysql mysql 56M Sep 6 15:38 test.ibd

Same here, you can notice that test.ibd file size decreased from 157M to 56M.

NOTE: Please make sure that you have ample space before you run pt-online-schema-change because it will create a temporary table that contains roughly the size of the original table. By running this on the primary node, the changes will be replicated to your slaves.

The post How to reclaim space in InnoDB when innodb_file_per_table is ON appeared first on MySQL Performance Blog.

Percona University at Washington, D.C. – Sept. 12

Percona University at Washington, D.C. - Sept. 12Following our events earlier this year in Raleigh, Montevideo, Buenos Aires, Toronto and Portland, we bring Percona University to Washington, D.C. on September 12.

This free one-day technical education event will cover many MySQL ecosystem technologies, including: Percona Server, MariaDB, MySQL 5.6, Percona XtraDB Cluster, Tungsten Replicator, Percona XtraBackup, MySQL Cluster and Percona Toolkit. We also will learn about Apache Hadoop and how it can be most successfully used with MySQL environments.

Matt Yonkovit, VP of Percona MySQL Consulting, will give his great talk on the “5 minute DBA,” drawing from a wealth of practical DBA tips for developers and system administrators – professionals who must step up to the plate on occasion and perform DBA tasks.

Ryan Huddleston, director of Percona Remote DBA, will talk about “MySQL Backups in the Real World,” sharing best practices you’ll want to follow if you never want to lose your data again – as well as being able to recover from various data incidents as quickly as possible.

Our special guest for this event is Baron Schwartz, co-Founder and CEO of VividCortex, who will be speaking about Using MySQL with Go. Go Programming language is quickly gaining popularity and we’re excited to bring the most up to date information about using Go with MySQL database.

If you will be in the Washington, D.C. area on September 12, I’d love to see you at this event – space is limited though, so please register now.  And if you’re interested in Percona University events and would like to bring one to your city, please fill out this form to let us know – especially if you can help us by hosting or promoting the event.

The post Percona University at Washington, D.C. – Sept. 12 appeared first on MySQL Performance Blog.

MySQL Security: Armoring Your Dolphin

MySQL Security: Armoring Your DolphinMy colleague and teammate Ernie Souhrada will be presenting a webinar on Wednesday, August 21, 2013 at 10 a.m. PDT titled “MySQL Security: Armoring Your Dolphin.”

This is a popular topic with news breaking routinely that yet another Internet company has leaked private data of one form or another. Ernie’s webinar will be a great overview of security MySQL from top to bottom, including changes related to security in the 5.6 release.

Topics to be covered include:

  • Basic security concepts
  • Security above the MySQL layer (network, hardware, OS, etc.)
  • Tips for application design
  • A more secure MySQL configuration
  • Security-related changes in MySQL 5.6

Attendees will leave this presentation knowing where to start when identifying vulnerability in their systems.

Be sure to register for the webinar in advance!

The post MySQL Security: Armoring Your Dolphin appeared first on MySQL Performance Blog.

InnoDB Full-text Search in MySQL 5.6: Part 3, Performance

This is part 3 of a 3 part series covering the new InnoDB full-text search features in MySQL 5.6. To catch up on the previous parts, see part 1 or part 2

Some of you may recall a few months ago that I promised a third part in my InnoDB full-text search (FTS) series, in which I’d actually take a look at the performance of InnoDB FTS in MySQL 5.6 versus traditional MyISAM FTS. I hadn’t planned on quite such a gap between part 2 and part 3, but as they say, better late than never. Recall that we have been working with two data sets, one which I call SEO (8000-keyword-stuffed web pages) and the other which I call DIR (800K directory records), and we are comparing MyISAM FTS in MySQL 5.5.30 versus InnoDB FTS in MySQL 5.6.10.

For reference, although this is not really what I would call a benchmark run, the platform I’m using here is a Core i7-2600 3.4GHz, 32GiB of RAM, and 2 Samsung 256GB 830 SSDs in RAID-0. The OS is CentOS 6.4, and the filesystem is XFS with dm-crypt/LUKS. All MySQL settings are their respective defaults, except for innodb_ft_min_token_size, which is set to 4 (instead of the default of 3) to match MyISAM’s default ft_min_word_len.

Also, recall that the table definition for the DIR data set is:

CREATE TABLE dir_test (
  id INT UNSIGNED NOT NULL PRIMARY KEY,
  full_name VARCHAR(100),
  details TEXT
);

The table definition for the SEO data set is:

CREATE TABLE seo_test (
 id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
 title VARCHAR(255),
 body MEDIUMTEXT
);

Table Load / Index Creation

First, let’s try loading data and creating our FT indexes in one pass – i.e., we’ll create the FT indexes as part of the original table definition itself. In particular, this means adding “FULLTEXT KEY (full_name, details)” to our DIR tables and adding “FULLTEXT KEY (title, body)” to the SEO tables. We’ll then drop these tables, drop our file cache, restart MySQL, and try the same process in two passes: first we’ll load the table, and then we’ll do an ALTER to add the FT indexes. All times in seconds.

EngineData Setone-pass (load)two-pass (load, alter)
MyISAMSEO3.913.96 (0.76, 3.20)
InnoDBSEO3.7777.32 (1.53, 5.79)
MyISAMDIR43.15944.93 (6.99, 37.94)
InnoDBDIR330.7656.99 (12.70, 44.29)

Interesting. For MyISAM, we might say that it really doesn’t make too much difference which way you proceed, as the numbers from the one-pass load and the two-pass load are within a few percent of each other, but for InnoDB, we have mixed behavior. With the smaller SEO data set, it makes more sense to do it in a one-pass process, but with the larger DIR data set, the two-pass load is much faster.

Recall that when adding the first FT index to an InnoDB table, the table itself has to be rebuilt to add the FTS_DOC_ID column, so I suspect that the size of the table when it gets rebuilt has a lot to do with the performance difference on the smaller data set. The SEO data set fits completely into the buffer pool, the DIR data set does not. That also suggests that it’s worth comparing the time required to add a second FT index (this time we will just index each table’s TEXT/MEDIUMTEXT field). While we’re at it, let’s look at the time required to drop the second FT index as well. Again, all times in seconds.

EngineData SetFT Index Create TimeFT Index Drop Time
MyISAMSEO6.343.17
InnoDBSEO3.260.01
MyISAMDIR74.9637.82
InnoDBDIR24.590.01

InnoDB wins this second test all around. I’d attribute InnoDB’s win here partially to not having to rebuild the whole table with second (and subsequent) indexes, but also to the fact that at least some the InnoDB data was already in the buffer pool from when the first FT index was created. Also, we know that InnoDB generally drops indexes extremely quickly, whereas MyISAM requires a rebuild of the .MYI file, so InnoDB’s win on the drop test isn’t surprising.

Query Performance

Recall the queries that were used in the previous post from this series:

1. SELECT id, title, MATCH(title, body) AGAINST ('arizona business records'
   IN NATURAL LANGUAGE MODE) AS score FROM seo_test_{myisam,innodb} ORDER BY 3
   DESC LIMIT 5;
2. SELECT id, title, MATCH(title, body) AGAINST ('corporation commission forms'
   IN NATURAL LANGUAGE MODE) AS score FROM seo_test_{myisam,innodb} ORDER BY 3 DESC
   LIMIT 5;
3. SELECT id, full_name, MATCH(full_name, details) AGAINST ('+james +peterson +arizona'
   IN BOOLEAN MODE) AS score FROM dir_test_{myisam,innodb} ORDER BY 3 DESC LIMIT 5;
4. SELECT id, full_name, MATCH(full_name, details) AGAINST ('+james +peterson arizona'
   IN BOOLEAN MODE) AS score FROM dir_test_{myisam,innodb} ORDER BY 3 DESC LIMIT 5;
5. SELECT id, full_name, MATCH(full_name, details) AGAINST ('"Thomas B Smith"'
   IN BOOLEAN MODE) AS score FROM dir_test_{myisam,innodb} ORDER BY 3 DESC LIMIT 1;

The queries were run consecutively from top to bottom, a total of 10 times each. Here are the results in tabular format:

Query #EngineMin. Execution TimeAvg. Execution TimeMax. Execution Time
1MyISAM0.0079530.0081020.008409
1InnoDB0.0149860.0153310.016243
2MyISAM0.0018150.0018930.001998
2InnoDB0.0019870.0020770.002156
3MyISAM0.0007480.0008170.000871
3InnoDB0.6701100.6765400.684837
4MyISAM0.0011990.0012830.001372
4InnoDB0.0554790.0562560.060985
5MyISAM0.0084710.0085970.008817
5InnoDB0.6243050.6309590.641415

Not a lot of variance in execution times for a given query, so that’s good, but InnoDB is always coming back slower than MyISAM. In general, I’m not that surprised that MyISAM tends to be faster; this is a simple single-threaded, read-only test, so none of the areas where InnoDB shines (e.g., concurrent read/write access) are being exercised here, but I am quite surprised by queries #3 and #5, where InnoDB is just getting smoked.

I ran both versions of query 5 with profiling enabled, and for the most part, the time spent in each query state was identical between the InnoDB and MyISAM versions of the query, with one exception.

InnoDB: | Creating sort index | 0.626529 |
MyISAM: | Creating sort index | 0.014588 |

That’s where the bulk of the execution time is. According to the docs, this thread state means that the thread is processing a SELECT which required an internal temporary table. Ok, sure, that makes sense, but it doesn’t really explain why InnoDB is taking so much longer, and here’s where things get a bit interesting. If you recall part 2 in this series, query 5 actually returned 0 results when run against InnoDB with the default configuration because of the middle initial “B”, and I had to set innodb_ft_min_token_size to 1 in order to get results back. For the sake of completeness, I did that again here, then restarted the server and recreated my FT index. The results? Execution time dropped by 50% and ‘Creating sort index’ didn’t even appear in the query profile:

mysql [localhost] {msandbox} (test): SELECT id, full_name, MATCH(full_name, details) AGAINST
('"Thomas B Smith"' IN BOOLEAN MODE) AS score FROM dir_test_innodb ORDER BY 3 DESC LIMIT 1;
+-------+----------------+-------------------+
| id    | full_name      | score             |
+-------+----------------+-------------------+
| 62633 | Thomas B Smith | 32.89915466308594 |
+-------+----------------+-------------------+
1 row in set (0.31 sec)
mysql [localhost] {msandbox} (test): show profile;
+-------------------------+----------+
| Status                  | Duration |
+-------------------------+----------+
| starting                | 0.000090 |
| checking permissions    | 0.000007 |
| Opening tables          | 0.000017 |
| init                    | 0.000034 |
| System lock             | 0.000012 |
| optimizing              | 0.000008 |
| statistics              | 0.000027 |
| preparing               | 0.000012 |
| FULLTEXT initialization | 0.304933 |
| executing               | 0.000008 |
| Sending data            | 0.000684 |
| end                     | 0.000006 |
| query end               | 0.000006 |
| closing tables          | 0.000011 |
| freeing items           | 0.000019 |
| cleaning up             | 0.000003 |
+-------------------------+----------+

Hm. It’s still slower than MyISAM by quite a bit, but much faster than before. The reason it’s faster is because it found an exact match and I only asked for one row, but if I change LIMIT 1 to LIMIT 2 (or limit N>1), then ‘Creating sort index’ returns to the tune of roughly 0.5 to 0.6 seconds, and ‘FULLTEXT initialization’ remains at 0.3 seconds. So this answers another lingering question: there is a significant performance impact to using a lower innodb_ft_min_token_size (ifmts), and it can work for you or against you, depending upon your queries and how many rows you’re searching for. The time spent in “Creating sort index” doesn’t vary too much (maybe 0.05s) between ifmts=1 and ifmts=4, but the time spent in FULLTEXT initialization with ifmts=4 was typically only a few milliseconds, as opposed to the 300ms seen here.

Finally, I tried experimenting with different buffer pool sizes, temporary table sizes, per-thread buffer sizes, and I also tried changing from Antelope (ROW_FORMAT=COMPACT) to Barracuda (ROW_FORMAT=DYNAMIC) and switching character sets from utf8 to latin1, but none of these made any difference. The only thing which seemed to provide a bit of a performance improvement was upgrading to 5.6.12. The execution times for the InnoDB FTS queries under 5.6.12 were about 5-10 percent faster than with 5.6.10, and query #2 actually performed a bit better under InnoDB than MyISAM (average execution time 0.00075 seconds faster), but other than that, MyISAM still wins on raw SELECT performance.

Three blog posts later, then, what’s my overall take on InnoDB FTS in MySQL 5.6? I don’t think it’s great, but it’s serviceable. The performance for BOOLEAN MODE queries definitely leaves something to be desired, but I think InnoDB FTS fills a need for those people who want the features and capabilities of InnoDB but can’t modify their existing applications or who just don’t have enough FTS traffic to justify building out a Sphinx/Solr/Lucene-based solution.

The post InnoDB Full-text Search in MySQL 5.6: Part 3, Performance appeared first on MySQL Performance Blog.