Thank you for attending my July 22 webinar titled “Advanced Query Tuning in MySQL 5.6 and 5.7” (my slides and a replay available here). As promised here is the list of questions and my answers (thank you for your great questions).
Q: Here is the explain example:
mysql> explain extended select id, site_id from test_index_id where site_id=1
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: test_index_id
type: ref
possible_keys: key_site_id
key: key_site_id
key_len: 5
ref: const
rows: 1
filtered: 100.00
Extra: Using where; Using index
why is site_id a covered index for the query, given the fact that a) we are selecting “id”, b) key_site_id only contains site_id?
As the table is InnoDB, all secondary keys will always contain primary key (“id”); in this case the secondary index will contain all needed information to satisfy the above query and key_site_id will be “covered index”
Q: Applications change over time. Do you suggest doing a periodic analysis of indexes that are being used and drop the ones that are not? If yes, any suggestions as to tackle that?
Yes, that is a good idea. Usually it can be done easily with Percona toolkit or Performance_schema in MySQL 5.6
- Enable slow query log and log every query, then use Pt-index-usage tool
- Or use the following query (as suggested by FromDual blog post):
SELECT object_schema, object_name, index_name
FROM performance_schema.table_io_waits_summary_by_index_usage
WHERE index_name IS NOT NULL
AND count_star = 0
ORDER BY object_schema, object_name;
Q: Does the duplicate index is found on 5.6/5.7 will that causes an performance impact to the db while querying?
Duplicate keys can have negative impact on selects:
- MySQL can get confused and choose a wrong index
- Total index size can grow, which can cause MySQL to run out of RAM
Q: What is the suggested method to measure performance on queries (other than the slow query log) so as to know where to create indexes?
Slow query log is most common method. In MySQL 5.6 you can also use Performance Schema and use events_statements_summary_by_digest table.
Q: I’m not sure if this was covered in the webinar but… are there any best-practices for fulltext indexes?
That was not covered in this webinar, however, I’ve done a number of presentations regarding Full Text Indexes. For example: Creating Geo Enabled Applications with MySQL 5.6
Q: What would be the limit on index size or number of indexes you can defined per table?
There are no limits on Index size on disk, however, it will be good (performance wise) to have active indexes fit in RAM.
In InnoDB there are a number of index limitations, i.e. a table can contain a maximum of 64 secondary indexes.
Q: If a table has two columns you would like to sum, can you have that sum indexed as a calculated index? To add to that, can that calculated index have “case when”?
Just to clarify, this is only a feature of MySQL 5.7 (not released yet).
Yes, it is documented now:
CREATE TABLE triangle (
sidea DOUBLE,
sideb DOUBLE,
sidec DOUBLE AS (SQRT(sidea * sidea + sideb * sideb))
);
Q: I have noticed that you created indexes on columns like DayOfTheWeek with very low cardinality. Shouldn’t that be a bad practice normally?
Yes, you are right! Unless, you are doing queries like “select count(*) from … where DayOfTheWeek = 7” those indexes may not be very useful.
Q: I saw an article that if you don’t specify a primary key upfront mysql / innodb creates one in the background (hidden). Is it different from a primary key itself, if most of the where fields that are used not in the primary / semi primary key? And is there a way to identify the tables with the hidden primary key indexes?
The “hidden” primary key will be 6 bytes, which will also be appended (duplicated) to all secondary keys. You can create an INT primary key auto_increment, which will be smaller (if you do not plan to store more than 4 billion rows). In addition, you will not be able to use the hidden primary key in your queries.
The following query (against information_schema) can be used to find all tables without declared primary key (with “hidden” primary key):
SELECT tables.table_schema, tables.table_name, tables.table_rows
FROM information_schema.tables
LEFT JOIN (
SELECT table_schema, table_name
FROM information_schema.statistics
GROUP BY table_schema, table_name, index_name
HAVING
SUM(
CASE WHEN non_unique = 0 AND nullable != 'YES' THEN 1 ELSE 0 END
) = COUNT(*)
) puks
ON tables.table_schema = puks.table_schema AND tables.table_name = puks.table_name
WHERE puks.table_name IS NULL
AND tables.table_type = 'BASE TABLE' AND engine='InnoDB'
You may also use mysql.innodb_index_stats table to find rows with the hidden primary key:
Example:
mysql> select * from mysql.innodb_index_stats;
+---------------+------------+-----------------+---------------------+--------------+------------+-------------+-----------------------------------+
| database_name | table_name | index_name | last_update | stat_name | stat_value | sample_size | stat_description |
+---------------+------------+-----------------+---------------------+--------------+------------+-------------+-----------------------------------+
| test | t1 | GEN_CLUST_INDEX | 2015-08-08 20:48:23 | n_diff_pfx01 | 96 | 1 | DB_ROW_ID |
| test | t1 | GEN_CLUST_INDEX | 2015-08-08 20:48:23 | n_leaf_pages | 1 | NULL | Number of leaf pages in the index |
| test | t1 | GEN_CLUST_INDEX | 2015-08-08 20:48:23 | size | 1 | NULL | Number of pages in the index |
+---------------+------------+-----------------+---------------------+--------------+------------+-------------+-----------------------------------+
Q: You are using the alter table to create index, but how does mysql sort the data for creating the index? isn’t it uses temp table for that?
That is a very good question: the behavior of the “alter table … add index” has changed over time. As documented in Overview of Online DDL:
Historically, many DDL operations on InnoDB tables were expensive. Many ALTER TABLE operations worked by creating a new, empty table defined with the requested table options and indexes, then copying the existing rows to the new table one-by-one, updating the indexes as the rows were inserted. After all rows from the original table were copied, the old table was dropped and the copy was renamed with the name of the original table.
MySQL 5.5, and MySQL 5.1 with the InnoDB Plugin, optimized CREATE INDEX and DROP INDEX to avoid the table-copying behavior. That feature was known as Fast Index Creation
When MySQL uses “Fast Index Creation” operation it will create a set of temporary files in MySQL’s tmpdir:
To add a secondary index to an existing table, InnoDB scans the table, and sorts the rows using memory buffers and temporary files in order by the values of the secondary index key columns. The B-tree is then built in key-value order, which is more efficient than inserting rows into an index in random order.
Q: How good is InnoDB deadlocks on 5.7 comparing to 5.6 version. Is that based on parameters setup?
InnoDB deadlocks discussion is outside of the scope of this presentation. Valerii Kravchuk and Nilnandan Joshi did an excellent talk at Percona Live 2015 (slides available): Understanding Innodb Locks and Deadlocks
Q: What is the performance impact of generating a virtual column for a table having 66 Million records and generating the index. And how would you go about it? Do you have any suggestions on how to re organize indexes on the physical disk?
As MySQL 5.7 is not released yet, behavior of the virtual columns may change. The main question here is: will it be online operations to a) add a virtual column (as this is only metadata change it should be very light operation anyway). b) add index on that virtual column. In the labs released it was not online, however this can change.
Thank you again for attending.
The post Advanced Query Tuning in MySQL 5.6 and MySQL 5.7 Webinar: Q&A appeared first on MySQL Performance Blog.