L
L
Lev Lybin2015-12-22 17:36:30
MySQL
Lev Lybin, 2015-12-22 17:36:30

MySQL, query optimizer, why does one query on different users use different indexes?

This user has more data and the request is processed quickly:

MariaDB [80Product_12]> explain SELECT * FROM actions WHERE actions.user_id = '11111' AND actions.type = 'BonusDist' ORDER BY actions.created_at DESC  LIMIT 3;
+------+-------------+---------+-------------+------------------------------------+------------------------------------+---------+------+-------+----------------------------------------------------------------------------------+
| id   | select_type | table   | type        | possible_keys                      | key                                | key_len | ref  | rows  | Extra                                                                            |
+------+-------------+---------+-------------+------------------------------------+------------------------------------+---------+------+-------+----------------------------------------------------------------------------------+
|    1 | SIMPLE      | actions | index_merge | ix_actions_type,ix_actions_user_id | ix_actions_user_id,ix_actions_type | 98,152  | NULL | 18127 | Using intersect(ix_actions_user_id,ix_actions_type); Using where; Using filesort |
+------+-------------+---------+-------------+------------------------------------+------------------------------------+---------+------+-------+----------------------------------------------------------------------------------+
1 row in set (0.00 sec)

This user has little data, the same request works for several minutes, it can be seen that it does not use ix_actions_user_id:
MariaDB [80Product_12]> explain SELECT * FROM actions WHERE actions.user_id = '222222' AND actions.type = 'BonusDist' ORDER BY actions.created_at DESC  LIMIT 3;
+------+-------------+---------+-------+------------------------------------+-----------------------+---------+------+------+-------------+
| id   | select_type | table   | type  | possible_keys                      | key                   | key_len | ref  | rows | Extra       |
+------+-------------+---------+-------+------------------------------------+-----------------------+---------+------+------+-------------+
|    1 | SIMPLE      | actions | index | ix_actions_type,ix_actions_user_id | ix_actions_created_at | 10      | NULL | 1013 | Using where |
+------+-------------+---------+-------+------------------------------------+-----------------------+---------+------+------+-------------+
1 row in set (0.00 sec)

MariaDB [80Product_12]> SHOW INDEX from actions;
+---------+------------+------------------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table   | Non_unique | Key_name                     | Seq_in_index | Column_name       | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+---------+------------+------------------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| actions |          0 | PRIMARY                      |            1 | id                | A         |    15565863 |     NULL | NULL   |      | BTREE      |         |               |
| actions |          1 | ix_actions_type              |            1 | type              | A         |        3800 |     NULL | NULL   |      | BTREE      |         |               |
| actions |          1 | ix_actions_created_at        |            1 | created_at        | A         |    15565863 |     NULL | NULL   |      | BTREE      |         |               |
| actions |          1 | ix_actions_aggregated_at     |            1 | aggregated_at     | A         |     7782931 |     NULL | NULL   | YES  | BTREE      |         |               |
| actions |          1 | ix_actions_aff_aggregated_at |            1 | aff_aggregated_at | A         |      178917 |     NULL | NULL   | YES  | BTREE      |         |               |
| actions |          1 | ix_actions_user_id           |            1 | user_id           | A         |     3113172 |     NULL | NULL   |      | BTREE      |         |               |
+---------+------------+------------------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
6 rows in set (0.00 sec)

MariaDB [80Product_12]> SELECT @@version;
+-------------------------------+
| @@version                     |
+-------------------------------+
| 10.0.21-MariaDB-1~precise-log |
+-------------------------------+
1 row in set (0.00 sec)

Why such behaviour? There is speculation that he thinks he can fetch this data faster as he does, but due to the lack of RAM on the shard, it is slower. Why doesn't it always use the user_id index?

Answer the question

In order to leave comments, you need to log in

2 answer(s)
L
Lev Lybin, 2015-12-25
@lybin


First, the index in the second case is still used (ix_actions_created_at). The absence of "Using index" in the Extra column only means that when reading from the ix_actions_created_at index, you have to make an additional query to get the values ​​of the remaining columns.
Secondly, in general, the second plan looks more optimal: fewer lines to process, and, most importantly, the absence of a filesort, because when fetching data from the ix_actions_created_at index, the order specified in the ORDER BY is obtained automatically.
I don't have enough data to answer the question "why is the second one slower then?". If the data does not fit in memory, then additional queries to get the values ​​of all columns will indeed take a lot of time. But in the first case, the server does the same (because there is no "Using index"). Is it the same configuration? What is the performance in the first case if we add FORCE INDEX(ix_actions_created_at)? How will performance change in the second case if we add FORCE INDEX (ix_actions_user_id,ix_actions_type)
It would be interesting to look at the output of EXPLAIN FORMAT=JSON (available in MySQL 5.6+, MariaDB 10.1) or ANALYZE FORMAT=JSON. Unfortunately, you have MariaDB 10.0, which does not have either.
But the question, as I understand it, is not even that, but where to turn (with this and similar questions) and where to look. You can write to Moscow MySQL User Group: groups.google.com/forum/#!forum/moscow-mysql-user-group
You can write to MariaDB ( https://launchpad.net/~maria-discuss). If you think you've found a bug, you should file it in MariaDB JIRA: mariadb.atlassian.net/secure/Dashboard.jspa
Well, StackOverflow can be helpful too :)
Where to look: I would try the new optimizer debug features in MySQL 5.6/5.7 and MariaDB 10.1. They can be installed at least for test purposes. Without them, the optimizer's problems can only be solved by trial and error, i.e. test hypotheses by changing the configuration and query text one change at a time.
The machine on which the request is executed is one, i.e. queries on one database with one configuration. There is a problem with RAM.
I add FORCE INDEX (ix_actions_created_at) to the first request, which was fast, it starts to think for a very long time:
MariaDB [80Product_12]> explain SELECT * FROM actions FORCE INDEX (ix_actions_created_at) WHERE actions.user_id = '11111' AND actions.type = 'BonusDist' ORDER BY actions.created_at DESC  LIMIT 3;
+------+-------------+---------+-------+---------------+-----------------------+---------+------+------+-------------+
| id   | select_type | table   | type  | possible_keys | key                   | key_len | ref  | rows | Extra       |
+------+-------------+---------+-------+---------------+-----------------------+---------+------+------+-------------+
|    1 | SIMPLE      | actions | index | NULL          | ix_actions_created_at | 10      | NULL |    3 | Using where |
+------+-------------+---------+-------+---------------+-----------------------+---------+------+------+-------------+

If we add FORCE INDEX (ix_actions_user_id, ix_actions_type) to the second request, then performance increases and becomes somewhere on an equal footing, like the first request, without FORCE INDEX (ix_actions_created_at):
MariaDB [80Product_12]> explain SELECT * FROM actions FORCE INDEX (ix_actions_user_id,ix_actions_type) WHERE actions.user_id = '222222' AND actions.type = 'BonusDist' ORDER BY actions.created_at DESC  LIMIT 3;
+------+-------------+---------+------+------------------------------------+--------------------+---------+-------+-------+----------------------------------------------------+
| id   | select_type | table   | type | possible_keys                      | key                | key_len | ref   | rows  | Extra                                              |
+------+-------------+---------+------+------------------------------------+--------------------+---------+-------+-------+----------------------------------------------------+
|    1 | SIMPLE      | actions | ref  | ix_actions_type,ix_actions_user_id | ix_actions_user_id | 98      | const | 46114 | Using index condition; Using where; Using filesort |
+------+-------------+---------+------+------------------------------------+--------------------+---------+-------+-------+----------------------------------------------------+
1 row in set (0.00 sec)

I just noticed now that the first request does not have data matching the condition, the second one has data, it probably plays a role too.
I think that's the point. Because Since the active dataset does not fit in memory, it is cheaper to select a lot of extra rows (i.e. all that satisfy the WHERE clause), then sort by created_id and return the first three, than to scan created_at completely, but avoid sorting.
Unfortunately, MySQL's optimizer doesn't take into account the difference in access between cached and non-cached data, so it can sometimes make wrong decisions. Though it in one way or another concerns all DBMS (hints are thought up for this purpose).
I suspect that the composite index on (user_id, type, created_at) can change the situation.
Alexei

W
wol_fi, 2015-12-22
@wol_fi

Each possible query execution plan is weighted by the optimizer. It is likely that these queries have different weights, which is why different indexes are used. You can try assigning indexes manually via FORCE INDEX

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question