M
M
mr_creo2015-08-03 17:50:10
MySQL
mr_creo, 2015-08-03 17:50:10

Why does MySQL only use part of the index?

There is a table that helps to process jobs (more than 300 million records):

CREATE TABLE `test` (
 `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
 `job_id` int(10) unsigned NOT NULL,
 `lock` mediumint(6) unsigned DEFAULT '0',
 `time` timestamp NULL DEFAULT NULL,
 PRIMARY KEY (`id`),
 KEY `job_id` (`job_id`),
 KEY `lock` (`lock`),
 KEY `time` (`time`),
 KEY `lock_time` (`lock`,`time`)
) ENGINE=MyISAM;

the lock field is set to 0 (when waiting), the process ID (when fetching), 1 (when executing), and finally 0 again. The
time field is set to the time the job was executed.
It is necessary to close the job processing loop, ORDER BY `time` ASC LIMIT n came to mind.
123456 - process id
UPDATE `test` SET `lock` = 123456 WHERE `lock` = 0 ORDER BY `time` LIMIT 1000
SELECT * FROM `test` WHERE `lock` = 123456
UPDATE `test` SET `lock` = 1 WHERE `lock` = 123456

A key was created for the UPDATE request: KEY `lock_time` (`lock`,`time`)
But for some reason, when EXPLAIN is executed, it turns out that only part of the `lock_time` index is used, namely lock:
e9117b192d1f43148d1df747841301c0.png
Although the selection request is relatively fast (0.0048 sec.) :
SELECT SQL_NO_CACHE * FROM `test` WHERE `lock` = 0 ORDER BY `time` LIMIT 1000

But a similar UPDATE query takes an unacceptably long time (286.0228 sec.):
UPDATE `test` SET `lock` = 123456 WHERE `lock` = 0 ORDER BY `time` LIMIT 1000

I'm waiting for your advice!

Answer the question

In order to leave comments, you need to log in

5 answer(s)
M
mr_creo, 2015-08-03
@mr_creo

Everything is very simple:
1000 rows affected. (The request took 0.0736 seconds)
Thank you all!

A
Anton B, 2015-08-03
@bigton

But a similar UPDATE query takes an unacceptably long time (286.0228 sec.)
UPDATE `test` SET `lock` = 123456 WHERE `lock` = 0 ORDER BY `time` LIMIT 1000

Maybe because there is an index on lock and this time is obtained due to his rearrangement?

S
Stanislav Makarov, 2015-08-03
@Nipheris

I re-read it four times - I still don’t understand why you need time when you go to the field with the process ID to zero. If you first do this:
then why not then do it exactly to the set?
If you say that several processes can execute a task, and you can't just put zeros on all tasks that have 1, then I will tell you that you need a normal table structure, and you should separate "lock" and "process_id". Although of course it is not clear why you need such a transition: 0 -> process_id -> 1 -> 0, do 0 -> process_id -> 0, and you will have normal requests:

UPDATE `test` SET `process_id` = 123456 WHERE `process_id` = 0 ORDER BY `time` LIMIT 1000
UPDATE `test` SET `process_id` = 0 WHERE `process_id` = 123456

R
Rsa97, 2015-08-03
@Rsa97

Here you can do this:

SET @id = 0;
UPDATE `test` SET `lock` = 1 WHERE @id := `id` AND `lock` = 0 ORDER BY `time` LIMIT 1;
SELECT @id;

Within the same connection, the variable is stored between requests, and the atomicity of the UPDATE request allows you to select the row id and lock it at the same time.
After processing, it is enough to reset the lock and, upon successful processing, set the completion time to time. Then the task will again rise to the end of the queue. If time is not changed, then the task will be at the head of the queue.

A
Alexander Melekhovets, 2015-08-03
@Blast

Have you tried Order by lock asc, time asc? The mysql scheduler doesn't always work well, perhaps one of those cases.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question