Answer the question
In order to leave comments, you need to log in
How to speed up query with WHERE NOT IN condition?
Good afternoon, I can’t understand why the request is taking a long time to complete,
there is a table of 10,000 records, of the following type
id ; computer_name ; hash1
; a; 112
; b; 223
; b; 224
; c; 335
; c; 34
The task is such that it is necessary to remove duplicate data from the table, and duplicates must be determined by the computer_name and hash fields together,
i.e. from the example above, lines 2 and 3 are duplicates. they have the same values for both fields (computer_name and hash)
and lines 4 and 5 are not a duplicate, because their hash is different
. And after deletion, you need to leave only here the line from the group of duplicates, whose ID is the maximum, in other words, you need to get
1 as a result; a; 113
; b; 22
4 ; c; 335
; c; 34
Wrote this query:
DELETE FROM table
WHERE id NOT IN (SELECT MAX(id)
FROM table
GROUP BY computer_name, hash);
SELECT * FROM table
WHERE id NOT IN (SELECT MAX(id)
FROM table
GROUP BY computer_name, hash);
Answer the question
In order to leave comments, you need to log in
NOT IN is very slow for such a task. Try replacing it with NOT EXISTS
or
SELECT * FROM table t1
LEFT JOIN (SELECT MAX(id) id
FROM table
GROUP BY computer_name, hash) t2
ON t1.id = t2.id
WHERE t2.id IS NULL
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question