How to remove duplicates in MySql table with 29 million rows?

D

doublench212016-04-13 19:34:23

MySQL

doublench21, 2016-04-13 19:34:23

What request will the system not get up for me during such an operation?

Reply

Answer the question

In order to leave comments, you need to log in

3 answer(s)

D

Dmitry Entelis, 2016-04-13
@doublench21

If it is possible to stop writing/changing the table, the fastest method would be the following scheme.
1.
Create a new table table2 with a structure similar to the original table1.
2.

insert into table2 (select distinct ... from table 1);

if the structure is complex, you can use group by
3 instead of distinct.
4.
Rename table2 to table1;

A

Artyom Karetnikov, 2016-04-13
@art_karetnikov

If I were you, I wouldn't delete anything from the table. And I would simply create a new one, already without duplicates. In terms of resource consumption, it will be much more economical. And no one bothers to do this:
1. Selected the first hundred thousand, remember the id of the last entry.
2. Pause.
3. Selected the next hundred thousand starting from the last ID. Go that point two.
All.

C

Chvalov, 2016-04-13
@Chvalov

Have a look - stackoverflow.com/questions/4685173/delete-all-dup...
stackoverflow.com/questions/3311903/remove-duplica...
stackoverflow.com/questions/14046355/how-do-i-dele...