I
I
Ivan Panteleev2013-06-28 13:21:16
MySQL
Ivan Panteleev, 2013-06-28 13:21:16

Copying large amounts of data from one table to another

There are two tables: pos, temp_pos. The first table contains a little over 350 million records, the second contains about 3 million records. It is required to transfer data from one table to another as efficiently as possible. Here are the points I'm interested in:
1. Which engine is better suited for this task InnoDB or still MyISAM?
2. Is it necessary to disable key verification and enable it after? Does it speed up the process or slow it down?

Now I do it like this:

ALTER TABLE pos DISABLE KEYS;
INSERT INTO pos (id, position, url_id, dataset_id) SELECT id, position, url_id, dataset_id from temp_pos;
ALTER TABLE pos ENABLE KEYS;

Now MyISAM is used and the data transfer operation takes an average of 40 minutes to 2.5 hours. I want to do it as efficiently as possible.

Answer the question

In order to leave comments, you need to log in

6 answer(s)
V
Vitaly, 2013-06-28
@xytop

MyISAM for inserts should be faster than InnoDB ( www.opennet.ru/tips/info/1958.shtml )
try this method:

SET FOREIGN_KEY_CHECKS = 0;
SET UNIQUE_CHECKS = 0;
SET SESSION tx_isolation='READ-UNCOMMITTED';
SET sql_log_bin = 0;
ALTER TABLE pos DISABLE KEYS;

INSERT INTO pos (id, position, url_id, dataset_id) SELECT id, position, url_id, dataset_id from temp_pos;

ALTER TABLE pos ENABLE KEYS;
SET sql_log_bin = 1;
SET UNIQUE_CHECKS = 1;
SET FOREIGN_KEY_CHECKS = 1;
SET SESSION tx_isolation='READ-REPEATABLE';

K
KEKSOV, 2013-06-28
@KEKSOV

There is a lot of useful information on this topic on the website . And Google on request mysql speedup insert select can help.
If you use InnoDB, add SET autocommit=0; and after insert add COMMIT;

E
edogs, 2013-06-28
@edogs

Do you have access to FS?
If yes. then
a) You can copy the database files (myd, frm, myi), after stopping the database.
b) You can use mysqlhotcopy (does almost the same as point A, only smarter and without stopping the entire database).
ps: If it is necessary to add data from one table to another, then the above is suitable for “ejecting” data from the working table into a temporary one, so that the working one would continue to work while a long copying through insert into is in progress

W
wartur, 2013-06-28
@wartur

In InnoDB, you can enable a transaction and write everything at once, then the log operation is optimized and the speed is increased. Disabling foreign keys will also speed up the speed, but this should be done if you are confident in your actions.

V
Vampiro, 2013-06-28
@Vampiro

The first table contains about 3 million records, the second has a little over 350 million records.

It seems to me that it is more profitable to rename the tables by swapping them. Inserting 3 is easier than 350. If you need to constantly drive 350k records from table to table, you probably have something wrong with the architecture, and you should think in the direction of triggers or something caching along the way from data to muscle. It seems that memcash is now making a simple database, but I don’t have the information here. you need to look at the application - where to put it.

B
balloon, 2014-07-10
@balloon

We have similar tasks with a comparable amount of data. SELECT + LOAD on InnoDB takes minutes (SSD + ~120Gb RAM). We do not disable the keys, as well as autocommit - IMHO, with atomic queries, there is no point in this, the indexes will be updated anyway only after the end of writing all the data. We do LOAD IGNORE to ignore duplicates. Another advantage of SELECT + LOAD over SELECT INSERT is that there is no blocking for the duration of the SELECT.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question