W
W
winbackgo2012-08-06 12:20:32
MySQL
winbackgo, 2012-08-06 12:20:32

Synchronization: Deleting data

There are two tables (T1 and T2). The tables are in different databases and on different servers. Data from T1 is synchronized with T2. The synchronization process is simple, the date of the last record is taken from T2, data updated after this date is retrieved from T1, then inserted / updated in T2.
How would you implement the removal of data that is not in T1? There are many records, several million. The operation must be carried out regularly. Tables have a common ID field.
The simplest option is to iterate through all the records and calculate the missing IDs, but how to do this with minimal overhead?

UPD: Table structures are different, data is used in two completely different projects. Replication will not work here. Let's say a server where T2 works as a parser and can only read from T1. T1 knows nothing about T2 at all.

UPD2:
So far I've only reached the following request:

SET @counter=0;
SELECT 
ID,
IF(@counter+1 != ID, CONCAT(@counter+1, '-', ID-1), NULL) AS missing,
@counter:=ID
FROM T1 ORDER BY ID;
I’m thinking how to extract non-NULL data now, if it were possible to write, then it would be possible to create a temporary table, but there are only read rights. Another disadvantage is that the received missing must be parsed and made into a range.

Answer the question

In order to leave comments, you need to log in

5 answer(s)
G
gaelpa, 2012-08-06
@gaelpa

Write a trigger that writes the ID of a deleted/modified record from table T1 to table T1del.

F
Fortop, 2012-08-06
@Fortop

Why not set up replication?

F
frasl, 2012-08-06
@frasl

A slightly naive option is to check strictly delta. That is,
delete from t2 where record_time >= sync_time and id not in (select id from t1 where record_time >= sync_time)
if synchronizations are relatively frequent (deltas are small), it will be nice.

D
dbmaster, 2012-08-06
@dbmaster

In addition to the proposed options, there is one more - to make the deleted flag in the T1 table.

S
spacediver, 2012-08-08
@spacediver

how to do it with minimal overhead?
Try the simple solution head-on, and measure the actual overhead. Maybe the comparative execution time and the possible frequency of launches will suit you right away?
Perhaps, after 1-2 launches, the server will “warm up” one way or another and begin to process these operations faster?

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question