How to avoid duplication when loading data into the database?

B

beduin012015-01-03 00:51:22

SQL

beduin01, 2015-01-03 00:51:22

There is a table in a DB. There is data from the site that I parse. The data on the site does not change much. New lines may appear. They are added in continuation of the old ones. However, the site does not have more than 30 lines (the old ones are deleted).
Question: how to implement adding only new values to the database? At the beginning, do a SELECT from the database and then compare it with the values \u200b\u200bthat I want to write there?
Or how to make it better?
And the second question. What if some of the old values on the site received a small update, and the database contains the old version. It turns out easier every time to delete all 30 records that are there from the database and write new ones there?
Let me explain with an example. Measurements are posted on the site every day. Every day is a new line. However, there are times when yesterday's measurements are corrected retroactively.

Reply

Answer the question

In order to leave comments, you need to log in

2 answer(s)

A

Armenian Radio, 2015-01-03
@gbg

In addition to the option you described, there is another one - you can create a trigger in the database with activation before inserting a row, and in this trigger you can do all the work to find a double and correct it.
This will be slightly faster (data won't be rushing back and forth between the database and the application).

H

hrum, 2015-01-03
@hrum

MySQL? If the primary (or unique) index in the table is correctly defined, a simple REPLACE can be used instead of INSERT/UPDATE.
It's ugly :) and a bit slow (because MySQL actually does two operations instead of one), but it's simple and it's not a problem to update old data automatically. Yes, and for 30 entries in the package is not critical in terms of performance.
dev.mysql.com/doc/refman/5.5/en/replace.html