A
A
Anton2016-09-30 11:42:48
MySQL
Anton, 2016-09-30 11:42:48

Whether to delete the data from a DB?

Hello. I have noticed more than once that people do not delete data from, but simply create a del field in which they mark or delete a record, why do they do this and what is the best way to do it?

Answer the question

In order to leave comments, you need to log in

7 answer(s)
R
Rsa97, 2016-09-30
@Rsa97

You can delete records only if they are guaranteed not to be needed in the future and are not referenced by other records.
Well, if the table is large and contains many indexes, then it is faster to mark the record for deletion, and the real deletion and, accordingly, the rebuilding of indexes should be done during periodic maintenance.

R
romy4, 2016-09-30
@romy4

If the data is not monospaced (numbers and lines are the same length), then deleting them will result in a heavily defragmented database. It will turn out that the data is small, and the base is of a huge size. To avoid this, it is necessary to optimize it, and this is very, very expensive and even unacceptable on loaded databases. Therefore, the del field becomes the solution. And also tricks in the form of partitions, etc.

A
Andrew, 2017-10-04
@sAndreas

My position, which has helped me more than once - if the dimensions allow - it is better to store everything. I even made a record in a separate table of who deleted the data (because usually the user_id of the one who changed the data is written automatically, and if it was deleted, then there was no record left of who exactly deleted it). A couple of times I put it in place especially cunningly*opy, who thought that if the data is deleted, then no one will know who sent it for deletion.

V
Vyacheslav Belyaev, 2016-10-06
@TroL929

I recommend adding the del field and marking it already. Many services work on this option, the same social networks when you can delete the record and restore

M
Michael, 2016-09-30
@mtyurin

well, at least if the data is not deleted, then the database will no longer fit into memory :) the question can be here, only in size and time.
or the computing capabilities of the base may not be enough, then the question is in the size / quantity / quality of resources.
but data growth can be significantly slowed down by dividing its speed by the number of base nodes representing the original base.

M
Maxim Timofeev, 2016-10-01
@webinar

The answer to the question can be divided into 2 parts:
1. Will this record be needed in the future? For example, there is no point in deleting a user, he can return.
2. Can it affect seo or site performance. For example, by deleting a page - you lose SEO, you can remove links to it from the site, but at the same time leaving access via a direct link. Some things make sense to keep for statistics.
But keep in mind that most of the time it's laziness. For example, deleting a forum category requires deleting all the forums within and probably a bunch of related data. And it's easier not to delete than to prescribe all the dependencies - this is laziness.
Therefore, the answer to delete or not depends on the specific case. Sometimes you have to remove it, sometimes you have to leave it. And the issue of a growing database can be solved by archiving.

M
Miron2, 2017-08-26
@Miron2

delete vs. status "deleted" depends on the context. If the line is not needed, then of course delete it.
usually there is an online data table where the deletion of a row occurs in a transaction, and the accumulation of rows with the status "deleted" and the time of the operation occurs in the accumulators. the procedure for writing to the drive occurs either in a trigger or in a command line,
sometimes it happens that data is received into the drive, in order to maximize the speed of writing individual or small groups of lines, after which a command line is turned on that processes a significant amount of data in order to maximize the capabilities of the subd, in order to process data, and as a result of the operation, lines are deleted / added to the final table for prompt submission of highly intelligent data. this is how you have to build an architecture when the subd is expensive, for example, Azure Parallel Data Warehouse, and its peak performance for the user requires flat, fully denormalized rows, selected by a single, standard key. these complexities are justified when the cost of storage and processor recedes into the background before the need for simultaneous access to huge amounts of data
in the latter case, imagine a factory where, for example, they first send the line “faucet number one, take into account for pipe number 2 ...” to the subd, after an hour on the stand, the crane bursts during testing. the crane is changed, they send to the base "delete the entry of crane No. 1". In this case, the "delete" status carries the load akin to a command.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question