Once again about logging changes and versioning data in the database

G

Godless2013-08-19 15:27:11

SQL

Godless, 2013-08-19 15:27:11

Good day.
I apologize for another similar question.
The topic is rather hackneyed, decision algorithms are known ( one , two ).

At work, the task arose to depict software for branches. I won't go into details. The essence is something like this:
There are * tsat directories. Some directories are hierarchical. There are aggregator tables (hereinafter XXX) in which data will be accumulated.

The specifics of the subject area is such that directories can change quite often. Moreover, changes can be both critical (which means they need to be recorded in history), and not so (not recorded). And the data must be in XXX to be up-to-date on the date of entry. This is solved by versioning the data in the directories with links in XXX to the data of a certain version. There is one more nuance - again, due to specifics, quite often (4-6 times a year) the set of fields in directories can change (in XXX too).

We can not determine how best to store all this economy. I would like it to be convenient to code, to work smartly (the database will fill up pretty quickly), and for users to have room for functionality. I don’t want to stick an intermediate layer into the program for processing history and logging (for branches, when we develop for ourselves, it’s already possible there - the amount of data is an order of magnitude larger). Maybe we want too much? Then help to cross out the excess...

Users will log in from the table in the database (sometimes several people work under one account in AD). The rights to sections too in the table.

For now, we are looking at such a solution:
1 table per directory - current up-to-date information
2nd table per directory - data with versions (well, here you can add who changed the last and when) for links from XXX.
+ triggers or an intermediate layer in the program between the software and the database.

Please share your experience. who does what? Is this really the only right way in such a changing environment?

ZY: the DB will be MS SQL (in branches Express).

Reply

Answer the question

In order to leave comments, you need to log in

3 answer(s)

P

Pavel Nazarov, 2013-08-19
@Godless

If you want to store a changing directory, then store it! Approximately as in the first link:
Справочник (иерархично) -> Элементы справочника (уникальный код элемента, ссылка_на_справочник) -> Значения элементов во временном разрезе, т.е. версии по вашей терминологии (дата_начала, дата_окончания, реальное_значение, ссылка_на_элемент, + аудит по вкусу).
Here is the last entity, the actual values, and change as you like, IMHO. If you want, only the value (that is, sometimes it will be a “backdating” change), or if you want, start a new record with dates.
In this case, the query can always get the real value at any time.
If it is not very convenient to embed such conditions in many queries, then you can think about denormalization - i.e. storing references to the desired value of the element in the aggregation.
It is not entirely clear why you need the current up-to-date information - what is the profit? Well, i.e. if a hard load is easier to raise in memory, IMHO, there is hardly such an infinity of directories.

N

noonesshadow, 2013-08-19
@noonesshadow

Most likely the bicycle has already been invented .

V

vlivyur, 2013-09-30
@vlivyur

Developed in two ways.
1. All changes in one table: one line-one attribute (usually directories change 1-3 attributes at a time, less often - everything). Stored: when, who, what props, value before the change (varchar, which in the client turned into the desired value, including if there were links to documents / directories, the current value of this object and the value at the time of the change were displayed) + in the original the table had two fields: who created it and when (mostly users were interested in this information, and history was only for disassembly). The entire structure of the database was described by metadata: directories, documents and their physical representation. Hence, when changing the physical structure, a change trigger was generated, which entered the value before the change into the history table. There was no physical deletion - a "deleted" sign and a "show deleted" button. The solution was simple, but it was more difficult when the history of the document was required: it was just as fast for the header, and for dependent tables - only by opening the object and viewing it line by line. Plus, some inconvenience for the user (who changed when and the value “before the change”, and not “after the change”) is stored.
2. A slightly modified version: The current line is stored in the data + who and when the last one changed it. When this line is changed, the value before the change is who, when, which object is now the penultimate one and the entire object at the time of the change (the entire table row in XML). Everything is also described by metadata: the necessary details are simply marked with the signs “keep history”, the rest is thought out by the client: it generates a request to save the current data in XML and saves it to the history table, and then changes the data (and updates two columns: who and when changed). I thought it would be more difficult than the first option (I didn’t work with XML and even SQL), but it seemed to be more compact + it was easier to implement history based on documents with dependent tables (everything is displayed in one window). Reference values are expanded and the current value is stored, Accordingly, it is difficult to obtain which object the document then referred to (and can already trace its history). Hence the gluttony (when changing one bit field in a table with 30 fields, it will save all 30 fields, and by joining all the dependencies), well, XML was also not invented for speed.
In both cases, one of the changes can be lost during implementation: creation, last change, or deletion.
Personally, I liked the first option - it is more flexible and simpler (except for the request to extract historical data - there is a very high whatnot), but it turned out to be not very convenient for users to see per-requisite changes - they wanted to see the entire line (it is inconvenient when a new attribute changes with the next change and the current value is accounted for keep in mind), but the pests were easier to look for. In the second option, you will have to think about displaying porridge in dependent tables (all rows will be in a heap and it is difficult to make out which line of history belongs to which line of data) and coloring changes compared to the previous option (although you may think about how to store only the changed details without the rest and significant time spent on implementation? there seems to be a lot of problems automatically disappear).
About other ways to store changes - a very good article .