K
K
Kasbolat Kumakhov2018-11-12 23:43:20
Database design
Kasbolat Kumakhov, 2018-11-12 23:43:20

How to properly version input data?

There is a system of several modules. The main task is to get a set of XML files, check them and load them into the database.
The sequence is:

  • Getting the XML
  • Checking against XSD
  • We check each node with a set of scripts
  • Serialize and upload to database

If one of the items returned an error, we wrap the process.
Everything works, everything is fine.
Here only there was a need to change the input data while maintaining backward compatibility.
To do this, XML has a version tag.
It seems that the situation is ordinary - to add a couple of new schemes, scripts and a serialization handler.
Everything is complicated by the fact that there are hundreds of scripts (and there will be more) and versions will be released regularly.
What are the best practices on this topic?
The DBMS should always have only the latest version of the schema, but all data for all time. There is no need to store previous schemas, and the data must be adapted each time.
Compatibility is only needed between the new and the previous version of the files. You don't have to drag the old one.
Changes can be very global and scripts can be very different.
What is the best way to implement this?
The main plug for me is the scripts. I didn’t come up with anything better than to enter version support into each script.
But this complicates things a lot.
Ideally, I would like to get something like branches in git. One branch is the current version, and the second is the new one. And depending on the version of the file, select scripts from the desired branch.
Also the serializer. But it will need some kind of adapter, which will make a new one from the current version.
In general, I would like to know how someone solves such a problem.
UPDATE_1:
Example:
XML_VER_1.0:
  • FIELD_1: 10
  • FIELD_2: 20
  • FIELD_3: 30

XML_VER_2.0:
  • FIELD_3: 30
  • FIELD_4: 40

Here comes two such XML. In the database, the result should be written like this:
First entry:
  • FIELD_1: 10
  • FIELD_2: 20
  • FIELD_3: 30
  • FIELD_4: 0

Second entry:
  • FIELD_1: 0
  • FIELD_2: 0
  • FIELD_3: 30
  • FIELD_4: 40

Answer the question

In order to leave comments, you need to log in

1 answer(s)
D
d-stream, 2018-11-13
@d-stream

And if you move the versioned transformation to the "before" position?
That is, separately transform the previous version into the current one and then follow the current path. Naturally, this is acceptable if there are unambiguous transformations from version to version (possibly with hints of default new values, etc.)

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question