Database versioning?

V

Vladimir Kuts2019-05-28 12:24:59

Database design

Vladimir Kuts, 2019-05-28 12:24:59

There is a base with several tables on the server.
There is a basis on a certain client.
On the server, the data changes from time to time. Let's admit - base structure does not change. Let's say that after each change we increment the data version.
Now a certain client accesses the database through the API, indicating its version, in response we must send some json to the API, in which we indicate which records need to be added / changed / deleted to synchronize data between the version on the server and the version of the client, spending as much as fewer resources are possible and by minimizing the number of requests and data sent from the server, since clients are mobile ...
It would seem that the task is typical - but I don’t really find data on the implementation on the network.
So far I have added a version to the table, and with each change I add data with the version. But this crutch does not seem to be particularly effective. Let's say there are about a million records in the table, and a couple of changes are made in the next version - firstly, you have to store a duplicate of the data, and secondly, you have to shovel millions of records for changes between versions.
Where can you find information on this topic? Perhaps you can somehow use other free database engines for this business? Perhaps somewhere there is a ready-made implementation?

Reply

Answer the question

In order to leave comments, you need to log in

5 answer(s)

V

Vasily, 2019-05-28
@VasyaM13221

1) the client stores the date of the last replication with the server.
2) the server updates the row's changetAt column each time the row is updated.
3) when connecting, the client sends the date of its last replication with the server to the server.
4) the server makes a selection by date and sends the row id to the client
5) the client looks at what id he needs and makes a request to update them

P

Philipp, 2019-05-29
@zoonman

If you have 500k entries in the table, then there is absolutely no point in downloading it all to your phone, except for one situation - offline mode. In this case, an actual impression is simply made, archived and sent to the client. Downloading 100 MB from your phone is not a problem now. With the Internet, it is much easier to make requests through the API.
If this option does not suit you, then everything is done very simply - the data is never deleted on the server, but a sign of their deletion is stored. In addition to this sign, the sign of the last update of the record is stored.
The client stores special meta-data with information about the last update (timestamp).
When connecting, the client requests an update with the last update time and local time (seen the wrong clock set a million times), the server responds with updated data. To save traffic, data comes in a compressed format, and remote records are transmitted as a separate fragment with minimal information identifying the row (fields: primary key or unique index). This provides a minimum amount of traffic, but only works in one direction.
To work in both directions, you need a full log of operations and a conflict resolution system. In general, the solution is so-so, for databases working with a large out of sync or a high frequency of updates, it is not solved ( OT will not save) without human intervention.
For regular databases in collaborative mode, set up master-to-master replication with transactions. For mobile applications, use thin client mode.

P

Peter, 2019-05-28
@petermzg

Write down not the version, but the time when the state of the record changed.
And it will be enough to get all the changes that have occurred after the specified time.

S

Stanislav Makarov, 2019-05-28
@Nipheris

If you need direct versioning - see SQL:2011 and its implementations for the RDBMS you need.
If you still need replication, and between disparate databases and process participants, look at www.symmetricds.org

What are some examples of using database replication?
Remote offices replicated to a central office
Cross platform database replication between different databases
Replication between on-premise databases and cloud databases
Consolidation of multiple databases into a data warehouse
Regional database replication to improve access times for local users
High availability of a database using a primary and secondary instance
As for conflicts and "gita for tabular records" - everything is really not easy here, I've taken on writing a dissertation on this topic ..

V

Vladimir, 2019-05-28
@MechanID

Have you learned native replication mechanisms in mysql (easier there) and then in postgresql? do they not suit you? if so, please describe why?