How to competently organize work with a large amount of intensively updated data?

C

Crash2019-08-23 13:16:10

Parsing

Crash, 2019-08-23 13:16:10

The essence of the problem: there is a data set with a volume of several hundred thousand rows (tens of megabytes), if placed in a MySQL database table. This data needs to be received from a third-party source several times a minute, reformatted and some analytics done on the fly, then stored at home - the cached data should be available at any time.
I tried to do it in the classical way, with saving to a MySQL table. But as expected, in this case, the work is very inefficient - a very long insert, even if it is divided into several stages. I do not have time to save the data to the database until the next call to the update script (the interval between calls is a few seconds), which is why script calls start to "catch up" with each other. Of course, protection against parallel calls can be done, but in this case the data will be in an outdated state (data relevance is critically important).
So far I have settled on a solution with Redis, saving data in RAM. In general, everything works, but is this method the best solution to the problem? RAM is also not free and not infinite. Let me know who has come across.

Reply

Answer the question

In order to leave comments, you need to log in

1 answer(s)

E

Evgeny Koryakin, 2019-08-23
@zettend

The current SSD speed will allow you to process hundreds of megabytes per second. Any multi-threaded Node JS engine on the Raspberry Pi will do just fine.
What you have written will feel good on any hosting for a dollar.