A
A
Anton Bormotov2014-10-30 08:43:05
MySQL
Anton Bormotov, 2014-10-30 08:43:05

Input data is stored in MySQL, How to store and calculate data on the fly when one of the tables is updated?

The question is:
1) The input data is stored in MySQL, in the "Results" table:
Results [result_id, post_id , user_id , points_bet, points_got]
Posts [ post_id , category_id , text]
Categories [ category_id , category_name]
Users[ user_id , overall_ratio ]
The overall_ratio value for each user is calculated as follows
. way:

overall_ratio  = SELECT sum(points_got) / sum(points_bet) FROM Results WHERE (user_id = user->id)

2) We need to calculate the same ratio for each user in each category:
category_id_ratio  = SELECT sum(points_got) / sum(points_bet) FROM Results WHERE (user_id = user->id) AND (category_id = category->id)

The main value of overall_ratio, the ratio of sums for all results, we store in the overall_ratio field in the Users table. And every time we add a result to the Results table, we recalculate this value, taking into account the new, added one, without taking into account the category, that is, for all user results.
Okay, we can calculate this value, it's only 1 query per result. (Add 100 results, recalculate overall_ratio for each user). This can still be dealt with, we think.
But what if you need to calculate for each user and also in each category?
Possible problems that may arise in the future:
There will be a large number of results, each user in each category. And we need to recalculate the data each time anew, for example, to display users with an adjacent ratio in each category.
sum(points_got) / sum(points_bet).
For example, in the football category, the user has a ratio of 1.2.
It is necessary to display users with the same or neighboring ratio (+-) from the same category. But we do not store this data.
Let's call this data intermediate.
Question: How to correctly calculate and store these same intermediate data "on the fly" when adding data to the results table?
That is ration for each category and for each user.
If you just send a query to the database and get the result, but on a large number of results, categories and users, performance may be low.
How to optimize this task?
Should key value storages be used to solve problems such as noSQL databases or cache storages?
Thank you very much for your attention and comments.

Answer the question

In order to leave comments, you need to log in

1 answer(s)
D
Dmitry Entelis, 2014-10-30
@DmitriyEntelis

0)
The main question is how many values ​​do you have in Query?
1)
And what for to collect results on all table?
When you add a record, you know the user_id, you know the category, respectively, you can recalculate only a specific line in its entirety
2)
Based on your formula - there is no need to honestly recalculate the line, you can do something like
This can be done both automatically by a trigger and in the logic of the application itself.
In principle, if the Query is large, you can move it to some kind of redis.
Re-read the question. You do not want to store exactly non-aggregated data, i.e. get rid of Results ?
If there the account does not go to hundreds of millions of records, I would still store it, because if anything, it would not be possible to recalculate the values ​​otherwise.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question