Adam_Ether2013-03-17 06:15:39
Adam_Ether, 2013-03-17 06:15:39

Aggregation of statistics in SQL storage?

Suggest your implementation of the task.
Let's say there is a system that collects certain statistics rounded to the minute. Let the format be: date, count.
All this is stored in sql-storage (postgresql or mysql).
The question is how to aggregate these statistics into:
1) statistics by hours,
2) statistics by days.
1) process only new data, ie. incremental method.
2) provide maximum performance
3) as low as possible consumption of RAM and CPU.
There is a solution to the problem by creating views in the database. We remember the last processed date. And not the most elegant statistic conflict merge.
I thought about taking statistics to MongoDB and doing map-reduce there. Then back to sql-storage. But this is the use of two different databases at once. Overhead resources for copying. Which looks too, not in the most appropriate way.

Answer the question

In order to leave comments, you need to log in

2 answer(s)
Stdit, 2013-03-17

The map-reduce option is not a fact that it will be a faster solution. Denormalization will help. To avoid aggregation on each request for statistics, and impact loads on the processor and memory during such requests, it is logical to increase the counters in special tables during the entry of each next record. To do this, you can put a trigger on the insert that increments the desired counter (current hour counter, current day counter). Of course, a separate counter will be required for each form of a statistical query.

egorinsk, 2013-03-17

Explicitly in the application to increase the counters. In MySQL this can be done with a single query like:
INSERT INTO hourly_table() VALUES() ON DUPLICATE KEY UPDATE x = x + ?, y = y +?
INSERT INTO daily_table() VALUES() ON DUPLICATE KEY UPDATE x = x + ?, y = y +?

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question