Answer the question
In order to leave comments, you need to log in
How expensive is the GROUP BY() + SUM() operation?
Let's say there is a table:
statistics
date (int), user_id(int), param1(int), param2(int), param3(int) ...
uniq_user_id_date
fk_user_id
$sql_date_format = '%Y-%m';
// $sql_date_format = '%Y-%m-%u';
// $sql_date_format = '%Y-%m-%d %H:00:00';
// ...
$statistics_obj = DB::select()
->select(
['date', 'date'],
[DB::expr('DATE_FORMAT(FROM_UNIXTIME(`date`), :date_format)', [
':date_format' => $sql_date_format,]), 'date_formatted',],
[DB::expr('SUM(`param1`)'), 'param1']
)
->from(ORM::factory('Statistic')->table_name())
->where('user_id', '=', $user->id)
->and_where('date', 'BETWEEN', [$this->_start, $this->_end])
->group_by('date_formatted')
->as_object()
->execute();
Answer the question
In order to leave comments, you need to log in
It doesn't matter how many records are in the table at all. You have a condition by user and date and an index by this pair, respectively, the grouping will be only within the date range and by this user.
The grouping is done by a formatted date, MariaDB is unlikely to be able to figure out that ordering by date will help, so it will most likely group this as an unordered set. I think if you group by original date before grouping can be faster - you can format later.
As an alternative, index the formatted date
Populate the table with data that is approximately similar in distribution and look at the query plan.
What does slow mean in your mind?
What kind of iron?
What indexes?
And guess what - generate random data and check.
"slowly" is a relative term. it’s impossible for production under load, but for reports it will work once a week
Sooner or later, the selections become too fat for building reports/graphs/whatever on the fly.
Maybe it's better to immediately choose something more suitable for this
https://habr.com/company/oleg-bunin/blog/329062/
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question