S
S
Soslan Khloev2018-10-25 16:22:01
SQL query optimization
Soslan Khloev, 2018-10-25 16:22:01

SQL query optimization. What can be done?

Good afternoon!
There is a standard CDR asterisk table with several additional fields. The table has about 250,000 entries.
Request:

SELECT DISTINCT(dstnum) AS t2dstnum, MAX(calldate) OVER (PARTITION BY dstnum) AS t2lastcall FROM cdr WHERE route = "out"

runs for about 13 minutes.
EXPLAIN output:
+------+-------------+-------+------+---------------+-------+---------+-------+--------+----------+----------------------------------------+
| id   | select_type | table | type | possible_keys | key   | key_len | ref   | rows   | filtered | Extra                                  |
+------+-------------+-------+------+---------------+-------+---------+-------+--------+----------+----------------------------------------+
|    1 | SIMPLE      | cdr   | ref  | route,r_and_d | route | 51      | const | 140379 |   100.00 | Using index condition; Using temporary |
+------+-------------+-------+------+---------------+-------+---------+-------+--------+----------+----------------------------------------+

SHOW PROFILE output:
+-------------------------+------------+
| Status                  | Duration   |
+-------------------------+------------+
| Starting                | 0.000170   |
| Checking permissions    | 0.000020   |
| Opening tables          | 0.000050   |
| After opening tables    | 0.000017   |
| System lock             | 0.000015   |
| Table lock              | 0.000019   |
| Init                    | 0.000091   |
| Optimizing              | 0.000036   |
| Statistics              | 0.000214   |
| Preparing               | 0.000046   |
| Creating tmp table      | 0.000082   |
| Executing               | 0.000013   |
| Sending data            | 0.948192   |
| Converting HEAP to Aria | 0.073438   |
| Sending data            | 8.220165   |
| Removing duplicates     | 755.840777 |
| Removing tmp table      | 0.000293   |
| Removing duplicates     | 0.000036   |
| End of update loop      | 0.000011   |
| Query end               | 0.000005   |
| Commit                  | 0.000007   |
| Closing tables          | 0.000006   |
| Unlocking tables        | 0.000004   |
| Closing tables          | 0.000013   |
| Starting cleanup        | 0.000004   |
| Freeing items           | 0.000014   |
| Updating status         | 0.000037   |
| Reset for next command  | 0.000005   |
+-------------------------+------------+

The problem is clear where, but how to optimize did not understand. Ask for help from more experienced and advanced SQL writers. Can it be optimized at all?

Answer the question

In order to leave comments, you need to log in

2 answer(s)
V
Vitsliputsli, 2019-02-27
@hloe0xff

need like this:

SELECT 
        dstnum AS t2dstnum, 
        MAX(calldate) AS t2lastcall 
    FROM cdr 
    WHERE route = "out" 
    GROUP BY dstnum

analytics is not needed here, only an aggregate function with grouping. As a result, when passing the table, we do not collect duplicates, but take only the necessary rows. Otherwise, a temporary table will be created with many unnecessary duplicates, which will take DISTINCT time to remove. Judging by the work of the query, in PostgreSQL the optimizer itself guesses what needs to be done exactly like this, in MySQL it must be set explicitly.

C
chupasaurus, 2015-08-25
@chupasaurus

No way.
NAT is needed: either static NAT if there is a free external IP and you don’t feel sorry for it, or PAT on the necessary ports (at least 25)

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question