Answer the question
In order to leave comments, you need to log in
How to optimize sql query (selection from 1.5M rows)?
I select data between two dates with more than a million rows with this query:
SELECT
time AT TIME ZONE 'UTC' AT TIME ZONE 'MSK' AS time,
type,
CASE
WHEN (duration > (extract(epoch from (time - '2015-08-29 12:36:50'))) )
THEN extract(epoch from (time - '2015-08-29 12:36:50'))
ELSE duration
END AS trim_duration
FROM clamps
WHERE device_id = 27 AND time BETWEEN '2015-08-29 12:36:50' AND '2016-03-15 12:36:50'
ORDER BY time ASC
CREATE TABLE clamps
(
id serial NOT NULL,
"time" timestamp without time zone,
duration numeric,
type character varying(255),
device_id integer,
packet_id integer,
dur_float double precision DEFAULT 0,
created_at timestamp without time zone DEFAULT now(),
updated_at timestamp without time zone DEFAULT now(),
CONSTRAINT clamps_pkey PRIMARY KEY (id)
)
CREATE UNIQUE INDEX index_clamps_on_device_id_time
ON clamps
USING btree (device_id, "time");
Sort (cost=149213.28..150770.01 rows=622695 width=21) (actual time=4263.385..4713.566 rows=1469514 loops=1)
Sort Key: "time"
Sort Method: external merge Disk: 71816kB
-> Bitmap Heap Scan on clamps (cost=17535.79..89284.59 rows=622695 width=21) (actual time=190.593..2692.161 rows=1469514 loops=1)
Recheck Cond: ((device_id = 27) AND ("time" >= '2015-08-29 12:36:50'::timestamp without time zone) AND ("time" <= '2016-03-15 12:36:50'::timestamp without time zone))
Heap Blocks: exact=17528
-> Bitmap Index Scan on index_clamps_on_device_id_time (cost=0.00..17380.12 rows=622695 width=0) (actual time=186.667..186.667 rows=1469514 loops=1)
Index Cond: ((device_id = 27) AND ("time" >= '2015-08-29 12:36:50'::timestamp without time zone) AND ("time" <= '2016-03-15 12:36:50'::timestamp without time zone))
Planning time: 0.291 ms
Execution time: 5414.691 ms
Answer the question
In order to leave comments, you need to log in
When analyzing the result, the result is not transmitted to the client, the time of IO and transmission over the network is not taken into account (and you have 1.5 lines there, as you yourself said, besides, there can be relatively long lines).
And what do you want to achieve by sorting a huge array? For further analysis, sorting is usually not needed, and for visual analysis, a smaller sample is needed. Remove sorting - save a lot of time. This is a very expensive operation. (Although if you set ORDER BY device_id, time then sorting may go away due to the fact that the index itself stores values in a sorted form)
And this is
CASE
WHEN (duration > (extract(epoch from (time - '2015-08-29 12:36:50'))) )
THEN extract(epoch from (time - '2015-08-29 12:36:50'))
ELSE duration
END AS trim_duration
The index is correct, but it looks like it's not being used, try to hardcode the index in the query, you can also try reindexing.
Is there enough memory for the index?
WHEN (duration > (extract(epoch from (time - '2015-08-29 12:36:50'))) )
THEN extract(epoch from (time - '2015-08-29 12:36:50'))
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question