Answer the question
In order to leave comments, you need to log in
Why is DISTINCT ON so slow?
Plate 931,263 entries.
Request
select distinct on (o.group_id) o.id group_id from offers o limit 10;
# \d+ offers;
Table "public.offers"
Column | Type | Modifiers | Storage | Stats target | Description
--------------+-------------------------+-----------------------------------------------------+----------+--------------+-------------
id | integer | not null default nextval('offers_id_seq'::regclass) | plain | |
name | character varying(400) | not null | extended | |
group_id | integer | | plain | |
Indexes:
"offers_pkey" PRIMARY KEY, btree (id)
"offers_group_id_e0c51f8a" btree (group_id)
# explain analyze select distinct on (o.group_id) o.id group_id from offers o limit 10;
-[ RECORD 1 ]---------------------------------------------------------------------------------------------------------------------------------------------------------------
QUERY PLAN | Limit (cost=0.42..59572.55 rows=10 width=8) (actual time=0.089..566.879 rows=1 loops=1)
-[ RECORD 2 ]---------------------------------------------------------------------------------------------------------------------------------------------------------------
QUERY PLAN | -> Unique (cost=0.42..1191442.91 rows=200 width=8) (actual time=0.087..566.876 rows=1 loops=1)
-[ RECORD 3 ]---------------------------------------------------------------------------------------------------------------------------------------------------------------
QUERY PLAN | -> Index Scan using offers_group_id_e0c51f8a on offers o (cost=0.42..1189221.41 rows=888599 width=8) (actual time=0.085..529.775 rows=931263 loops=1)
-[ RECORD 4 ]---------------------------------------------------------------------------------------------------------------------------------------------------------------
QUERY PLAN | Planning time: 0.137 ms
-[ RECORD 5 ]---------------------------------------------------------------------------------------------------------------------------------------------------------------
QUERY PLAN | Execution time: 566.925 ms
Answer the question
In order to leave comments, you need to log in
You can speed it up with such a nice recursive cte: https://wiki.postgresql.org/wiki/Loose_indexscan
Natively pg does not yet know how to loose indexscan. So distinct reads all elements of the tree instead of looking for the next larger element.
Use grouping by field that requires uniqueness
In my case, this is about 6.5 times faster.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question