A
A
altair862019-11-22 14:51:19
PostgreSQL
altair86, 2019-11-22 14:51:19

Why doesn't PostgreSQL use an index?

We take a request:

SELECT *
   FROM t
     LEFT JOIN tu ON t.tid= tu.tid
WHERE t.OrderStatusID IN (1,5)

We get this result:
"Hash Right Join  (cost=4549.10..9309.76 rows=2262 width=460) (actual time=44.610..44.729 rows=165 loops=1)"
"  Hash Cond: (tu.tid = t.tid)"
"  ->  Seq Scan on tu  (cost=0.00..4359.65 rows=152765 width=47) (actual time=0.008..21.188 rows=152887 loops=1)"
"  ->  Hash  (cost=4520.83..4520.83 rows=2262 width=413) (actual time=1.232..1.232 rows=165 loops=1)"
"        Buckets: 4096  Batches: 1  Memory Usage: 65kB"
"        ->  Index Scan using t_orderstatusid on t  (cost=0.56..4520.83 rows=2262 width=413) (actual time=0.035..1.142 rows=165 loops=1)"
"              Index Cond: (orderstatusid = ANY ('{1,5}'::integer[]))"
"Planning time: 0.722 ms"
"Execution time: 44.834 ms"

We see that the table tu passes Seq Scan for the number of rows 152 887. What does the index have on the tid field.
We take the filter t.OrderStatusID IN (1,5) and take the IDs we are looking for from there. The same required 165 records:
And we get that this time the index works on tu! At what for 1 ms in difference from the previous request in 48 ms. How so?
SELECT *
   FROM t
     LEFT JOIN tu ON t.tid= tu.tid
WHERE t.tid IN (34377272,34376992,34377506,34377537,34377335,34377330,34377397,34377023,34376898,34377431,34376981,34377208,34377599,34377368,34377384,34377521,34377538,34377490,34377063,34373826,34377533,34376547,34377037,34377622,34377247,34377571,34377392,34376742,34377598,34377568,34377344,34377617,34377366,34377509,34377573,34377469,34377614,34377618,34376881,34377608,34377017,34377595,34377548,34377611,34377457,34377374,34376918,34377619,34377363,34377402,34377383,34377422,34377350,34377503,34377594,34377302,34377518,34377612,34377527,34377373,34377255,34377500,34377544,34377505,34377248,34377583,34377375,34376735,34377492,34377466,34376782,34377604,34376196,34377439,34377364,34377620,34377561,34377334,34377615,34376558,34376960,34376671,34376987,34377096,34377449,34377601,34377470,34375651,34377460,34377473,34377086,34377053,34377523,34377600,34375538,34377338,34377456,34375690,34376215,34377545,34377553,34377530,34377625,34376287,34375912,34376511,34377411,34377565,34376383,34377520,34377297,34376739,34377307,34377589,34377443,34377591,34377596,34376399,34377415,34377555,34377626,34377508,34377362,34376891,34377026,34377139,34377562,34377560,34377485,34377581,34377462,34377607,34377325,34377361,34377576,34377420,34376860,34377585,34377524,34377501,34377531,34377163,34377517,34377624,34376396,34377582,34377286,34377603,34376915,34375891,34377605,34377623,34375056,34377312,34377621,34376481,34376083,34376673,34377434,34376852,34377478,34377453,34377349,34377602,34377510,34377464,34377494,34377586,34376715,34377569,34376780,34377551,34376514,34377606,34377483)

"Nested Loop Left Join  (cost=0.98..2196.44 rows=175 width=460) (actual time=0.090..1.755 rows=175 loops=1)"
"  ->  Index Scan using t_tid on t (cost=0.56..851.88 rows=175 width=413) (actual time=0.060..1.135 rows=175 loops=1)"
"        Index Cond: (tid= ANY ('{34377272,34376992,34377506,34377537,34377335,34377330,34377397,34377023,34376898,34377431,34376981,34377208,34377599,34377368,34377384,34377521,34377538,34377490,34377063,34373826,34377533,34376547,34377037,34377622,34377247,34377571,34377392,34376742,34377598,34377568,34377344,34377617,34377366,34377509,34377573,34377469,34377614,34377618,34376881,34377608,34377017,34377595,34377548,34377611,34377457,34377374,34376918,34377619,34377363,34377402,34377383,34377422,34377350,34377503,34377594,34377302,34377518,34377612,34377527,34377373,34377255,34377500,34377544,34377505,34377248,34377583,34377375,34376735,34377492,34377466,34376782,34377604,34376196,34377439,34377364,34377620,34377561,34377334,34377615,34376558,34376960,34376671,34376987,34377096,34377449,34377601,34377470,34375651,34377460,34377473,34377086,34377053,34377523,34377600,34375538,34377338,34377456,34375690,34376215,34377545,34377553,34377530,34377625,34376287,34375912,34376511,34377411,34377565,34376383,34377520,34377297,34376739,34377307,34377589,34377443,34377591,34377596,34376399,34377415,34377555,34377626,34377508,34377362,34376891,34377026,34377139,34377562,34377560,34377485,34377581,34377462,34377607,34377325,34377361,34377576,34377420,34376860,34377585,34377524,34377501,34377531,34377163,34377517,34377624,34376396,34377582,34377286,34377603,34376915,34375891,34377605,34377623,34375056,34377312,34377621,34376481,34376083,34376673,34377434,34376852,34377478,34377453,34377349,34377602,34377510,34377464,34377494,34377586,34376715,34377569,34376780,34377551,34376514,34377606,34377483}'::integer[]))"
"  ->  Index Scan using unique_tu_tid on tu  (cost=0.42..7.68 rows=1 width=47) (actual time=0.002..0.002 rows=0 loops=175)"
"        Index Cond: (t.tid= tid)"
"Planning time: 0.652 ms"
"Execution time: 1.831 ms"

Answer the question

In order to leave comments, you need to log in

3 answer(s)
A
Andrey Grigoriev, 2019-11-22
@eigrad

The database did not have time to accumulate statistics that there are few records left by the OrderStatus filter, it cannot evaluate the profit from using the index, and on such a small number of records it prefers to do a full scan in order to immediately raise all the data into a hash by tid. Most likely analyze will help.

G
globalmac, 2019-11-22
@globalmac

I think this article will help you - https://www.datadoghq.com/blog/100x-faster-postgre...
Instead of:
Try:

SELECT * FROM t LEFT JOIN tu ON t.tid= tu.tid WHERE t.tid = ANY (VALUES (34377272,34376992,34377506,...))

V
Vitsliputsli, 2019-11-22
@Vitsliputsli

Build a composite index on both fields.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question