L
L
Leo_Eldorado2020-02-03 13:10:38
PostgreSQL
Leo_Eldorado, 2020-02-03 13:10:38

Why is an IN query in PostgreSQL DBMS much slower than a similar OR query?

Hello!
There is the following request:

SELECT * from table 
WHERE column IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17) 
ORDER BY ID DESC LIMIT 500 OFFSET 0

It is executed on a fairly large table of 25 million records and its execution time is approximately 5-7 minutes. But it is worth replacing IN with OR as the execution time drops to 1 second. The question is: why is this happening and should this method be used to optimize the query? In explain it is clear that the query plans are different and in one article I saw advice to rewrite OR to IN on the contrary, so I doubt that OR will be the best choice. Moreover, to speed up this query, I also tried another approach, which is to create a separate, specialized table, to which I first execute a subquery, after which I combine the results of this query with the main table using INNER JOIN. This approach also gives good results, accelerating data acquisition up to 1 second.

Answer the question

In order to leave comments, you need to log in

2 answer(s)
M
Melkij, 2020-02-03
@Leo_Eldorado

ORDER BY ID

hm.
Show explain (analyze,buffers) queries. And show query plans with "ORDER BY ID + 0". And further
select * from pg_stats where tablename = 'table' and attname = 'column'

I'm going to assume that your in is not slow at all, but rather breaks the scheduler's evaluation, but in this particular case it turns out to be an unexpected boon.

L
Lazy @BojackHorseman, 2020-02-03
SQL

because the execution plans are different. compare them and see the answer

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question