L
L
Ler Den2018-04-30 15:42:17
PostgreSQL
Ler Den, 2018-04-30 15:42:17

What is the best way to speed up the selection with pagination?

I must say right away that I am not strong in the nuances, I know sql in general terms.
Now pagination is done as standard by remembering the last id. More or less like this:

SELECT *          
            FROM recording WHERE recording.id > 0 AND recording.artist_id = '269608'
            ORDER BY recording.id
            LIMIT 10

The request takes 5-10 seconds. The query plan is here https://explain.depesz.com/s/a8Xl
As you can see, the index scan takes the most time, I don’t understand why.
If you remove ORDER BY , then everything is more fun https://explain.depesz.com/s/WpTp
I also noticed that if the artist's ID is not 269608, but less, for example 500, then the request goes very quickly. And the larger the ID, the longer the request takes. Is it normal? It seems that all IDs are recalculated in order until they reach the desired one.
In general, how to be without ORDER BY, if you need both speed (first of all) and data filtering?
The table contains 18 million records, but I don’t think that this is a very heavy scheme for subd (?)
UPD
I will describe the task in full.
You need to get the tracks by the artist's ID. The tracks are in the recording table, the artists are in the artist table. But the tables are not connected directly, but only through the other two - artist_credit_name and artist_credit.
The connection is as follows: artist.id <--> artist_credit_name .artist, artist_credit_name .artist_credit <--> artist_credit.id <--> recording.artist_credit
The whole scheme here
The complete request looks like this:
SELECT recording.id  AS "recordingId", recording.name AS "trackName", artist.name AS "artistName"
            FROM artist
            INNER JOIN artist_credit_name ON artist.id = artist_credit_name.artist
            INNER JOIN artist_credit ON artist_credit_name.artist_credit = artist_credit.id
            INNER JOIN recording ON artist_credit.id = recording.artist_credit                      
            WHERE artist.id = $(artistId) AND recording.id > $(index) 
            ORDER BY recording.id LIMIT $(limit)

It turns out that one `artist.id` may correspond to several `artist_credit.id`. Therefore, I tried to rewrite the query in such a way that we first select all `artist_credit.id` for a given artist and then select tracks using `WHERE IN`, speeding up by about 30% (although this may be an error), but the result is still not the one you need.
Indexes on tables:
recording: id (PK),
artist_credit: id (PK),
artist_credit_name: id (PK), artist(FK),
artist: id (PK)
Can I add an index on the field `recording.artist_credit` ? I don't know if it's possible to add indexes on foreign keys?
UPD#2 Added an index on `recording.artist_credit`, now the request is fast

Answer the question

In order to leave comments, you need to log in

2 answer(s)
A
Anton Shamanov, 2018-05-08
@SilenceOfWinter

The table contains 18 million records, but I don’t think that this is a very heavy scheme for subd (?)

it depends on the server. besides pk there are indexes? or tried to add, for example, unique id + artist_id?
what for check "recording.id > 0" is necessary, you don't use an auto increment?
why sort by id if the values ​​are written in this order anyway thanks to auto increment.

D
dimarick, 2018-05-19
@dimarick

Your problem is that postgres first sorts EVERYTHING, and then takes limit.
Try adding something like: AND recording.id < $(index+wow), where wow is the limit + the maximum conceivable "hole" size in the list of primary keys.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question