W
W
wawa2018-06-22 17:38:14
PostgreSQL
wawa, 2018-06-22 17:38:14

How to speed up pagination in Django?

There are ~3.5 million records in the table, but the server is dead.
I'm using the standard Paginator from Junga.
I understand that he first makes a COUNT(*) ... query, and then the already passed QuerySet.
At first everything was ok, but when the QuerySet became more complicated (an annotation was added), the COUNT(*)... query from the paginator began to slow down wildly (> 10sec). The QuerySet itself is executed instantly (by index). Those. the problem is in the COUNT(*)... query of the paginator, where there is even ...GROUP BY... by the primary key - in short, it doesn't fit at all.
An option is floating in my head:
Do offset / limit yourself, but information about the number is not needed. Those. the end of pagination is not known in advance. And to find out that this was the last page - do limit by 1 more than the page size.
However, as a beginner, you would like to know how such problems are usually solved? And are there any ready-made solutions (maybe even in jang itself). I can bike, but it would be stupid if there is a (de facto) standard for such situations.

Answer the question

In order to leave comments, you need to log in

1 answer(s)
W
wawa, 2018-06-24
@wawa

An error has occurred.
Initially, in order to optimize, I decided to limit the maximum offset, because the user does not need to scroll deeply and it is much more reasonable to play with the search filter. In my opinion, a common technique (for example, habr).
Since the django paginator has no way to limit the offset, the following is done:

qs = MyModel.objects.filter(...)
objects = qs[:MAX_PAGE * PER_PAGE]  #2
paginator = Paginator(objects, PER_PAGE)
page = paginator.page(...)

Yes, this approach always pulls more records from the database than necessary, but (MAX_PAGE * PER_PAGE) is not large (~ 500) and the overhead is invisible. Although maybe later I will write my own paginator.
As expected in line 2, a query is made to the database and returns a list of objects, which is passed to the paginator below. Next, the paginator needs to know how many objects there are, and first, counting on the QuerySet, it tries to call .count() and, if it fails, calls .__len__(). I relied on the second case and was wrong.
It turns out that line 2 does not return a list, but still a QuerySet. No, I am aware that he is lazy, but in slicing "laziness" is not applicable. Right?
One way or another, what is received in line 2 has a count() method, and in the paginator it is safely called and initiates a query to the database. At the same time, the request is somehow monstrous (GROUP BY is used for some reason).
This is what slowed down the work. And the solution is:
qs = MyModel.objects.filter(...)
objects = list(qs[:MAX_PAGE * PER_PAGE])  #2 !!!
paginator = Paginator(objects, PER_PAGE)
page = paginator.page(...)

Thank you kindly as always!

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question