Is it possible in postgres to make an index for count queries?

D

devalone2019-05-09 23:01:11

PostgreSQL

devalone, 2019-05-09 23:01:11

The goal is to make queries like

`SELECT COUNT(*) FROM table WHERE column > value AND column < value2`

in particular - to build graphs of the distribution of records by creation date and other fields, now such queries on a table of sizes of 60 million records take an incredibly long time (up to an hour, because postgres goes through all the records in the index, which satisfy the condition).
Theoretically, there is no problem to store the number in an index, for example in a binary tree, and then the choice `WHERE timestamp > N AND timestamp < M`will take log (N), which is very good.
Is there any way to make such questions fast, albeit not as accurate as possible?

Reply

Answer the question

In order to leave comments, you need to log in

3 answer(s)

S

SanSYS, 2019-05-10
@SanSYS

As written above - it is desirable to aggregate this in advance - in the background.
But triggers with a record are not very suitable for this, IMHO, because each time you will have to add the record code, and not new queries familiar to everyone
For PostgreSQL, there is an extension PipelineDB, you can read how to use it here https://habr.com/en/post/432512/, it looks like exactly what you
need in general, if the data grows incrementally, then it may be worth considering the BRIN index https://habr.com/en/company/postgrespro/blog/346460/ ?
It won't weigh much, but it will work quite well, but read
the document carefully. Related to storage/reading features
Or another option - try columnar databases;)

R

rPman, 2019-05-10
@rPman

One of the common ways to solve, if there are much fewer changes in the database than read requests, is to collect the necessary data by triggers in a separate plate, and make queries already in it.
ps indexes are already used there, the only thing is, try using count(indexed field used in where) instead of count(*)

T

TheRonCronix, 2019-05-15
@TheRonCronix

1. It was correctly indicated above that aggregates are needed. Aggregates can be partial. If you have count without distinct, then the indicator will be additive and there should be no problems with summing over aggregates.
2. Time at you the table so grows it is necessary to partition it. Most likely by date.