How can you optimize a query with COUNT and SUM in PostgreSQL?

N

nurzhannogerbek2019-07-15 12:25:11

PostgreSQL

nurzhannogerbek, 2019-07-15 12:25:11

Hello comrades! Please help me figure it out. I use a PostgreSQL
database in my project :

PostgreSQL 11.0 (Debian 11.0-1.pgdg90+2) on x86_64-pc-linux-gnu, compiled by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit

The database has a table with the following structure:

| SURVEY_ID (uuid)                     | EMPLOYEE (varchar)     | STATUS (bool) | SEND (bool) | TREE_ORGANIZATION_ID (varchar) |
|--------------------------------------|------------------------|---------------|-------------|--------------------------------|
| bc90de33-62f9-4c6f-a7a6-6a76abb28b65 | [email protected]    | true          | true        | \27623\27734\28304             |
| bc90de33-62f9-4c6f-a7a6-6a76abb28b65 | [email protected]   | true          | true        | \27623\30556\30453             |
| bc90de33-62f9-4c6f-a7a6-6a76abb28b65 | [email protected] | true          | true        | \27623\30558\30353\30354\28233 |

Created an index for this table as follows:

CREATE INDEX INDEX_SURVEYS_EMPLOYEES_RELATIONSHIP ON SURVEYS_EMPLOYEES_RELATIONSHIP(SURVEY_ID, EMPLOYEE, STATUS, SEND, TREE_ORGANIZATION_ID);

I am making this kind of request. There is a task to accelerate/optimize the given request.

SELECT
  NULLIF(COUNT(*), 0) AS TOTAL_RESPONDENTS,
  SUM(CASE WHEN SURVEYS_EMPLOYEES_RELATIONSHIP.SEND = TRUE AND SURVEYS_EMPLOYEES_RELATIONSHIP.STATUS = TRUE THEN 1 ELSE 0 END) AS PAST_RESPONDENTS
FROM 
  SURVEYS_EMPLOYEES_RELATIONSHIP
WHERE
  SURVEYS_EMPLOYEES_RELATIONSHIP.SURVEY_ID = 'bc90de33-62f9-4c6f-a7a6-6a76abb28b65'
AND 
  SURVEYS_EMPLOYEES_RELATIONSHIP.TREE_ORGANIZATION_ID LIKE CONCAT('%', '\', '27623', '%')

The EXPLAIN command for this query produces the following plan:

There are very few records in the table at the moment. It is assumed that in the future every week the table will be replenished with one million new records. What do you think are the bottlenecks in this request. How can it be optimized?

Reply

Answer the question

In order to leave comments, you need to log in

2 answer(s)

M

Melkij, 2019-07-15
@melkij

PostgreSQL 11.0 (Debian 11.0-1.pgdg90+2)

Thank you for your support in the first couple of weeks after release, but what's the point of doing this in 2019? Almost the only answer you can expect to a bug report: check first on the current version.
What for?
count on MVCC base. For OLAP - ok, let it rustle.
For something faster - screw in pre-aggregated slices.
see ltree
Although judging by LIKE CONCAT('%', '\', '27623', '%') - rather even an array.

A

Alexander Aksentiev, 2019-07-15
@Sanasol

https://tatiyants.com/pev/#/plans/new