Why is Postgresql so slow?

A

Anatoly Sidorov2017-04-25 12:14:47

PostgreSQL

Anatoly Sidorov, 2017-04-25 12:14:47

Good afternoon!
There are a lot of laudatory reviews about postgres on the Internet, everyone is doing well and great, tables with billions of data, etc.
In reality, we currently have a table with 1.5 million statistics records, a typical count(*) query takes 0.700 ms.

[SQL] EXPLAIN ANALYSE select count(*) from table

Aggregate  (cost=174099.68..174099.69 rows=1 width=8) (actual time=787.417..787.418 rows=1 loops=1)
  ->  Seq Scan on table (cost=0.00..170511.54 rows=1435254 width=0) (actual time=0.444..637.771 rows=1435107 loops=1)
Planning time: 0.110 ms
Execution time: 787.479 ms

[SQL] EXPLAIN ANALYSE select count(*) from table where user_id=114
Aggregate  (cost=166624.17..166624.18 rows=1 width=8) (actual time=482.791..482.792 rows=1 loops=1)
  ->  Bitmap Heap Scan on table (cost=2919.96..166234.64 rows=155811 width=0) (actual time=46.828..463.465 rows=156944 loops=1)
        Recheck Cond: (user_id = 114)
        Rows Removed by Index Recheck: 153189
        Heap Blocks: exact=39222 lossy=26507
        ->  Bitmap Index Scan on idx_user_id  (cost=0.00..2881.01 rows=155811 width=0) (actual time=36.766..36.766 rows=156944 loops=1)
              Index Cond: (user_id = 114)
Planning time: 0.242 ms
Execution time: 483.520 ms

And this is a simple count of the number, but it also requires the aggregation of individual columns, and joins, and distincts, and much more.
What are we missing? We just need to be able to quickly filter a large amount of data by a given date and user. Partitioning only exacerbates the situation, probably too small amount of data.
UPD. The simplest example. If a million UUID, user_id, date. 95% of select queries are a filter by user_id + date between(start, end). What will help in this case? Now there is a btree index on dt + user_id.
Thanks in advance.

Reply

Answer the question

In order to leave comments, you need to log in

6 answer(s)

A

Artem Klimenko, 2017-04-25
@aklim007

Regarding the slow COUNT for the entire table, they wrote to you, but the second query "according to normal" should work instantly, provided that the postgres is correctly configured.
Do you by any chance use the default settings (and they are there so that they work even on a calculator)?
if so, then I advise postgresql.leopard.in.ua there, a new version has recently been released.

A

Alexander Aksentiev, 2017-04-25
@Sanasol

https://wiki.postgresql.org/wiki/Slow_Counting

T

terrier, 2017-04-25
@terrier

Recheck Cond: (user_id = 114)
Rows Removed by Index Recheck: 153189
Heap Blocks: exact=39222 lossy=26507
You do a re-check of that very condition, and it does indeed filter out a significant number of rows. Long story short - you do not have enough work_mem, increase it.

T

thyratr0n, 2017-04-27
@thyratr0n

The first query does not use an index. It looks like the table doesn't have a PRIMARY KEY.
The second request is also not clear. It looks like the user_id is contained in a complex index like (user_type, user_id), and since the first field in the request is not used, then the request slows down.
We need DDL tables - without it, it all looks like guessing on the coffee grounds.

A

Andrey Shishkin, 2017-05-04
@compiler

"Why is Postgresql so slow?"
You just don't know how to cook it.

M

Max, 2017-04-25
@MaxDukov

but do EXPLAIN (ANALYSE, BUFFERS) ...