How to correctly calculate the approximate number of rows through EXPLAIN?

6

6pirule2017-06-15 11:24:27

PostgreSQL

6pirule, 2017-06-15 11:24:27

There is a big table. You need to count the approximate number of rows that fit the WHERE clause.
If there are more than 9612 suitable rows, then EXPLAIN gives the correct approximate value of rows.
If there are fewer than 9612 eligible rows, for example 5000 (or even 5), then EXPLAIN will return: rows=9612
What am I doing wrong?
EXPLAIN SELECT COUNT(*)
FROM mat_view
WHERE gen_doc @@ plainto_tsquery('english', 'any example');

Reply

Answer the question

In order to leave comments, you need to log in

1 answer(s)

M

Melkij, 2017-06-15
@6pirule

Learn exactly how the scheduler statistics work. Well, for example, https://habrahabr.ru/company/pgdayrussia/blog/329542/
There is a default_statistics_target (default 100) and that is how many pieces of the most popular specific values are tracked in most_common_vals and most_common_freqs. What can be said about all the other values? (the number of different values is - n_distinct) That they can somehow be distributed over the (100%-sum(most_common_freqs)) remaining rows. In fact, there are no other values in statistics than most_common_vals, and therefore it is impossible to say whether there is at least one desired value at all. And how are they distributed there, if any? Yes, who knows. The scheduler assumes it's roughly even.
Accordingly, you can twist the default_statistics_target for a specific field or index.