R
R
Roman Kitaev2015-12-10 18:05:05
PostgreSQL
Roman Kitaev, 2015-12-10 18:05:05

How to force Postgres to use indexes?

Hello. There is this table:

CREATE TABLE public.core_entry (
  id INTEGER PRIMARY KEY NOT NULL DEFAULT nextval('core_entry_id_seq'::regclass),
  keyword CHARACTER VARYING(100) NOT NULL,
  created TIMESTAMP WITH TIME ZONE NOT NULL
);
CREATE INDEX core_entry_e2fa5388 ON core_entry USING BTREE (created);
CREATE INDEX keyword_gist_idx ON core_entry USING GIST (keyword);

The GiST index works very well, but the problem is that it doesn't work on EXACT, i.e. upon request
SELECT keyword FROM core_entry WHERE keyword='something';

Seq Scan in progress
Yes, this can be tweaked by writing
SELECT keyword FROM core_entry WHERE keyword LIKE 'something';

And the result will be the same, but with the index applied.
The main (most frequent) request looks like this:
SELECT keyword, count(keyword) as count FROM core_entry WHERE keyword LIKE '%something%' GROUP BY keyword;

In this case, the count() function starts a Seq Scan (apparently checking for equality).
Here I tried to add a regular index:
CREATE INDEX keyword_idx ON core_entry USING BTREE (keyword);

And... It got even worse. Now he is calmly coping with the request
SELECT keyword FROM core_entry WHERE keyword='something';

But LIKE '%something%' it handles sequential reading.
Question:
How to change the query (preferably by deleting the BTREE index, because 10 million records per week get into the database) so that count(keyword) is calculated through the GiST index?

Answer the question

In order to leave comments, you need to log in

2 answer(s)
K
Kirill, 2015-12-10
@deliro

Make 2 tables something like

create table core_keywords(
    keyword_id serial primary key,
    keyword    varchar(100)
);
create unique index u_idx_keyword on core_keywords(lower(keyword));
create index t_idx_keywords on core_keywords using gin (lower(keyword) gin_trgm_ops);

create table core_keywords_entry(
    keyword_id int not null references core_keywords,
    created_at timestamp with time zone not null default CURRENT_TIMESTAMP,
    primary key (keyword_id, created_at)
);

Well, everything will be easier, something like
select 
    e.keyword_id, 
    count(*) 
from core_keywords_entry as e
join core_keywords as k using(keyword_id)
where lower(k.keyword) like  '%something%'
group by 1

L
lega, 2015-12-10
@lega

But what if keyword is made an array (or use tsvector), then the search for exact matches should work, which means (possibly) the grouping will work. In monge, the "tag cloud" works fine.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question