K
K
Koteezy2018-12-28 19:09:40
PostgreSQL
Koteezy, 2018-12-28 19:09:40

Duplicate values ​​when querying with generate_series and left join?

Hello everyone, I can’t understand why the number of items in the request is considered incorrect, tell me, what could be the jamb?

select 
  dates.d as day,
  count(v.*) as views_count,
  count(i.*) as items_count
from (
    select d from generate_series('2018-12-01'::date, '2018-12-30', '1 day' ) as d
) as dates
left join views as v on v.created_at::date = dates.d
left join items as i on i.created_at::date = dates.d
group by day order by day desc;

You can see it in action at dbfiddle

Answer the question

In order to leave comments, you need to log in

2 answer(s)
K
Koteezy, 2018-12-28
@Koteezy

There was an answer on Stackoverflow and it fixed the situation.

select 
    dates.d as day,
    coalesce(v.count, 0) as views_count,
    coalesce(i.count, 0) as items_count
from generate_series('2018-12-26'::date, '2018-12-30', '1 day' ) as dates(d)
left join (
    select created_at::date as day, count(*)
    from views 
    group by day
    ) as v on v.day = dates.d
left join (
    select created_at::date as day, count(*)
    from items 
    group by day
    ) as i on i.day = dates.d
order by day desc

M
Melkij, 2018-12-29
@melkij

The query in the question does a cross join views on items and of course the resulting data will be perfectly valid but meaningless.
I don't think it's a good idea to iterate over potentially large views and items without traversing known search conditions by dates, even for an analytical query.

select d as day,
(select count(*) from views where created_at >= d and created_at < (d + interval '1 day')) as views_count,
(select count(*) from items where created_at >= d and created_at < (d + interval '1 day')) as items_count
from generate_series('2018-12-26'::date, '2018-12-30', '1 day' ) as dates(d)
order by day desc

Or in a similar way
select d as day, views_count, items_count
from generate_series('2018-12-26'::date, '2018-12-30', '1 day' ) as dates(d)
left join lateral (select count(*) as views_count from views where created_at >= d and created_at < (d + interval '1 day'))  as v on true
left join lateral (select count(*) as views_count from items where created_at >= d and created_at < (d + interval '1 day'))  as i on true
order by day desc

And see what the planner thinks about it

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question