How to aggregate over a list in pandas, including missing ones?

S

Sergey Sokolov2018-03-09 20:10:51

Python

Sergey Sokolov, 2018-03-09 20:10:51

There is a list of IDs of interest and a bunch of strings with numbers for some, not all of these IDs.
It would be desirable to receive the sum of numbers on each of ID. And zeros for those to which there is none. Something like LEFT OUTER JOIN in SQL.
For example, data:

id  x
------
1   10
1   12
2   11
4   21

Interested in ID (1,2,3,4). As expected result

Without the nulls, I would do this: df.groupby('id').agg('sum')
But this just doesn't output rows for IDs that don't have any entries.

Reply

Answer the question

In order to leave comments, you need to log in

1 answer(s)

S

Sergey Sokolov, 2018-03-09
@sergiks

Suggested on SO : use reindex ()

my_ids = [1,2,3,4] #для примера
df.groupby('id').sum()
             .reindex( my_ids, fill_value=0)
             .reset_index()