S
S
Sergey Sokolov2018-03-09 20:10:51
Python
Sergey Sokolov, 2018-03-09 20:10:51

How to aggregate over a list in pandas, including missing ones?

There is a list of IDs of interest and a bunch of strings with numbers for some, not all of these IDs.
It would be desirable to receive the sum of numbers on each of ID. And zeros for those to which there is none. Something like LEFT OUTER JOIN in SQL.
For example, data:

id  x
------
1   10
1   12
2   11
4   21

Interested in ID (1,2,3,4). As expected result
1  22
2  11
3  0
4  21

Without the nulls, I would do this: df.groupby('id').agg('sum')
But this just doesn't output rows for IDs that don't have any entries.

Answer the question

In order to leave comments, you need to log in

1 answer(s)
S
Sergey Sokolov, 2018-03-09
@sergiks

Suggested on SO : use reindex ()

my_ids = [1,2,3,4] #для примера
df.groupby('id').sum()
             .reindex( my_ids, fill_value=0)
             .reset_index()

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question