How to concisely write a set of aggregate functions with conditions in pandas?

I

ideological2021-01-28 18:48:51

Python

ideological, 2021-01-28 18:48:51

Hey!

Sample data:

df = pandas.DataFrame()
df['группы'] = ['a','a','a','b','b','c','c','c','c','d','d']
df['ящиков'] = [1,11,3,11,5,6,11,8,9,10,11]
df['ситуация'] = [0,1,0,1,1,0,1,0,1,0,1]
df

As usual, this is solved in SQL:

SELECT
"группы",
COUNT(DISTINCT CASE WHEN "ситуация" = 1 THEN "ящиков" END) AS "уникальных_ящиков_при_ситуации_1",
SUM(CASE WHEN "ситуация" = 0 THEN "ящиков" END) AS "сумма_ящиков_при_ситуации_0",
..
..
GROUP BY "группы"

How to concisely write a lot of aggregate functions in pandas with different conditions on very different columns? In SQL it is written very simply.
Do not write stupidly, let's say 10 df filtrations, then 10 groupby and 10 merge :). Is there any simple convenient option?

Reply

Answer the question

In order to leave comments, you need to log in

1 answer(s)

A

Alan Gibizov, 2021-01-28
@phaggi

Maybe something like this? I'm quite fluent in SQL and I'm not sure what it will give out in the end ...

import pandas

df = pandas.DataFrame()
df['группы'] = ['a', 'a', 'a', 'b', 'b', 'c', 'c', 'c', 'c', 'd', 'd']
df['ящиков'] = [1, 11, 3, 11, 5, 6, 11, 8, 9, 10, 11]
df['ситуация'] = [0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1]
print(df[df['ситуация'] == 1].groupby('группы').count())
print(df[df['ситуация'] == 0].groupby('группы').sum())