How to properly use pandas indexes?

R

Rudolf Nemov2020-11-20 16:47:26

Python

Rudolf Nemov, 2020-11-20 16:47:26

Hello, I am trying to solve this problem. She's interesting enough.

The function should take pandas.DataFrame as input, as in the image, and return the winner of the election as output.

The task also states that it is worth using the functions .idxmax(), .sort_index(), .groupby() But
I have been unable to solve it for several hours...

the value from the electors column

As far as I understand, I need to:
1. group by state
2. sort by candidate name
3. find the winner in each state
4. Then I somehow need to get the number of electors multiplied by "whether this candidate won", by all-or-nothing logic

Here is what my code looks like now:

def winner_votes(df_in):
    df = df_in.groupby(['state', 'electors'], sort=False).first()
    df.sort_index(axis=1, inplace=True)
    df['winner'] = df.idxmax(axis=1)
    return df

The output is the following dataframe:

I don't understand how I can add a "score" column to this dataframe, I would like to place something like df[ df.index == df['winner'] ] * df['electors'] in it but of course this approach doesn't work.

Reply

Answer the question

In order to leave comments, you need to log in

3 answer(s)

R

Rudolf Nemov, 2020-11-20
@rudieduddie

I found the solution with the involvement of more experienced colleagues :)

def winner_votes(df):
    votes = df.drop(columns=['state']).set_index('electors')
    votes = votes.T.sort_index().idxmax().reset_index()
    votes = votes.groupby(0)['electors'].sum().nlargest(1)
    return (list(votes.index)[0], list(votes.values)[0])

D

dmshar, 2020-11-20
@dmshar

I did not understand anything from the question, but the "winner" in the last given table is easy to find - we group by "winner" and calculate the sum of "electors" in the group. Then we choose the group (that candidate) who has the maximum amount.
If this is not an answer to your question - clarify, reformulate and ask a specific question so that you can understand what exactly you do not understand.

O

o5a, 2020-11-20
@o5a

We consider the sum in the grouping by the found column with the maximum value of three, and take idxmax from the result. More or less like this

df.groupby(df.iloc[:,2:5].idxmax(axis=1)).sum()['electors'].idxmax()