1

This is my current data frame:

sports_gpa  music_gpa Activity Sport
2            3         nan       nan
0            2         nan       nan
3            3.5       nan       nan
2             1        nan       nan

I have the following condition:

If the 'sports_gpa' is greater than 0 and the 'music_gpa' is greater than the 'sports_gpa', fill the the 'Activity' column with the 'sport_gpa' and fill the 'Sport' column with the str 'basketball'.

Expected output:

sports_gpa  music_gpa Activity Sport
2            3         2       basketball
0            2         nan       nan
3            3.5       3        basketball 
2            1         nan      nan

To do this I would use the following statement...

df['Activity'], df['Sport'] = np.where(((df['sports_gpa'] > 0) & (df['music_gpa'] > df['sports_gpa'])), (df['sport_gpa'],'basketball'), (df['Activity'], df['Sport']))

This of course gives an error that operands could not be broadcast together with shapes.

To fix this I could add a column to the data frame..

df.loc[:,'str'] = 'basketball'
df['Activity'], df['Sport'] = np.where(((df['sports_gpa'] > 0) & (df['music_gpa'] > df['sports_gpa'])), (df['sport_gpa'],df['str']), (df['Activity'], df['Sport']))

This gives me my expected output.

I am wondering if there is a way to fix this error without having to create a new column in order to add the str value 'basketball' to the 'Sport' column in the np.where statement.

4
  • please show an example of your dataframe and your expected output Commented Nov 4, 2019 at 22:31
  • Fixed my question. Thanks. Commented Nov 4, 2019 at 22:43
  • Thanks, I have added a answer for this Commented Nov 4, 2019 at 23:13
  • please check my answer:) Commented Nov 23, 2019 at 22:32

2 Answers 2

1

Use np.where + Series.fillna:

where=df['sports_gpa'].ne(0)&(df['sports_gpa']<df['music_gpa'])
df['Activity'], df['Sport'] = np.where(where, (df['sports_gpa'],df['Sport'].fillna('basketball')), (df['Activity'], df['Sport']))

You can also use Series.where + Series.mask:

df['Activity']=df['sports_gpa'].where(where)
df['Sport']=df['Sport'].mask(where,'basketball')
print(df)

   sports_gpa  music_gpa  Activity       Sport
0           2        3.0       2.0  basketball
1           0        2.0       NaN         NaN
2           3        3.5       3.0  basketball
3           2        1.0       NaN         NaN
Sign up to request clarification or add additional context in comments.

Comments

0

Just figured out I could do:

   df['Activity'], df['Sport'] = np.where(((df['sports_gpa'] > 0) & (df['music_gpa'] > df['sports_gpa'])), (df['sports_gpa'],df['Sport'].astype(str).replace({"nan": "basketball"})), (df['Activity'], df['Sport']))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.