where statements with groupby in python dataframes

Question

train[['Pclass', 'Age']].groupby(['Pclass'], as_index=False).median().sort_values(by='Pclass', ascending=True)

This is the bit where I am doing the grouping, finding the summary statistics, and sorting it based on a column ('Pclass' in this case).

How can I use a where clause along with this? The where clause I want to enter would perform something similar to train[train.Survived==1]

Any thoughts on how this can be achieved? I am using the classic "Titanic" dataset.

unutbu · Accepted Answer · 2017-06-14 11:30:48Z

1

Change train[['Pclass', 'Age']] to

train.loc[train['Survived'] == 1, ['Pclass', 'Age']]

For example,

import pandas as pd
import seaborn as sns
train = sns.load_dataset("titanic")

print(train.loc[train['survived'] == 1, ['pclass', 'age']]
           .groupby(['pclass'], as_index=False)
           .median()
           .sort_values(by='pclass', ascending=True))

prints

   pclass   age
0       1  35.0
1       2  28.0
2       3  22.0

answered Jun 14, 2017 at 11:30

unutbu

886k197 gold badges1.9k silver badges1.7k bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

where statements with groupby in python dataframes

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related