0
train[['Pclass', 'Age']].groupby(['Pclass'], as_index=False).median().sort_values(by='Pclass', ascending=True)

This is the bit where I am doing the grouping, finding the summary statistics, and sorting it based on a column ('Pclass' in this case).

How can I use a where clause along with this? The where clause I want to enter would perform something similar to train[train.Survived==1]

Any thoughts on how this can be achieved? I am using the classic "Titanic" dataset.

1 Answer 1

1

Change train[['Pclass', 'Age']] to

train.loc[train['Survived'] == 1, ['Pclass', 'Age']]

For example,

import pandas as pd
import seaborn as sns
train = sns.load_dataset("titanic")

print(train.loc[train['survived'] == 1, ['pclass', 'age']]
           .groupby(['pclass'], as_index=False)
           .median()
           .sort_values(by='pclass', ascending=True))

prints

   pclass   age
0       1  35.0
1       2  28.0
2       3  22.0
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.