Python pandas generate a table for multiple output variables

Question

I have accident data and part of this data includes the year of the accident, degree of injury and age of the injured person. this is an example of the DataFrame:

df = pd.DataFrame({'Year': ['2010', '2010','2010','2010','2010','2011','2011','2011','2011'], 
                        'Degree_injury': ['no_injury', 'death', 'first_aid', 'minor_injury','disability','disability', 'disability', 'death','first_aid'],
                        'Age': [50,31,40,20,45,29,60,18,48]})

print(df)

I want three output variables to be grouped in a table by year when the age is less than 40 and get counts for number of disabilities, number of deaths, and number of minor injuries.

The output should be like this:

I generated the three variables (num_disability, num_death, num_minor_injury) when the age is < 40 as shown below.

disability_filt = (df['Degree_injury'] =='disability') &\
                   (df['Age'] <40)
num_disability = df[disability_filt].groupby('Year')['Degree_injury'].count()
death_filt = (df['Degree_injury'] == 'death')& \
                    (df['Age'] <40)
num_death = df[death_filt].groupby('Year')['Degree_injury'].count()
minor_injury_filt = (df['Degree_injury'] == 'death') & \
                   (df['Age'] <40)
num_minor_injury = df[minor_injury_filt].groupby('Year')['Degree_injury'].count()

How to combine these variables in one table to be as illustrated in the above table?

Thank you in advance,

Corralien · Accepted Answer · 2021-11-20 20:22:21Z

1

Use pivot_table after filter your rows according your condition:

out = df[df['Age'].lt(40)].pivot_table(index='Year', columns='Degree_injury', 
                                       values='Age', aggfunc='count', fill_value=0)
print(out)

# Output:
Degree_injury  death  disability  minor_injury
Year                                          
2010               1           0             1
2011               1           1             0

answered Nov 20, 2021 at 20:22

Corralien

121k8 gold badges44 silver badges69 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

mnist · Accepted Answer · 2021-11-20 20:24:14Z

1

# prep data
df2 = df.loc[df.Age<40,].groupby("Year").Degree_injury.value_counts().to_frame().reset_index(level=0, inplace=False)
df2 = df2.rename(columns={'Degree_injury': 'Count'})
df2['Degree_injury'] = df2.index
df2
#                   Year    Count   Degree_injury
# death             2010    1       death
# minor_injury      2010    1       minor_injury
# death             2011    1       death
# disability        2011    1       disability

# pivot result
df2.pivot(index='Year',columns='Degree_injury')
#       death   disability  minor_injury
# Year          
# 2010  1.0     NaN         1.0
# 2011  1.0     1.0         NaN

answered Nov 20, 2021 at 20:24

mnist

7,0242 gold badges20 silver badges44 bronze badges

1 Comment

Eng_GR Over a year ago

Thank you so much @mnist for your feedback.

Collectives™ on Stack Overflow

Python pandas generate a table for multiple output variables

2 Answers 2

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related