1

I have the following program:

df = pd.DataFrame({'student':['a'] * 4 + ['b'] * 6,
                           'semester':[1,1,2,2,1,1,2,2,2,2],
                           'passed_exam':[True, False] * 5})

    print (df)
      passed_exam  semester student
    0        True         1       a
    1       False         1       a
    2        True         2       a
    3       False         2       a
    4        True         1       b
    5       False         1       b
    6        True         2       b
    7       False         2       b
    8        True         2       b
    9       False         2       b

    table = df.groupby(["student","semester","passed_exam"])
              .size()
              .unstack(fill_value=0)
              .rename_axis(None, axis=1)
              .reset_index()
    print (table)
      student  semester  False  True
    0       a         1      1     1
    1       a         2      1     1
    2       b         1      1     1
    3       b         2      2     2

I want to add a new column to the second dataframe that counts total number of students. Something like this:

   student  semester  False  True Total_St
0       a         1      1     1     4
1       a         2      1     1     4
2       b         1      1     1     6
3       b         2      2     2     6

Any ideas?

Thank you in advance!

2 Answers 2

2

Since the table has two rows per student, one approach is to use original df to find the student count and map to table

table['total_st'] = table['student'].map(df.groupby('student').size())


passed_exam student semester    False   True    total_st
0           a           1       1       1       4
1           a           2       1       1       4
2           b           1       1       1       6
3           b           2       2       2       6
Sign up to request clarification or add additional context in comments.

4 Comments

Thanks!! I get the result in the first row for each student and for other rows it returns Nan values
Can you provide the case on which you tested?
Solved it! Thanks again!!
Also I used this code to create table: table = df.groupby(["student","semester","passed_exam"]).size().unstack().reset_index().
1

Groupby 'student', use size to count them up, then merge with table:

table.merge(pd.DataFrame(df.groupby('student').size()).reset_index(), on='student')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.