In df below there are three groups in the variable 'group' - 'A', 'AB', 'C'. The other columns in the df is assigned to a specific group by suffix - var1_A relates to group A and so forth.
data = pd.DataFrame({'group':['A', 'AB', 'A', 'AB', 'AB', 'C', 'C', 'A', 'A', 'AB'],
'var1_A':['pass', 'fail', 'pass','fail', 'pass']*2,
'var2_A':['pass', 'pass', 'pass','fail', 'pass']*2,
'var1_AB':['pass', 'pass', 'pass','fail', 'pass']*2,
'var2_AB':['pass', 'pass', 'fail','fail', 'pass']*2,
'var1_C':['pass', 'pass', 'pass','fail', 'pass']*2,
'var2_C': ['fail', 'fail', 'fail','fail', 'pass']*2
})
I want for each row count the number of times 'pass' occur. For the instances that belongs to group A I only want to count the variables that are connected to the group A. I want the result in a new column. This would almost do the job.
data['new_col'] = data[data['group']=='A']['var1_A, var2_A].isin(['pass']).sum(1)
data['new_col'] = data[data['group']=='AB']['var1_AB, var2_AB].isin(['pass']).sum(1)
data['new_col'] = data[data['group']=='C']['var1_C, var2_C].isin(['pass']).sum(1)
However, I want the result in the same column from all groups. This operation is perhaps possible to do using a groupby and transform? However, I got stuck figuring it out.
Target dataframe:
pd.DataFrame({'group':['A', 'AB', 'A', 'AB', 'AB', 'C', 'C', 'A', 'A', 'AB'],
'var1_A':['pass', 'fail', 'pass','fail', 'pass']*2,
'var2_A':['pass', 'pass', 'pass','fail', 'pass']*2,
'var1_AB':['pass', 'pass', 'pass','fail', 'pass']*2,
'var2_AB':['pass', 'pass', 'fail','fail', 'pass']*2,
'var1_C':['pass', 'pass', 'pass','fail', 'pass']*2,
'var2_C': ['fail', 'fail', 'fail','fail', 'pass']*2,
'result':[2,2,2,0,2,1,1,2,0,2]
})