1
P T1 T2 T3
0 1 2 3
1 1 2 0
2 3 1 2
3 1 0 2

In the above pandas dataframe df, I want to add columns on the basis of the value of column 'P'.

if df['P'] == 0: 0
if df['P'] == 1: T1 (=1)
if df['P'] == 2: T1+T2 (=3+1=4)
if df['P'] == 3: T1+T2+T3 (=1+0+2=3)

In other words, I want to add from T1 to TN if df['P'] == N. How can I implement this with Python code?

1 Answer 1

1

EDIT:

For sum values by P column create mask by broadcasting np.arange by length of filtered columns by DataFrame.filter, compare by P values and this mask pass to DataFrame.where, last use sum per rows:

np.random.seed(20)
    
c = [f'{x}{i + 1}' for x in ['T','U','V'] for i in range(3)]
df = pd.DataFrame(np.random.randint(4, size=(10,10)), columns=['P'] + c)

arrP = df['P'].to_numpy()[:, None]

for c in ['T','U','V']:
    df1 = df.filter(regex=rf'^{c}')
    df[f'{c}_SUM'] = df1.where(np.arange(len(df1.columns)) < arrP, 0).sum(axis=1)
print (df)
   P  T1  T2  T3  U1  U2  U3  V1  V2  V3  T_SUM  U_SUM  V_SUM
0  3   2   3   3   0   2   1   0   3   2      8      3      5
1  3   2   0   2   0   1   2   2   3   3      4      3      8
2  0   1   2   2   2   0   1   1   3   1      0      0      0
3  3   2   2   2   1   3   2   1   3   2      6      6      6
4  3   1   1   3   1   2   2   0   2   3      5      5      5
5  2   3   2   3   1   1   1   0   3   0      5      2      3
6  2   3   2   3   3   3   2   1   1   2      5      6      2
7  3   2   0   2   1   1   2   2   2   3      4      4      7
8  2   2   1   0   2   2   0   3   3   0      3      4      6
9  2   2   3   2   2   3   2   2   1   1      5      5      3
Sign up to request clarification or add additional context in comments.

5 Comments

Thank you for your quick solution. Actually, The dataframe has many columns. ...P....T1,T2,T3,..T10, U1,U2,...,U12, V1,V2,...,V8. I want to make columns T_SUM, U_SUM, V_SUM which contains each column sum on the basis of the value of P (as in my question). Would you please update your solution?
@YimJune - answer was edited with some sample data
Thank you jezrael. I'd better consider tril and filter fuctions. But in your sample df, T_SUM( also U_SUM, V_SUM) doesn't seem to be calculated on the basis of P. (e.g. in 4th row, P=2 but T_SUM=9 (it should be 5)) And FYI, my actual dataframe P is not sorted. (P = 1, 1, 0, 3, 2, 4, 4, ... like this). So I wonder if I can use np.tril function for my actual dataframe.
@YimJune - Answer was edited.
Perfect! My problem has been solved. Thanks for your great help. :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.