1

My aim is to put values in column D based on columns A to C. I want to go through each of the columns A, B and C and add 1 to column D if the value is greater than 20.

import pandas as pd
data={'A':[5,2,25,4],"B":[15,22,100,24], "C":[4, 100, 0, 19], "D" : [0,0,0,0]}
df= pd.DataFrame(data)

for x in range(0, len(df.columns)):
    if df.iloc[:, x]  > 20:
        df["D"] = df["D"] +1
    else:
        df["D"] = df["D"]
df

So, based on the sample code, Column D should looks like:

D
0
2
2
1

However, I keep getting this error - ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). I have gone through other answers here (Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()) which indicate that it is a multiple operator issue but I am not using these here.

What am I doing wrong?

2 Answers 2

2

What is wrong is that you can't use if Series > value in a vectorial way. if expects a single boolean value and df.iloc[:, x] > 20 returns a Series of booleans.

In your case use:

df['D'] = df.drop(columns='D').gt(20).sum(1)

NB. If column D is initially null or inexistent, skip the .drop(columns='D').

Output:

    A    B    C  D
0   5   15    4  0
1   2   22  100  2
2  25  100    0  2
3   4   24   19  1
Sign up to request clarification or add additional context in comments.

Comments

1

Another possible solution:

df.D = (
  df.loc[:, ~ df.columns.str.contains('D')]
  .apply(lambda x: sum(x > 20), axis = 1))

Output:

    A    B    C  D
0   5   15    4  0
1   2   22  100  2
2  25  100    0  2
3   4   24   19  1

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.