28

I have simple dataframe:

import pandas as pd
frame = pd.DataFrame(np.random.randn(4, 3), columns=list('abc'))

Thus for example:

a   b   c
0   -0.813530   -1.291862   1.330320
1   -1.066475   0.624504    1.690770
2   1.330330    -0.675750   -1.123389
3   0.400109    -1.224936   -1.704173

And then I want to create column “d” that contains value from “c” if c is positive. Else value from “b”.

I am trying:

frame['d']=frame.apply(lambda x: frame['c'] if frame['c']>0 else frame['b'],axis=0)

But getting “ValueError: ('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().', 'occurred at index a')

I was trying to google how to solve this, but did not succeed. Any tip please?

2
  • lambda x: ... as in it takes an argument x which is not used for the logic..... Commented May 25, 2016 at 16:40
  • frame['c']>0 produces a series of values in column c that are greater then 0, which is then tried to use the booleaness of it instead of x['c']>0 which will compare the value at the specific cell to 0 and return a boolean. Commented May 25, 2016 at 16:42

3 Answers 3

55

is that what you want?

In [300]: frame[['b','c']].apply(lambda x: x['c'] if x['c']>0 else x['b'], axis=1)
Out[300]:
0   -1.099891
1    0.582815
2    0.901591
3    0.900856
dtype: float64
Sign up to request clarification or add additional context in comments.

1 Comment

axis=1 is important at the end. Otherwise, it gives keyerror.
8

Solution

use a vectorized approach

frame['d'] = frame.b + (frame.c > 0) * (frame.c - frame.b)

Explanation

This is derived from the sum of

(frame.c > 0) * frame.c  # frame.c if positive

Plus

(frame.c <= 0) * frame.b  # frame.b if c is not positive

However

(frame.c <=0 )

is equivalent to

(1 - frame.c > 0)

and when combined you get

frame['d'] = frame.b + (frame.c > 0) * (frame.c - frame.b)

Comments

3

I came by and faced something like this and this how I retrieve new column based on conditions from other columns

df["col3"] = df[["col1", "col2"]].apply(
    lambda x: "return this if first statement is true"
    if (x.col1 == "value1" and x.col2 == "value2")
    else "return this if the statement right below this line is true"
    if (x.col1 == "value1" and x.col2 != "value2")
    else "return this if the below is true"
    if (x.col1 != "value1" and x.col2 == "Value2")
    else "return this because none of the above statements were true",
    axis=1
)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.