-1

I have a dataframe

import pandas as pd
import numpy as np

data = pd.DataFrame({"col1": [0, 1, 1, 1,1, 0],
                     "col2": [False, True, False, False, True, False]
                     })

data

I'm trying to create a column col3 where col1=1 and col2==True its 1 else 0

Using np.where:

data.assign(col3=np.where(data["col1"]==1 & data["col2"], 1, 0))
col1    col2    col3
0   0   False   1
1   1   True    1
2   1   False   0
3   1   False   0
4   1   True    1
5   0   False   1

For row 1: col1==0 & col2=False, but I'm getting col3 as 1.

What am I missing??

The desired output:


col1    col2    col3
0   0   False   0
1   1   True    1
2   1   False   0
3   1   False   0
4   1   True    1
5   0   False   0
0

1 Answer 1

1

You are missing parentheses (& has higher precedence than ==):

data.assign(col3=np.where((data["col1"]==1) & data["col2"], 1, 0))

A way to avoid this is to use eq:

data.assign(col3=np.where(data["col1"].eq(1) & data["col2"], 1, 0))

You can also replace the numpy.where by astype:

data.assign(col3=((data["col1"]==1) & data["col2"]).astype(int))

Output:

   col1   col2  col3
0     0  False     0
1     1   True     1
2     1  False     0
3     1  False     0
4     1   True     1
5     0  False     0
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.