5

How to create a dummy variable if missing values are included? I have the following data and I want to create a Dummy variable based on several conditions. My problem is that it automatically converts my missing values to 0, but I want to keep them as missing values.

import pandas as pd

mydata = {'x' : [10, 50, np.nan, 32, 47, np.nan, 20, 5, 100, 62], 
          'y' : [10, 1, 5,  np.nan, 47, np.nan, 8, 5, 100, 3]}
df = pd.DataFrame(mydata)

df["z"] = ((df["x"] >= 50) & (df["y"] <= 20)).astype(int)

print(df)

1 Answer 1

5

When creating your boolean-mask, you are comparing integers with nans. In your case, when comparing df['x']=np.nan with 50, your mask df['x'] >= 50 will always be False and will equal 0 if you convert it to an integer. You can just create a boolean-mask that equals True for all rows that contain any np.nan in the columns ['x', 'y'] and then assign np.nan to these rows.

Code:

import pandas as pd
import numpy as np

mydata = {'x' : [10, 50, np.nan, 32, 47, np.nan, 20, 5, 100, 62], 
          'y' : [10, 1, 5,  np.nan, 47, np.nan, 8, 5, 100, 3]}
df = pd.DataFrame(mydata)

df["z"] = ((df["x"] >= 50) & (df["y"] <= 20)).astype("uint32")
df.loc[df[["x", "y"]].isna().any(axis=1), "z"] = np.nan

Output:

    x       y       z
0   10.0    10.0    0.0
1   50.0    1.0     1.0
2   NaN     5.0     NaN
3   32.0    NaN     NaN
4   47.0    47.0    0.0
5   NaN     NaN     NaN
6   20.0    8.0     0.0
7   5.0     5.0     0.0
8   100.0   100.0   0.0
9   62.0    3.0     1.0

Alternatively, if you want an one-liner, you could use nested np.where statements:

df["z"] = np.where(
    df.isnull().any(axis=1), np.nan, np.where((df["x"] >= 50) & (df["y"] <= 20), 1, 0)
)
Sign up to request clarification or add additional context in comments.

1 Comment

Also can use >>>df.loc[df['x'].isnull(),'z'] = np.nan; df.loc[df['y'].isnull(),'z'] = np.nan

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.