1

Given the following data frame:

import pandas as pd
df = pd.DataFrame({'COL1': ['A', np.nan], 
                   'COL2' : ['A','A']})
df

    COL1    COL2
0   A       A
1   NaN     A

How might I replace the second cell in COL2 with "NaN" (that is, make it null) if the corresponding cell under COL1 is null ("NaN")?

Desired Result:

    COL1    COL2
0   A       A
1   NaN     NaN

Note: I'm looking for a systematic solution that will work across n rows of COL1 and COL2.

Thanks in advance!

2 Answers 2

4

You could do this by indexing into the data frame where COL1 is nan:

import pandas as pd
df = pd.DataFrame({'COL1': ['A', np.nan]*100000, 
                   'COL2' : ['A','A']*100000})

df.loc[df.COL1.isnull(), 'COL2'] = np.nan

I used a larger dataframe so that we can compare timings:

%timeit df.loc[df.COL1.isnull(), 'COL2'] = np.nan
100 loops, best of 3: 5.36 ms per loop

Compared to the previous solution which is also a good solution:

%timeit df['COL2'] = np.where(pd.isnull(df['COL1']), np.nan, df['COL2'])
100 loops, best of 3: 10.9 ms per loop
Sign up to request clarification or add additional context in comments.

Comments

1

This works:

df['COL2'] = np.where(pd.isnull(df['COL1']), np.nan, df['COL2'])

Is there a preferable way?

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.