3

I'm playing around with pandas and I'm trying to get some NaN columns to be filled in with 0(and leaving others untouched).

Here's what I'm trying:

variablesToCovertToZero = ['column1', 'column2'] #just a list of columns
print('before ', df.isna().sum().sum()) #show me how many nulls
# df = df.update(df[variablesToCovertToZero].fillna(0, inplace=True)) #try 1, didn't work
df[variablesToCovertToZero].fillna(0, inplace=True) #try 2, also didn't work
print('after ', df.isna().sum().sum())

Results when I run it:

before  11056930
/opt/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py:4259: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  **kwargs
after  11056930

the before and after are the same! But I am also getting a warning. In the past the warning wasn't a problem but I thought I'd add it in just in case it was related.

Any suggestions on what I'm doing wrong? I just want to use the fillin option for specific list of columns.

4
  • Have you seen this post ? Commented Jul 5, 2020 at 16:57
  • @BalajiAmbresh I did it but wasn't sure if it was connected or just a warning. is the warning causing the issue with not fillingNA? Commented Jul 5, 2020 at 17:00
  • 1
    @Lostsoul I think the problem is the inplace=True with a subset of the dataframe. if you do df[variablesToCovertToZero] = df[variablesToCovertToZero].fillna(0) and not use inplace, it works well. Otherwise if you want to fillna some cols and use inplace, you can do df.fillna({col:0 for col in variablesToCovertToZero }, inplace=True) Commented Jul 5, 2020 at 17:21
  • 1
    @Ben.T worked like a charm. Can you put that as a answer and I'll accept? Commented Jul 5, 2020 at 17:27

1 Answer 1

2

The problem is the inplace=True with a subset of the dataframe when doing df[variablesToCovertToZero], it is what raise the warning and not fill the nan. If you do:

df[variablesToCovertToZero] = df[variablesToCovertToZero].fillna(0)

and not use inplace, it works well. Otherwise if you want to fillna some cols and still use inplace, you can create a dictionary of columns to filled with the value you want.

df.fillna({col:0 for col in variablesToCovertToZero }, inplace=True)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.