0

Some columns in my data set have missing values that are represented as None (Nonetype, not a string). Some other missing values are represented as 'N/A' or 'No'. I want to be able to handle these missing values in below method.

df.loc[df.col1.isin('None', 'Yes', 'No'), col1] = 'N/A'

Now my problem is, None is a value not a string and so I can not use none as 'None'. I have read somewhere we can convert that none value to a string 'None'.

Can anyone kindly give me any clue how to go about it ?

Note 1:

Just for clarity of explanation if I run below code:

df.col1.unique()

I get this output:

array([None, 'No', 'Yes'], dtype=object)

Note 2:

I know I can handle missing or None value with isnull() but in this case I need to use .isin() method

Sample dataframe:

f = {'name': ['john', 'tom', None, 'rock', 'dick'], 'DoB': [None, '01/02/2012', '11/22/2014', '11/22/2014', '09/25/2016'], 'Address': ['NY', 'NJ', 'PA', 'NY', None]}
df1 = pd.DataFrame(data = f)

When you run below code you will see None as a value.

df1.Address.unique()
output: array(['NY', 'NJ', 'PA', None], dtype=object)

I want the None to be displayed as 'None'

3
  • Can you give an input dataframe and your expected output? I tried answering your question, but am not sure what you actually need. Commented Apr 26, 2018 at 13:35
  • Updated with sample data frame at the bottom of my post. Commented Apr 26, 2018 at 13:49
  • could you provide more context on why do you want to do this? The export methods (e.g., df.to_csv) have na_rep arguments that can change all of the null/missing data to any string you want. Commented Apr 26, 2018 at 14:08

2 Answers 2

1

There is a different between a null/None and 'None'. So you can change your original statement to

df.loc[df.col1.isin([None, 'Yes', 'No']), col1] = 'N/A'

That is, take out the apostrophes for None

Or you can first find all the indices where a null's or none's exist and then select all those rows based on the index. And then you can use your original statement.

df["col1"].loc[df["col1"].isnull()] = 'None'
Sign up to request clarification or add additional context in comments.

Comments

0

Create an example df:

df = pd.DataFrame({"A": [None, 'Yes', 'No', 1, 3, 5]})

which looks like:

     A
0  None
1   Yes
2    No
3     1
4     3
5     5

Replace your 'None' by None and make the to be replaced arguments a list (that's how isin works):

df.loc[df.A.isin([None, 'Yes', 'No']), 'A'] = 'N/A'

which returns:

     A
0  N/A
1  N/A
2  N/A
3    1
4    3
5    5

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.