4

I have a data frame in which all the missing values are denoted with ?. I need the count of ? per each column.

A method that I tried was:

mydata.replace('?','')
mydata.isnull().sum()

that returns:

A1     0
A2     0
A3     0
A4     0
A5     0
A6     0
...
A16    0
dtype: int64

which should not be the case because there are ? in the CSV file that I got my data from.

1 Answer 1

3

Compare all values with ? and get occurencies by sum of True values:

out = (mydata == '?').sum()

Similar:

out = mydata.eq('?').sum()

In your solution first should be replaced ? to NaN and then chain together:

out = mydata.replace('?',np.nan).isnull().sum()

Also is possible replace ? to missing values in read_csv by parameter na_values='?':

mydata = pd.read_csv(file, na_values='?')

out = mydata.isnull().sum()
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.