0
main_df[main_df.isnull()].count()

result:

number_project           0
average_montly_hours     0
time_spend_company       0
Work_accident            0
left                     0
promotion_last_5years    0
department               0
salary                   0
satisfaction_level       0
last_evaluation          0
dtype: int64

however, when I used any() method , I found some null value in my columns

main_df.isnull().any()

results:

number_project           False
average_montly_hours     False
time_spend_company       False
Work_accident            False
left                     False
promotion_last_5years    False
department               False
salary                   False
satisfaction_level        True
last_evaluation           True
dtype: bool

why have this situation?

by the way, I also try the sum(), the result was 0.0 as well, and then

main_df[main_df['employee_id'] == 3794]

result is

18  3794    2   160 3   1   1   1   sales   low NaN NaN

however, when I checked by column name

main_df[main_df['satisfaction_level'] == np.nan]

NO any output!

3
  • Change main_df[main_df['satisfaction_level'] == np.nan] to main_df[main_df['satisfaction_level'].isna()] anything == np.nan always returns False. Even 'np.nan == np.nan` returns False that is not a good check. Commented Aug 4, 2020 at 21:08
  • You also need to verify that you don't have string 'NaN' vs a true np.nan. Commented Aug 4, 2020 at 21:10
  • OHH got it, however, what happened to this code main_df[main_df.isnull()].count() if I wanna check the whole dataframe Commented Aug 4, 2020 at 21:19

2 Answers 2

3

You can try:

main_df.isna().sum()

Describe will also tell you if there are na values

main_df.info()
Sign up to request clarification or add additional context in comments.

1 Comment

Yeah I also try the sum() the result also was employee_id 0.0 number_project 0.0 average_montly_hours 0.0 time_spend_company 0.0 Work_accident 0.0 left 0.0 promotion_last_5years 0.0 department 0.0 salary 0.0 satisfaction_level 0.0 last_evaluation 0.0
0

print(main_df.isnull().sum()[main_df.isnull().sum()>0])

2 Comments

Please avoid code only answer and provide an explanation.
Answer needs supporting information Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.