How to calculate the mean of a pandas DataFrame with NaN values

Question

I have a DataFrame that looks like as follows (the address key is the index):

address date1 date2 date3 date4 date5 date6 date7 <email> NaN NaN NaN 1 NaN NaN NaN

I want to calculate the mean across a row, but when I use DataFrame.mean(axis=1), I get NaN (in the above example, I want a mean of 1). I get NaN even when I use DataFrame.mean(axis=1, skipna=True, numeric_only=True). How can I get the correct mean for the rows in this DataFrame?

EdChum · Accepted Answer · 2016-06-22 15:33:54Z

3

Despite appearances your dtypes are not numeric hence the NaN values, you need to cast the type using astype:

df['date4'] = df['date4'].astype(int)

then it will work, depending on how you loaded/created this data then it should be something that you should correct at that stage rather than as a post-processing step if possible

You can confirm what the dtypes are but looking at the output from df.info() and also you can filter non-numeric columns out using select_dtypes: df.select_dtypes(include=[np.number]) to select just the numeric columns

answered Jun 22, 2016 at 15:33

EdChum

397k204 gold badges836 silver badges583 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

How to calculate the mean of a pandas DataFrame with NaN values

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related