4

I have a DataFrame that looks like as follows (the address key is the index):

address date1 date2 date3 date4 date5 date6 date7 <email> NaN NaN NaN 1 NaN NaN NaN

I want to calculate the mean across a row, but when I use DataFrame.mean(axis=1), I get NaN (in the above example, I want a mean of 1). I get NaN even when I use DataFrame.mean(axis=1, skipna=True, numeric_only=True). How can I get the correct mean for the rows in this DataFrame?

0

1 Answer 1

3

Despite appearances your dtypes are not numeric hence the NaN values, you need to cast the type using astype:

df['date4'] = df['date4'].astype(int)

then it will work, depending on how you loaded/created this data then it should be something that you should correct at that stage rather than as a post-processing step if possible

You can confirm what the dtypes are but looking at the output from df.info() and also you can filter non-numeric columns out using select_dtypes: df.select_dtypes(include=[np.number]) to select just the numeric columns

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.