1

I have two arrays (A and B) containing either values ore nan's.

For calculating the average, I sum both up and divide by two.

A:

array([         nan,          nan,          nan,          nan,
                nan,          nan,          nan,          nan,
                nan,          nan, 109.93013333, 121.27613333,
       131.6136    , 142.32926667, 148.2544    , 156.32266667,
       160.3568    , 164.39093333, 168.6772    , 165.2734    ,
       165.77766667, 163.0042    , 164.8952    , 157.83546667,
       145.48093333, 162.89614286, 163.13026667, 151.53213333,
                nan,          nan,          nan,          nan,
                nan,          nan,          nan,          nan,
                nan,          nan,          nan,          nan,
                nan,          nan,          nan,          nan,
                nan])

B:

array([         nan,          nan,          nan,          nan,
                nan,          nan,          nan,          nan,
                nan,          nan,          nan,          nan,
       127.39813333, 141.14986667, 152.5664    , 160.99906667,
       169.04253333, 173.45346667, 179.29146667, 179.55093333,
       180.1996    , 178.51306667, 182.40506667, 173.06426667,
       158.27466667, 163.0748    , 140.76066667, 120.00333333,
        82.5104    ,          nan,          nan,          nan,
                nan,          nan,          nan,          nan,
                nan,          nan,          nan,          nan,
                nan,          nan,          nan,          nan,
                nan])

avg:

array([         nan,          nan,          nan,          nan,
                    nan,          nan,          nan,          nan,
                    nan,          nan,          nan,          nan,
           129.50586667, 141.73956667, 150.4104    , 158.66086667,
           164.69966667, 168.9222    , 173.98433333, 172.41216667,
           172.98863333, 170.75863333, 173.65013333, 165.44986667,
           151.8778    , 162.98547143, 151.94546667, 135.76773333,
                    nan,          nan,          nan,          nan,
                    nan,          nan,          nan,          nan,
                    nan,          nan,          nan,          nan,
                    nan,          nan,          nan,          nan,
                    nan])

Apparently, the average is calculated only at indices in both arrays with non-Nan values.

But: How to consider value with are either only in A or B present?

1 Answer 1

1

You have two options:

  1. Use numpy.nan_to_num. this approach convert np.nan to zero then (nan + 20)/2 = 10
  2. Use numpy.nanmean((A,B), axis=0) (Doc). this approach skip np.nan as num and compute average then (nan + 20)/2 = 20 (In this appraoch, we get a warning if we have and want to compute (nan+nan)/2)
# 1
>>> (np.nan_to_num(A)+np.nan_to_num(B))/2
array([  0.        ,   0.        ,   0.        ,   0.        ,
         0.        ,   0.        ,   0.        ,   0.        ,
         0.        ,   0.        ,  54.96506667,  60.63806666,
       129.50586666, 141.73956667, 150.4104    , 158.66086667,
       164.69966666, 168.9222    , 173.98433334, 172.41216667,
       172.98863334, 170.75863334, 173.65013333, 165.44986667,
       151.8778    , 162.98547143, 151.94546667, 135.76773333,
        41.2552    ,   0.        ,   0.        ,   0.        ,
         0.        ,   0.        ,   0.        ,   0.        ,
         0.        ,   0.        ,   0.        ,   0.        ,
         0.        ,   0.        ,   0.        ,   0.        ,
         0.        ])

# 2
>>> np.nanmean((A,B), axis=0)
array([         nan,          nan,          nan,          nan,
                nan,          nan,          nan,          nan,
                nan,          nan, 109.93013333, 121.27613333,
       129.50586666, 141.73956667, 150.4104    , 158.66086667,
       164.69966666, 168.9222    , 173.98433334, 172.41216667,
       172.98863334, 170.75863334, 173.65013333, 165.44986667,
       151.8778    , 162.98547143, 151.94546667, 135.76773333,
        82.5104    ,          nan,          nan,          nan,
                nan,          nan,          nan,          nan,
                nan,          nan,          nan,          nan,
                nan,          nan,          nan,          nan,
                nan])
Sign up to request clarification or add additional context in comments.

3 Comments

@Karl, No error, warning, because you have (nan+nan)/2)
Sorry, I had a comment regarding error propagation. I calculate following: avg_err = np.sqrt((A_err)**2 + (B_err)**2)/2 It would lead to half error: avg_err = np.sqrt(np.nan_to_num(A_err)**2+np.nan_to_num(B_err)**2)/2
@Karl, No prob, I edit the answer and add more details.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.