0

I'm trying to create a frequency-of-occurrence map with an array of time, lat, lon. I should end up with a 2d lat/lon array of frequencies. The code below outlines my approach, and I run into problems at step d, when I convert the inverted boolean array mask to numerical values. I accidentally found a way to do, but I don't know why it works (np.mean). I can't see why np.mean turned booleans to floats but then didn't actually calculate the mean along the requested axis. I had to apply np.mean again to get the desired result. I feel there must be a right way to do convert a boolean array to floats or integers. Also, if you can think of a more better way to accomplish the task, fire away. My numpy mojo is weak and this was the only approach I could come up with.

import numpy as np

# test 3D array in time, lat, lon; values are percents
# real array is size=(30,721,1440)

a = np.random.random_integers(0,100, size=(3,4,5))
print(a)

# Exclude all data outside the interval 0 - 20 (first quintile)
# Repeat for 21-40, 41-60, 61-80, 81-100

b = np.ma.masked_outside(a, 0, 20)
print "\n\nMasked array:  "
print(b)

# Because mask is false where data within quintile, need to invert

c = [~b.mask] 
print "\n\nInverted mask:  "
print(c)

# Accidental way to turn True/False to 1./0., but that's what I want

d = np.mean(c, axis = 0)  
print "\n\nWhy does this work? How should I be doing it?"
print(d)

# This is the mean I want.  Gives desired end result

e = np.mean(d, axis = 0)
print "\n\nFrequency Map"
print(e)

How do I convert the boolean values in my (inverted) array mask to numerical (1 and 0)?

1 Answer 1

3

It "works" because your c isn't what you think it is:

>>> c
[array([[[False, False, False, False, False],
        [False, False, False, False,  True],
        [False, False, False, False, False],
        [False, False, False, False, False]],

       [[False, False, False, False, False],
        [False, False, False, False,  True],
        [False, False, False,  True, False],
        [False, False, False, False,  True]],

       [[False, False, False, False, False],
        [False, False, False, False, False],
        [False,  True, False, False, False],
        [ True, False,  True,  True, False]]], dtype=bool)]
>>> type(c)
<type 'list'>

It's not an array, it's a list containing an array. So when you take

d = np.mean(c, axis = 0)  

you're taking the mean of a list of one element, which is simply itself (but converted to float, because that's what mean does, and float(True) == 1.0.)

Instead, lose the unneeded brackets:

>>> c = ~b.mask
>>> output = c.mean(axis=0)
>>> output
array([[ 0.        ,  0.        ,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        ,  0.        ,  0.66666667],
       [ 0.        ,  0.33333333,  0.        ,  0.33333333,  0.        ],
       [ 0.33333333,  0.        ,  0.33333333,  0.33333333,  0.33333333]])
>>> np.allclose(output, e)
True

BTW, the canonical way to convert from bool to float or int is using astype, e.g. c.astype(float) or c.astype(int) but to be honest sometimes I'm lazy and simply write c + 0.0 or c + 0. You didn't hear that from me, though.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.