3

Let's say I have an array of 100 random numbers called random_array. I need to create an array that averages x numbers in random_array and stores them.

So if I had x = 7, then my code finds the average of the first 7 numbers and stores them in my new array, then next 7, then next 7...

I currently have this but I'm wondering how I can vectorize it or use some python method:

random_array = np.random.randint(100, size=(100, 1))
count = 0
total = 0
new_array = []
for item in random_array:
    if (count == 7):
        new_array.append(total/7)
        count = 0
        total = 0
    else:
        count = count + 1
        total = total + item
print new_array
1
  • So elements at 98, and 99(or the last two elements) are dropped since they can't form an array of 7? Commented Jan 23, 2017 at 21:11

4 Answers 4

4

Here's the standard trick

def down_sample(x, f=7):
    # pad to a multiple of f, so we can reshape
    # use nan for padding, so we needn't worry about denominator in
    # last chunk
    xp = np.r_[x, nan + np.zeros((-len(x) % f,))]
    # reshape, so each chunk gets its own row, and then take mean
    return np.nanmean(xp.reshape(-1, f), axis=-1)
Sign up to request clarification or add additional context in comments.

Comments

2

Here's an approach with ID-based summing/averaging using np.bincount -

ids = np.arange(len(random_array))//7
out = np.bincount(ids,random_array)/np.bincount(ids)

Sample run -

In [140]: random_array
Out[140]: 
array([89, 66, 29, 25, 36, 25, 30, 58, 64, 19, 25, 63, 76, 74, 44, 73, 94,
       88, 83, 88, 17, 91, 69, 65, 32, 73, 91, 20, 20, 14, 52, 65, 21, 58,
       14, 30, 26, 82, 61, 87, 24, 67, 83, 93, 57, 30, 81, 48, 84, 83, 59,
       19, 95, 55, 86, 57, 59, 77, 92, 44, 40, 29, 37, 42, 33, 89, 37, 57,
       18, 17, 85, 47, 19, 95, 96, 40, 13, 64, 18, 79, 95, 26, 31, 70, 35,
       65, 52, 93, 46, 63, 86, 77, 87, 48, 88, 62, 68, 82, 49, 86])

In [141]: ids = np.arange(len(random_array))//7

In [142]: np.bincount(ids,random_array)/np.bincount(ids)
Out[142]: 
array([ 42.85714286,  54.14285714,  69.57142857,  63.        ,
        34.85714286,  53.85714286,  68.        ,  64.85714286,
        54.        ,  41.85714286,  56.42857143,  54.71428571,
        62.85714286,  73.14285714,  67.5       ])

In [143]: random_array[:7].mean()    # Verify output[0]
Out[143]: 42.857142857142854

In [144]: random_array[7:14].mean()  # Verify output[1]
Out[144]: 54.142857142857146

In [145]: random_array[98:].mean()   # Verify output[-1]
Out[145]: 67.5

For performance, we can replace np.bincount(ids,random_array) with an alternative one using np.add.reduceat -

np.add.reduceat(random_array,range(0,len(random_array),7))

Comments

0

you could do this:

random_array = np.random.randint(100, size=(100, 1))

n = 7

dummy_array = random_array

new_vector = []

ref = n

for i in np.arange(len(random_array)/n):

    new_vector.append(dummy_array[i*n:ref].mean())

    ref = ref + n

it will return you a vector with the means, the last term is the mean of whatever was left (the last sequence doesnt have N terms necessarely)

hope it helps

Comments

0

You could do this:

res = np.average(np.reshape(random_array, (-1, 7)), axis=1)

... provided that the input array's size is a multiple of 7. If this is not guaranteed, you could chop off the remainder first:

random_array.resize(random_array.size // 7 * 7)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.