4

I have an array of varying size from which I would like to average each consecutive n numbers and build another array as a result.

I have come up with two different ways but each have their problems and I am not sure if this is the best way to solve this:

  1. Using numpy.array_split() function:

    import numpy as np
    no_splits = 3 #Or any number user defines
    no_items = int(np.random.random(1)*100) # To get a variable number of items
    pre_array = np.random.random(no_items)
    mean_array = np.mean(np.array_split(pre_array,no_splits)) 
    #This is efficient but gives an error if len(pre_array)%no_splits != 0
    
  2. enumerate(pre_array) alternative:

    mean_array = [np.mean(pre_array[i-no_splits+1:i]) for i, x in enumerate(pre_array) if i%no_splits == 0 and i != 0] 
    

This is fine but clips the last values if i%no_splits != 0. Ideally, I would create a last value that is the mean of the remaining ones whilst keeping the code compact.

Each of this works for my purposes but I am not sure if they are the most efficient for larger arrays.

Thank you in advance!

4
  • Do you mean you want to average 0 1 2 and 1 2 3 ... or 0 1 2 and 3 4 5 ... from 0 1 2 3 4...? Commented Aug 30, 2013 at 17:39
  • Hi @Ophion I need the latter, your posted answer if I understood and tried it correctly uses a moving window average method with length of the array being the same as the original array. I would want the length to equal len(new_array) = original_array/no_splits rounded up. Commented Aug 31, 2013 at 9:49
  • I have updated my answer, please let me know if this is what you are looking for. Commented Aug 31, 2013 at 14:10
  • Yes it's great thanks a lot! I was hoping for a one-liner but making a new function works as well. Thanks for help! Commented Sep 1, 2013 at 13:20

2 Answers 2

4

Use uniform_filter:

>>> import scipy.ndimage.filters as filter

>>> a=np.arange(5,dtype=np.double)
>>> filter.uniform_filter(a,size=3)
array([ 0.33333333,  1.        ,  2.        ,  3.        ,  3.66666667])

#What this is actually doing
>>> np.mean([0,0,1]) #ind0
0.33333333333333331
>>> np.mean([0,1,2]) #ind1
1.0
>>> np.mean([1,2,3]) #ind2
2.0

Can be used with any size window.

>>> filter.uniform_filter(a,size=5)
array([ 0.8,  1.2,  2. ,  2.8,  3.2])

The caveat here is that the accumulator will be whatever the dtype of the array is.


Group by three then take mean:

def stride_mean(arr,stride):
    extra = arr.shape[0]%stride
    if extra==0:
        return np.mean(arr.reshape(-1,stride),axis=1)
    else:
        toslice = arr.shape[0]-extra
        first = np.mean(arr[:toslice].reshape(-1,stride),axis=1)
        rest = np.mean(arr[toslice:])
        return np.hstack((first,rest))

print pre_array
[ 0.50712539  0.75062019  0.78681352  0.35659332]

print stride_mean(pre_array,3)
[ 0.6815197   0.35659332]
Sign up to request clarification or add additional context in comments.

1 Comment

The second part seems to be what the OP is asking for.
1
no_splits = 3
no_items = 100
a = np.random.rand(no_items)

no_bins = no_splits + no_items % no_splits
b = np.empty((no_bins,), dtype=a.dtype)
endpoint = no_items//no_splits

b[:no_splits] = np.mean(a[:endpoint*no_splits].reshape(-1, endpoint),
                       axis=-1)
b[no_splits:] = np.mean(a[endpoint*no_splits:])
>>> b
array([ 0.49898723,  0.49457975,  0.45601632,  0.5316093 ])

1 Comment

I am not sure this gives me what I wanted - if there are 100 items len(b) = 34 with the last point being average of one item. In array = [0, 1, 2, 3, 4, 5, 6] I want new array to be [mean[0 1 2], mean [3 4 5], mean[6]]. Thanks for helping out!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.