11

I have a numpy array with floats.

What I would like to have (if it is not already existing) is a function that gives me a new array of the average of every x points in the given array, like sub sampling (and opposite of interpolation(?)).

E.g. sub_sample(numpy.array([1, 2, 3, 4, 5, 6]), 2) gives [1.5, 3.5, 5.5]

E.g. Leftovers can be removed, e.g. sub_sample(numpy.array([1, 2, 3, 4, 5]), 2) gives [1.5, 3.5]

Thanks in advance.

1

3 Answers 3

32

Using NumPy routines you could try something like

import numpy

x = numpy.array([1, 2, 3, 4, 5, 6])

numpy.mean(x.reshape(-1, 2), 1) # Prints array([ 1.5,  3.5,  5.5])

and just replace the 2 in the reshape call with the number of items you want to average over.

Edit: This assumes that n divides into the length of x. You'll need to include some checks if you are going to turn this into a general function. Perhaps something like this:

def average(arr, n):
    end =  n * int(len(arr)/n)
    return numpy.mean(arr[:end].reshape(-1, n), 1)

This function in action:

>>> x = numpy.array([1, 2, 3, 4, 5, 6])
>>> average(x, 2)
array([ 1.5,  3.5,  5.5])

>>> x = numpy.array([1, 2, 3, 4, 5, 6, 7])
>>> average(x, 2)
array([ 1.5,  3.5,  5.5])
Sign up to request clarification or add additional context in comments.

5 Comments

This one works fine, except when the window size (2 in example above) is not a multiplication of the length of the array but I can make sure this is. Thanks!
thanks ... yes that was exactly what I also was thinking about.
Is there an easy way to generalize this to downsampling a single axis, in a multidimensional array? e.g. average an array of shape [8,4] down to [4,4] ?
Could you provide a solution where i could enter a floating downsampling rate. E.g 2.7
@maniac If you have a question that isn't answered here, please post a new question, rather than commenting on an existing answer.
3
def subsample(data, sample_size):
    samples = list(zip(*[iter(data)]*sample_size))   # use 3 for triplets, etc.
    return map(lambda x:sum(x)/float(len(x)), samples)

l = [1, 2, 3, 4, 5, 6]

print subsample(l, 2)
print subsample(l, 3)
print subsample(l, 5)

Gives:

[1.5, 3.5, 5.5]
[2.0, 5.0]
[3.0]

1 Comment

Thank you I will try it, however I hope there will be a numpy function because they tend to be around 10 times faster as most similar Python function.
-1

this is also a one line solution that works:

downsampled_a = [a[i:n+i].mean() for i in range(0,size(a),n)]

"a" is the vector with your data and "n" is your sampling step.

PS: from numpy import *

3 Comments

It returns [1.5, 3.5, 5.0] - not [1.5, 3.5] as desired by OP. Also use np.size() instead of importing all from numpy.
The above one-liner returns exactly what asked: [1.5, 3.5, 5.5] not [1.5, 3.5, 5.0]. The leftover -of course- can be removed (see the examples in the original question).
numpy.size() can be avoided. len() it is enough... (^_^)/*

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.