0

I'm trying to write a function that can randomly sample a numpy.ndarray that has floating point numbers while preserving the distribution of the numbers in the array. I have this function for now:

import random
from collections import Counter

def sample(A, N):
    population = np.zeros(sum(A))
    counter = 0
    for i, x in enumerate(A):
            for j in range(x):
                    population[counter] = i
                    counter += 1

    sampling = population[np.random.choice(0, len(population), N)]
    return np.histogram(sampling, bins = np.arange(len(A)+1))[0]

So I would like the function to work something like this(doesn't include accounting for distribution for this example):

a = np.array([1.94, 5.68, 2.77, 7.39, 2.51])
new_a = sample(a,3)

new_a
array([1.94, 2.77, 7.39])

However, when I apply the function to an array like this I'm getting:

TypeError                                 Traceback (most recent call last)
<ipython-input-74-07e3aa976da4> in <module>
----> 1 sample(a, 3)

<ipython-input-63-2d69398e2a22> in sample(A, N)
      3 
      4 def sample(A, N):
----> 5     population = np.zeros(sum(A))
      6     counter = 0
      7     for i, x in enumerate(A):

TypeError: 'numpy.float64' object cannot be interpreted as an integer

Any help on modifying or create a function that would work for this would be really appreciated!

2
  • The argument for np.zeros is supposed to be a shape - an integer, or tuple of integers, e.g np.zeros((2,3)). Your a/A is an array of floats, so the sum will also be a float. Commented Jun 13, 2019 at 1:12
  • So how do I fix this? I tried changing the argument to np.zeros(sum(A.shape)) but it's still the same error Commented Jun 13, 2019 at 16:11

1 Answer 1

1
In [67]: a = np.array([1.94, 5.68, 2.77, 7.39, 2.51])                                                  
In [68]: np.zeros(sum(a))                                                                              
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-68-263779bc977b> in <module>
----> 1 np.zeros(sum(a))

TypeError: 'numpy.float64' object cannot be interpreted as an integer

sum on the shape does not produce this error:

In [69]: np.zeros(sum(a.shape))                                                                        
Out[69]: array([0., 0., 0., 0., 0.])

But you shouldn't need to use sum:

In [70]: a.shape                                                                                       
Out[70]: (5,)
In [71]: np.zeros(a.shape)                                                                             
Out[71]: array([0., 0., 0., 0., 0.])

In fact if a is 2d, and you want a 1d array with the same number of items, you want the product of the shape, not the sum.

But do you want to return an array exactly the same size as A? I thought you were trying to downsize.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.