1

I have a numpy array as follows:

array = np.random.randint(6, size=(50, 400))

This array has the cluster that each value belongs to, with each row representing a sample and each column representing a feature, but I would like to create a 5 dimensional array with the frequency of each cluster (in each sample, represented as a row in this matrix).

However, in the frequency calculation, I want to ignore 0, meaning that the frequency of all values except 0 (1-5) should add to 1.

Essentially what I want is a array with each row being a cluster (1-5) in this case, and each row still contains a single sample.

How can this be done?

Edit:

small input:

input = np.random.randint(6, size=(2, 5))

array([[0, 4, 2, 3, 0],
       [5, 5, 2, 5, 3]])

output:

1    2    3    4    5

0   .33  .33  .33   0
0   .2   .2    0   .6    

Where 1-5 are the row names, and the bottom two rows are the desired output in a numpy array.

3
  • Can you show an example small input and desired output? Commented May 28, 2018 at 13:37
  • When you say "5 dimensional array" do you mean an array of shape (5,)? Commented May 28, 2018 at 13:41
  • I just added an example input and output. I hope that helps. Commented May 28, 2018 at 13:46

1 Answer 1

4

This is a simple application of bincount. Does this do what you want?

def freqs(x):
    counts = np.bincount(x, minlength=6)[1:]
    return counts/counts.sum()

frequencies = np.apply_along_axis(freqs, axis=1, arr=array)

If you were wondering about the speed implications of apply_along_axis, this method using tricky indexing is marginally slower in my tests:

counts = (array[:, :, None] == values[None, None, :]).sum(axis=1)
frequencies2 = counts/counts.sum(axis=1)[:, None]
Sign up to request clarification or add additional context in comments.

2 Comments

shouldn't it be axis=1 and no .T?
@filippo indeed. Thanks.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.