Splitting values in an array 'logarithmically' / based on another array

Question

I have a 2d array, where each element is a fourier transform. I'd like to split transform 'logarithmically'. For example, let's take a single one of those arrays and call it a:

a = np.arange(0, 512)

# I want to split a into 'bins' defined by b, below:
b = np.array([0] + [10 * 2**i for i in range(6)]) # [0, 10, 20, 40, 80, 160, 320, 640]

What I'm looking to do is something like using np.split, except I would like to split values into 'bins' based on array b such that all values of a between [0, 10) are in one bin, all values between [10, 20) in another, etc.

I could do this in some sort of convoluted for loop:

split_arr = []
for i in range(1, len(b)):
    fbin = []
    for amp in a:
        if (amp >= b[i-1]) and (amp < b[i]):
            fbin.append(amp)
    split_arr.append(fbin)

I have many arrays to split, and also this is ugly (just my opinion). Is there a better way?

Ehsan · Accepted Answer · 2020-08-11 01:04:32Z

5

Here is how you can do it, using np.split:

np.split(a, np.searchsorted(a,b))

If your array a is not sorted, sort it before the above command:

a = np.sort(a)

np.searchsorted finds the locations of values in b that would be inserted in the sorted array a. In other words, np.searchsorted finds the locations where you want to split your array. And if you do not want the empty array at the beginning, simply remove 0 from b.

edited Aug 11, 2020 at 1:04

answered Aug 11, 2020 at 0:50

Ehsan

12.5k2 gold badges24 silver badges36 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

rocksNwaves Over a year ago

Just timed this. Blazing fast and so concise. I'm glad I waited a few minutes to check out answers. I'm looking up np.searchsorted in the docs now, as I'd like to understand it more. Thank you.

Ehsan Over a year ago

@rocksNwaves You are welcome. I added another line for more explanation. Hope it helps. Feel free to accept the answer if it solves your problem. Thank you.

Julien Over a year ago

So this probably assumes a is sorted in the first place... which is why it's so fast. If a is not sorted then you need to factor in the cost of sorting it. Still probably the most efficient method, especially for big arrays.

Ehsan Over a year ago

@Julien Yes, Thank you for the note. Added to the post.

Julien · Accepted Answer · 2020-08-11 00:35:12Z

1

First you can reduce the 'ugliness' by using list comprehension:

split_arr = [[amp for amp in a if (amp >= b[i-1]) and (amp < b[i])] for i in range(1, len(b))]

Then you can apply the same logic using numpy fast parallelized functionalities (which has the bonus of looking even cleaner):

split_arr = [a[(a >= b[i-1]) & (a < b[i])] for i in range(1, len(b))]

Comparison:

%timeit [[amp for amp in a if (amp >= b[i-1]) and (amp < b[i])] for i in range(1, len(b))]
1.29 ms ± 109 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit [a[(a >= b[i-1]) & (a < b[i])] for i in range(1, len(b))]
35.9 µs ± 4.52 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

answered Aug 11, 2020 at 0:35

Julien

15.3k6 gold badges33 silver badges58 bronze badges

1 Comment

Julien Over a year ago

I would really love to know the reason of the downvote...

Collectives™ on Stack Overflow

Splitting values in an array 'logarithmically' / based on another array

2 Answers 2

4 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related