Fold out a numpy array along a new dimension using values as index

Question

I have a [m,m] numpy array with element in {0, 1, 2, ..., 24}, now I want separate each number in a third dimension to get a [m,m,24] array.

a simple example, [5,5] array with element in {0, 1, 2, 3}

Now I need get a `[5,5,3]` array

[[0 0 1 0 0
  0 0 0 0 1
  0 0 0 1 0
  0 0 1 0 0
  1 0 0 0 1]
 [0 0 0 0 0
  2 0 0 0 0
  0 2 0 0 0
  0 0 0 0 0
  0 0 2 0 0]
 [0 0 0 0 0
  0 0 3 0 0
  0 0 3 0 0
  0 0 0 0 0
  0 0 0 0 0]]

Currently I have a simple method, but it's very computational expensive. Because I need to do this operation frequently.

img = np.expand_dims(img, axis=2)
for i in range(24):
    img_norm[..., i] = (img[..., 0] == (i + np.ones(shape=img[..., 0].shape)))

For 64 array with size [224,224] and element in {0, 1, 2, ..., 24}, The code above takes about 5s.

Is there a faster way to do it?

What happened to the 0s in your example? You only seem to match for 1, 2 and 3. — 9769953
– 9769953, Commented Jan 10, 2019 at 10:26
Not sure why this question is getting downvotes. It's just fine. Want a test dataset? Make one in a one-liner! See answers so far for inspiration. — Jean-François Corbett
– Jean-François Corbett, Commented Jan 10, 2019 at 10:37
@Jean-FrançoisCorbett I don't see that stated in the question: it explicitly mentions elements {0, 1, 2, 3} in the example, and similar for the actual data. It'd be good if the OP clarifies that. — 9769953
– 9769953, Commented Jan 10, 2019 at 10:43

Jean-François Corbett · Accepted Answer · 2019-01-10 10:33:57Z

3

The following is pretty speedy for me:

import numpy as np
max_num = 3
img = np.array([
    [0,0,1,0,0],
    [2,0,3,0,1],
    [0,2,3,1,0],
    [0,0,1,0,0],
    [1,0,2,0,1],
    ])

img_norm = np.zeros(img.shape + (max_num,))
for idx in range(1, max_num + 1):
    img_norm[idx-1,:,:]=idx*(img == idx)

Testing it with a random array of your specified size;

max_num = 24
img = np.int64((max_num+1)*np.random.rand(224, 224)) # Random array

img_norm = np.zeros(img.shape + (max_num,))
for idx in range(1, max_num + 1):
    img_norm[idx-1,:,:]=img*(img == idx)

Hardly takes any time at all on my machine.

def getnorm_acdr(img):
    max_num = np.max(img)
    img_norm = np.zeros([max_num, *img.shape])    
    for idx in range(1, max_num + 1):
        img_norm[idx-1,:,:]=img*(img == idx)

img = np.int64((max_num+1)*np.random.rand(224, 224))

%timeit getnorm_acdr(img)

Gives:

11.9 ms ± 536 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

edited Jan 10, 2019 at 10:33

Jean-François Corbett

38.7k30 gold badges145 silver badges192 bronze badges

answered Jan 10, 2019 at 10:14

acdr

4,7863 gold badges24 silver badges48 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Jean-François Corbett · Accepted Answer · 2019-01-10 13:02:35Z

1

Definitely more elegant: use np.ndenumerate().

for (i,j), val in np.ndenumerate(img):
    img_norm[val-1,i,j] = val

Looks like this should be faster than yours because O(N^2) rather than O(N^3). Let's try it out on an array with size and content as you describe:

def getnorm_ndenumerate(img):
    img_norm = np.zeros([np.max(img), *img.shape])
    for (i,j), val in np.ndenumerate(img):
        img_norm[val-1,i,j] = val  
    return img_norm

b = np.int64(25*np.random.rand(224, 224)) 

%timeit getnorm_ndenumerate(b)

gives

47.8 ms ± 1.38 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

It is indeed faster than yours. But the elegance comes at a price, because it is slower than acdr's method.

edited Jan 10, 2019 at 13:02

answered Jan 10, 2019 at 10:16

Jean-François Corbett

38.7k30 gold badges145 silver badges192 bronze badges

Comments

Lee David · Accepted Answer · 2019-01-10 13:01:46Z

I made a mistake, in the output array, all the non-zeros should be 1. Sorry for my silly mistake.

thanks for all your help. I tested the three methods above, including code from Jean-François Corbett, acdr + Jean-François Corbett and mine. It turns out the method from acdr + Jean-François Corbett is the fastest.

Here is my testing code

def test_time():
    def func1(img, max_num):
        w, h = img.shape
        img_norm = np.zeros([w, h, max_num], np.float32)
        for (i, j), val in np.ndenumerate(img):
            # img_norm[i, j, val - 1] = val
            img_norm[i, j, val - 1] = 0 if val == 0 else 1
        return img_norm

    def func2(img, max_num):
        w, h = img.shape
        img_norm = np.zeros([w, h, max_num], np.float32)
        for idx in range(1, max_num + 1):
            # img_norm[:, :, idx - 1] = idx*(img == idx)
            img_norm[:, :, idx - 1] = (img == idx)
        return img_norm

    def func3(img, max_num):
        w, h = img.shape
        img_norm = np.zeros([w, h, max_num], np.float32)
        for idx in range(max_num):
            # img_norm[:, :, idx] = (idx+1) * (img[:, :, 0] == (idx + np.ones(shape=img[:, :, 0].shape)))
            img_norm[:, :, idx] = (img == (idx + np.ones(shape=img.shape)))
        return img_norm

    import cv2
    img_tmp = cv2.imread('dat.png', cv2.IMREAD_UNCHANGED)
    img_tmp = np.asarray(img_tmp, np.int)

    # img_tmp = np.array([
    #     [0, 0, 1, 0, 0],
    #     [2, 0, 3, 0, 1],
    #     [0, 2, 3, 1, 0],
    #     [0, 0, 1, 0, 0],
    #     [1, 0, 2, 0, 1],
    # ])

    img_bkp = np.array(img_tmp, copy=True)
    print(img_bkp.shape)
    import time
    cnt = 100
    maxnum = 24
    start_time = time.time()
    for i in range(cnt):
        _ = func1(img_tmp, maxnum)
    print('1 total time =', time.time() - start_time)

    start_time = time.time()
    for i in range(cnt):
        _ = func2(img_tmp, maxnum)
    print('2 total time =', time.time() - start_time)

    start_time = time.time()
    for i in range(cnt):
        _ = func3(img_tmp, maxnum)
    print('3 total time =', time.time() - start_time)

    print((img_tmp == img_bkp).all())
    img1 = func1(img_tmp, maxnum)
    img2 = func2(img_tmp, maxnum)
    img3 = func3(img_tmp, maxnum)
    print(img1.shape, img2.shape, img3.shape)
    print((img1 == img2).all())
    print((img2 == img3).all())
    print((img1 == img3).all())
    # print(type(img1[0, 0, 0]), type(img2[0, 0, 0]), type(img3[0, 0, 0]))
    # print('img1\n', img1[:, :, 2])
    # print('img3\n', img3[:, :, 2])

The output is

    (224, 224)
    1 total time = 4.738261938095093
    2 total time = 0.7725710868835449
    3 total time = 1.5980615615844727
    True
    (224, 224, 24) (224, 224, 24) (224, 224, 24)
    True
    True
    True

If there is any problem, please post it in comments.

Thanks for all your kind help!

Collectives™ on Stack Overflow

Fold out a numpy array along a new dimension using values as index

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related