Is there a faster way than the for loop to label matrix(3D array) in python?

Question

I wrote a code for labeling matrix(3D array) in Python. The concept of code is

check the 2 by 2 by 2 matrix in 3D array(whatever size I want)

if the matrix has 1, 2, and 3 as element, all elements in matrix would be changed into "max unique number + 1" in matrix.

import numpy as np

def label_A(input_field):
labeling_A = np.copy(input_field)
labeling_test = np.zeros((input_field.shape))
for i in range(0,input_field.shape[0]-1):
    for j in range(0,input_field.shape[1]-1):
        for k in range(0,input_field.shape[2]-1):
            test_unit = input_field[i:i+2,j:j+2,k:k+2]
            if set(np.unique(test_unit).astype(int)) >= set((1,2,3)):
                labeling_test[i:i+2,j:j+2,k:k+2] = np.max(input_field)+1
                labeling_A[labeling_test == np.max(input_field)+1] = np.max(input_field)+1
    return labeling_A

This is a simple example code in matrix in 3D.

example = np.random.randint(0, 10, size=(10, 10, 10))
label_example = label_A(example)
label_example

In my view, the code itself has no problem and it works, actually. However, I am curious about that is there any faster way to do the same function for this?

No need format the example, can you just make it copy/paste-able? — Jello
– Jello, Commented Mar 7, 2019 at 20:42

marc_s · Accepted Answer · 2019-03-16 10:36:44Z

1

This implementation returns the suggested result and handles a (140,140,140) sized tensor in 1.8 seconds.

import numpy as np
from scipy.signal import convolve

def strange_convolve(mat, f_shape, _value_set, replace_value):
    _filter =np.ones(tuple(s*2-1 for s in f_shape))
    replace_mat = np.ones(mat.shape)
    for value in _value_set:
        value_counts = convolve((mat==value),_filter,mode='same')
        replace_mat*=(value_counts>0)
    mat[replace_mat==1]=replace_value
    return mat
example = np.random.randint(0, 8, size=(10, 10, 10))
print('same_output validation is '+str((strange_convolve(example,(2,2,2),(1,2,3),4) == label_A(example)).min()))

import time 
example = np.random.randint(0, 10, size=(140, 140, 140))
timer = time.time()
strange_convolve(example,(2,2,2),(1,2,3),4)
print(time.time()-timer)

1.8871610164642334

edited Mar 16, 2019 at 10:36

marc_s

760k186 gold badges1.4k silver badges1.5k bronze badges

answered Mar 7, 2019 at 21:27

Peter Mølgaard Pallesen

2,0192 gold badges23 silver badges38 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

GeorgeLPerkins Over a year ago

I've never used convolve, so I don't fully understand it, but I will say that this is definitely faster. However, there appears to be some issue with your current code. First: label_A is not included in your example. Second: s*2-1 creates a filter of 3x3x3. s*2-2 will give you the 2x2x2 block that the OP is looking for. Third: I haven't identified why, but using your setup even with the change, I still get some different results. I believe your method is definitely the way to go (and shows me I need to study and use scipy). Just a few issues to return the correct results.

GeorgeLPerkins · Accepted Answer · 2019-03-11 19:29:19Z

First, you have a couple of issues with your code that can be easily resolved and sped up.
For every loop, you are recalculating np.max(input_field)+1 three times.
The larger your matrix becomes, the impact becomes much more noticeable. Note the difference in tests A and B.

I tried running tests with the convolve example above, and while it was fast, the results were never the same as the other test (which in the setup below should have been identical). I believe it's looking for 1, 2, or 3 in a 3x3x3 block.

Label A with size of 10 --- 0:00.015628
Label B with size of 10 --- 0:00.015621
Label F with size of 10 --- 0:00.015628

Label A with size of 50 --- 0:15.984662
Label B with size of 50 --- 0:10.093478
Label F with size of 50 --- 0:02.265621

Label A with size of 80 --- 4:02.564660
Label B with size of 80 --- 2:29.439298
Label F with size of 80 --- 0:09.437868

------ Edited ------ The convolve method is definately faster, though I believe there is some issue with the code as given by Peter.

Label A with size of 10 : 00.013985
[[ 2 10 10 10 10  4  9  0  8  7]
 [ 9 10 10 10 10  0  9  8  5  9]
 [ 3  8  4  0  9  4  2  8  7  1]
 [ 4  7  6 10 10  4  8  8  5  4]] 

Label B with size of 10 : 00.014002
[[ 2 10 10 10 10  4  9  0  8  7]
 [ 9 10 10 10 10  0  9  8  5  9]
 [ 3  8  4  0  9  4  2  8  7  1]
 [ 4  7  6 10 10  4  8  8  5  4]] 

Label Flat with size of 10 : 00.020001
[[ 2 10 10 10 10  4  9  0  8  7]
 [ 9 10 10 10 10  0  9  8  5  9]
 [ 3  8  4  0  9  4  2  8  7  1]
 [ 4  7  6 10 10  4  8  8  5  4]] 

Label Convolve with size of 10 : 00.083996
[[ 2  2 10  8  4 10  9  0  8  7]
 [ 9 10  0  4  7 10  9 10 10  9]
 [ 3  8  4  0  9  4  2 10  7 10]
 [ 4  7 10  5  0  4  8 10  5  4]]

The OP wanted all elements of the 2x2x2 matrix set to the higher value.
Note that convolve in it's present setup sets some single space elements and not in the 2x2x2 matrix pattern.

Below is my code:

import numpy as np
from scipy.signal import convolve
from pandas import datetime as dt

def label_A(input_field):
    labeling_A = np.copy(input_field)
    labeling_test = np.zeros((input_field.shape))
    for i in range(0,input_field.shape[0]-1):
        for j in range(0,input_field.shape[1]-1):
            for k in range(0,input_field.shape[2]-1):
                test_unit = input_field[i:i+2,j:j+2,k:k+2]
                if set(np.unique(test_unit).astype(int)) >= set((1,2,3)):
                    labeling_test[i:i+2,j:j+2,k:k+2] = np.max(input_field)+1
                    labeling_A[labeling_test == np.max(input_field)+1] = np.max(input_field)+1
    return labeling_A


def label_B(input_field):
    labeling_B = np.copy(input_field)
    labeling_test = np.zeros((input_field.shape))
    input_max = np.max(input_field)+1
    for i in range(0,input_field.shape[0]-1):
        for j in range(0,input_field.shape[1]-1):
            for k in range(0,input_field.shape[2]-1):
                test_unit = input_field[i:i+2,j:j+2,k:k+2]
                if set(np.unique(test_unit).astype(int)) >= set((1,2,3)):
                    labeling_test[i:i+2,j:j+2,k:k+2] = input_max
                    labeling_B[labeling_test == input_max] = input_max
    return labeling_B


def label_Convolve(input_field):
    _filter =np.ones([2,2,2])
    replace_mat = np.ones(input_field.shape)
    input_max = np.max(input_field)+1
    for value in (1,2,3):
        value_counts = convolve((input_field==value),_filter,mode='same')
        replace_mat*=(value_counts>0)
    input_field[replace_mat==1] = input_max
    return input_field


def flat_mat(matrix):
    flat = matrix.flatten()
    dest_mat = np.copy(flat)
    mat_width = matrix.shape[0]
    mat_length = matrix.shape[1]
    mat_depth = matrix.shape[2]
    input_max = np.max(matrix)+1

    block = 0
    for w in range(mat_width*(mat_length)*(mat_depth-1)):
        if (w+1)%mat_width != 0:
            if (block+1)%mat_length == 0:
                pass
            else:
                set1 = flat[w:w+2]
                set2 = flat[w+mat_width:w+2+mat_width]
                set3 = flat[w+(mat_width*mat_length):w+(mat_width*mat_length)+2]
                set4 = flat[w+(mat_width*mat_length)+mat_width:w+(mat_width*mat_length)+mat_width+2]
                fullblock = np.array([set1, set2, set3, set4])
                blockset = np.unique(fullblock)
                if set(blockset) >= set((1,2,3)):
                    dest_mat[w:w+2] = input_max
                    dest_mat[w+mat_width:w+2+mat_width] = input_max
                    dest_mat[w+(mat_width*mat_length):w+(mat_width*mat_length)+2] = input_max
                    dest_mat[w+(mat_width*mat_length)+mat_width:w+(mat_width*mat_length)+mat_width+2] = input_max
        else:
            block += 1
    return_mat = dest_mat.reshape(mat_width, mat_length, mat_depth)
    return(return_mat)


def speedtest(matrix,matrixsize):
    starttime = dt.now()
    label_A_example = label_A(matrix)
    print(f'Label A with size of {matrixsize} : {dt.now() - starttime}')
    print(label_A_example[0][0:4], '\n')

    starttime = dt.now()
    label_B_example = label_B(matrix)
    print(f'Label B with size of {matrixsize} : {dt.now() - starttime}')
    print(label_B_example[0][0:4], '\n')

    starttime = dt.now()
    label_Inline_example = flat_mat(matrix)
    print(f'Label Flat with size of {matrixsize} : {dt.now() - starttime}')
    print(label_Inline_example[0][0:4], '\n')

    starttime = dt.now()
    label_Convolve_example = label_Convolve(matrix)
    print(f'Label Convolve with size of {matrixsize} : {dt.now() - starttime}')
    print(label_Convolve_example[0][0:4], '\n')

tests = 1 #each test will boost matrix size by 10
matrixsize = 10
for i in range(tests):
    example = np.random.randint(0, 10, size=(matrixsize, matrixsize, matrixsize))
    speedtest(example,matrixsize)
    matrixsize += 10

Collectives™ on Stack Overflow

Is there a faster way than the for loop to label matrix(3D array) in python?

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related