I Need a fast way to loop through pixels of an Image/Stack in Python

Question

I have created a 3D median filter which does work and is the following:

def Median_Filter_3D(image,kernel):

window = np.zeros(shape=(kernel,kernel,kernel), dtype = np.uint8)
n = (kernel-1)/2    #Deals with Image border
imgout = np.empty_like(image)
w,h,l = image.shape()

%%Start Loop over each pixel

for y in np.arange(0,(w-n*2),1):
    for x in np.arange(0,(h-n*2),1):
        for z in np.arange(0,(l-n*2),1):
            window[:,:,:] = image[x:x+kernel,y:y+kernel,z:z+kernel]
            med = np.median(window)
            imgout[x+n,y+n,z+n] = med 
return(imgout)

So at every pixel, It creates a window of size kernelxkernelxkernel, finds the median value of the pixels in the window, and replaces the value of that pixel with the new medium value.

My problem is, its very slow, I have thousands of big images to process. There must be a faster way to iterate through all these pixels and still be able to get the same result.

Thanks in advance!!

Imanol Luengo · Accepted Answer · 2016-04-01 13:37:25Z

6

First, looping a 3D matrix in python is a very very very bad idea. In order to loop a large 3D matrix you are better of going down to Cython or C/C++/Fortran and creating a python extension. However, for this particular case, scipy already contains an implementation of the median filter for n-dimensional arrays:

>>> from scipy.ndimage import median_filter
>>> median_filter(my_large_3d_array, radious)

In short, there is no a faster way of iterating through voxels in python (maybe numpy iterators would help a bit, but won't increase the performance considerably). If you need to perform more complicated 3D stuff in python, you should consider programming in Cython the loopy interface or, alternatively, using a chunking library such as Dask, which implements parallel operations for chunks of arrays.

The problem with Python if that for loops are extremely slow, specially if they are nested and with large arrays. Thus, there is no a standard pythonic method for obtaining efficient iterations over arrays. Usually, the way of getting speed-ups is through vectorized operations and numpy-ticks, but those are very problem-specific and there is no generic trick, you will learn a lot of numpy tricks here in SO.

As a generic approach, if you really need to iterate over arrays, you can write your code in Cython. Cython is a C-like extension for Python. You write code in Python syntax, but specifying variable types (like in C, with int or float. That code is then compiled automatically to C and can be called from python. A quick example:

Example Python loopy function:

import numpy as np

def iter_A(A):
    B = np.empty(A.shape, dtype=np.float64)

    for i in range(A.shape[0]):
        for j in range(A.shape[1]):
            B[i, j] = A[i, j] * 2
    return B

I know that the above code is kinda redundant and could be written as B = A * 2, but its purpose is just to illustrate that python loops are extremely slow.

Cython version of the function:

import numpy as np
cimport numpy as np

def iter_A_cy(double[:, ::1] A):
    cdef Py_ssize_t H = A.shape[0], W = A.shape[1]
    cdef double[:, ::1] B = np.empty((H, W), dtype=np.float64)
    cdef Py_ssize_t i, j

    for i in range(H):
        for j in range(W):
            B[i, j] = A[i, j] * 2

    return np.asarray(B)

Test speeds of both implementations:

>>> import numpy as np
>>> A = np.random.randn(1000, 1000)
>>> %timeit iter_A(A)
1 loop, best of 3: 399 ms per loop
>>> %timeit iter_A_cy(A)
100 loops, best of 3: 2.11 ms per loop

NOTE: you cannot run the Cython function as it is. You need to put it in a separate file and compile it first (or use %%cython magic in IPython Notebook).

It shows that the raw python version took 400ms to iterate the whole array, while it was only 2ms for the Cython version (x200 speedup).

edited Apr 1, 2016 at 13:37

answered Apr 1, 2016 at 10:05

Imanol Luengo

16k3 gold badges52 silver badges68 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Scott Alistair Over a year ago

OK, thank you. I am a bit new to Python so don't know what Cython or Dask is yet but I will look into it (Y). I know scipy already has the filter but this is just a practice function that will lead to more complex ones later.

Imanol Luengo Over a year ago

@ScottAlistair I've just edited the answer to give more information about Cython and why iterating large arrays in python is a bad idea. I know it is a lot to digest specially if you are new to python, but was just trying to point in the good direction.

Scott Alistair Over a year ago

Thank you, this is very useful!

Imanol Luengo Over a year ago

@ScottAlistair glad it helps! Note: Py_ssize_t is the equivalent (similar) to size_t in C if you are familiar with it (a large integer if not :P). To be able to use cython code you need to compile it. And cython code is usually stored in .pyx files.

Collectives™ on Stack Overflow

I Need a fast way to loop through pixels of an Image/Stack in Python

1 Answer 1

Example Python loopy function:

Cython version of the function:

Test speeds of both implementations:

4 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Example Python loopy function:

Cython version of the function:

Test speeds of both implementations:

4 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related