0
def colorize(im, h, s, l_adjust):
    result = Image.new('RGBA', im.size)
    pixin = np.copy(im)
    pixout = np.array(result)

    >>>>>>>>>>>>>>>>> loop <<<<<<<<<<<<<<<<<

    for y in range(pixout.shape[1]):
        for x in range(pixout.shape[0]):
            lum = currentRGB(pixin[x, y][0], pixin[x, y][1], pixin[x, y][2])
            r, g, b = colorsys.hls_to_rgb(h, lum, s)
            r, g, b = int(r * 255.99), int(g * 255.99), int(b * 255.99)
            pixout[x, y] = (r, g, b, 255)

    >>>>>>>>>>>>>>>>>>>>> Loop end <<<<<<<<<<<

    return result

Trying to find the HSL per pixel value from a frame of input video but it's taking too much time about 1.5s but want to reduce the time to at least within 0.3s. Any faster way to do this without using these 2 loops? Looking for something like LUT(Look up table)/vectorize/something with NumPy shortcut to avoid those 2 loops. Thanks

OR

Part 2 ->>

If I break the custom currentRGB() into the for loops it looks like :

def colorize(im, h, s, l_adjust):
    result = Image.new('RGBA', im.size)
    pixin = np.copy(im)
    pixout = np.array(result)
    for y in range(pixout.shape[1]):
        for x in range(pixout.shape[0]):
            currentR, currentG, currentB = pixin[x, y][0]/255 , pixin[x, y][1]/255, pixin[x, y][2]/255
            #luminance
            lum = (currentR * 0.2126) + (currentG * 0.7152) + (currentB * 0.0722)
            if l_adjust > 0:
                lum = lum * (1 - l_adjust)
                lum = lum + (1.0 - (1.0 - l_adjust))
            else:
                lum = lum * (l_adjust + 1)
            l = lum
            r, g, b = colorsys.hls_to_rgb(h, l, s)
            r, g, b = int(r * 255.99), int(g * 255.99), int(b * 255.99)
            pixout[x, y] = (r, g, b, 255)
    return pixout
7
  • 2
    There are many libraries that implement this conversion. OpenCV, DIPlib, Scikit-Image, … Don’t reinvent the wheel, especially in Python where it is so easy to write really slow wheels. :) Commented Jul 14, 2021 at 13:12
  • @ CrisLuengo Thanks for your reply but as I have a custom function inside the loop named currentRGB( ) so I cant apply -> cv.cvtColor(im, cv2.COLOR_RGB2HLS) this type of conversion. I would appreciate your further suggestions. Commented Jul 14, 2021 at 13:24
  • I don’t know what that function does, nor the other functions you call in this loop. So there is no way for me to help in speeding up this loop. See minimal reproducible example. Commented Jul 14, 2021 at 13:28
  • This is one of the biggest issue: if the function comes from an external module, you cannot vectorize it nor JIT it and it will be slow because function calls are expensive in CPython, especially when executed in 2 nested loops. I think you need to find another function to do that or do it yourself or even find a way to pass numpy arrays to this function. Commented Jul 14, 2021 at 13:28
  • I have added part 2 of my question. Where edited the process together inside the loop. Any idea or suggestion now? Its taking too much time :( about 1.5s but for processing my video frame by frame its too much !! Commented Jul 14, 2021 at 13:39

1 Answer 1

1

You can use Numba to drastically speed the computation up. Here is the implementation:

import numba as nb

@nb.njit('float32(float32,float32,float32)')
def hue_to_rgb(p, q, t):
    if t < 0: t += 1
    if t > 1: t -= 1
    if t < 1./6: return p + (q - p) * 6 * t
    if t < 1./2: return q
    if t < 2./3: return p + (q - p) * (2./3 - t) * 6
    return p

@nb.njit('UniTuple(uint8,3)(float32,float32,float32)')
def hls_to_rgb(h, l, s):
    if s == 0:
        # achromatic
        r = g = b = l
    else:
        q = l * (1 + s) if l < 0.5 else l + s - l * s
        p = 2 * l - q
        r = hue_to_rgb(p, q, h + 1./3)
        g = hue_to_rgb(p, q, h)
        b = hue_to_rgb(p, q, h - 1./3)

    return (int(r * 255.99), int(g * 255.99), int(b * 255.99))

@nb.njit('void(uint8[:,:,::1],uint8[:,:,::1],float32,float32,float32)', parallel=True)
def colorize_numba(pixin, pixout, h, s, l_adjust):
    for x in nb.prange(pixout.shape[0]):
        for y in range(pixout.shape[1]):
            currentR, currentG, currentB = pixin[x, y, 0]/255 , pixin[x, y, 1]/255, pixin[x, y, 2]/255
            #luminance
            lum = (currentR * 0.2126) + (currentG * 0.7152) + (currentB * 0.0722)
            if l_adjust > 0:
                lum = lum * (1 - l_adjust)
                lum = lum + (1.0 - (1.0 - l_adjust))
            else:
                lum = lum * (l_adjust + 1)
            l = lum
            r, g, b = hls_to_rgb(h, l, s)
            pixout[x, y, 0] = r
            pixout[x, y, 1] = g
            pixout[x, y, 2] = b
            pixout[x, y, 3] = 255

def colorize(im, h, s, l_adjust):
    result = Image.new('RGBA', im.size)
    pixin = np.copy(im)
    pixout = np.array(result)
    colorize_numba(pixin, pixout, h, s, l_adjust)
    return pixout

This optimized parallel implementation is about 2000 times faster than the original code on my 6-core machine (on 800x600 images). The hls_to_rgb implementation is coming from this post. Note that the string in @nb.njit decorators are not mandatory but enable Numba to compile the function ahead of time instead of at the first call. For more information about the types, please read the Numba documentation.

Sign up to request clarification or add additional context in comments.

3 Comments

This answer is a gem ! Thanks. I tried numba with cuda before but somehow return value was the issue there. But little query. Does @cuda.jit perform better than @nb.njit as I was following this tutorial .
If it performs better should I use it here?
Using CUDA may help but this is hard to tell. Indeed, the image will need to be sent on the GPU memory, then computed and then transferred back to the CPU memory. Data transfers are often rather slow, limiting the speed up. The biggest problem with GPUs is that they are very different from CPU. So while the current code could run on a GPU, it will likely not be very efficient because of warp divergence and coalescence. Still, it may be faster in the end, so the best is to try. Note that if you want to target GPUs, I think hls_to_rgb if the fonction to optimize first.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.