3

I am new to computer vision concepts and I am trying to learn. I have an image of letters and I performed Otsu binarization on the image so that every actual content in the image is changed to the same color (white 255 in my case). Now I want to segment the image into letters. For example, enter image description here

Now I want to loop through this image to get every single character(sub image of each character in it) from it into a separate numpy array or a separate image so that it can be passed to the model I built. Could you please advise on how to achieve this or is there any algorithm ?

I thought of looping, but it seems to be time consuming.

2
  • what exactly do you mean by "get every single character"? Do you mean getting the subimage that contains each character, or do you mean getting the actual character that is represented? These are wildly different tasks and could use some clarification. Commented Jul 17, 2018 at 5:14
  • Getting the subimage that contains each character in it . Commented Jul 17, 2018 at 6:32

1 Answer 1

3

In the following solution you can get words separately for each sentence. After getting word by word this will give output character by character.

This is the full code:

import cv2
import numpy as np
image = cv2.imread("stach.png",0)
cv2.imshow('orig',image)
# image = cv2.resize(image_original,None,fx=4, fy=4, interpolation = cv2.INTER_CUBIC)



#dilation
kernel = np.ones((5,100), np.uint8)
img_dilation = cv2.dilate(image, kernel, iterations=1)
# original_resized = cv2.resize(img_dilation, (0,0), fx=.2, fy=.2)
cv2.imshow('dilated',img_dilation)
cv2.waitKey(0)
#find contours
im2,ctrs, hier = cv2.findContours(img_dilation.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

#sort contours
sorted_ctrs = sorted(ctrs, key=lambda ctr: cv2.boundingRect(ctr)[1])

for i, ctr in enumerate(sorted_ctrs):

    # Get bounding box
    x, y, w, h = cv2.boundingRect(ctr)

    # Getting ROI
    roi = image[y:y+h, x:x+w]

# #   show ROI
    cv2.imshow('segment no:' +str(i),roi)
    cv2.waitKey(0)



    im = cv2.resize(roi,None,fx=4, fy=4, interpolation = cv2.INTER_CUBIC)
    ret_1,thresh_1 = cv2.threshold(im,127,255,cv2.THRESH_BINARY_INV)
    # original_resized = cv2.resize(thresh, (0,0), fx=.2, fy=.2)
    cv2.imshow('Threshold_1',thresh_1)
    cv2.waitKey(0)
    cv2.bitwise_not(thresh_1, thresh_1)
    kernel = np.ones((5, 30), np.uint8)
    words = cv2.dilate(thresh_1, kernel, iterations=1)
    cv2.imshow('words', words)
    cv2.waitKey(0)


    #words=cv2.cvtColor(words, cv2.COLOR_BGR2GRAY);

    #find contours
    im,ctrs_1, hier = cv2.findContours(words, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

    #sort contours
    sorted_ctrs_1 = sorted(ctrs_1, key=lambda ctr: cv2.boundingRect(ctr)[0])

    for j, ctr_1 in enumerate(sorted_ctrs_1):

        # Get bounding box
        x_1, y_1, w_1, h_1 = cv2.boundingRect(ctr_1)

        # Getting ROI
        roi_1 = thresh_1[y_1:y_1+h_1, x_1:x_1+w_1]

        # #   show ROI
        cv2.imshow('Line no: ' + str(i) + " word no : " +str(j),roi_1)
        cv2.waitKey(0)

        #chars = cv2.cvtColor(roi_1, cv2.COLOR_BGR2GRAY);

        # dilation
        kernel = np.ones((10, 1), np.uint8)
        joined = cv2.dilate(roi_1, kernel, iterations=1)
        # original_resized = cv2.resize(img_dilation, (0,0), fx=.2, fy=.2)
        cv2.imshow('joined', joined)
        cv2.waitKey(0)

        # find contours
        im, ctrs_2, hier = cv2.findContours(joined, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

        # sort contours
        sorted_ctrs_2 = sorted(ctrs_2, key=lambda ctr: cv2.boundingRect(ctr)[0])



        for k, ctr_2 in enumerate(sorted_ctrs_2):
            # Get bounding box
            x_2, y_2, w_2, h_2 = cv2.boundingRect(ctr_2)

            # Getting ROI
            roi_2 = roi_1[y_2:y_2 + h_2, x_2:x_2 + w_2]

            # #   show ROI
            cv2.imshow('Line no: ' + str(i) + ' word no : ' + str(j) + ' char no: ' + str(k), roi_2)
            cv2.waitKey(0)

First line segmentation should be done. For that following code is used:

kernel = np.ones((5,100), np.uint8)
img_dilation = cv2.dilate(image, kernel, iterations=1)

a kernel of 5x100 is used to seperate lines in the image.

Result is like follows:

After that contours will be extracted from above image and apply that contour coordinates to the original image. Then lines of the image will be extracted. Example line is as follows:

Then to each of these lines another kernel will be applied to extract words using the same method which applied to extract lines.

kernel = np.ones((5, 30), np.uint8)
words = cv2.dilate(thresh_1, kernel, iterations=1)

After extracting word by word then character by character will be extracted using following code:

for k, ctr_2 in enumerate(sorted_ctrs_2):
    # Get bounding box
    x_2, y_2, w_2, h_2 = cv2.boundingRect(ctr_2)

    # Getting ROI
    roi_2 = roi_1[y_2:y_2 + h_2, x_2:x_2 + w_2]

Hope you understand the method I provided. You can make changes as per your requirement to the full code.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.