0

I am trying to use OpenCV and Pytesseract to loop over the white numbers at the bottom of this image (or similar images) and record each number.

1

While I have the logic correct for determining the ROI, I'm often getting bad/no results using Tesseract in order to read the numbers from it. Sometimes it's correct, but most of the time what happens is:

  • It only gets one digit (usually the last digit). For example, the column that has 115 at the bottom would read in as 5.
  • It doesn't read anything at all.

This is what I have right now:

# Pre-process image
img = cv2.imread(local_file_path)
gray_image = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray_image, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

# Get basic details about the image, so we can loop over
# each column to check the measure numbers.
h, w, c = img.shape
columns = math.ceil(w / 255)  # Figure out the number of columns in the image.

# Loop over each ROI and get the text.
for column_num in range(columns):
    roi_x_1 = 45 + (column_num * 255)
    roi_y_1 = h - 35
    roi_x_2 = 80 + (column_num * 255)
    roi_y_2 = h
    roi = thresh[roi_y_1:roi_y_2, roi_x_1:roi_x_2]
    data = pytesseract.image_to_string(roi, config='--psm 6 --oem 3 -c tessedit_char_whitelist=0123456789')
    measure_number = int(data)
    logging.warning(f"Number found in column {column_num + 1}: {measure_number}")
    # SO Note: Not saving the measure number just yet - trying to get correct reads first.

I know that the logic for determining the ROI is correct, since I've manually checked the coordinates - and since I'm getting correct numbers sometimes (if Tesseract somehow reads the number correctly). This leads me to believe that something is wrong with my pre-processing of the image before I hand it off to Tesseract.

I've tried multiple types of pre-processing, and different PSM values, but unfortunately I've come up short-handed. I’ve also tried resizing to 4X the original size.

What combination of OpenCV preprocessing/Tesseract configuration could I use that will help me correctly read the numbers? (I'm also open to using a different implementation, if there's something better than OpenCV preprocessing/Tesseract).

3
  • purely a tesseract issue. it's got trouble handling tiny text. you probably saw that in your research. Commented Jun 9, 2024 at 18:41
  • this site will resize uploaded images if they exceed some limits. was your source image 5588 by 1557 pixels? that's the dimension of the image currently in the question. if your source isn't that size, you might wanna generate a composite of the number crops, and upload that instead. I'm asking because this is the crops I get: i.sstatic.net/tdu0eoyf.png Commented Jun 9, 2024 at 20:11
  • @ChristophRackwitz First off, I appreciate the edit to my original post! Still kind of unfamiliar with posting on here and how to make questions "good". Yes, the source is 5588 x 1557, which was sourced here: sdvxindex.com/s/0993/5 Based off of what you showed me for the crops, I may have to adjust how I'm crawling across each ROI. I'll look into that and get back to you. Thank you so much! Commented Jun 9, 2024 at 21:02

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.