I am trying to use OpenCV and Pytesseract to loop over the white numbers at the bottom of this image (or similar images) and record each number.
While I have the logic correct for determining the ROI, I'm often getting bad/no results using Tesseract in order to read the numbers from it. Sometimes it's correct, but most of the time what happens is:
- It only gets one digit (usually the last digit). For example, the column that has 115 at the bottom would read in as 5.
- It doesn't read anything at all.
This is what I have right now:
# Pre-process image
img = cv2.imread(local_file_path)
gray_image = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray_image, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Get basic details about the image, so we can loop over
# each column to check the measure numbers.
h, w, c = img.shape
columns = math.ceil(w / 255) # Figure out the number of columns in the image.
# Loop over each ROI and get the text.
for column_num in range(columns):
roi_x_1 = 45 + (column_num * 255)
roi_y_1 = h - 35
roi_x_2 = 80 + (column_num * 255)
roi_y_2 = h
roi = thresh[roi_y_1:roi_y_2, roi_x_1:roi_x_2]
data = pytesseract.image_to_string(roi, config='--psm 6 --oem 3 -c tessedit_char_whitelist=0123456789')
measure_number = int(data)
logging.warning(f"Number found in column {column_num + 1}: {measure_number}")
# SO Note: Not saving the measure number just yet - trying to get correct reads first.
I know that the logic for determining the ROI is correct, since I've manually checked the coordinates - and since I'm getting correct numbers sometimes (if Tesseract somehow reads the number correctly). This leads me to believe that something is wrong with my pre-processing of the image before I hand it off to Tesseract.
I've tried multiple types of pre-processing, and different PSM values, but unfortunately I've come up short-handed. I’ve also tried resizing to 4X the original size.
What combination of OpenCV preprocessing/Tesseract configuration could I use that will help me correctly read the numbers? (I'm also open to using a different implementation, if there's something better than OpenCV preprocessing/Tesseract).
