How to recognize single characters from an image using Tesseract?

Question

This is the original image:

This is the processed image:

I'm trying to automate a mini-game, in which characters appear on the screen. I did some light reaserch and managed to process the image to what you can see above, but the it dosen't seem to work correctly. This code returns a single character 'Q'. Is there anyway to do this?

I'm using version 5.4.0

Thanks in advance

My code:

import pytesseract
import cv2
import numpy as np

pytesseract.pytesseract.tesseract_cmd = r'C:\Users\***\AppData\Local\Programs\Tesseract-OCR\tesseract.exe'

image = cv2.imread('ocrtest.png', cv2.IMREAD_GRAYSCALE)
thresh = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 11, 2)

cv2.imshow('test', thresh)
cv2.waitKey(0)
cv2.destroyAllWindows()

data = pytesseract.image_to_string(thresh, lang='eng', config='-c tessedit_char_whitelist=01234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ --psm 6 --oem 3')
print(data)

I tried different all the simple tresholding methods and Otsu's Binarization from this doc. It resulted in poor image quality and it basically also didn't work. I settled on the addaptive thresholding, because it looks best in my opinion, but I'm not sure since I don't really know how it works.

I tried every other psm option, but they also didn't work and I settled on 6, because it at least gives me something. Wierdly I thought that 11 would be best acording to this description, but it returned nothing.

11 Sparse text. Find as much text as possible in no particular order.

for every contour define a new subimage and run pytesseract there. Should b easy — Tino D
– Tino D, Commented Jul 19, 2024 at 7:34

Mark Setchell · Accepted Answer · 2024-07-19 09:53:37Z

0

Your preprocessing code is doing a pretty poor job of isolating the letters. They are well separated in the Red channel, so maybe more like this:

import cv2 as cv

# Load image
im = cv2.imread('letters.png')

# Operate on Red channel
red = im[..., 2]

_, thresh  = cv2.threshold(red, 180, 255, cv2.THRESH_BINARY_INV)

If the original image is not properly representative, but may consist of differently coloured backgrounds, you could convert to HSV mode, and look for the white, unsaturated letters (in the "Saturation" channel) instead of segmenting on the Red channel. That is more like this:

import cv2 as cv

# Load image
im = cv2.imread('letters.png')

# Convert to HSV colourspace and select "Saturation" channel
hsv = cv.cvtColor(im,cv.COLOR_BGR2HSV)
s = hsv[...,1]

# Find unsaturated pixels - i.e. white/black or uncoloured
_, thresh  = cv2.threshold(s, 40, 255, cv2.THRESH_BINARY)

edited Jul 19, 2024 at 9:53

answered Jul 19, 2024 at 8:56

Mark Setchell

210k32 gold badges310 silver badges504 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Lrx Over a year ago

Impressive result, but isn't it too specific to this image ? What if the background isn't as clear as in the example ?

Mark Setchell Over a year ago

@Lrx I am hoping (expecting) the OP has posted a properly representative image - else all bets are off.

Mark Setchell Over a year ago

@Lrx If the background colours vary from blues and greens, we can always convert to HSV mode and threshold on the Saturation channel to find the unsaturated, white letters.

Mark Setchell Over a year ago

@Lrx I added some ideas to my answer to deal with differently coloured backgrounds

Lrx Over a year ago

Nice, this looks promising.

Collectives™ on Stack Overflow

How to recognize single characters from an image using Tesseract?

1 Answer 1

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related