0

This is the original image: original image

This is the processed image: processed image

I'm trying to automate a mini-game, in which characters appear on the screen. I did some light reaserch and managed to process the image to what you can see above, but the it dosen't seem to work correctly. This code returns a single character 'Q'. Is there anyway to do this?

I'm using version 5.4.0

Thanks in advance

My code:

import pytesseract
import cv2
import numpy as np

pytesseract.pytesseract.tesseract_cmd = r'C:\Users\***\AppData\Local\Programs\Tesseract-OCR\tesseract.exe'

image = cv2.imread('ocrtest.png', cv2.IMREAD_GRAYSCALE)
thresh = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 11, 2)

cv2.imshow('test', thresh)
cv2.waitKey(0)
cv2.destroyAllWindows()

data = pytesseract.image_to_string(thresh, lang='eng', config='-c tessedit_char_whitelist=01234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ --psm 6 --oem 3')
print(data)

I tried different all the simple tresholding methods and Otsu's Binarization from this doc. It resulted in poor image quality and it basically also didn't work. I settled on the addaptive thresholding, because it looks best in my opinion, but I'm not sure since I don't really know how it works.

I tried every other psm option, but they also didn't work and I settled on 6, because it at least gives me something. Wierdly I thought that 11 would be best acording to this description, but it returned nothing.

11 Sparse text. Find as much text as possible in no particular order.

2
  • for every contour define a new subimage and run pytesseract there. Should b easy Commented Jul 19, 2024 at 7:34
  • Did you try my answer? How did you get on with it? Commented Jul 23, 2024 at 15:11

1 Answer 1

0

Your preprocessing code is doing a pretty poor job of isolating the letters. They are well separated in the Red channel, so maybe more like this:

import cv2 as cv

# Load image
im = cv2.imread('letters.png')

# Operate on Red channel
red = im[..., 2]

_, thresh  = cv2.threshold(red, 180, 255, cv2.THRESH_BINARY_INV)

enter image description here


If the original image is not properly representative, but may consist of differently coloured backgrounds, you could convert to HSV mode, and look for the white, unsaturated letters (in the "Saturation" channel) instead of segmenting on the Red channel. That is more like this:

import cv2 as cv

# Load image
im = cv2.imread('letters.png')

# Convert to HSV colourspace and select "Saturation" channel
hsv = cv.cvtColor(im,cv.COLOR_BGR2HSV)
s = hsv[...,1]

# Find unsaturated pixels - i.e. white/black or uncoloured
_, thresh  = cv2.threshold(s, 40, 255, cv2.THRESH_BINARY)
Sign up to request clarification or add additional context in comments.

5 Comments

Impressive result, but isn't it too specific to this image ? What if the background isn't as clear as in the example ?
@Lrx I am hoping (expecting) the OP has posted a properly representative image - else all bets are off.
@Lrx If the background colours vary from blues and greens, we can always convert to HSV mode and threshold on the Saturation channel to find the unsaturated, white letters.
@Lrx I added some ideas to my answer to deal with differently coloured backgrounds
Nice, this looks promising.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.