Pytesseract image to text problem in Python

Question

Please check the following image:

I am using the following code to extract text from the image.

img = cv2.imread("img.png")
txt = pytesseract.image_to_string(img)

But the result is showing different than the original one:

It is showing the following result:

+BuFl

But it should be:

+Bu#L

I don't know what the problem is. I am pretty new in Pytesseract.

Is there anyone who can help me to sort out the problem?

Thank you very much.

Ahmet · Accepted Answer · 2022-01-08 20:57:29Z

1

One way of solving is applying otsu-thresholding

Otsu's method automatically finds the threshold value unlike global thresholding.

The result of applying Otsu's threshold will be:

import cv2
import pytesseract


img = cv2.imread("Tqom8.png")  # Load the image
img = cv2.resize(img, (0, 0), fx=0.5, fy=0.5)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)  # Convert to gray
thr = cv2.threshold(gray, 0, 128, cv2.THRESH_OTSU)[1]
txt = pytesseract.image_to_string(gray, config='--psm 6')
print(pytesseract.__version__)
print(txt)

Result:

0.3.8
+Bu#L

Also make sure to read the Improving the quality of the output

answered Jan 8, 2022 at 20:57

Ahmet

8,1213 gold badges30 silver badges53 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

JamesHorab Over a year ago

Thanks it works but not for all the images.

Collectives™ on Stack Overflow

Pytesseract image to text problem in Python

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related