8

I have successfully integrated tesseract into my android app and it reads whatever the image that I capture but with very less accuracy. But most of the time I do not get the correct text after capturing because some text around the region of interest is also getting captured.

All I want to read is all text from a rectangular area, accurately, without capturing the edges of the rectangle. I have done some research and posted on stackoverflow about this two times, but still did not get a happy result!

Following are the 2 posts that I made:

https://stackoverflow.com/questions/16663504/extract-text-from-a-captured-image?noredirect=1#comment23973954_16663504

Extracting information from captured image in android

I am not sure whether to go ahead with tesseract or use openCV

11
  • If the answers were unsatisfactory, try putting up a bounty. If you go the openCV route, make sure you configure it for the camera you'll be using. Commented Jun 21, 2013 at 14:28
  • With tesseract, I have a kind of a rectangular area, so the user will place the area to be captured within that rectangle. But when capturing the image, if you move slightly, the result that you get is completely a garbage value. I think tesseract is not helping me. Could you please provide me some sample code? Commented Jun 21, 2013 at 14:33
  • Haven't played with openCV since my student days, so no, not really... but looking at your other question, lottery tickets might not be teh best thing to try out with. Try blank white paper with big black bold typefont and work from there... Lighting, camera internals, focus - they all get in the way of OCR. Commented Jun 21, 2013 at 14:42
  • well I tried that way as well, if the text is on white background then it reads fine. But when I applied to lottery, gives me garbage values most of the time. I also tried with various lighting conditions, even with good lighting conditions, tesseract gives me poor results when the lottery is processed. What should I do? Commented Jun 21, 2013 at 14:50
  • Curse the gods, how dare the lottery people try to make forging/OCRing tickets hard! So, before OCRing the loterry ticket, you need to clean it using a ... RasterizerFilter? In any case, try to filter out the holograms/funny background, use high-contrasting etc and try to pass a filtered input to OCR, rather than trying to make a read-anything OCR. Commented Jun 21, 2013 at 14:52

2 Answers 2

11
+250

Including the many links and answers from others, I think it's good to take a step back and note that there are actually two fundamental steps to optical character recognition (OCR):

  • Text Detection: This is the title and focus of your question, and it is concerned with localizing regions in an image that contain text.
  • Text Recognition: This is where the actual recognition happens, where the localized image regions from detection get segmented character-by-character and classified. This is also where tools like Tesseract come into play.

Now, there are also two general settings in which OCR is applied:

  • Controlled: These are images taken from a scanner or similar in-nature where the target is a document and things like perspective, scale, font, orientation, background consistency, etc are pretty docile.
  • Uncontrolled/Scene: These are the more natural and in-the-wild photos, e.g. those taken from a camera, where you are trying to recognize a street sign, shop name, etc.

Tesseract as-is is most applicable to the "controlled" setting. And in general, but for scene OCR especially, "re-training" Tesseract will not directly improve detection, but may improve recognition.

If you are looking to improve scene text detection, see this work; and if you are looking at improving scene text recognition, see this work. Since you asked about detection, the detection reference uses maximally stable extremal regions (MSER), which has a plethora of implementation resources, e.g. see here.

There's also a text detection project here specifically for Android too:
https://github.com/dreamdragon/text-detection

As many have noted, keep in mind that recognition is still an open research challenge.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks a lot for the time that you spent for answering this this question. You have provided so much important information. I think I will be able to figure out a way. Thanks again!
5

The solution to improving the OCR output is to

  • either use more training data to train it better

  • filter it's input using some Linear Filter (grayscaling, high-contrasting, blurring)

In the chat we posted a number of links describing filtering techniques used in OCRing, but sample code wasn't posted.

Some of the links posted were

Improving input for OCR

How to train Tesseract

Text enhancement using asymmetric filters <-- this paper is easily found on google, and should be read fully as it quite clearly illustrates and demonstrates necessary steps before OCR-processing the image.

OCR Classification

2 Comments

Linear Classificator / OCR classification. That's the one I was trying to remember.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.