0

I am doing character recognition in ID cards in opencv-2.x using C++. I tried Tesseract OCR, but it didn't give me better results than my own neural network training and all. Although, I am still struggling with segmentation of characters. I was wondering if it is possible to get the character or word position from Tesseract OCR Api in C++ and use my neural network for prediction. Any suggestions?

Simply, I need bounding box of each character in ID using Tesseract.

3
  • Can you post a sample image of the text?? Commented Jan 28, 2016 at 9:06
  • actually, NO. Those are IDs, can't be shared! Commented Jan 28, 2016 at 9:25
  • stackoverflow.com/questions/12278982/… check this might help If you want to detect the text location with tessearct . Commented Jan 28, 2016 at 10:04

2 Answers 2

3

You can retrieve the position of each word using the run function:

virtual void cv::text::OCRTesseract::run(Mat& image, 
                                         std::string& output_text, 
                                         std::vector<Rect>* component_rects = NULL, 
                                         std::vector<std::string>* component_texts = NULL, 
                                         std::vector<float>* component_confidences = NULL, 
                                         int component_level = 0);  

where component_rects will provide a list of Rects for the individual text elements found, and component_level = OCR_LEVEL_WORD will find single words.

Sign up to request clarification or add additional context in comments.

4 Comments

Sorry for not informing. I am not using opencv-3.x, I am using opencv-2.9.x. And I am quite sure there is no OCRTesseract in my verson of opencv.
Ok, that's an information you should've included.. Don't worry about this, just update your question to be more clear. I've done that for a project some times ago... I'll update my answer tomorrow if you didn't found your answer yet. @SuJit
Thanks! I am still looking for the solution.
Please have a look at the way OpenCV uses tesseract internally. You'll find out at line 207 that you just need to use ResultIterator. Sorry but I can't write a working example right now.
1

Use HOCR format for your results instead of plain text. I think the argument is -hocr or just hocr. Embedded in the HOCR results (which are returned as XHTML, but it's easy to convert to JSON) are the page coordinates of every word.

Shameless plug, you can also try my online OCR service http://OCRestful.com, which offers OCR as a service via a RESTful API. Just POST your doc and get back JSON with word coordinates and word-by-word confidence scores. There's a permanently-free tier as well as paid plans.

You mentioned your data is sensitive--you can force the document to auto-delete after it's OCR-processed, and you control the lifetime of the extracted OCR data.

Matt

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.