How to get text position using Tesseract OCR API in C++?

Question

I am doing character recognition in ID cards in opencv-2.x using C++. I tried Tesseract OCR, but it didn't give me better results than my own neural network training and all. Although, I am still struggling with segmentation of characters. I was wondering if it is possible to get the character or word position from Tesseract OCR Api in C++ and use my neural network for prediction. Any suggestions?

Simply, I need bounding box of each character in ID using Tesseract.

stackoverflow.com/questions/12278982/… check this might help If you want to detect the text location with tessearct . — Arijit
– Arijit, Commented Jan 28, 2016 at 10:04

Miki · Accepted Answer · 2016-01-28 23:02:38Z

3

You can retrieve the position of each word using the run function:

virtual void cv::text::OCRTesseract::run(Mat& image, 
                                         std::string& output_text, 
                                         std::vector<Rect>* component_rects = NULL, 
                                         std::vector<std::string>* component_texts = NULL, 
                                         std::vector<float>* component_confidences = NULL, 
                                         int component_level = 0);

where component_rects will provide a list of Rects for the individual text elements found, and component_level = OCR_LEVEL_WORD will find single words.

answered Jan 28, 2016 at 23:02

Miki

42k13 gold badges131 silver badges219 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Sujit Over a year ago

Sorry for not informing. I am not using opencv-3.x, I am using opencv-2.9.x. And I am quite sure there is no OCRTesseract in my verson of opencv.

Miki Over a year ago

Ok, that's an information you should've included.. Don't worry about this, just update your question to be more clear. I've done that for a project some times ago... I'll update my answer tomorrow if you didn't found your answer yet. @SuJit

Sujit Over a year ago

Thanks! I am still looking for the solution.

Miki Over a year ago

Please have a look at the way OpenCV uses tesseract internally. You'll find out at line 207 that you just need to use ResultIterator. Sorry but I can't write a working example right now.

Matthew · Accepted Answer · 2016-01-28 14:54:54Z

1

Use HOCR format for your results instead of plain text. I think the argument is -hocr or just hocr. Embedded in the HOCR results (which are returned as XHTML, but it's easy to convert to JSON) are the page coordinates of every word.

Shameless plug, you can also try my online OCR service http://OCRestful.com, which offers OCR as a service via a RESTful API. Just POST your doc and get back JSON with word coordinates and word-by-word confidence scores. There's a permanently-free tier as well as paid plans.

You mentioned your data is sensitive--you can force the document to auto-delete after it's OCR-processed, and you control the lifetime of the extracted OCR data.

Matt

answered Jan 28, 2016 at 14:54

Matthew

111 bronze badge

Collectives™ on Stack Overflow

How to get text position using Tesseract OCR API in C++?

2 Answers 2

4 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related