1

I'm unable to read the form exactly on using node-tesseract.Only the printed text of the form is recognized and returned correctly whereas the handwritten text is returned with some special characters.

My code is,

var options = {
            l: 'deu',
            psm: 6,
            env: {
                maxBuffer: 4096 * 4096
            }
        };
        tesseract.process('./server/images/form.jpg', options, function (err,text) {
            if (err) {
                return console.log("An error occured: ", err);
            }
            console.log("Recognized text:");
            console.log(text);
        });

my input ------> OWNER Brian Dude output------> OW_NER ägga ] )ggé;= ‘

here, OWNER is some text filed here

1

2 Answers 2

3
  1. Take a look at the following papers. Both are examples that use Tesseract Training process for handwriting recognition.

Tesseract Training for Handwritten Digit Recognition

Training Tesseract for Roman Font Handwriting

  1. Check out the official Tesseract Training page.

  2. The following link takes you through the Training Process, it helped me a lot. https://web.archive.org/web/20170820212334/http://www.resolveradiologic.com:80/blog/2013/01/15/training-tesseract

  3. Use a third party GUI for Tesseract Training, it will make your life much easier. I recommend tesseract4java and jTessBoxEditor (both work on OS X)

Sign up to request clarification or add additional context in comments.

Comments

0

You can train tesseract to recognize your handwritten text. See here.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.