I am working on an iOS Swift project that takes takes OCR data and then searches the text for key phrases. The OCR output looks like this:
INGREDIENTS WATER, BROWN SUGAR, RED RIPE
TOMATO CONCENTRATE, APPLE CIDERVINEGAR
W01CESTERSHlWSMJCE(WATERW4EGAR CORN
SYRUP, SALT, MOLASSE, SPICE, NATURAL FLAVOR
GARLIC POWDER, CARAMEL COLOR, ANCHOVIES
CFlSril,TAMARiN0), MOLASSES, LEMON JUICE,
ONION, HONEY, MODIFIED TAVIOCA STARCH,
When I search the string for "corn syrup", nothing is found. Searching for "corn" and "syrup" does produce positive results.
I have also tried
tesseract.recognizedText.stringByTrimmingCharactersInSet(NSCharacterSet.whitespaceAndNewlineCharacterSet())
to no avail.
Any thoughts on how to format this text for searching that would allow "corn syrup" to be identified? The qualifier is that only the exact phrase is useful - after all there are corn, corn starch, maple syrup, etc. as potential ingredients.
Thanks.
OK here is the solution that worked
'textView.text = tesseract.recognizedText.stringByReplacingOccurrencesOfString("\n", withString: " ", options: NSStringCompareOptions.LiteralSearch, range: nil)'
I thought the initial code was accomplishing the same task.
