3

In Objective-c, I want to check is a proper english sentence/word or not, not grammatically.. i.e: texts like "I didn't go!", ""Hi" is a word", "hello world", "a 5 digit number", "the % is high!" and "[email protected]" should pass. but texts like "@/-5%;l:" should NOT pass the text may contain: numbers 0-9 and letters a-z, A-Z and -/:;()$&\"'!?,._

I tried:

NSString *regex1 = @"^[\w:;()'\"\s-]*";
NSPredicate *streamTest = [NSPredicate predicateWithFormat:@"SELF MATCHES %@", regex1];
return [streamTest evaluateWithObject:candidate];

But it wouldn't achieve what I want Any ideas?

3
  • 2
    This really isn't the kind of thing regular expressions are designed for. There is no set pattern for "A proper English word". What if someone entered in a bunch of gibberish like "ehweogu ewgfbweaoufewb"? Or a short but valid word like "a" or "hi". The only way to do this is to check each space-separated word against a dictionary of valid words. Commented Jun 24, 2013 at 9:24
  • 1
    I think you first have to get clear on what you want. Would you accept a sentence like "ASCII contains all of the following characters:@/-5%;l:"? Commented Jun 24, 2013 at 9:25
  • I guess you may be able to do something with this: (iOS5 only) developer.apple.com/library/ios/#documentation/cocoa/reference/… - it tags "sentences" (strings) with info about each "word" (i.e. noun, adjective, etc) and certain other grammatical traits. You could check that each word (or a certain % of words) matches some sort of grammatical structure? Commented Jun 24, 2013 at 9:35

1 Answer 1

1

I agree with @borrrden that this is a difficult task for a regex, but one thing you'd need to do is to escape the regex-backslashes (for want of a better word) with another backslash (\). Like this:

NSString *regex1 = @"^[\\w:;()'\"\\s-]*";

The reasoning behind this is that you want the regex engine to "see" the backslash, but the compiler which handles the NSString also uses backslashes to escape certain characters. "w" and "s" are not among those characters, so they \w and \s are just translated into w and s, respectively.

A double backslash in a literal string serves to get a single backslash into the compiled string.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.