1

I'm trying to match a string with the following:

  • starts with [A-Z]
  • contains [a-zA-Z- '\u00E0-\u00EF] (Latin-1 Supplement - Match Unicode Block Range)
  • any other character is forbidden
  • does not end with [- '] or have [- '] next to one another.
  • has at least 2 characters

I've been trying the following:

new RegExp(/^[A-Z](?!.*[- ']$).*[a-zA-Z- '\u00E0-\u00EF]$/);

My problem isn't that I'm not able to understand what regular expressions do, but whether they are correct. It's very easy (or not) to write a regex that looks like it should work but misses on a few things.

Any help would be much appreciated.

Edit

Valid string : Marie-Noëlle Tranchant, Jean-François Copé...

4
  • can you give an example of a valid string? Commented Aug 17, 2011 at 1:01
  • edited question - added valid string. Commented Aug 17, 2011 at 1:06
  • "does not have [- '] next to one another" all 9 possibilities here or just the three of the same character doubled up? Commented Aug 17, 2011 at 1:06
  • @jswolf19 do not have : 'space''space' , -- or ''. Commented Aug 17, 2011 at 1:07

4 Answers 4

3
/^[A-Z](?:[- ']?[a-zA-Z\u00E0-\u00EF])+$/

Below is a proof of why this meets your criteria. If you change the non-capturing group (?:...) to a (...) then it is also the shortest regexp that meets your criteria.

starts with [A-Z]

because of the ^[A-Z].

contains [a-zA-Z- '\u00E0-\u00EF] (Latin-1 Supplement - Match Unicode Block Range) any other character is forbidden

because the entire thing must match character sets containing only those characters

does not end with [- '] or have [- '] next to one another.

because [- '] is restricted to zero or one occurrence per following occurrence of [a-zA-Z\u00E0-\u00EF]

has at least 2 characters

because the [A-Z] matches at least one character and the + after the (?:...) group requires another one.

Sign up to request clarification or add additional context in comments.

Comments

1

I don't think your regexp will do what you want. It should accept any string that starts with [A-Z] and ends with [a-zA-Z\u00E0-\u00EF] (with any characters in between, including ones you don't want to accept), although I can't say for sure since I don't know how the unescaped '-' is handled...

I think you want something more like this:

new RegExp(/^[A-Z](?:(?!--|''|  )[a-zA-Z\- '\u00E0-\u00EF])*[a-zA-Z\u00E0-\u00EF]$/);

Comments

1

A very basic way to test regex is to take a literal string e.g. "blah this is text" and using the .match method with it. You can open a js console (Ctrl + Shift + J in Chrome) and directly run it to see what it returns

"Marie-Noëlle Tranchant".match(/^[A-Z][-a-zA-Z '\u00E0-\u00EF]*[^- ']$/);

2 Comments

My problem is the few things that are missed out : Ma5rie-No1ëlle Tra4nc6hant and Ma%rie-Noëlle Tranch#ant works.
how about /^[A-Z][-a-zA-Z '\u00E0-\u00EF]*[^- ']$/?
1

Edit - redo
After revisiting this thread, I noticed these comments:

"does not have [- '] next to one another" all 9 possibilities here or just the three of
the same character doubled up? – jswolf19 2 days ago
@jswolf19 do not have : 'space''space' , -- or ''. – Stack 101 2 days ago
"

In light of this, you have to go with what @jswolf19 did.

His regex could probably be simplified a little more:

pcre:
/^[A-Z](?:([\- '])(?!$|\1)|[a-zA-Z\x{E0}-\x{EF}])+$/

js:
/^[A-Z](?:([\- '])(?!$|\1)|[a-zA-Z\u00E0-\u00EF])+$/

expanded JavaScript:  
^                     # start of string
   [A-Z]                     # single A-Z char
   (?:                       # non-capture group
       ([\- '])                   # capture group 1, single char from: [- ']
       (?! $ | \1 )               # not the end of string nor the
                                  #   char captured in group 1 (backreference)
     |                          # OR,
       [a-zA-Z\u00E0-\u00EF]      # a single char from: [a-zA-Z\u00E0-\u00EF]
   )+                        # end non-capture group, do 1 or more times
$                     # end of string

Please test answers before you mark them as correct. Others may visit this thread
in the future.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.