6

I am trying to wrap some words with HTML tags, for that I am using regular expressions. I am almost there:

This is my regexp

/((apple|banana|cherry|orange)\b\s?)+/gi

and this is my replacement:

<em>$&</em>

which works perfectly for my example text:

Apple Banana apple cherry, Cherry orange and Oranges Apple, Banana

the result being:

<em>Apple Banana apple cherry</em>, <em>Cherry orange </em>and Oranges <em>Apple</em>, <em>Banana</em>

I could be pragmatic and live with this but I would reaaaaaally like to have it perfect and not include the space after the final match.

i.e. my perfect result would be (see the tag shifted left after "Cherry orange"):

<em>Apple Banana apple cherry</em>, <em>Cherry orange</em> and Oranges <em>Apple</em>, <em>Banana</em>
3
  • Is "Oranges" not supposed to be enclosed in <em>s? Commented Nov 29, 2009 at 4:49
  • and why not <em>$&</em> (note the slash)? Commented Nov 29, 2009 at 8:17
  • @BipedalShark: that's correct, I only want full and specific words. @nalply: my bad, of course </em>, I just corrected it Commented Nov 29, 2009 at 13:49

3 Answers 3

4

JavaScript doesn't support lookbehind. This is a shame, as we could have done:

// doesn't work in JavaScript:
/((apple|banana|cherry|orange)\b\s?)+(?<!\s)/gi 

What we can do, however, is to move the white-space to the beginning, and add a negative lookahead (so the catch must not start with a white-space):

/(?!\s)(\s?\b(apple|banana|cherry|orange)\b)+/gi

A slight difference from your code is that I also added \b to the beginning of the pattern, so it wouldn't catch apple from Snapple.

Sign up to request clarification or add additional context in comments.

4 Comments

JavaScript does not have issues with lookbehind, it doesn't support it. Perhaps this is what you meant, but "has issues" suggests (to me) that it does support it, but the implementation contains bugs.
I know, that is what I meant. That's a big issue. I agree, Bart, I'll change it to avoid confusion.
the \b didn't seem to be necessary in my original example. Awesome thanks!! This was my first question on StackOverflow. I am absolutely amazed I got a perfect answer this quick!
No problem, Thanks! Welcome to StackOverflow. BTW, it's a very good question, and well formatted. Probably the reason it was up voted that much.
2

You could put function in the replace parameter as

function(x){return "<em>"+x.replace(/\s+$/,"")+"<em>";} instead of <em>$&</em>

and you could put striping space inside that function.

"Apple Banana apple cherry, Cherry orange and Oranges Apple, Banana".replace(
/((?:apple|banana|cherry|orange)\b\s?)+/gi,
function(x){
   return "<em>"+x.replace(/\s+$/,"")+"<em>";
})

<em>Apple Banana apple cherry<em>, <em>Cherry orange<em>and Oranges <em>Apple<em>, <em>Banana<em>

2 Comments

Could you please elaborate how come the second argument for replace is a function? I can't find any documentation about this behavior.
developer.mozilla.org/en/Core_JavaScript_1.5_Reference/… - but attention, it is not supported by all browsers (IE5, for example)
0

You could also solve this with a lookahead at the end of the pattern to make sure that any match is followed by space, a comma or end-of-string (this naturally means a match won't match if the result is followed by a letter, which would be the case in the problematic example).

Altered matching pattern:

/((apple|banana|cherry|orange)\b\s?)+(?=\s|,|$)/gi

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.