2

I am trying to matching and characters from html. There are three types of ands:

and, &, &

I'm using the following code:

(&|&|\band\b)

The problem with above code is: it also matches words which start with &

i.e. © €

I've also tried the following, but it does not match & character from the start and end of line in the text.

(\s&\s|&|\band\b)
1
  • Is it a truth that in the case of the sole "&" that is always preceded and followed by a space? Commented Aug 1, 2012 at 22:00

3 Answers 3

4

How about

(&)|&(?!\w)|\band\b

Matches and, &, &

Does not match © €

The middle one matches an ampersand that is not followed by a word character ([A-Za-z0-9_])

Sign up to request clarification or add additional context in comments.

Comments

1
(&|&|\band\b)

is a good start. Go on by reducing the result set, you will have to specify details when not to match. There is no magic delimiter that tells a regex what you want. So the question is: how can you tell the '&' you want to accept from those you do not want to accept ?

Maybe you want to accept all '&' if not starting a word ? So:

(&[^a-zA-Z]|&|\band\b)

Comments

0

Try this regex :

$regex = '/\b((\&(amp;)?)|(and))\b/i';

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.