0

The strings looks like hyperlinks, such as http://somethings. This is what I need :

  1. I need to check them only if they doesnt start with the character "; I mean, only that characters : if before there aren't characters it must check;
  2. That somethings string means that every kind of characters can be used (of course, is a link) except a whitespace (The end marker link); I know, it's permitted by RFC, but is the only way I know to escape;
  3. these string are previously filtered by using htmlentities($str, ENT_QUOTES, "UTF-8"), that's why every kind of characters can be used. Is it secure? Or I risk problems with xss or html broked?
  4. the occurences of this replacement can me multiple, not only 1, and must be case insenstive;

This is my actual regex :

preg_replace('#\b[^"](((http|https|ftp)://).+)#', '<a class="lforum" href="$1">$1</a>', $str);

But it check only those string that START with ", and I want the opposite. Any helps answering to this question would be good, Thanks!

2
  • 1 - you will need to split the words into individual strings. Then the regex would be [^"].* Commented May 9, 2011 at 11:27
  • [^"] should works, but if the word is the first word it doesnt (because before there isn't any characters. Commented May 9, 2011 at 11:30

2 Answers 2

2

For both of your cases you'll want lookbehind assertions.

  1. \b(?<!")(\w)\b - negative lookbehind to match only if not preceded by "
  2. (?<=ThisShouldBePresent://)(.*) - positive lookbehind to match only if preceded by the your string.
Sign up to request clarification or add additional context in comments.

1 Comment

Edited the topic with some others further infos (because this in fact doesnt work so well); Please let me know, thanks for your time
1
  1. Something like this: preg_match('/\b[^"]/',$input_string);

    This looks for a word-break (\b), followed by any character other than a double quote ([^"]).

  2. Something like this: preg_match('~(((ThisShouldBePresent)://).+)~');

    I've assumed the brackets you specified in the question (and the plus sign) were intended as part of the regex rather than characters to search for.

    I've also taken @ThiefMaster's advice and changed the delimiter to ~ to avoid having to escape the //.

3 Comments

You might want to use a different delimiter for (2) so you don't have an escaping hell - # would do the job fine for example.
@Thief - but I like having to escape my slashes; it makes regex syntax even more obtuse! yay! ;-) (but yes, I have edited it to use an alternative character)
Tried your example, but it doesnt work (maybe we misunderstand). I edited my topic with a more concise explanation ;)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.