I am trying to process messages in a small, private, ticketing system that will automatically parse URL's into clickable links without messing up any HTML that may be posted. Up until now, the function to parse URL's has worked well, however one or two users of the system want to be able to post embedded images rather than as attachments.
This is the existing code that converts strings into clickable URL's, please note I have limited knowledge of regex and have relied on some assistance from others to build this
$text = preg_replace(
array(
'/(^|\s|>)(www.[^<> \n\r]+)/iex',
'/(^|\s|>)([_A-Za-z0-9-]+(\\.[A-Za-z]{2,3})?\\.[A-Za-z]{2,4}\\/[^<> \n\r]+)/iex',
'/(?(?=<a[^>]*>.+<\/a>)(?:<a[^>]*>.+<\/a>)|([^="\']?)((?:https?):\/\/([^<> \n\r]+)))/iex'
),
array(
"stripslashes((strlen('\\2')>0?'\\1<a href=\"http://\\2\" target=\"_blank\">\\2</a> \\3':'\\0'))",
"stripslashes((strlen('\\2')>0?'\\1<a href=\"http://\\2\" target=\"_blank\">\\2</a> \\4':'\\0'))",
"stripslashes((strlen('\\2')>0?'\\1<a href=\"\\2\" target=\"_blank\">\\3</a> ':'\\0'))",
), $text);
return $text;
How would I go about modifying an existing function, such as the one above, to exclude hits wrapped in HTML tags such as <img without hurting the functionality of the it.
Example:
`<img src="https://example.com/image.jpg">`
turns into
`<img src="<a href="https://example.com/image.jpg" target="_blank">example.com/image.jpg</a>">`
I have done some searching before posting, the most popular hits I am turning up are;
Obviously the common trend is "This is the wrong way to do it" which is obviously true - however while I agree, I also want to keep the function quite light. The system is used privately within the organisation and we only wish to process img tags and URL's automatically using this. Everything else is left plain, no lists, code tags quotes etc.
I greatly appreciate your assistance here.
Summary: How do I modify an existing set of regular expression rules to exclude matchs found within an img or other html tag found within a block of text.