-3

I have some <img> tag src values that need to have their path removed.

Unfortunately, my html DOM is invalid, so I cannot use a DOM parser and must resort to regex.

My current attempt is:

src=(\'|")\/root\/images\/([^\/]*)\/([^(\'|"]*)

to turn this:

lots of other html
<img src="/root/images/ANY MORE PATH HERE/file.jpg">
more html

in to this:

lots of other html
<img src="file.jpg">
more html

The above will work when I just use capture group 3 only AND I have one directory beyond /root/images, but I don't know how many subdirectories will be in a given filepath.

Any suggestions?

3

3 Answers 3

1

This uses preg_replace():

<?php
$foo = '/\/.+\//';
$test =  '<img src="/root/images/ANY MORE PATH HERE/file.jpg">';
echo preg_replace($foo, '', $test);
?>
Sign up to request clarification or add additional context in comments.

2 Comments

with a slight variation this led me to the right answer, thanks
This only works when the entire html string is a single image tag. That is not what the OP is dealing with.
1

It seems to me that you can match zero or more non-quote characters followed by a slash -- as many times as possible and replace that substring with an empty string. This will always leave you with a src value that purely consists of the filename at the end of the path.

Code: (Demo)

$html = <<<HTML
lots of other html
<img src="/root/images/ANY MORE PATH HERE/file.jpg">
more html
HTML;

echo preg_replace('~ src=[\'"]\K(?:[^\'"]*/)*~','',$html);

Output:

lots of other html
<img src="file.jpg">
more html

Pattern: ~ src=['"]\K([^'"]*/)*~

(Pattern Demo)

Pattern Breakdown:

~          #pattern delimiter (deliberately not slash -- to avoid escaping)
 src=      #match a space followed literally by "src="
['"]       #match either single quote or double quote
\K         #restart the fullstring match (effectively forget previously matched characters)
(?:        #start of non-capturing group
  [^'"]*   #match zero or more non-single and non-double quote characters
  /        #match a forward slash
)          #end of non-capturing group
*          #allow zero or more occurrences of the non-capturing group
~          #pattern delimiter

Comments

-1

I think this is a simple solution using explode:

 $src = "/root/images/ANY MORE PATH HERE/file.jpg";
 $part = explode("/", $src);
 $imageName = $part[sizeof($part)-1]; //get the last index of the array

3 Comments

<img src="/root/images/ANY MORE PATH HERE/file.jpg"> is just a small part of the string
as long as the image name at the end, this code will work. Print it out and give it a try.
This only works when the entire string is a single src attribute value. That is not what the OP is dealing with.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.