0

I want to remove the following - title=\"huluId-581956\" - from a string so that:

<a title=\"huluId-581956\" href="somelink">My Link</a>

becomes

<a href="somelink">My Link</a>

I'm basically looking to take the title attribute out. I finessed my exp on regexpal and put it into preg_replace as such:

$myString ='<a title=\"huluId-581956\" href="somelink">My Link</a>';

$myString = preg_replace('/(title=\\)("huluId-)[0-9]+\\(")/', '', $myString);
$myString = preg_replace('/(title=\\)("huluId-)[0-9]+(\\")/', '', $myString);

But although on regexpal I have no problem selecting the title attribute, when I place the expression into preg_replace it does NOT work.

Any help would be greatly appreciated as I have no idea why this would be so.

Thank you!

2
  • Well, the variable you're wanting to replace from is $html, but you put the contents in $myString. If not that, maybe too many \ . And perhaps you could use an XML parser to pull the attribute out, just in case your <a> is less well behaved in the future? Commented Jan 16, 2014 at 3:17
  • Sorry, that was a mistake I put when shortening the thing for posting. I corrected it now. It obviously should be $myString. As for the XML parse could you please explain further - I have no experience in that dept. Thanks Commented Jan 16, 2014 at 3:25

3 Answers 3

2

Simply use this instead:

$myString = preg_replace('/\s+title=\\\\"[^"]+"/', '', $html);

Also, since I don't know in what context you're using this, maybe consider using a DOM parser because regex is not the appropriate tool for HTML parsing... A DOM parser like PHP Simple HTML DOM Parser can do that easily...

Working DEMO

Sign up to request clarification or add additional context in comments.

3 Comments

@mikevoermans, corrected, you must use 3 slaches to escape the 4th xD
Does not work. $myString = '<a title=\"huluId-581956\" href=\"hulu.com\" rel=\"nofollow\">asdfasdf</a>'; $clean_html = preg_replace('/\s+title=\\"[^"]+"/', '', $myString); echo $clean_html;
Thank you very much. Yep, that did it. Aha - I see... Thanks I will try. I get this html as a long string stored in $myString
0

The slashes are messing up the regex - strip them out and it makes life easier.

$myString ='<a title=\"huluId-581956" href="somelink">My Link</a>';
$myString = stripslashes($myString);
$myString = preg_replace('/title="huluId-[0-9]+" /', '', $myString);
echo $myString;

1 Comment

Hi - this is part of a much longer string and I cannot mess with it, except of course for deleting the title attribute. Yes, if the backslashes were not there it would be much simpler, but as such, they must be. Unless you know of a way of stripping the backslashes ONLY for that section - the title tag.
0

Considering that you will normally have the slash after "title" you can have a simpler regex:

/title=\\"(.)*?"/ 

It selects everything after 'title=\"', and the "?" make it ends on the next character, which is a quotation mark.

The code:

$myString ='<a title=\"huluId-581956\" href="somelink">My Link</a>';

$myString = preg_replace('/title=\\"(.)*?"/', '', $myString);

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.