0

in the text, i would like to replace the div tag li. But not everyone just certain well defined. In this case, those that have id that begins "tab-*" I need something using PHP functions easily from text:

<div id="tab-141285" class="my-class">                         
  <div class="my-subclass">              
    <div>                     
     Lorem ipsum dolor sit amet consectetuer                 
    </div>             
  </div>                 
</div>                   
<div id="tab-85429"  class="my-class">                                  
  <div class="my-subclass">              
    <div>                      
    Lorem ipsum dolor sit amet consectetuer                  
    </div>             
  </div>                 
</div>

get this text

<li id="tab-141285" class="my-class">                          
  <div class="my-subclass">              
    <div>                     
     Lorem ipsum dolor sit amet consectetuer                 
    </div>             
  </div>                 
</li>                
<li id="tab-85429"  class="my-class">                                   
  <div class="my-subclass">              
    <div>                      
    Lorem ipsum dolor sit amet consectetuer                  
    </div>             
  </div>                 
</li>

Can you advise me?

Thank you

2
  • Where's the code you've tried? I see nothing preg_replace related. Commented Oct 10, 2014 at 6:20
  • I would advise against a pritive substring replacement approach. Though this is possible it carries a certain risk that things break the moment the markup is not implemented exactly as expected. Instead you should take a look at a php based DOM parser and manipulator. Commented Oct 10, 2014 at 6:39

2 Answers 2

1

Regular expressions are not adequate for parsing HTML. Any regex you try to use will be fragile. I suggest using the DOM extension for this instead.

The idea is to:

  1. Find all the <div> elements that have an id attribute that begins with "tab-" using the XPath query //div[starts-with(@id, "tab-")]
  2. Create a new <li> element for each of them.
  3. Move all the <div>'s attributes and child nodes to the new <li> element.
  4. Replace the old <div> with the new <li>.

Because your string doesn't have a root element, we'll do a little dance before and after to put it in one then rebuild it.


Example:

$html = <<<'HTML'
<div id="tab-141285" class="my-class">
  <div class="my-subclass">
    <div>
     Lorem ipsum dolor sit amet consectetuer
    </div>
  </div>
</div>
<div id="tab-85429"  class="my-class">
  <div class="my-subclass">
    <div>
    Lorem ipsum dolor sit amet consectetuer
    </div>
  </div>
</div>
HTML;

$dom = new DOMDocument();
$dom->loadHTML("<div>$html</div>", LIBXML_HTML_NOIMPLIED);
$xpath = new DOMXPath($dom);

$nodes = $xpath->query('//div[starts-with(@id, "tab-")]');

foreach ($nodes as $node) {
    $li = $dom->createElement('li');

    while ($node->attributes->length) {
        $li->setAttributeNode($node->attributes->item(0));
    }
    while ($node->firstChild) {
        $li->appendChild($node->firstChild);
    }

    $node->parentNode->replaceChild($li, $node);
}

$html = '';
foreach ($dom->documentElement->childNodes as $node) {
    $html .= $dom->saveHTML($node);
}

echo $html;

Output:

<li id="tab-141285" class="my-class">
  <div class="my-subclass">
    <div>
     Lorem ipsum dolor sit amet consectetuer
    </div>
  </div>
</li>
<li id="tab-85429" class="my-class">
  <div class="my-subclass">
    <div>
    Lorem ipsum dolor sit amet consectetuer
    </div>
  </div>
</li>
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you, This is what I wanted. I was just there at the end instead of added echo $html; echo preg_replace ('/(<\/?body>)|(<\/?html>)/i', '', $ html);
Hi, after all I still have one question. Why is $dom->loadHTML("<div>$html</div>", LIBXML_HTML_NOIMPLIED); and not $dom->loadHTML($html, LIBXML_HTML_NOIMPLIED);
0

use domdocument xml component of php just load the string in domdocument object and search for element then get its attribute and check its id and compare using preg_replace and remove if it met your condition

1 Comment

How it should look like?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.