4

I am using XPath in PHP to retrieve part of an HTML document. Assume that my HTML document looked like this:

<html>
    <head>
    </head>
    <body>
        <div id="first">
            <a href="some_link_address.com">Hello</a>
            <p>Some text here</p>
        </div>
        <div id="second">
            <p>Some other text here</p>
            <img src="src/to/image.jpg" />
        </div>
    </body>
</html>

And my PHP including the XPath call is:

$result_dom = new DOMDocument('1.0', 'utf-8');
$node_to_keep = $xpath->query("//div[@id='first']");

foreach ($nodes_to_keep as $node) {

    $element = $result_dom->createElement('div', $node->nodeValue;);
    $result_dom ->appendChild($element);
}

I was expecting that the resulting dom will contain the following

<div>
    <a href="some_link_address.com">Hello</a>
    <p>Some text here</p>
</div>

However this is the resulting dom

<div>
    Hello
    Some text here
</div>

So my question is, how do I set the resulting dom to contain the html tags. I do not want them removed

Thanks.

1 Answer 1

2

The "nodeValue" of an element is the textual content of that element. The text nodes in the document do not include the <a ...>, etc., just the text inside and between those elements. So, this is all you get in the new element.

Instead of creating a node manually, import a deep copy of the result node and append that:

$importedNode = $result_dom->importNode($node, true);
$result_dom->appendChild($importedNode);
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.