1

I am processing very simple html code using php domdocument and xpath. I am getting duplicate value of dom node.

PHP Code

<?php

$html = <<<HTML

    <div id="my-cats">

        <ul class="category_list">

            <li class="item reference">
                <span class="the_score"><b>35</b></span>
                <span class="the_category">Reference / Education</span>
            </li>

            <li class="item computer">
                <span class="the_score"><b>50</b></span>
                <span class="the_category">Computer / Internet</span>
            </li>

        </ul>

        <ul class="category_list">

            <li class="item home">
                <span class="the_score"><b>22</b></span>
                <span class="the_category">Home / Gardening</span>
            </li>

            <li class="item home">
                <span class="the_score"><b>12</b></span>
                <span class="the_category">Home / Repair</span>
            </li>

        </ul>

    </div>

HTML;

$dom = new DOMDocument();
@$dom->loadHTML($html);

$finder = new DOMXPath($dom);

$cats = $finder->query('//div[@id="my-cats"]//ul[@class="category_list"]//li');

foreach( $cats as $li ){
    echo $li->getAttribute('class') . "\n";
    $value = trim($finder->query('//span[@class="the_score"]', $li)->item(0)->nodeValue);
    $key = trim($finder->query('//span[@class="the_category"]', $li)->item(0)->nodeValue);
    echo "$key : $value\n";
}

Output

item reference 
Reference / Education : 35 

item computer 
Reference / Education : 35 

item home 
Reference / Education : 35 

item home 
Reference / Education : 35

As you can see I am echoing the classnames which shows the $li element I am processing is different. Yet I get only first value of the dom node.

You can see the problem live in here https://3v4l.org/tjJB5

1 Answer 1

1

Change the inner queries to e.g. $finder->query('span[@class="the_score"]', $li) to search span children of $li.

Sign up to request clarification or add additional context in comments.

2 Comments

Thank you! Can you explain a bit?
A path starting with / selects down from the document containing the context node so your attempts with $finder->query('//span[@class="the_score"]', $li) search the whole document in which the $li element is contained. If you want to select child elements then the shortest is the path span which is short for child::span. Descendants you could find with descendant::span or .//span.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.