12

I need to extract some data from a webpage with PHP. The part that I'm interested in is structured similarly to this:

<a href="somepath" target="fruit">apple</a>
<a href="somepath" target="animal">cat</a>
<a href="somepath" target="fruit">orange</a>
<a href="somepath" target="animal">dog</a>
<a href="somepath" target="fruit">mango</a>
<a href="somepath" target="animal">monkey</a>

First, I want to extract all fruits, and then all animals, so that I have them nicely grouped.

I figured out how to loop through all attribute values. Here's the code:

$dom = new DOMDocument();
$html = file_get_contents('example.html');

@$dom->loadHTML($html);

$a = $dom->getElementsByTagName('a');

for ($i; $i < $a->length; $i++) {
    $attr = $a->item($i)->getAttribute('target');

    echo $attr . "\n";
}

So I get:

fruit animal fruit animal fruit animal

I also found out how to get the elements' text content:

$a->item($i)->textContent

So, if included in loop and echoed, I get:

apple cat orange dog mango monkey

I feel like I'm very close, but I can't get what I want. I need something like this:

if (target = "fruit") then give me "apple, orange, mango".

How can I do it?

3 Answers 3

17

Just continue on target attributes which aren't fruit, and then add the textContent of the elements to an array.

$nodes = array();

for ($i; $i < $a->length; $i++) {
    $attr = $a->item($i)->getAttribute('target');

    if ($attr != 'fruit') {
        continue;
    }

    $nodes[] = $a->item($i)->textContent;
}

$nodes now contains all the nodes of the elements which have their target attribute set to fruit.

Sign up to request clarification or add additional context in comments.

Comments

13

Use DOMXPath and queries:

$doc = new DOMDocument();
$doc->Load('yourFile.html');

$xpath = new DOMXPath($doc);

$fruits = $xpath->query("//a[@target='fruit']");
foreach($fruits as $fruit) {
    // ...
}

$animals = $xpath->query("//a[@target='animal']");
foreach($animals as $animal) {
    // ...
}

See this demo.

Comments

3

Make two arrays:

$fruits = array();
$animals = array();

And in the loop, when you get:

if(target == 'fruit') {
   array_push($fruits, $valueofelement);

} else if ($target == 'animal') {
   array_push($animals, $valueofelement);
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.