PHP CURL / XPATH - Links not working

Question

i'm using the following code to scrape some external divs for http://psnc.org.uk/our-latest-news-category/psnc-news/

I wanting to scrape the PSNC News Latest News section

$ch = curl_init("http://psnc.org.uk/our-latest-news-category/psnc-news/");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$output = curl_exec($ch);
curl_close($ch);

$document = new DOMDocument;
libxml_use_internal_errors(true);
$document->loadHTML($output);
$xpath = new DOMXPath($document);

$tweets = $xpath->query("//article[@class='news-template-box']");

echo "<html><body>";
foreach ($tweets as $tweet) {
echo "\n<p>".$tweet->nodeValue."</article>\n";
}
echo "</html></body>";

It successfully scrapes the text but the links / href's / images infact all elements do not appear.

Am I missing something?

when you putting $xpath->query("*"); you get all data

L. Vadim
– L. Vadim

2017-01-05 16:16:54 +00:00
Commented Jan 5, 2017 at 16:16 — L. Vadim
– L. Vadim, Commented Jan 5, 2017 at 16:16
I only want to scrape a DIV not the entire page

itguyme
– itguyme

2017-01-05 16:18:16 +00:00
Commented Jan 5, 2017 at 16:18 — itguyme
– itguyme, Commented Jan 5, 2017 at 16:18
which div ? ???

L. Vadim
– L. Vadim

2017-01-05 16:22:56 +00:00
Commented Jan 5, 2017 at 16:22 — L. Vadim
– L. Vadim, Commented Jan 5, 2017 at 16:22
article class="news-template-box"

itguyme
– itguyme

2017-01-05 16:26:30 +00:00
Commented Jan 5, 2017 at 16:26 — itguyme
– itguyme, Commented Jan 5, 2017 at 16:26
OR <div class="page-content twelve columns clear">

itguyme
– itguyme

2017-01-05 16:26:34 +00:00
Commented Jan 5, 2017 at 16:26 — itguyme
– itguyme, Commented Jan 5, 2017 at 16:26

harry · Accepted Answer · 2017-01-05 16:28:08Z

1

DOMNode::nodeValue == DOMNode::textContent, only print text content.

http://php.net/manual/en/class.domnode.php#domnode.props.nodevalue

$tweets = $xpath->query("//article[@class='news-template-box']");

foreach ($tweets as $tweet) {
    echo $document->saveHTML($tweet);
}

answered Jan 5, 2017 at 16:28

harry

4832 silver badges12 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

PHP CURL / XPATH - Links not working

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related