i'm struggling to find an answer for the following... i suspect I don't really know what i'm asking for or how to ask it... let me describe:
I would like to grab some links from a page. I only want the links that have the following word as part of the URL: "advertid". Therefore and for example, the URL would be something like http://thisisanadvertis.com/questions/ask.
I've got this far
<?php
// This is our starting point. Change this to whatever URL you want.
$start = "https://example.com";
function follow_links($url) {
// Create a new instance of PHP's DOMDocument class.
$doc = new DOMDocument();
// Use file_get_contents() to download the page, pass the output of file_get_contents()
// to PHP's DOMDocument class.
@$doc->loadHTML(@file_get_contents($url));
// Create an array of all of the links we find on the page.
$linklist = $doc->getElementsByTagName("a");
// Loop through all of the links we find.
foreach ($linklist as $link) {
echo $link->getAttribute("href")."\n";
}
}
// Begin the crawling process by crawling the starting link first.
follow_links($start);
?>
This returns all URLs on the page... which is OK. So to try and get the URLs i wanted, i tried a few things including trying to amend the getattribute part:
echo $link->getAttribute("href"."*advertid*")."\n";
I've tried a few things... but can't get what i want. Can someone point me in the right direction, i'm a bit stuck.
Many thanks in advance.