0

some test get a tag from http://www.msnbc.msn.com/ use simple html dom.

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://www.msnbc.msn.com/');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2 GTB5');
$htmls = curl_exec($ch);
curl_close($ch);
$html = str_get_html($htmls);
foreach($html->find('a') as $element){ 
    echo $element.'<br />';
}

this code could get all the hyper links, but how to ignore all the links in div#mainNav? I need get all the links out of div#mainNav in http://www.msnbc.msn.com/, thanks.

1 Answer 1

2

Check the parent, like this:

foreach($html->find('a') as $element){ 
    if ($element->parent()->id == 'mainNav') {
        //do nothing
    } else {
        echo $element.'<br />';
    }
}
Sign up to request clarification or add additional context in comments.

7 Comments

Do you want the links in mainNav or do you want the links outside of mainNav ?
I want outside of mainNav, but I still can get all the page links in my test.
Are the links direct descendents of mainNav? parent() gets the immediate parent. I recommend you do a var_dump() of $element->parent() and see what it shows you.
I know why not, because $element->parent() == 'li',$element->parent()->parent()== 'ul', $element->parent()->parent()->parent()== '#mainNav', so any other better suggestion? thanks.
I think that might be your best choice. Change the if condition to call parent() three times if ($element->parent()->parent()->parent()->id == 'mainNav') {
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.