ppl. I ussualy find my answers looking the web and stackoverflow, but this time couldn't resolve my issue. I'm using php dom for parse a website and extract some data from it, but for some reason, all the ways i tryed keep returning me less items than the number on the page.
Tryed with "simple php simple html dom", "php advanced html dom" and the native php dom... but still get, in this case, 14 article tags.
http://www.emol.com/movil/nacional/
In this site there are 28 elements tagged "article", but i always get 14 (or less)
Tryed using the classic find (from simple and advance), with all the combinations possible; and with the native one, query xpath and getelementsbytag.
$xpath->query('//article');
$xpath->query('//*[@id="listNews"]/article[6]') //even this don't work
$html->find('article:not(.sec_mas_vistas_emol), article'); //return 14
So my guess was the way i was loading the url... so i tryed the classic "file_get_html", curl, and some custom functions... and all them are the same. What is more extrange, is if i use a a online xpath tester, copy all the html and use the "query->('//article')... it find all. This are my two last tests:
//Way 1
$html = file_get_html('http://www.emol.com/movil/nacional/');
$lidata = $html->find('article');
//Way 2
$url = 'http://www.emol.com/movil/nacional';
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$e = curl_exect($ch);
$dom = new DOMDocument;
@$dom->loadHTML($e); //tryed with loadHTMLFile too and the libxml_use_internal_erros
$xpath = new DOMXPath($dom);
$xpath->query('//article');
Any suggestion on what could be the issue and a way to fix it? Actually, is my first incursion with PHP dom, so possible there is something i'm missing.