I am trying to parse html table in order to get <td> ID HERE </td> tag content using Xpath and PHP.
Executing following line
$doc->loadHTMLFile($file);
gives me warnings like this:
PHP Warning: DOMDocument::loadHTMLFile(): Unexpected end tag : tr in...
That's why I am using the following block of code:
libxml_use_internal_errors(true);
$doc->loadHTMLFile($file);
libxml_clear_errors();
Trying to parse this: (the entire page here)
<table class="object-table" cellpadding="0" cellspacing="0">
<tbody>
<tr>
<th width="8%">something here</th>
<th width="89%">something here</th>
<th width="3%">something here</th>
</tr>
<tr class="normal-row">
<td>ID number here</td>
<td><a href="/catalog/view/id/4127">something here</a>
</td>
<td align="center">
<img src="/design/img/hasnt_photo_icon.gif">
</td>
</tr>
<tr class="odd-row">
<td>ID number here</td>
<td><a href="/catalog/view/id/1865">something here</a>
</td>
<td align="center">
<img src="/design/img/hasnt_photo_icon.gif">
</td>
</tr>
</tbody>
</table>
with the following code:
$file = "http://www.sportsporudy.gov.ua/catalog/#c[1]=1";
$doc = new DOMDocument();
libxml_use_internal_errors(true);
$doc->loadHTMLFile($file);
libxml_clear_errors();
$xpath = new DOMXPath($doc);
$query = '//tr[@class="odd-row"]';
$elements = $xpath->query($query);
printf("Size of array: %d\n", sizeof($elements));
printElements($elements);
and tried using different queries like
//table[@class="object-table"]/tbody/tr ...
but doesn't seem to give me the td tags I need. Maybe that's because of the broken HTML.
Thanks for your advice.