1

I'm running simple html dom on php 7.1.

But the first line I can not parse html

My code

<?php
include 'simple_html_dom.php';

$html = file_get_html('http://google.com');

echo $html;
?>

The page displays nothing (white background) with the above code.

But the below code but runs:

<?php
include 'simple_html_dom.php';
//base url
$base = 'https://google.com';
$curl = curl_init();
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($curl, CURLOPT_HEADER, false);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl, CURLOPT_URL, $base);
curl_setopt($curl, CURLOPT_REFERER, $base);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
$str = curl_exec($curl);
curl_close($curl);
// Create a DOM object
$html_base = new simple_html_dom();
// Load HTML from a string
$html_base->load($str);
echo $html_base;
$html_base->clear(); 
unset($html_base);
?>

Then, I try to get img with class below code with above code but no working:

Image html to get:

<div class="product_thumb">
<a title="Me Before You" class="image-border" href=/me-before-you-a-novel-movie-tie-in-p69988.html">
<img class="   pict lazy-img" id="det_img_00069988" 
src="/images/thumbnails/product/115x/222614_me-before-you-a-novel-movie-tie-
in.jpg">
</a></div>

My Simple HTML DOM, All dont working (get no html on may page)

//* Find all images 1st code
foreach($html_base->find('img[class=   pict lazy-img]') as $element) 
   echo '<img src="' . $element->src . '" />' . '<br>';
//* Find all images 2nd code
foreach($html_base->find('img[class=   pict lazy-img]',0) as $element) 
   echo '<img src="' . $element->src . '" />' . '<br>';
//* Find all images 3rd code
foreach($html_base->find('img[class$=pict lazy-img]',0) as $element) 
   echo '<img src="' . $element->src . '" />' . '<br>';
//* Find all images 4th code
foreach($html_base->find('img[class$=pict lazy-img]',0) as $element) 
   echo '<img src="' . $element->src . '" />' . '<br>';
4
  • file_get_html seems to return an object, use var_dump($html) instead of echo Commented Nov 10, 2017 at 2:08
  • PHP ini file_get_contents external url - it's probably just the allow_url_fopen PHP configuration. BUT, could you enable error reporting to see the actual error? That would help with debugging this. Commented Nov 10, 2017 at 2:10
  • var_dump($html) run on php 7.1 with results like echo $html on php 5.6. Commented Nov 10, 2017 at 2:28
  • Its fine when run follow stackoverflow.com/a/44131040/8916968 It work done like run on php 5.6 Commented Nov 10, 2017 at 3:19

3 Answers 3

8

file_get_html change in simple_html_dom include file needs to be changed. See below, it worked for me. See link https://sourceforge.net/p/simplehtmldom/bugs/161/

Since PHP 7.1 it is possible to interpret negative offset. The default Value of offset have to be changed from -1 to 0.

function file_get_html($url, $use_include_path = false, $context=null, $offset = 0, $maxLen=-1, $lowercase = true, $forceTagsClosed=true, $target_charset = DEFAULT_TARGET_CHARSET, $stripRN=true, $defaultBRText=DEFAULT_BR_TEXT, $defaultSpanText=DEFAULT_SPAN_TEXT)
Sign up to request clarification or add additional context in comments.

Comments

3

I know this a weebit old but you can always just download the newest version here -> https://sourceforge.net/projects/simplehtmldom/

The newest update as of this post is 10-08-19

Comments

2

I escaped this by changing "simple_html_dom.php" file in method "parse_slector()" (in line 386) as

$pattern = "/([\w\-:\*]*)(?:\#([\w\-]+)|\.([\w\-]+))?(?:\[@?(!?[\w\-]+)(?:([!*^$]?=)[\"']?(.*?)[\"']?)?\])?([\/, ]+)/is";

and in method "read_tag()" (in line 722)

if (!preg_match("/^[\w\-:]+$/", $tag)) {
...
}

the trick is adding backslash before "-" on the pattern

1 Comment

This worked for me (PHP 7.3), thanks a lot ! I also removed the offset parameter in file_get_contents on line 75 and set the offset to 0 in the file_get_html function on line 70 works too.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.