I'm trying to scrape data from some websites. For several sites it all seems to go fine, but for one website it doesn't seem to be able to get any HTML. This is my code:
<?php include_once('simple_html_dom.php');
$html = file_get_html('https://www.magiccardmarket.eu/?mainPage=showSearchResult&searchFor=' . $_POST['data']);
echo $html; ?>
I'm using ajax to fetch the data. When I log the returned value in my js it's completely empty.
Could it be due to the fact that this website is running on https? And if so, is there any way to work around it? (I've tried changed the url to http, but I get the same result)
Update:
If I var_dump the $html variable, I get bool(false).
My PHP error log says this:
[27-Feb-2014 22:20:50 Europe/Amsterdam] PHP Warning: file_get_contents(http://www.magiccardmarket.eu/?mainPage=showSearchResult&searchFor=tarmogoyf): failed to open stream: HTTP request failed! HTTP/1.0 403 Forbidden in /Users/leondewit/PhpstormProjects/Magic/stores/simple_html_dom.php on line 75
$htmlto see if you get any result. Also, maybe try tovar_dump($html);instead of just echo it.403 Forbiddenerror code is sent from the server you are trying to contact (magiccardmarket) and is usually sent when the page you are requesting requires a login. It is possible they are blocking automated requests from user agents that are not browsers. You could try to change your user agent, but that is really a guess. If that is the case though, they are blocking it for a reason which is most likely that they just don't want people to abuse their website.