5
$\begingroup$

I am trying to download some data from a website that uses "java enabling" to tell me that I cannot download the data. Using

Import[ "website...", "Data"] 

I get

{....."You need Javascript enabled to view live scores. Loading... Loading... ..}

Do I have to use something like squid to change requester_id so it shows up as a web-browser rather than Mathematica ?

I have now fixed this problem. The second problem is that Mathematica does not support iframe. Does it ? Is there a way around this issue ?

$\endgroup$
4
  • $\begingroup$ Please post your code and the website you are trying to scrape. You should probably post your second question as a separate question. $\endgroup$ Commented Mar 12, 2012 at 19:55
  • $\begingroup$ You wrote: "I have now fixed this problem." Please post your solution as an answer to your own question, as soon as the system will allow it (a matter of "reputation" points and time). $\endgroup$ Commented Mar 13, 2012 at 7:48
  • $\begingroup$ No, my fault, I have not fixed anything. I deluded myself into believing I had fixed this issue. I just tried a different category on the same site which did not work either but gave the iframe error message. Sorry. $\endgroup$ Commented Mar 13, 2012 at 8:17
  • $\begingroup$ @Verbeia for example, livexscores.com/livescore/football i am into live quotes. $\endgroup$ Commented Mar 13, 2012 at 8:29

2 Answers 2

3
$\begingroup$

The user agent (requester_id) probably isn't the issue. Indeed, Mathematica does not appear as a standard browser to web servers (it identifies itself as Mathematica), but that usually isn't an issue for normal HTML pages. And indeed Squid can solve problems if your page is testing for a specific browser (see also an earlier question on user agents)

For pages that are partly generated by a Javascript script this is different, as Mathematica does not interpret Javascript. I don't think you can get this to work.

$\endgroup$
6
  • 1
    $\begingroup$ If you are completely crazy, or desperate/curious, you could interface with a javascript interpreter via MathLink, like FireFox's SpiderMonkey. I'm almost tempted to do it, myself, and yes I fall under the crazy/curious header. $\endgroup$ Commented Mar 13, 2012 at 1:47
  • $\begingroup$ would there be other ways to scrape the data ? with ruby or scrapy ? $\endgroup$ Commented Mar 13, 2012 at 1:47
  • $\begingroup$ @rcollyer for example something like livexscores.com/livescore/football ? i guess ruby would be the way to go, no ? $\endgroup$ Commented Mar 13, 2012 at 8:28
  • $\begingroup$ nevermind i am closing this question. $\endgroup$ Commented Mar 15, 2012 at 17:42
  • $\begingroup$ I am very curious about updates here, I have mathematica 9 and also cant doload a page because of either iframe or because mathematica identifies as Mathematica not as say Chrome. any updates? $\endgroup$ Commented Jun 2, 2013 at 15:00
2
$\begingroup$

If anyone is reading this Q&A in 2018 or later, the situation is a bit different because there is now a Chrome driver that can be used to evaluate JavaScript. It is still experimental at the time of writing. Please look for more recent Q&As such as this or even more recent.

$\endgroup$

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.