4

I'm doing some web scraping and the project is almost done, except for I need to click a javascript link and I can't work out how to with Python and mechanize.

On one of the pages, a list of javascript links appear and I want to follow them in turn, scrape some data, and repeat. I know mechanize doesn't work with javascript but does anyone know a workaround? Here's the code I use to isolate the links:

for Auth in iterAuths:
     Auth = str(Auth.contents[0]).strip()
     br.find_link(text=Auth)

now if I do br.follow_link(text=Auth), I get an error urllib2.URLError: <urlopen error unknown url type: javascript>.

If I do print br.click_link(text=Auth'), it prints as Request for javascript:SendThePage('5660')

I just need to get through the javascript link. Can anyone help?

1 Answer 1

2

When I needed to do something similar, I looked at the links I was trying to follow.

Some of them were static links generated with javascript. They were predictable/consistent enough that I could manually generate a list before hand.

Others were just constructed URLs with parameters. These too could be analyzed before hand and generated python-side and passed as a request instead of a "click on this link."

If you need to actually execute the javascript, you could run a PyV8 + Mechanize hybrid. I've been playing with this a bit and it seems pretty cool. PyV8 bridges Python with the V8 Javascript engine allowing you to create JS environments and execute arbitrary code. It does a great job going back and forth between the two languages.

I don't have any sample code, but one of these 3 solutions will work for you :) Good luck!

Sign up to request clarification or add additional context in comments.

2 Comments

Do you have any resources on installing PyV8, I'm having some trouble.
@ConstantlyConfused at this point, I'd just consider using Selenium with a PhantomJS backend so you can run it headless. That workflow has come so far, and mechanize is pretty dead.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.