4

I am trying to log into a website with Scrapy, but the response received is an HTML document containing only inline JavaScript. The JS redirects to the page I want to scrape data from. But Scrapy does not execute the JS and therefore doesn't route to the page I want it to.

I use the following code to submit the login form required:

    def parse(self, response):
      request_id =   response.css('input[name="request_id"]::attr(value)').extract_first()
      data = {
          'userid_placeholder': self.login_user,
          'foilautofill': '',
          'password': self.login_pass,
          'request_id': request_id,
          'username': self.login_user[1:]
      }
      yield   scrapy.FormRequest(url='https://www1.up.ac.za/oam/server/auth_cred_submit',   formdata=data,
                               callback=self.print_p)

The print_p callback function is as follows:

def print_p(self, response):
    print(response.text)

I have looked at scrapy-splash but I could not find a way to execute the JS in the response with scrapy-splash.

3
  • 1
    have you tried to manually go to the page the JS redirections bring you to ? (that's to say, scrap an url in print_p and yield a request to this page) Commented Jun 22, 2017 at 10:30
  • @Pablo The JS builds a url which it then redirects to. Commented Jun 22, 2017 at 10:38
  • docs.scrapy.org/en/latest/topics/dynamic-content.html Commented Nov 20, 2019 at 9:42

2 Answers 2

5

I'd suggest using Splash as a rendering service. Personally, I found it more reliable than Selenium. Using scripts, you can instruct it to interact with the page.

Sign up to request clarification or add additional context in comments.

Comments

2

Probably selenium can help you pass this JS.

If you haven't checked it yet you can use some examples like this. If you'll have luck to reach it then you can get page url with:

self.driver.current_url

And scrape it after.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.