9

I'd like to scrape a page, the content of which seems to be rendered by an app referenced in the html like:

<div id="app" class="app-mobile-pusher"></div>

I'm using the render() method from Requests-HTML python library like so:

with HTMLSession() as session:
    p = session.post(login_url, data=payload)
    r = session.get(content_url)
    r.html.render()
    print(r.text)

This code returns the HTML for the page without any errors, but also without any content (just HTML tags). Notes:

  • I've tried adding time out arguments to session.get to give the page more time to render before accessing it and other variations on syntax of the above.

  • Also tried adding user agent information in headers based on this answer (in order to circumvent rejection of my automated scrape)

  • The chromium browser did download when I first ran render()

The lack of any error messages is stumping me and it is difficult to replicate the context of this request to test on another site.

Any specific suggestions for how to solve, or ideas for how to go about troubleshooting, appreciated. (Python 3.6, Mac OS)

1 Answer 1

12

have you tried print(r.html.html) instead? The new rendered code is under this object path.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.