1

I am trying to use R scrape a website:

http://divulgacandcontas.tse.jus.br/divulga/#/candidato/2018/2022802018/GO/90000609234

It has several fields with lots of information. I am only interested in the url above the field "site do candidato". In this example, the url I want is: "http://vanderlansenador111.com.br"

The problem is, there is no HTML (visible). So, I don't think using rvest is helpful (at least, I don't know how to use it). Is there a way to scrape it without using selenium (I never used Rselenium and had some problems trying to run it).

Points to any direction much appreciated.

2 Answers 2

4

Don't waste your time with Selenium. Use the Developer Tools part of your browser to find the XHR request: http://divulgacandcontas.tse.jus.br/divulga/rest/v1/candidatura/buscar/2018/GO/2022802018/candidato/90000609234

and just use jsonlite::fromJSON():

str(jsonlite::fromJSON("http://divulgacandcontas.tse.jus.br/divulga/rest/v1/candidatura/buscar/2018/GO/2022802018/candidato/90000609234"))

The str() output is large & complete. You should be able to find what you need there.

Sign up to request clarification or add additional context in comments.

Comments

1

Selenium is a good choice for this, and alternative is you can use PhantomJS there is a good tutorial on the process over at datacamp (not as clean solution as Selenium)

https://www.datacamp.com/community/tutorials/scraping-javascript-generated-data-with-r

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.