3

I have a basic R script which I have cobbled together using Rselenium which allows me to log into a website, once authenticated my script then goes to the first page of interest and pulls 3 pieces of text from the page.

Luckily for me the URL has been created in such a way that I can pass in a vector of numbers to the URL to take me to the next page of interest hence the use of map().

While on each page I want to scrape the same 3 elements off the page and store them in a master data frame for later analysis.

I wish to use the map family of functions so that I can become more familiar with them but I am really struggling to get these to work, could anyone kindly tell me where I am going wrong?

Here is the main part of my code (go to website and log in)

library(RSelenium)
# https://stackoverflow.com/questions/55201226/session-not-created-this-version-of-chromedriver-only-supports-chrome-version-7/56173984
rd <- rsDriver(browser = "chrome",
               chromever = "88.0.4324.27",
               port = netstat::free_port())

remdr <- rd[["client"]]

# url of the site's login page
url <- "https://www.myWebsite.com/"

# Navigating to the page
remdr$navigate(url)

# Wait 5 secs for the page to load
Sys.sleep(5)

# Find the initial login button to bring up the username and password fields
loginbutton <- remdr$findElement(using = 'css selector','.plain')

# Click the initial login button to bring up the username and password fields
loginbutton$clickElement()

# Find the username box
username <- remdr$findElement(using = 'css selector','#username')

# Find the password box
password <- remdr$findElement(using = 'css selector','#password')

# Find the final login button
login <- remdr$findElement(using = 'css selector','#btnLoginSubmit1')

# Input the username
username$sendKeysToElement(list("myUsername"))

# Input the password
password$sendKeysToElement(list("myPassword"))

# Click login
login$clickElement()

And hey presto we're in!

Now my code takes me to the initial page of interest (index = 1)

Above I mentioned that I am looking to increment through each page and I can do this by substituting an integer into the URL at the rcId element, see below

#remdr$navigate("https://myWebsite.com/rc_redesign/#/layout/jcard/drugCard?accountId=XXXXXX&rcId=1&searchType=R&reimbCode=&searchTerm=&searchTexts=*") # Navigating to the page

For each rcId in 1:9999 I wish to grab the following 3 elements and store them in a data frame

hcpcs_info <- remdr$findElement(using = 'class','is-jcard-heading')

hcpcs <- hcpcs_info$getElementText()[[1]]

hcpcs_description <- remdr$findElement(using = 'class','is-jcard-desc')

hcpcs_desc <- hcpcs_description$getElementText()[[1]]

tc_info <- remdr$findElement(using = 'css selector','.col-12.ng-star-inserted')

therapeutic_class <- tc_info$getElementText()[[1]]

I have tried creating a separate function and passing to map but I am not advance enough to piece this together, below is what I have tried.

my_function <- function(index) {
  remdr$navigate(sprintf("https://rc2.reimbursementcodes.com/rc_redesign/#/layout/jcard/drugCard?accountId=113479&rcId=%d&searchType=R&reimbCode=*&searchTerm=*&searchTexts=*",index)
                 Sys.sleep(5)
                 hcpcs_info[index] <- remdr$findElement(using = 'class','is-jcard-heading')
                 hcpcs[index] <- hcpcs_info$getElementText()[index][[1]])
}

x <- 1:10 %>% 
map(~ my_function(.x))

Any help would be greatly appreciated

2
  • What output does your code return rr does it return an error? Which are the 3 elements that you want to store? Commented Feb 5, 2021 at 2:03
  • Hi Ronak, I wish to return hcpcs, hcpcs_desc and therapeutic_class Commented Feb 5, 2021 at 2:43

1 Answer 1

4

Try the following :

library(RSelenium)

purrr::map_df(1:10, ~{
          remdr$navigate(sprintf("https://rc2.reimbursementcodes.com/rc_redesign/#/layout/jcard/drugCard?accountId=113479&rcId=%d&searchType=R&reimbCode=*&searchTerm=*&searchTexts=*",.x))
          Sys.sleep(5)
          hcpcs_info <- remdr$findElement(using = 'class','is-jcard-heading')
          hcpcs <- hcpcs_info$getElementText()[[1]]
          hcpcs_description <- remdr$findElement(using = 'class','is-jcard-desc')
          hcpcs_desc <- hcpcs_description$getElementText()[[1]]
          tc_info <- remdr$findElement(using = 'css selector','.col-12.ng-star-inserted')
          therapeutic_class <- tc_info$getElementText()[[1]]
          tibble(hcpcs, hcpcs_desc, therapeutic_class)
          }) -> result
result
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you so much, this is awesome. I've just tried it out and it works, woo hoo! thanks again.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.