0

I have the following function to get some urls from a website using RSelinium and phantomjs.

get_url <- function(url){
  rdr$navigate(url)
  li <- rdr$findElements(using = 'xpath',  "//div[@data-id]")
  str <- sapply(li, function(x){x$getElementAttribute('outerHTML')})
  if(length(str)>1){
  tree <- htmlParse(str)
  url <- getNodeSet(tree, '//div//a[@class="link url"]')
  url <- sapply(url, xmlGetAttr, 'href')
  }
}

And the url is stored in a 30 x 60 matrix.

I tried doing this using the following nested loop.

for(i in 1:ncol(offset_url)){
  for(j in 1:nrow(offset_url)){
    url_list <- rbind(url_list,get_url(offset_url[j,i]))
  }
}

However, it takes a lot of time to execute.

Is there a way that I can use apply functions to rduce the time?

1
  • 1
    There a a number of syntactic issues with posted code which you might be throwing together as an example. Can you post actual code? Commented Jul 27, 2016 at 3:26

1 Answer 1

1

Is this helpful?

do.call(rbind,list(mapply(function(x,y) get_url(offset_url[x,y]),x=row(offset_url),y=col(offset_url))))
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.