3

I am currently using a for loop to geocode a large number of addresses using the Googleway package. Initially, I ran into issues with "500 internal server errors" stopping the execution of the loop. I was able to get around this using tryCatch(). However, since this tends to be a transient error, I would like the function to repeat the address that throws the error until it receives a result or hits some arbitrary number of attempts, let's say 10.

Unfortunately, I've found tryCatch() and the documentation associated with it confusing, so I'm at a loss for how to do anything other than get it to throw an error message and move on. Here is my current code:

rugeocoder.fun <- function(addr){
              require(googleway)
              output <- vector("list", length=length(addr))
              tryCatch({
                for(i in 1:length(addr)){
                  output[[i]] <- google_geocode(address=addr[i], key="myapikey", language="ru", simplify=T)
                  print(i)

                }},error=function(e) output[[i]] <- "Error: reattempt")
              return(output)
              }

1 Answer 1

7

You probably want to separate out the logic for calling google_geocode() safely, and for looping over the addresses.

Here's a function that modifies other functions to call them repeatedly until they work, or they fail max_attempts times. Functions that modify other functions are sometimes called "adverbs".

safely <- function(fn, ..., max_attempts = 5) {
  function(...) {
    this_env <- environment()
    for(i in seq_len(max_attempts)) {
      ok <- tryCatch({
          assign("result", fn(...), envir = this_env)
          TRUE
        },
        error = function(e) {
          FALSE
        }
      )
      if(ok) {
        return(this_env$result)
      }
    }
    msg <- sprintf(
      "%s failed after %d tries; returning NULL.",
      deparse(match.call()),
      max_attempts
    )
    warning(msg)
    NULL
  }
}

Try it out on this simple function that generates a random number, and throws an error if it is too small.

random <- function(lo, hi) {
  y <- runif(1, lo, hi)
  if(y < 0.75) {
    stop("y is less than 0.75")
  }
  y
}
safe_random <- safely(random)
safe_random() # will sometimes work, will sometimes return NULL
safe_random(0, 10) # will usually work

In you case, you want to modify the google_geocode() function.

safe_google_geocode <- safely(google_geocode)

Then loop over addresses calling this.

geocodes <- lapply( # purrr::map() is an alternative
  addresses,
  safe_google_geocode,
  key = "myapikey", 
  language = "ru", 
  simplify = TRUE
)
Sign up to request clarification or add additional context in comments.

7 Comments

Would you mind clarifying why you used this_env <- environment()? The environments are a bit complex here, and I'm having trouble following how the result moves from the closure created by the safely() function to the global environment in order to be assigned to an object.
@SeanNorton Yeah, it is confusing. tryCatch() seems to use its own environment, so just doing result <- fn(...) doesn't work.
@SeanNorton this_env is, I think, the environment belonging to the safe_whatever() closure.
Hm, unfortunately the assign() doesn't appear to be working correctly - I receive the error In assign("result", fn(...), envir = this_env) : restarting interrupted promise evaluation for every iteration, returning a list of entirely null elements. Any thoughts on how to get around this?
@SeanNorton Hmm. I'm not sure where the problem occurs. Does evaluating fn(...) with eval.parent() help? stackoverflow.com/q/20596902/134830
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.