1

I have data.frame containing travel Duration time from several cities to some destinations. The structure (simplified) looks like the table below:

city  | dest1 | dest2 | closest
------+-------+-------+--------
cityA | NA    | NA    | cityC
cityB | NA    | NA    | cityD
cityC | 100   | 200   | cityA
cityD | 300   | 400   | cityB

Now I want to approximate the travel duration from cityA to dest1 by the travel duration from cityC to dest1 (because cityC is closest to cityA, last column), i.e. I want to replace the NA value in the upper left with 100.

Is there a possibility to do this smoothly with dplyr functions?

1 Answer 1

4

You can do this with a left_join with some selecting/renaming, a mutate with coalesce to merge the columns and a select to remove what you don't want from the output.

library(dplyr)

df <- tibble(city = c("CityA","CityB","CityC","CityD"),
              dest1 = c(NA, NA, 100, 300),
              dest2 = c(NA, NA, 200, 400),
              closest = c("CityC","CityD","CityA","CityB"))


df %>% 
  left_join(select(., city = closest, dist = dest1), by = "city") %>% 
  mutate(dest1 = coalesce(dist, dest1)) %>% 
  select(-dist)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.