0

I am trying to create a new column in a data frame using mutate. This should match values in two columns between 2 different data frames, and ID and a step number, and then return the value from a third column in my second data frame. Hopefully my code below makes it a little clearer what I'm trying to achieve!

Is this the right way to go about it, I've looked into using merge but don't think that quite does what I need.

Step1 <- iData %>%

filter(IndicatorID == 43) %>%

mutate(Step = 1) %>%

mutate(iresult = InputA + InputB) %>%

mutate(stepname = ifelse(IndicatorID == Step$IndicatorID & Step==Step$Step,Step$StepName, ""))

Basically, it should look to find the row in Step where the Indicator is 43 and Step = 1, then put the value in the new column, in this case it would be "Gross value added". Any help will be really appreciated!

1
  • Can you make this post reproducible by adding data and show expected output for the same? Commented Oct 2, 2019 at 10:54

1 Answer 1

1

If I'm interpreting correctly, thinking about this as a join rather than mutating might make it alot easier

I've creating dummy data, hopefully that will make clear the assumptions I'm making re. the data.

So we have two tables. In both we have IndicatorID and Step. Then in the step dataframe we have a var 'StepName' and we want to be able to use those values in a third table called step1 by matching on IndicatorID and Step.

step <- tibble(
        IndicatorID = c(41, 42, 43, 44, 45, 46), 
        Step = c(1, 2, 1, 4, 5, 6), 
        StepName = c('left', 'right', 'up', 'down', 'under', 'over'))


iData <- tibble(
        IndicatorID = c(seq(from = 1, to = 43)), 
        InputA = runif(43), 
        InputB = runif(43)) %>%
        mutate(iresult = InputA + InputB)

Step1 <- iData %>%
        filter(IndicatorID == 43) %>%
        mutate(Step = 1) %>%
        left_join(step, by = c('IndicatorID', 'Step'))

IndicatorID InputA InputB iresult  Step StepName
        <dbl>  <dbl>  <dbl>   <dbl> <dbl> <chr>   
          43  0.773  0.124   0.898     1 up   


### Example where we select only the columns from step 
### that we are interested in keeping, without doing a semi_join

Step1 <- iData %>%
        filter(IndicatorID == 43) %>%
        mutate(Step = 1) %>%
        left_join(step %>%
             select(IndicatorID, Step, StepName), 
             by = c('IndicatorID', 'Step'))

Sign up to request clarification or add additional context in comments.

2 Comments

That worked perfectly thank you! Out of curiosity, if the step data frame had more columns than just the one I wanted, is there way to specify left_join to only take certain columns? Thanks!
semi_join would be the way to go in that case. This link is a great resource: stat545.com/… Also I've found that in lots of cases remembering the names and meanings of joins can be tricky. You can always do a select() within the left_join to make what you're doing a little more obvious. I'll add an example in the original post with this option.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.