Recode multiple column values using mutate_at and creating a new column based on mutated columns in a pipe

Question

I have a data frame of questionnaire data in wide format, with each column representing one questionnaire item.

Individually, I know how to recode the values within columns and create new columns based on values found in other columns. However, I am experiencing problems trying to do both in a single pipe.

My data looks like the following:

df <- data.frame(Q1 = c(1, 2, 1, 4), Q2 = c(4, 2, 3, 1), Q3 = c(3, 3, 2, 3),
             Q4 = c(4, 4, 2, 4), Q5 = c(4, 2, 3, 1), Q6 = c(7, 2, 3, 1))

Using my sample dataset as an example, I intend to subtract 1 from columns Q1, Q2, and Q3 and replace the original values with the new (subtracted) values. Concurrently, I want to create a new column that contains the mean of Q1, Q2, and Q3 while ignoring any NA values or values that are 3.

I have tried the following code, but the Q1, Q2, and Q3 columns are not updated with the subtracted value.

library(dplyr)

df$mean <- df %>%
  select(Q1, Q2, Q3) %>%
  mutate_all(funs(. - 1)) %>%
  apply(1, function(x) {
    round(mean(x[!is.na(x) & x != 3]), digits = 2)
  })

I have tried using mutate_at followed by mutate in a pipe. However, the end result deletes every other column that is not selected. I still want the other columns to be in the final dataset:

df <- df %>%
  select(Q1, Q2, Q3) %>%
  mutate_all(funs(. - 1)) %>%
  mutate(mean = apply(., 1, function(x)
    round(mean(x[!is.na(x) & x != 3]), digits = 2)))

Thanks and much appreciated!

MeetMrMet · Accepted Answer · 2018-07-23 09:22:36Z

1

We can define a vector of variables you want to do your actions on, then use this in mutate_at to do the subtraction. For the mean, we can nest a select in the apply you already have as follows

subtract <- c("Q1", "Q2", "Q3")
df2 <- df %>%
  mutate_at(subtract, funs(. - 1)) %>%
  mutate(mean = apply(select(., one_of(subtract)), 1, function(x)
    round(mean(x[!is.na(x) & x != 3]), digits = 2)))

df2
#   Q1 Q2 Q3 Q4 Q5 Q6 mean
# 1  0  3  2  4  4  7 1.00
# 2  1  1  2  4  2  2 1.33
# 3  0  2  1  2  3  3 1.00
# 4  3  0  2  4  1  1 1.00

answered Jul 23, 2018 at 9:22

MeetMrMet

1,3898 silver badges14 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

DTYK Over a year ago

I tried assigning your code to df instead of df2. Why does yours retain all other columns while mine only retains Q1, Q2, and Q3? Thanks!

MeetMrMet Over a year ago

When I run it on my system reassigning to df I still retain all columns. Are you sure you didn't run it on a result of one of your earlier attempts that deleted those columns?

DTYK Over a year ago

Your code works perfectly fine for me. I'm just asking for your help in explaining the logic of your code to me. How did your correct code retain the other columns while my second example failed to do so? Thanks!

MeetMrMet Over a year ago

Ah! The default action of select when you positively specify variables is to only bring forward those specified, so when you do select(Q1, Q2, Q3) the other three are dropped. In mine, the mutate_at specifies which variables to apply the function to, but doesn't affect the other variables.

DTYK Over a year ago

Oh! I see. Thanks for the explanation. Much appreciated!

Ronak Shah · Accepted Answer · 2018-07-23 09:45:23Z

One option is we select the required column and subtract -1 from each of them and then take the mean , rowwise from those columns and add the new column.

library(tidyverse)

df %>%
  select(1:3) %>%
  mutate_all(funs(. - 1)) %>%
  rowwise() %>%
  do( (.) %>% as.data.frame %>% 
      mutate(mean = mean(.[. != 3], na.rm = TRUE)))

#    Q1    Q2    Q3  mean
#* <dbl> <dbl> <dbl> <dbl>
#1  0     3.00  2.00  1.00
#2  1.00  1.00  2.00  1.33
#3  0     2.00  1.00  1.00
#4  3.00  0     2.00  1.00

which can also be written as

(df[1:3] - 1) %>%
    rowwise() %>%
    do( (.) %>% as.data.frame %>% 
    mutate(mean = mean(.[. != 3], na.rm = TRUE)))

Or to completely avoid the do call we can create a function which calculates the mean and apply it rowwise

apply_fun <- function(x) {
  mean(x[x != 3], na.rm = TRUE)
}

(df[1:3] - 1) %>%
  rowwise() %>%
  mutate(mean = apply_fun(c(Q1, Q2, Q3)))

 #    Q1    Q2    Q3  mean
 #  <dbl> <dbl> <dbl> <dbl>
 #1  0     3.00  2.00  1.00
 #2  1.00  1.00  2.00  1.33
 #3  0     2.00  1.00  1.00
 #4  3.00  0     2.00  1.00

Collectives™ on Stack Overflow

Recode multiple column values using mutate_at and creating a new column based on mutated columns in a pipe

2 Answers 2

5 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

5 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related