3

I'm lost, so any directions would be helpful. Let's say I have a dataframe:

df <- data.frame(
  id = 1:12,
  v1 = rep(c(1:4), 3),
  v2 = rep(c(1:3), 4),
  v3 = rep(c(1:6), 2),
  v4 = rep(c(1:2), 6))

My goal would be to recode 2=4 and 4=2 for variables v3 and v4 but only for the first 4 cases (id < 5). I'm looking for a solution that works for up to twenty variables. I know how to do basic recoding but I don't see a simple way to implement the subset condition while manipulating multiple variables.

0

4 Answers 4

3

Here is a base R solution,

df[1:5, c('v3', 'v4')] <- lapply(df[1:5, c('v3', 'v4')], function(i) 
                                       ifelse(i == 2, 4, ifelse(i == 4, 2, i)))

which gives,

   id v1 v2 v3 v4
1   1  1  1  1  1
2   2  2  2  4  4
3   3  3  3  3  1
4   4  4  1  2  4
5   5  1  2  5  1
6   6  2  3  6  2
7   7  3  1  1  1
8   8  4  2  2  2
9   9  1  3  3  1
10 10  2  1  4  2
11 11  3  2  5  1
12 12  4  3  6  2
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks as well. I think it's very usefull to know the base R version of a solution!
or df[1:5, c('v3', 'v4')][df[1:5, c('v3', 'v4')] == 2 | df[1:5, c('v3', 'v4')] == 4] <- 6-df[1:5, c('v3', 'v4')][df[1:5, c('v3', 'v4')] == 2 | df[1:5, c('v3', 'v4')] == 4] (which can probably be shortened with indices...)
@Cath You should put that in a new answer!
3

You can try mutate_at with case_when in dplyr

library(dplyr)

df %>%
  mutate_at(vars(v3:v4), ~case_when(id < 5 & . == 4 ~ 2L, 
                                    id < 5 & . == 2 ~ 4L, 
                                    TRUE ~.))
#   id v1 v2 v3 v4
#1   1  1  1  1  1
#2   2  2  2  4  4
#3   3  3  3  3  1
#4   4  4  1  2  4
#5   5  1  2  5  1
#6   6  2  3  6  2
#7   7  3  1  1  1
#8   8  4  2  2  2
#9   9  1  3  3  1
#10 10  2  1  4  2
#11 11  3  2  5  1
#12 12  4  3  6  2

With mutate_at you can specify range of columns to apply the function.

2 Comments

Ok, thank you! This looks like a nice solution. What I don't quite understand is the use of "~" and "." here, so I don't really see the role of "TRUE ~." at the end. The dot sort of works like a loop through the variables?
@2freet TRUE ~ . at the end refers for all the cases where the column is not 4 or 2, so in that case it keeps the same value. ~ is a formula-style syntax, you can read more about it at ?case_when.
3

Another, more direct, option is to get the indices of the numbers to replace, and to replace them by 6 minus the number (6-4=2, 6-2=4):

whToChange <- which(df[1:5, c("v3", "v4")] ==2 | df[1:5, c("v3", "v4")]==4, arr.ind=TRUE)

df[, c("v3", "v4")][whToChange] <- 6-df[, c("v3", "v4")][whToChange]

head(df, 5)
#  id v1 v2 v3 v4
#1  1  1  1  1  1
#2  2  2  2  4  4
#3  3  3  3  3  1
#4  4  4  1  2  4
#5  5  1  2  5  1

Comments

1

You can use match and a lookup table - just in chase you have to recede more than two values.

rosetta <- matrix(c(2,4,4,2), 2)
df[1:4, c("v3", "v4")] <- lapply(df[1:4, c("v3", "v4")], function(x) {
  i <- match(x, rosetta[1,]); j <- !is.na(i); "[<-"(x, j, rosetta[2, i[j]])})
df
#   id v1 v2 v3 v4
#1   1  1  1  1  1
#2   2  2  2  4  4
#3   3  3  3  3  1
#4   4  4  1  2  4
#5   5  1  2  5  1
#6   6  2  3  6  2
#7   7  3  1  1  1
#8   8  4  2  2  2
#9   9  1  3  3  1
#10 10  2  1  4  2
#11 11  3  2  5  1
#12 12  4  3  6  2

Have also a look at R: How to recode multiple variables at once or Recoding multiple variables in R

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.