9

I have several variables in my dataset that need to be recoded in exactly the same way, and several other variables that need to be recoded in a different way. I tried writing a function to help me with this, but I'm having trouble.

library(dplyr)
recode_liberalSupport = function(arg1){
  arg1 = recode(arg1, "1=-1;2=1;else=NA")
  return(arg1)
}

liberals = c(df$var1, df$var4, df$var8)
for(i in unique(liberals)){
  paste(df$liberals[i] <- sapply(liberals, FUN = recode_liberalSupport))
}

R studio works on this for about 5 minutes then gives me this error message:

Error in `$<-.data.frame`(`*tmp*`, liberals, value = c(NA_real_, NA_real_,  : 
  replacement has 9 rows, data has 64600
In addition: Warning messages:
1: Unknown or uninitialised column: 'liberals'. 
2: In df$liberals[i] <- sapply(liberals, FUN = recode_liberalSupport) :
  number of items to replace is not a multiple of replacement length

Any help would be really appreciated! Thank you

3
  • You probably want to use mutate_at instead of apply, here. I think your syntax for recode is also not correct. Providing sample data is the best way to get working answers Commented Feb 7, 2018 at 21:51
  • One issue is the your for loop. unique(liberals) is going to have fewer values than liberals Commented Feb 7, 2018 at 21:51
  • Does this make sense paste(df$liberals[i] <- sapply(liberals, FUN = recode_liberalSupport))? (The issue is with paste.) Commented Feb 7, 2018 at 21:57

2 Answers 2

20

This is neater I think with dplyr. Using recode correctly is a good idea. mutate_all() can be used to operate on the whole dataframe, mutate_at() on just selected variables. There are lots of ways to specify variables in dplyr.

mydata <- data.frame(arg1=c(1,2,4,5),arg2=c(1,1,2,0),arg3=c(1,1,1,1))

mydata

  arg1 arg2 arg3
1    1    1    1
2    2    1    1
3    4    2    1
4    5    0    1

mydata <- mydata %>% 
     mutate_at(c("arg1","arg2"), funs(recode(., `1`=-1, `2`=1, .default = NaN)))

mydata

  arg1 arg2 arg3
1   -1   -1    1
2    1   -1    1
3  NaN    1    1
4  NaN  NaN    1

I use NaN instead of NA as it is numeric is be simpler to manage within a column of other numbers.

Sign up to request clarification or add additional context in comments.

4 Comments

I see how this works, but how can I put the recoded variables in mydata back into my original dataframe?
@Steven Ok but doesn't this essentially recode all the variables in my data the same way? What if I want to recode some variables like 1=-1, 2=1, .default = NaN and some like 1=1, 2=-1, .default = NaN? Then put them all back into the same dataframe
Use mutate_at(var1, var3, ...etc)
Does .default mean "every other value not specified"?
0

As always there are many ways of doing this. I don't know dplyr well enough to use that function, but this seems to be what you are looking for.

mydata <- data.frame(arg1=c(1,2,4,5),arg2=c(1,1,2,0))
mydata
  arg1 arg2
1    1    1
2    2    1
3    4    2
4    5    0

Function to recode using a nested ifelse()

recode_liberalSupport <- function(var = "arg1", data=mydata) {
+   recoded <- ifelse(mydata[[var]] == 1, -1,
+                           ifelse(mydata[[var]] == 2, 1, NA))
+   return(recoded)
+ }

Call the function

recode_liberalSupport(var = "arg1")
[1] -1  1 NA NA

Replace the variable arg1 with recoded values.

mydata$arg1 <- recode_liberalSupport(var = "arg1") 
mydata
  arg1 arg2
1   -1    1
2    1    1
3   NA    2
4   NA    0

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.