1

I have a rather basic question. I have several values in a column that I would like to replace for a single one, for instance:

a<-data.frame(T=LETTERS[5:20],V=rnorm(16,10,1))

and I would like to change all "E", "S", "T" in T for "AB", so I tried

a[a$T==c("E","S","T")]<-"AB"

and it gives me several warnings, and ends up replacing all to "AB"

I think it has something to do with levels and level's labels but I was not able to replace only some of the values, I would have to re-label each. Sorry for the trouble, and thanks for any help!

4 Answers 4

7

You can use function recode() from library car to change values also for the factors.

library(car)
a$T<-recode(a$T,"c('E','S','T')='AB'")

If you need to replace different values with different other values then all statements can be written in one function call.

recode(a$T,"c('E','S','T')='AB';c('F','G','H')='CD'")
Sign up to request clarification or add additional context in comments.

4 Comments

...but why such an unfriendly interface (not blaming you @Didzis but the author). I read about plyr::mapvalues but I can't test it here on my old R version. I suppose plyr::mapvalues(levels(a$T), c("E","S","T"), "AB") might work if someone can give it a try.
@flodel mapvalues will work with some modification mapvalues(a$T, c("E","S","T"), rep("AB",3))
hello, this is really helpful! Thank you very much. Now I have realized I need to do this for several different groups of T, say (L,M,N). Is it possible to do it for all variables in a single line instead of doing it group by group?
@DidzisElferts Echoing your comment, a$T <- plyr::mapvalues(a$T, c("E","S","T"), rep("AB",3)) is more friendly and readable than car::recode(...)
4

This would maintain your data structure (a factor like you guessed):

x <- levels(a$T)
levels(a$T) <- ifelse(x %in%  c("E","S","T"), "AB", x)

or

levels(a$T)[levels(a$T) %in%  c("E","S","T")] <- "AB"

Edit: if you have many such replacements, it is a little more complicated but not impossible:

from <- list(c("E","S","T"), c("J", "K", "L"))
to   <- c("AB", "YZ")

find.in.list <- function(x, y) match(TRUE, sapply(y, `%in%`, x = x))
idx.in.list  <- sapply(levels(a$T), find.in.list, from)
levels(a$T)  <- ifelse(is.na(idx.in.list), levels(a$T), to[idx.in.list])

a$T
#  [1] AB F  G  H  I  YZ YZ YZ M  N  O  P  Q  R  AB AB
# Levels: AB F G H I YZ M N O P Q R

1 Comment

+1. this is the right answer, rather than messing around with character class.
1

Do you really want factors there ??? If not (I think you do not) do options(stringsAsFactors=FALSE) So it is much simpler than that... => a[a$T %in% c("E","S","T"),"T"]<-"AB"

2 Comments

If T is factor (as in this case) this will give NA values not the AB.
Ohhhh !!! this is because my options()$stringAsFactors==FALSE, since R.2.12.x R R maintain hash tables for strings I think there is no need for factors in those cases...
0

An R Base solution is:

a$T[a$T %in% c("E","S","T")] <- "AB"

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.