replace strings in dataframe that match strings in matrix

Question

I have a dataframe df with a series of NA and strings and 2 matrices match and value with same ncol and nrow. match has all the possible strings in df

I would like to replace the strings in df with those in value. If a string in df matches the values in match, then it can be replaced by the string in value at the same position

I believe the first step is to create a new df with the position of match in df

df1 <- which(df %in% match) #nothing valuable...

apologies for less code from my side.

df <- as.data.frame(matrix(c("ab","bc",NA,"aa",NA,NA,"de","aa",NA,"bc","ab","ab"),ncol = 4))
match <- matrix(c("ab","bc","de","aa"),nrow = 2)
value <- matrix(c("Good","Bad","Average","Stop"),nrow = 2)

 output <- as.data.frame(matrix(c("Good","Bad",NA,"Stop",NA,NA,"Average","Stop",NA,"Bad","Good","Good"),ncol = 4))

Or using plyr : matrix(mapvalues(unlist(df),c(match),c(value)),dim(df)) — count
– count, Commented Mar 10, 2017 at 8:32

Niek · Accepted Answer · 2017-03-10 08:37:59Z

2

This should also works

> m<-apply(df,2,function(x) match(x,match))
> df2<-as.data.frame(matrix(value[m],ncol =ncol(df),nrow=nrow(df)))
> df2
       V1      V2      V3   V4
1    Good    Stop Average  Bad
2     Bad Average    Stop Good
3 Average    Stop     Bad Good

answered Mar 10, 2017 at 8:37

Niek

1,62412 silver badges20 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Ronak Shah · Accepted Answer · 2017-03-10 08:45:09Z

1

We can unlist the dataframe and match the elements of dataframe with that of m1 and use the index to get corresponding value from value.

df[] <- value[match(unlist(df), m1)]
df

#    V1   V2      V3   V4
#1 Good Stop Average  Bad
#2  Bad <NA>    Stop Good
#3 <NA> <NA>    <NA> Good

Note : Renamed match as m1.

edited Mar 10, 2017 at 8:45

answered Mar 10, 2017 at 8:30

Ronak Shah

391k20 gold badges173 silver badges237 bronze badges

4 Comments

Chrisftw Over a year ago

simple and clear. in order to add a nomatch in case i want to keep the string. adding 'df[] <- value[match(unlist(df), m1,nomatch=unlist(df))]' does not keep the previous ones. any ideas? thanks.

Ronak Shah Over a year ago

@Chrisftw I didn't get you. to keep the string ?? do you mean an empty string instead of NA's?. What output you aiming at ?

Chrisftw Over a year ago

your answer is correct. In case some strings in df do not match the strings in values , i would like to add a nomatch argument that will keep the unmatched string.

Ronak Shah Over a year ago

@Chrisftw That won't be straightforward with nomatch. I think you'll require something like v1 <- match(unlist(df), m1); df[] =ifelse(is.na(v1) & !is.na(unlist(df)), unlist(df), value[v1]) in that case.

akrun · Accepted Answer · 2017-03-10 09:27:12Z

1

We can use lapply with match.

df[] <- lapply(df, function(x) value[match(x, match)])
df
#   V1   V2      V3   V4
#1 Good Stop Average  Bad
#2  Bad <NA>    Stop Good
#3 <NA> <NA>    <NA> Good

answered Mar 10, 2017 at 9:27

akrun

891k38 gold badges590 silver badges700 bronze badges

Collectives™ on Stack Overflow

replace strings in dataframe that match strings in matrix

3 Answers 3

Comments

4 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

4 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related