0

I have a dataframe df with a series of NA and strings and 2 matrices match and value with same ncol and nrow. match has all the possible strings in df

I would like to replace the strings in df with those in value. If a string in df matches the values in match, then it can be replaced by the string in value at the same position

I believe the first step is to create a new df with the position of match in df

df1 <- which(df %in% match) #nothing valuable...

apologies for less code from my side.


df <- as.data.frame(matrix(c("ab","bc",NA,"aa",NA,NA,"de","aa",NA,"bc","ab","ab"),ncol = 4))
match <- matrix(c("ab","bc","de","aa"),nrow = 2)
value <- matrix(c("Good","Bad","Average","Stop"),nrow = 2) 

 output <- as.data.frame(matrix(c("Good","Bad",NA,"Stop",NA,NA,"Average","Stop",NA,"Bad","Good","Good"),ncol = 4)) 
1
  • Or using plyr : matrix(mapvalues(unlist(df),c(match),c(value)),dim(df)) Commented Mar 10, 2017 at 8:32

3 Answers 3

2

This should also works

> m<-apply(df,2,function(x) match(x,match))
> df2<-as.data.frame(matrix(value[m],ncol =ncol(df),nrow=nrow(df)))
> df2
       V1      V2      V3   V4
1    Good    Stop Average  Bad
2     Bad Average    Stop Good
3 Average    Stop     Bad Good
Sign up to request clarification or add additional context in comments.

Comments

1

We can unlist the dataframe and match the elements of dataframe with that of m1 and use the index to get corresponding value from value.

df[] <- value[match(unlist(df), m1)]
df

#    V1   V2      V3   V4
#1 Good Stop Average  Bad
#2  Bad <NA>    Stop Good
#3 <NA> <NA>    <NA> Good

Note : Renamed match as m1.

4 Comments

simple and clear. in order to add a nomatch in case i want to keep the string. adding 'df[] <- value[match(unlist(df), m1,nomatch=unlist(df))]' does not keep the previous ones. any ideas? thanks.
@Chrisftw I didn't get you. to keep the string ?? do you mean an empty string instead of NA's?. What output you aiming at ?
your answer is correct. In case some strings in df do not match the strings in values , i would like to add a nomatch argument that will keep the unmatched string.
@Chrisftw That won't be straightforward with nomatch. I think you'll require something like v1 <- match(unlist(df), m1); df[] =ifelse(is.na(v1) & !is.na(unlist(df)), unlist(df), value[v1]) in that case.
1

We can use lapply with match.

df[] <- lapply(df, function(x) value[match(x, match)])
df
#   V1   V2      V3   V4
#1 Good Stop Average  Bad
#2  Bad <NA>    Stop Good
#3 <NA> <NA>    <NA> Good

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.