1

Hopefully this is an easy one. I just can't seem to piece together an answer. I have a data frame. For each row, I have values that I need to change to NA. It is not the same value that needs to be changed for every row. I want to change values to NA for each row based on a value that is in a specified column.

    mydata = as.data.frame(rbind(c("AA","CC","BB","DC","CC"),c("CC","CC","BB","DC","BB"),c("BB","BB","BB","DC","DC")))

    > mydata
      V1 V2 V3 V4 V5
    1 AA CC BB DC CC
    2 CC CC BB DC BB
    3 BB BB BB DC DC

    #for each row, replace values that match the value in column 5 with NA
    apply(mydata[,1:4], 1, function(x){
    x[x %in% x$V5]  = NA
    })

Desired output

    > mydata
      V1 V2 V3 V4 V5
    1 AA NA BB DC CC
    2 CC CC NA DC BB
    3 BB BB BB NA DC

Thanks!

----UPDATE----

Using the code below from arvi1000 works great for comparing values in a row to a single column of values. Is there a way to do something like this but comparing the values to 2 or more columns?

Current code

    mydata[,1:4][mydata[,1:4]==mydata[,5]] <- NA

Let's say I also have a column 6. By row, I want to change values that do not equal values in columns 5 or 6 to NA.

    mydata = as.data.frame(rbind(c("AA","CC","BB","DC","CC","AA"),c("CC","CC","BB","DC","BB","CC"),c("BB","BB","BB","DC","DC","BB")),stringsAsFactors=F)

    > mydata
      V1 V2 V3 V4 V5 V6
    1 AA CC BB DC CC AA
    2 CC CC BB DC BB CC
    3 BB BB BB DC DC BB

Desired output

    > mydata
      V1 V2 V3 V4 V5 V6
    1 AA CC NA NA CC AA
    2 CC CC BB NA BB CC
    3 BB BB BB DC DC BB

I tried to do this, but received an error

 mydata[,1:4][mydata[,1:4]==mydata[,5]|mydata[,6]] <- NA
    Error in mydata[, 1:4] == mydata[, 5] | mydata[, 6] : 
      operations are possible only for numeric, logical or complex types

2 Answers 2

1

Add stringsAsFactors=F to as.data.frame. This is key because 'CC'!='CC' when they are different levels of different factors.

mydata = as.data.frame(rbind(c("AA","CC","BB","DC","CC"),c("CC","CC","BB","DC","BB"),c("BB","BB","BB","DC","DC")),
                       stringsAsFactors=F)

Then:

mydata[,1:4][mydata[,1:4]==mydata[,5]] <- NA

Voila:

  V1   V2   V3   V4 V5
1 AA <NA>   BB   DC CC
2 CC   CC <NA>   DC BB
3 BB   BB   BB <NA> DC
Sign up to request clarification or add additional context in comments.

4 Comments

Hi, this works great! Is there a way I could do this for comparing the data to values in 2 or more columns? I tried using a conditional (see my above edit) but that didn't work out so well. Thanks!
You were close! mydata[,1:4]==mydata[,5] | mydata[,1:4]==mydata[,6] will do it
That's great! Doesn't seem to work the same if I want to do != instead of == though. If I want to do != do I need to put together a different statement altogether?
Troubleshooting logical operators is probably a different question. The basic idea though, is that you create a dataframe of true/false values of the same shape as mydata[,1:4] and use that to index which 'cells' you want to make NA. Look at the logical indexing data.frame by itself and try to get that sorted as a first step (i.e. look at mydata[,1:4]!=mydata[,5] and build from there; you probably want & not | to combine != statements)
1

Another way would be using apply:

mydata = as.data.frame(rbind(c("AA","CC","BB","DC","CC"),c("CC","CC","BB","DC","BB"),c("BB","BB","BB","DC","DC")))

mydata <- data.frame(t(apply(mydata,1,function(x) {
  for ( i in 1:(ncol(mydata)-1)){
    if ( x[i] == x[ncol(mydata)]) {
      x[i] <- NA
    }  
  }
  return(x)
})))

output:

> mydata
  V1   V2   V3   V4 V5
1 AA <NA>   BB   DC CC
2 CC   CC <NA>   DC BB
3 BB   BB   BB <NA> DC

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.