0

The other solution marked as duplicate gave me an error when I tried it on my dataset which has categorical data as well.

I have a table with several columns. One column, column A, has 0, 1, 2, 3, 4 as values. These are codes for a certain condition. I'm trying to create/add another column, column Z, to the table which has 0 if the value in column A is 0 and 1 if the value in column A is 3 or 4. I'm trying to do it via this:

for (i in 1:nrow(pheno_table))
    if pheno_table$columnA == 0
     then pheno_table$newcolumnZ<-0
    elsif pheno_table$columnA == 3 | pheno_table$columnA == 4
     then pheno_table$newcolumnZ<-0

thanks so much @see24! also, I did try this and set the working directory and such but am not able to see the file in the folder (I checked the paths)

    setwd('/pathtofolder/') 

    library(dplyr) df <- data.frame(A=  
    (originaltablefile$column_of_interest)) 
    newcolumn <- df %>% mutate
    (newcolumn = case_when(A == 0 ~ 0, A %in% c(3,4) ~ 1, 
    TRUE ~ NA_real_)) 
    finaltablefile <- cbind(originaltablefile,newcolumn)` 

not able to see finaltablefile in my folder.

1

1 Answer 1

1

I like to use the mutate and case_when functions from the dplyr package

library(dplyr)
df <- data.frame(A = c(1,2,3,4,0),B = c(3,4,5,6,7))
df2 <- df %>% mutate(Z = case_when(A == 0 ~ 0,
                            A %in% c(3,4) ~ 1,
                            TRUE ~ NA_real_))

I'm assuming that you want NA for rows that are not 1, 3, or 4. The TRUE part means if none of the above are true then... You have to use NA_real_ because case_when requires all the outputs to be of the same type

Sign up to request clarification or add additional context in comments.

6 Comments

Alternately, could do a left join with mdf = data.frame(A = c(0,3,4), Z = c(0, 1, 1)); df %>% left_join(mdf) if all comparisons are with equality.
thanks so much! im not too familiar with dplyr; could you please explain NA_real a bit more? also could you suggest a good resource for learning dplyr? thank you!
To learn more about dplyr I suggest this excellent free online book R for Data Science. NA is of type logical and case_when requires that all the outputs are the same type. So if your other outputs are numbers you need NA_real_ and if they are character you need NA_character_. See the case_when documentation examples for more
thanks so much @see24! also, I did try this and set the working directory and such but am not able to see the file in the folder (I checked the paths) ` setwd('/pathtofolder/') library(dplyr) df <- data.frame(A = (originaltablefile$column_of_interest)) newcolumn <- df %>% mutate(newcolumn = case_when(A == 0 ~ 0, A %in% c(3,4) ~ 1, TRUE ~ NA_real_)) finaltablefile <- cbind(originaltablefile,newcolumn)` not able to see finaltablefile in my folder.
You have created finaltablefile in the R environment if you want to save it to a folder you need to do that explicitly. You can use write.csv to save it as a csv file or you could use save(finaltablefile, file = "finaltablefile.RData") to save it as an R object that you can load in the future with load(). If you type ls() you will see the objects that you have created in the R environment. I recommend getting RStudio as it makes keeping track of the objects in the environment much easier
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.