0

looking for answers similar to these posts; R: Replace multiple values in multiple columns of dataframes with NA ; Multiple replacement in R

My dataframe my.df contains NAs.

dput(my.df)
structure(list(`AICAR (GDSC1:1001)_GDSC1` = c(10.1253052794007, 
NA, NA, NA, NA, NA, 9.3362273693641, NA, NA, NA), `vinblastine (GDSC1:1004)_GDSC1` = c(-5.56689193211021, 
NA, NA, NA, NA, NA, -3.49808657768651, NA, NA, -5.7323006155361
), `cisplatin (GDSC1:1005)_GDSC1` = c(3.20680858158152, NA, NA, 
NA, NA, NA, NA, NA, NA, NA), `cytarabine (GDSC1:1006)_GDSC1` = c(-1.29089026889862, 
NA, NA, NA, NA, NA, NA, NA, NA, NA), `docetaxel (GDSC1:1007)_GDSC1` = c(-9.21190331946225, 
NA, NA, NA, NA, NA, NA, NA, NA, -6.51430196744496), `methotrexate (GDSC1:1008)_GDSC1` = c(NA, 
NA, NA, NA, NA, NA, -4.96153980941858, NA, NA, NA), `gefitinib (GDSC1:1010)_GDSC1` = c(NA, 
NA, NA, NA, NA, NA, NA, NA, NA, -4.65609368323825), `navitoclax (GDSC1:1011)_GDSC1` = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_), `vorinostat (GDSC1:1012)_GDSC1` = c(-0.1834250603902, 
1.80666265545084, 0.503152683902549, 1.78569632218743, NA, 1.01934567070847, 
0.321867836558935, NA, 2.18003424956055, 0.143794452798708)), row.names = c(NA, 
10L), class = "data.frame")

I get the cell location of each NA using idx <- my.df %>% lapply(., function(x) which(is.na(x))) Convert these NAs to 0 by my.df %>% mutate_if(.,is.numeric, ~replace(., is.na(.), 0)) before I calculate correlations. Now how can I return the NAs into their dedicated cells based on theidx?

I recon loops, tidy, purrr or something similar can do this fast? Would be great if a match could be done between the column names of my.df and the names of idx for quality control in the code.

Thanks!

5
  • 2
    do not use lapply. use idx <- is.na(my.df) Then do whatever you want to my.df Once done you can do is.na(my.df) <-idx Commented Nov 24, 2022 at 10:55
  • 1
    or... use cor(..., type = 'pairwise.complete.obs') to calculate the correlation and leave your data unaffected? Your correlation is likely biased after imputation, unless you have pre-knowledge that NA's are in fact 0's in which case they can should be replaced and not returned. Commented Nov 24, 2022 at 11:02
  • Thanks for the suggestion. If I do the first alternative I get Error in [<-.data.frame(*tmp*, value, value = NA) : unsupported matrix index in replacement Commented Nov 24, 2022 at 12:51
  • 1
    Could you give an example of what is the expected output of 'how can I return the NAs into their dedicated cells based on their idx' Commented Nov 26, 2022 at 10:27
  • if you consider the example above, I turns NAs in the cells to 0, I do some stuff with the data but the location of cells is still the same. so now I want to insert NAs back to their original cells. Commented Nov 30, 2022 at 9:35

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.