3

I have created two dataframes, with df.1 containing my main data.

ID  A_ratio   B_ratio  C_ratio
1    0.9       7.6      3.5
2    3.1       4.4      0.7     
3    6.3       8.2      1.2

The dataframe cut only contains one row.

A_cut  B_cut  C_cut
 4.5    5.3    2.0

I now want to use the values stored in cut to binarize df, turning X_ratio <= X_cut to 1 and X_ratio > X_cut to 0. The new column could be called X_bin. I've tried the following dplyr approach:

df.2 <- df.1 %>%
  mutate(across(ends_with("ratio"), ~if_else(. <= get(cut[str_replace(cur_column(),"ratio","cut")]), 1, 0)
            .names = "{.col}_bin"))%>%
  rename_with(~str_replace(.,"_ratio",""),contains("_ratio_"))
  select(ID, ends_with("bin"))

But I'm unfortunately getting an Error: unexpected symbol. Could someone point out my mistake? The desired output in df.2 would be

ID A_bin B_bin C_bin
1   1     0     0
2   1     1     1
3   0     0     1

Thanks a lot in advance!

3 Answers 3

3

There is a , missing before the .names and if we are extracting the column from cut, we don't need any get along with the fact that instead of mutate, use transmute to return only those columns needed so that the last step with select can be removed

library(dplyr)
library(stringr)
df.1 %>%
  transmute(ID, across(ends_with("ratio"), 
      ~if_else(. <=  cut[[str_replace(cur_column(),"ratio","cut")]], 
            1, 0),
        .names = "{.col}_bin")) %>% 
   rename_with(~str_replace(.,"_ratio",""),contains("_ratio_"))

-output

#  ID A_bin B_bin C_bin
#1  1     1     0     0
#2  2     1     1     1
#3  3     0     0     1

As we are returning binary columns, if_else is not really needed. The logical vector can be coerced to binary with as.integer or wrapped with +(

df.1 %>%
  transmute(ID, across(ends_with("ratio"), 
      ~as.integer(. <=  cut[[str_replace(cur_column(),"ratio","cut")]]),
        .names = "{.col}_bin")) %>% 
   rename_with(~str_replace(.,"_ratio",""),contains("_ratio_"))

Note: cut is a function name, so it is better not to name objects with function names

data

df.1 <- structure(list(ID = 1:3, A_ratio = c(0.9, 3.1, 6.3), B_ratio = c(7.6, 
4.4, 8.2), C_ratio = c(3.5, 0.7, 1.2)), class = "data.frame", row.names = c(NA, 
-3L))

cut <- structure(list(A_cut = 4.5, B_cut = 5.3, C_cut = 2), class = "data.frame",
row.names = c(NA, 
-1L))
Sign up to request clarification or add additional context in comments.

1 Comment

Ah, thank you so much for this answer, works out perfectly. Also thanks for the very helpful remarks, will definitely try to follow them from now on!
3

Base R answer :

df.1[-1] <- +(sweep(df.1[-1], 2, unlist(cut), `<=`))
df.1

#  ID A_ratio B_ratio C_ratio
#1  1       1       0       0
#2  2       1       1       1
#3  3       0       0       1

1 Comment

Thanks for that! Didn't know there was such an easy and elegant way using base R!
2

purrr

df <- structure(list(ID = 1:3, A_ratio = c(0.9, 3.1, 6.3), B_ratio = c(7.6, 
                                                                         4.4, 8.2), C_ratio = c(3.5, 0.7, 1.2)), class = "data.frame", row.names = c(NA, 
                                                                                                                                                     -3L))

cut <- structure(list(A_cut = 4.5, B_cut = 5.3, C_cut = 2), class = "data.frame",
                 row.names = c(NA, 
                               -1L))
library(purrr)
df[-1] <- +map2_dfc(df[-1], cut, ~.x <= .y)
df
#>   ID A_ratio B_ratio C_ratio
#> 1  1       1       0       0
#> 2  2       1       1       1
#> 3  3       0       0       1

Created on 2021-04-02 by the reprex package (v1.0.0)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.