2

I would like to convert data frame df1 into data frame df2.

id <- c(1,2,3)
outcome_1 <- c(1,0,1)
outcome_2 <- c(1,1,0)
df1 <- data.frame(id,outcome_1,outcome_2) 
id <- c(1,2,3)
outcome <- c("1,2","2","1")
df2 <- data.frame(id,outcome) 

The answers to the following question almost do what I want, but in my case a row can have more than one positive outcome (e.g. first row needs to be "1,2"). Also, I would like the resulting column to be a character column.

R: Converting multiple binary columns into one factor variable whose factors are binary column names

Please kindly help. Thank you.

5 Answers 5

2

Subset the substrings of the outcomes with their binary values coerced as.logical.

apply(df1[-1], 1, \(x) toString(substring(names(df1)[-1], 9)[as.logical(x)]))
# [1] "1, 2" "2"    "1" 

or

apply(df1[-1], 1, \(x) paste(substring(names(df1)[-1], 9)[as.logical(x)], collapse=','))
# [1] "1,2" "2"   "1"  

Using the first method:

cbind(df1[1], outcome=apply(df1[-1], 1, \(x) toString(substring(names(df1)[-1], 9)[as.logical(x)])))
#   id outcome
# 1  1    1, 2
# 2  2       2
# 3  3       1

If you want a nested list you may use list2DF.

l <- list2DF(c(df1[1],
               outcome=list(apply(df1[-1], 1, \(x) 
                                  as.numeric(substring(names(df1)[-1], 9))[as.logical(x)]))))
l
#   id outcome
# 1  1    1, 2
# 2  2       2
# 3  3       1

where

str(l)
# 'data.frame': 3 obs. of  2 variables:
#   $ id     : num  1 2 3
# $ outcome:List of 3
# ..$ : num  1 2
# ..$ : num 2
# ..$ : num 1

Data:

df1 <- structure(list(id = c(1, 2, 3), outcome_1 = c(1, 0, 1), outcome_2 = c(1, 
1, 0)), class = "data.frame", row.names = c(NA, -3L))
Sign up to request clarification or add additional context in comments.

2 Comments

This is giving me an error message: Error: unexpected input in "cbind(df1[1], outcome=apply(df1[-1], 1, \" It might be that I am missing something obvious as I am still relatively new to R.
@marc.th Oh, you are probably newer than your R version. Since R4.1 we can write shorthand \(x) for function(x), so if you change that in the code and it will work. I recommend to always use the newest R, though. Cheers!
0

Here is one more tidyverse approach:

library(dplyr)
library(tidyr)

df1 %>% 
  mutate(across(-id, ~case_when(. == 1 ~ cur_column()), .names = 'new_{col}'), .keep="unused") %>% 
  unite(outcome, starts_with('new'), na.rm = TRUE, sep = ', ') %>% 
  mutate(outcome = gsub('outcome_', '', outcome))
  id outcome
1  1    1, 2
2  2       2
3  3       1

Comments

0

How many outcome_ columns are there? If just 2, this will work fine.

library(dplyr) 

df1 %>% 
    rowwise() %>% 
    summarise(id = id, 
              outcome = paste(which(c(outcome_1,outcome_2)==1), collapse =",")) 

# A tibble: 3 x 2
     id outcome
  <dbl> <chr>  
1     1 1,2    
2     2 2      
3     3 1
               

If there are more than 2, try this:

df1 %>% 
    rowwise() %>% 
    summarise(id=id, 
              outcome = paste(which(c_across(-id)== 1), collapse =",")) 
               

Comments

0

Another possible solution, based on dplyr and purrr::pmap:

library(tidyverse)

df1 %>%
  transmute(id, outcome = pmap(., ~ c(1*..2, 2*..3) %>% .[. != 0] %>% toString))

#>   id outcome
#> 1  1    1, 2
#> 2  2       2
#> 3  3       1

Or simply:

library(tidyverse)

pmap_dfr(df1, ~ data.frame(id = ..1, outcome = c(1*..2, 2*..3) %>% .[. != 0]
   %>% toString))

#>   id outcome
#> 1  1    1, 2
#> 2  2       2
#> 3  3       1

Comments

-1
outcome_col_idx <- grepl("outcome", colnames(df1))
cbind(
  df1[,!outcome_col_idx, drop = FALSE],
  outcome = apply(
    replace(df1, df1 == 0, NA)[,outcome_col_idx],
    1,
    function(x){
      as.factor(
        toString(
          gsub(
            "outcome_", 
            "", 
            names(x)[complete.cases(x)]
          )
        )
      )
    }
  )
)

1 Comment

outcome_col_idx is not defined in this solution

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.