2

I have a dataframe test with 1 column exposure

    exposure                       
1    CD177 
2    RFESD 
3    IL12B                       
4   IL18R1 
5      CEL

I want to add a column to test based on the columns count_type from another dataframe test1 below

  Exposure cis.trans count_type
 1:    CD177       cis          1
 2:    CD177       cis          1
 3:    CD177       cis          1
 4:    CD177       cis          1
 5:    CD177       cis          1
 6:    CD177       cis          1
 7:    CD177       cis          1
 8:      CEL       cis          1
 9:    IL12B     trans          2
10:    IL12B       cis          2
11:   IL18R1       cis          1
12:   IL18R1       cis          1
13:   IL18R1       cis          1
14:    RFESD       cis          1

if count_type =1 I want to take the value from cis.trans column otherwise the value will be "mix" In this example I want to get this :

 exposure  typ
1    CD177 cis 
2    RFESD cis 
3    IL12B mix
4   IL18R1 cis
5      CEL cis

Here is my code:

test<-test%>%
  mutate( typ=ifelse(test1[match(test$exposure,test1$Exposure),"count_type"]==1,
                     test1[match(test$exposure,test1$Exposure),"cis.trans"],
                     "mix"))

What I am getting is the following:

exposure                       typ
1    CD177 cis, cis, trans, cis, cis
2    RFESD cis, cis, trans, cis, cis
3    IL12B                       mix
4   IL18R1 cis, cis, trans, cis, cis
5      CEL cis, cis, trans, cis, cis

I don't know where is the problem I tried the following to test what match is returning and it is indeed returning the index of the wanted value from the test1 dataframe

test<-test%>%
  mutate( typ_ind=ifelse(test1[match(test$exposure,test1$Exposure),"count_type"]==1,
                     match(test$exposure,test1$Exposure),
                     "mix"))

test
  exposure                       typ count_type
1    CD177 cis, cis, trans, cis, cis          1
2    RFESD cis, cis, trans, cis, cis         14
3    IL12B                       mix        mix
4   IL18R1 cis, cis, trans, cis, cis         11
5      CEL cis, cis, trans, cis, cis          8

Any idea on what's happening and how to fix it ?

1 Answer 1

1

Keep only unique rows for test1 based on Exposure and count_type column and join the data with test. Change the value of cis.trans to "mix" if count_type = 2.

library(dplyr)

test1 %>%
  distinct(Exposure, count_type, .keep_all = TRUE) %>%
  inner_join(test, by = c('Exposure' = 'exposure')) %>%
  mutate(cis.trans  = ifelse(count_type == 2, 'mix', cis.trans))

#  Exposure cis.trans count_type
#1    CD177       cis          1
#2      CEL       cis          1
#3    IL12B       mix          2
#4   IL18R1       cis          1
#5    RFESD       cis          1
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks the answer worked for this case but the names are taken from test1 dataframe I need the names to stay same as test because in fact the dataframes have much more columns than this example. When I apply the code to my data all columns from test1 are added to test
I did select manually the columns I wanted to keep and changed the name with select and rename
You can change the order of join and use select to keep the columns you need in the final dataframe. test %>%inner_join(test1 %>% distinct(Exposure, count_type, .keep_all = TRUE), by = c('exposure' = 'Exposure')) %>% select(col1, col2) %>% mutate(cis.trans = ifelse(count_type == 2, 'mix', cis.trans)) . Replace col1, col2 with actual column names in your data.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.