Hello, folks!
I have tried to find a solution to this following problem that I think it would be pretty simple. Perhaps it is (for some of you), but I couldn’t solve the problem yet. What do I want is to modify all zeros and ones from columns 6 to 10, replacing the 0 for the third column values, and 1 for the fourth values in a row-wise manner.
That’s a reproducible example:
# Creating dataframe vectors
chr= rep(10,10)
id= paste0("name", 1:10)
pos= seq(1,1000, length.out = 10)
allele1= c("T","T","G","G","C","T","C","C","G","C")
allele2= c("A","A","T","T","C","T","C","C","T","T")
col6= sample(c(0,1),10, TRUE)
col7= sample(c(0,1),10, TRUE)
col8= sample(c(0,1),10, TRUE)
col9= sample(c(0,1),10, TRUE)
col10= sample(c(0,1),10, TRUE)
df= data.frame(chr,id, pos, allele1, allele2, col6, col7, col8, col9, col10)
df
chr id pos allele1 allele2 col6 col7 col8 col9 col10
1 10 name1 1 T A 1 1 1 1 1
2 10 name2 112 T A 0 0 0 1 1
3 10 name3 223 G T 1 0 1 1 0
4 10 name4 334 G T 1 1 0 1 1
5 10 name5 445 C C 0 0 1 0 1
6 10 name6 556 T T 0 1 0 1 1
7 10 name7 667 C C 0 1 0 0 1
8 10 name8 778 C C 0 0 1 1 1
9 10 name9 889 G T 1 1 1 1 0
10 10 name10 1000 C T 0 1 1 0 1
Accordingly to this output, I would expect:
df
chr id pos allele1 allele2 col6 col7 col8 col9 col10
1 10 name1 1 T A A A A A A
2 10 name2 112 T A T T T A A
3 10 name3 223 G T T G T T G
4 10 name4 334 G T T T G T T
5 10 name5 445 C C C C C C C
6 10 name6 556 T T T T T T T
7 10 name7 667 C C C C C C C
8 10 name8 778 C C C C C C C
9 10 name9 889 G T T T T T G
10 10 name10 1000 C T C T T C T
I have tried using the function 'within' and 'apply' inside a for loop, but it seems like I am indexing wrongly. I bet this task is much easier in Perl, but I'd really like to use R for practicing.
Here's an example of the code I've tried:
within(df, {
for(i in 1:nrow(df)){
df[i,6:length(df)]= ifelse(df[i,6:length(df)] == 0, df[i,4],df[i,5])
}
})
for(i in 1:nrow(df)){
df[,6:length(df)]= apply(df[,6:length(df)]==0,2,ifelse,df[i,4],df[i,5])
}
I would appreciate any help!
Sincerely yours