0

I have a table like as below. I like to replace each sample value base on comparing with third first columns and code each value with(0,1,2,3). For example, if sample value in each row = REF_REF -> 1 ; if sample value in each row = REF_ALT -> 2 ; if sample value in each row = ALT_ALT -> 3 and for 0/0 -> 0.

REF_REF   REF_ALT   ALT_ALT       sample1       sample2     sample3

 A/A         A/G     G/G             0/0          A/G          G/G

T/T          T/C     C/C             T/T          T/C          T/C

C/C          C/G    G/G              0/0          C/G          C/G

I like to make a table like as:

REF_REF   REF_ALT   ALT_ALT       sample1       sample2     sample3

A/A         A/G       G/G           0               2           3

T/T         T/C       C/C           1               2           2

C/C        C/G       G/G            0               2           2

Also, I used this command but not work.

l=c()

for (i in seq_along(data))

{
 data=data[1,]

 Ref_Ref=data$Ref_Ref

 Alt_Alt=data$Ref_Ref

 Ref_Alt=data$Ref_Alt

 with( data[], ifelse( data == Ref_Ref, 1, ifelse(data == Alt_Alt, 3, 
if((data==Ref_Alt) 2))))

 if(data=Ref_Ref, data=1)

l[1,]=if(data==Ref_Ref, 1)

  l[1] <- if (data %in% data$Ref_Ref) 1 else if (data %in% data$Alt_Alt) 3  else if (data %in% data$Alt_Alt) 2 else 0  
}
3
  • Anything you tried yourself? Why did it not work? Commented Dec 16, 2015 at 15:29
  • l=c() for (i in seq_along(data)) { data=data[1,] Ref_Ref=data$Ref_Ref Alt_Alt=data$Ref_Ref Ref_Alt=data$Ref_Alt with( data[], ifelse( data == Ref_Ref, 1, ifelse(data == Alt_Alt, 3, if((data==Ref_Alt) 2)))) if(data=Ref_Ref, data=1) l[1,]=if(data==Ref_Ref, 1) l[1] <- if (data %in% data$Ref_Ref) 1 else if (data %in% data$Alt_Alt) 3 else if (data %in% data$Alt_Alt) 2 else 0 } Commented Dec 16, 2015 at 15:33
  • Can you put that in the question please? Don't put extra information in comments. Commented Dec 16, 2015 at 15:34

2 Answers 2

2

This might work for you. It splits your data by row, and uses the useful properties of factors in R. For each row, we create a factor out of the samples, with levels of 0/0 followed by ref_ref, ref_alt and alt_alt. Then we convert this factor to numeric, and substract 1 to get the desired output.

recoded_samples <- apply(dat,1,function(x) {
  res <- as.numeric(factor(x[4:6],levels = c("0/0",x[1:3]))) - 1
  res
})

Then we can copy dat to an outcome variable (I don't like overwriting variables), and replace the columns. Note that we need to transpose 'res'.

outcome <- dat
outcome[,4:6] <- t(res)

> outcome
  REF_REF REF_ALT ALT_ALT sample1 sample2 sample3
1     A/A     A/G     G/G       0       2       3
2     T/T     T/C     C/C       1       2       2
3     C/C     C/G     G/G       0       2       2
Sign up to request clarification or add additional context in comments.

Comments

0

A dummy approach.

text1 <- "REF_REF   REF_ALT   ALT_ALT       sample1       sample2     sample3
A/A         A/G     G/G             0/0          A/G          G/G
T/T          T/C     C/C             T/T          T/C          T/C
C/C          C/G    G/G              0/0          C/G          C/G"

df <- read.table(text=text1, head=T, as.is=T)

for (x in 4:ncol(df)) {
  df[,x][df[,x]=="0/0"] <- 0
  df[,x][df[,x]==df[,1]] <- 1
  df[,x][df[,x]==df[,2]] <- 2
  df[,x][df[,x]==df[,3]] <- 3
}
# change characters to integers
df[,4:6] <- as.integer(as.matrix(df[, 4:6]))
df

2 Comments

Won't this convert the numbers to character?
Yes. @Heroka But I think it's okay as characters. And we can change character to integers if needed.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.