retrieving matching rows between two data frames using for loop

Question

I have been trying to do this but not getting anywhere. Any help will be very much appreciated.

df1 <- data.frame(chrom = "chr1", start=c(10,20,30), end = c(100,200,300), stringsAsFactors=FALSE)
df2 <- data.frame(chrom = c("chr1", "chr2", "chr3"),start=c(15,500,150), end = c(75,1000,300), stringsAsFactors=FALSE)

I want to get all rows of df2 where df1$chrom == df2$chrom. Or better yet: I want to generate the output in a new vector and display the rows of df1 followed by df2 or vice versa where df1$chrom == df2$chrom.

I have tried this using a for loop as follows:

for(i in 1:nrow(df2)){
    x[i] <- df2[which(df1$chrom == df2$chrom[i])]
}

Not working!

What is it you're trying to accomplish in doing this comparison between data frames? There may just be an easier solution to your work flow than the approach you're taking--i.e., if you only want a vector out of a data frame, are you going to require many such vectors? A new data frame? What is the end-goal? That context is important to the questions you ask. — Bryan Goodrich
– Bryan Goodrich, Commented Apr 9, 2012 at 20:17

Bryan Goodrich · Accepted Answer · 2012-04-11 23:47:22Z

3

Is this what you want?

df2[df2$chrom == df1$chrom, ]
#   chrom start end
# 1  chr1    15  75

Per your comment, you might also want to try the following.

merge(df1, df2, by = 'chrom')

This will do a database "join" on the two frames ("tables"). The result is this.

  chrom start.x end.x start.y end.y
1  chr1      10   100      15    75
2  chr1      20   200      15    75
3  chr1      30   300      15    75

It isn't always an efficient approach to take in R, but it is convenient. You can control the ".x" stuff with parameters (see the help pages: ?merge). If you want all the fields from df2 included, you could add the "all = TRUE" parameter setting to merge.

As I alluded to before, it is best to consider the overall approach. This isn't necessarily an efficient way to process your data because now you've entered a lot of redundancy into the resulting frame. Instead, in database terms, we think of df2 as a "look up" table. The "chr1" in df1 references information in df2 (a foreign key) that is associated with df1 but distinct from it. Instead of, as the merge above shows, having the information of df2 repeated, we can simply access it when required. This is where the merge makes that convenient.

edited Apr 11, 2012 at 23:47

answered Apr 9, 2012 at 20:16

Bryan Goodrich

7414 silver badges7 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

user1079898 Over a year ago

Yes that is exactly the format that I want. It will be nice to have the matching rows of both data frames side by side in a new data frame with 6 columns. Actually my ultimate goal is much more complicated where the comparison will be done satisfying many conditions between the two data frames. The above condition is just one of them.

user1079898 Over a year ago

The statement you sent works very well. Thanks. I am having a hard time wrapping my head around it...but it works! Thank you so much

Tyler Rinker Over a year ago

It's at first hard to get (particularly if you're used to another language that uses loops) but once you get it it's pretty straight forward. if you have multiple conditions remember %in% and the logical operators & and | are great tools in indexing which is the method Bryan showed (rather than an explicit loop).

thelatemail Over a year ago

Would df2[df2$chrom %in% df1$chrom, ] be more robust in this circumstance?

user1079898 Over a year ago

Thanks a bunch Tyler. Its kinda complicated to explain what I am trying to do....and its harder because I am very new with R. I will work on it a bit more and hopefully my next post will makes more sense. Anyways thanks for taking the time.......

|

Collectives™ on Stack Overflow

retrieving matching rows between two data frames using for loop

1 Answer 1

7 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

7 Comments

Your Answer

Sign up or log in

Post as a guest

Related