Correlation between two corresponding columns from seperate datasets

Question

I have two sets of data, which contain columns with the same names, but differing values in those columns. e.g:

m1 <- matrix(1:9, nrow = 3, ncol = 3, byrow = TRUE,
             dimnames = list(c("s1", "s2", "s3"),c("cow", "dog","cat")))
m2 <- matrix(1:9, nrow = 3, ncol = 3, byrow = FALSE,
             dimnames = list(c("s1", "s2", "s3"),c("dog", "cow","cat")))
> m1
   cow dog cat
s1   1   2   3
s2   4   5   6
s3   7   8   9
> m2
   dog cow cat
s1   1   4   7
s2   2   5   8
s3   3   6   9

I would like to create a function using cor.test() to calculate the correlation between corresponding columns. E.g. cow vs cow, dog vs dog. The reason for using cor.test() is I want to obtain the correlation coefficient and p-value. So, if there are other ways to obtain this information, I'm open to those too. The actual data set has thousands of columns, which are randomly organized, so I'm looking for a way to match the columns first and then calculate the correlation. Any ideas?

jlesuffleur · Accepted Answer · 2016-10-05 12:48:08Z

2

Here is a solution, using lapply on common columns:

# Common columns
cols <- intersect(colnames(m1), colnames(m2))

# For each column, compute cor test
res <- lapply(cols, function(x) cor.test(
  m1[, x],
  m2[, x]
))

names(res) <- cols

The result is a list of htest objects that you can access this way: res[["cow"]]

edited Oct 5, 2016 at 12:48

answered Oct 5, 2016 at 12:44

jlesuffleur

1,3031 gold badge9 silver badges20 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Correlation between two corresponding columns from seperate datasets

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related