How is the correlation matrix using cor calculated?

Question

I implemented my own correlation function in R. Surprisingly I get slightly different results when using the built-in cor function. The differences seem to disappear when n the number of observations are big enough.

My function:

corr = function(X) {
  Q = X - colMeans(X)
  S_ = colSums(Q**2)
  S = sqrt(S_ %*% t(S_))
  covarr = t(Q) %*% Q
  corrr_ = covarr / S
  return(corrr_)
}

library(mvtnorm)
set.seed(247)
X = rmvnorm(10, sigma = matrix(c(1,0.8,0.8,1), ncol=2)) # change 10 to 100, 1000, or 10000
corr(X)
cor(X)

For n=10 I get 0.8490966 vs. 0.8465363, so the change is in the 3rd decimal. For n=1000 I get 0.7960206 vs. 0.7960925, so the change is in the 5th decimal.

Look into the C source code of C_cor that cor() calls internally. — jay.sf
– jay.sf, Commented Apr 29, 2023 at 14:49
@jay.sf I've looked, it's a complicated code, this is why I ask here. — Maverick Meerkat
– Maverick Meerkat, Commented Apr 29, 2023 at 16:03

G. Grothendieck · Accepted Answer · 2023-04-29 17:26:29Z

3

The first line of the function should be this since R stores matrices column by column and not row by row

Q = t(t(X) - colMeans(X))

or

Q = X - matrix(colMeans(X), nrow(X), ncol(X), byrow = TRUE)

or

Q = scale(X, TRUE, FALSE)

or even this which is not the same Q but in the end gives the same answer

Q = scale(X)

If we use cov2cor from the base of R then

corr = function(X) {
  Q = scale(X)
  cov2cor(crossprod(Q))
}

edited Apr 29, 2023 at 17:26

answered Apr 29, 2023 at 16:31

G. Grothendieck

273k18 gold badges221 silver badges365 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Stéphane Laurent Over a year ago

Yep. Or Q = sweep(X, 2, colMeans(X)).

Maverick Meerkat Over a year ago

Thanks. I always mix R and python "broadcasting"

Collectives™ on Stack Overflow

How is the correlation matrix using cor calculated?

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related