2

I am trying to extend the answer of a question R: filtering data and calculating correlation.

To obtain the correlation of temperature and humidity for each month of the year (1 = January), we would have to do the same for each month (12 times).

cor(airquality[airquality$Month == 1, c("Temp", "Humidity")])

Is there any way to do each month automatically?

In my case I have more than 30 groups (not months but species) to which I would like to test for correlations, I just wanted to know if there is a faster way than doing it one by one.

Thank you!

0

1 Answer 1

1
cor(airquality[airquality$Month == 1, c("Temp", "Humidity")])

gives you a 2 * 2 covariance matrix rather than a number. I bet you want a single number for each Month, so use

## cor(Temp, Humidity | Month)
with(airquality, mapply(cor, split(Temp, Month), split(Humidity, Month)) )

and you will obtain a vector.

Have a read around ?split and ?mapply; they are very useful for "by group" operations, although they are not the only option. Also read around ?cor, and compare the difference between

a <- rnorm(10)
b <- rnorm(10)
cor(a, b)
cor(cbind(a, b))

The answer you linked in your question is doing something similar to cor(cbind(a, b)).


Reproducible example

The airquality dataset in R does not have Humidity column, so I will use Wind for testing:

## cor(Temp, Wind | Month)
x <- with(airquality, mapply(cor, split(Temp, Month), split(Wind, Month)) )

#         5          6          7          8          9 
#-0.3732760 -0.1210353 -0.3052355 -0.5076146 -0.5704701 

We get a named vector, where names(x) gives Month, and unname(x) gives correlation.


Thank you very much! It worked just perfectly! I was trying to figure out how to obtain a vector with the R^2 for each correlation too, but I can't... Any ideas?

cor(x, y) is like fitting a standardised linear regression model:

coef(lm(scale(y) ~ scale(x) - 1))  ## remember to drop intercept

The R-squared in this simple linear regression is just the square of the slope. Previously we have x storing correlation per group, now R-squared is just x ^ 2.

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you very much! It worked just perfectly! I was trying to figure out how to obtain a vector with the R^2 for each correlation too, but I can't... Any ideas?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.