0

I'm trying to write a function that takes a dataframe, a main variable, and a list of variables and uses the cor.test function. I'm looking for it to return a dataframe with the variable names and the correlation coefficient and p-value.

The code I have so far is:

myCorTest = function(dat, mainVar, varlist)
{
  result = data.frame()
  mainV = dat[[mainVar]]
  for (i in 1:length(varlist)){
    var_select = dat[[varlist[i]]]
    x = cor.test(mainV, var_select)
    R = x$estimate
    p = x$p.value
    result = cbind(mainVar, varlist, R, p)
}
  return(result)
}

I want the output to look like this:

> myCortest (chol, "bmi", c("sbp", "dbp", "vldl", "hdl", "ldl"))
    var1 var2     R             p
sbp  bmi sbp  0.14927952  3.877523e-02
dbp  bmi dbp  0.42636371  6.997094e-10
vldl bmi vldl 0.41033688  4.107925e-09
hdl  bmi hdl  -0.11984422 9.956239e-02
ldl  bmi ldl  0.03449137  6.366170e-01

But my outputs are:

> myCorTest(chol, "bmi", c("sbp","dbp", "vldl", "hdl", "ldl"))
     mainVar varlist R                    p                  
[1,] "bmi"   "sbp"   "0.0344913724648321" "0.636617020943996"
[2,] "bmi"   "dbp"   "0.0344913724648321" "0.636617020943996"
[3,] "bmi"   "vldl"  "0.0344913724648321" "0.636617020943996"
[4,] "bmi"   "hdl"   "0.0344913724648321" "0.636617020943996"
[5,] "bmi"   "ldl"   "0.0344913724648321" "0.636617020943996"

2 Answers 2

2

The problem with your codes is cbind creating a matrix, where the matrix needs all values inside it to have the same data types. What you need is to create a data.frame. Try this :

myCorTest = function(dat, mainVar, varlist)
{
# Create empty data.frame to store all results with its data types
  result = data.frame(var1=character(),
                      var2=character(),
                      R=numeric(),
                      p=numeric()
                      )
  mainV = dat[[mainVar]]
  for (i in 1:length(varlist)){
    var_select = dat[[varlist[i]]]
    x = cor.test(mainV, var_select)
    R = x$estimate
    p = x$p.value
    result_temp = data.frame(mainVar, varlist[i], R, p)
    row.names(result_temp) = varlist[i]
    result = rbind(result,result_temp)
}
colnames(result) = c("var1","var2","R","p")
  return(result)
}

myCorTest(chol, "bmi", c("sbp", "dbp", "vldl", "hdl", "ldl"))
Sign up to request clarification or add additional context in comments.

Comments

1

Growing objects/dataframes in a loop is inefficient. I would use lapply :

myCorTest = function(dat, mainVar, varlist) {
  mainV = dat[[mainVar]]
  do.call(rbind, lapply(varlist, function(x) {
    temp = cor.test(mainV, dat[[x]])
    R = temp$estimate
    p = temp$p.value
    data.frame(mainVar = mainVar, varlist = x, R, p)
  })) -> result
  rownames(result) <- NULL
  return(result)
}

myCorTest(mtcars, 'mpg', c('cyl', 'am'))

#  mainVar varlist      R        p
#1     mpg     cyl -0.852 6.11e-10
#2     mpg      am  0.600 2.85e-04

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.