2

I would like to create a function that does a for loop to create multiple datasets. These datasets should be returned into a single dataset, which will be the output of my function.

I did the following code. It works when the for loop is outside the function, but it does not work when the loop is inside another function. The problem with my function, is that it only gives me back the first (i) dataset.

library(broom)
library(dplyr)

# My function
validation <- function(x, y) {
    df <- NULL
for (i in 1:ncol(x)) {
  coln <- colnames(x)[i]
  covariate <- as.vector(x[,i])
  models <- (tidy(glm(y ~ covariate, data = x, family = binomial)))
  df <- (rbind(df, cbind(models, coln))) %>% filter( term != "(Intercept)")
  return(df)
}
 }

# Test function
validation(mtcars, mtcars$am)

term         estimate  std.error  statistic        p.value coln
covariate   0.3070282   0.1148416   2.673493    0.007506579 mpg

This function should give me the following output:

  term     estimate    std.error     statistic     p.value coln
1  covariate  0.307028190 1.148416e-01  2.6734932353 0.007506579  mpg
2  covariate -0.691175096 2.536145e-01 -2.7252982408 0.006424343  cyl
3  covariate -0.014604292 5.167837e-03 -2.8259972293 0.004713367 disp
4  covariate -0.008117121 6.074337e-03 -1.3362973916 0.181452089   hp
5  covariate  5.577358500 2.062575e+00  2.7040753425 0.006849476 drat
6  covariate -4.023969940 1.436416e+00 -2.8013963535 0.005088198   wt
7  covariate -0.288189820 2.278968e-01 -1.2645629995 0.206028024 qsec
8  covariate  0.693147181 7.319250e-01  0.9470194188 0.343628884   vs
9  covariate 51.132135568 7.774641e+04  0.0006576784 0.999475249   am
10 covariate 21.006490452 3.876257e+03  0.0054192724 0.995676067 gear
11 covariate  0.073173343 2.254018e-01  0.3246350695 0.745457282 carb

2
  • 1
    ii. guess your return(df) should be outside Commented Apr 26, 2020 at 23:42
  • 1
    @akrun it works! Commented Apr 26, 2020 at 23:43

1 Answer 1

2

If we change the return(df) from the inner loop to outer, it should work because the 'df' return inside the inner loop is just the output just got updated i.e. the first run output

validation <- function(x, y) {
    df <- NULL
    for (i in 1:ncol(x)) {
      coln <- colnames(x)[i]
      covariate <- as.vector(x[,i])
      models <- (tidy(glm(y ~ covariate, data = x, family = binomial)))
      df <- (rbind(df, cbind(models, coln))) %>% filter( term != "(Intercept)")
      # to understand it better, create some print statement
      print(sprintf("column index : %d", i))
      print('-----------------')
      print('df in each loop')
      print(df)
      print(sprintf("%dth loop ends", i))

        }
      df
     }

-checking

validation(mtcars, mtcars$am)
#       term     estimate    std.error     statistic     p.value coln
#1  covariate  0.307028190 1.148416e-01  2.6734932353 0.007506579  mpg
#2  covariate -0.691175096 2.536145e-01 -2.7252982408 0.006424343  cyl
#3  covariate -0.014604292 5.167837e-03 -2.8259972293 0.004713367 disp
#4  covariate -0.008117121 6.074337e-03 -1.3362973916 0.181452089   hp
#5  covariate  5.577358500 2.062575e+00  2.7040753425 0.006849476 drat
#6  covariate -4.023969940 1.436416e+00 -2.8013963535 0.005088198   wt
#7  covariate -0.288189820 2.278968e-01 -1.2645629995 0.206028024 qsec
#8  covariate  0.693147181 7.319250e-01  0.9470194188 0.343628884   vs
#9  covariate 51.132135568 7.774641e+04  0.0006576784 0.999475249   am
#10 covariate 21.006490452 3.876257e+03  0.0054192724 0.995676067 gear
#11 covariate  0.073173343 2.254018e-01  0.3246350695 0.745457282 carb
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.