2

How can I loop over rows of a data frame containing variable names permutations when running a mediation analysis with lavaan?

Say I have 4 variables var1, var2, var3, var4:

df<- data.frame(var1 = rnorm(100), 
                var2 = rnorm(100), 
                var3 = rnorm(100),
                var4 = rnorm(100))

Using gtools::permutations() I save all possible permutations of the 4 variables in sets of 3:

permut <- 
  gtools::permutations(n = 4, r = 3, v = names(df), repeats.allowed = FALSE)

colnames(permut) <- c("Y", "X", "M")

> head(permut)
     Y      X      M     
[1,] "var1" "var2" "var3"
[2,] "var1" "var2" "var4"
[3,] "var1" "var3" "var2"
[4,] "var1" "var3" "var4"
[5,] "var1" "var4" "var2"
[6,] "var1" "var4" "var3"

Then I set the medation model using lavaan syntax, where I'm interested in the mediating effect of M over the relationship between X and Y:

mod <- "
    M ~ a * X
    Y ~ c * X + b * M
    ind := a*b
    tot := c + (a*b)
    "

I want to run the model and store it's results for future inspection:

library(lavaan)
library(dplyr)

#fit the model
fit <- sem(mod, df, se = "robust")

#save results
result <-
parameterestimates(fit) %>% filter(op != "~~")

My question here is this:

How to instruct R to use as Y,X,M variable names from each row of permut, fit the model using data from df and model syntax from mod and eventually store results for every model fit?

The code above is the simplest possible scenario that I'd like to use for running more complex models the same way.

I'm aware of answers about looping linear models of different variables: loop over all possible combinations, Looping over combinations of regression model terms, Linear Regression loop for each independent variable individually against dependent, and possibly the closest one: How to use reference variables by character string in a formula?, but still I'm stuck and coudn't solve this over the weekend.

1 Answer 1

2

Here's one way of doing it:

fits <- apply(permut, 1, function (p) {
    permuted.df <- df[p]
    colnames(permuted.df) <- names(p)
    sem(mod, permuted.df, se="robust")
})

fits contains the SEM results for every 3-permutation in permut. To see the estimates of, e.g., the first fit, you can proceed as usual:

> parameterestimates(fits[[1]]) %>% filter(op != "~~")
  lhs op     rhs label         est         se          z     pvalue    ci.lower
1   M  ~       X     a -0.18393765 0.10977670 -1.6755618 0.09382406 -0.39909603
2   Y  ~       X     c  0.07314372 0.09891034  0.7394952 0.45960637 -0.12071699
3   Y  ~       M     b  0.01944518 0.08852450  0.2196587 0.82613697 -0.15405965
4 ind :=     a*b   ind -0.00357670 0.01600038 -0.2235385 0.82311644 -0.03493686
5 tot := c+(a*b)   tot  0.06956702 0.09816192  0.7086966 0.47851276 -0.12282680
    ci.upper
1 0.03122074
2 0.26700443
3 0.19295001
4 0.02778346
5 0.26196084
Sign up to request clarification or add additional context in comments.

2 Comments

Great, thank you! Your post answers my question and I will accept it. I didn't think of it before but it looks like I will need some way to know what was actually picked for Y,X,M at a given run. Any chance you tell me, how to do that?
Well, fit[[n]] was performed with permut[n,] as variables. Alternatively, you can replace sem(mod, permuted.df, se="robust") by something like list(fit=sem(mod, permuted.df, se="robust"), vars=p), and then do fits[[5]]$fit to access the fit and fits[[5]]$vars to inspect the variables.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.