1

I am trying to do mixed linear model for my study in R. I would like to know if my code is correct or not. MY design - I have 5 sites, 2 subsites within each site and 2 permanent quadrates within each site. So I have 5 sites, 10 subsites and 20 quadrats. I have measured colony size (of corals) at all the quadrats. My question is does the size structure vary between sites ? In my data quadrats are nested within subsite and subsites are nested within site. I will use site as my fixed factor and subsites and quadrats as my random effects. I can think of two possible ways of doing this:

library(lme4)

option 1 lmer(size ~ site + (1|subsite) + (1|quadrat)

option 2 lmer(size ~ site + (1|site:subsite) + (1|subsite:quadrant)

which one of these would be correct to use?

Thanks

1
  • Option 1 is correct. Option 2 would estimate the variance in the intercept associated with the interaction between sites and subsite (first term) and the variance associated with the interaction between subsite and quadrat. Commented Nov 3, 2016 at 15:26

1 Answer 1

1

It depends a bit on how your subsites and quadrats are coded. Let's consider two schemes.

explicit nesting: this means that the subsites within sites and quadrats within subsites don't have unique names, e.g.

site subsite quadrat
A    a       1
A    a       2
A    b       1
A    b       2
B    a       1
B    a       2
... etc.

In this case, you must use interaction/nesting syntax to let R know that quadrat 1 in site A, subsite a has nothing in common with all of the other quadrats labeled "1" ...

size ~ site + (1|site:subsite) + (1|site:subsite:quadrat)

(size ~ site + (1|site:(subsite/quadrat)) might work, but I haven't tested it)

implicit nesting: in this case, everything is uniquely named.

site subsite quadrat
A    Aa      Aa1
A    Aa      Aa2
A    Ab      Ab1
A    Ab      Ab2
B    Ba      Ba1
B    Ba      Ba2
... etc.

In this case, you can use either the syntax above (R automatically drops the redundant levels) or

size ~ site + (1|subsite) + (1|quadrat)

and you should get identical results. (You can always test this experimentally!)

A couple of other points:

  • in general I recommend unique labels/implicit nesting (explicit nesting may be more convenient for humans gathering data on field notes, but you should convert to implicit nesting early in your data cleaning process), because it slightly reduces the chances of error
  • I always recommend using the data argument with lme4
  • if you don't care about quantifying within-site variation, and if your design is balanced, and your data are Normal (i.e. you're using lmer and not glmer) you can greatly simplify your life by simply aggregating to the mean values per site and running a 1-way ANOVA (see Murtaugh 2007, Ecology, "Simplicity and complexity in ecological data analysis").
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you this really helped me. I my quadrat and subsites are uniquely named so I can use the first option.Actually my data is not normal, I have transformed it (log10). I will check the residual pattern and will have to move to glmer with gamma family distribution if Lmer doesn't work.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.