Online course

Introduction to GLLVM and multivariate GLMM

The central theme of this course is the analysis of multiple correlated response (or dependent) variables using GLMs and GLMMs. Rather than applying multiple univariate GLMs or GLMMs, we will focus on multivariate GLMMs, particularly generalised linear latent variable models (GLLVMs), for the simultaneous analysis of all variables.

During the course, we cover a large number of exercises with examples such as trait variables from turtle hatchlings from multiple clutches, biomass data from fish species sampled at multiple sites, count data from 250 freshwater benthic species sampled at 200 sites, abundances of multiple parasite species on fish, counts of 60 different debris types in water samples, abundances of multiple spider species in traps, multiple morphometric variables sampled from honeybees, and absence/ presence of diet variables from faecal samples of brown bears.

In all these examples, we can analyse each variable with a univariate GLM(M). Although these analyses are relatively simple, there are also some problems:

  • Extra Work: Individual analyses are computationally less efficient and require separate validation, interpretation, and reporting.
  • Lack of Multivariate Relationships: Analysing the variables individually neglects the interconnected relationships and interactions between them.
  • No Shared Variation: Univariate models might overlook consistent residual patterns across species, while multivariate models can capture shared variations due to common environmental factors.
  • Multiple Testing: Conducting separate analyses increases the risk of Type I errors, especially when the response variables are highly correlated.
  • Loss of Community-Level Insights: Analysing species separately misses out on a comprehensive, community-level viewpoint and can lead to inconsistent conclusions.

 

COURSE CONTENT

  • Preparation material with on-demand video:
  • Exercise on Poisson and negative binomial GLM.
  • DHARMa for model validation.
  • Matrix notation.

Module 1:

  • General introduction.
  • A short theoretical presentation revising linear mixed-effects models.
  • One exercise on linear mixed-effects models.
  • Theory presentation on multivariate GLM and multivariate GLMM.
  • One exercise on multivariate GLM(M).

Module 2:

  • Theory presentation on generalised linear latent variable models (GLLVM) for the analysis of multivariate
    data.
  • Three exercises on GLLVM for the analysis of count data (Poisson/negative binomial).

Module 3:

  • Catching up
  • Theory presentation on constrained GLLVM (reduced rank regression and concurrent ordination).
  • Three exercises on constrained GLLVM.

Module 4:

  • Three exercises using GLLVM with Tweedie, Gamma, and Bernoulli distributions. The Tweedie distribution
    can be used for continuous data with zeros (e.g., biomass), and the Gamma distribution for continuous data
    without zeros. The Bernoulli distribution can be used for the analysis of absence/presence data.
  • Time allowing: Adding spatial correlation to GLLVMs.


PRE-REQUIRED KNOWLEDGE
You need to have a good understanding of data exploration, multiple linear regression, and Poisson and negative binomial GLMs. Working knowledge of R is required. Familiarity with linear mixed-effects models is not required but is an advantage.