www.highstat.com
Highland Statistics Ltd

On demand: Intro to R using a protocol for regression-type models

Please use the discussion board below to ask questions relevant to the course.

Comments (10)

Rated 0 out of 5 based on 0 voters
This comment was minimized by the moderator on the site

Hi there,

I was just wondering how you would go about dealing with heterogeneity in your data? I have uploaded a picture of my residuals vs. fitted values plot for my model which I think is showing heterogeneity.

Thanks,

Zoe

Zoe Melvin
This comment was minimized by the moderator on the site

Zoe,

By using GLS. See Chapter 4 in our mixed-effects modeling book (with the penguin cover).

Kind regards,

Alain

Alain Zuur
This comment was minimized by the moderator on the site

Hi there,

I was just wondering what your opinion is on identifying over-influential data points in a model using e.g. cook's distance, as part of the model validation? And whether we should remove such influential data points?

Best,

Zoe

Zoe Melvin
This comment was minimized by the moderator on the site

Zoe...if you have done a good data exploration then most likely you will not have influential points. But yes, Cook and friends can sometimes be useful tools. As to removing influential points....that depends on why they are influential. They may...

Zoe...if you have done a good data exploration then most likely you will not have influential points. But yes, Cook and friends can sometimes be useful tools. As to removing influential points....that depends on why they are influential. They may be influential because of extreme covariate values, or because of extreme response variable values, or because you are doing a model with a 3-way interaction and there are only 3 observations for a specific combination.

So..the answer to your question is very much an 'it depends' answer. If the majority of your covariates values are between 1 and 2, and there is one value equal to 100, and it has a high Cook...then remove. But if model complexity causes it, then simplify the model.


The topic is discussed in more detail in our 'Data exploration, regression, GLM and GAM' course.

Kind regards,
Alain

Read More
Alain Zuur
This comment was minimized by the moderator on the site

Hi there,

When using the VIF method for excluding colinear covariates, is it still true that we cannot determine which of the covariates is driving the pattern in the response variable?

Best wishes,

Zoe

Zoe Melvin
This comment was minimized by the moderator on the site

That is indeed the case.

Alain

Alain Zuur
This comment was minimized by the moderator on the site

Hi there,

In exercise 5, you mention that you cannot analyse two variables separately if they could be related. So your example was measuring two diseases in a fish; they can't be treated as separate response variables because the fish is more...

Hi there,

In exercise 5, you mention that you cannot analyse two variables separately if they could be related. So your example was measuring two diseases in a fish; they can't be treated as separate response variables because the fish is more likely to have the second disease if it has the first one.

In this case, how would you analyse that data?

Best,
Zoe

Read More
Zoe Melvin
This comment was minimized by the moderator on the site

Dear Zoe,

With a multivariate GLM. One that allows for correlation between the two response variables via a correlated random effect.

You can start here:
Jaffa et al. J Transl Med (2015) 13:192
DOI 10.1186/s12967-015-0557-2
Analysis of...

Dear Zoe,

With a multivariate GLM. One that allows for correlation between the two response variables via a correlated random effect.

You can start here:
Jaffa et al. J Transl Med (2015) 13:192
DOI 10.1186/s12967-015-0557-2
Analysis of multivariate longitudinal
kidney function outcomes using generalized
linear mixed models
Miran A Jaffa1*, Mulugeta Gebregziabher2 and Ayad A Jaffa3,4


Alain

Read More
Alain Zuur
This comment was minimized by the moderator on the site

Hello, I have a question about using site-level averages in a model. My data has 21 locations, with 8 replicates at each location, repeated over multiple years (response is fish growth rate, covariates include depth, coral cover and reef...

Hello, I have a question about using site-level averages in a model. My data has 21 locations, with 8 replicates at each location, repeated over multiple years (response is fish growth rate, covariates include depth, coral cover and reef structural complexity). When discussing the model with supervisors it was suggested that location-level means are used, for both response and predictor variables. This would reduce the data from 168 rows each year to 21, and also lose info/variation at the replicate level.

My instinct is not to do this, and to include data at the replicate level as the model could provide location-level estimates anyway. I wondered if both methods could be considered acceptable, and what pros/cons there might be to using the less detailed means (e.g. lower sample size)?

Read More
Mark Hamilton
This comment was minimized by the moderator on the site

Mark,
This is not really a question that is linked to the course content (the Discussion board was meant for 'course-related' questions. But given that the answer only takes 10 seconds: What you were told is very poor statistical advice. No......

Mark,
This is not really a question that is linked to the course content (the Discussion board was meant for 'course-related' questions. But given that the answer only takes 10 seconds: What you were told is very poor statistical advice. No... you should not take site averages. Instead, you should apply mixed-effects models. If not, you risk the chance that your paper will be rejected.

Alain

This sounds a little bit commercial...but in the GLMM course (frequentist) that we will run in February, we have a nearly-identical example.

Read More
Alain Zuur
There are no comments posted here yet

Leave your comments

  1. Posting comment as a guest. Sign up or login to your account.
Rate this post:
0 Characters
Attachments (0 / 3)
Share Your Location