Notes

Author

Steffi LaZerte

Published

September 23, 2024

GAM reference - https://noamross.github.io/gams-in-r-course/

GAM plotting using mgcv functions:

library(readr)
library(mgcv)
library(dplyr)
v <- read_csv("Data/Datasets/vultures_clean_2023.csv")
g <- gam(count ~ s(doy, k = 10), 
         data = filter(v, year == 2000) |> mutate(l = 3), 
         method = "REML", family = "nb")
summary(g)

Family: Negative Binomial(1.117) 
Link function: log 

Formula:
count ~ s(doy, k = 10)

Parametric coefficients:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)    2.338      0.122   19.16   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
         edf Ref.df Chi.sq p-value    
s(doy) 5.482  6.627  170.7  <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.407   Deviance explained = 71.2%
-REML = 267.73  Scale est. = 1         n = 75
plot(g, shade = TRUE, trans = exp, residuals = TRUE, pch = 20, 
     shift = coef(g)[1])

Model assessment

  • Looks for significant patterns in residuals (check that we have given high enough basis k to estimate ‘wiggliness’)
p0 <- par(mfrow = c(2,2))
gam.check(g, pch = 19, cex = 0.5)


Method: REML   Optimizer: outer newton
full convergence after 4 iterations.
Gradient range [9.247488e-07,1.632843e-06]
(score 267.7298 & scale 1).
Hessian positive definite, eigenvalue range [2.251584,28.18106].
Model rank =  10 / 10 

Basis dimension (k) checking results. Low p-value (k-index<1) may
indicate that k is too low, especially if edf is close to k'.

         k'  edf k-index p-value
s(doy) 9.00 5.48    1.01    0.78
par(p0)
  • Compare to DHARMa
s <- DHARMa::simulateResiduals(g, plot = TRUE)
Registered S3 method overwritten by 'GGally':
  method from   
  +.gg   ggplot2
Registered S3 method overwritten by 'mgcViz':
  method from  
  +.gg   GGally

We want

  • Full convergence
  • K > edf
  • High p-value
Back to top