Notes

Author

Steffi LaZerte

Published

June 6, 2025

GAM reference - https://noamross.github.io/gams-in-r-course/

GAM plotting using mgcv functions:

library(readr)
library(mgcv)
library(dplyr)
v <- read_csv("Data/Datasets/vultures_clean_2023.csv")
g <- gam(count ~ s(doy, k = 10), 
         data = filter(v, year == 2000) |> mutate(l = 3), 
         method = "REML", family = "nb")
summary(g)


Family: Negative Binomial(1.129) 
Link function: log 

Formula:
count ~ s(doy, k = 10)

Parametric coefficients:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)   2.3393     0.1207   19.38   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
         edf Ref.df Chi.sq p-value    
s(doy) 5.475  6.616  175.5  <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.391   Deviance explained = 71.3%
-REML =  271.2  Scale est. = 1         n = 76

plot(g, shade = TRUE, trans = exp, residuals = TRUE, pch = 20, 
     shift = coef(g)[1])

Model assessment

Looks for significant patterns in residuals (check that we have given high enough basis k to estimate ‘wiggliness’)

p0 <- par(mfrow = c(2,2))
gam.check(g, pch = 19, cex = 0.5)


Method: REML   Optimizer: outer newton
full convergence after 4 iterations.
Gradient range [8.674563e-10,1.084888e-07]
(score 271.1993 & scale 1).
Hessian positive definite, eigenvalue range [2.264941,28.62155].
Model rank =  10 / 10 

Basis dimension (k) checking results. Low p-value (k-index<1) may
indicate that k is too low, especially if edf is close to k'.

         k'  edf k-index p-value
s(doy) 9.00 5.48    0.96    0.61

par(p0)

Compare to DHARMa

s <- DHARMa::simulateResiduals(g, plot = TRUE)

Registered S3 method overwritten by 'GGally':
  method from   
  +.gg   ggplot2

Registered S3 method overwritten by 'mgcViz':
  method from  
  +.gg   GGally

We want

Full convergence
K > edf
High p-value