World Journal of Environmental Biosciences
World Journal of Environmental Biosciences
2026 Volume 15 Issue 2

Statistical Assessment of Climate Change Effects on Biodiversity Using Generalized Linear Models


, , , , , ,
  1. Facultad de Ingeniería Estadística e Informática, Universidad Nacional del Altiplano de Puno, Puno-Perú.
  2. Departamento Académico de Educación y Humanidades de la Universidad Nacional José María Arguedas, Andahuaylas, Perú.
Abstract

Climate change is a major driver of biodiversity loss, altering species distributions, population abundances, phenology, and extinction risk. Statistical assessment of these changes requires models that can accommodate non-normal ecological response variables such as species counts, abundance indices, and presence/absence observations. Ordinary linear regression is poorly suited to biodiversity count data because it assumes normally distributed errors, constant variance, and unbounded continuous responses. In ecological monitoring, these assumptions are often violated by overdispersion, excess zeros, unequal sampling effort, and nonlinear climate responses. This article develops a generalized linear model framework for assessing the effects of temperature and precipitation on biodiversity metrics. The objective is to estimate climate effects on species richness, abundance, and occurrence while accounting for sampling effort, confounding variables, and appropriate error structures. The primary model is a negative binomial generalized linear model for count data with a log link and an effort offset. A complementary logistic generalized linear model is specified for presence/absence responses, with climate predictors from gridded climate products and adjustment for land cover, elevation, and year. Conceptually, biodiversity is expected to show a nonlinear temperature response, with richness increasing up to an optimum and declining under high thermal stress. Precipitation is expected to show a positive linear association in water-limited systems, while negative binomial dispersion estimates between 1.2 and 2.5 would indicate meaningful overdispersion relative to a Poisson model. Generalized linear models provide a statistically sound and interpretable framework for climate-biodiversity assessment. When paired with diagnostic testing, offsets, and validation, they can support conservation prioritization under future climate scenarios.


Keywords: Climate change, Biodiversity, Generalized linear models, Negative binomial regression, Species richness, Overdispersion

INTRODUCTION

 

The accelerating loss of global biodiversity is increasingly linked to climate change through warming, precipitation shifts, drought intensification, and extreme weather events. Projected exposure to warming indicates that insects, vertebrates, and plants face substantially different levels of risk under 1.5°C and 2°C warming pathways, while observed biodiversity redistribution is already altering ecosystem structure and human well-being (Pecl et al., 2017; Warren et al., 2018). Large-scale biodiversity time series further show that community composition is changing across marine and terrestrial systems, although the direction and magnitude of local change vary by region and taxonomic group (Blowes et al., 2019; Antão et al., 2020). These patterns require statistical approaches that distinguish climate signals from background ecological variability, land-use pressure, and imperfect observation.

Quantifying climate-biodiversity relationships is challenging because many response variables are counts, proportions, or binary outcomes rather than continuous normally distributed measurements (Carpio-Vargas et al., 2023; Huata-Panca et al., 2025). Species richness and abundance are non-negative integers, occupancy is often recorded as presence or absence, and community change may be summarized through indices with bounded or skewed distributions (Hudson et al., 2017; Dornelas et al., 2018). Citizen-science and monitoring data also contain uneven spatial coverage, variable sampling intensity, and repeated observations through time, all of which complicate direct comparison among sites (Johnston et al., 2020; Torres-Cruz et al., 2025). As a result, ordinary least squares regression may produce inefficient, biased, or biologically impossible predictions when applied to biodiversity responses without distributional adjustment.

Generalized linear models offer a flexible statistical framework for ecological data because they connect the expected value of a response to predictors through a link function while allowing non-normal error distributions. Poisson, binomial, and negative binomial models can represent count, binary, and overdispersed count responses, respectively, making them suitable for richness, abundance, and occurrence analyses (Brooks et al., 2017; Warton, 2018). Species distribution modeling standards emphasize transparent model specification, evaluation, and reporting because inference about climate effects depends strongly on the response distribution, predictor selection, and validation strategy (Araújo et al., 2019; Zurell et al., 2020; Carita et al., 2025). For ecological count data, negative binomial models are especially important when variance exceeds the mean and transformation-based approaches fail to solve the underlying distributional problem (Warton, 2018).

This article argues that a rigorous GLM framework can produce interpretable and statistically defensible estimates of climate effects on biodiversity when it correctly specifies error structure, tests overdispersion, includes effort offsets, and evaluates nonlinear climate responses. Model comparison using information criteria, cross-validation, and structured residual diagnostics is necessary because biodiversity data often include spatial, temporal, and hierarchical dependence (Roberts et al., 2017; Valavi et al., 2018; Norberg et al., 2019). Mixed-effects extensions and diagnostic tools improve robustness when site-level heterogeneity, repeated observations, or excess zeros violate simple independence assumptions (Brooks et al., 2017; Harrison et al., 2018; Lüdecke et al., 2021). The resulting effect estimates can be translated into incidence rate ratios, odds ratios, and marginal predictions that directly inform conservation planning under climate change.

Figure 1 presents the hierarchical analytical workflow linking biodiversity monitoring data, climate covariates, GLM family selection, diagnostic evaluation, and conservation interpretation

 

 

Figure 1. Hierarchical GLM workflow for assessing climate change effects on biodiversity

 

Background

Climate change impacts on biodiversity

Climate change affects biodiversity through phenological shifts, range contractions, local extinctions, altered abundance, and restructuring of ecological communities. Empirical and projected studies indicate that species responses differ among taxa, biomes, and climate zones, with tropical and temperate systems often showing different capacities for tracking suitable climates (Román-Palacios & Wiens, 2020; Freeman et al., 2021). Abrupt ecological disruption is expected when climate exposure exceeds species-specific tolerances across assemblages, and extinction risk increases when climatic change interacts with dispersal limitation and habitat degradation (Trisos et al., 2020; Wiens & Zelinka, 2024). These processes justify statistical models that can estimate both linear and nonlinear climate effects across repeated biodiversity observations.

Key climate predictors in ecological studies

Key climate predictors include annual mean temperature, seasonal temperature variability, total precipitation, precipitation anomalies, drought indices, growing degree days, heatwave frequency, and extreme weather events. High-resolution products such as WorldClim, CHELSA, CRU TS, and ERA5-Land provide gridded climate variables that can be matched to ecological monitoring locations at annual or seasonal resolutions (Fick & Hijmans, 2017; Karger et al., 2017; Harris et al., 2020; Muñoz-Sabater et al., 2021). Temperature-related biodiversity change has been detected across temperate marine and terrestrial systems, while precipitation and drought are especially important where water availability constrains productivity, occupancy, and population growth (Antão et al., 2020; Outhwaite et al., 2022). Including both central-tendency variables and climate extremes is therefore necessary to avoid underestimating climate stress.

Biodiversity response metrics

Biodiversity responses can be expressed as species richness counts, abundance indices, occupancy states, community composition, or diversity indices such as Shannon and Simpson measures. Global time-series resources and human-impact databases show that richness and abundance data often arise from repeated monitoring, transects, plots, or opportunistic occurrence records, each with distinct sampling properties (Hudson et al., 2017; Dornelas et al., 2018). Count responses are discrete and frequently overdispersed, while occupancy responses are binary and require a probability model rather than a Gaussian mean model (Araújo et al., 2019). These statistical properties determine whether the appropriate GLM family is Poisson, negative binomial, binomial, or an extension such as a zero-inflated or mixed-effects model.

Limitations of linear models for count and binary data

Linear models are limited for ecological count and binary data because they assume constant variance, normally distributed residuals, and unbounded continuous outcomes. For species counts, this can generate negative fitted values and misleading standard errors, particularly when the count variance increases with the mean or contains many zeros (Warton, 2018). For presence/absence responses, linear probability models can predict probabilities below zero or above one, whereas binomial GLMs constrain predictions to the valid probability range through the logit link (Araújo et al., 2019). These limitations are amplified in climate-biodiversity applications because climate gradients often create nonlinear responses and heterogeneous uncertainty across sites.

Generalized linear models as a solution

Generalized linear models address these limitations by modeling the conditional mean of the response through a link function and an exponential-family or related distribution. For richness counts, the log link ensures positive fitted means, while for occurrence data, the logit link maps linear predictors to probabilities between zero and one (Brooks et al., 2017; Warton, 2018). Maximum likelihood estimation allows direct comparison among candidate models through AIC, likelihood ratio tests, and information-theoretic evidence, provided the model family reflects the observed mean-variance relationship (Harrison et al., 2018; Norberg et al., 2019). This makes GLMs a practical foundation for climate ecology because effect sizes can be interpreted as multiplicative changes in counts or odds under specified climate increments.

Data sources and study design

Biodiversity response data

The study design uses a hypothetical longitudinal dataset from 50 monitoring sites surveyed annually over 10 years, producing 500 site-year observations. The primary response is species richness count for vascular plants or breeding birds, with supplementary abundance indices and presence/absence outcomes available for sensitivity analysis. Sampling effort is recorded as person-hours, transect length, checklist duration, trap nights, or number of survey visits, allowing effort to be incorporated as an offset rather than treated as an ordinary covariate. This design is consistent with long-term biodiversity time-series databases and monitoring schemes that compile repeated ecological observations across sites and years (Dornelas et al., 2018; Johnston et al., 2020).

Climate covariates

Climate covariates are assigned to each site-year using gridded climate products at matched spatial and temporal resolution. Annual mean temperature, represented by Bio1, and annual precipitation, represented by Bio12, can be extracted from WorldClim v2.1 at approximately 1-km resolution, while CHELSA, CRU TS, and ERA5-Land provide complementary climate surfaces and time-varying meteorological estimates (Fick & Hijmans, 2017; Karger et al., 2017; Harris et al., 2020; Muñoz-Sabater et al., 2021). Growing-season temperature from April to September and summer drought intensity, summarized as SPEI-3 or an analogous drought index, are included to capture biologically relevant seasonal stress. This covariate structure allows the GLM to test whether biodiversity responds more strongly to annual climate means, seasonal constraints, or short-term drought exposure.

Confounding and contextual variables

Confounding variables include land use type, elevation, year, habitat fragmentation, and spatial coordinates, all of which can influence biodiversity independently of climate. Land-use change and agricultural intensification are known to interact with climate pressure, particularly for insects and other taxa sensitive to both habitat conversion and thermal stress (Hudson et al., 2017; Outhwaite et al., 2022). Elevation is included because it structures local temperature, precipitation, dispersal barriers, and species pools, while year captures unmeasured temporal trends such as policy shifts, disease outbreaks, or observer protocol changes. Spatial coordinates are retained for residual autocorrelation checks and for possible transition to clustered standard errors or mixed models if independence assumptions are not supported (Roberts et al., 2017; Valavi et al., 2018).

Exploratory data analysis and dispersion assessment

Distribution of biodiversity responses

Exploratory analysis begins with histograms, frequency tables, and mean-variance plots for species richness and abundance counts. Under a Poisson GLM, the theoretical variance equals the mean, so an empirical variance-mean ratio greater than one indicates overdispersion and motivates a negative binomial specification (Warton, 2018). Biodiversity time-series data commonly display heterogeneity among sites, years, and taxa, making overdispersion plausible before formal modeling (Dornelas et al., 2018; Blowes et al., 2019). The count distribution should also be inspected for excess zeros, high-leverage observations, and unusually rich sites that may dominate the fitted climate response.

Pairwise relationships with climate variables

Pairwise relationships between biodiversity responses and climate predictors are examined using scatterplots, binned summaries, and loess smoothing. These visual checks are important because temperature effects may be unimodal, with richness increasing under moderate warming but declining beyond physiological or ecological thresholds (Antão et al., 2020; Freeman et al., 2021). Boxplots across temperature and precipitation quantiles help reveal whether extreme heat, drought, or precipitation anomalies are associated with compressed richness distributions or elevated absence probabilities (Trisos et al., 2020; Outhwaite et al., 2022). Such exploratory evidence guides whether the GLM should include quadratic temperature terms, interaction terms, or later sensitivity analysis with generalized additive models.

Preliminary dispersion testing

Preliminary dispersion testing is conducted by fitting a Poisson GLM and calculating the Pearson chi-square statistic divided by the residual degrees of freedom. A value near one supports the Poisson mean-variance assumption, whereas values above 1.5 indicate overdispersion that can inflate type I error if ignored (Warton, 2018; Lüdecke et al., 2021). Simulated residual diagnostics can supplement this calculation by evaluating whether residuals are uniform, whether zeros are more frequent than expected, and whether dispersion remains after covariate adjustment (Lüdecke et al., 2021). If overdispersion persists, the negative binomial GLM becomes the primary count model rather than a secondary robustness check.

Generalized linear model specification

Table 1 provides a decision matrix for matching biodiversity response variables to GLM families, link functions, diagnostic risks, and interpretable climate effect measures.

 

 

Table 1. Decision matrix for selecting GLM-family models in climate-biodiversity analysis

Biodiversity response

Statistical form

Recommended primary model

Link function

Key climate predictors

Main diagnostic risk

Preferred effect-size interpretation

Conservation-relevant output

Species richness

Non-negative count

Negative binomial GLM

Log

Annual mean temperature, precipitation, drought index

Overdispersion, influential high-richness sites

Incidence rate ratio for expected richness

Predicted richness change under climate gradients

Total abundance

Non-negative count

Negative binomial GLM or Poisson GLM after dispersion check

Log

Growing-season temperature, precipitation anomaly, heatwave days

Extra-Poisson variance, aggregation, effort imbalance

Incidence rate ratio for expected abundance

Population-level sensitivity to warming or drought

Presence/absence

Binary

Logistic GLM

Logit

Temperature, precipitation, drought, land-cover interaction

Separation, imperfect detection, spatial bias

Odds ratio for occurrence probability

Probability of persistence or local absence

Proportion of occupied plots

Binomial proportion

Binomial GLM

Logit

Seasonal temperature, precipitation, elevation interaction

Overdispersion among plots, non-independence

Odds ratio for proportional occupancy

Change in occupied habitat fraction

Many zero observations

Count with excess zeros

Zero-inflated negative binomial sensitivity model

Log for count component; logit for zero component

Drought, heatwaves, habitat suitability

Structural zeros vs. sampling zeros

Separate effects on occurrence and conditional abundance

Identification of unsuitable or climate-excluded sites

Repeated site-year observations

Hierarchical count or binary response

GLMM extension

Log or logit

Temperature trend, year, climate anomaly

Site dependence, temporal autocorrelation

Conditional and marginal climate effects

Site-specific vulnerability and regional heterogeneity

Nonlinear climate response

Count or binary response with threshold

GLM with quadratic terms or GAM sensitivity model

Log or logit

Temperature, Temp², precipitation, Temp×Precip

Mis-specified linearity

Temperature optimum or threshold response

Identification of climatic tolerance limits

 

Base model for count data

The base count model is a negative binomial generalized linear model with a log link because species richness is non-negative, discrete, and expected to be overdispersed. The model is specified as , where  represents survey duration, transect length, or another measure of sampling intensity. The negative binomial dispersion parameter θ is estimated separately, with smaller θ values indicating stronger overdispersion relative to a Poisson model (Brooks et al., 2017; Warton, 2018). Coefficients are interpreted after exponentiation as incidence rate ratios, giving the multiplicative change in expected richness for a one-unit change in each climate predictor.

Alternative model for presence/absence

The alternative presence/absence model uses a logistic GLM with binomial error and a logit link. The model is specified as , with optional adjustment for land cover, elevation, and year when occurrence records are drawn from heterogeneous monitoring contexts. This formulation is appropriate for occupancy-style responses because it estimates the probability that a species or taxonomic group is recorded at a site-year rather than treating presence as a continuous outcome (Araújo et al., 2019; Johnston et al., 2020). Exponentiated coefficients are interpreted as odds ratios, which provide a direct measure of how climatic conditions alter the odds of observed presence.

Inclusion of nonlinear and interaction terms

Nonlinear and interaction terms are included to reflect the ecological expectation that climate effects are not always monotonic. A quadratic temperature term is specified as , allowing estimation of a temperature optimum when β3 is negative. A Temp×Precip interaction tests whether warming effects differ under wet and dry conditions, which is important when thermal stress is intensified by aridity or reduced by water availability (Antão et al., 2020; Trisos et al., 2020; Outhwaite et al., 2022). Candidate models are compared using AIC, cross-validation, and ecological interpretability rather than relying only on statistical significance (Roberts et al., 2017; Valavi et al., 2018; Norberg et al., 2019).

Handling overdispersion, zero inflation, and offsets

Comparing poisson vs. negative binomial

The Poisson and negative binomial count models are compared by examining the dispersion statistic, likelihood-based evidence, and AIC. A likelihood ratio test can be used when models are nested or comparable, while an AIC difference greater than two provides practical evidence favoring the negative binomial model if it improves fit without excessive complexity (Harrison et al., 2018; Norberg et al., 2019). The dispersion parameter θ is reported because smaller θ indicates stronger extra-Poisson variation, which is common in biodiversity monitoring where unobserved habitat quality, observer differences, and local population aggregation inflate count variance (Brooks et al., 2017; Warton, 2018). In the present GLM framework, negative binomial superiority over Poisson would support the conclusion that richness responses to temperature and precipitation cannot be modeled reliably under an equidispersion assumption.

Zero-inflated models as sensitivity test

Zero-inflated models are used as a sensitivity test when observed zeros exceed the number expected under the fitted Poisson or negative binomial model. In climate-biodiversity data, excess zeros may arise from unsuitable habitat, local extinction, dispersal limitation, imperfect detection, or stochastic absence during extreme weather years (Román-Palacios & Wiens, 2020; Trisos et al., 2020). A zero-inflated negative binomial model separates the ecological process generating structural zeros from the count process generating observed abundance or richness, and packages such as glmmTMB allow this structure to be estimated flexibly (Brooks et al., 2017). Comparing standard and zero-inflated specifications helps determine whether apparent climate effects reflect changes in expected richness, changes in occupancy suitability, or both.

Inclusion of offsets for sampling effort

An offset for sampling effort is included when the number of species or individuals observed depends partly on survey duration, transect length, checklist number, trap nights, or observer time. The offset term offset(log(Effort_i)) fixes the effort coefficient at one, so the model estimates biodiversity rates rather than raw counts that are confounded by unequal sampling intensity (Johnston et al., 2020). This is essential for citizen-science and long-term monitoring data because spatial and temporal variation in recorder effort can mimic ecological change if not explicitly modeled (Hudson et al., 2017; Dornelas et al., 2018). In practical terms, the fitted model estimates how expected richness changes with temperature and precipitation after standardizing observations to a common effort scale.

Interpreting climate effect sizes and uncertainty

Incidence rate ratios for count models

In the negative binomial GLM, exponentiated coefficients are interpreted as incidence rate ratios. If β1 represents annual mean temperature, then e^(β1) gives the multiplicative change in expected species richness associated with a 1°C increase, holding precipitation, effort, land cover, elevation, and year constant (Brooks et al., 2017; Warton, 2018). Confidence intervals can be estimated using profile likelihood or robust standard errors, and their interpretation should emphasize effect magnitude as well as statistical uncertainty. This approach is more informative for conservation planning than reporting coefficients only on the log scale because it translates climate effects into proportional gains or losses in biodiversity (Belfiore et al., 2024; Figueroa-Valverde et al., 2024; Karatas, 2024; Lee & Ferreira, 2024; Negreiros & Ory, 2024).

Marginal effects and predicted biodiversity surfaces

Marginal effects plots are used to show predicted richness or occurrence across climate gradients while holding other covariates at representative values. For the quadratic temperature model, the temperature optimum is identified where the derivative of the linear predictor with respect to temperature equals zero, giving Temp_optimum = -β1/(2*β3) when β3 is negative (Antão et al., 2020; Freeman et al., 2021). Predicted biodiversity surfaces over temperature and precipitation gradients can reveal whether richness declines most strongly under hot-dry conditions, moderate warming, or precipitation deficits (Trisos et al., 2020; Outhwaite et al., 2022). These visual summaries support interpretation of nonlinear GLM results without requiring readers to infer ecological thresholds from coefficient tables alone (Abdullah et al., 2025; Jagsi et al., 2025; Kęska & Suchy, 2024; Kounatidis et al., 2024; Lee & Ferreira, 2024; Noor et al., 2024; Petronis et al., 2025; Schneider & Krüger, 2025; Wong et al., 2025; Yu et al., 2025).

Contribution of climate vs. land use

The relative contribution of climate and land use is assessed by comparing nested models: a full model containing climate, land use, elevation, effort, and year against reduced models that remove climate terms or remove land-use terms. Changes in AIC, likelihood, and pseudo-R² indicate whether climate predictors explain variation beyond habitat conversion, agricultural intensity, and other human pressures (Hudson et al., 2017; Outhwaite et al., 2022). Because agriculture and climate change can jointly reshape biodiversity, the interpretation should avoid treating climate coefficients as purely causal unless confounders and spatial structure are adequately addressed (Outhwaite et al., 2022). Measures such as marginal and conditional R² are particularly useful when the GLM is extended to a mixed-effects model with site or year random effects (Nakagawa et al., 2017).

Practical implications for conservation planning

Identifying climate-sensitive biodiversity hotspots

Fitted GLMs can be used to predict biodiversity under alternative future climate scenarios, including moderate and high-emissions pathways, by applying estimated climate coefficients to projected temperature and precipitation surfaces. Areas where predicted richness or occurrence declines sharply can be classified as climate-sensitive biodiversity hotspots and prioritized for protection, restoration, or assisted connectivity (Trisos et al., 2020; Warren et al., 2018). This is especially important because projected ecological disruption may occur abruptly when multiple species cross climatic thresholds within a short period (Trisos et al., 2020). The same modeling framework can also identify potential climate refugia where predicted biodiversity remains relatively stable despite regional warming.

Designing climate-adapted monitoring

GLM outputs can guide climate-adapted monitoring by estimating the number of sites, years, or repeated surveys needed to detect a specified biodiversity trend with acceptable uncertainty. Standard errors from negative binomial and logistic models can be incorporated into simulation-based power analysis, allowing monitoring designs to account for overdispersion, baseline richness, and expected climate variability (Roberts et al., 2017; Johnston et al., 2020). Long-term biodiversity databases show that repeated observation is essential because climate signals are often embedded within substantial spatial, temporal, and taxonomic variation (Dornelas et al., 2018; Blowes et al., 2019). A model-based monitoring design is therefore more efficient than uniform sampling when climate risk is spatially structured and sampling resources are limited.

Model evaluation and validation strategy

Table 2 consolidates the diagnostic and validation checkpoints required to distinguish robust climate-biodiversity inference from model artifacts caused by dispersion, sampling effort, dependence, or extreme years (Carpio-Vargas et al., 2023; Belfiore et al., 2024; Figueroa-Valverde et al., 2024; Karatas, 2024; Lee & Ferreira, 2024; Negreiros & Ory, 2024; Wolderslund et al., 2024; Carita et al., 2025; Huata-Panca et al., 2025; Torres-Cruz et al., 2025).

 

 

Table 2. Analytical checkpoints for strengthening inference from climate-biodiversity GLMs

Analytical checkpoint

Statistical question addressed

Operational test or implementation

Evidence of concern

Model response

Interpretive implication

Mean-variance comparison

Does count variance exceed the Poisson assumption?

Plot mean versus variance; calculate Pearson dispersion statistic

Variance substantially greater than mean; dispersion statistic > 1.5

Use negative binomial GLM or quasi-Poisson sensitivity model

Climate effects may be overstated if Poisson standard errors are retained

Zero-frequency assessment

Are absences or zero counts more frequent than expected?

Compare observed zeros with simulated zeros from fitted model

Observed zeros exceed model-predicted zeros

Fit zero-inflated negative binomial sensitivity model

Climate may affect both suitability and conditional richness

Effort standardization

Are counts confounded by unequal sampling intensity?

Include offset(log(Effort_i)) for duration, visits, transect length, or trap nights

Richness increases strongly with effort before adjustment

Retain effort offset in all count models

Estimated biodiversity change reflects rate-standardized response

Nonlinearity screening

Is the climate response monotonic or threshold-like?

Fit Temp² term; compare AIC and marginal effects

Significant curvature or improved AIC

Retain quadratic term or test GAM sensitivity model

Biodiversity may peak at intermediate temperature or precipitation levels

Climate-land-use confounding

Are climate effects independent of habitat context?

Compare nested models with and without land use and elevation

Climate coefficients shift after contextual adjustment

Retain confounders and report adjusted estimates

Apparent climate effects may partly reflect habitat gradients

Spatial dependence

Are nearby sites statistically independent?

Examine spatial residual correlograms or blocked cross-validation

Residual clustering by location

Use clustered standard errors, spatial blocking, or GLMM extension

Naive uncertainty estimates may be too narrow

Temporal dependence

Are repeated observations independent through time?

Check residual autocorrelation by year and site

Persistent residual trends or lag structure

Add year effects, site random effects, or temporal validation

Climate trend estimates may absorb unmodeled temporal processes

Predictive robustness

Does the model generalize beyond fitted observations?

k-fold, spatial-block, or leave-one-year-out cross-validation

Poor out-of-sample prediction or unstable coefficients

Revise predictors, interactions, or random effects

Conservation projections require validated predictive performance

Extreme-year sensitivity

Are effects driven by drought or heatwave years?

Refit excluding major drought or heatwave years

Large coefficient changes after exclusion

Report average-climate and extreme-year models separately

Biodiversity response may reflect episodic disturbance rather than gradual trend

 

Residual diagnostics for GLMs

Residual diagnostics are conducted using deviance residuals, Pearson residuals, and simulated residuals. Simulated residual tools are especially useful for testing uniformity, overdispersion, zero inflation, and residual temporal or spatial autocorrelation in fitted GLMs and GLMMs (Lüdecke et al., 2021). If residual patterns remain after including climate, land cover, effort, and year, the model may require additional covariates, nonlinear terms, random effects, or spatially structured validation (Roberts et al., 2017; Valavi et al., 2018). Diagnostic evidence should be reported alongside coefficient estimates because apparent climate effects may be unreliable when residual assumptions are visibly violated.

Cross-validation and predictive performance

Predictive performance is evaluated using k-fold cross-validation, with folds constructed to reduce leakage across spatially or temporally dependent observations. For count models, predictive accuracy can be summarized using root mean squared error, mean absolute error, log predictive density, or calibration of predicted richness; for binary models, AUC and calibration plots are appropriate (Roberts et al., 2017; Norberg et al., 2019). Spatial blocking is preferred when nearby sites share similar climate and species pools, because random cross-validation can overestimate transferability under spatial autocorrelation (Valavi et al., 2018). Comparing the fitted GLM to a null model and to alternative climate specifications provides evidence on whether temperature and precipitation improve prediction beyond baseline site and year structure.

Sensitivity to extreme climate years

Sensitivity to extreme climate years is evaluated with leave-one-year-out cross-validation and targeted refitting that excludes major drought, heatwave, or anomalous precipitation years. This analysis tests whether the estimated climate response is stable or whether it is driven primarily by rare events that exert disproportionate influence on biodiversity observations (Trisos et al., 2020; Leclerc et al., 2025). Because climate extremes can restructure ecological networks and interact with biological invasions, excluding or isolating extreme years can clarify whether coefficients represent average climate gradients or episodic disturbance effects (Leclerc et al., 2025). If model conclusions change substantially, the manuscript should report both average-climate and extreme-year interpretations rather than presenting a single pooled estimate as universally stable.

Limitations

Confounding and causality

GLMs estimate statistical associations between climate predictors and biodiversity responses, but they do not by themselves establish causality. Climate covariates are often correlated with land use, elevation, habitat fragmentation, productivity, invasive species, pollution, and biotic interactions, making omitted-variable bias a serious concern (Hudson et al., 2017; Outhwaite et al., 2022). Even when temperature and precipitation coefficients are significant, causal interpretation requires careful design, sensitivity analysis, temporal ordering, and ideally independent validation or quasi-experimental evidence. Therefore, the GLM framework should be interpreted as an inferential and predictive tool rather than as proof that climate alone caused the observed biodiversity change.

Spatial autocorrelation and detection biases

Spatial autocorrelation can inflate statistical significance when nearby sites share unmeasured environmental conditions, species pools, or observer networks. Detection bias can also distort abundance and occurrence estimates because false absences, variable observer skill, and unequal effort are common in biodiversity datasets, especially citizen-science records (Roberts et al., 2017; Johnston et al., 2020). Mixed-effects models, clustered standard errors, spatial blocking, and explicit effort offsets reduce these risks but may not fully resolve imperfect detection or unobserved site heterogeneity (Harrison et al., 2018; Valavi et al., 2018). Future extensions should integrate hierarchical occupancy, detection probability, and spatial random effects when the monitoring design supports that level of model complexity.

CONCLUSION

Generalized linear models provide a coherent statistical framework for assessing climate change effects on biodiversity because they align model structure with ecological response types. Count outcomes such as richness and abundance can be analyzed with log-linked Poisson or negative binomial models, while occurrence outcomes can be analyzed with logit-linked binomial models. Including effort offsets, confounders, and nonlinear climate terms strengthens inference by reducing avoidable bias and improving biological realism.

The central statistical contribution of this framework is the demonstration that ecological count data often require negative binomial rather than Poisson assumptions. By estimating incidence rate ratios, odds ratios, confidence intervals, and marginal effects, the model converts abstract climate coefficients into interpretable measures of biodiversity change. Quadratic temperature terms and climate interactions further allow the model to detect optima, thresholds, and hot-dry stress conditions.

The practical value of this GLM approach lies in its ability to support conservation planning under climate uncertainty. Predicted richness and occurrence surfaces can identify climate-sensitive hotspots, potential refugia, and monitoring locations where biodiversity is most likely to change. Model-based power analysis can also improve survey design by estimating how many sites, years, or repeated visits are required to detect climate-driven trends.

Future work should integrate long-term biodiversity monitoring networks, transparent GLM code repositories, and hierarchical model extensions that account for detection probability, site dependence, and spatial random effects. Such integration would improve reproducibility and strengthen the link between ecological observation, statistical inference, and conservation decision-making. As climate pressures intensify, statistically defensible models will be essential for identifying where biodiversity loss is occurring, why it is occurring, and how conservation responses should be prioritized.

ACKNOWLEDGMENTS: None

CONFLICT OF INTEREST: None

FINANCIAL SUPPORT: None

ETHICS STATEMENT: None

References

Abdullah, N. A., Zulkifli, M. I., & Mohamed, A. S. (2025). Refinement of the 8th AJCC staging system for medullary thyroid cancer: Integrating tumor size and lymph node characteristics with SEER and multicenter validation. Archives of International Journal of Cancer and Allied Sciences, 5(2), 34–43. doi:10.51847/R1sIaONOms

Antão, L. H., Bates, A. E., Blowes, S. A., Waldock, C., Supp, S. R., Magurran, A. E., Dornelas, M., & Schipper, A. M. (2020). Temperature-related biodiversity change across temperate marine and terrestrial systems. Nature Ecology & Evolution, 4(7), 927–933.

Araújo, M. B., Anderson, R. P., Márcia Barbosa, A., Beale, C. M., Dormann, C. F., Early, R., Garcia, R. A., Guisan, A., Maiorano, L., Naimi, B., et al. (2019). Standards for distribution models in biodiversity assessments. Science Advances, 5(1), eaat4858.

Belfiore, C. I., Galofaro, V., Cotroneo, D., Lopis, A., Tringali, I., Denaro, V., & Casu, M. (2024). Studying the Effect of Mindfulness, Dissociative Experiences, and Feelings of Loneliness in Predicting the Tendency to Use Substances in Nurses. Journal of Integrative Nursing and Palliative Care, 5, 1-7. doi:10.51847/LASijYayRi

Blowes, S. A., Supp, S. R., Antão, L. H., Bates, A., Bruelheide, H., Chase, J. M., Moyes, F., Magurran, A., McGill, B., Myers-Smith, I. H., et al. (2019). The geography of biodiversity change in marine and terrestrial assemblages. Science, 366(6463), 339–345.

Brooks, M. E., Kristensen, K., Van Benthem, K. J., Magnusson, A., Berg, C. W., Nielsen, A., Skaug, H. J., Mächler, M., & Bolker, B. M. (2017). glmmTMB balances speed and flexibility among packages for zero-inflated generalized linear mixed modeling. The R Journal, 9(2), 378–400.

Carita, A. J. Q., Cutipa, R. A., Vargas, J. C. J., Cueva, A. L., Figueroa, E. N. T., & Torres-Cruz, F. (2025). Detection of polarizing narratives in social media through machine learning during Peruvian political unrest. Journal of Organisational Behaviour Research, 10(4), 106–115. doi:10.51847/ePYLFVct7c

Carpio-Vargas, E. E., Ibarra-Cabrera, E. M., Ibarra, M. J., Choquejahua-Acero, R., Calderon-Vilca, H. D., & Torres-Cruz, F. (2023). Categorical stress predictors in higher education students amidst remote learning in COVID-19 pandemic. Journal of Advanced Pharmacy Education & Research, 13(2), 131–139. doi:10.51847/ImofrnDDZg

Clark, A., & Foster, H. (2025). Network pharmacology integration and experimental verification to elucidate the molecular mechanisms of triptolide in treating membranous nephropathy. Pharmaceutical Sciences and Drug Design, 5, 33–47. doi:10.51847/X9UVmVSJ4E

Csep, A. N., Voiţă-Mekereş, F., Tudoran, C., & Manole, F. (2024). Understanding and managing polypharmacy in the aging population. Annals of Pharmaceutical Practice and Pharmacotherapy, 4, 17–23. doi:10.51847/VdKr0egSln

Dornelas, M., Antao, L. H., Moyes, F., Bates, A. E., Magurran, A. E., Adam, D., Akhmetzhanova, A. A., Appeltans, W., Arcos, J. M., Arnold, H., et al. (2018). BioTIME: A database of biodiversity time series for the Anthropocene. Global Ecology and Biogeography, 27(7), 760–786.

Fick, S. E., & Hijmans, R. J. (2017). WorldClim 2: new 1-km spatial resolution climate surfaces for global land areas. International Journal of Climatology, 37(12), 4302–4315.

Figueroa-Valverde, L., Marcela, R., Alvarez-Ramirez, M., Lopez-Ramos, M., Mateu-Armand, V., & Emilio, A. (2024). Statistical data from 1979 to 2022 on prostate cancer in populations of Northern and Central Mexico. Bulletin of Pioneer Research in Medical and Clinical Sciences, 4(1), 24–30. doi:10.51847/snclnafVdg

Freeman, B. G., Song, Y., Feeley, K. J., & Zhu, K. (2021). Montane species track rising temperatures better in the tropics than in the temperate zone. Ecology Letters, 24(8), 1697–1708.

García Criado, M., Myers-Smith, I. H., Bjorkman, A. D., Elmendorf, S. C., Normand, S., Aastrup, P., Aerts, R., Alatalo, J. M., Baeten, L., Björk, R. G., et al. (2025). Plant diversity dynamics over space and time in a warming Arctic. Nature, 642(8068), 653–661.

Ghiga, I., Pitchforth, E., Lundborg, C. S., & Machowska, A. (2024). Bacterial infections and antibiotic resistance in Romanian children: Insights from a hospital-based study. Interdisciplinary Research in Medical Sciences Special, 4(2), 1–8. doi:10.51847/pISlxaQJVu

Harris, I., Osborn, T. J., Jones, P., & Lister, D. (2020). Version 4 of the CRU TS monthly high-resolution gridded multivariate climate dataset. Scientific Data, 7(1), 109.

Harrison, X. A., Donaldson, L., Correa-Cano, M. E., Evans, J., Fisher, D. N., Goodwin, C. E., Robinson, B. S., Hodgson, D. J., & Inger, R. (2018). A brief introduction to mixed effects modelling and multi-model inference in ecology. PeerJ, 6, e4794.

Huata-Panca, P., Apaza, J. M. H., Carita, A. J. Q., Mamani, G. Q., & Torres-Cruz, F. (2025). Determinants of mortality type in a high altitude Andean context using a multivariable logit regression model in Puno, Peru. Journal of Advanced Pharmacy Education & Research, 15(3), 198–204. doi:10.51847/1vvhNPv5Vy

Hudson, L. N., Newbold, T., Contu, S., Hill, S. L., Lysenko, I., De Palma, A., Phillips, H. R., Alhusseini, T. I., Bedford, F. E., Bennett, D. J., et al. (2017). The database of the PREDICTS (projecting responses of ecological diversity in changing terrestrial systems) project. Ecology and Evolution, 7(1), 145–188.

Jagsi, R., Lee, J., Roselin, D., Ira, K., & Williams, J. (2025). Do U.S. medical schools follow medical associations’ recommendations on paid parental leave for faculty? Annals of Pharmaceutical Education, Safety, Public Health Advocacy, 5, 1–11. doi:10.51847/r117In8wdi

Jin, L. W., Tahir, N. A. M., Islahudin, F., & Chuen, L. S. (2024). Exploring treatment adherence and quality of life among patients with transfusion-dependent thalassemia. Annals of Pharmaceutical Practice and Pharmacotherapy, 4, 8–16. doi:10.51847/B8R85qakUv

Johnston, A., Moran, N., Musgrove, A., Fink, D., & Baillie, S. R. (2020). Estimating species distributions from spatially biased citizen science data. Ecological Modelling, 422, 108927.

Joungtrakul, J., & Smith, I. D. (2025). Exploring the path from organizational justice to organizational citizenship behavior: Job commitment as a mediator. Annals of Organizational Culture, Leadership and External Engagement Journal, 6, 31–35. doi:10.51847/DBvez9u8O9

Karatas, K. S. (2024). First episode psychotic disorder and COVID-19: A case study. Bulletin of Pioneer Research in Medical and Clinical Sciences, 4(1), 19–23. doi:10.51847/VP5xOKglSX

Karger, D. N., Conrad, O., Böhner, J., Kawohl, T., Kreft, H., Soria-Auza, R. W., Zimmermann, N. E., Linder, H. P., & Kessler, M. (2017). Climatologies at high resolution for the earth’s land surface areas. Scientific Data, 4, 1–20.

Kebe, I. A., Kahl, C., & Liu, Y. (2025). The role of transformational leadership in enhancing employee performance: A study of the Vietnamese banking industry. Annals of Organizational Culture, Leadership and External Engagement Journal, 6, 21–30. doi:10.51847/g7jtt7Qgxk

Kęska, M., & Suchy, W. (2024). Cardiovascular risk and systemic inflammation in rheumatoid arthritis: A comparative analysis with psoriatic arthritis. Journal of Medical Sciences Interdisciplinary Research, 4(2), 30–40. doi:10.51847/PvcqitKMgB

Kounatidis, D., Dalamaga, M., Grivakou, E., Karampela, I., Koufopoulos, P., Dalopoulos, V., Adamidis, N., Mylona, E., Kaziani, A., & Valliano, N. G. (2024). Evaluation of blood-aqueous barrier permeability in response to tetracycline antibiotics under normal and pathological conditions. Interdisciplinary Research in Medical Sciences Special, 4(2), 9–17. doi:10.51847/wu4fOEjgDv

Leclerc, C., Frossard, V., Sharaf, N., Bazin, S., Bruel, R., & Sentis, A. (2025). Climate impacts on lake food-webs are mediated by biological invasions. Global Change Biology, 31(3), e70144.

Lee, M. J., & Ferreira, J. (2024). COVID-19 and children as an afterthought: Establishing an ethical framework for pandemic policy that includes children. Asian Journal of Ethics in Health and Medicine, 4, 1–19. doi:10.51847/haLKYCQorD

Lüdecke, D., Ben-Shachar, M. S., Patil, I., Waggoner, P., & Makowski, D. (2021). performance: An R package for assessment, comparison and testing of statistical models. Journal of Open Source Software, 6(60).

Muñoz-Sabater, J., Dutra, E., Agustí-Panareda, A., Albergel, C., Arduini, G., Balsamo, G., Boussetta, S., Choulga, M., Harrigan, S., Hersbach, H., et al. (2021). ERA5-Land: A state-of-the-art global reanalysis dataset for land applications. Earth System Science Data, 13(9), 4349–4383.

Musa, K., Noor, O., Ibrahim, M., & Saleh, A. (2025). A validated whole-body PBPK model of dextromethorphan and its metabolites for genotype-based prediction of CYP2D6 phenotype and urinary metabolic ratio. Special Journal of Pharmacognosy, Phytochemistry and Biotechnology, 5, 50–76. doi:10.51847/xbESBJHHcx

Nakagawa, S., Johnson, P. C., & Schielzeth, H. (2017). The coefficient of determination R² and intra-class correlation coefficient from generalized linear mixed-effects models revisited and expanded. Journal of the Royal Society Interface, 14(134), 20170213.

Negreiros, A. B., & Ory, M. G. (2024). Navigating uncertain outcomes: Returning genomic results in children with developmental delays. Asian Journal of Ethics in Health and Medicine, 4, 20–27. doi:10.51847/grOfZd8oyo

Newbold, T. (2018). Future effects of climate and land-use change on terrestrial vertebrate community diversity under different scenarios. Proceedings of the Royal Society B: Biological Sciences, 285(1881).

Njoroge, E., & Odhiambo, S. (2025). Elucidating the therapeutic mechanisms of Agrimonia pilosa Ledeb. extract for acute myocardial infarction via network pharmacology and experimental validation. Pharmaceutical Sciences and Drug Design, 5, 48–63. doi:10.51847/eZOWCUj80m

Noor, H., Sabău, D., Coțe, A., Mihetiu, A. F., Pirvut, V., Mălinescu, B., & Bratu, D. G. (2024). Advancements in Esophageal Stricture Treatment: The Role of Stents in Benign and Malignant Conditions. Journal of Medical Sciences and Interdisciplinary Research, 4(2), 47-52. doi:10.51847/LtuxAzRl0M

Norberg, A., Abrego, N., Blanchet, F. G., Adler, F. R., Anderson, B. J., Anttila, J., Araújo, M. B., Dallas, T., Dunson, D., Elith, J., et al. (2019). A comprehensive evaluation of predictive performance of 33 species distribution models at species and community levels. Ecological Monographs, 89(3), e01370.

Osluf, A. S. H., Shoukeer, M., & Almarzoog, N. A. (2024). Case report on persistent fetal vasculature accompanied by congenital hydrocephalus. Asian Journal of Current Research in Clinical Cancer, 4(1), 25–30. doi:10.51847/0gjOEudJNr

Outhwaite, C. L., McCann, P., & Newbold, T. (2022). Agriculture and climate change are reshaping insect biodiversity worldwide. Nature, 605(7908), 97–102.

Pecl, G. T., Araújo, M. B., Bell, J. D., Blanchard, J., Bonebrake, T. C., Chen, I. C., Clark, T. D., Colwell, R. K., Danielsen, F., Evengård, B., et al. (2017). Biodiversity redistribution under climate change: Impacts on ecosystems and human well-being. Science, 355(6332), eaai9214.

Petronis, Z., Golubevas, R., Rokicki, J. P., Guzeviciene, V., Sakavicius, D., & Lukosiunas, A. (2025). A Systematic Review and Meta-Analysis on Trigeminal Neuralgia Linked to Neurovascular Compression Using MRI Analysis. Journal of Current Research in Oral Surgery, 5, 17-24. doi:10.51847/sptZWIrWeo

Raza, S., Khan, A., Mehmood, F., & Farooq, U. (2025). Nationwide implementation of essential pharmacogenomic testing in the Netherlands: A decision-analytic model of lives saved and cost-effectiveness. Special Journal of Pharmacognosy, Phytochemistry and Biotechnology, 5, 39–49. doi:10.51847/PUWEymkYkk

Roberts, D. R., Bahn, V., Ciuti, S., Boyce, M. S., Elith, J., Guillera-Arroita, G., Hauenstein, S., Lahoz-Monfort, J. J., Schröder, B., Thuiller, W., et al. (2017). Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure. Ecography, 40(8), 913–929.

Román-Palacios, C., & Wiens, J. J. (2020). Recent responses to climate change reveal the drivers of species extinction and survival. Proceedings of the National Academy of Sciences, 117(8), 4211–4217.

Rypel, J., Kubacka, P., Mykała-Cieśla, J., Pająk, J., Bulska-Będkowska, W., & Chudek, J. (2024). Case presentation of breast adenoid cystic carcinoma. Asian Journal of Current Research in Clinical Cancer, 4(1), 18–24. doi:10.51847/6eOqq2KFjp

Schneider, T. L., & Krüger, B. E. (2025). Breast cancer-specific mortality in stage IV patients with small tumors: Insights from a population-based cohort. Archives of International Journal of Cancer and Allied Sciences, 5(2), 1–12. doi:10.51847/b9vFcweAVg

Torres-Cruz, F., Pari-Condori, E. Y., Tumi-Figueroa, E. N., Coyla-Idme, L., Tito-Lipa, J., Gonzalez, L. A., & Tumi-Figueroa, A. (2025). Prediction of university dropouts through random forest-based models. Journal of Advanced Pharmacy Education and Research, 15(1), 78-83. doi:10.51847/PFb18QB60j

Trisos, C. H., Merow, C., & Pigot, A. L. (2020). The projected timing of abrupt ecological disruption from climate change. Nature, 580(7804), 496–501.

Valavi, R., Elith, J., Lahoz-Monfort, J. J., & Guillera-Arroita, G. (2018). blockCV: An R package for generating spatially or environmentally separated folds for k-fold cross-validation of species distribution models. Methods in Ecology and Evolution, 10(2), 225-232.

Warren, R., Price, J., Graham, E., Forstenhaeusler, N., & VanDerWal, J. (2018). The projected effect on insects, vertebrates, and plants of limiting global warming to 1.5 C rather than 2 C. Science, 360(6390), 791–795.

Warton, D. I. (2018). Why you cannot transform your way out of trouble for small counts. Biometrics, 74(1), 362–368.

Wiens, J. J., & Zelinka, J. (2024). How many species will Earth lose to climate change? Global Change Biology, 30(1), e17125.

Wolderslund, M., Kofoed, P., & Ammentorp, J. (2024). Investigating the effectiveness of communication skills training on nurses' self-efficacy and quality of care. Journal of Integrative Nursing and Palliative Care, 5, 14–20. doi:10.51847/55M0sHLo3Z

Wong, Y., Lin, S., Cheng, H., Hsieh, T., Hsiue, T., Chung, H., Tsai, M., & Wang, M. (2025). Understanding the Impact of Medical Humanities on Internship Training and Performance. Annals of Pharmacy Education, Safety, and Public Health Advocacy, 5, 12-21. doi:10.51847/Z1fogzPksy

Yu, M., Ma, Y., Han, F., & Gao, X. (2025). Effectiveness of mandibular advancement splint in treating obstructive sleep apnea: A systematic review. Journal of Current Research in Oral Surgery, 5, 25–32. doi:10.51847/AInSXrD9rc

Zurell, D., Franklin, J., König, C., Bouchet, P. J., Dormann, C. F., Elith, J., Fandos, G., Feng, X., Guillera-Arroita, G., Guisan, A., et al. (2020). A standard protocol for reporting species distribution models. Ecography, 43(9), 1261–1277.

 


How to cite this article
Vancouver
Flores BC, Idme LC, Altamirano EBR, Quispe VI, Acero RC, Mamani GQ, et al. Statistical Assessment of Climate Change Effects on Biodiversity Using Generalized Linear Models. World J Environ Biosci. 2026;15(2):17-26. https://doi.org/10.51847/XJoWyaN9M2
APA
Flores, B. C., Idme, L. C., Altamirano, E. B. R., Quispe, V. I., Acero, R. C., Mamani, G. Q., & Panca, P. H. (2026). Statistical Assessment of Climate Change Effects on Biodiversity Using Generalized Linear Models. World Journal of Environmental Biosciences, 15(2), 17-26. https://doi.org/10.51847/XJoWyaN9M2
Related articles:
Most viewed articles:
Copyright © 2026 World Journal of Environmental Biosciences. Authors retain copyright of their article if they are accepted for publication.
Creative Commons License This work is licensed under a Creative Commons Attribution 4.0 International License.