library(rethinking) # bayesian hierarchical modelling: https://github.com/rmcelreath/rethinking/tree/Experimental
library(tidyverse)
In this model the response is range size (range
), the predictor is species’ trajectory (status
). Each coefficient of status
(i.e., mean range size for persisting, colonizing and extinct species) is allowed to vary by each study.
# clean list with data that are used in the model
dat_list <- list(
status = as.integer( relevel(d$status, ref="persisting") ),
study = as.integer(d$studyID),
range = d$range_normal
)
model0<- ulam(
alist(
range ~ dnorm(mu, sigma) , # likelihood of the data
mu <- b[status] + a[study, status] , # linear model
# adaptive prior
vector[3]:a[study] ~ multi_normal(0, Rho, sigma_a),
# fixed priors
b[status] ~ dnorm(0, 1.5),
sigma_a ~ dexp(1) ,
sigma ~ dexp(1),
# prior for correlation
Rho ~ lkj_corr(2)
), data=dat_list, iter=2000, chains=4 , cores=4
)
In the top level, the outcome is range
. The variable was transformed and is normal distributed with a mean, mu
, and standard deviation, sigma
). We make mu
into a linear function containing a coefficient for each trajectory, b[status]
, and an effect of each trajectory in each study a[study, status]
. The next part of the model is the adaptive prior. The multivariate Gaussian prior is 3-dimensional, because there are 3 possible trajectories (persisting, colonizing, extinct). There are no means in this priors, because we already placed the average trajectory effects, b[status]
, in the linear model. The covariance matrix of this prior needs two parameters, sigma_a
and Rho
. The next part of the model are the fixed priors. We define Normal priors for b[status]
. The range of the normalized range sizes is -3 to +3, so if we set the standard deviation of the b[status]
prior to half of 3, we are saying that we expect 95% of plausible coefficients to be within the -3 to +3 interval. For sigma
from the Normal distribution of the outcome variable and sigma_a
from the covariance matrix of the multivariate Gaussian prior, we take broad exponential priors with \(\lambda=1\). This constrains the sigmas to be zero or positive. The prior for Rho
uses the LKJ correlation distribution. lkj_corr(2)
is a weakly informative prior on the correlation between coefficients that is skeptical of extreme correlations near -1 or +1.
To produce more effective samples for each parameter we re-parametrized the model to a non-centered version.
model00 <- ulam(
alist(
range ~ dnorm(mu, sigma) ,
mu <- b[status] + a[study, status] ,
# adaptive priors - non-centered
transpars> matrix[study,3]:a <-
compose_noncentered( sigma_a , L_Rho_a , z_a ),
matrix[3,study]:z_a ~ normal( 0 , 1 ),
# fixed priors
b[status] ~ dnorm(0, 1.5),
vector[3]:sigma_a ~ dexp(1),
cholesky_factor_corr[3]:L_Rho_a ~ lkj_corr_cholesky( 2 ),
sigma ~ dexp(1),
# compute ordinary correlation matrices from Cholesky factors
gq> matrix[3,3]:Rho_a <<- multiply_lower_tri_self_transpose(L_Rho_a)
), data=dat_list, iter=2000, chains=4 , cores=4
)
We are interested in the difference between the estimated range size of persisting, colonizing and extinct species.
post <- extract.samples(model00)
diffs <- list( percol = post$b[,1] - post$b[,2],
perext = post$b[,1] - post$b[,3],
colext = post$b[,2] - post$b[,3] )
# omit the species that have not been present at the baseline survey: drop colonizing species
d%>%
filter(!status=="colonizing")%>%
droplevels() -> d_ep
# convert status into dummy variable
d_ep$extinct <- ifelse(d_ep$status=="extinct", 1, 0)
Empty model (one cluster type): We start by fitting an empty (no predictor) multilevel model. For this model, there are two cluster types in our data that are relevant. One cluster type is study, the other is species. First, we include only one cluster type (study) and then add species as a crossed varying effect. Finally, we compare the two models.
# clean list with data that are used in the model
dat_list <- list(
extinct = as.integer(d_ep$extinct), # 1/0 indicator for extinct
study = as.integer(d_ep$studyID) # study id
)
# model
model0.1 <- ulam(
alist(
# first level:
extinct ~ dbinom(1, p),
logit(p) <- a[study],
# second level:
# adaptive priors
a[study] ~ dnorm( a_bar , sigma_a ),
# hyper-priors
a_bar ~ dnorm( 0 , 2 ), # prior for mean study
sigma_a ~ dexp(1) # prior for standard deviation of studies
), data=dat_list , iter=2000 , chains=4, cores=4, log_lik=T
)
In the top level, the outcome is extinct
. This variable can take the values 0 and 1 and is Bernoulli distributed. The logit link maps the parameter p
, which is only defined from 0 to 1 onto a linear model. In this model the parameter is the vector a[study]
. The prior for these intercepts is a[study]~Normal(a_bar; sigma_a)
(second level).
The parameters for this prior are a_bar
and sigma_a
, and they have priors again. The relatively concetrated prior a_bar~Normal(0; 2)
produces a reasonably broad prior distribution on the outcome scale. Because sigma can only be positive we take a broad exponential prior with \(\lambda=1\), which constrains sigma to be zero or positive.
Empty model (two cluster types): The next model includes species as an additional crossed varying effect. This allows us to use partial pooling on both categorical variables, study and species, at the same time.
To add the second cluster type, species, we replicate the structure for the study cluster. This means the linear model gets yet another varying intercept, g[species]
, and the model gets another adaptive prior and yet another standard deviation parameter that adapts the amount of pooling across units.
# clean list with data that are used in the model
dat_list <- list(
extinct = as.integer(d_ep$extinct),
study = as.integer(d_ep$studyID),
species = as.integer( as.factor(d_ep$species) ) # species id
)
# model
model0.2 <- ulam(
alist(
extinct ~ dbinom(1, p),
# first level
logit(p) <- a_bar + a[study] + g[species],
a_bar ~ dnorm( 0 , 2 ), # prior for global mean
# second level
# adaptive priors
a[study] ~ dnorm( 0 , sigma_a ),
g[species] ~ dnorm( 0 , sigma_g ),
# hyper-priors
a_bar ~ dnorm( 0 , 2 ), # prior for global mean
sigma_a ~ dexp(1), # prior for standard deviation of studies
sigma_g ~ dexp(1) # prior for standard deviation of species
), data=dat_list , iter=2000 , chains=4, cores=4, log_lik=T
)
We compare model0.1
to the model with both clusters, model0.2
. We focus on a criterion that is more general than both AIC and DIC, WAIC. The Widely Applicable Information Criterion (WAIC) makes no assumption about the shape of the posterior. It provides an approximation of the out-of-sample deviance.
mcomparison <- compare(model0.1, model0.2)
Predictor model (local occupancy) without varying slopes:
We add local occupancy, initf
, to model0.2
.
# clean list with data that are used in the model
dat_list <- list(
extinct = as.integer(d_ep$extinct),
study = as.integer(d_ep$studyID),
initf = standardize(d_ep$B) # standardize species' local occupancy at the baseline survey
)
# model
model1.1 <- ulam(
alist(
extinct ~ dbinom(1, p),
# first level
logit(p) <- a_bar + a[study] + g[species] + b_f * initf,
a_bar ~ dnorm( 0, 2 ), # prior for global mean
b_f ~ dnorm( 0 , 2), # prior for the effect (slope) of initf
# second level
# adaptive priors
a[study] ~ dnorm( 0 , sigma_a ),
g[species] ~ dnorm( 0 , sigma_g ),
# hyper-priors
a_bar ~ dnorm( 0 , 2),
sigma_a ~ dexp(1),
sigma_g ~ dexp(1)
), data=dat_list , iter=2000 , chains=4, cores=4, log_lik=T
)
Predictor model (local occupancy) with varying slopes:
We let the effect of local occupancy vary across localities (studies). We have to model the joint population of intercepts and slopes, which means modeling their covariance. In conventional multilevel models, this is made possible by a joint multivariate Gaussian distribution for all of the varying effects, both intercepts and slopes.
Instead of having two independent Gaussian distributions of intercepts and of slopes, we assign a two-dimensional Gaussian distribution to both the intercepts (first dimension) and the slopes (second dimension). That means each study has an intercept, a[study]
and slope b[study]
with a prior distribution defined by the two-dimensional Gaussian distribution with means a_bar
and b_bar
and covariance matrix S. The covariance matrix needs three parameters, the intercept standard deviation, the slope standard deviation (vector sigma_dt
) and a correlation matrix with the correlation between intercept and slopes Rho
.
# clean list with data that are used in the model
dat_list <- list(
extinct = as.integer(d_ep$extinct),
study = as.integer(d_ep$studyID),
initf = standardize(d_ep$B)
)
# model
model1.2 <- ulam(
alist(
extinct ~ dbinom(1, p),
# first level
logit(p) <- a[study] + g[species] + b[study] * initf,
# second level
# adaptive priors
c(a, b)[study] ~ multi_normal(c(a_bar, b_bar), Rho, sigma_dt),
g[species] ~ dnorm( 0 , sigma_g ),
# fixed priors
# prior for a_bar (average intercept) and b_bar (average slope)
c(a_bar, b_bar) ~ dnorm(0, 2),
# prior for standard deviation of intercepts and slopes of studies
sigma_dt ~ dexp(1),
# prior for std. dev of intercept of species
sigma_g ~ dexp(1),
# prior for the correlation matrix
Rho ~ lkj_corr(2)
), data=dat_list , iter=2000 , chains=4, cores=4, log_lik=T
)
The model above shows a varying slope logit model for the relationship between the probability of going extinct and local occupancy. The prior for Rho
, lkj_corr(2)
is a weakly informative prior on the correlation between intercepts and slopes that is skeptical of extreme correlations near -1 or 1.
We compare the varying slopes model to the varying intercept only model.
mcomparison <- compare(model1.2, model1.1)
Model1.2 with range size, without varying slopes:
We add range size, range
, to model1.2
.
# clean list with data that are used in the model
dat_list <- list(
extinct = as.integer(d_ep$extinct),
study = as.integer(d_ep$studyID),
species = as.integer( as.factor(d_ep$species) ),
initf = standardize(d_ep$B),
range = standardize(d_ep$range_mid) # standardized range size estimated at a resolution of 10.7 km2 (mid-resolution)
)
# model
model1.3 <- ulam(
alist(
extinct ~ dbinom(1, p),
# first level
logit(p) <- a[study] + g[species] + b[study] * initf + b_r * range,
# prior for slope of range size
b_r ~ dnorm(0, 2),
# second level
# adaptive priors
c(a, b)[study] ~ multi_normal(c(a_bar, b_bar), Rho, sigma_dt),
g[species] ~ dnorm( 0 , sigma_g ),
# fixed priors
# prior for a_bar (average intercept) and b_bar (average slope)
c(a_bar, b_bar) ~ dnorm(0, 2),
# prior for standard deviation of intercepts and slopes of studies
sigma_dt ~ dexp(1),
# prior for std. dev of intercept of species
sigma_g ~ dexp(1),
# prior for the correlation matrix
Rho ~ lkj_corr(2)
), data=dat_list , iter=2000 , chains=4, cores=4, log_lik=T
)
Model1.2 with range size, with varying slopes: The effect of range size might also vary between studies, we include varying slopes for range size.
# clean list with data that are used in the model
dat_list <- list(
extinct = as.integer(d_ep$extinct),
study = as.integer(d_ep$studyID),
species = as.integer( as.factor(d_ep$species) ),
initf = standardize(d_ep$B),
range = standardize(d_ep$range_mid)
)
# model
model1.4 <- ulam(
alist(
extinct ~ dbinom(1, p),
# first level
logit(p) <- a[study] + g[species] + b_f[study] * initf + b_r[study] * range,
# second level
# adaptive priors
c(a, b_f, b_r)[study] ~ multi_normal(c(a_bar, b_f_bar, b_r_bar), Rho, sigma_dt),
g[species] ~ dnorm( 0 , sigma_g ),
# fixed priors
# prior for a_bar (average intercept) and b_bar (average slope)
c(a_bar, b_f_bar, b_r_bar) ~ dnorm(0, 2) ,
# prior for standard deviation of studies
sigma_dt ~ dexp(1) ,
sigma_g ~ dexp(1) ,
# prior for correlation
Rho ~ lkj_corr(2)
), data=dat_list , iter=2000 , chains=4, cores=4, log_lik=T
)
We compare the model with varying slopes on range size, model1.4
to the model without model1.3
.
mcomparison <- compare(model1.3, model1.4)
We compare the effect of range sizem b_r
, in the model that controls for local occupancy, model1.3
, with the effect in a model that does not control for local occupancy.
# clean list with data that are used in the model
dat_list <- list(
extinct = as.integer(d_ep$extinct),
study = as.integer(d_ep$studyID),
species = as.integer( as.factor(d_ep$species) ),
range = standardize(d_ep$range_mid)
)
# model
model1.5 <- ulam(
alist(
extinct ~ dbinom(1, p),
logit(p) <- a_bar + a[study] + g[species] + b_r * range,
a_bar ~ dnorm(0 , 2 ), # prior for global mean
b_r ~ dnorm( 0 , 2 ), # prior for effect (slope) of initf
# adaptive priors
a[study] ~ dnorm( 0 , sigma_a),
g[species] ~ dnorm( 0 , sigma_g ),
# hyper-priors
sigma_a ~ dexp(1), # prior for standard deviation of studys
sigma_g ~ dexp(1) # prior for sta
), data=dat_list , iter=2000 , chains=4, cores=4, log_lik=T
)
So far we have considered the effects of species level explanatory variables. Now we add study-level explanatory variables. This allows us to explain variation among studies (localities) in average extinction probability across species. Additionally, we can use these variables to ask whether the effect of range size varies with any contextual effect.
Model of average extinction probability across species with inter-census nitrogen deposition, and other covariates as study-level predictors:
# clean list with data that are used in the model
dat_list <- list(
extinct = as.integer(d_ep$extinct),
dataset = as.integer( as.factor(d_ep$studyID) ),
species = as.integer( as.factor(d_ep$species) ),
initf = standardize(d_ep$B),
range = standardize(d_ep$range_mid),
ndepo = standardize(d_ep$n_delta_r), # inter-census n-deposition
survint = standardize(d_ep$survey_interval), # time between surveys
nplot = standardize(d_ep$n_plots), # number of plots
splot = standardize(d_ep$plot_size), # size of plots
area = standardize(d_ep$site_area), # size of study area
lati = standardize(d_ep$latitude) # latitude
)
model2.1 <- ulam
alist(
extinct ~ dbinom(1, p) ,
logit(p) <-
a[study] +
g[species] +
b[study] * initf +
b_r*range +
# contextual variables
b_nd*ndepo + # n-deposition
b_si*survint + # time between surveys
b_np*nplot + # number of plots
b_sp*splot + # size of plots
b_ar*area + # size of study area
b_la*lati , # latitude
c(b_r, b_nd, b_si , b_np, b_sp, b_ar, b_la) ~ dnorm(0, 2), # priors for slopes
# second level
# adpative priors
c(a, b)[study] ~ multi_normal(c(a_bar, b_bar), Rho, sigma_dt) ,
g[species] ~ dnorm( 0 , sigma_g ),
# fixed priors
# prior for a_bar (average intercept) and b_bar (average slope)
c(a_bar, b_bar) ~ dnorm(0, 2) ,
# prior for standard deviation of intercept and slope of study
sigma_dt ~ dexp(1),
# prior for std. dev of intercept of species
sigma_g ~ dexp(1),
# prior for the correlation matrix
Rho ~ lkj_corr(2)
), data=dat_list , iter=2000 , chains=4, cores=4
)
Model for species’ extinction probability with the interaction effect between inter-census nitrogen deposition and species’ range size:
model2.2 <- ulam(
alist(
extinct ~ dbinom(1, p) ,
logit(p) <-
a[study] +
g[species] +
b[study] * initf +
b_r*range +
# contextual variables
b_nd*ndepo + # n-deposition
b_si*survint + # time between surveys
b_np*nplot + # number of plots
b_sp*splot + # size of plots
b_ar*area + # size of study area
b_la*lati + # latitude
# crosslevel interaction
b_nd_r*ndepo*range,
c(b_r, b_nd, b_si , b_np, b_sp, b_ar, b_la, b_nd_r) ~ dnorm(0, 2), # priors for slopes
# second level
# adpative priors
c(a, b)[study] ~ multi_normal(c(a_bar, b_bar), Rho, sigma_dt) ,
g[species] ~ dnorm( 0 , sigma_g ),
# fixed priors
# prior for a_bar (average intercept) and b_bar (average slope)
c(a_bar, b_bar) ~ dnorm(0, 2) ,
# prior for standard deviation of intercept and slope of study
sigma_dt ~ dexp(1),
# prior for std. dev of intercept of species
sigma_g ~ dexp(1),
# prior for the correlation matrix
Rho ~ lkj_corr(2)
), data=dat_list , iter=2000 , chains=4, cores=4
)
The following models are for the change in species number (delta_sr
) only. The models remain however the same for the other response variable, percentage point change in alien (alien_dp
).
# prepare a data.frame with all relevant variables
dat<- unique( data.frame(studyID = d$studyID,
delta_sr = d$delta_sr, # change in species number
alien_dp = d$alien_dp, # percentage point change in alien species
meann_ext = d$meann_ext, # mean N value of extinct species
meann = d$delta_meann, # change in community mean N value
meann_b = d$meann_b, # community mean N value baseline
meann_r = d$meann_r, # community mean N value resurvey
ndepo = d$n_delta_r, # n-deposition
survint = d$survey_interval, # time between surveys
nplot = d$n_plots, # number of plots
splot = d_ep$plot_size, # size of plots
area = d$site_area, # size of study area
lati = d$latitude ) # latitude
We start by modelling the change in species number from baseline survey to resurvey with inter-census nitrogen deposition, survey interval and number of plots as predictors (Supplementary Table 5).
# clean data list
dat_list <- list(
delta_sr = standardize(dat$delta_sr),
ndepo = standardize(dat$ndepo),
survint = standardize(dat$survint),
nplot = standardize(dat$nplot),
splot = standardize(dat$splot),
area = standardize(dat$site_area),
lati = standardize(dat$lati),
)
# model
model_sr <- ulam(
alist(
delta_sr ~ dnorm(mu, sigma),
mu <- a + b*ndepo + b1*survint + b2*nplot + b3*area + b4*splot + b5*lati,
a ~ dnorm(0, 1),
c(b,b1,b2,b3,b4,b5) ~ dnorm(0, 1),
sigma ~ dexp(1)
), data=dat_list , iter=2000 , chains=4, cores=4
)
We want to visualize the effect of ndepo
with a predictor residual response plot. This plot displays the linear relationship between change in species number and inter-census nitrogen deposition, having statistically controlled for number of plots and survey interval, with the actual data. In order to achieve that, we model the predictor of interest ndepo
as a function of the other two predictors survint
and nplot
. Then we calculate residuals by substracting the predicted mean values from this model from the observed values for ndepo
. Then in a second model, we regress the change in species number on these just calculated residuals.
# clean data list
dat_list <- list(
nplot = standardize(dat$nplot),
splot = standardize(dat$splot),
area = standardize(dat$site_area),
lati = standardize(dat$lati),
)
# predictor residual model
model_res <- ulam(
alist(
ndepo ~ dnorm(mu, sigma),
mu <- a + b1*survint + b2*nplot + b3*area + b4*splot + b5*lati,
a ~ dnorm(0, 1),
c(b1,b2,b3) ~ dnorm(0, 1),
sigma ~ dexp(1)
), data=dat_list , iter=2000 , chains=4, cores=4
)
# calculate residuals
mu <- link(model_res)
mu_mean <- apply( mu , 2 , mean )
mu_resid <- dat_list$ndepo - mu_mean
# clean data list
dat_list <- list(
delta_sr = standardize(dat$delta_sr),
resids = mu_resid)
# predictor residual response model
model_sr_res <- ulam(
alist(
delta_sr ~ dnorm(mu, sigma),
mu <- a + b*resids,
a ~ dnorm(0, 1),
b ~ dnorm(0, 1),
sigma ~ dexp(1)
), data=dat_list , iter=2000 , chains=4, cores=4,
)
For plotting, we make a a scatterplot of species richness change and inter-census nitrogen deposition residuals and overlay this with the linear regression of the two variables (predictor residual response model). Because now we only have 68 datapoints (number of studies), it is important to visualize if there are any highly influential data points that give leverage and diffuse a general trend. Therefore we will calculate each data point’s pareto k value and scale the point size with these values.
k <- LOOPk(model_sr_res)
We tested whether community composition shifts towards more N-demanding species with higher N-deposition by regressing 1) the average N-demand of extinct species (meann_ext
) and 2) and the change in mean N-demand of the entire community (meann
) against N-deposition.
dat_list <- list(meann_ext = standardize(dat$meann_ext),
meann = standardize(dat$delta_meann),
ndepo = standardize(dat$n_delta_r))
model_meann<-ulam(
alist(
meann_ext ~ dnorm(mu, sigma), # replace meann_ext with meann for second model
mu <- a + b*ndepo,
a ~ dnorm(0, 1),
b ~ dnorm(0, 1),
sigma ~ dexp(1)
), data=dat_list , iter=2000 , chains=4, cores=4, log_lik = T
)
To test whether variances between the baseline survey and resurvey differ, we estimate two sigmas. That is we allow sigma
to vary between surveys (group
). We do this using a bayesian estimation of the t-test with an unequal variances model.
# clean data list
dat_list <- list(
meann = c(dat$meann_b, dat$meann_r), # mean n-number, proxy for nutrient availability
group = c(rep(1, nrow(dat)), rep(2, nrow(dat)))) # baseline / resurvey
# model
model_var <- ulam(
alist(meann ~ dnorm(mu, sigma[group]) ,
mu <- a + b*group ,
a ~ dnorm(5, 2) , # the mean for group 1, the baseline survey
b ~ dnorm(0, .5) , # the slope (difference between group 1 and 2)
sigma[group] ~ dexp(1) # prior for sigmas
), data = dat_list, iter = 2000 , chains=4, cores=4
)
To calculate the difference between these sigmas, we sample their posterior distributions.
# sample posterior distribution
post <- extract.samples(model_var)
# calculate the difference (square for obtaining variance instead of std. dev.)
diff_sigma <- post$sigma[,2]^2 - post$sigma[,1]^2