Preamble

library(rethinking)  # bayesian hierarchical modelling: https://github.com/rmcelreath/rethinking/tree/Experimental
library(tidyverse)

R code for range size comparison (Fig. 2a and Supplementary Table 2).

In this model the response is range size (range), the predictor is species’ trajectory (status). Each coefficient of status (i.e., mean range size for persisting, colonizing and extinct species) is allowed to vary by each study.

# clean list with data that are used in the model
dat_list <- list(
  status  = as.integer( relevel(d$status, ref="persisting") ),
  study   = as.integer(d$studyID),
  range   = d$range_normal
)


model0<- ulam(
  alist(
    range ~ dnorm(mu, sigma) ,               # likelihood of the data
    mu <- b[status] + a[study, status] ,     # linear model
    
    # adaptive prior
    vector[3]:a[study] ~ multi_normal(0, Rho, sigma_a),
    
    # fixed priors
    b[status] ~ dnorm(0, 1.5),
    sigma_a ~ dexp(1) ,
    sigma ~ dexp(1),
    
    # prior for correlation
    Rho ~ lkj_corr(2)
    
  ), data=dat_list, iter=2000, chains=4 , cores=4
)

In the top level, the outcome is range. The variable was transformed and is normal distributed with a mean, mu, and standard deviation, sigma). We make mu into a linear function containing a coefficient for each trajectory, b[status], and an effect of each trajectory in each study a[study, status]. The next part of the model is the adaptive prior. The multivariate Gaussian prior is 3-dimensional, because there are 3 possible trajectories (persisting, colonizing, extinct). There are no means in this priors, because we already placed the average trajectory effects, b[status], in the linear model. The covariance matrix of this prior needs two parameters, sigma_a and Rho. The next part of the model are the fixed priors. We define Normal priors for b[status]. The range of the normalized range sizes is -3 to +3, so if we set the standard deviation of the b[status] prior to half of 3, we are saying that we expect 95% of plausible coefficients to be within the -3 to +3 interval. For sigma from the Normal distribution of the outcome variable and sigma_a from the covariance matrix of the multivariate Gaussian prior, we take broad exponential priors with \(\lambda=1\). This constrains the sigmas to be zero or positive. The prior for Rho uses the LKJ correlation distribution. lkj_corr(2) is a weakly informative prior on the correlation between coefficients that is skeptical of extreme correlations near -1 or +1.

To produce more effective samples for each parameter we re-parametrized the model to a non-centered version.

model00 <- ulam(
  alist(
    range ~ dnorm(mu, sigma) ,
    mu <- b[status] + a[study, status] ,
    
    # adaptive priors - non-centered
    transpars> matrix[study,3]:a <- 
      compose_noncentered( sigma_a , L_Rho_a , z_a ),
    matrix[3,study]:z_a ~ normal( 0 , 1 ),
    
    # fixed priors
    b[status] ~ dnorm(0, 1.5),
    vector[3]:sigma_a ~ dexp(1),
    cholesky_factor_corr[3]:L_Rho_a ~ lkj_corr_cholesky( 2 ),
    sigma ~ dexp(1),
    
    # compute ordinary correlation matrices from Cholesky factors
    gq> matrix[3,3]:Rho_a <<- multiply_lower_tri_self_transpose(L_Rho_a)
  
  ), data=dat_list, iter=2000, chains=4 , cores=4
)

We are interested in the difference between the estimated range size of persisting, colonizing and extinct species.

post <- extract.samples(model00)
diffs <- list( percol = post$b[,1] - post$b[,2],
               perext = post$b[,1] - post$b[,3],
               colext = post$b[,2] - post$b[,3] )

R code for the effect of abundance on species’ extinction probability (Fig. 2b and Supplementary Table 3).

# omit the species that have not been present at the baseline survey: drop colonizing species
d%>%
  filter(!status=="colonizing")%>% 
  droplevels() -> d_ep

# convert status into dummy variable
d_ep$extinct <- ifelse(d_ep$status=="extinct", 1, 0)

Empty model (one cluster type): We start by fitting an empty (no predictor) multilevel model. For this model, there are two cluster types in our data that are relevant. One cluster type is study, the other is species. First, we include only one cluster type (study) and then add species as a crossed varying effect. Finally, we compare the two models.

# clean list with data that are used in the model
dat_list <- list(
  extinct = as.integer(d_ep$extinct),      # 1/0 indicator for extinct
  study   = as.integer(d_ep$studyID)       # study id
)


# model
model0.1 <- ulam(
  alist(
    # first level:
    extinct ~ dbinom(1, p),
    logit(p) <- a[study],
    
    # second level:
    # adaptive priors
    a[study] ~ dnorm( a_bar , sigma_a ),
    
    # hyper-priors
    a_bar ~ dnorm( 0 , 2 ),   # prior for mean study
    sigma_a ~ dexp(1)         # prior for standard deviation of studies
    
  ), data=dat_list , iter=2000 , chains=4, cores=4, log_lik=T
)

In the top level, the outcome is extinct. This variable can take the values 0 and 1 and is Bernoulli distributed. The logit link maps the parameter p, which is only defined from 0 to 1 onto a linear model. In this model the parameter is the vector a[study]. The prior for these intercepts is a[study]~Normal(a_bar; sigma_a) (second level). The parameters for this prior are a_bar and sigma_a, and they have priors again. The relatively concetrated prior a_bar~Normal(0; 2) produces a reasonably broad prior distribution on the outcome scale. Because sigma can only be positive we take a broad exponential prior with \(\lambda=1\), which constrains sigma to be zero or positive.

Empty model (two cluster types): The next model includes species as an additional crossed varying effect. This allows us to use partial pooling on both categorical variables, study and species, at the same time.

To add the second cluster type, species, we replicate the structure for the study cluster. This means the linear model gets yet another varying intercept, g[species], and the model gets another adaptive prior and yet another standard deviation parameter that adapts the amount of pooling across units.

# clean list with data that are used in the model
dat_list <- list(
  extinct = as.integer(d_ep$extinct),
  study   = as.integer(d_ep$studyID),
  species = as.integer( as.factor(d_ep$species) )  # species id
)


# model
model0.2 <- ulam(
  alist(
    extinct ~ dbinom(1, p),
    # first level
    logit(p) <- a_bar + a[study] + g[species],
    a_bar ~ dnorm( 0 , 2 ), # prior for global mean
    
    # second level
    # adaptive priors
    a[study] ~ dnorm( 0 , sigma_a ),
    g[species] ~ dnorm( 0 , sigma_g ),
    
    # hyper-priors
    a_bar ~ dnorm( 0 , 2 ), # prior for global mean
    sigma_a ~ dexp(1),      # prior for standard deviation of studies
    sigma_g ~ dexp(1)       # prior for standard deviation of species
    
  ), data=dat_list , iter=2000 , chains=4, cores=4, log_lik=T
)

We compare model0.1 to the model with both clusters, model0.2. We focus on a criterion that is more general than both AIC and DIC, WAIC. The Widely Applicable Information Criterion (WAIC) makes no assumption about the shape of the posterior. It provides an approximation of the out-of-sample deviance.

mcomparison <- compare(model0.1, model0.2)

Predictor model (local occupancy) without varying slopes: We add local occupancy, initf, to model0.2.

# clean list with data that are used in the model
dat_list <- list(
  extinct = as.integer(d_ep$extinct),
  study   = as.integer(d_ep$studyID),
  initf   = standardize(d_ep$B)       # standardize species' local occupancy at the baseline survey
)


# model
model1.1 <- ulam(
  alist(
    extinct ~ dbinom(1, p),
    # first level
    logit(p) <- a_bar + a[study] + g[species] + b_f * initf,
    a_bar ~ dnorm( 0, 2 ),  # prior for global mean
    b_f ~ dnorm( 0 , 2),    # prior for the effect (slope) of initf
    
    # second level
    # adaptive priors
    a[study] ~ dnorm( 0 , sigma_a ),
    g[species] ~ dnorm( 0 , sigma_g ),
    
    # hyper-priors
    a_bar ~ dnorm( 0 , 2), 
    sigma_a ~ dexp(1), 
    sigma_g ~ dexp(1) 
    
  ), data=dat_list , iter=2000 , chains=4, cores=4, log_lik=T
)

Predictor model (local occupancy) with varying slopes: We let the effect of local occupancy vary across localities (studies). We have to model the joint population of intercepts and slopes, which means modeling their covariance. In conventional multilevel models, this is made possible by a joint multivariate Gaussian distribution for all of the varying effects, both intercepts and slopes. Instead of having two independent Gaussian distributions of intercepts and of slopes, we assign a two-dimensional Gaussian distribution to both the intercepts (first dimension) and the slopes (second dimension). That means each study has an intercept, a[study] and slope b[study] with a prior distribution defined by the two-dimensional Gaussian distribution with means a_bar and b_bar and covariance matrix S. The covariance matrix needs three parameters, the intercept standard deviation, the slope standard deviation (vector sigma_dt) and a correlation matrix with the correlation between intercept and slopes Rho.

# clean list with data that are used in the model
dat_list <- list(
  extinct = as.integer(d_ep$extinct),
  study   = as.integer(d_ep$studyID),
  initf   = standardize(d_ep$B)         
)


# model
model1.2 <- ulam(
  alist(
    extinct ~ dbinom(1, p),
    # first level
    logit(p) <- a[study] + g[species] + b[study] * initf,
    
    # second level
    # adaptive priors
    c(a, b)[study] ~ multi_normal(c(a_bar, b_bar), Rho, sigma_dt),
    g[species] ~ dnorm( 0 , sigma_g ),
    
    # fixed priors
    # prior for a_bar (average intercept) and b_bar (average slope)
    c(a_bar, b_bar) ~ dnorm(0, 2),  
    
    # prior for standard deviation of intercepts and slopes of studies
    sigma_dt ~ dexp(1), 
    # prior for std. dev of intercept of species 
    sigma_g ~ dexp(1),
    # prior for the correlation matrix
    Rho ~ lkj_corr(2)
    
  ), data=dat_list , iter=2000 , chains=4, cores=4, log_lik=T
)

The model above shows a varying slope logit model for the relationship between the probability of going extinct and local occupancy. The prior for Rho, lkj_corr(2) is a weakly informative prior on the correlation between intercepts and slopes that is skeptical of extreme correlations near -1 or 1.

We compare the varying slopes model to the varying intercept only model.

mcomparison <- compare(model1.2, model1.1)

R code for the effect of range size on species’ extinction probability controlling for local occupancy (Fig. 2c and Supplementary Table 3).

Model1.2 with range size, without varying slopes: We add range size, range, to model1.2.

# clean list with data that are used in the model
dat_list <- list(
  extinct = as.integer(d_ep$extinct),
  study   = as.integer(d_ep$studyID),
  species = as.integer( as.factor(d_ep$species) ),
  initf   = standardize(d_ep$B),
  range   = standardize(d_ep$range_mid) # standardized range size estimated at a resolution of 10.7 km2 (mid-resolution)
)


# model
model1.3 <- ulam(
  alist(
    extinct ~ dbinom(1, p),
    # first level
    logit(p) <- a[study] + g[species] + b[study] * initf + b_r * range,
    # prior for slope of range size
    b_r ~ dnorm(0, 2), 
    
    # second level
    # adaptive priors
    c(a, b)[study] ~ multi_normal(c(a_bar, b_bar), Rho, sigma_dt),
    g[species] ~ dnorm( 0 , sigma_g ),
    
    # fixed priors
    # prior for a_bar (average intercept) and b_bar (average slope)
    c(a_bar, b_bar) ~ dnorm(0, 2),  
    
    # prior for standard deviation of intercepts and slopes of studies
    sigma_dt ~ dexp(1),
    # prior for std. dev of intercept of species 
    sigma_g ~ dexp(1),
    # prior for the correlation matrix
    Rho ~ lkj_corr(2)
    
  ), data=dat_list , iter=2000 , chains=4, cores=4, log_lik=T
)

Model1.2 with range size, with varying slopes: The effect of range size might also vary between studies, we include varying slopes for range size.

# clean list with data that are used in the model
dat_list <- list(
  extinct = as.integer(d_ep$extinct),
  study   = as.integer(d_ep$studyID),
  species = as.integer( as.factor(d_ep$species) ),
  initf   = standardize(d_ep$B),
  range   = standardize(d_ep$range_mid) 
)


# model
model1.4 <- ulam(
  alist(
    extinct ~ dbinom(1, p),
    # first level
    logit(p) <- a[study] + g[species] + b_f[study] * initf + b_r[study] * range,
    
    # second level
    # adaptive priors
    c(a, b_f, b_r)[study] ~ multi_normal(c(a_bar, b_f_bar, b_r_bar), Rho, sigma_dt),
    
    g[species] ~ dnorm( 0 , sigma_g ),
    
    # fixed priors
    # prior for a_bar (average intercept) and b_bar (average slope)
    c(a_bar, b_f_bar, b_r_bar) ~ dnorm(0, 2) ,  
    
    # prior for standard deviation of studies
    sigma_dt ~ dexp(1) ,  
    sigma_g ~ dexp(1) ,
    # prior for correlation
    Rho ~ lkj_corr(2)
    
  ), data=dat_list , iter=2000 , chains=4, cores=4, log_lik=T
)

We compare the model with varying slopes on range size, model1.4 to the model without model1.3.

mcomparison <- compare(model1.3, model1.4)

R code for the effect of range size on species’ extinction probability without controlling for local occupancy (Supplementary Table 3).

We compare the effect of range sizem b_r, in the model that controls for local occupancy, model1.3, with the effect in a model that does not control for local occupancy.

# clean list with data that are used in the model
dat_list <- list(
  extinct = as.integer(d_ep$extinct),
  study   = as.integer(d_ep$studyID),
  species = as.integer( as.factor(d_ep$species) ),
  range   = standardize(d_ep$range_mid)
)


# model
model1.5 <- ulam(
  alist(
    extinct ~ dbinom(1, p),
    logit(p) <- a_bar + a[study] + g[species] + b_r * range,
    a_bar ~ dnorm(0 , 2 ), # prior for global mean
    b_r ~ dnorm( 0 , 2 ),  # prior for effect (slope) of initf
    
    # adaptive priors
    a[study] ~ dnorm( 0 , sigma_a),
    g[species] ~ dnorm( 0 , sigma_g ),
    
    # hyper-priors
    sigma_a ~ dexp(1),  # prior for standard deviation of studys
    sigma_g ~ dexp(1)   # prior for sta
    
  ), data=dat_list , iter=2000 , chains=4, cores=4, log_lik=T
)

R code for the effect of nitrogen deposition on species’ extinction probability (Fig. 3a and b, Supplementary Table 4).

So far we have considered the effects of species level explanatory variables. Now we add study-level explanatory variables. This allows us to explain variation among studies (localities) in average extinction probability across species. Additionally, we can use these variables to ask whether the effect of range size varies with any contextual effect.

Model of average extinction probability across species with inter-census nitrogen deposition, and other covariates as study-level predictors:

# clean list with data that are used in the model
dat_list <- list(
  extinct = as.integer(d_ep$extinct),
  dataset = as.integer( as.factor(d_ep$studyID) ),
  species = as.integer( as.factor(d_ep$species) ), 
  initf   = standardize(d_ep$B),
  range   = standardize(d_ep$range_mid),
  ndepo   = standardize(d_ep$n_delta_r),            # inter-census n-deposition
  survint = standardize(d_ep$survey_interval),      # time between surveys
  nplot   = standardize(d_ep$n_plots),              # number of plots
  splot   = standardize(d_ep$plot_size),            # size of plots
  area    = standardize(d_ep$site_area),            # size of study area
  lati    = standardize(d_ep$latitude)              # latitude
)


model2.1 <- ulam
  alist(
    extinct ~ dbinom(1, p) , 
    logit(p) <- 
      a[study] + 
      g[species] +
      b[study] * initf + 
      b_r*range + 
      
      # contextual variables
      b_nd*ndepo +          # n-deposition
      b_si*survint +        # time between surveys
      b_np*nplot +          # number of plots
      b_sp*splot +          # size of plots
      b_ar*area +           # size of study area
      b_la*lati ,           # latitude
    
    c(b_r, b_nd, b_si , b_np, b_sp, b_ar, b_la) ~ dnorm(0, 2), # priors for slopes
    
    # second level
    # adpative priors
    c(a, b)[study] ~ multi_normal(c(a_bar, b_bar), Rho, sigma_dt) ,
    g[species] ~ dnorm( 0 , sigma_g ),
    
    # fixed priors
    # prior for a_bar (average intercept) and b_bar (average slope)
    c(a_bar, b_bar) ~ dnorm(0, 2) ,  
    
    # prior for standard deviation of intercept and slope of study
    sigma_dt ~ dexp(1),
    # prior for std. dev of intercept of species 
    sigma_g ~ dexp(1),
    # prior for the correlation matrix
    Rho ~ lkj_corr(2)  
    
  ), data=dat_list , iter=2000 , chains=4, cores=4
)

Model for species’ extinction probability with the interaction effect between inter-census nitrogen deposition and species’ range size:

model2.2 <- ulam(
  alist(
    extinct ~ dbinom(1, p) , 
    logit(p) <- 
      a[study] + 
      g[species] +
      b[study] * initf + 
      b_r*range + 
      
      # contextual variables
      b_nd*ndepo +          # n-deposition
      b_si*survint +        # time between surveys
      b_np*nplot +          # number of plots
      b_sp*splot +          # size of plots
      b_ar*area +           # size of study area
      b_la*lati +           # latitude
      
      # crosslevel interaction 
      b_nd_r*ndepo*range, 
    
    c(b_r, b_nd, b_si , b_np, b_sp, b_ar, b_la, b_nd_r) ~ dnorm(0, 2), # priors for slopes
    
    # second level
    # adpative priors
    c(a, b)[study] ~ multi_normal(c(a_bar, b_bar), Rho, sigma_dt) ,
    g[species] ~ dnorm( 0 , sigma_g ),
    
    # fixed priors
    # prior for a_bar (average intercept) and b_bar (average slope)
    c(a_bar, b_bar) ~ dnorm(0, 2) ,  
    
    # prior for standard deviation of intercept and slope of study
    sigma_dt ~ dexp(1),
    # prior for std. dev of intercept of species 
    sigma_g ~ dexp(1),
    # prior for the correlation matrix
    Rho ~ lkj_corr(2)  
    
  ), data=dat_list , iter=2000 , chains=4, cores=4
)

R code for the effect of nitrogen deposition on colonizing species (Fig. 3c,d and Supplementary Table 5).

The following models are for the change in species number (delta_sr) only. The models remain however the same for the other response variable, percentage point change in alien (alien_dp).

# prepare a data.frame with all relevant variables
dat<- unique( data.frame(studyID   = d$studyID, 
                         delta_sr  = d$delta_sr,            # change in species number
                         alien_dp  = d$alien_dp,            # percentage point change in alien species
                         meann_ext = d$meann_ext,           # mean N value of extinct species
                         meann     = d$delta_meann,         # change in community mean N value
                         meann_b   = d$meann_b,             # community mean N value baseline
                         meann_r   = d$meann_r,             # community mean N value resurvey
                         ndepo     = d$n_delta_r,           # n-deposition
                         survint   = d$survey_interval,     # time between surveys
                         nplot     = d$n_plots,             # number of plots
                         splot     = d_ep$plot_size,        # size of plots
                         area      = d$site_area,           # size of study area
                         lati      = d$latitude )           # latitude

We start by modelling the change in species number from baseline survey to resurvey with inter-census nitrogen deposition, survey interval and number of plots as predictors (Supplementary Table 5).

# clean data list
dat_list <- list(
  delta_sr = standardize(dat$delta_sr),
  ndepo    = standardize(dat$ndepo),
  survint  = standardize(dat$survint),
  nplot    = standardize(dat$nplot),
  splot    = standardize(dat$splot),
  area     = standardize(dat$site_area),
  lati    = standardize(dat$lati),
)

# model 
model_sr <- ulam(
  alist(
    delta_sr ~ dnorm(mu, sigma),
    mu <- a + b*ndepo + b1*survint + b2*nplot + b3*area + b4*splot + b5*lati,
    
    a ~ dnorm(0, 1),
    c(b,b1,b2,b3,b4,b5) ~ dnorm(0, 1),
    
    sigma ~ dexp(1)
  ), data=dat_list , iter=2000 , chains=4, cores=4
)

We want to visualize the effect of ndepo with a predictor residual response plot. This plot displays the linear relationship between change in species number and inter-census nitrogen deposition, having statistically controlled for number of plots and survey interval, with the actual data. In order to achieve that, we model the predictor of interest ndepo as a function of the other two predictors survint and nplot. Then we calculate residuals by substracting the predicted mean values from this model from the observed values for ndepo. Then in a second model, we regress the change in species number on these just calculated residuals.

# clean data list
dat_list <- list(
  nplot    = standardize(dat$nplot),
  splot    = standardize(dat$splot),
  area     = standardize(dat$site_area),
  lati    = standardize(dat$lati),
)

# predictor residual model
model_res <- ulam(
  alist(
    ndepo ~ dnorm(mu, sigma),
    mu <- a + b1*survint + b2*nplot + b3*area + b4*splot + b5*lati,
    
    a ~ dnorm(0, 1),
    c(b1,b2,b3) ~ dnorm(0, 1),
    
    sigma ~ dexp(1)
  ), data=dat_list , iter=2000 , chains=4, cores=4
)

# calculate residuals
mu <- link(model_res)
mu_mean <- apply( mu , 2 , mean )
mu_resid <- dat_list$ndepo - mu_mean



# clean data list
dat_list <- list(
  delta_sr = standardize(dat$delta_sr),
  resids   = mu_resid)



# predictor residual response model
model_sr_res <- ulam(
  alist(
    delta_sr ~ dnorm(mu, sigma),
    mu <- a + b*resids,
    
    a ~ dnorm(0, 1),
    b ~ dnorm(0, 1),
    
    sigma ~ dexp(1)
  ), data=dat_list , iter=2000 , chains=4, cores=4,
)

For plotting, we make a a scatterplot of species richness change and inter-census nitrogen deposition residuals and overlay this with the linear regression of the two variables (predictor residual response model). Because now we only have 68 datapoints (number of studies), it is important to visualize if there are any highly influential data points that give leverage and diffuse a general trend. Therefore we will calculate each data point’s pareto k value and scale the point size with these values.

k <- LOOPk(model_sr_res)

R code for change in nitrophiles (Fig 3e and Supplementary Table 6).

We tested whether community composition shifts towards more N-demanding species with higher N-deposition by regressing 1) the average N-demand of extinct species (meann_ext) and 2) and the change in mean N-demand of the entire community (meann) against N-deposition.

dat_list <- list(meann_ext = standardize(dat$meann_ext),
                 meann = standardize(dat$delta_meann),
                 ndepo = standardize(dat$n_delta_r))

model_meann<-ulam(
  alist(
    meann_ext ~ dnorm(mu, sigma),        # replace meann_ext with meann for second model
    mu <- a + b*ndepo,
    
    a ~ dnorm(0, 1),
    b ~ dnorm(0, 1),
    
    sigma ~ dexp(1)
  ), data=dat_list , iter=2000 , chains=4, cores=4, log_lik = T
)

R code for change in nitrogen availability over time (Fig 3f and Supplementary Table 7).

To test whether variances between the baseline survey and resurvey differ, we estimate two sigmas. That is we allow sigma to vary between surveys (group). We do this using a bayesian estimation of the t-test with an unequal variances model.

# clean data list
dat_list <- list(
  meann = c(dat$meann_b, dat$meann_r),                 # mean n-number, proxy for nutrient availability
  group = c(rep(1, nrow(dat)), rep(2, nrow(dat))))     # baseline / resurvey


# model
model_var <- ulam(
  alist(meann ~ dnorm(mu, sigma[group]) ,
        mu <- a + b*group ,
        
        a ~ dnorm(5, 2) ,        #  the mean for group 1, the baseline survey
        b ~ dnorm(0, .5) ,       #  the slope (difference between group 1 and 2)
        sigma[group] ~ dexp(1)   #  prior for sigmas
        
  ), data = dat_list, iter = 2000 , chains=4, cores=4
)

To calculate the difference between these sigmas, we sample their posterior distributions.

# sample posterior distribution
post <- extract.samples(model_var)

# calculate the difference (square for obtaining variance instead of std. dev.)
diff_sigma <-   post$sigma[,2]^2 - post$sigma[,1]^2