This document contains additional information and details on the demographic structure of individual countries in the study area, as well as model fitting information and details for modelling diagnostics that were not presented in the text of the primary article.
Individual population countries were produced from WorldPop Age/Sex Structured data. Change over time in the demographic structure can be observed for each country in the study.
Figure 1. The pyramids above show the demographic breakdown and change
for each six countries for the period represented in the modeling
data.
We present here additional detail and alternate model specifications for the model presented in the main text. All code used to produce the final, presented model (tables and plots) is included below, with additional diagnostic detail retained which may not have been included in the final manuscript.
All data and code presented here is uploaded to Figshare, DOI:10.6084/m9.figshare.20459802 for archival and to facilitate the reproducibility of this research.
Loading required packages:
## Load packages and set working directory:
rm(list = ls())
library(tidyverse)
library(ggExtra)
library(reshape2)
library(GGally)
library(MuMIn)
library(fixest)
library(corrplot)
library(modelsummary)
library(jtools)
library(lme4)
library(merTools)
library(rstan)
library(brms)
library(bayesplot)
library(tidybayes)
library(DHARMa)
library(sjPlot)
library(ggeffects)
library(marginaleffects)
library(patchwork)
library(ggthemes)
#library(knitr)
#library(lemon)
#library(htmlTable)
library(table1)
The following seed, and all seeds that follow for Stan models (see below) are required to reproduce the models as fit for paper:
set.seed(2025)
The following will load and create the subsets of the data needed for modeling purposes. We will subset the data to two distinct training/testing datasets. The first contains the primary urbanizing areas of interest (>5000 population in 2000 and <300,000 population in 2015) and the second that also includes all larger cities:
## Set to location for base directory with associated /data and /output
## folders:
setwd('/Users/frstev01/Desktop/SEAsia/')
## Load summarized data file for data exploration and modeling:
seasia_variables <- read.csv('data/SEAsia/seasia_variables_for_modeling_2000-2015.csv')
seasia_variables$country_iso <- as.factor(seasia_variables$country_iso)
### Our primary urban areas of interest are "Medium-sized Urban Areas (MUAs)"
### which are >300k in 2015 but >5k in 2000, but we'll create two subsets
### for comparisons, one that includes larger UAs as well, and just those
### that have fewer than 300,000 people total in 2015:
ModelCitiesMLUA <- subset(seasia_variables, TotalPop_2000 > 5000) ## All cities in the DB except for small areas...
ModelCitiesMUA <- subset(seasia_variables, TotalPop_2015 < 300000 & TotalPop_2000 > 5000) ## Our primary study thresholds of interest
## Add large city dummy variable:
ModelCitiesMLUA$big_city <- as.numeric(ModelCitiesMLUA$TotalPop_2015 >= 300000)
Now we create the model formulations that are tested and compared for the selection of the final models presented in the paper:
## Final model formulae, to be fit using mixed effect modelling:
frm_lmer_pop_iso_fe <- formula(
AnnGrRate ~
(
1 +
scale(elevation) +
scale(mean_temp2000) +
scale(mean_precip2000) +
scale(built2000) +
scale(built_change) +
scale(mean_green2000) +
scale(mean_green_change) +
scale(gdp_percap2000) +
scale(tt_capital) +
scale(ntl) +
scale(YoungDR_2000) +
scale(YoungDRChange) +
scale(LogTotalPop_2000)
) +
( 1 | country_iso )
## The following can be used instead of the former to create a varying
## slope model instead of simply a varying intercept model. The data
## don't support the complexity of this modeling approach.
#( 1 + scale(LogTotalPop_2000) | country_iso )
)
frm_lmer_pop_iso_fe_bc <- formula(
AnnGrRate ~
(
1 +
scale(elevation) +
scale(mean_temp2000) +
scale(mean_precip2000) +
scale(built2000) +
scale(built_change) +
scale(mean_green2000) +
scale(mean_green_change) +
scale(gdp_percap2000) +
scale(tt_capital) +
scale(ntl) +
scale(YoungDR_2000) +
scale(YoungDRChange) +
scale(LogTotalPop_2000)
) +
( 1 | country_iso )
#( 1 + scale(LogTotalPop_2000) | country_iso )
)
Next we present the models as fit using maximum likelihood methods. These may be easier for certain statistical backgrounds to compare, but are equivalent to the model fits and results as presented from the Bayesian process in the results.
## Subset data to only those cities with non-missing data:
lmer_pop_iso_fe <- lmer(frm_lmer_pop_iso_fe, data = ModelCitiesMUA)
lmer_pop_iso_fe_bc <- lmer(frm_lmer_pop_iso_fe, data = ModelCitiesMLUA)
summary(lmer_pop_iso_fe)
## Linear mixed model fit by REML ['lmerMod']
## Formula:
## AnnGrRate ~ (1 + scale(elevation) + scale(mean_temp2000) + scale(mean_precip2000) +
## scale(built2000) + scale(built_change) + scale(mean_green2000) +
## scale(mean_green_change) + scale(gdp_percap2000) + scale(tt_capital) +
## scale(ntl) + scale(YoungDR_2000) + scale(YoungDRChange) +
## scale(LogTotalPop_2000)) + (1 | country_iso)
## Data: ModelCitiesMUA
##
## REML criterion at convergence: -2600.7
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -4.6197 -0.5735 0.0347 0.5528 3.1892
##
## Random effects:
## Groups Name Variance Std.Dev.
## country_iso (Intercept) 6.798e-05 0.008245
## Residual 2.433e-04 0.015598
## Number of obs: 505, groups: country_iso, 6
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 2.649e-02 4.055e-03 6.533
## scale(elevation) 7.638e-05 8.478e-04 0.090
## scale(mean_temp2000) -7.205e-04 8.798e-04 -0.819
## scale(mean_precip2000) -1.512e-03 8.322e-04 -1.816
## scale(built2000) 3.189e-03 1.107e-03 2.880
## scale(built_change) 3.802e-03 8.114e-04 4.686
## scale(mean_green2000) 1.426e-03 9.090e-04 1.569
## scale(mean_green_change) -2.732e-03 8.229e-04 -3.320
## scale(gdp_percap2000) 1.589e-03 8.322e-04 1.909
## scale(tt_capital) 2.564e-03 8.529e-04 3.006
## scale(ntl) 9.464e-04 7.729e-04 1.225
## scale(YoungDR_2000) -2.701e-03 1.031e-03 -2.620
## scale(YoungDRChange) 2.442e-03 2.606e-03 0.937
## scale(LogTotalPop_2000) -6.673e-03 1.180e-03 -5.656
##
## Correlation matrix not shown by default, as p = 14 > 12.
## Use print(x, correlation=TRUE) or
## vcov(x) if you need it
summary(lmer_pop_iso_fe_bc)
## Linear mixed model fit by REML ['lmerMod']
## Formula:
## AnnGrRate ~ (1 + scale(elevation) + scale(mean_temp2000) + scale(mean_precip2000) +
## scale(built2000) + scale(built_change) + scale(mean_green2000) +
## scale(mean_green_change) + scale(gdp_percap2000) + scale(tt_capital) +
## scale(ntl) + scale(YoungDR_2000) + scale(YoungDRChange) +
## scale(LogTotalPop_2000)) + (1 | country_iso)
## Data: ModelCitiesMLUA
##
## REML criterion at convergence: -2968.5
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -4.8786 -0.5561 -0.0267 0.5521 3.6100
##
## Random effects:
## Groups Name Variance Std.Dev.
## country_iso (Intercept) 0.0001357 0.01165
## Residual 0.0002406 0.01551
## Number of obs: 572, groups: country_iso, 6
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 0.0291384 0.0053385 5.458
## scale(elevation) 0.0001434 0.0007852 0.183
## scale(mean_temp2000) -0.0007402 0.0008165 -0.907
## scale(mean_precip2000) -0.0011267 0.0007718 -1.460
## scale(built2000) 0.0017233 0.0007808 2.207
## scale(built_change) 0.0031951 0.0007493 4.264
## scale(mean_green2000) 0.0010003 0.0008643 1.157
## scale(mean_green_change) -0.0030970 0.0007569 -4.092
## scale(gdp_percap2000) 0.0021481 0.0007687 2.795
## scale(tt_capital) 0.0025294 0.0007912 3.197
## scale(ntl) 0.0009450 0.0007333 1.289
## scale(YoungDR_2000) -0.0031184 0.0010044 -3.105
## scale(YoungDRChange) 0.0073220 0.0027848 2.629
## scale(LogTotalPop_2000) -0.0031024 0.0009016 -3.441
##
## Correlation matrix not shown by default, as p = 14 > 12.
## Use print(x, correlation=TRUE) or
## vcov(x) if you need it
## Construct model summaries for model with and without LUAs included:
mods <- list(
"Med. Only"=lmer_pop_iso_fe,
"Med./Lrg."=lmer_pop_iso_fe_bc
)
modelsummary(mods, estimate="{estimate}{stars}\n[{conf.low}, {conf.high}]", statistic="p={p.value}{stars}", stars=c('*' = .05))
Med. Only | Med./Lrg. | |
---|---|---|
(Intercept) | 0.026* [0.019, 0.034] | 0.029* [0.019, 0.040] |
p=<0.001* | p=<0.001* | |
scale(elevation) | 0.000 [-0.002, 0.002] | 0.000 [-0.001, 0.002] |
p=0.928 | p=0.855 | |
scale(mean_temp2000) | -0.001 [-0.002, 0.001] | -0.001 [-0.002, 0.001] |
p=0.413 | p=0.365 | |
scale(mean_precip2000) | -0.002 [-0.003, 0.000] | -0.001 [-0.003, 0.000] |
p=0.070 | p=0.145 | |
scale(built2000) | 0.003* [0.001, 0.005] | 0.002* [0.000, 0.003] |
p=0.004* | p=0.028* | |
scale(built_change) | 0.004* [0.002, 0.005] | 0.003* [0.002, 0.005] |
p=<0.001* | p=<0.001* | |
scale(mean_green2000) | 0.001 [0.000, 0.003] | 0.001 [-0.001, 0.003] |
p=0.117 | p=0.248 | |
scale(mean_green_change) | -0.003* [-0.004, -0.001] | -0.003* [-0.005, -0.002] |
p=<0.001* | p=<0.001* | |
scale(gdp_percap2000) | 0.002 [0.000, 0.003] | 0.002* [0.001, 0.004] |
p=0.057 | p=0.005* | |
scale(tt_capital) | 0.003* [0.001, 0.004] | 0.003* [0.001, 0.004] |
p=0.003* | p=0.001* | |
scale(ntl) | 0.001 [-0.001, 0.002] | 0.001 [0.000, 0.002] |
p=0.221 | p=0.198 | |
scale(YoungDR_2000) | -0.003* [-0.005, -0.001] | -0.003* [-0.005, -0.001] |
p=0.009* | p=0.002* | |
scale(YoungDRChange) | 0.002 [-0.003, 0.008] | 0.007* [0.002, 0.013] |
p=0.349 | p=0.009* | |
scale(LogTotalPop_2000) | -0.007* [-0.009, -0.004] | -0.003* [-0.005, -0.001] |
p=<0.001* | p=<0.001* | |
SD (Intercept country_iso) | 0.008 | 0.012 |
SD (Observations) | 0.016 | 0.016 |
Num.Obs. | 505 | 572 |
R2 Marg. | 0.235 | 0.262 |
R2 Cond. | 0.402 | 0.528 |
AIC | -2568.7 | -2936.5 |
BIC | -2501.1 | -2866.9 |
ICC | 0.2 | 0.4 |
RMSE | 0.02 | 0.02 |
We create here subsets of the data as scaled and subset for models with and without large cities:
## Create NoNA datasets for further modeling:
ModelVarsScaled <- c(model_response_variables(lmer_pop_iso_fe), model_beta_variables(lmer_pop_iso_fe))
ModelVars <- gsub('.*\\((.*)\\)','\\1', ModelVarsScaled)
ModelCitiesMLUANoNA <- ModelCitiesMLUA[complete.cases(ModelCitiesMLUA[,ModelVars]), ]
ModelCitiesMUANoNA <- ModelCitiesMUA[complete.cases(ModelCitiesMUA[,ModelVars]), ]
Now we fit the models of the finalized structure using the brms package (using the Stan backend), in a Bayesian context. The prior definition represents only vaguely informative priors, which imposes very little information and should result in coefficient estimates similar to Maximum Likelihood results from above:
## Fit models for all data:
## Estimate the mixed effects model with Bayesian approach:
## Using cmdstanr with brms (must be installed separately):
options(brms.backend = "cmdstanr", mc.cores = 4)
#bprior <- prior_string("normal(0,1.5)", class = "b")
#bprior <- c(prior_string("normal(0,1.5)", class = "b"),
# prior_(~cauchy(0,2), class = ~sd,
# group = ~country_iso, coef = ~Intercept))
bprior <- c(prior(normal(0, 1.5), class = Intercept),
prior(normal(0, 1.5), class = b),
prior(exponential(5), class = sigma),
prior(exponential(5), class = sd))
## Check code:
#form <- bf(frm_lmer_pop_iso_fe)
#make_stancode(
# form,
# data = ModelCitiesMUANoNA,
# prior = bprior,
# cores = 4,
# iter=2000, warmup=1000, chains=4, seed=2025,
# save_pars = save_pars(all = TRUE),
# adapt_delta = 0.98,
# backend="cmdstanr"
#)
brm_pop_iso_fe <- brm(
frm_lmer_pop_iso_fe,
data = ModelCitiesMUANoNA,
prior = bprior,
cores = 4,
iter=2000, warmup=1000, chains=4, seed=20252025,
save_pars = save_pars(all = TRUE),
adapt_delta = 0.98,
backend="cmdstanr")
## Running MCMC with 4 parallel chains...
##
## Chain 1 Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 2 Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 3 Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 4 Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 1 Iteration: 100 / 2000 [ 5%] (Warmup)
## Chain 2 Iteration: 100 / 2000 [ 5%] (Warmup)
## Chain 3 Iteration: 100 / 2000 [ 5%] (Warmup)
## Chain 4 Iteration: 100 / 2000 [ 5%] (Warmup)
## Chain 1 Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 2 Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 4 Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 1 Iteration: 300 / 2000 [ 15%] (Warmup)
## Chain 3 Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 1 Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 2 Iteration: 300 / 2000 [ 15%] (Warmup)
## Chain 4 Iteration: 300 / 2000 [ 15%] (Warmup)
## Chain 3 Iteration: 300 / 2000 [ 15%] (Warmup)
## Chain 4 Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 1 Iteration: 500 / 2000 [ 25%] (Warmup)
## Chain 2 Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 3 Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 1 Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 2 Iteration: 500 / 2000 [ 25%] (Warmup)
## Chain 4 Iteration: 500 / 2000 [ 25%] (Warmup)
## Chain 3 Iteration: 500 / 2000 [ 25%] (Warmup)
## Chain 1 Iteration: 700 / 2000 [ 35%] (Warmup)
## Chain 2 Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 4 Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 3 Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 1 Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 2 Iteration: 700 / 2000 [ 35%] (Warmup)
## Chain 3 Iteration: 700 / 2000 [ 35%] (Warmup)
## Chain 4 Iteration: 700 / 2000 [ 35%] (Warmup)
## Chain 1 Iteration: 900 / 2000 [ 45%] (Warmup)
## Chain 4 Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 2 Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 3 Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 4 Iteration: 900 / 2000 [ 45%] (Warmup)
## Chain 2 Iteration: 900 / 2000 [ 45%] (Warmup)
## Chain 3 Iteration: 900 / 2000 [ 45%] (Warmup)
## Chain 1 Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 1 Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 4 Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 4 Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 1 Iteration: 1100 / 2000 [ 55%] (Sampling)
## Chain 2 Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 3 Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 4 Iteration: 1100 / 2000 [ 55%] (Sampling)
## Chain 2 Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 3 Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 1 Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 3 Iteration: 1100 / 2000 [ 55%] (Sampling)
## Chain 4 Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 2 Iteration: 1100 / 2000 [ 55%] (Sampling)
## Chain 4 Iteration: 1300 / 2000 [ 65%] (Sampling)
## Chain 1 Iteration: 1300 / 2000 [ 65%] (Sampling)
## Chain 3 Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 2 Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 4 Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 1 Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 3 Iteration: 1300 / 2000 [ 65%] (Sampling)
## Chain 2 Iteration: 1300 / 2000 [ 65%] (Sampling)
## Chain 4 Iteration: 1500 / 2000 [ 75%] (Sampling)
## Chain 1 Iteration: 1500 / 2000 [ 75%] (Sampling)
## Chain 3 Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 4 Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 1 Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 2 Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 3 Iteration: 1500 / 2000 [ 75%] (Sampling)
## Chain 4 Iteration: 1700 / 2000 [ 85%] (Sampling)
## Chain 3 Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 1 Iteration: 1700 / 2000 [ 85%] (Sampling)
## Chain 2 Iteration: 1500 / 2000 [ 75%] (Sampling)
## Chain 4 Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 3 Iteration: 1700 / 2000 [ 85%] (Sampling)
## Chain 1 Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 2 Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 4 Iteration: 1900 / 2000 [ 95%] (Sampling)
## Chain 3 Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 1 Iteration: 1900 / 2000 [ 95%] (Sampling)
## Chain 4 Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 4 finished in 8.4 seconds.
## Chain 2 Iteration: 1700 / 2000 [ 85%] (Sampling)
## Chain 3 Iteration: 1900 / 2000 [ 95%] (Sampling)
## Chain 1 Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 1 finished in 8.8 seconds.
## Chain 2 Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 3 Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 3 finished in 8.9 seconds.
## Chain 2 Iteration: 1900 / 2000 [ 95%] (Sampling)
## Chain 2 Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 2 finished in 9.7 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 8.9 seconds.
## Total execution time: 9.8 seconds.
brm_pop_iso_fe_prior <- brm(
frm_lmer_pop_iso_fe,
data = ModelCitiesMUANoNA,
prior = bprior,
cores = 4,
iter=2000, warmup=1000, chains=4, seed=20252025,
save_pars = save_pars(all = TRUE),
adapt_delta = 0.98,
sample_prior = "only",
backend="cmdstanr")
## Running MCMC with 4 parallel chains...
##
## Chain 1 Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 1 Iteration: 100 / 2000 [ 5%] (Warmup)
## Chain 1 Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 1 Iteration: 300 / 2000 [ 15%] (Warmup)
## Chain 1 Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 1 Iteration: 500 / 2000 [ 25%] (Warmup)
## Chain 1 Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 1 Iteration: 700 / 2000 [ 35%] (Warmup)
## Chain 1 Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 1 Iteration: 900 / 2000 [ 45%] (Warmup)
## Chain 1 Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 1 Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 1 Iteration: 1100 / 2000 [ 55%] (Sampling)
## Chain 1 Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 2 Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 2 Iteration: 100 / 2000 [ 5%] (Warmup)
## Chain 2 Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 2 Iteration: 300 / 2000 [ 15%] (Warmup)
## Chain 2 Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 2 Iteration: 500 / 2000 [ 25%] (Warmup)
## Chain 2 Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 2 Iteration: 700 / 2000 [ 35%] (Warmup)
## Chain 2 Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 2 Iteration: 900 / 2000 [ 45%] (Warmup)
## Chain 2 Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 2 Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 2 Iteration: 1100 / 2000 [ 55%] (Sampling)
## Chain 2 Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 3 Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 3 Iteration: 100 / 2000 [ 5%] (Warmup)
## Chain 3 Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 3 Iteration: 300 / 2000 [ 15%] (Warmup)
## Chain 3 Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 3 Iteration: 500 / 2000 [ 25%] (Warmup)
## Chain 3 Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 3 Iteration: 700 / 2000 [ 35%] (Warmup)
## Chain 3 Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 3 Iteration: 900 / 2000 [ 45%] (Warmup)
## Chain 4 Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 4 Iteration: 100 / 2000 [ 5%] (Warmup)
## Chain 4 Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 4 Iteration: 300 / 2000 [ 15%] (Warmup)
## Chain 4 Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 4 Iteration: 500 / 2000 [ 25%] (Warmup)
## Chain 4 Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 4 Iteration: 700 / 2000 [ 35%] (Warmup)
## Chain 4 Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 1 Iteration: 1300 / 2000 [ 65%] (Sampling)
## Chain 1 Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 1 Iteration: 1500 / 2000 [ 75%] (Sampling)
## Chain 1 Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 1 Iteration: 1700 / 2000 [ 85%] (Sampling)
## Chain 1 Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 1 Iteration: 1900 / 2000 [ 95%] (Sampling)
## Chain 1 Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 1 finished in 0.2 seconds.
## Chain 2 Iteration: 1300 / 2000 [ 65%] (Sampling)
## Chain 2 Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 2 Iteration: 1500 / 2000 [ 75%] (Sampling)
## Chain 2 Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 2 Iteration: 1700 / 2000 [ 85%] (Sampling)
## Chain 2 Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 2 Iteration: 1900 / 2000 [ 95%] (Sampling)
## Chain 2 Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 2 finished in 0.2 seconds.
## Chain 3 Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 3 Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 3 Iteration: 1100 / 2000 [ 55%] (Sampling)
## Chain 3 Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 3 Iteration: 1300 / 2000 [ 65%] (Sampling)
## Chain 3 Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 3 Iteration: 1500 / 2000 [ 75%] (Sampling)
## Chain 3 Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 3 Iteration: 1700 / 2000 [ 85%] (Sampling)
## Chain 3 Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 3 Iteration: 1900 / 2000 [ 95%] (Sampling)
## Chain 3 Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 3 finished in 0.2 seconds.
## Chain 4 Iteration: 900 / 2000 [ 45%] (Warmup)
## Chain 4 Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 4 Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 4 Iteration: 1100 / 2000 [ 55%] (Sampling)
## Chain 4 Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 4 Iteration: 1300 / 2000 [ 65%] (Sampling)
## Chain 4 Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 4 Iteration: 1500 / 2000 [ 75%] (Sampling)
## Chain 4 Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 4 Iteration: 1700 / 2000 [ 85%] (Sampling)
## Chain 4 Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 4 Iteration: 1900 / 2000 [ 95%] (Sampling)
## Chain 4 Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 4 finished in 0.2 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 0.2 seconds.
## Total execution time: 0.4 seconds.
brm_pop_iso_fe_bc <- brm(
frm_lmer_pop_iso_fe_bc,
data = ModelCitiesMLUANoNA,
prior = bprior,
cores = 4,
iter=2000, warmup=1000, chains=4, seed=20252025,
save_pars = save_pars(all = TRUE),
adapt_delta = 0.98,
backend="cmdstanr")
## Running MCMC with 4 parallel chains...
##
## Chain 1 Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 2 Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 3 Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 4 Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 1 Iteration: 100 / 2000 [ 5%] (Warmup)
## Chain 3 Iteration: 100 / 2000 [ 5%] (Warmup)
## Chain 4 Iteration: 100 / 2000 [ 5%] (Warmup)
## Chain 2 Iteration: 100 / 2000 [ 5%] (Warmup)
## Chain 1 Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 4 Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 3 Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 2 Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 1 Iteration: 300 / 2000 [ 15%] (Warmup)
## Chain 4 Iteration: 300 / 2000 [ 15%] (Warmup)
## Chain 3 Iteration: 300 / 2000 [ 15%] (Warmup)
## Chain 1 Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 2 Iteration: 300 / 2000 [ 15%] (Warmup)
## Chain 4 Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 3 Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 1 Iteration: 500 / 2000 [ 25%] (Warmup)
## Chain 2 Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 4 Iteration: 500 / 2000 [ 25%] (Warmup)
## Chain 1 Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 3 Iteration: 500 / 2000 [ 25%] (Warmup)
## Chain 4 Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 1 Iteration: 700 / 2000 [ 35%] (Warmup)
## Chain 2 Iteration: 500 / 2000 [ 25%] (Warmup)
## Chain 3 Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 4 Iteration: 700 / 2000 [ 35%] (Warmup)
## Chain 1 Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 2 Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 3 Iteration: 700 / 2000 [ 35%] (Warmup)
## Chain 4 Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 1 Iteration: 900 / 2000 [ 45%] (Warmup)
## Chain 2 Iteration: 700 / 2000 [ 35%] (Warmup)
## Chain 3 Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 4 Iteration: 900 / 2000 [ 45%] (Warmup)
## Chain 1 Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 2 Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 3 Iteration: 900 / 2000 [ 45%] (Warmup)
## Chain 1 Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 4 Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 1 Iteration: 1100 / 2000 [ 55%] (Sampling)
## Chain 4 Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 2 Iteration: 900 / 2000 [ 45%] (Warmup)
## Chain 3 Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 1 Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 3 Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 4 Iteration: 1100 / 2000 [ 55%] (Sampling)
## Chain 1 Iteration: 1300 / 2000 [ 65%] (Sampling)
## Chain 2 Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 2 Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 4 Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 3 Iteration: 1100 / 2000 [ 55%] (Sampling)
## Chain 1 Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 4 Iteration: 1300 / 2000 [ 65%] (Sampling)
## Chain 2 Iteration: 1100 / 2000 [ 55%] (Sampling)
## Chain 1 Iteration: 1500 / 2000 [ 75%] (Sampling)
## Chain 3 Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 4 Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 2 Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 1 Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 3 Iteration: 1300 / 2000 [ 65%] (Sampling)
## Chain 4 Iteration: 1500 / 2000 [ 75%] (Sampling)
## Chain 1 Iteration: 1700 / 2000 [ 85%] (Sampling)
## Chain 2 Iteration: 1300 / 2000 [ 65%] (Sampling)
## Chain 4 Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 1 Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 3 Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 2 Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 1 Iteration: 1900 / 2000 [ 95%] (Sampling)
## Chain 4 Iteration: 1700 / 2000 [ 85%] (Sampling)
## Chain 3 Iteration: 1500 / 2000 [ 75%] (Sampling)
## Chain 1 Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 1 finished in 9.5 seconds.
## Chain 2 Iteration: 1500 / 2000 [ 75%] (Sampling)
## Chain 4 Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 2 Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 3 Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 4 Iteration: 1900 / 2000 [ 95%] (Sampling)
## Chain 2 Iteration: 1700 / 2000 [ 85%] (Sampling)
## Chain 4 Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 3 Iteration: 1700 / 2000 [ 85%] (Sampling)
## Chain 4 finished in 10.3 seconds.
## Chain 2 Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 3 Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 2 Iteration: 1900 / 2000 [ 95%] (Sampling)
## Chain 3 Iteration: 1900 / 2000 [ 95%] (Sampling)
## Chain 2 Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 2 finished in 11.6 seconds.
## Chain 3 Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 3 finished in 11.9 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 10.8 seconds.
## Total execution time: 12.1 seconds.
brm_pop_iso_fe_bc_prior <- brm(
frm_lmer_pop_iso_fe_bc,
data = ModelCitiesMLUANoNA,
prior = bprior,
cores = 4,
iter=2000, warmup=1000, chains=4, seed=20252025,
save_pars = save_pars(all = TRUE),
adapt_delta = 0.98,
sample_prior = "only",
backend="cmdstanr")
## Running MCMC with 4 parallel chains...
##
## Chain 1 Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 1 Iteration: 100 / 2000 [ 5%] (Warmup)
## Chain 1 Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 1 Iteration: 300 / 2000 [ 15%] (Warmup)
## Chain 1 Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 1 Iteration: 500 / 2000 [ 25%] (Warmup)
## Chain 1 Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 1 Iteration: 700 / 2000 [ 35%] (Warmup)
## Chain 1 Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 1 Iteration: 900 / 2000 [ 45%] (Warmup)
## Chain 1 Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 1 Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 2 Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 2 Iteration: 100 / 2000 [ 5%] (Warmup)
## Chain 2 Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 2 Iteration: 300 / 2000 [ 15%] (Warmup)
## Chain 2 Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 2 Iteration: 500 / 2000 [ 25%] (Warmup)
## Chain 2 Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 2 Iteration: 700 / 2000 [ 35%] (Warmup)
## Chain 2 Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 2 Iteration: 900 / 2000 [ 45%] (Warmup)
## Chain 2 Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 2 Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 2 Iteration: 1100 / 2000 [ 55%] (Sampling)
## Chain 3 Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 3 Iteration: 100 / 2000 [ 5%] (Warmup)
## Chain 3 Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 3 Iteration: 300 / 2000 [ 15%] (Warmup)
## Chain 3 Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 3 Iteration: 500 / 2000 [ 25%] (Warmup)
## Chain 3 Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 3 Iteration: 700 / 2000 [ 35%] (Warmup)
## Chain 4 Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 4 Iteration: 100 / 2000 [ 5%] (Warmup)
## Chain 4 Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 4 Iteration: 300 / 2000 [ 15%] (Warmup)
## Chain 4 Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 4 Iteration: 500 / 2000 [ 25%] (Warmup)
## Chain 4 Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 4 Iteration: 700 / 2000 [ 35%] (Warmup)
## Chain 4 Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 2 Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 2 Iteration: 1300 / 2000 [ 65%] (Sampling)
## Chain 2 Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 2 Iteration: 1500 / 2000 [ 75%] (Sampling)
## Chain 2 Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 2 Iteration: 1700 / 2000 [ 85%] (Sampling)
## Chain 2 Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 2 Iteration: 1900 / 2000 [ 95%] (Sampling)
## Chain 2 Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 2 finished in 0.2 seconds.
## Chain 1 Iteration: 1100 / 2000 [ 55%] (Sampling)
## Chain 1 Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 1 Iteration: 1300 / 2000 [ 65%] (Sampling)
## Chain 1 Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 1 Iteration: 1500 / 2000 [ 75%] (Sampling)
## Chain 1 Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 1 Iteration: 1700 / 2000 [ 85%] (Sampling)
## Chain 1 Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 1 Iteration: 1900 / 2000 [ 95%] (Sampling)
## Chain 1 Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 3 Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 3 Iteration: 900 / 2000 [ 45%] (Warmup)
## Chain 3 Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 3 Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 3 Iteration: 1100 / 2000 [ 55%] (Sampling)
## Chain 3 Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 3 Iteration: 1300 / 2000 [ 65%] (Sampling)
## Chain 3 Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 3 Iteration: 1500 / 2000 [ 75%] (Sampling)
## Chain 3 Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 3 Iteration: 1700 / 2000 [ 85%] (Sampling)
## Chain 3 Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 3 Iteration: 1900 / 2000 [ 95%] (Sampling)
## Chain 3 Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 4 Iteration: 900 / 2000 [ 45%] (Warmup)
## Chain 4 Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 4 Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 4 Iteration: 1100 / 2000 [ 55%] (Sampling)
## Chain 4 Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 4 Iteration: 1300 / 2000 [ 65%] (Sampling)
## Chain 4 Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 4 Iteration: 1500 / 2000 [ 75%] (Sampling)
## Chain 4 Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 4 Iteration: 1700 / 2000 [ 85%] (Sampling)
## Chain 4 Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 4 Iteration: 1900 / 2000 [ 95%] (Sampling)
## Chain 4 Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 1 finished in 0.2 seconds.
## Chain 3 finished in 0.2 seconds.
## Chain 4 finished in 0.2 seconds.
##
## All 4 chains finished successfully.
## Mean chain execution time: 0.2 seconds.
## Total execution time: 0.5 seconds.
Now we can perform some fit diagnostics that, while not presented in the main manuscript, indicate that the models are relatively well specified given the constraints placed on explanatory variables included:
## Initial fit diagnostics:
pp_check(brm_pop_iso_fe)
## Using 10 posterior draws for ppc type 'dens_overlay' by default.
pp_check(brm_pop_iso_fe_prior)
## Using 10 posterior draws for ppc type 'dens_overlay' by default.
brms::pp_check(brm_pop_iso_fe_prior) + xlim(-2,2)
## Using 10 posterior draws for ppc type 'dens_overlay' by default.
## Warning: Removed 3192 rows containing non-finite outside the scale range
## (`stat_density()`).
pp_check(brm_pop_iso_fe, type="error_hist", ndraws=12)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
pp_check(brm_pop_iso_fe, type="scatter_avg", ndraws=12)
pp_check(brm_pop_iso_fe, type="stat_2d")
## Using all posterior draws for ppc type 'stat_2d' by default.
#pp_check(brm_pop_iso_fe, type="rootogram")
pp_check(brm_pop_iso_fe, type="loo_pit")
## Using all posterior draws for ppc type 'loo_pit' by default.
## Warning: Found 1 observations with a pareto_k > 0.7 in model '.x1'. We
## recommend to set 'moment_match = TRUE' in order to perform moment matching for
## problematic observations.
## Warning in .fun(y = .x1, yrep = .x2, lw = .x3): '.fun' is deprecated.
## Use 'ppc_loo_pit_qq or ppc_loo_pit_overlay' instead.
## See help("Deprecated")
pp_check(brm_pop_iso_fe_bc)
## Using 10 posterior draws for ppc type 'dens_overlay' by default.
pp_check(brm_pop_iso_fe_bc_prior)
## Using 10 posterior draws for ppc type 'dens_overlay' by default.
pp_check(brm_pop_iso_fe_bc, type="error_hist", ndraws=12)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
pp_check(brm_pop_iso_fe_bc, type="scatter_avg", ndraws=12)
pp_check(brm_pop_iso_fe_bc, type="stat_2d")
## Using all posterior draws for ppc type 'stat_2d' by default.
#pp_check(brm_pop_iso_fe_bc, type="rootogram")
pp_check(brm_pop_iso_fe_bc, type="loo_pit")
## Using all posterior draws for ppc type 'loo_pit' by default.
## Warning: Found 1 observations with a pareto_k > 0.7 in model '.x1'. We recommend to set 'moment_match = TRUE' in order to perform moment matching for problematic observations.
## Warning: '.fun' is deprecated.
## Use 'ppc_loo_pit_qq or ppc_loo_pit_overlay' instead.
## See help("Deprecated")
### Compare models:
#loo1 <- loo(brm_pop_iso_fe, moment_match=T)
#loo2 <- loo(brm_pop_iso_fe_bc, moment_match=T)
#
#summary(loo1)
#loo1
#plot(loo1)
#summary(loo2)
#loo2
#plot(loo2)
## Models are fit with different data, so uncomparable:
#loo_compare(loo1, loo2)
## Check for divergent sampling problems:
plot(brm_pop_iso_fe, variable="^b_", regex=T)
plot(brm_pop_iso_fe_bc, variable="^b_", regex=T)
plot(brm_pop_iso_fe, variable="^r_", regex=T)
plot(brm_pop_iso_fe_bc, variable="^r_", regex=T)
#pairs(brm_pop_iso_fe_bc, variable="^b_", regex=T) ## Too many samples, removing plot...
## Check Rhat (further checking if > 1.05) and ESS (want ~1000)
## (remove the rest of the output so it doesn't distract)
summary(brm_pop_iso_fe)
## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: AnnGrRate ~ (1 + scale(elevation) + scale(mean_temp2000) + scale(mean_precip2000) + scale(built2000) + scale(built_change) + scale(mean_green2000) + scale(mean_green_change) + scale(gdp_percap2000) + scale(tt_capital) + scale(ntl) + scale(YoungDR_2000) + scale(YoungDRChange) + scale(LogTotalPop_2000)) + (1 | country_iso)
## Data: ModelCitiesMUANoNA (Number of observations: 505)
## Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup draws = 4000
##
## Multilevel Hyperparameters:
## ~country_iso (Number of levels: 6)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.01 0.01 0.00 0.03 1.00 816 1161
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept 0.03 0.01 0.02 0.04 1.00 859
## scaleelevation 0.00 0.00 -0.00 0.00 1.00 5229
## scalemean_temp2000 -0.00 0.00 -0.00 0.00 1.00 4431
## scalemean_precip2000 -0.00 0.00 -0.00 0.00 1.00 5133
## scalebuilt2000 0.00 0.00 0.00 0.01 1.00 5058
## scalebuilt_change 0.00 0.00 0.00 0.00 1.00 5022
## scalemean_green2000 0.00 0.00 -0.00 0.00 1.00 4724
## scalemean_green_change -0.00 0.00 -0.00 -0.00 1.00 4103
## scalegdp_percap2000 0.00 0.00 -0.00 0.00 1.00 5529
## scalett_capital 0.00 0.00 0.00 0.00 1.00 4971
## scalentl 0.00 0.00 -0.00 0.00 1.00 4533
## scaleYoungDR_2000 -0.00 0.00 -0.01 -0.00 1.00 2366
## scaleYoungDRChange 0.00 0.00 -0.00 0.01 1.00 1255
## scaleLogTotalPop_2000 -0.01 0.00 -0.01 -0.00 1.00 4975
## Tail_ESS
## Intercept 718
## scaleelevation 3605
## scalemean_temp2000 3429
## scalemean_precip2000 3372
## scalebuilt2000 2974
## scalebuilt_change 3113
## scalemean_green2000 3109
## scalemean_green_change 3488
## scalegdp_percap2000 3257
## scalett_capital 3549
## scalentl 3231
## scaleYoungDR_2000 3385
## scaleYoungDRChange 1937
## scaleLogTotalPop_2000 3189
##
## Further Distributional Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 0.02 0.00 0.01 0.02 1.00 2501 2161
##
## Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
summary(brm_pop_iso_fe)$fixed[, 5:7]
## Rhat Bulk_ESS Tail_ESS
## Intercept 1.0030116 859.0421 717.9956
## scaleelevation 0.9994183 5229.3174 3604.6584
## scalemean_temp2000 1.0007199 4430.9688 3428.8847
## scalemean_precip2000 1.0018525 5132.6465 3372.1415
## scalebuilt2000 1.0008495 5058.1890 2973.6414
## scalebuilt_change 0.9999777 5022.2024 3113.0604
## scalemean_green2000 1.0014908 4723.6344 3109.3395
## scalemean_green_change 1.0018215 4103.0990 3488.1436
## scalegdp_percap2000 1.0001965 5529.0897 3257.2762
## scalett_capital 0.9998616 4971.0345 3549.0244
## scalentl 1.0005880 4532.8939 3230.8248
## scaleYoungDR_2000 1.0004720 2366.4508 3384.6903
## scaleYoungDRChange 1.0038512 1255.2582 1937.0812
## scaleLogTotalPop_2000 1.0005350 4975.1041 3189.1934
summary(brm_pop_iso_fe)$random
## $country_iso
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## sd(Intercept) 0.01241187 0.007581854 0.003583361 0.03186361 1.002337 815.9535
## Tail_ESS
## sd(Intercept) 1161.046
summary(brm_pop_iso_fe_bc)$fixed[, 5:7]
## Warning: There were 1 divergent transitions after warmup. Increasing
## adapt_delta above 0.98 may help. See
## http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
## Rhat Bulk_ESS Tail_ESS
## Intercept 1.0072162 821.4922 854.1114
## scaleelevation 1.0003352 5486.1646 3437.9287
## scalemean_temp2000 1.0000858 4208.5567 3113.6503
## scalemean_precip2000 0.9999573 5170.7274 3033.7069
## scalebuilt2000 1.0002823 5157.9913 3167.9244
## scalebuilt_change 1.0008245 4192.6830 3011.5852
## scalemean_green2000 1.0017713 4245.9034 3216.0334
## scalemean_green_change 1.0001315 3801.3119 3023.7415
## scalegdp_percap2000 0.9997254 4491.6660 3492.0933
## scalett_capital 0.9996427 4493.1383 3688.1690
## scalentl 1.0003313 4923.1134 3291.2180
## scaleYoungDR_2000 1.0009930 1969.7077 2626.3613
## scaleYoungDRChange 1.0035146 983.0126 1280.2737
## scaleLogTotalPop_2000 1.0027634 4580.0260 3016.6401
The following indicate that the brms models and estimates for coefficients are relatively well mixed between chains and represent a well-behaved fitting process:
ratios_cp <- neff_ratio(brm_pop_iso_fe)
print(ratios_cp)
## b_Intercept b_scaleelevation
## 0.1794989 0.9011646
## b_scalemean_temp2000 b_scalemean_precip2000
## 0.8572212 0.8430354
## b_scalebuilt2000 b_scalebuilt_change
## 0.7434104 0.7782651
## b_scalemean_green2000 b_scalemean_green_change
## 0.7773349 0.8720359
## b_scalegdp_percap2000 b_scalett_capital
## 0.8143191 0.8872561
## b_scalentl b_scaleYoungDR_2000
## 0.8077062 0.5916127
## b_scaleYoungDRChange b_scaleLogTotalPop_2000
## 0.3138145 0.7972983
## sd_country_iso__Intercept sigma
## 0.2039884 0.5403220
## Intercept r_country_iso[IDN,Intercept]
## 0.1794989 0.2425511
## r_country_iso[KHM,Intercept] r_country_iso[LAO,Intercept]
## 0.2886084 0.4449692
## r_country_iso[MYS,Intercept] r_country_iso[THA,Intercept]
## 0.2262211 0.2094262
## r_country_iso[VNM,Intercept] lprior
## 0.2228171 0.2052878
## lp__ z_1[1,1]
## 0.2375673 0.2728928
## z_1[1,2] z_1[1,3]
## 0.3458966 0.5304514
## z_1[1,4] z_1[1,5]
## 0.2665590 0.2657729
## z_1[1,6]
## 0.2538544
mcmc_neff(ratios_cp, size = 2)
ratios_cp_bc <- neff_ratio(brm_pop_iso_fe_bc)
print(ratios_cp_bc)
## b_Intercept b_scaleelevation
## 0.2053731 0.8594822
## b_scalemean_temp2000 b_scalemean_precip2000
## 0.7784126 0.7584267
## b_scalebuilt2000 b_scalebuilt_change
## 0.7919811 0.7528963
## b_scalemean_green2000 b_scalemean_green_change
## 0.8040083 0.7559354
## b_scalegdp_percap2000 b_scalett_capital
## 0.8730233 0.9220422
## b_scalentl b_scaleYoungDR_2000
## 0.8228045 0.4924269
## b_scaleYoungDRChange b_scaleLogTotalPop_2000
## 0.2457531 0.7541600
## sd_country_iso__Intercept sigma
## 0.1165064 0.4869824
## Intercept r_country_iso[IDN,Intercept]
## 0.2053731 0.2149330
## r_country_iso[KHM,Intercept] r_country_iso[LAO,Intercept]
## 0.2341039 0.2675276
## r_country_iso[MYS,Intercept] r_country_iso[THA,Intercept]
## 0.2107217 0.2238969
## r_country_iso[VNM,Intercept] lprior
## 0.2022518 0.1173942
## lp__ z_1[1,1]
## 0.1820167 0.2379916
## z_1[1,2] z_1[1,3]
## 0.2950684 0.3724044
## z_1[1,4] z_1[1,5]
## 0.2493860 0.2862420
## z_1[1,6]
## 0.2490672
mcmc_neff(ratios_cp_bc, size = 2)
posterior_brm <- as.array(brm_pop_iso_fe)
mcmc_acf(posterior_brm, lags = 100, regex_pars="^b_")
mcmc_acf(posterior_brm, lags = 100, regex_pars="^r_")
posterior_brm_bc <- as.array(brm_pop_iso_fe_bc)
mcmc_acf(posterior_brm_bc, lags = 100, regex_pars="^b_")
mcmc_acf(posterior_brm_bc, lags = 100, regex_pars="^r_")
We can then produce a variety of checks on model fits that indicate that while the overall variability in growth rates could be better explained by better covariates we are not producing overly biased estimates of coefficients due to distributional aspects of the model assumptions:
## Model residual checking:
check_brms <- function(model, # brms model
integer = FALSE, # integer response? (TRUE/FALSE)
plot = TRUE, # make plot?
... # further arguments for DHARMa::plotResiduals
) {
mdata <- brms::standata(model)
if (!"Y" %in% names(mdata))
stop("Cannot extract the required information from this brms model")
dharma.obj <- DHARMa::createDHARMa(
simulatedResponse = t(brms::posterior_predict(model, ndraws = 1000)),
observedResponse = mdata$Y,
fittedPredictedResponse = apply(
t(brms::posterior_epred(model, ndraws = 1000, re.form = NA)),
1,
mean),
integerResponse = integer)
if (isTRUE(plot)) {
plot(dharma.obj, ...)
}
invisible(dharma.obj)
}
model.check <- check_brms(brm_pop_iso_fe, integer = F)
model.check_bc <- check_brms(brm_pop_iso_fe_bc, integer = F)
plot(model.check, form=ModelCitiesMUANoNA$country_iso)
plot(model.check_bc, form=ModelCitiesMLUANoNA$country_iso)
testDispersion(model.check)
##
## DHARMa nonparametric dispersion test via sd of residuals fitted vs.
## simulated
##
## data: simulationOutput
## dispersion = 0.94208, p-value = 0.506
## alternative hypothesis: two.sided
testDispersion(model.check_bc)
##
## DHARMa nonparametric dispersion test via sd of residuals fitted vs.
## simulated
##
## data: simulationOutput
## dispersion = 0.9605, p-value = 0.666
## alternative hypothesis: two.sided
## Summarize models for publication:
prior_summary(brm_pop_iso_fe)
## prior class coef group resp dpar nlpar lb
## normal(0, 1.5) b
## normal(0, 1.5) b scalebuilt_change
## normal(0, 1.5) b scalebuilt2000
## normal(0, 1.5) b scaleelevation
## normal(0, 1.5) b scalegdp_percap2000
## normal(0, 1.5) b scaleLogTotalPop_2000
## normal(0, 1.5) b scalemean_green_change
## normal(0, 1.5) b scalemean_green2000
## normal(0, 1.5) b scalemean_precip2000
## normal(0, 1.5) b scalemean_temp2000
## normal(0, 1.5) b scalentl
## normal(0, 1.5) b scalett_capital
## normal(0, 1.5) b scaleYoungDR_2000
## normal(0, 1.5) b scaleYoungDRChange
## normal(0, 1.5) Intercept
## exponential(5) sd 0
## exponential(5) sd country_iso 0
## exponential(5) sd Intercept country_iso 0
## exponential(5) sigma 0
## ub source
## user
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)
## user
## user
## (vectorized)
## (vectorized)
## user
prior_summary(brm_pop_iso_fe_bc)
## prior class coef group resp dpar nlpar lb
## normal(0, 1.5) b
## normal(0, 1.5) b scalebuilt_change
## normal(0, 1.5) b scalebuilt2000
## normal(0, 1.5) b scaleelevation
## normal(0, 1.5) b scalegdp_percap2000
## normal(0, 1.5) b scaleLogTotalPop_2000
## normal(0, 1.5) b scalemean_green_change
## normal(0, 1.5) b scalemean_green2000
## normal(0, 1.5) b scalemean_precip2000
## normal(0, 1.5) b scalemean_temp2000
## normal(0, 1.5) b scalentl
## normal(0, 1.5) b scalett_capital
## normal(0, 1.5) b scaleYoungDR_2000
## normal(0, 1.5) b scaleYoungDRChange
## normal(0, 1.5) Intercept
## exponential(5) sd 0
## exponential(5) sd country_iso 0
## exponential(5) sd Intercept country_iso 0
## exponential(5) sigma 0
## ub source
## user
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)
## user
## user
## (vectorized)
## (vectorized)
## user
print(summary(brm_pop_iso_fe), digits=4)
## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: AnnGrRate ~ (1 + scale(elevation) + scale(mean_temp2000) + scale(mean_precip2000) + scale(built2000) + scale(built_change) + scale(mean_green2000) + scale(mean_green_change) + scale(gdp_percap2000) + scale(tt_capital) + scale(ntl) + scale(YoungDR_2000) + scale(YoungDRChange) + scale(LogTotalPop_2000)) + (1 | country_iso)
## Data: ModelCitiesMUANoNA (Number of observations: 505)
## Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup draws = 4000
##
## Multilevel Hyperparameters:
## ~country_iso (Number of levels: 6)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.0124 0.0076 0.0036 0.0319 1.0023 816 1161
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept 0.0270 0.0065 0.0157 0.0414 1.0030 859
## scaleelevation 0.0001 0.0009 -0.0015 0.0019 0.9994 5229
## scalemean_temp2000 -0.0008 0.0009 -0.0026 0.0010 1.0007 4431
## scalemean_precip2000 -0.0015 0.0008 -0.0031 0.0001 1.0019 5133
## scalebuilt2000 0.0032 0.0011 0.0009 0.0054 1.0008 5058
## scalebuilt_change 0.0035 0.0008 0.0020 0.0049 1.0000 5022
## scalemean_green2000 0.0015 0.0009 -0.0004 0.0033 1.0015 4724
## scalemean_green_change -0.0027 0.0008 -0.0043 -0.0010 1.0018 4103
## scalegdp_percap2000 0.0016 0.0008 -0.0001 0.0032 1.0002 5529
## scalett_capital 0.0026 0.0009 0.0010 0.0042 0.9999 4971
## scalentl 0.0009 0.0008 -0.0006 0.0026 1.0006 4533
## scaleYoungDR_2000 -0.0029 0.0011 -0.0052 -0.0007 1.0005 2366
## scaleYoungDRChange 0.0034 0.0034 -0.0026 0.0106 1.0039 1255
## scaleLogTotalPop_2000 -0.0066 0.0012 -0.0090 -0.0043 1.0005 4975
## Tail_ESS
## Intercept 718
## scaleelevation 3605
## scalemean_temp2000 3429
## scalemean_precip2000 3372
## scalebuilt2000 2974
## scalebuilt_change 3113
## scalemean_green2000 3109
## scalemean_green_change 3488
## scalegdp_percap2000 3257
## scalett_capital 3549
## scalentl 3231
## scaleYoungDR_2000 3385
## scaleYoungDRChange 1937
## scaleLogTotalPop_2000 3189
##
## Further Distributional Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 0.0156 0.0005 0.0147 0.0167 1.0017 2501 2161
##
## Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
bayes_R2(brm_pop_iso_fe)
## Estimate Est.Error Q2.5 Q97.5
## R2 0.2893417 0.02841066 0.2332424 0.3438243
#loo_R2(brm_pop_iso_fe)
print(summary(brm_pop_iso_fe_bc), digits=4)
## Warning: There were 1 divergent transitions after warmup. Increasing
## adapt_delta above 0.98 may help. See
## http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: AnnGrRate ~ (1 + scale(elevation) + scale(mean_temp2000) + scale(mean_precip2000) + scale(built2000) + scale(built_change) + scale(mean_green2000) + scale(mean_green_change) + scale(gdp_percap2000) + scale(tt_capital) + scale(ntl) + scale(YoungDR_2000) + scale(YoungDRChange) + scale(LogTotalPop_2000)) + (1 | country_iso)
## Data: ModelCitiesMLUANoNA (Number of observations: 572)
## Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup draws = 4000
##
## Multilevel Hyperparameters:
## ~country_iso (Number of levels: 6)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.0170 0.0116 0.0051 0.0478 1.0113 466 706
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept 0.0294 0.0089 0.0142 0.0474 1.0072 821
## scaleelevation 0.0001 0.0008 -0.0013 0.0017 1.0003 5486
## scalemean_temp2000 -0.0008 0.0008 -0.0024 0.0009 1.0001 4209
## scalemean_precip2000 -0.0011 0.0008 -0.0026 0.0005 1.0000 5171
## scalebuilt2000 0.0017 0.0008 0.0002 0.0033 1.0003 5158
## scalebuilt_change 0.0029 0.0007 0.0016 0.0043 1.0008 4193
## scalemean_green2000 0.0010 0.0009 -0.0008 0.0027 1.0018 4246
## scalemean_green_change -0.0031 0.0008 -0.0046 -0.0015 1.0001 3801
## scalegdp_percap2000 0.0021 0.0008 0.0006 0.0036 0.9997 4492
## scalett_capital 0.0026 0.0008 0.0010 0.0041 0.9996 4493
## scalentl 0.0009 0.0007 -0.0005 0.0024 1.0003 4923
## scaleYoungDR_2000 -0.0032 0.0011 -0.0054 -0.0011 1.0010 1970
## scaleYoungDRChange 0.0078 0.0036 0.0010 0.0149 1.0035 983
## scaleLogTotalPop_2000 -0.0031 0.0009 -0.0049 -0.0013 1.0028 4580
## Tail_ESS
## Intercept 854
## scaleelevation 3438
## scalemean_temp2000 3114
## scalemean_precip2000 3034
## scalebuilt2000 3168
## scalebuilt_change 3012
## scalemean_green2000 3216
## scalemean_green_change 3024
## scalegdp_percap2000 3492
## scalett_capital 3688
## scalentl 3291
## scaleYoungDR_2000 2626
## scaleYoungDRChange 1280
## scaleLogTotalPop_2000 3017
##
## Further Distributional Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 0.0156 0.0004 0.0147 0.0165 1.0009 1948 2190
##
## Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
bayes_R2(brm_pop_iso_fe_bc)
## Estimate Est.Error Q2.5 Q97.5
## R2 0.2590172 0.02689917 0.2080968 0.3103759
#loo_R2(brm_pop_iso_fe_bc)
## Construct model summaries for model with and without LUAs included:
mods <- list(
"Med. Only"=brm_pop_iso_fe,
"Med./Lrg."=brm_pop_iso_fe_bc
)
modelsummary(mods, estimate="{estimate}{stars}\n[{conf.low}, {conf.high}]", statistic="p={p.value}{stars}", stars=c('*' = .05), output="modelsummary.html")
## Warning:
## `modelsummary` uses the `performance` package to extract goodness-of-fit
## statistics from models of this class. You can specify the statistics you wish
## to compute by supplying a `metrics` argument to `modelsummary`, which will then
## push it forward to `performance`. Acceptable values are: "all", "common",
## "none", or a character vector of metrics names. For example: `modelsummary(mod,
## metrics = c("RMSE", "R2")` Note that some metrics are computationally
## expensive. See `?performance::performance` for details.
## This warning appears once per session.
We can also then compare Bayesian to Maximum Likelihood model estimates, for skeptics wondering about the fitting process:
## Compare random effects from lmer and brm outputs:
ranef(brm_pop_iso_fe)
## $country_iso
## , , Intercept
##
## Estimate Est.Error Q2.5 Q97.5
## IDN -0.0109902176 0.007906323 -0.029123745 0.001542428
## KHM 0.0094522688 0.008588844 -0.004576353 0.028479475
## LAO 0.0001045683 0.008412536 -0.017449407 0.017985878
## MYS 0.0091623521 0.006907387 -0.003000531 0.024108935
## THA -0.0045867921 0.007778196 -0.022031090 0.008196024
## VNM -0.0021576671 0.006507469 -0.013759842 0.011497110
ranef(brm_pop_iso_fe_bc)
## $country_iso
## , , Intercept
##
## Estimate Est.Error Q2.5 Q97.5
## IDN -0.0162200835 0.010085567 -0.037630845 -9.262495e-06
## KHM 0.0132505917 0.010681845 -0.003287151 3.637161e-02
## LAO 0.0002963401 0.010162628 -0.018979557 1.994737e-02
## MYS 0.0110165433 0.009211021 -0.004900797 3.009089e-02
## THA -0.0085145604 0.010101630 -0.029435290 8.057141e-03
## VNM 0.0009601022 0.009035787 -0.013963447 1.943230e-02
## Reasonable plots but interactive:
#conditional_effects(brm_pop_iso_fe)
#conditional_effects(brm_pop_iso_fe_bc)
## Test within and between country variability:
brm_pop_iso_fe |>
tidy(effects = c("ran_pars"), conf.int = FALSE) |>
dplyr::select(-component, -effect, -std.error) |>
mutate(sigma_2 = estimate^2) |>
mutate(props = sigma_2 / sum(sigma_2))
## Warning in tidy.brmsfit(brm_pop_iso_fe, effects = c("ran_pars"), conf.int =
## FALSE): some parameter names contain underscores: term naming may be
## unreliable!
## # A tibble: 2 × 5
## group term estimate sigma_2 props
## <chr> <chr> <dbl> <dbl> <dbl>
## 1 country_iso sd__(Intercept) 0.0124 0.000154 0.386
## 2 Residual sd__Observation 0.0156 0.000245 0.614
## 40% of the variability in Ann. Gr Rates comes from between-country differences, while 60% comes from variations within individual countries.
## Construct posterior predictions for summaries:
posterior <- as.matrix(brm_pop_iso_fe)
posterior_bc <- as.matrix(brm_pop_iso_fe_bc)
plot_title <- ggtitle("Posterior distributions",
"with medians and 80% (thick)/95% (thin) intervals")
mcmc_intervals(posterior,
pars = colnames(posterior)[grepl("^b_", colnames(posterior))],
prob = 0.8, prob_outer=0.95) + plot_title
mcmc_intervals(posterior,
pars = colnames(posterior)[grepl("^r_", colnames(posterior))],
prob = 0.8, prob_outer=0.95) + plot_title
mcmc_intervals(posterior_bc,
pars = colnames(posterior_bc)[grepl("^b_", colnames(posterior_bc))],
prob = 0.8, prob_outer=0.95) + plot_title
mcmc_intervals(posterior_bc,
pars = colnames(posterior_bc)[grepl("^r_", colnames(posterior_bc))],
prob = 0.8, prob_outer=0.95) + plot_title
#color_scheme_set("red")
#ppc_dens_overlay(y = ModelCitiesMLUANoNA$AnnGrRate,
# yrep = posterior_predict(brm_pop_iso_fe_bc, draws = 50))
#ppc_dens_overlay_grouped(y = ModelCitiesMLUANoNA$AnnGrRate, group=ModelCitiesMLUANoNA$country_iso,
# yrep = posterior_predict(brm_pop_iso_fe_bc, draws = 10))
ppc_intervals(
y = brm_pop_iso_fe$data$AnnGrRate,
yrep = posterior_predict(brm_pop_iso_fe),
x = brm_pop_iso_fe$data$AnnGrRate,
prob = 0.5
) +
labs(
x = "Ann. Gr. Rate 2000-2015",
y = "Predicted Ann. Gr. Rate",
title = "50% Posterior Predictive Intervals vs. Obs. Population Change Pct.",
subtitle = "vs. Ann. Gr. Rate 2000-2015"
) +
panel_bg(fill = "gray95", color = NA) +
grid_lines(color = "white")
There are a variety of plots that illustrate the posterior predictive effects that are estimated from our fitting process:
ggplot(aes(x=AnnGrRate, y=Estimate, ymin=Q2.5, ymax=Q97.5, color=country_iso), data=cbind(ModelCitiesMUANoNA, predict(brm_pop_iso_fe, robust=T))) +
geom_point() +
geom_linerange(alpha=0.25) +
geom_abline(intercept = 0, slope = 1, color = "red", linetype = "dashed") +
labs(x="Observed Ann. Gr. Rate", y="Prediction w/ 95% PI") + theme_bw()
## Marginal posterior prediction effect plots:
p <- ggplot(aes(x=TotalPop_2015, y=Estimate, color=country_iso), data=cbind(ModelCitiesMLUANoNA, data.frame("Estimate"=apply(t(posterior_epred(brm_pop_iso_fe_bc)), 1, mean)))) +
geom_point() +
geom_abline(intercept = 0, slope = 1, color = "red", linetype = "dashed") +
labs(x="Observed Pop. 2015", y="Mean of Posterior Predictions") + theme_bw()
ggMarginal(p + theme(legend.position="left"), groupColour = TRUE, groupFill = TRUE, size=6)
p <- ggplot(aes(x=log(TotalPop_2015), y=Estimate, color=country_iso), data=cbind(ModelCitiesMLUANoNA, data.frame("Estimate"=apply(t(posterior_epred(brm_pop_iso_fe_bc)), 1, mean)))) +
geom_point() +
geom_abline(intercept = 0, slope = 1, color = "red", linetype = "dashed") +
labs(x="ln(Observed Pop. 2015)", y="Mean of Posterior Predictions") + theme_bw()
ggMarginal(p + theme(legend.position="left"), groupColour = TRUE, groupFill = TRUE, size=6)
By looking at predictions relative to observed growth rates we can diagnose any potential problems (over/under prediction) and conclude that we can have some faith in modeled coefficients while acknowledging bias at lower and upper values, just as we would have in maximum likelihood models:
## Primary Obs. vs. Predicted Scatter Plot:
ggplot(aes(x=Observed, y=Estimate, color=country_iso), data=cbind(ModelCitiesMLUANoNA, data.frame("Observed"=brm_pop_iso_fe_bc$data$AnnGrRate, "Estimate"=apply(t(posterior_epred(brm_pop_iso_fe_bc)), 1, mean)))) +
geom_point() +
geom_abline(intercept = 0, slope = 1, color = "red", linetype = "dashed") +
labs(x="Observed Annual Pop. Growth Rate\n(Log(Pop. 2015/Pop. 2000)/15)", y="Mean of Posterior Predictions") + theme_bw()
## Observed vs. Predicted for models, with marginal plots:
estimates <- apply(t(posterior_epred(brm_pop_iso_fe)), 1, mean)
estimates <- epred_draws(brm_pop_iso_fe, newdata=ModelCitiesMUANoNA)
p <- ggplot(aes(x=Observed, y=Estimate, color=country_iso), data=cbind(ModelCitiesMUANoNA, data.frame("Observed"=brm_pop_iso_fe$data$AnnGrRate, "Estimate"=apply(t(posterior_epred(brm_pop_iso_fe)), 1, mean)))) +
geom_point() +
geom_abline(intercept = 0, slope = 1, color = "red", linetype = "dashed") +
labs(x="Observed Annual Pop. Growth Rate\n(Log(Pop. 2015/Pop. 2000)/15)", y="Mean of Posterior Predictions") + theme_bw()
ggMarginal(p + theme(legend.position="left"), groupColour = TRUE, groupFill = TRUE, size=6)
p <- ggplot(aes(x=Observed, y=Estimate, color=country_iso), data=cbind(ModelCitiesMLUANoNA, data.frame("Observed"=brm_pop_iso_fe_bc$data$AnnGrRate, "Estimate"=apply(t(posterior_epred(brm_pop_iso_fe_bc)), 1, mean)))) +
geom_point() +
geom_abline(intercept = 0, slope = 1, color = "red", linetype = "dashed") +
labs(x="Observed Annual Pop. Growth Rate\n(Log(Pop. 2015/Pop. 2000)/15)", y="Mean of Posterior Predictions") + theme_bw()
ggMarginal(p + theme(legend.position="left"), groupColour = TRUE, groupFill = TRUE, size=6)
## Same as above but uses predictions() from marginaleffects package:
p <- ggplot(aes(x=Observed, y=Estimate, color=country_iso), data=cbind(ModelCitiesMLUANoNA, data.frame("Observed"=brm_pop_iso_fe_bc$data$AnnGrRate, "Estimate"=predictions(brm_pop_iso_fe_bc)$estimate))) +
geom_point() +
geom_abline(intercept = 0, slope = 1, color = "red", linetype = "dashed") +
labs(x="Observed Annual Pop. Growth Rate\n(Log(Pop. 2015/Pop. 2000)/15)", y="Mean of Posterior Predictions") + theme_bw()
ggMarginal(p + theme(legend.position="left"), groupColour = TRUE, groupFill = TRUE, size=6)
The following plots show various relationships that are estimated by holding values at fixed levels across countries and looking at estimated errors, upper and lower bounds, for combinations of covariates:
## Candidate discussion plots:
plot_model(brm_pop_iso_fe, type = "pred", interval="prediction", terms = c("LogTotalPop_2000", "YoungDR_2000"))
plot_model(brm_pop_iso_fe_bc, type = "pred", interval="prediction", terms = c("YoungDR_2000", "LogTotalPop_2000 [2, 3, 4, 5, 6]"))
plot_model(brm_pop_iso_fe_bc, type = "pred", interval="prediction", terms = c("built_change", "LogTotalPop_2000 [2, 3, 4, 5, 6]"))
ggpredict(model = brm_pop_iso_fe, interval="prediction", typical="median", type="random", terms = c("LogTotalPop_2000", "country_iso [VNM]"), back_transform=F) %>%
plot(show_data=T)
## Data points may overlap. Use the `jitter` argument to add some amount of
## random variation to the location of data points and avoid overplotting.
ggpredict(model = brm_pop_iso_fe, interval="prediction", typical="median", type="random", terms = c("LogTotalPop_2000", "country_iso [IDN]"), back_transform=F) %>%
plot(show_data=T)
## Data points may overlap. Use the `jitter` argument to add some amount of
## random variation to the location of data points and avoid overplotting.
ggpredict(model = brm_pop_iso_fe, interval="prediction", typical="median", type="random", terms = c("LogTotalPop_2000", "country_iso [VNM,IDN]"), back_transform=F) %>%
plot(show_data=T)
## Data points may overlap. Use the `jitter` argument to add some amount of
## random variation to the location of data points and avoid overplotting.
ggpredict(model = brm_pop_iso_fe_bc, interval="prediction", typical="median", type="random", terms = c("built_change", "LogTotalPop_2000 [2, 3, 4, 5, 6]", "country_iso [VNM]"), back_transform=F) %>%
plot(show_data=T)
## Data points may overlap. Use the `jitter` argument to add some amount of
## random variation to the location of data points and avoid overplotting.
ggpredict(model = brm_pop_iso_fe_bc, interval="prediction", typical="median", type="random", terms = c("LogTotalPop_2000", "built_change [terciles]", "country_iso [VNM]"), back_transform=F) %>%
plot(show_data=T)
## Data points may overlap. Use the `jitter` argument to add some amount of
## random variation to the location of data points and avoid overplotting.
ggpredict(model = brm_pop_iso_fe, interval="prediction", typical="median", type="random", ci_level=0.95, terms = c("built_change", "YoungDR_2000 [terciles]", "country_iso"), facet="country_iso", condition=c(TotalPop_2000=150000), back_transform=F) %>%
plot(show_data=T)
## Data points may overlap. Use the `jitter` argument to add some amount of
## random variation to the location of data points and avoid overplotting.
There are further diagnostics that could not be included in the manuscript because of space, but which are included here for illustrative purposes:
## Diagnostic plots:
qplot(ModelCitiesMLUANoNA$YoungDRChange, ModelCitiesMLUANoNA$OldDRChange, color=ModelCitiesMLUANoNA$country_iso)
## Warning: `qplot()` was deprecated in ggplot2 3.4.0.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
qplot(ModelCitiesMLUANoNA$LogTotalPop_2000, ModelCitiesMLUANoNA$AnnGrRate, color=ModelCitiesMLUANoNA$country_iso)
qplot(ModelCitiesMLUANoNA$TotalPop_2000, ModelCitiesMLUANoNA$AnnGrRate, color=ModelCitiesMLUANoNA$country_iso)
## Export model data for small/medium urban areas with predictions:
#write.csv(cbind(ModelCitiesMUANoNA, data.frame("Observed"=brm_pop_iso_fe$data$AnnGrRate, "Estimate"=apply(t(posterior_epred(brm_pop_iso_fe)), 1, mean))), file="./output/Model_Data_With_Mean_Estimate.csv")
The following is code used to create plots for presentation and the main manuscript. It is presented here for illustration and reproducible research purposes, but they are not recreated in the Supplemental:
## Final plots for presentations:
p <- ggplot(aes(x=Observed, y=Estimate, color=country_iso), data=cbind(ModelCitiesMUANoNA, data.frame("Observed"=brm_pop_iso_fe$data$AnnGrRate, "Estimate"=predictions(brm_pop_iso_fe)$estimate))) +
geom_point(alpha=0.5) +
geom_abline(intercept = 0, slope = 1, color = "red", linetype = "dashed") +
labs(x="Observed Annual Pop. Growth Rate\n(Ln(Pop. 2015/Pop. 2000)/15)", y="Mean of Posterior Predictions") +
guides(color=guide_legend(title="Country")) +
scale_color_gdocs() +
theme_bw()
#theme_gdocs()
#theme_fivethirtyeight()
#pdf(file="./output/Predicted_Observed.pdf", width=7, height=5)
ggMarginal(p + theme(legend.position="left") + theme(panel.background = element_blank()), groupColour = TRUE, groupFill = TRUE, size=6)
#dev.off()
posterior <- as.matrix(brm_pop_iso_fe)
plot_title <- ggtitle("Posterior Distributions of Standardized Coefficient Estimates",
"Median Point Estimate, 80% (thick), and 95% (thin) Intervals")
color_scheme_set("teal")
mcmc_areas(posterior,
pars = colnames(posterior)[grepl("^b_", colnames(posterior))],
prob = 0.8, prob_outer=0.95) + plot_title + scale_color_gdocs() + theme_bw() +
geom_vline(xintercept = 0, color = "#F0B73E99", linetype = "dashed", size=1.25)
mcmc_intervals(posterior,
pars = colnames(posterior)[grepl("^r_", colnames(posterior))],
prob = 0.8, prob_outer=0.95) + plot_title + scale_color_gdocs() + theme_bw()
p <- list()
group_levels <- c("Environmental", "Land System", "Economic", "Demographic")
name_levels <- list(
#c("b_Intercept"),
c("Elevation", "Mean Temp. '00", "Mean Precip. '00"),
c("Built-up Area '00", "% Built-up Change '15-'00", "Mean Greenness '00", "% Greenness Change '15-'00"),
c("GDP Per Capita '00", "Travel Time to Capital", "Mean NTL Intensity"),
c("Young Dep. Ratio '00", "% Dep. Ratio Change '15-'00", "Log(Total Pop. '00)")
)
parameter_vec <- list(
#c("b_Intercept"),
c("b_scaleelevation", "b_scalemean_temp2000", "b_scalemean_precip2000"),
c("b_scalebuilt2000", "b_scalebuilt_change", "b_scalemean_green2000", "b_scalemean_green_change"),
c("b_scalegdp_percap2000", "b_scalett_capital", "b_scalentl"),
c("b_scaleYoungDR_2000", "b_scaleYoungDRChange", "b_scaleLogTotalPop_2000")
)
color_vec <- c("green", "purple", "yellow", "brightblue")
for (i in 1:length(parameter_vec)){
color_scheme_set(color_vec[i])
p[[i]] <- mcmc_areas(
posterior,
pars = rev(parameter_vec[[i]]),
prob = 0.8,
prob_outer = 0.95,
point_est = "median") +
scale_y_discrete(labels = rev(name_levels[[i]])) +
scale_x_continuous(limits = c(-0.015, 0.015)) +
geom_vline(xintercept = 0, color = "#F0B73E99", linetype = "dashed", size=1.25) +
scale_color_gdocs() + theme_bw()
if (i == 1) {
p[[i]] <- p[[i]] + plot_title
}
if (i != length(parameter_vec)) {
p[[i]] <- p[[i]] +
theme(axis.ticks.x = element_blank(),
axis.text.x = element_blank(),
axis.title.x = element_blank(),
axis.line.x = element_blank(),
plot.margin = unit(c(0,0,0,0), "cm")
)
}
if (i != 1) {
p[[i]] <- p[[i]] +
theme(title = element_blank())
}
}
layout <- '
A
B
C
D
'
#pdf(file="./output/Coefficients.pdf", width=7, height=5)
wrap_plots(A=p[[1]], B=p[[2]], C=p[[3]], D=p[[4]], design = layout)
#dev.off()
## Plot intercept and random, country effects:
plot_title <- ggtitle("Posterior Distributions of Overall and Country-level Intercepts",
"Median Point Estimate, 80% (thick), and 95% (thin) Intervals")
#pdf(file="./output/Intercepts.pdf", width=7, height=5)
mcmc_areas(posterior,
pars = rev(c("b_Intercept", colnames(posterior)[grepl("^r_", colnames(posterior))])),
prob = 0.8, prob_outer=0.95) +
plot_title + theme_bw() +
## Note, can use scale_y_discrete(limits=rev) instead of the rev() call:
scale_y_discrete(labels = rev(c(
"Model Intercept",
"IDN",
"KHM",
"LAO",
"MYS",
"THA",
"VNM"
))) +
geom_vline(xintercept = 0, color = "#F0B73E99", linetype = "dashed", size=1.25)
#dev.off()
ggplot() +
stat_function(fun = ~dexp(., 5),
geom = "area", fill = "black") +
xlim(c(0, 1)) +
labs(x = "**σ<sub>y</sub>** and **σ<sub>0</sub>**<br>Variation *within* countries ann. growth rates<br>and *between* countries growth rates") +
theme(axis.title.x = element_text(), axis.text.y = element_blank(),
axis.title.y = element_blank(), axis.ticks.y = element_blank(),
panel.grid = element_blank())
## Updated correlation plot for final model variables:
name_levels <- list(
c("Ann. Growth Rt."),
c("Elevation", "Mean Temp. '00", "Mean Precip. '00"),
c("Built-up Area '00", "% Built-up Change '15-'00", "Mean Greenness '00", "% Greenness Change '15-'00"),
c("GDP Per Capita '00", "Travel Time to Capital", "Mean NTL Intensity"),
c("Young Dep. Ratio '00", "% Dep. Ratio Change '15-'00", "Log(Total Pop. '00)")
)
parameter_vec <- list(
c("AnnGrRate"),
c("elevation", "mean_temp2000", "mean_precip2000"),
c("built2000", "built_change", "mean_green2000", "mean_green_change"),
c("gdp_percap2000", "tt_capital", "ntl"),
c("YoungDR_2000", "YoungDRChange", "LogTotalPop_2000")
)
color_vec <- list(
c("black"),
c("#2d8659", "#2d8659", "#2d8659"),
c("#660066", "#660066", "#660066", "#660066"),
c("#aa975c", "#aa975c", "#aa975c"),
c("#0065cc", "#0065cc", "#0065cc")
)
corrs <- cor(ModelCitiesMUANoNA[,as.character(flatten(parameter_vec))])
colnames(corrs) <- as.character(flatten(name_levels))
rownames(corrs) <- as.character(flatten(name_levels))
#corrplot::corrplot(corrs, method="square", type = "upper", order = "hclust",
# tl.col = "black", tl.srt = 45)
#corrplot::corrplot(corrs, method="square", type = 'lower', order = 'hclust', tl.col = 'black',
# cl.ratio = 0.2, tl.srt = 45, col = COL2('PuOr', 10))
#corrplot::corrplot(corrs, method="square", type = 'lower', tl.col = as.character(flatten(color_vec)),
# cl.ratio = 0.2, tl.srt = 45, col = COL2('PuOr', 10), tl.cex=0.8)
## Final correlation plot for publication:
#pdf(file="./output/Correlations.pdf", width=10, height=5)
corrplot::corrplot(corrs, method="square", type = 'lower', tl.col = as.character(flatten(color_vec)),
cl.ratio = 0.2, tl.srt = 45, col = COL2('PuOr', 10), tl.cex=0.8)
#dev.off()
## Calculate average annualized growth rate and %
## change in YDR by country:
summ <- ModelCitiesMUANoNA %>%
group_by(country_iso) %>%
dplyr::summarize(
MeanAnnGrRate = mean(AnnGrRate, na.rm=T),
MinAnnGrRate = min(AnnGrRate, na.rm=T),
MaxAnnGrRate = max(AnnGrRate, na.rm=T),
SDAnnGrRate = sd(AnnGrRate, na.rm=T),
MeanYoungDRChange = mean(YoungDRChange, na.rm=T),
MinYoungDRChange = min(YoungDRChange, na.rm=T),
MaxYoungDRChange = max(YoungDRChange, na.rm=T),
SDYoungDRChange = sd(YoungDRChange, na.rm=T)
) %>%
add_count() %>%
arrange(desc(MeanAnnGrRate))
#table1(summ)
table1(~ AnnGrRate + YoungDRChange | country_iso, data=ModelCitiesMUANoNA)
summary(ModelCitiesMUANoNA)
## Quick correlation:
corrs