The Gibbs sampler for our model is based on iterative sampling from the full conditional density for each parameter (Gelfand and Smith 1990, Gelman et al. 1995). Here we summarize our method.
Regression parameters can be directly
sampled based on distributions derived from theory of linear models. Parameters
for fixed effects conditionally
depend on all tree fecundity schedules and on a bivariate normal prior,
where and
.
The posterior conditional for the
i
contains only the fecundity schedule for the ith tree,
=
where
and
.
The variance on hyperparameters has conditional posterior
=
.
Prior parameter values were c
= [2, 0.5]T, ,
and a
= b
= 0.01.
Seed production from the ith tree conditionally depends on the likelihood for all seed traps in all years (because there is dispersal and autocorrelation) and on its own fecundity schedule. For the ith tree we have the full posterior conditional
.
We use a Metropolis step based on
the proposal density . Thus,
we propose a full time series for each tree, and we accept or reject on the
basis of the full series, not on specific years.
Total variance 2
was sampled directly from the posterior conditional
= |
Prior parameter values were a
= b
= 0.01.
For
we used the symmetric proposal density
and
a Metropolis step, where
(g)
is the current value of
, and
to sample from
the conditional posterior
.
Unless near –1 or 1, the proposal
density has width 0.1. Near either limit, it narrows. Because correlations are
always bounded away from limits, the algorithm does not bog down there. As with
2, the covariance
matrix is assembled as basis for evaluating the conditional posterior.
Dispersal parameter
The dispersal parameter conditionally depends on all of the seed trap data,
.
We used a Metropolis-Hastings step with a gamma proposal density. Gamma distributions Gam(a,b) have mean a/b. Prior parameter values were au = 1 and bu = 0.01.
Tree-status parameters
Conditional posteriors for tree
status parameters contain both full likelihoods, because statuses qi
directly influence all seed traps through expected seed rain (Eqs. 13 and 14). For
a, the conditional
is
with that for b
having the appropriate prior. For both a
prior parameter values were a0 = 0.2 and b0
= 0.1, and b
prior
parameter values were a0 = 0.1 and b0 =
1. For both we used Gamma proposals and Metropolis-Hastings. Likewise, the female
fraction full conditional is
.
Recognition error was
.
The weighting parameter k was introduced as a means for manipulating the contribution of the two data sets. A range of values was explored. We ultimately used k = 1, i.e., both data sets weighted strictly in terms of their representation in the data sets.
Convergence
Despite the high dimensionality
of the model, Markov chains converged for all but a few species. We were not
satisfied that convergence was reached for several species having few seeds;
these species were not analyzed further (see Results in text). To initialize
seed production rates yit, plot-by-plot maximum-likelihood
(ML) estimates of a0 were obtained for each year under the
simplifying assumption of
= 1, and a0 = 0.5. These estimates were used to initialize
fecundity for each tree. A burn-in of 2000 to 20,000 iterations was sufficient
for all species included in this analysis, with steps to convergence depending
in large part on number of trees. Parameters that could be sampled directly
converged to values near the target distribution rapidly, with some slow drift
as the stickier yit’s converged more slowly.
|
FIG.
A1. Markov-chain Monte Carlo (MCMC) output for 20 parallel chains after
discarding 20,000 burn-in iterations. Parameters are scaled to familiar
or consistent units. Gelman and Rubin’s (1992) potential
scale-reduction values are in parentheses at right. Posterior densities
in the original scales are provided in the text figures. The four terms
of log fecundity are shown in the upper panel, where the diameter used to
scale a1 is di1 = 1.46 (or
28.6 cm), the diameter of the example tree in Fig. A2. Thus,
the intercept term (a0) dominates, followed by process
variability ( |
An example from Acer rubrum demonstrates a "worst-case" scenario. When there are many trees in a stand, there are many potential combinations of fecundity values that might "satisfy" the annual trap data, and there is more "overlap" in seed shadows, making it difficult to resolve parameters. Moreover, when interannual variability is large, there can be years of low seed recovery for which identification of source trees is especially challenging. A. rubrum is the most abundant species, and it is at highest density in stand C1 (720 trees/ha). After 20,000 burn-in iterations on parallel chains, all population level parameters are well-identified (Fig. A1). Convergence occurred from multiple initial conditions, and potential scale reduction factors are at acceptable values (Gelman and Rubin 1992).
An example of fecundity estimates for a tree from plot C1 shows how parameters are well identified for years with adequate seed recovery (say, more than 100 seeds), when trees produce more than 1000 seeds (1994, 1995, 1997, 2000, and 2002 in Fig. A2). In years of low seed recovery, fecundity estimates are low, and poorly identified on the log-scale plots of Fig. A2 (1992, 1993, 1996, 1998, 1999, 2001). The years of low fecundity estimates do not contribute substantially to overall fecundity, as is evident from the linear scale on the lower panel of Fig. A3. As this is an example of the most challenging species on the most challenging plot, most simulations converged much more rapidly.
|
FIG. A3. Annual estimates of fecundity from the example tree in Fig. A2 with 95% CIs. Poorly resolved estimates come from years of especially low fecundity. |
Extraction of species effects
The extraction of species effects
is summarized here. From the Gibbs output for a data set containing all trees
of a genus, we extracted the chain of i
and yi estimates for each individual of a given species
p. For the ith tree, the estimate of the full (fixed
plus random) effect at Gibbs step g is
. The
fixed effect for species p is the mean of random effects taken over all
individuals of that species,
where P is set of mp individuals belonging to species p. The individual effect for tree i of species p is centered on the species p population effect
.
The gth estimate of 2
is
.
Fecundity schedules yi were simply extracted from trees belonging to species p.
Gelfand, A. E., and A. F. M. Smith. 1990. Sampling-based approaches to calculating marginal densities. Journal of the American Statistical Association 85:398–409.
Gelman, A., J. B. Carlin, H. S. Stern, and D. B. Rubin. 1995. Bayesian data analysis. Chapman and Hall, London, UK.
Gelman, A., and D. B. Rubin. 1992. Inference from iterative simulation using multiple sequences. Statistical Science 7:457–511.