Ecological Archives M074-010-A1

James S. Clark, Shannon LaDeau, and Ines Ibanez. 2004. Fecundity of trees and the colonization-competition hypothesis. Ecological Monographs 74:415–442.

Appendix A. Markov-chain Monte Carlo (MCMC) algorithm: conditional posteriors and convergence. A pdf file is also available.

The Gibbs sampler for our model is based on iterative sampling from the full conditional density for each parameter (Gelfand and Smith 1990, Gelman et al. 1995). Here we summarize our method.  

Fecundity parameters

Regression parameters can be directly sampled based on distributions derived from theory of linear models. Parameters for fixed effects conditionally depend on all tree fecundity schedules and on a bivariate normal prior,

where  and

The posterior conditional for the i contains only the fecundity schedule for the ith tree,

                             =

where

 

and

   

    .

The variance on hyperparameters has conditional posterior

             = .

Prior parameter values were c = [2, 0.5]T, , and a = b = 0.01.

Seed production from the ith tree conditionally depends on the likelihood for all seed traps in all years (because there is dispersal and autocorrelation) and on its own fecundity schedule. For the ith tree we have the full posterior conditional

.

We use a Metropolis step based on the proposal density . Thus, we propose a full time series for each tree, and we accept or reject on the basis of the full series, not on specific years.

Total variance 2 was sampled directly from the posterior conditional

 
 = .

 

Prior parameter values were a = b = 0.01. 

For we used the symmetric proposal density  and a Metropolis step, where (g) is the current value of , and  to sample from the conditional posterior

.

Unless near –1 or 1, the proposal density has width 0.1. Near either limit, it narrows. Because correlations are always bounded away from limits, the algorithm does not bog down there. As with 2, the covariance matrix is assembled as basis for evaluating the conditional posterior.

 

Dispersal parameter

The dispersal parameter conditionally depends on all of the seed trap data,

.

We used a Metropolis-Hastings step with a gamma proposal density. Gamma distributions Gam(a,b) have mean a/b. Prior parameter values were au = 1 and bu = 0.01.

 

Tree-status parameters

Conditional posteriors for tree status parameters contain both full likelihoods, because statuses qi directly influence all seed traps through expected seed rain (Eqs. 13 and 14). For a, the conditional is

with that for b having the appropriate prior. For both a prior parameter values were a0 = 0.2 and b0 = 0.1, and b prior parameter values were a0 = 0.1 and b0 = 1. For both we used Gamma proposals and Metropolis-Hastings. Likewise, the female fraction full conditional is

.

Recognition error was

.

The weighting parameter k was introduced as a means for manipulating the contribution of the two data sets. A range of values was explored. We ultimately used k = 1, i.e., both data sets weighted strictly in terms of their representation in the data sets.

 

Convergence

Despite the high dimensionality of the model, Markov chains converged for all but a few species. We were not satisfied that convergence was reached for several species having few seeds; these species were not analyzed further (see Results in text). To initialize seed production rates yit, plot-by-plot maximum-likelihood (ML) estimates of a0 were obtained for each year under the simplifying assumption of = 1, and a0 = 0.5. These estimates were used to initialize fecundity for each tree. A burn-in of 2000 to 20,000 iterations was sufficient for all species included in this analysis, with steps to convergence depending in large part on number of trees. Parameters that could be sampled directly converged to values near the target distribution rapidly, with some slow drift as the stickier yit’s converged more slowly.

 
   FIG. A1. Markov-chain Monte Carlo (MCMC) output for 20 parallel chains after discarding 20,000 burn-in iterations. Parameters are scaled to familiar or consistent units.  Gelman and Rubin’s (1992) potential scale-reduction values are in parentheses at right. Posterior densities in the original scales are provided in the text figures. The four terms of log fecundity are shown in the upper panel, where the diameter used to scale a1 is di1 = 1.46 (or 28.6 cm), the diameter of the example tree in Fig. A2. Thus, the intercept term (a0) dominates, followed by process variability (), diameter effect (a1dit), and individual effects (). The middle panel shows the diameter at which the maturation schedule   = 0.5 (a/b) and the mean dispersal distance . The lower panel shows the recognition error (v), female fraction (), and AR(1) parameter ().

 

An example from Acer rubrum demonstrates a "worst-case" scenario. When there are many trees in a stand, there are many potential combinations of fecundity values that might "satisfy" the annual trap data, and there is more "overlap" in seed shadows, making it difficult to resolve parameters. Moreover, when interannual variability is large, there can be years of low seed recovery for which identification of source trees is especially challenging. A. rubrum is the most abundant species, and it is at highest density in stand C1 (720 trees/ha).  After 20,000 burn-in iterations on parallel chains, all population level parameters are well-identified (Fig. A1). Convergence occurred from multiple initial conditions, and potential scale reduction factors are at acceptable values (Gelman and Rubin 1992).

 
    FIG. A2. Twenty parallel MCMC chains for an example Acer rubrum tree on plot C1 corresponding to those in Fig. A1. Numbers of seeds collected for each year are indicated in the upper left of each panel. Potential scale-reduction factors and smoothed posterior densities are at right. For years of low seed recovery (<100 seeds), estimated fecundities are low (<104 seeds) and not well identified.

 

An example of fecundity estimates for a tree from plot C1 shows how parameters are well identified for years with adequate seed recovery (say, more than 100 seeds), when trees produce more than 1000 seeds (1994, 1995, 1997, 2000, and 2002 in Fig. A2). In years of low seed recovery, fecundity estimates are low, and poorly identified on the log-scale plots of Fig. A2 (1992, 1993, 1996, 1998, 1999, 2001). The years of low fecundity estimates do not contribute substantially to overall fecundity, as is evident from the linear scale on the lower panel of Fig. A3. As this is an example of the most challenging species on the most challenging plot, most simulations converged much more rapidly.

 
   FIG. A3. Annual estimates of fecundity from the example tree in Fig. A2 with 95% CIs. Poorly resolved estimates come from years of especially low fecundity.

 

Extraction of species effects

The extraction of species effects is summarized here. From the Gibbs output for a data set containing all trees of a genus, we extracted the chain of i and yi estimates for each individual of a given species p. For the ith tree, the estimate of the full (fixed plus random) effect at Gibbs step g is . The fixed effect for species p is the mean of random effects taken over all individuals of that species,

                                                                            

where P is set of mp individuals belonging to species p. The individual effect for tree i of species p is centered on the species p population effect

.

The gth estimate of 2 is

.

Fecundity schedules yi were simply extracted from trees belonging to species p.

 

Literature cited

Gelfand, A. E., and A. F. M. Smith. 1990. Sampling-based approaches to calculating marginal densities. Journal of the American Statistical Association 85:398–409.

Gelman, A., J. B. Carlin, H. S. Stern, and D. B. Rubin. 1995. Bayesian data analysis. Chapman and Hall, London, UK.

Gelman, A., and D. B. Rubin. 1992. Inference from iterative simulation using multiple sequences. Statistical Science 7:457–511.



[Back to M074-010]