Spatial heterogeneity analysis for estimating breeding values of tree height in a hybrid larch progeny test plantation

ABSTRACT Microenvironmental heterogeneity in forest stands results in spatial variations in growth traits. Especially in progeny tests in tree breeding, this spatial variation can prevent the accurate estimation of genetic parameters, including breeding values. To quantify spatial heterogeneity, genetic models incorporating spatial coordinate information have been effectively used to accurately estimate breeding values of individuals. In this study, we measured the height of all Japanese larch and hybrid larch individuals in a progeny trial at 1, 2, 5, 10, and 15 years after planting and calculated genetic parameters, including breeding values, using genetic models. According to the fittings of the candidate models, a genetic model incorporating spatial coordinate information is more suitable for estimating genetic parameters. Overall, a genetic model incorporating spatial coordinate information could be effective in explaining the genetic effects of tree height, which will be useful for estimating more accurate breeding values in hybrid larch breeding programs.


Introduction
In most breeding programs for animals and perennial plant species, predicting the breeding values of individuals is essential to choose candidate individuals with higher economic values, along with helping in ranking the parents, using data from progeny tests (White and Hodge 1989). Best Linear Unbiased Prediction (BLUP), first proposed by Henderson (1949) is the most widely used method for estimating the breeding values of individuals (Piepho et al. 2008). Combined with Residual/Restricted Maximum Likelihood (REML) (Robinson 1987), REML/BLUP has been widely used to estimate variance components.
The estimation of breeding values is also especially important for forest trees breeding programs. However, progeny trials of forest trees require large spaces provided for testing individuals' responses that are time-consuming, costly, and laborious (Zas 2006;White et al. 2007). Furthermore, these progeny trials are often established on hilly and irregular terrains and may be affected by adjacent stands, leading to spatial heterogeneity in micro-environmental factors, such as soil fertility, soil moisture, soil temperature, air temperature, slope, and solar radiation (Epperson 1990;Fukatsu et al. 2018). Therefore, understanding the effects of spatial variation caused by micro-environmental heterogeneity is essential for improving the accuracy of genetic parameters (Hamann et al. 2002).
Generally, multiple field test designs, such as randomized complete or incomplete block designs, can be used to handle micro-environmental effects (Fins et al. 1992). However, designing an appropriate blocking is difficult because high levels of environmental variation often exist within a block or replication in a field test design (Y-B et al. 1999;Costa e Silva et al. 2001). Spatial gradients and patchiness have been demonstrated in numerous progeny trials of forest trees (Y-B et al. 1999;Costa e Silva et al. 2001;Dutkowski et al. 2002). Spatial analytical methods have been used to adjust for the effects of spatial variation (Brownie et al. 1993;Stroup et al. 1994;Zas 2006;White et al. 2007;Cappa et al. 2015). Particularly, mixed models, incorporating autoregressive models, have shown a reduction in experimental error variance caused by spatial variation, with an increase in heritability estimates (Bian et al. 2017;Belaber et al. 2019). Additionally, this model could improve the accuracy of breeding values compared to that of other models without spatial effects, as shown in related empirical studies (Costa e Silva et al. 2001;Dutkowski et al. 2006;Ye and Jayawickrama 2008;Chen et al. 2018).
In Japan, the Japanese larch (Larix kaempferi) has been introduced from an area of natural distribution, Honshu (center of Japan), to Hokkaido (north of Japan) since 1872, and has been widely planted as it grows well in Hokkaido. However, this species exhibits low resistance to damage by field voles (Takahashi et al. 1966). To overcome this limitation, Kurile larch (L. gmelinii var. japonica) with high resistance to damage by field voles (Takahashi et al. 1966) was introduced mainly from the south of Sakhalin and bred with Japanese larch. The resultant hybrid larch (F 1 : L. gmelinii var. japonica × L. kaempferi) not only shows rapid juvenile growth but also high resistance to damage by field voles, as compared to those for Japanese larch and other hybrid larches (Hatakeyama and Kaji 1982;Kurahashi 1988). Currently, a specific half-sib progeny of F 1 named "Clean-Larch," produced by mating a maternal clone "Nakashibetsu-5" with paternal clones of Japanese larch, shows superior general combining ability and has advantages in height growth and survival as compared to those of the Japanese larch at several planting sites (Kita et al. 2017). In addition, this cultivar possesses desirable wood quality regarding low stem crookedness, high wood density, and superior wood strength (Fujimoto et al. 2007), as well as a greater capacity for carbon accumulation, which is useful for mitigating global warming (Kita et al. 2009). Spatial heterogeneity within planting sites has never been considered in the previous studies, when they evaluated the superiority of Clean-Larch over Japanese larch. Recently, this cultivar has been commercially used for afforestation in Hokkaido, Japan. The seed orchard producing Clean-Larch is composed of single-clone Kuril larch "Nakashibetsu-5" as female parents and multiclone Japanese larch as male parents, which was named as the "interspecific seed orchard with a single maternal clone" (Moriguchi et al. 2008). Since Clean-Larch that was used to establish this progeny test site was derived from such interspecific seed orchard, genetic variability of F 1 progeny mainly resulted from difference of paternal clones, thereby, genetic background of F 1 progeny is relatively simpler than progeny derived from other seed orchards with multi-clone parents. Owing to that, confounding of genetic effects with spatial variation may be relatively weaker, which suggests this progeny test site may be useful to quantify the spatial heterogeneity. Accurate estimation of breeding values for F 1 progeny is essential for detecting superior paternal clones to improve the constituting clones of this interspecific seed orchard with a single maternal clone. Especially, if combing DNA marker-based pedigree information, superior paternal clones can be appropriately selected based on more accurate breeding values (i.e. genetic gains) estimated by spatial heterogeneity analysis.
In this study, a progeny test plantation, which includes F 1 "Clean-Larch" (hereafter, hybrid larch) individuals and a few Japanese larch individuals, was chosen as the study site. The objectives of this study were (1) to explore the availability of an REML/BLUP model that incorporates the spatial autoregressive model (AR) (hereafter, spatial model) for height data compared to the standard REML/BLUP model (hereafter, non-spatial model); (2) to estimate the breeding values of individual trees for tree height using the spatial model; and (3) to re-evaluate the superiority of height growth of hybrid larch compared to that of Japanese larch based on the spatial model.

Study materials
The study site was a hybrid larch progeny test plantation established in 2006 spring at the University of Tokyo Hokkaido Forest, Furano-city, central Hokkaido (43°22N, 142°44E) (Figure 1), and it included hybrid larch individuals and a few Japanese larch seedling stocks ( Table 1). The hybrid larch individuals were produced in 2003 by the open pollination of a specific maternal clone (L. gmelinii var. japonica; "Nakashibetsu-5") that shows beneficial properties (e.g. high wood strength and density), with its nearby different paternal clones of plus trees of Japanese larch Figure 1. Progeny test site before planting (left) and plot design (right). Each black dot represents a transplant. The number inside each grid represents the plotnumber. The Japanese larch seedlings were planted randomly in the gray plot. (L. kaempferi) in an interspecific seed orchard in Kunnepputown, eastern Hokkaido (43°75N, 143°68E). The seeds of hybrid larch to harvest seedlings used in this study were collected from 11 clonal ramets of single mother clone "Nakashibetsu-5" in the seed orchard, then seedlings were raised at a nursery of the Forest Research Institute, Hokkaido Research Organization located at Bibai-city, western Hokkaido (43°29N, 141°85E) for 2 years since 2004. Seedlings were managed according to per mother tree. Transplantation of hybrid larch and Japanese larch was conducted on 11 May 2006. A plot-plantation was introduced for the design of this progeny test. A total of 117 plots were designed ( Figure 1). One tree row plot (4 m × 15 m) was composed of five seedlings including either hybrid larch seedlings derived from one specific mother tree, or Japanese larch seedlings. Inter-tree distance was 3 m. Semirandomization was used for the plot arrangement. Additionally, eight plots out of 117 plots were randomly selected to plant a total of 40 Japanese larch seedlings as a control treatment of the progeny test site. These seedlings were produced by seedling stocks derived from commercial larch seeds distributed in Hokkaido. Therefore, pedigree information including number of Japanese larch families were not available. Japanese larch seedlings used in this study were produced at the nursery of Sasaki Seedling Co. Ltd., located at eastern Hokkaido.
The tree height of all planted progenies was measured in the fall of 1 year (H_1 st y) and 2 years after planting (H_2 nd y) by a measuring pole (TAKETANI Co. LTD, Japan), and in the fall of 5 years (H_5 th y), 10 years (H_10 th y), and 15 years after planting (H_15 th y) by an ultrasonic vertex hypsometer (Haglöf Sweden AB, Långsele, Sweden). Notably, approximately 30% of the trees were thinned in 2017.

Data analysis
Two models (spatial and non-spatial) were used to estimate genetic parameters using the breedR package (Muñoz and Sanchez 2022) in R (R Core Team 2022). All data analyses were performed in a linear mixed model based on restricted maximum likelihood estimations, as follows: Silva et al. 2001) where y is the vector of observed phenotypic data,β is the vector of overall mean as a fixed effect, X is an incidence matrix for β; u is a vector of random additive genetic effect of individuals, u,Nð0; Aσ 2 u ), A is a matrix of additive genetic relationships based on pedigree, Z is an incidence matrix for u; e is the vector of random residuals, e,N 0; R ð Þ. For the non-spatial model, R ¼ Iσ 2 e , where σ 2 e is the residual variance, I is the identity matrix with a size the same as the number of data points. For the spatial model, Þ are first-order autoregressive correlation matrices for column and row directions of individuals with autocorrelation parameters � c and � r , � is the Kronecker product, σ 2 η is uncorrelated random residual variance.
For autocorrelation parameters, a set of rho in the row direction (� r Þ and column direction (� c Þ was selected to make the spatial model achieve the lowest Bayesian information criteria (BIC) values. Candidate rho values were 0.8 (positive), 0.2 (weakly positive), 0, −0.2 (weakly negative), and−0.8 (negative) for both parameters.
The availability of the spatial model was examined by comparing it to the non-spatial model based on the estimated genetic parameters of height trait at different ages. Spatial heterogeneity analysis for estimating the breeding values of 15-year-old individuals was conducted using the spatial model. Due to a lack of pedigree information for Japanese larch, the estimated value for this species was considered as an overall mean after eliminating the spatial effects.
The narrow-sense heritability (h 2 ) for tree height at different ages was computed automatically by fitting an animal model using breedR package.
The accuracy of the estimated breeding values (r g )of different ages' height trait, i.e. the correlation between the true (g) and predicted genetic values (ĝ), was calculated for offspring according to the following formula (Dutkowski et al. 2002): where PEV is the prediction error variance (square of the standard errors).

Comparison of height growth between the hybrid larch and Japanese larch
Based on the mean height at different ages, the results showed that the height growth of hybrid larch was greater than that of Japanese larch at an early stage until 15 years after planting ( Table 1). The mean height growth of the hybrid larch individuals was consistently greater by >10% compared to that of the Japanese larch before the thinning treatment (1 st y, 2 nd y, 5 th y, and 10 th y). The superiority in height growth of the hybrid against that of Japanese larch decreased after thinning (15 th y); nevertheless, the mean value of the hybrid larch was 6% higher than that of the Japanese larch.

Availability of the spatial model
Significant spatial heterogeneity was observed for height trait, as shown in Figure 2, where the spatial distribution of height growth was increasingly uneven at the test site over the years. Therefore, spatial model should be considered to use for data analysis.
Estimates of the genetic parameters of the non-spatial and spatial models for tree height at different ages are listed in Table 2. The results showed that the residual variance of the models (σ 2 η ) as well as Bayesian information criteria (BIC) in the spatial models were lower than those in the non-spatial models (σ 2 e ). Additionally, heritabilities (h 2 ) were higher in the spatial models, except for that of H_5 th y, which ranged from 0.207 to 0.561 for tree height at different ages. The average accuracies of the estimated breeding value (r g ) for all individuals were mostly higher in the spatial models, except for that of H_5 th y. Estimates of genetic variance (σ 2 u ) increased at ages 1, 2, and 15, and decreased at ages 5 and 10 in the non-spatial models compared to those in the spatial models.
For autocorrelation parameters by the spatial models, rho selected in the row direction showed two patterns: 0.2 for H_1 st y and H_5 th y, and 0.8 for H_2 nd y, H_10 th y, and H_15 th y. Rho selected in the column direction was consistent among the models; 0.8 for all ages.

Estimation of breeding values of all 15-year-old individuals using the spatial model
The spatial distribution of the breeding values of all 15-yearold individuals estimated by the non-spatial model was different from that of the spatial model (Figures 3(a, c)). In addition, spatial effects were remarkable (Figure 3(d)). The residuals obtained from the non-spatial model were spatially clustered (Figure 3(b)) and these spatial patterns were similar to those of the spatial effect obtained from the spatial model (Figure 3(d)). Therefore, considering the spatial effects, the residuals obtained from the spatial model were especially small (Figure 3(e)).
More accurate breeding values for the tree height of all 15year-old individuals ranged from −2.405 to 2.412 (Figure 4(a)). The relationship between the measured tree height and the estimated breeding values of all 15-year-old individuals was significant according to Pearson's correlation test (r = 0.678, p < 0.001). Based on the relationship, even though the values were dispersed from the regression line due to spatial effects (Figure 4(a)), the result seemed to show the larger the observed values, the larger the estimated breeding values. Additionally, the hybrid larch was significantly better than the Japanese larch based on the estimated breeding values in H_15 th y (p < 0.01by t-test; Figure 4(b)).

Discussion
In this study, some factors such as the shade of adjacent stands and slope were obvious in the progeny test site (Figure 1), which are highly likely to lead to spatial variation. Moreover, the patchy spatial distribution and gradient of tree height at the test site have been evident over the years (Figure 2), indicating the existence of spatial variation due to environmental heterogeneity. This is apparent because, during the selection of superior hybrid larch individuals based on the estimated genetic parameters of tree height, spatial variation may affect the outcome.
Analyses with spatial coordinates have been used effectively to quantify environmental heterogeneity, including the spatial gradient and patchiness of tree height, and to improve predicted genetic responses (Joyce et al. 2002;Dutkowski et al. 2006;Chen et al. 2018;Dong et al. 2020). In this study, spatial variance (σ 2 � ) seemed to be higher and higher over years (Table 2), which is compatible with spatial distribution patterns of height trait at different ages shown in Figure 2 which showed that the spatial distribution was increasingly uneven at the test site over the years.  N, number of individuals in the progeny test plantation; σ 2 u ; estimated additive genetic variance; σ 2 e residual variance in the non-spatial model; σ 2 � : variance of the spatial parameter; σ 2 η : residual variance in the spatial model; h2, narrow-sense heritability; BIC, Bayesian information criteria; r g , accuracy of breeding values of the offspring.
Nevertheless, spatial variance (σ 2 � ) was lower than residual variance (σ 2 η ) by the spatial model in H_1 st y, H_2 nd y, and H_5 th y, whereas much higher in H_10 th y, and H_15 th y (Table 2), which suggested that spatial heterogeneity was probably slowly taking over the dominant role in environment variation over years. Additionally, we considered that the spatial autocorrelation structure was not stable at early stage, because the selected autocorrelation parameters (rho) were not fixed in the column direction. As mentioned above, spatial heterogeneity might be taking over the dominant role over years to show consistent and positive correlations (rho = 0.8) at 10-and 15-year-old.
On the other hand, the residual variance (σ 2 η ) and BIC of tree heights using the spatial model were lower than those of the non-spatial model (σ 2 e ) at different ages, which indicated that the spatial model was more suitable (Table 2). A decrease in the residual variance using a spatial model has been reported for height traits in previous studies (Ye and Jayawickrama 2008;Cappa et al. 2015;Belaber et al. 2019). In our study, although model fitting was significantly improved and the residual variance decreased by the spatial model, changes in the additive genetic variance (σ 2 u ) showed no obvious tendency to increase or decrease. The effect of the spatial model on additive genetic variance was concluded to be not constant, which is consistent with previous studies (Costa e Silva et al. 2001;Dutkowski et al. 2006). Owing to this reason, Chen et al. (2018) considered that ill-fitting of the model or confounding of genetic effects with spatial variation, might result in an inconsistent effect on the σ 2 u . Although it is very difficult to have an unambiguous separation of genetic and environmental effects, using a spatial model is expected to adjust spatial heterogeneity for improving estimates of genetic parameters.
Several studies demonstrated that a spatial model to adjust spatial variation is effective for improving heritability (h 2 ) (Hamann et al. 2002;Zas 2006;Bian et al. 2017;Belaber et al. 2019), which reflects the degree of genetic control for tree height. In our study, there were small to moderate increases using the spatial model in the h 2 at different ages, except for the h 2 of H_5 th y (Table 2), which suggested that spatial  variation might have an adverse effect on the genetic analysis of tree height, and that the spatial model, by adjusting spatial variation, was helpful for improving the h 2 . Heritability of growth traits estimated by spatial models can be expected in practical tree-improvement programs. In addition, the indicators of accuracy of estimated breeding values (r g ) for H_1 st y, H_2 nd y, H_10 th y, and H_15 th y (Table 2) obtained from the spatial model were higher than those obtained from the nonspatial model. These findings are in agreement with those of several previous studies (Costa e Silva et al. 2001;Zas 2006;Ye and Jayawickrama 2008;Cappa et al. 2015;Chen et al. 2018). Thus, the spatial model is a better method for extracting the genetic effects of tree height and is valuable in breeding programs.
In this study, a comparison was made between the distribution of model parameters in the spatial and non-spatial models in H_15 th y (Figure 3). Remarkable heterogeneity in the spatial parameters were represented in the spatial model, as well as constantly small residuals, which could be explained by the high level of sensitivity of tree height to spatial heterogeneity. In general, spatial effects can cause erroneous variance estimations, leading to unadvisable decisions during the implementation of an advanced breeding strategy, ultimately resulting in incorrect estimations of breeding values and genetic gains (Hamann et al. 2002;Zas 2006). In our study, it was obvious that spatial heterogeneity affected the estimation of breeding values, giving rise to a bias in the estimated breeding values of 15-year-old individuals from the non-spatial model to the spatial model (Figure 3(a, c)).
The third objective of this study was to re-evaluate the superiority of height growth of F 1 hybrids compared to that of Japanese larch. The advantages of F 1 hybrids include rapid juvenile growth, inherited from the Japanese larch, and high resistance to field vole damage, inherited from the Kurile larch (Hatakeyama and Kaji 1982;Kurahashi 1988). Kita et al. (2017) reported that the height growth of the F 1 hybrid larch produced by mating a maternal clone ("Nakashibetsu-5") with paternal clones of Japanese larch at three out of four planting sites was faster compared to that of the Japanese larch at an early stage. Similar to the previous studies, the superiority of the F 1 hybrids was demonstrated in this study based on both the overall mean (Table 1) and the model estimates with spatial effects (Figure 4(b)). Therefore, although Japanese larch offspring with pedigree information is needed to estimate the robust breeding values in the future study, we firstly addressed the superiority of Clean-Larch against Japanese larch considering spatial heterogeneity within progeny test site in the present study.
Moreover, the individual breeding values in H_15 th y estimated by the spatial model were found to be different from those of the non-spatial model and more divergent from the regression line in the relationship with the observed height values (Figures 4(a), S1). This suggests that it may lead to misunderstandings of the genetic properties of growth-related performance for each progeny when estimating breeding values using a non-spatial model, as well as using observed height values directly. In contrast, as spatial model improved the accuracies of estimated breeding values, in our study, the individual with maximum breeding value about 2.4 m could be determined as a superior individual and would be practical, alongside high genetic gain. Moreover, combing DNA marker-based pedigree constructure, it could be applied to appropriately select superior paternal clones in hybrid larch breeding programs (Chen et al. 2022).

Conclusions
Microenvironmental heterogeneity in forest stands results in spatial variations in tree height. In this study, we investigated the spatial distribution and heterogeneity of tree heights at different ages using a spatial model. The reduction in the residuals and BICs from the non-spatial to the spatial model at different ages revealed that the spatial model was useful for removing spatial variation in tree height, as well as for improving model fitting. In addition, the accuracy of the genetic parameters was improved in the spatial model. Overall, the spatial model could be effective for extracting the genetic effects of growth-related performance in hybrid larch breeding programs.