Long-term experiments and strip plot designs

In a long-term experiment usually the experimenter needs to know whether the effect of a treatment varies over time. But time usually has both a fixed and a random effects over the output and the difficulty in the analysis depends on the particular design considered and the availability of covariates. Actually, as shown in the paper, the presence of covariates can be very useful to model the random effect of time. In this paper a model to analyze data from a long-term strip plot design with covariates is proposed. Its effectiveness will be tested using both simulated and real data from a crop rotation experiment.


Introduction
Long-term experiments (LTEs) are commonly used in agronomy, soil science, ecology, biology, medicine and other disciplines to compare the effects of different treatment regimes over an extended length of time, usually years [8]. Treatments are usually assigned to experimental units at the start of the study, and measurements of interest are observed regularly over the course of the trial. Usually the same manipulation is applied one or more times over the course of the trial (annual fertilization, tillage regimes, and so on) and/or manipulation varying in a planned manner over time, such as crop rotation, can be involved. Different experimental designs can be considered, ranging from randomized complete block (RCB) designs to split plot and strip plot designs. Even in real experiments the two things (repeated measures and experimental designs) are usually combined, very often only one out of the two is emphasized during the analysis of the data.
In an LTE usually the purpose is to test and estimate the time × treatment interaction: that is, the experimenter needs to know not only the effect of a treatment but, most of all, if this observed ones. Standard methods for the analysis of repeated measures data cannot be applied, since one of the fundamental assumptions is that subjects (units) must respond independently from one another. This is not the case of LTE employing standard designs, since all plots are affected simultaneously by the same random effect of year.
In order to quantify and account for random environmental conditions, additional information must be supplied, or some assumptions must be made about the nature of fixed and random effects ('cycle' and 'year') of time.
In order to separate fixed and random effects of time Loughin et al. [8] suggest to make assumptions on their nature, that is: • to make an assumption about the structure of the fixed effect of time, for example hypothesizing a particular trend for the mean effect of time; • to add covariates to the model to account for the random effects of time and time × treatment.
The first solution can be considered only if we have an idea of what trend the mean follows, and some graphical representation of data can help in this sense. The second approach need that yearly recorded covariates are available, in order to be included in the model as fixed effects.
Considering a single factor with a levels, let Y ikt represent a measurement taken at time t = 1, 2, . . . , T on a unit from block k = 1, 2, . . . , K receiving treatment i = 1, 2, . . . , a. The univariate model for repeated measures taken in an RCB design is where μ is a general mean effect, is the random plot error, that is, a random error effect for factor A, γ t is the fixed main effect of 'cycle' t, g t ∼ N(0, σ 2 g ) is the random effect of 'year', (αγ ) it is the fixed interaction of cycle (time) and treatment, (αg) it ∼ N(0, σ 2 ga ) is the random interaction of time and treatment, ε ikt is the random error associated with measurement taken at time t on a unit in a block k receiving treatment i and is N(0, σ 2 ε ). g t and (αg) it are independent of each other, ε ijk and ε i j k are independent of each other if i = i or j = j , but may be correlated with a correlation ρ kk if i = i and j = j , and finally ε ijk are independent of (rα) ik ∀i, j, k.
The terms g t and (αg) it account for the random variation associated with years and for possible random variations in treatment effect associated with yearly fluctuation. These two terms can be modeled as functions of covariates, whenever available, as Table 1s (on-line supplemental material) shows the variances of the components for model (1).
Under model (1) with covariates the terms σ 2 g and σ 2 ga are substituted by σ 2 g and σ 2 ga , respectively, and, if the covariates are effective in explaining the random variation, σ 2 g < σ 2 g and σ 2 ga < σ 2 ga .

Strip plot experimental design
Strip plot designs are common in plant breeding, animal science, health science experiments, especially when two factors (A and B) requiring large experimental units are to be tested in the same experiment. Lansky [6] shows how strip plot designs can be a powerful tool for identifying subtle effects of factors within biological asseys, while Farewell and Herzberg [3] show their usefullness in studies of the training of medical practitioners. Strip plot designs go also under different names, including strip block design, split block experiment design, two-way whole plot design, sub-treatments in strips across blocks and a criss-cross design (for complete references see [4]). In such a design, experimental field is divided into blocks and each block is divided into strips perpendicular to each other ( Figure 1 ). Factor A is randomly applied to strips in one direction, while Factor B is randomly applied to strips, which are actually a new set of whole plots, orthogonal to the original plots used for factor A. Here, different from a split plot design, characterized by whole plots and subplots, there are two whole plot treatments, A and B: an RCB experiment design is used for factor A treatments, but also the factor B treatments are arranged in an RCB design. The levels of factor B go across all levels of factor A and vice versa in a criss-cross manner. This arrangement, with a different randomization, is repeated in each of the (say K) complete blocks. The usual response model equation is

A. Plaia
with k = 1, 2, . . . K, i = 1, 2, . . . , a and j = 1, 2, . . . b where Y ijk is the response for the ijkth experimental unit; μ is a general mean effect, r k is the kth block effect and is IID(0, σ 2 r ), α i is the effect of the ith level of factor A, (rα) ik is a random error effect for factor A and is IID(0, σ 2 a ), β j is the effect of the jth level of factor B, (rβ) jk is a random error effect for factor B and is IID(0, σ 2 b ), (αβ) ij is the ijth interaction effect of the two factors A and B, and ε ijk is a random error effect for the interaction effects and is IID(0, σ 2 ε ). The different random effects (rα) ik , (rβ) jk , and ε ijk are assumed to be independent. Since the experimental units are different for factors A and B and for their interaction, three different error terms are required in this design. Tables 1 and 2 show, respectively, the expected values of the various mean squares and the ANOVA table of a standard strip plot design with fixed A and B and random block.
With the assumption of normality for the random error effects in Equation (2), the F-tests in Table 2 can be used for fixed effects for both factors. Each of the three F-tests requires a Table 1. Expected mean squares for the ANOVA of a standard strip plot design.

Source of variation
Degrees of freedom Expected mean square values different error term. Therefore, three different parts can be distinguished in the ANOVA table (Table 2): • a first part with factor A sum of squares and the corresponding error term (Error(a)); • a second part with factor B sum of squares and the corresponding error term (Error(b)); • a third part with interaction A × B sum of squares and the corresponding error term (Error(ab)).

Artificial example
To make ideas clearer, consider a simulated example where data are generated according to model (2), with A (significant) as the row factor and B (significant) as the column factor with three levels each, with field divided into K = 2 blocks. Figure 2 shows the interaction plot for the simulated data, while Table 3 shows the corresponding ANOVA

Long-term strip plot
When a standard strip plot designed experiment is conducted at several sites, in several years, or repeated in some other way, the researcher may want to combine the results from the individual experiments, even if results of the individual experiments will need to be obtained and interpreted as well. LTEs and strip plot designs are both schemes followed in agriculture and other field experiments, but, to our knowledge, a complete analysis of data coming from a long-term strip plot design of experiments is absent in the literature. Even if the experiment was conducted according to a long-term strip plot one, data are not analyzed appropriately [5], for example, consider an experiment where treatments were arranged in a strip plot design and observed over six years, but separate analyses for each year are presented. Federer and King [4] describe how to combine results from a strip plot experiment over several sites, but the approach cannot be immediately extended to the case of several years (long-term), though theoretically considered by the authors at the beginning of the book chapter. As a matter of fact, Federer and King [4] do not distinguish between fixed and random effect of time. Here, we will try to merge the model for data coming from a strip plot experimental design (2) to the model to analyze data from an RCB long-term experimental design (1): to do this we will consider 'time' as the 'outmost block'.
The new model (3), obtained by opportunely merging models (1) and (2), allows to account simultaneously for the effect of time and the presence of a specific design of experiment (strip plot): where Y ijkt is the response for the ijktth experimental unit; μ is a general mean effect, α i is the effect of the ith level of factor A, β j is the effect of the jth level of factor B, γ t is the fixed effect of time (cycle) tth, g t is the random effect of year and is N(0, σ 2 g ), (rg) kt is the random error effect for time and is N(0, σ 2 rg ), (αrg) ikt is a random error effect for factor A and is N(0, σ 2 a ), (βrg) jkt is a random error effect for factor B and is N(0, σ 2 b ), (αβ) ij is the ijth interaction effect of the two factors A and B, (αγ ) it is the fixed interaction of cycle (time) and treatment A, (βγ ) it is the fixed interaction of cycle (time) and treatment B, (αβγ ) ijt is the fixed interaction of cycle (time), treatment A and treatment B, ε ijkt is a random error effect for the interaction effects and is N(0, σ 2 ε ). Model (3) has been obtained by considering time as the outmost factor and, therefore, starting from models (1) and (2): • by substituting r k with (rg) kt , the random interaction of time and block effects; If we look at model (3) as the extension of model (2) in [8] to the strip plot design, its appropriateness to model long-term strip plot experiments can be immediately gathered. In model (3) four different error terms can be distinguished: σ 2 rg to test the effect of time, σ 2 a and σ 2 b to test the effect of A and B, respectively, and their interaction with time, while σ 2 ε allows to test the A × B and A × B × time interactions. Table 4 shows the expected mean squares for the ANOVA of a long-term strip plot design. The corresponding ANOVA table is shown in Table 5. With respect to Table 2, one more stratum has been added, with sum of squares for Time and the corresponding error term (Error(T)). As it is possible to see from Table 4, there are no proper error terms for the interactions A × Time, B × Time and A × B × Time. The naively F-values computed by using E a , E b and E ab , respectively, as denominator (Table 5) can lead to biased tests for the term σ 2 g (random effect of time) that inflates the numerators.

Artificial example
To make ideas clearer, consider 100 simulated examples where data are generated according to model (3) (without the second-level interaction (αβγ ) ijt ), with A as the row factor and B as the column factor with three levels each, with field divided into two blocks, the experiment was conducted for 5 years. Figure 3 shows the interaction plots for a single simulated data set, while Table 6 shows the true model parameters (according to model (3)) and their mean estimates over the 100 simulated data sets according to model (3). As expected, in some cases parameter estimates are not correct: this is especially true for the two variances σ 2 g and σ 2 rg . The problem is that the fixed and random effects of time are partially confounded. A solution to this problem is proposed in the next section. Table 4. Expected mean squares for the ANOVA of a long-term strip plot design.

Source of variation
Degrees of freedom Expected mean square values

Long-term strip plot with covariates
In model (3), as already explained for model (1), the fixed effect of time, γ t , is partially confounded with its random effect g t and the naively F values can lead to biased tests. As already stated in Section 2, two alternative solutions can be considered to solve this problem: a model for the fixed effect or, if available, covariates to explain random variation.
The first choice needs to be supported at least by an idea of a possible model for the mean, but if the chosen model is inadequate the estimated mean results will be misleading. If covariate data are available, a better choice can be to model g t as function of covariates: many 'random' variations are actually conglomerations of fixed effects [8]. For example, in agronomy, rainfall, solar radiation, etc., or synthetic measures that account for a combined effect of weather variables (like the Water Stress Index, WSI) can be considered, since they are known to affect crop yields and other agronomic measurements. According to this solution, a new model can be introduced by substituting g t in model (3) with: where δx t is the effect of the continuous covariate x at time t, (αx) it is the interaction of x and treatment A, (βx) jt is the interaction of x and treatment B, (αβx) ijt is the interaction of x, treatment A and treatment B, and g t is the random effect of time that, if the covariates are effective in explaining the random variations, is N(0, σ 2 g ), with σ 2 g << σ 2 g . As a result, the bias in the F tests in Table 5 is reduced.
The complete model will be: An application of this model, that can help the reader to understand it, is presented in Section 6.

Artificial example
Again, consider 100 simulated examples where data are generated according to model (4) (without the second-level interaction (αβγ ) ijt ), with A as the row factor and B as the column factor with 3 levels each, with field divided into 2 blocks, the experiment conducted for 5 years (parameter values are shown in Table 7). A covariate x has now been introduced in the model. Figure 4 shows the interaction plots for a single simulated data set. Table 7 shows the true model parameters and their mean estimates over 100 simulated data sets by fitting either model (4) and (3). As expected, variances estimated according to model (3)

A real long-term strip plot trial
Data are here presented from a rotation experiment established in a region in the South of Italy in 1992 [1]. A crop rotation is a cropping system in which different crops are grown in a sequence on the same piece of land, one after the other [13]. Figure 1 shows the layout of the considered experiment. It is a typical strip plot design, with two factors, Tillage and Precession, with three levels each, replicated over two Blocks and observed over 17 years (starting in 1992-1993).
Three crop sequences (Precession) as horizontal treatments, and three soil Tillage systems as vertical treatments have been considered.   Different aspects of crop productivity have been considered in the experiment: grain yield will be here analyzed.
Each phase of each rotation is present each year since, within each replication, there are two groups of plots (established in the first year) -one group of plots receives all the Tillage-Precession combination according to a strip plot design in their grain phase, while the second group of plots was planted to the second phase of the rotations.
Thus, although the year-wise observations (grain yield) are available for combinations of Tillage, Precession, and replication, they actually come (1) from one of the two groups of plots and (2) from the same plot only in alternate years.
As an example, the following schemes show the year-wise yeild of grain (Y ) from Plots 1 and 2 under the same crop-sequence/tillage combination and replication. We assume that in the W-FB rotation, Plot 1 is in the wheat (W) phase and Plot 2 in the faba-bean (FB) phase in the first year. Year The series of grain yields Y 11 Y 12 . . . arise from Plot 1 (in the grain phase (W) in odd years) while the grain yields Y 21 Y 22 . . . from Plot 2 (beginning with the clover phase in Year 1 and receiving the grain phase in even years). The observations over years within each of the plots can be correlated, but independent (or with ignorable correlation) when they come from different plots.
Since data come from the same plot in alternate year, and emphasis is given to random environmental effects and their interaction with treatments, serial correlation will not be accounted for: this is justified by the autocorrelation plot within each of the 18 = 3 × 3 × 2 plot units shown in Figure 1 which shows the autocorrelation among measures (y) coming from the same plot at different lags (3 (levels of Crop Sequence) × 3(levels of Tillage) × 2(blocks)). As Payne [9] suggests, one of the issues to consider is the possible different amounts of random variation in different years. For this reason, a test for homogeneity of variances has been carried out resulting to be not significant.
For the same years meteorological data are available, which, for sake of simplicity, have been transformed into a crop WSI according to [12].
Since the experiment was conducted for several years, we are interested in combining the results from the single-year experiments. The availability of a covariate (WSI) allows to adapt model (4) to the data.
The ANOVA table for the whole data set is reported in Table 8, and the results can be verified looking at the figures in the online supplemental material.
In particular: (1) the first row (from the top) in Figure 2s shows the effect of Time (left, not significant) and WSI (right, significant), and corresponds to the tests in the Time:Rep stratum in Table 8, (2) the second row in Figure 2s shows the interaction between Tillage and Time (not significant) and Tillage and WSI (significant), and corresponds to the tests in the Time:Rep:Tillage stratum in Table 8, (3) the third row in Figure 2s shows the interaction between Crop-Sequence and Time (not significant) and Crop-Sequence and WSI (significant), and corresponds to the tests in the Time:Rep:Prec stratum in Table 8, (4) Figure 3s shows the interaction between Tillage, Prec and Time (significant), while Figure 4s the interaction Tillage, Prec and WSI (not significant), and correspond to the tests in the 'Within' stratum in Table 8.
The effects of tillage varied greatly by climate (the WSI-Tillage interaction). When the WSI was high, grain yield was greater with conservative tillage techniques (especially NT) than with CT, whereas the opposite was true when the WSI was low; in growing seasons of medium water stress, no differences in grain yield were observed among the three tillage techniques. The effects of Tillage varied greatly by crop sequence (Tillage-Prec interaction was significant). On average, grain yield was higher with NT than CT when wheat was grown after FB or BC and lower with NT than CT in continuous wheat (W level). The latter effect grew stronger with time (Time-Tillage-Prec interaction was significant). The effects of the Tillage technique on grain yield were stable over time when wheat was grown after FB or BC, whereas RT and especially NT had a detrimental effect on continuous wheat (W) that increased with time. The effects of crop sequence were stable over time (the Time-Prec interaction was not significant) but varied greatly by WSI (the WSI-Prec interaction is significant).

Discussion and conclusions
In an LTE usually the purpose is to test and estimate the time × treatment interaction: that is, the experimenter needs to know whether the effect of the treatment varies over time. In this sense time should be considered as a fixed effect. On the other hand, measurements are usually influenced by uncontrollable environmental factors which can be modeled by random year-toyear fluctuations and year × treatment random effects. Even if these random effects could be only a nuisance, they need to be taken into account, since variations in the measurements across years are simultaneously due to both fixed and random effects [7]. As a result, time can have both a fixed and a random effect over the output.
Moreover, the analysis can be more difficult due to the particular design followed in the experiment, not necessarily a standard RCB one, and to the presence of covariates that, opportunely introduced in the model, can be very useful to model the random effect of time.
For example, a strip plot is a quite common experimental design and, at the same time, longterm experiments are frequently considered to account for the fixed and/or random effect of time. But the true potentiality of a particular experimental design established for a long time is fully exploited only if a model that properly accounts, at the same time, for the particular design and the presence of replications in time is formulated.
Even if LTE and strip plot designs are both schemes followed in Agriculture and other field experiments, up to our knowledge a complete analysis of data coming from a long-term strip plot design of experiments is absent in the literature. Even if the experiment was conducted according to a long-term strip plot one, data are not analyzed appropriately.
In this paper a model to analyze data from a long-term strip plot design with covariates is proposed. The model allows the fixed and random effects of time to be separated, introducing a covariate that can account for the random effect. Even if only a linear effect of covariates has been proposed, this model can be easily extended to a nonlinear effect of this covariate.
Both simulated and real data are used to show the adequacy of the proposed model, verified also by appropriate graphical representations which highlight the significant/not significant effects of main factors, time and covariate, together with their two-and three-way interactions.