Introduction

Grapevine is a very socially and economically important crop in many countries worldwide. It is thought that the ancient grapevine varieties have resulted from the domestication of individual wild plants, subsequently multiplied by vegetative propagation throughout centuries and millennia until the present time. As such, one variety would have been a single homogeneous clone at the beginning, but the effects of recurrent somatic mutations, and of other factors of variation, would have transformed it into a vast group of genotypes, with some morphological homogeneity, yet with differences in many quantitative characteristics of agronomical and technological interest (yield, sugar, acidity of the must and many others). Consequently, high genetic variation was generated within varieties along their evolutionary history.

Field experimentation with perennial plants is comparatively difficult and the selection methodology of grapevine varieties used in the vine-growing world in the last 50–100 years emphasizes the sanitary side of the selection process based on virus diagnosis (OIV, 1991) but neglects the potential of genetic variation.

However, understanding the entire variability existent within a variety and its distribution across the different regions and countries where it is grown is a very important matter because this will allow new views on history of agriculture and people (based on the relation between variability and evolutionary age of populations), a more efficient recognition and preservation of genetic resources (slowing down genetic erosion) as well as higher genetic gains through selection. As a consequence, our selection strategy in Portugal is mainly focussed on the knowledge about genetic variability (quantification, geographic distribution) through a method composed of three phases: (i) sampling variability in different regions where the variety is grown (hundreds of mother plants in ancient vineyards); (ii) planting a large field trial with the sampled plants (each one multiplied by vegetative propagation originating a clone); this phase has the objective of quantifying the genetic variability of the most important traits (typically yield, but also sugar content of the must, acidity and antocyanes) and of carrying out mass genotypic selection (selecting a group of clones); (iii) establishment of multi-environmental trials with the selected group in phase 2, with the aim of clonal selection. This paper refers to the second stage of this methodology.

To quantify the genetic variation within a variety and to perform efficient selection, through quantitative genetics as well as other methods, it is necessary to plant a very large field trial (normally from 200 to 400 clones). Only large trials contain a representative sample of all the variation within the variety across the different regions where it is grown. The fact that grapevine is perennial and the field trials are maintained for many years, allows us to make various genotypic mass selections, responding to future environmental alterations as well as to ever changing consumer demands. As a result, large field trials are in fact at the cutting edge of a new strategy for the evaluation and the usage of grapevine genetic resources.

However, the field trial above referred would cover an unusually large area (from 0.75 to 1.5 ha) which, by itself, causes large environmental variation. Therefore, the importance of experimental design in this type of trial is crucial to reach the objectives above referred successfully.

After Fisher (1935) introduced randomized complete block (RCB) designs, Yates (1936, 1940) described balanced incomplete block designs for the first time, including balanced square lattice designs. Since then, many variants on these designs have appeared. Although this group of designs is very large, we will restrict ourselves to only the most relevant for working with a high number of treatments, as frequently happens with initial trials of grapevine selection. Of note are α designs (Patterson and Williams, 1976), which constitute a particular class of generalized lattice designs, row-column (RC) designs (Williams and John, 1989), and which correspond to groups of more complex latin square designs, t-latinized designs (John and Williams, 1998) and resolvable spatial RC designs (Williams et al., 2006). In sum, it can be said that α designs are resolvable designs and are recommended whenever the number of treatments is large. For these block designs there is no limitation on block size. For resolvable RC designs the plots in each replicate are arranged in rows and columns. A spatial resolvable RC design takes into account the separation of different treatments in rows and columns. For a latinized design the replicates are contiguous and form long blocks (or columns) of plots (for more details see the references above mentioned and Whitaker et al., 2007).

Many algorithms have been developed to build these designs, suitable for trials of more than 100 treatments (Patterson and Williams, 1976; Williams, 1985; Nguyen, 1994; Nguyen, 1997; Whitaker et al., 2007), as well as statistical tools to assess their efficiency (John and Whitaker, 2000).

The use of these experimental designs has been intensively recommended and discussed for decades. In the agricultural field, studies have reported greater effectiveness of balanced incomplete block and α designs compared to RCB designs (Patterson et al., 1978; Patterson and Silvey, 1980; Patterson and Hunter, 1983; Kempton et al., 1994; Yau, 1997; Qiao et al., 2000), and in the area of forestry similar results have been obtained, for example by Fu et al. (1998, 1999) and Gezan et al. (2006).

Although all these contributions are important, their use in initial trials of grapevines has not been studied. In this article we compare through simulation several of the aforementioned experimental designs (RCB, α and RC designs), with the aim of identifying those most suitable for quantifying and using the genetic variability under various different conditions: different field layouts, different population sizes, different levels of genetic variability of yield in the population, different levels of environmental variance. Through the simulations we attempted to clarify the effects of different experimental designs on the control of spatial variation, on the accuracy and precision of the estimates for genetic variance and on the prediction of genetic gain.

Materials and methods

Simulation procedure

Yield data were simulated because this trait has a general interest in all selection programmes and it is currently used for quantification of genetic variability under field conditions.

The simulated yield data were generated according to the model

where yilm is the observed yield located in the lth column and mth row, generated as the sum of the overall mean of the population (μ) with the genotypic effect of clone gi ( u g i ) and the errors associated with the observation yilm, spatially dependent (ɛilm) and independent (ηilm).

The parametric values were established such that generated data showed a good agreement with the actual trials for grapevine selection. Thus, populations with an overall mean (μ) of 3 kg per plant were used. The genotypic effects were assumed independent and identically distributed (iid) random variables, with normal distribution with mean 0 and variance . Two values of genotypic variance were considered, σg2=0.2025 and σg2=0.81, corresponding, respectively, to populations with a smaller and larger genetic variation (Martins, 2007). Populations of 100, 200 and 300 genotypes were simulated, with both levels of variability. These populations were generated using SAS code, version 9.1 (SAS Institute, 2003), RANNOR function (100 simulations for each population type).

In this simulation study, the option for a high value for the error variance component (σe2) is justified because of the fact that we are working with a perennial plant that is influenced by many environmental factors, grown in poor and heterogeneous soils and requiring intensive management, which causes a high level of errors associated with the observations. Thus, according to what is observed in large grapevine field trials (around 70 in Portugal), two values of error variance were considered, σe2=1 and 3, corresponding respectively to level 1 (the most frequent) and level 2 of environmental variation. In respect to spatial variability, in these trials the percentage of spatially independent variation is usually around 60%, but can oscillate between 50 and 70% of the total error variance (values obtained from current yield data analysis of 70 grapevine initial selection trials and also supported by Gonçalves et al., 2007). In this study, we assumed that 60% of the total error variance is attributable to the spatially independent variance and 40% to the spatially dependent variance. As a consequence, two error components were considered: an iid normal component ηilm, defined as ηilmN(0, ση2), and a spatially dependent component ɛilm, defined as N(0, σɛ2f), with

This is an anisotropic exponential model, where habrow=sasbrow is the Euclidian distance between the centre of the plot located at Sa and the centre of the plot located at Sb in row direction, is the Euclidian distance between the centre of the plot located at Sa and the centre of the plot located at Sb in column direction and parameters θrow and θcol are related to the ranges of correlation in row and column directions, respectively. It should be pointed out that under the exponential model the correlation reaches 0 only asymptotically. Therefore, for this model is used the term practical range (3θrow, 3θcol) for specifying the distance where the covariance is reduced to 5% of the σɛ2, that is, the distance where the level of correlation is regarded as approximately 0 (Littell et al., 2006).

They were established values for the parameters θrow and θcol of 20 and 15 m, respectively (values frequently obtained from current yield data analysis of 70 grapevine initial selection trials and also supported by Gonçalves et al., 2007). These errors were generated in Proc SIM2D of SAS version 9.1 (SAS Institute, 2003), 100 simulations for each field layout.

Regarding the field layouts, in many of these trials conflicts frequently arise between what is theoretically more correct and what is feasible in practice. Theoretically, the better estimates of genetic variability and a more successfully selection would be obtained from the greatest number of replicates, that is, from replicates with single plant plots. However, in practice, the management of a grapevine initial trial with thousands of plots will be very difficult, increasing the experimental errors associated to the data collection. Consequently, four replicates were adopted with four plants per plot to save on trial area needed, for greater security of plot boundaries (so that they coincided with vine trellis posts) and because often the wood of a mother plant is insufficient to make a clone with more than 16 plants. With a spacing of 1.2 m × 2.5 m, the conditions above produce a distance to the centre of adjacent plots of 2.5 m in row and 4.8 m in column directions.

In grapevine trials the replicates are usually contiguous, therefore latinized designs were simulated and the respective effect (latinized block or latinized column) will be included in the model for data analysis. However, according to the parsimony principle, this effect can always be discarded when it does not improve the fit of the model to the data.

Various types of experimental designs were applied to each field layout arranged by rows ((1,…,r × k) and columns (1,…,s) (Table 1). The RCB design was generated using the Proc Plan of SAS version 9.1 (SAS Institute, 2003). The α designs, latinized by block, RC designs and spatial RC (RCSpatial) designs, latinized by column, were generated using the package CycDesigN 3.0 (Whitaker et al., 2007). In the RCSpatial design the separation of different genotypes in rows and columns was ensured according to a modified exponential variance weight function, with a value of 0.9 for the decay factor, which is coherent with values from real grapevine selection trials.

Table 1 Field layout, experimental designs for r=4 resolvable replicates and other design parameters: g—no. of genotypes, s—no. of incomplete blocks per replicate, for α design, or number of columns, for row-column designs, k—incomplete block size, for α design, or number of rows, for row-column designs

For each situation described in Table 1, 100 different randomizations were done. Altogether, 11 200 simulations were generated, 100 for each of 112 studied cases: population with 100 clones × 2 genetic variances × 2 error variances × 2 field layouts × 4 experimental designs (32 situations); population with 200 clones × 2 genetic variances × 2 error variances × 2 field layouts × 4 experimental designs (32 situations); population with 300 clones × 2 genetic variances × 2 error variances × 3 field layouts × 4 experimental designs (48 situations).

Models for data analysis

The linear model used for data analysis of an experiment with an RCB design was

for i=1,…,g and j=1,…,r. The yij represent the observations, μ the population mean, u g i the genotypic effects, u r j the resolvable replicate effects (complete block effects) and eij the random errors associated with individual plots.

The linear model for an α design, latinized by block, was

for i=1,…,g, j=1,…,r, t=1,…,s, l=1,…,s. The yijtl represent the observations, μ the population mean, u g i the genotypic effects, u r j the resolvable replicate effects, the latinized block effects, the incomplete block effects within replicates and eijtl the random errors associated with individual plots.

The linear model for a resolvable RC design, latinized by column, was

for i=1,…,g, j=1,…,r, t=1,…,s, l=1,…,s, m=1,…,k. The yijtlm represent the observations, μ the population mean, u g i the genotypic effects, u r j the resolvable replicate effects, the latinized column effects, the column effects within replicates, the row effects within replicates and eijtlm the random errors associated with individual plots.

In all cases, model effects (with the exception of μ) were assumed iid normal variables with 0 mean and respective variances σg2, σr2, σblat2, σb(r)2, σlcol2, σcol(r)2, σrow(r)2 and σe2. All random effects were assumed mutually independent.

It should be noted that, although we know the real error variance–covariance structure, and it is possible to incorporate it into models 1, 2 and 3, this was not the objective and so we considered iid errors. What one would expect is that the spatially dependent error component, which was initially simulated and incorporated into the data would now be captured, in some extent, by the effects of the design factors. A second justification for the non-fitting of spatial models was to avoid making comparisons of models that are influenced by the way the data were generated. We always followed this strategy, even for data analysis of spatial RC designs, which were analyzed as a classical RC design.

All models were fitted in Proc Mixed (Littell et al., 2006) of SAS version 9.1 (SAS Institute, 2003).

Model parameters evaluation and effects on genetic selection

Model parameters were estimated by the residual or restricted maximum likelihood method (REML, Patterson and Thompson, 1971), using the Fisher-scoring algorithm (Jennrich and Sampson, 1976).

To understand the fraction of the total variance accounted for by each of the design effects, the results were expressed in terms of the percentage of each component of variance resulting from each effect of the model. In addition, the relative bias (RB) and the mean squared error (MSE) of the estimates of the genotypic variance component were calculated to assess its accuracy and its precision.

To compare the effects of different experimental designs and, therefore, of the respective models for data analysis (models 1, 2 and 3) on efficiency of genetic selection, the following indicators were calculated (expressed as the average of the 100 simulations, for each case studied).

(i) For the prediction of the genotypic effects of the clones, empirical best linear unbiased predictors—EBLUPs—of genotypic effects of the clones (ũg) were obtained through mixed model equations (Henderson 1975; Searle et al., 1992).

(ii) For evaluating the precision in the prediction of genotypic effects, the prediction standard errors (PSEs) based on the comparison of ũg with the simulated genotypic random effects were computed.

(iii) Relative efficiency (RE) of experimental design D1 compared with experimental design D2, defined as

where APSE is the average prediction standard error of ũg.

(iv) Spearman's rank correlation coefficient (rs) and the associated standard error, for the comparison between the rankings of EBLUPs of the genotypic effects and the true genotypic effects (the simulated effects).

(v) RB and the MSE of the predicted genetic gain (PGG) (RBPGG and MSEPGG, respectively) for the selection of the group of 30 top-ranking clones for the populations of 100 and 200 clones, and 45 top-ranking clones for the population of 300 clones. The PGG was calculated for each simulation as the average of the EBLUPs of the top selected clones. The true genetic gain was calculated in the same way, but using genetic effects generated for each situation.

To clarify if it is possible to control the effect of the spatial autocorrelation through the experimental design, semivariograms of the errors in simulated populations were compared with those of the residuals from various fitted models (which describe the spatial correlation not accounted for by the design effects). More precisely, a simulation of two field layouts for populations with 300 genotypes was studied, and plots of the sample semivariogram (Matheron, 1963) were computed through Proc Variogram and Proc Gplot of SAS.

Results

For the level 1 of error variance, the estimates of restricted maximum likelihood of genotypic variance were close to the parametric values imposed during the simulation (that is values of 0.2025 and 0.81), revealing a very low RB (varying between −4 and 6%). This occurred for all the experimental designs (including the RCB design), for any size of population (that is for 100, 200 and 300 genotypes) and for the two levels of genetic variation (Table 2). However, the MSE of the genetic variance estimates was lower (chiefly by a reduction in its variance) for α and RC designs, and decreased as the number of clones in the trial increased. For the level 2 of error variance similar results were obtained, but a higher RB and MSE associated with these estimates were observed, especially for trials with 300 clones under low genetic variability and RCB designs (Table 3).

Table 2 Relative bias (RB) and mean squared error (MSE) of the genotypic variance estimate and fraction of the total variance estimate (FTV) attributable to each variance component, for two levels of genotypic variance and level 1 of error variance and for 100, 200 and 300 genotypes (g) (results expressed as the average of 100 simulations)
Table 3 Relative bias (RB) and mean squared error (MSE) of the genotypic variance estimate and fraction of the total variance estimate (FTV) attributable to each variance component, for two levels of genotypic variance and level 2 of error variance and for 100, 200 and 300 genotypes (g) (results expressed as the average of 100 simulations)

The RCB design showed the greatest percentage of total variance attributable to error, coming out one percentage point lower in this component in the α and RC designs. Furthermore, it was observed that among the RCB designs, the closer the resolvable complete block was to the square, the greater the percentage of total variance attributable to component σr2 and the less it is attributable to σe2.

In Tables 2 and 3, it is clear that among α designs, the greater the number of incomplete blocks and the smaller their size, the greater is the percentage of total variance attributable to incomplete blocks (σb(r)2) and to latinized blocks (σblat2) and the less is attributable to the components σr2 and σe2. Among RC designs, it was also observed that the greater the number of rows per replicate the bigger is and the smaller is . On the other hand, the greater the number of columns, the greater are the estimates of the variance components and . The greater proximity between the values of and was observed for RC10 × 20 (trial with 200 clones) and for RC10 × 30 (300 clones). In these situations, a reduction in variance of error was also observed. In addition, the results obtained with spatial and non-spatial RC designs were similar (which is expectable because the model for data analysis was the same).

From the aforementioned comments it follows that the higher the number of levels of design effects, the greater is the percentage of the total variance attributable to design effects (incomplete blocks, latinized blocks, row, column, latinized columns effects). This occurs because when random effects are assumed, reduced numbers of levels give rise to more frequent null REML estimates for the respective variance components.

For the level 1 of error variance the frequency of null estimates of genotypic variance was 0. In situations of higher error variance (level 2) and low genetic variability, the frequency of simulations, which caused null estimates of the genotypic variance component , was greater in trials with 100 genotypes (6–11%) and decreased with the rise in the number of genotypes. The choice of an α design or a RC design led to a reduction in null estimates of genotypic variance (Table 3).

As the level 1 of error variance is the more frequent situation, we just illustrate the results related to EBLUPs of genetic effects for this case. For the three population sizes and for the two genetic variation levels, the PSEs of EBLUPs of genotypic effects were smaller for α and RC designs than for RCB design (Table 4). The greatest efficiency of those designs relative to RCB design can be seen when working with populations with 300 clones and with higher genetic variation, thereby obtaining RE values of 108.4%.

Table 4 Average prediction standard error (APSE) of EBLUPs of genotypic effects and relative efficiency (RE) for populations with 100, 200 and 300 genotypes (g) and level 1 of error variance

The greatest efficiency of the RC designs relative to the α designs occurs when the number of plots per incomplete block is greater or equal to 10. In other words, for populations of 100 genotypes, the efficiency of the RC10 × 10 and of the RCSpatial10 × 10 is greater relative to the α-10, for populations of 200 genotypes the efficiency of the RC10 × 20 and RCSpatial10 × 20 is greater relative to the α-20 and for populations of 300 genotypes the efficiency of the RC20 × 15 and the RCSpatial20 × 15 is greater relative to the α-15. In the latter case, the RE of RC relative to the α designs is in the region of 107%, for populations with higher genetic variation (Table 4).

Comparing the rankings of EBLUPs of the genotypic effects and the true genotypic effects, Spearman's rank correlation coefficient values with lower standard error were obtained for the α and RC designs (Table 5).

Table 5 Spearman's rank correlation coefficient (rs) and associated standard error (s.e.), predicted genetic gain (kg per plant) (PGG) and the corresponding relative bias (RBPGG) and mean squared error (MSEPGG) for populations with 100, 200 and 300 genotypes (g) and level 1 of error variance

As expected, all these results are reflected in the RB and in the MSE of genetic gains obtained with the selection of a group of clones (genotypic mass selection). With the α designs, smaller RBPGG and MSEPGG were obtained as the number of plots per incomplete block decreased, confirming once more that smaller incomplete blocks are more efficient for controlling environmental variation. In contrast, when the number of plots per column increased, smaller values for the RBPGG and MSEPGG were observed with the RC designs, mainly for trials with 200 and 300 clones (Table 5). On the basis of these results it is easy to see how much can be gained in precision when working with populations with high genetic variation. An RB of around −13% can be observed with the α and RC designs (that is much closer to the true genetic gain), whereas with low variation, this RB is in the region of −33%.

Discussion

As for the estimates of the component of genotypic variance (quantification of the yield genetic variability in a population), α and RC designs were observed to be more beneficial than RCB design (estimates with less variance), especially when the number of clones increased. However, no single design was entirely inefficient for quantification of genetic variability. These results were predictable as we are working with large samples of genotypes (100, 200 and 300). In fact, when trying to obtain an estimate of the genetic parameter relating to a group of genotypes (as in this case of genotypic variance), individual deviations are minimized by increasing the number of genotypes and the final estimate is reliable, even with a certain amount of environmental variation. However, the same conclusion cannot be reached with regard to the accuracy of the mass genotypic selection. This is explicable if we consider that making selection involves taking decisions (acceptance or rejection) regarding each individual genotype, so that the presence of a large experimental error may lead to wrong decisions. With the RC designs (spatial and non-spatial) lower PSEs of EBLUPs of the genotypic effects and higher values for rs with lower standard error were obtained, thus these results indicate that with these experimental designs more accurate selections can be made.

Spatial autocorrelation was simulated, as the problem of spatial variability has had a major effect on our grapevine selection trials. These effects are unpredictable over the course of a trial, various natural phenomena cause this spatial correlation (fertility, water and other environmental factors) and, so are more difficult to control for at the experimental design stage. Thus, the challenge was to identify which type of experimental design will best control the type of spatial autocorrelation usually observed in large experimental populations of grapevine clones.

From the results obtained, it is clear that the component of spatially dependent error (present in the data but not modelled through the error structure) was partially absorbed by variance components associated with design factors: σb(r)2 and σblat2, in the case of the α designs, and σrow(r)2, σcol(r)2 and σlcol2, in the case of the RC designs. However, this result raises the question: will we be able to control the effect of the spatial autocorrelation through the experimental design?

From the observation of Figures 1 and 2, it was clear that residuals resulting from the RCB model fit showed the same pattern of spatial variation as the simulated errors. However, this correlation starts to weaken with the adjustment of the models corresponding to the α and RC designs, in the case of α-60, RC5 × 60 and RCSpatial 5 × 60 (Figures 1c–e, respectively), and to the RC designs, in cases RC20 × 15 and RCSpatial20 × 15 (Figures 2d and e, respectively). Hence, these results support the ability of the latter designs to control the spatial correlation.

Figure 1
figure 1

Empirical semivariograms for row and column directions from one simulation of a population with high genetic variability containing 300 genotypes and a field layout 20 plots (rows) × 60 plots (columns): (a) simulated errors, (b) model 1 fit residuals (RCB5 × 60 design), (c) model 2 fit residuals (α-60 design), (d) model 3 fit residuals (RC5 × 60 design), (e) model 3 fit residuals (RCSpatial5 × 60 design).

Figure 2
figure 2

Empirical semivariograms for row and column directions from one simulation of a population with high genetic variability containing 300 genotypes and a field layout 80 plots (rows) × 15 plots (columns): (a) simulated errors, (b) model 1 fit residuals (RCB20 × 15 design), (c) model 2 fit residuals (α-15 design), (d) model 3 fit residuals (RC20 × 15 design), (e) model 3 fit residuals (RCSpatial20 × 15 design).

From this simulation study, it is also possible to state that if we have field layouts allowing squared replicates with 10 or more plots per row and column, we should opt for a RC design. If the field layout allows only small-scale incomplete blocks (around five plots), we could opt for an α design. As for the best size of incomplete block, similar conclusions have been obtained in trials on other crops (Patterson and Hunter, 1983; Fu et al., 1998; Fu et al., 1999; Gezan et al., 2006), suggesting incomplete blocks of around five plots for α design.

The comparative efficiency of α and RC designs relative to RCB design varied between 101 and 108% (Table 4). No direct comparison can be made of these values with others obtained in field experiments with other crops, as the PSE of the genotypic effects is not always used as a criterion of comparison between experimental designs. For example, if the RE was calculated, not based on PSE, but on prediction error variance, those values would be higher. However, it is worth pointing out that the biggest differences in efficiency among experimental designs will be found whenever the size of the trial is large (higher number of genotypes) and when there is more environmental heterogeneity. Hence, as environmental variation is never entirely predictable at the beginning of a field trial, we should be ambitious in our choice of experimental design for large trials of grapevine clones, always opting for α or resolvable RC designs, in preference to an RCB design. Besides, a point in favour of those more complex resolvable designs is that there is always the option of analysing them as an RCB design.

We should also remember that an experimental design supposed entirely suitable might later prove to be inefficient in controlling for spatial variation. That is why recourse to spatial models for data analysis is another important option for the selection success (Cullis and Gleeson, 1991; Federer, 1998; Qiao et al., 2000; Dutkowski et al., 2006; Gonçalves et al., 2007). It is important to emphasize, however, that even using spatial analysis, a good experimental design is always essential. In sum, the correct strategy should be starting with a classical incomplete block or RC model and then check if addition of a spatial error model can improve the fit (note that the option for a spatial design clearly supports this sequential fitting process). This approach also defended by other authors (Williams et al., 2006; Piepho et al., 2008) is the most adequate for data analysis of large grapevine field trials. In fact, these are perennial plants, and experimenting on them is costly and prolonged, and experimental failure is very hard to bear. Therefore, demanding statistical methods are entirely justified.

Finally, we should note that the conclusions of this work represent foundations for a new knowledge on genetic variability within grapevine varieties, allowing novel strategies for genetic resources conservation as well as for selection with high genetic gains.