Chasing genetic correlation breakers to stimulate population resilience to climate change

Klápště, Jaroslav; Telfer, Emily J; Dungey, Heidi S; Graham, Natalie J

doi:10.1038/s41598-022-12320-3

Download PDF

Article
Open access
Published: 17 May 2022

Chasing genetic correlation breakers to stimulate population resilience to climate change

Jaroslav Klápště¹,
Emily J Telfer¹,
Heidi S Dungey¹ &
…
Natalie J Graham¹

Scientific Reports volume 12, Article number: 8238 (2022) Cite this article

874 Accesses
1 Citations
2 Altmetric
Metrics details

Subjects

Abstract

Global climate change introduces new combinations of environmental conditions, which is expected to increase stress on plants. This could affect many traits in multiple ways that are as yet unknown but will likely require the modification of existing genetic relationships among functional traits potentially involved in local adaptation. Theoretical evolutionary studies have determined that it is an advantage to have an excess of recombination events under heterogeneous environmental conditions. Our study, conducted on a population of radiata pine (Pinus radiata D. Don), was able to identify individuals that show high genetic recombination at genomic regions, which potentially include pleiotropic or collocating QTLs responsible for the studied traits, reaching a prediction accuracy of 0.80 in random cross-validation and 0.72 when whole family was removed from the training population and predicted. To identify these highly recombined individuals, a training population was constructed from correlation breakers, created through tandem selection of parents in the previous generation and their consequent mating. Although the correlation breakers showed lower observed heterogeneity possibly due to direct selection in both studied traits, the genomic regions with statistically significant differences in the linkage disequilibrium pattern showed higher level of heretozygosity, which has the effect of decomposing unfavourable genetic correlation. We propose undertaking selection of correlation breakers under current environmental conditions and using genomic predictions to increase the frequency of these ’recombined’ individuals in future plantations, ensuring the resilience of planted forests to changing climates. The increased frequency of such individuals will decrease the strength of the population-level genetic correlations among traits, increasing the opportunity for new trait combinations to be developed in the future.

Efficient genomics-based ‘end-to-end’ selective tree breeding framework

Article Open access 03 January 2024

Adaptive evolution in a conifer hybrid zone is driven by a mosaic of recently introgressed and background genetic variants

Article Open access 05 February 2021

Massive haplotypes underlie ecotypic differentiation in sunflowers

Article 08 July 2020

Introduction

Global climate change will likely introduce new combinations of environmental conditions into our forest systems, increasing the physiological stress affecting plants and affecting multiple traits in multiple ways. Coping with these ecological changes in the long-term may require the modification of the underlying genetic correlation matrix among functional traits¹. Genetic correlations are pairwise trait associations based on pleiotropic mutations², and linkage disequilibrium (LD) (co-segregation of quantitative trait loci (QTLs))³. Pleiotropic mutations can be present at the allelic level (single causal variant contributes to multiple phenotypes) or at the gene level (multiple causal variants within a gene contributing to multiple phenotypes)² and appear to be the primary cause of the genetic correlations observed among traits in populations under random mating^4,5. Linkage disequilibrium or a shared environment can also contribute to genetic correlations in a population under non-random mating with the presence of inbreeding⁶.

Genetic correlations represent evolutionary constraints that have developed over time, and their values and directions can vary according to differences in micro-evolutionary processes along environmental gradients^1,7. Alternatively, high positive genetic correlations between functional traits provide a mean for their co-evolution⁸. Sgró and Hoffmann⁹ found changes in environmental conditions to be the cause of changes in the value and direction of genetic correlations in life history traits that were observed in samples from the same population, stressing the need to investigate genetic correlations among different sets of environmental conditions. Previous studies found substantial changes in genetic parameters when estimated in favourable versus stressful environmental conditions^10,11. Although the presence of high recombination through migration might be counteracting local adaptation to the current environmental conditions¹², theoretical evolutionary genetics studies have found the evolution of high recombination rates when selection targeted different loci under heterogeneous environmental conditions^13,14. Thus, high recombination rates are crucial in adaptation under rapidly changing environments.

Adverse genetic correlations constraining genetic improvement progress can similarly be found in breeding populations¹⁵. In these circumstances, the undesirable genetic patterns between traits can be challenged by specific breeding (complementary mating) and/or selection procedures in order to select ’correlation breakers’ (individuals from the studied population that strongly deviate from the general trend between investigated traits) and eliminate the unfavourable trends¹⁵. One of the most critical adverse genetic correlation in forest trees is the relationship between tree productivity and wood density. This correlation is often strongly negative e.g. $\sim$ − 0.51¹⁶. Additionally, both traits seem to be involved in adaptive processes. While productivity in terms of growth speed is considered a means to deal with competition for the sun light from surrounding plants¹⁷, wood density is related to resistance to drought stress, through thicker cell walls which prevent cell implosion due to higher negative pressure in the xylem pipelines or embolism^18,19, and to wood stiffness²⁰. Breaking this strongly negative genetic correlation could unlock the genetic potential of trees. Therefore, finding individuals with a much greater recombination rate than normal is strategy to explore the potential of this approach.

Unravelling the mechanisms underlying the observed phenotypic correlations among traits is one of the greatest challenges in quantitative genetics¹⁵. A phenotypic correlation is the product of genetic and environmental correlations and their interplay²¹. Genetic correlations represent associations among the genetic causes of the traits investigated, which can be affected by many factors such as sampling error, environmental heterogeneity and developmental stage. Genetic correlations can also vary with the expression of different genes in different environments or developmental phases²². While genetic correlations reflect general trends in populations, they can also be modified by the selection of correlation breakers to shift the association between traits in a favourable direction¹⁵.

Current progress in next-generation sequencing has enabled the development of genomic resources for non-model organisms and allowed the closer investigation of correlations among genotypes and phenotypes through association mapping^23,24,25. Such analyses can be used effectively to select recombinants of known quantitative trait nucleotides (QTNs) to break adverse genetic correlations, especially in cases where genetic correlations are built-up through chromosomal linkages rather than pleiotropic effects²⁶.

Genetic analyses investigating single-trait versus multi-trait models have found that multi-trait models were superior in the accuracy of estimated genetic parameters such as additive genetic variance, heritability and breeding values, particularly where there are significant genetic correlations between traits^27,28. However, multi-trait prediction models appear to be less advantageous when traits are uncorrelated, even proving inferior to single-trait models. Additionally, the multi-trait models are not useful for identifying individuals that break unfavourable genetic correlations²⁷.

Our analysis investigates the genomic features of genetic correlation breakers and the ability to predict these types of individuals using genomics. We propose that genomics will allow us to select individuals with high level of recombination and that these individuals will have favourable combinations of traits that are otherwise negatively correlated at the population level. We used a radiata pine (Pinus radiata D. Don) population established using tandem selection based on New Zealand’s nation-wide breeding values as a test case¹⁵. Using genomic data, we identified individuals that were correlation breakers for growth and wood density traits, and used genomic data to test for high levels of recombination in these individuals. We also investigated how recombination within the population described differs with a another reference population of the same species. With the knowledge that recombination and heterogeneity has been associated with genetic robustness¹⁴, we discuss the implications of our results on the potential to select for genotypes that will remain robust under future climate change.

Materials and methods

Plant material

The New Zealand (NZ) Radiata Pine Breeding Company’s (RPBC) breeding population includes populations selected for either high wood density or growth and form attributes, using NZ nation-wide breeding values that reflect a common genetic effect across tested environments. The breeding program strategy proposed further developments are outlined in detail in previous studies^29,30. Briefly, the program is based on an open nucleus breeding strategy with two independent sublines. The main population within each sublines is structured into different breeding goals such as growth and form, long internode, high wood density, structural timber, and tested as an open pollinated population. The elite population comprises genetically narrow material that is tested through control pollinated progenies, with or without vegetative propagation (clonal trials). Selection is usually performed on the basis of a ’growth and form’ (GF) score, which is an artificially created scale from 7 (unimproved) to 30 (highly improved) representing weighted breeding values for growth and form attributes³¹.

The sample under study was composed of two populations: POP1 was used as the training population, and POP2GF was the population in which to predict individuals with excess recombination using a genomic prediction model. All trials were clonal full-sib families and replicated among sites at Tarawera, Woodhill and Kinleith in the North Island of New Zealand. The POP1 population was established from parents chosen through two selection strategies. One was based on a tandem selection, where 19 parents were first selected for growth and form using GF scores followed by a second round of selection focused on high wood density (HD). This group (POP1HD) formed a population of 160 individuals termed ’correlation breakers’. The second selection strategy in POP1 was based purely on GF scores where 33 parents were selected to create POP1GF, consisting of 304 individuals. POP1GF and POP1HD were established using a single-pair mating design which produced 33 (POP1GF) and 19 (POP1HD) full-sib families. Both populations were planted at Tarawera using a single tree plot, set within replications design with six replications. Each family was represented by 10 genotypes and each genotype was tested in six copies.

The POP2GF population, consisting of 523 individuals, was also selected based only on GF scores. POP2GF was established using a factorial mating design which produced 42 full-sib families from 24 parents. The POP2GF population was planted across two sites, Woodhill and Kinleith, using an incomplete block design with five replications and nine incomplete blocks (each representing six families) within each replication. Each family included 10 genotypes which were tested in five copies. There were two common parents between POP1GF and POP1HD, 12 common parents between POP1GF and POP2GF and one parent between POP1HD and POP2GF. Full trial details are reported in Li et al.³².

All 987 clonally replicated genotypes were measured for the traits branch cluster (BR9), using a 9 points scale from 1 (uninodal) to 9 (extreme multinodal)³³, straightness (ST9), using a 9 points scale from 1 (crooked) to 9 (very straight)³⁴, diameter at breast height (DBH [cm]), and wood density (WD [${\text{kg/m}}^{3}$]) measured as basic wood density through the maximum moisture content method³⁵.

Genomic resources

Genomic data were generated using a previously developed exome capture—genotyping by sequencing platform³⁶ as described in Telfer et al.³⁷. In brief, transcriptomic resources, that represented gene expression across broad range of tissues, including compression wood xylem, spring xylem, summer xylem, summer phloem, spring buds, autumn buds, healthy needles, needles infected by Phytophtora pluvialis, seedling phloem and seedling xylem³⁷, were aligned to Pinus taeda reference genome v. 1.01e and used to develop 120 base capture probes. Captured markers were removed if heterozygosity shown in megagametophyte tissues was higher than 5%, average read depth was less than 10, multiple alleles were detected, or only singletons were observed. Additionally, individual datapoints were classified as missing if the ratio between reference and alternative allele was lower than 0.1 and the number of read was less than 10³⁸. The average read depth was 59.2 per marker and 59.04 per individual. Data were further filtered for minor allele frequency (MAF) > 0.01 and missing data were imputed with mean genotype. Total number of markers was 80,159 SNPs after filtering.

Genomic data analysis

The analysis of linkage disequilibrium (LD) was based on composite LD correlation ’r’ equivalent to the Pearson’s product-moment correlation coefficient between genotypes of investigated loci³⁹. To reduce bias in estimation of LD due to familial structure, we implemented LD corrected for relatedness as proposed by Mangin et al.⁴⁰ as follows:

$$\begin{aligned} {\hat{r}}_v(i,j)=\frac{\left( \sum ^G_{X^i,X^j}\right) ^2}{\sum ^G_{X^i,X^i}\sum ^G_{X^j,X^j}} \end{aligned}$$

where $\sum _{X^i,X^j}^G$ is the sample variance-covariance matrix defined as $([{\varvec{X}}^i,{\varvec{X}}^j ]-\frac{(1_N 1_N^T {\varvec{G}}^{-1})}{(1_N^T {\varvec{G}}^{-1} 1_N )} [{\varvec{X}}^i,{\varvec{X}}^j ]) {\varvec{G}}^{-1} ([{\varvec{X}}^i,{\varvec{X}}^j ]-\frac{(1_N 1_N^T {\varvec{G}}^{-1})}{(1_N^T {\varvec{G}}^{-1} 1_N )} [{\varvec{X}}^i,{\varvec{X}}^j ])$, where ${\varvec{X}}^i$ and ${\varvec{X}}^j$ are the vectors of genotypes for $i^{th}$ and $j^{th}$ marker, N is the sample size and ${\varvec{G}}$ is the marker-based relationship matrix⁴¹. The analysis was performed using ’LDcorSV’ R package⁴². Additionally, the genomic differences between POP1HD (correlation breakers) and POP1GF populations were investigated through comparison of the mean observed and expected heterozygosity (approximating effective population size) using t test⁴³. The 100 individuals randomly selected from each population were used to estimate the sample mean observed heterozygosity, which was performed 100 times. The trend line of decay in LD was estimated using the Hill and Weir expectation⁴⁴ as follows: $E(r^2 )=[\frac{10+C}{(2+C)(11+C)]}[1+\frac{(3+C)(12+12C+C^2)}{n(2+C)(11+C)}]$ where n is sample size and C is the parameter to be estimated and represents the product of the population recombination parameter (C $=$ 4Nc), where N is the effective population size and c the recombination rate. The nonlinear least squares were used to fit the data using ’nls’ R package⁴⁵. The statistical significance in LD pattern difference between POP1GF and POP1HD in the training population was investigated through a Jennrich test⁴⁶, testing the null hypothesis that the two correlation matrices are not different from each other. Only scaffolds with at least three overlapping markers between samples were included in this analysis.

Statistical analysis

Phenotypic data were standardised for each trait at each site to avoid the problems associated with combining breeding values with different scales into a multi-trait selection index. The ’ASReml-R’ package⁴⁷ was used to estimate genetic parameters such as variance components, heritability, and genetic correlations. The multivariate linear mixed model was implemented in the POP1 populations (POP1HD and POP1GF) as follows:

$$\begin{aligned} {\varvec{Y}}={\varvec{X\beta }}+{\varvec{Za}}+{\varvec{Zg}}+{\varvec{Zr}}+{\varvec{Zs}}+{\varvec{e}} \end{aligned}$$

where ${\varvec{Y}}$ is the matrix of measurements, ${\varvec{\beta }}$ is the vector of fixed effects (intercept), ${\varvec{a}}$ is the vector of genomic estimated breeding values following var(${\varvec{a}}$)$\sim$N(0,G1) where G1 is the variance-covariance structure for genomic estimated breeding values following

$$\begin{aligned} G1=\begin{bmatrix} \sigma _{a_1}^2 &{}\ldots &{}\sigma _{a_1 a_n} \\ \vdots &{}\ddots &{}\vdots \\ \sigma _{a_n a_1} &{}\ldots &{}\sigma _{a_n}^2\\ \end{bmatrix}\otimes {\varvec{A}} \end{aligned}$$

where ${\varvec{A}}$ is the average numerator relationship matrix⁴⁸, $\sigma _{a_1}^2$ and $\sigma _{a_n}^2$ are additive genetic variances for the 1st and nth trait, $\sigma _{a_1 a_n}$ and $\sigma _{a_n a_1}$ are additive genetic covariances between the 1st and nth trait, and $\otimes$ is the Kronecker product. The marker-based analysis was performed by substituting the pedigree-based relationship matrix ${\varvec{A}}$ with the marker-based relationship matrix ${\varvec{G}}$, estimated following⁴¹:

$$\begin{aligned} {\varvec{G}}=\frac{{\varvec{ZZ'}}}{2\sum _ip_i (1-p_i)} \end{aligned}$$

where ${\varvec{Z}}$ = ${\varvec{M}} - {\varvec{P}}$, where ${\varvec{M}}$ is a matrix of genotypes coded 0, 1 and 2, indicating the number of alternative alleles in the genotype (relative to the loblolly pine (Pinus taeda) reference genome v. 1.01e⁴⁹) , ${\varvec{P}}$ is twice the alternative allele, ${\varvec{g}}$ is the vector of random non-additive genetic effects following var(${\varvec{g}}$)$\sim$N(0,G2), where G2 is the variance-covariance structure for non-additive genetic effects following

$$\begin{aligned} G2=\begin{bmatrix} \sigma _{g_1}^2 &{}\ldots &{}\sigma _{g_1 g_n} \\ \vdots &{}\ddots &{}\vdots \\ \sigma _{g_n g_1} &{}\ldots &{}\sigma _{g_n}^2\\ \end{bmatrix}\otimes {\varvec{I}} \end{aligned}$$

where $\sigma _{g_1}^2$ and $\sigma _{g_n}^2$ are non-additive genetic variances for the 1st and the nth trait, $\sigma _{g_1 g_n}$ and $\sigma _{g_n g_1}$ are non-additive genetic covariances among the 1st and nth trait, ${\varvec{I}}$ is the identity matrix, ${\varvec{r}}$ is the vector of random replication effects following var(${\varvec{r}}$)$\sim$N(0,G3), where G3 is the variance-covariance structure for replication effects following

$$\begin{aligned} G3=\begin{bmatrix} \sigma _{r_1}^2 &{}\ldots &{}\sigma _{r_1 r_n} \\ \vdots &{}\ddots &{}\vdots \\ \sigma _{r_n r_1} &{}\ldots &{}\sigma _{r_n}^2\\ \end{bmatrix}\otimes {\varvec{I}} \end{aligned}$$

where $\sigma _{r_1}^2$ and $\sigma _{r_n}^2$ are replication variances for the 1st and the nth trait, $\sigma _{r_1 r_n}$ and $\sigma _{r_n r_1}$ are replication covariances between the 1st and the nth trait, ${\varvec{s}}$ is the vector of random set effects nested within replicate effects following var(${\varvec{s}}$)$\sim$N(0,G4), where G4 is the variance-covariance structure for set nested within replication effects following

$$\begin{aligned} G4=\begin{bmatrix} \sigma _{s_1}^2 &{}\ldots &{}\sigma _{s_1 s_n} \\ \vdots &{}\ddots &{}\vdots \\ \sigma _{s_n s_1} &{}\ldots &{}\sigma _{s_n}^2\\ \end{bmatrix}\otimes {\varvec{I}} \end{aligned}$$

where $\sigma _{s_1}^2$ and $\sigma _{s_n}^2$ are set nested within replication variances for the 1st and the nth trait, $\sigma _{s_1 s_n}$ and $\sigma _{s_n s_1}$ are set nested within replication covariances between the 1st and the nth trait, ${\varvec{e}}$ is the vector of random residuals following var(${\varvec{e}}$)$\sim$N(0,R), where R is the variance-covariance structure for residual effects following

$$\begin{aligned} R=\begin{bmatrix} \sigma _{e_1}^2 &{}\ldots &{}\sigma _{e_1 e_n} \\ \vdots &{}\ddots &{}\vdots \\ \sigma _{e_n e_1} &{}\ldots &{}\sigma _{e_n}^2\\ \end{bmatrix}\otimes {\varvec{I}} \end{aligned}$$

where $\sigma _{e_1}^2$ and $\sigma _{e_n}^2$ are residual variances for the 1st and the nth trait, $\sigma _{e_1 e_n}$ and $\sigma _{e_n e_1}$ are residual covariances between the 1st and the nth trait, and ${\varvec{X}}$ and ${\varvec{Z}}$ are incidence matrices assigning fixed and random effects to measurements.

The multivariate linear mixed model was implemented in the POP2GF population as follows:

$$\begin{aligned} {\varvec{Y}}={\varvec{X\beta }}+{\varvec{Za}}+{\varvec{Zg}}+{\varvec{Zr}}+{\varvec{Zb}}+{\varvec{e}} \end{aligned}$$

where ${\varvec{b}}$ is the vector of random incomplete block effect following var(${\varvec{b}}$)$\sim$N(0,G5), where G5 is the variance-covariance structure for incomplete block effects following

$$\begin{aligned} G5=\begin{bmatrix} \sigma _{b_1}^2 &{}\ldots &{}\sigma _{b_1 b_n} \\ \vdots &{}\ddots &{}\vdots \\ \sigma _{b_n b_1} &{}\ldots &{}\sigma _{b_n}^2\\ \end{bmatrix}\otimes {\varvec{I}} \end{aligned}$$

where $\sigma _{b_1}^2$ and $\sigma _{b_n}^2$ are incomplete block variances for the 1st and the nth trait, and $\sigma _{b_1 b_n}$ and $\sigma _{b_n b_1}$ are incomplete block covariances between the 1st and the nth trait.

Trait narrow-sense heritability was estimated as follows:

$$\begin{aligned} h^2=\frac{\sigma ^2_a}{\sigma ^2_a+\sigma ^2_e} \end{aligned}$$

The pair-wise marker-based genetic correlation was estimated in terms of Pearson’s product moment as follows:

$$\begin{aligned} r_g=\frac{\sigma _{a_xa_y}}{\sqrt{\sigma ^2_{a_x}\sigma ^2_{a_y}}} \end{aligned}$$

where $\sigma _{a_x a_y }$ is the additive genetic covariance between trait x and y explained by genetic markers, and $\sigma _{a_x}^2$ and $\sigma _{a_y}^2$ are the additive genetic variances for trait x and y explained by markers. The standard errors of all genetic parameters were estimated through ’delta method’ based on Taylor expansion⁵⁰. The similarity of correlation matrices between samples was compared using the Krzanowski test⁵¹ measuring the similarity between first principal components by using ’KrzCor’ function implemented in ’evolQG’ R package⁵² as follows:

$$\begin{aligned} r_{Krz}=\frac{1}{k} \sum _{i=1}^k\sum _{j=1}^k cos^2 (\Lambda _i^A,\Lambda _j^B) \end{aligned}$$

where k is the number of principal components derived from eigendecomposition of the additive genetic variance-covariance matrix and considered in test $k=\frac{n}{2}-1$ where n is the number of traits used in the multivariate analysis, and $\Lambda _i^A$ is the ith principal component of the additive genetic variance-covariance matrix of the Ath sample. The phenotypes in the multivariate analysis used to construct additive genetic variance-covariance matrix were standardized in this case.

Selection response decomposition⁵³ was performed to investigate similarity/dissimilarity between pairs of variance/covariance matrices obtained for each tested population and identify traits which cause differences in these matrices. The method implements the decomposition of evolutionary responses inferred from selection gradient vectors simulated through random skewers⁵⁴. The similarity in response to selection between two investigated genetic variance/covariance matrices is estimated as the average correlation between vectors of evolutionary responses inferred from the same simulated vectors of selection gradients. The multivariate response to selection is estimated as follows:

$$\begin{aligned} \begin{bmatrix} \Delta _{z_1} \\ \Delta _{z_2} \\ \vdots \\ \Delta _{z_n} \end{bmatrix} = \begin{bmatrix} G_{11}\beta _1+G_{12}\beta _2+\ldots +G_{1n}\beta _n\\ G_{21}\beta _1+G_{22}\beta _2+\ldots +G_{2n}\beta _n\\ \vdots \\ G_{n1}\beta _1+G_{n2}\beta _2+\ldots +G_{nn}\beta _n\\ \end{bmatrix} \end{aligned}$$

where ${\varvec{\Delta _z}}$ is the vector of response to selection, ${\varvec{G}}$ is the additive genetic variance/covariance matrix, ${\varvec{\beta }}$ is the simulated vector of directional selection gradients. The identification of statistically different traits between the investigated populations is then performed through the decomposition of the above-mentioned product of the variance/covariance matrix ${\varvec{G}}$ and the vector of directional selection gradients ${\varvec{\beta }}$ to trait specific vectors of response as follows:

$$\begin{aligned} \begin{bmatrix} G_{11}\beta _1+G_{12}\beta _2+\ldots +G_{1n}\beta _n\\ G_{21}\beta _1+G_{22}\beta _2+\ldots +G_{2n}\beta _n\\ \vdots \\ G_{n1}\beta _1+G_{n2}\beta _2+\ldots +G_{nn}\beta _n\\ \end{bmatrix} \Rightarrow \begin{matrix} (G_{11}\beta _1+G_{12}\beta _2+\ldots +G_{1n}\beta _n)\\ (G_{21}\beta _1+G_{22}\beta _2+\ldots +G_{2n}\beta _n)\\ \vdots \\ (G_{n1}\beta _1+G_{n2}\beta _2+\ldots +G_{nn}\beta _n)\\ \end{matrix} \end{aligned}$$

The correlations of trait-specific vectors are estimated across tested populations for all simulated scenarios, and their average is called the SRD score. If ${\varvec{G}}$ matrices are different, the traits will show a different response to direct or indirect selection and result in low SRD score and large variance in correlations. The opposite trends will be observed in case when ${\varvec{G}}$ matrices are similar.

The prediction of correlation breakers was performed through logistic regression, where the aim was to predict the binary status of the correlation breakers through the genetic markers. The individuals originating from the correlation breakers (POP1HD) population were marked as 1 while individuals from POP1GF population were marked as 0. The POP2GF population was then used as an independent population to identify correlation breakers through the genomic prediction model. The logistic regression was performed in the ‘ASReml-R’ statistical package⁴⁷ as follows:

$$\begin{aligned} {\varvec{y}}={\varvec{X\beta }}+{\varvec{Zu}}+{\varvec{e}} \end{aligned}$$

where ${\varvec{y}}$ is the vector of binary responses defining correlation breaker status, ${\varvec{\beta }}$ is the vector of fixed effects, and ${\varvec{u}}$ is the vector of pedigree or marker-based breeding values following var(${\varvec{u}}$)$\sim$N(0,${\varvec{A}}\sigma _u^2$), where ${\varvec{A}}$ is the average numerator relationship matrix which is substituted by the marker-based relationship matrix ${\varvec{G}}$ when genomic estimated breeding values are predicted. The leave-one-out strategy was selected to perform an independent evaluation due to the restricted number of individuals. The prediction accuracy was estimated as follows:

$$\begin{aligned} r=cor({\varvec{EBV}},{\varvec{GEBV}}) \end{aligned}$$

where ${\varvec{EBV}}$ is a vector of pedigree-based estimated breeding values, and ${\varvec{GEBV}}$ is a vector of predicted genomic breeding values. In the case of correlation breaker status (binary trait), two scenarios of cross-validation were performed: leave-one-out at (1) individual level (the same strategy as described above for quantitative traits); (2) family level where each time a whole single family was replaced as missing data to be predicted. The prediction accuracy was estimated in terms of area under the ROC (Receiver operating characteristic) curve (AUC). The AUC is the measure of the ability of a classifier to distinguish between classes (correlation breaker versus common individual status) and is used as a summary of the ROC curve. The higher the AUC, the better the performance of the model at distinguishing between the statuses of correlation breaker or common individual.

Table 1 Pedigree-based genetic correlation estimates and their standard errors (in parentheses).

Full size table

Ethics approval and consent to participate

The study complies with Scion internal rules and guidelines for field operations and sampling of genetic material. All permissions required for data collection and sampling of plant tissues for DNA extraction were obtained. There is no permission required for the research on radiata pine in New Zealand.

Results

Linkage disequilibrium

Linkage disequilibrium (LD) decay was investigated within scaffolds which contained at least three Single Nucleotide Polymorphisms (SNPs), and that mapped to the loblolly pine (Pinus taeda) reference genome v. 1.01e. We found intensive LD decay in the correlation breaker population (POP1HD) where a composite estimate of linkage disequilibrium $r^2$ of 0.2 (considered as the threshold beyond which LD has been completely eroded⁵⁵) was reached at 1 kb. By comparison, this same level of $r^2$ was reached at $\sim$2.5 kb in the POP1GF population and $\sim$ 2.1 kb in the POP2GF population. Therefore, the population of correlation breakers appeared to capture a higher amount of recombination events compared with the other two samples (Fig. 1). When LD was corrected for bias produced by familial relatedness, the observed linkage disequilibrium followed a similar patterns as before (i.e., including bias) but showed even faster decay along the scaffolds: the threshold of 0.2 was reached within $\sim$ 0.3 kb in correlation breakers (POP1HD) and within $\sim$ 0.6 kb in POP1GF population, while POP2GF reached $r^2$ of 0.2 within $\sim$ 0.3 kb (Fig. 1). Investigation of differences in LD patterns between the populations included in the training set (i.e., POP1GF and POP1HD) found that around 57% (LD including familial structure) and 48% (LD corrected for familial relatedness) of scaffolds had statistically significant differences in LD patterns. Therefore, familial relatedness contributed about 9% of statistically significant differences in LD patterns between populations. The correspondence in statistical significance between LD estimated (biased versus unbiased by familial relatedness) was relatively stable at the most significant and the most non-significant case with large changes in the middle of the distribution (Fig. 2). Spectral decomposition of the marker-based relationship matrix estimated across all samples showed that the POP1HD population created a compact cluster of related individuals when compared with populations selected for only growth and form. These correlation breakers are positioned at the centre of the investigated space (Fig. 3).

The statistical significance of the difference in effective population sizes between POP1GF and POP1HD was investigated using the resulting vectors of observed and/or expected heterozygosity in a t test. This analysis found statistically significant differences in both the mean observed as well as the expected heterozygosity between POP1GF and POP1HD. Although the mean expected heterozygosity was similar between both populations (POP1HD versus POP1GF) and reached values from 0.236 to 0.243, the average observed heterozygosity was lower in POP1HD (0.203) compared with POP1GF (0.226). Additionally, we investigated observed heterozygosity in markers from genomic regions with distinct LD pattern between POP1HD and POP1GF (based on results from the Jennrich test) and found higher observed heterozygosity of 0.23 compared to whole-genome observed heterozygosity of 0.203 in POP1HD compared to 0.22 which is similar to the whole-genome observed heterozygosity of 0.226 in POP1GF. Moreover, identity by state (IBS) coefficients were estimated to look at genetic similarity, unbiased by allelic frequencies. We found only subtle differences between the distributions of IBS within each investigated population as well as between them. While the average IBS coefficient within populations ranged from 0.766 (POP2GF) to 0.775 (POP1HD), slightly lower average IBS coefficients were found between populations ranging from 0.764 (POP1HD–POP2GF) to 0.767 (POP1GF–POP1HD and POP1GF–POP2GF) (Fig. 4).

Genetic parameters

Both pedigree and marker-based analyses were able to recover statistically significant variance components and heritability estimates across all traits and populations/sites. The lowest heritability was observed for straightness (ST9), estimated to be 0.093 (POP2GF in Woodhill) − 0.212 (POP1HD) in the pedigree-based analysis. The highest heritability was observed for wood density (WD), reaching 0.350 (POP2GF in Kinleith) − 0.585 (POP1HD) when estimated from the pedigree across all samples. The marker-based analysis showed a similar pattern with the lowest heritability estimated for ST9, ranging from 0.087 (POP2GF in Kinleith) − 0.225 (POP1GF), and highest heritability estimated for WD, ranging from 0.518 (POP1GF) − 0.608 (POP2GF in Kinleith) (Table S1).

Genetic correlations were estimated within each population at each site separately using both pedigree and marker-based analyses. These correlations serve as input parameters to compare differences in the multivariate response to selection between populations and environments. The genetic correlations generally corresponded across the methods. However, the 95% confidence limits were larger in pedigree-based estimates compared to marker-based equivalents, especially in the middle part of the distribution (Tables 1, 2; Fig. 5). The weakest genetic correlations were found between BR9 and DBH in POP2GF in Kinleith (0.061) and between DBH and WD in POP1HD (0.026), while the strongest genetic correlations were found between BR9 and ST9 in POP1HD (0.829) and in POP1GF (0.635) in pedigree-based analysis. It is worth noting that moderate negative correlations were observed between DBH and WD in all populations except for the above mentioned POP1HD, which reached values from − 0.318 to − 0.441 (Table 1). When marker-based analysis was implemented, the weakest genetic correlations were found between ST9 and WD in POP1GF (− 0.001) and between DBH and WD in POP1HD (0.004), while the strongest genetic correlations were found between BR9 and ST9 in POP1HD (0.774) and POP1GF (0.509) (Table 2). Substantial differences were observed in the estimation of genetic correlations at the sample level. The Krzanowski test (Table 3) found large differences between correlation matrices obtained in the correlation breaker population (POP1HD) compared with both other populations (POP1GF and POP2GF) across all sites. The correlation between correlation matrices obtained in POP1HD and POP1GF reached only 0.090, compared to 0.107 and 0.315 obtained between POP1HD and POP2GF. As expected, the highest correlations have been achieved between POP1GF and POP2GF, reaching from 0.770 to 0.913, since these two populations have undergone a similar selection history (Table 3).

Selection response decomposition⁵³ based on simulated vectors of selection responses was generally in an agreement between genetic variance-covariance matrices and differences appeared to be caused by environmental rather than genetic heterogeneity. For example, POP1GF and POP2GF, which are both tested in Woodhill and share the highest number of parents, also showed the highest correspondence among the estimates. On the other hand, the lowest correspondence among estimates was obtained between POP2GF tested in two different environments, which indicates the high impact of GxE on the genetic correlation matrix. The lowest correspondence of estimates between the identical genetic material planted at two different environments indicates strong influence or environmental conditions, especially on traits related to growth and stem form. Surprisingly, at high correspondence among estimates was also found between POP1GF and POP1HD, populations with different selection histories but planted in the same environment. A low correspondence was estimated among the POP1HD and POP2GF populations compared with the correspondence among the POP1GF and POP2GF populations, which could be attributable to the difference in selection history (Fig. 6).

Table 2 Marker-based genetic correlation estimates and their standard errors (in parentheses).

Full size table

Genomic prediction accuracy

A genomic prediction model using correlation breaker status as a binary trait was implemented in two scenarios: (1) testing prediction accuracy of correlation breaker status within a training population that combined POP1HD and POP1GF, and (2) identifying correlation breakers in POP2GF. The ability to predict the status of correlation breakers in term of prediction accuracy was high and reached 0.80 when leave-one-out scenario was implemented and 0.72 when whole family was removed the from training population and their status predicted. Analysis of the POP2GF population resulted in the identification of 19 individuals with a predicted GEBV higher than 0.75, based on their genomic profile and after back transformation. When these individuals are plotted against their breeding values for DBH and WD, the predicted correlation breakers are in the centre of the distribution, with some cases that are superior for both traits and some that are inferior for both traits (Fig. 7). This might be caused by GxE interaction. The identification of correlation breaker status was performed on the GEBV identified at the 95 percentile of cumulative distribution of genomic breeding values. Since there was 35% of correlation breakers in training population, the 95 percentile represents very conservative approach to detect individuals with excess of heterozygosity within the genomic regions involving pleiotropic or collocating QTLs for investigated traits. In our case, the threshold GEBV value corresponding to 95 percentile was 0.75.

Table 3 Krzanowski correlations of marker-based (below diagonal) and pedigree-based (above diagonal) additive genetic variance-covariance matrices between samples.

Full size table

Discussion

Evolutionary advantages of correlation breakers

Our study investigated the potential of genomic selection to identify individuals with high levels of recombination once the proper training population is established. These individuals are called “correlation breakers” and usually represent individuals that deviate strongly from general trends in genetic correlation between traits (traits that usually have unfavourable genetic relationships). We proposed that the identification of such individuals will lead to improvements in the response to selection in both traits, when traits are adversely correlated, due to decayed genetic correlation. We also postulated that high recombination might also improve population resilience to climate change. Under random mating, the cause of genetic correlations is proposed to be the result of mainly pleiotropic mutations⁴. In populations under non-random mating, genetic correlations are the results of LD between loci affecting the correlated traits, causing inbreeding⁶. However, the process of adaptation to local environmental conditions may result in the development of genomic rearrangements, such as inversions⁵⁶, that produce clusters of tightly linked adaptive loci and are referred as super-genes⁵⁷. Such genomic rearrangements can potentially reshape the network of genetic correlations between adaptive traits. Genetic correlations represent evolutionary constraints, and their unfavourable combination can compromise progress in evolutionary changes. This can be detrimental under intensive environmental changes that require a fast genetic response. Theoretical studies⁵⁸ have shown the evolutionary advantages of recombination due to the ability of linked loci to interfere with each other’s response to selection.

Additionally, an excess of recombination events has been shown to be favourable under heterogeneous environmental conditions¹⁴. Correlation breakers represent individuals that have accumulated an excess of recombination events, and propose their infusion into breeding program/forest plantations to support population resilience under changing environmental conditions. This is especially important in forest trees, as they are long-living organisms facing temporal variability in environmental conditions along their ontogenetical development, exacerbated by climate change. Mathematical modelling found an advantage of recombination at the early stage of adaptation, while selection against recombination was pronounced at later stages to eliminate maladaptive gene flow^12,59. Additionally, the empirical study in lodgepole pine did not find any higher recombination rates between genes associated to different aspects of current climate patterns⁶⁰. However, if we assume continuously changing environmental conditions with a higher frequency of extreme weather and climate events^61,62, evolutionary processes might change direction which would shift back to the early stage of adaptation, and an excess of recombination would be advantageous. The decomposition of contemporary genetic correlations will not only release new genetic variation but also increase progress in evolutionary responses to new environmental conditions. Nevertheless, such a putative positive effect is restricted only to the genetic correlations caused by LD, while those caused by pleiotropic mutations should remain unaffected by recombination rates. However, pleiotropy-related genetic correlations could still be influenced by new environmental conditions through other mechanisms⁹.

The identification of individuals with an excess of recombination events is not straightforward since LD decay is a population-specific parameter which depends on breeding and selection history⁶³. Therefore, the establishment of a specific population containing correlation breakers is required. Such populations can be created through tandem selection as implemented in this study, or by complementary mating where the best performing individuals for each unfavourably correlated trait are mated¹⁵. Therefore, being a correlation breaker is more a relative concept than heritable feature. However, our analysis of observed heterogeneity discovered that, while the population of correlation breakers showed lower levels of heterozygosity across the whole genome, possibly due to directional selection for both traits (DBH and WD), they also showed increased level of observed heterozygosity in regions with statistically different patterns in LD. These regions might contain groups of pleiotropic or collocating QTLs for selected traits. Our multivariate analysis discovered that the tandem selection strategy disrupted most of the commonly expected correlations in the population apart from the correlation among BR9 and ST9 (Tables 1, 2). Such results are probably the product of the pleiotropic architecture of the traits rather than a function of the genetic correlations themselves⁶⁴. The clear distinction in networks of genetic correlation was confirmed in the results from the Krzanowski test⁵¹. However, a more obvious pattern was observed in the marker-based compared with pedigree-based analysis. This shows the strong ability of markers generated through exome capture genotyping by sequencing to track differences in selection and breeding history compared with the expected genealogy captured by pedigrees. While the information included in pedigrees is limited by the definition of the base population (pedigree founders) as having with no relatedness and inbreeding, the markers can trace both the relatedness created recently within breeding program and also relatedness and population structure that existed prior to the formation of the base population⁶⁵. Additionally, genomic markers enable tracing of the Mendelian sampling term and linkage disequilibrium between QTLs and markers as oppose to the expected identity by descent derived from pedigree information⁶⁶. On the other hand, such a clear pattern was not observed in the selection response decomposition, where the differences between correlation patterns appeared to be caused by environmental rather than genetic heterogeneity (Fig. 5). However, the sample tested in field experiments had passed several cycles of selection, and thus the genetic diversity needed for robust estimates of genetic correlation was limited, in addition to the contribution of the small sample size to the unclear pattern in genetic correlation comparisons⁶⁷.

The establishment of a robust training population is crucial to be able to predict individuals with a high levels of recombination events (correlation breakers). Our training population was established from sets of progenies derived from parents selected with a tandem selection strategy, and progeny of parents from a directional selection strategy. The ability to distinguish LD decay pattern developed from exome capture markers (Fig. 1) was crucial to clearly identify the individuals belonging to the correlations breakers set when population structure was investigated (Fig. 2). The accuracy of predictions for correlation breaker status was high, reaching 0.80, which provides strong evidence to support the selection of individuals with an excess of recombination events. Such individuals are preferred under heterogeneous environmental conditions¹⁴, and their infusion into populations should help erode unfavourable genetic correlations and initiate faster evolutionary processes, in response to climate change. The genomic selection prediction model was also able to identify 19 individuals as having correlation breaker status in an independent population (POP2GF). These individuals are centred in the middle of the distribution with regard to breeding values for DBH and WD, with some having breeding values above average in both traits to some having breeding values below average in both traits (Fig. 6). However, since the LD pattern is population specific, the status of correlation breakers has to be considered in the context of the training population and corresponds to the level of genetic diversity present. Therefore, genetically broader samples should be used in training and in the definition of correlation breakers.

Mitigation of climate change through genomics

Forest trees are mostly widespread species occupying geographically large and environmentally highly heterogeneous areas which, accompanied with intensive long-distance gene flow, should support the rapid adaptation of populations to new environmental conditions⁶⁸. However, there are concerns that such natural mechanisms are not fast enough to cope with the current speed of climate change without human intervention, and a more active approach needs to be applied. An assisted gene flow approach⁶⁹ was proposed as a mechanism to shift current populations towards their future optimal conditions as predicted by climate models^70,71,72,73. The current development of genomic resources in forest trees enables a deeper insight into adaptation processes through the detection of selection signals at the genome level^23,74,75. However, adaptive traits are rather complex, and capturing selection signals through association mapping can be challenging⁷⁶. Therefore, genomic selection⁷⁷ or genome-wide selection scans^78,79 based on the deployment of genetic markers to predict phenotype through multivariate regression models seems to be a more feasible solution to predict adaptive traits. Arenas et al.⁸⁰ proposed genomic prediction of putative adaptive traits in small natural population of relict species to optimize conservation management.

Genomic selection has been successfully implemented in animals^81,82, agriculture crops^83,84 and forest trees^{85,86,87,88,89,90,91}. Compared to animals and plants, however, forest trees have high genetic diversity, extreme genome lengths (e.g., $\sim$ 25 Gb in radiata pine), rapid LD decay^23,92, and breeding programs are generally only in their early stages due to late expression of sexual maturity and long breeding cycles. All these factors pose challenges to the implementation of genomic prediction in forest trees. Grattapaglia and Resende⁹³ performed a deterministic simulation of scenarios relevant to forest trees and found that a sufficient density of markers is a crucial requirement to perform genomic prediction successfully.

The extreme length of the genome of forest trees prohibits easily capturing the whole genome through whole genome re-sequencing, and genotyping platforms based on reduced representation approaches have to be deployed^36,94. Since prediction models can get overwhelmed with the exhaustive amount of genomic data⁶⁶, the effort invested into the reduction of genome complexity should not be seen not as a handicap, but rather as a challenge to identify relevant genomic fragments. Our study is based on genomic resources that were developed from the sequencing of transcriptomes from a range of different tissues, including buds, needles, xylem, and phloem³⁸. These genomic resources generated $\sim$ 800 K SNPs representing $\sim$ 40 K genes. However, most SNPs were rare variants, likely due to the fact that the exome is a highly conserved region, and only $\sim$ 80 K SNPs were used to generate genotypes for each individual.

The radiata pine genomic resources generated within our project were assembled against the loblolly pine draft reference genome⁹⁵, which allowed us to look at LD decay with physical distance. The population of correlation breakers (POP1HD) showed much more rapid LD decay compared with the other samples (POP1GF and POP2GF; Fig. 1). However, all samples appeared to aggregate in a cloud around $r^2$ of 0.6 in the correlation breakers (POP1HD) and POP2GF, and around 0.7 in POP1GF across the investigated range of physical distances (Fig. 1). This is likely due to differences in synteny between the loblolly pine and radiata pine genomes. The estimates of 1 kb, 2.1 kb and 2.5 kb around an $r^2$ of 0.2 in the correlation breakers and the other two populations (Fig. 1) are probably upwardly biased due to low sample sizes⁹⁶ and the actual decay in linkage disequilibrium is most likely more intensive (especially in the correlation breakers population where the sample size was even lower). Chao et al.⁶³ found differences in the long-range level of LD between different wheat populations and attributed it to differences in breeding and selection history. However, they did not find any differences in LD decay, and each population captured a comparable amount of recombinant events.

In contrast, our samples do show a difference in LD decay, with the fastest decay apparent in the correlation breakers population, which provides evidence that a higher amount of recombination events is being captured in this sample compared with others. This pattern is expected, as the correlation breakers population was selected to break the adverse pattern in genetic correlations between two highly complex traits, which would require a higher level of recombination between a large number of QTLs associated with the different traits. Additionally, the increased recombination rate in this population could stimulate an increased response to selection and a decreased loss of additive genetic variance over time⁸².

Since the expectation for $r^2$ is $E(r^2 )=\frac{1}{1+4Nc}$ where N is effective population size and c is recombination rate, the difference in the effective population size investigated through the mean observed heterozygosity was statistically significant among populations and thus N is likely to contribute to the LD estimate. This was especially observed when LD was corrected for bias due to familial relatedness, resulting in much faster decrease in LD decay in POP1GF compared to POP1HD. However, even after correcting for LD bias due to familial relatedness, POP1HD showed LD that decayed twice as fast as POP1GF. Additionally, correction of LD for familial relatedness resulted in only 9% decrease in the number of scaffolds showing statistically significant difference in LD patterns between POP1GF and POP1HD (Fig. 2), and thus difference in recombination rates is likely the driver of differences in LD patterns. The lower mean observed heterozygosity in correlation breakers population can be seen as contradictory to the notion that the higher level of heterozygosity is favourable in heterogeneous environmental conditions⁹⁷. The reduced level of heterozygosity can be connected to accumulation of favourable alleles for QTLs involved in genetic architecture of both traits under selection (DBH and WD). Additionally, despite the relatively high conservedness of exome regions, as evidenced by similar IBS within and between populations—Fig. 7, we were able to track differences in both general LD decay patterns and differences in local LD patterns between investigated populations to identify individuals with high levels of recombination (Figs. 1, 2).

The key advantage of the genomics-based approach in breeding is the recovery of both temporal and historical relatedness in studied populations⁶⁵ through the construction of a marker-based relationship matrix^41,98. The implementation of this type of relationship matrix in genetic evaluations involves the simple substitution of the pedigree-based alternative and does not require any additional data treatment. The spectral decomposition of the marker-based relationship matrix found a strong difference in population structure between the correlation breakers that were selected for HD and other samples selected for GF. While samples selected for GF show high dispersion across the investigated space, the correlation breakers population was concentrated in the middle (Fig. 2). Therefore, it is likely that stronger selection in the trait with a higher heritability (HD) is reflected in the reduced genetic diversity of POP1HD sample. As reported in this study, genetic correlation networks change with changes in environmental conditions. This might be the consequence of gene x environment interactions and the resulting impact on multivariate responses. Although the QTL effects might fluctuate with changes in environmental conditions, we assume the underlying genetic architecture will remain the same. Thus, the selection for increased heterozygosity in genomic regions accumulating pleiotropic or collocating QTLs for traits under selection might be an effective way to decompose unfavourable genetic correlations. The tandem selection implemented in the radiata pine breeding program proved to be a successful approach to decompose unfavourable genetic correlations and stimulate multivariate response to selection. However, we expect that the efficiency of this approach will depend on the genetic architecture involved in genetic correlations. In cases, where pleiotropic effects are the primary driver of genetic correlations, the tandem selection might be less efficient compared to cases where genetic correlation is caused by collocating QTLs.

Conclusions

We propose that environment-specific (representing the average New Zealand environment) selection be undertaken using genomic predictions by increasing the frequency of ’recombined’ individuals. We propose that high levels of recombination will also confer long-term resilience in planted forests, vital to the success of these populations under a changing climate. The increased frequency of such individuals may decrease the strength of the population-level genetic correlations among traits, which will, in turn, increase the opportunity for new trait combinations to be developed in the future.

References

Kremer, A., Potts, B. M. & Delzon, S. Genetic divergence in forest trees: Understanding the consequences of climate change. Funct. Ecol. 28, 22–36 (2014).
Article Google Scholar
Solovieff, N., Cotsapas, C., Lee, P. H., Purcell, S. M. & Smoller, J. W. Pleiotropy in complex traits: Challenges and strategies. Nat. Rev. Genet. 14, 483–495 (2013).
Article CAS PubMed PubMed Central Google Scholar
Falconer, D. S. & Mackay, T. F. Introduction to Quantitative Genetics (Addison Wesley Longman, Essex, 1996).
Google Scholar
Lande, R. The genetic covariance between characters maintained by pleiotropic mutations. Genetics 94, 203–215 (1980).
Article CAS PubMed PubMed Central Google Scholar
Conner, J. K. Genetic mechanisms of floral trait correlations in a natural population. Nature 420, 407–410 (2002).
Article ADS CAS PubMed Google Scholar
Lande, R. The genetic correlation between characters maintained by selection, linkage and inbreeding. Genet. Res. 44, 309–320 (1984).
Article ADS CAS PubMed Google Scholar
Sedlacek, J. et al. Evolutionary potential in the Alpine: Trait heritabilities and performance variation of the dwarf willow Salix herbacea from different elevations and microhabitats. Ecol. Evol. 6, 3940–3952 (2016).
Article PubMed PubMed Central Google Scholar
Rolian, C., Lieberman, D. E. & Hallgrímsson, B. The coevolution of human hands and feet. Evol. Int. J. Organ. Evol. 64, 1558–1568 (2010).
Article Google Scholar
Sgrò, C. M. & Hoffmann, A. A. Genetic correlations, tradeoffs and environmental variation. Heredity 93, 241–248 (2004).
Article PubMed Google Scholar
Hoffmann, A. A. et al. Evolutionary Genetics and Environmental Stress (Oxford University Press, Oxford, 1991).
Google Scholar
Hoffmann, A. A. & Merilä, J. Heritable variation and evolution under favourable and unfavourable conditions. Trends Ecol. Evol. 14, 96–101 (1999).
Article CAS PubMed Google Scholar
Losos, J. B. Convergence, adaptation, and constraint. Evol. Int. J. Organ. Evol. 65, 1827–1840 (2011).
Article Google Scholar
Bürger, R. Evolution of genetic variability and the advantage of sex and recombination in changing environments. Genetics 153, 1055–1069 (1999).
Article PubMed PubMed Central Google Scholar
Lenormand, T. & Otto, S. P. The evolution of recombination in a heterogeneous environment. Genetics 156, 423–438 (2000).
Article CAS PubMed PubMed Central Google Scholar
White, T. L., Adams, W. T. & Neale, D. B. Forest Genetics (Cabi, 2007).
Wu, H. et al. Breeding for wood quality and profit in Pinus radiata: A review of genetic parameter estimates and implications for breeding and deployment. NZ J. For. Sci. 38, 56–87 (2008).
Google Scholar
King, D. A. The adaptive significance of tree height. Am. Nat. 135, 809–828 (1990).
Article Google Scholar
Greenwood, S. et al. Tree mortality across biomes is promoted by drought intensity, lower wood density and higher specific leaf area. Ecol. Lett. 20, 539–553 (2017).
Article PubMed Google Scholar
Hacke, U. G., Sperry, J. S., Pockman, W. T., Davis, S. D. & McCulloh, K. A. Trends in wood density and structure are linked to prevention of xylem implosion by negative pressure. Oecologia 126, 457–461 (2001).
Article ADS PubMed Google Scholar
Van Gelder, H., Poorter, L. & Sterck, F. Wood mechanics, allometry, and life-history variation in a tropical rain forest tree community. New Phytol. 171, 367–378 (2006).
Article PubMed Google Scholar
Searle, S. Phenotypic, genetic and environmental correlations. Biometrics 17, 474–480 (1961).
Article Google Scholar
Stearns, S., de Jong, G. & Newman, B. The effects of phenotypic plasticity on genetic correlations. Trends Ecol. Evol. 6, 122–126 (1991).
Article CAS PubMed Google Scholar
Neale, D. B. & Savolainen, O. Association genetics of complex traits in conifers. Trends Plant Sci. 9, 325–330 (2004).
Article CAS PubMed Google Scholar
Yu, J. et al. A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat. Genet. 38, 203–208 (2006).
Article CAS PubMed Google Scholar
Price, A. L. et al. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38, 904–909 (2006).
Article CAS PubMed Google Scholar
Oraguzie, N. C., Gardiner, S. E., Rikkerink, E. H. & Silva, H. N. Association Mapping in Plants (Springer, New York, 2007).
Book Google Scholar
Jia, Y. & Jannink, J.-L. Multiple-trait genomic selection methods increase genetic value prediction accuracy. Genetics 192, 1513–1522 (2012).
Article PubMed PubMed Central Google Scholar
Marchal, A. et al. Multivariate genomic model improves analysis of oil palm (Elaeis guineensis jacq.) progeny tests. Mol. Breed. 36, 2 (2016).
Article Google Scholar
Jayawickrama, K. et al. A breeding strategy for the New Zealand radiata pine breeding cooperative. Silvae Genet. 49, 82–89 (2000).
Google Scholar
Dungey, H. et al. A new breeding strategy for Pinus radiata in New Zealand and New South Wales. Silvae Genet. 58, 28–38 (2009).
Article Google Scholar
Vincent, T. G. Certification system for forest tree seed and planting stock (Report, New Zealand Forest Research Institute, 1987).
Li, Y., Wilcox, P., Telfer, E., Graham, N. & Stanbra, L. Association of single nucleotide polymorphisms with form traits in three New Zealand populations of radiata pine in the presence of genotype by environment interactions. Tree Genet. Genomes 12, 63 (2016).
Article Google Scholar
Carson, S. Genotype x environment interaction and optimal number of progeny test sites for improving Pinus radiata in New Zealand. NZJ For. Sci 21, 32–49 (1991).
Google Scholar
Carson, M. J. Control-Pollinated Seed Orchards of Best General Combiners: A New Strategy for Radiata Pine Improvement (New Zealand Forest Service Rotorua, 1986).
Smith, D. M. Maximum Moisture Content Method for Determining Specific Gravity of Small Wood Samples (USDA, Forest Service (Forest Product Laboratory. Madison, Wisc, 1954).
Neves, L. G., Davis, J. M., Barbazuk, W. B. & Kirst, M. Whole-exome targeted sequencing of the uncharacterized pine genome. Plant J. 75, 146–156 (2013).
Article CAS PubMed Google Scholar
Telfer, E. et al. Approaches to variant discovery for conifer transcriptome sequencing. PLoS ONE 13, e0205835 (2018).
Article PubMed PubMed Central CAS Google Scholar
Telfer, E. et al. A high-density exome capture genotype-by-sequencing panel for forestry breeding in Pinus radiata. PLoS ONE 14, e0222640 (2019).
Article CAS PubMed PubMed Central Google Scholar
Weir, B. S. et al. Genetic data analysis. In Methods for Discrete Population Genetic Data (Sinauer Associates, Inc. Publishers, 1979).
Mangin, B. et al. Novel measures of linkage disequilibrium that correct the bias due to population structure and relatedness. Heredity 108, 285–291 (2012).
Article CAS PubMed Google Scholar
VanRaden, P. M. Efficient methods to compute genomic predictions. J. Dairy Sci. 91, 4414–4423 (2008).
Article CAS PubMed Google Scholar
Desrousseaux, D., Sandron, F., Siberchicot, A., Cierco-Ayrolles, C. & Mangin, B. LDcorSV: Linkage Disequilibrium Corrected by the Structure and the Relatedness 1. R package version 1.3 (2016).
Archie, J. W. Statistical analysis of heterozygosity data: Independent sample comparisons. Evolution 39, 623–637 (1985).
Article PubMed Google Scholar
Hill, W. & Weir, B. Variances and covariances of squared linkage disequilibria in finite populations. Theor. Popul. Biol. 33, 54–78 (1988).
Article MathSciNet CAS PubMed MATH Google Scholar
Remington, D. L. et al. Structure of linkage disequilibrium and phenotypic associations in the maize genome. Proc. Natl. Acad. Sci. 98, 11479–11484 (2001).
Article ADS CAS PubMed PubMed Central Google Scholar
Jennrich, R. I. An asymptotic $\chi$2 test for the equality of two correlation matrices. J. Am. Stat. Assoc. 65, 904–912 (1970).
MathSciNet MATH Google Scholar
Butler, D., Cullis, B. R., Gilmour, A. & Gogel, B. ASReml-R Reference Manual (The State of Queensland, Department of Primary Industries and Fisheries, Brisbane, 2009).
Wright, S. Coefficients of inbreeding and relationship. Am. Nat. 56, 330–338 (1922).
Article Google Scholar
Zimin, A. et al. Sequencing and assembly of the 22-Gb loblolly pine genome. Genetics 196, 875–890 (2014).
Article CAS PubMed PubMed Central Google Scholar
Lynch, M. et al. Genetics and Analysis of Quantitative Traits Vol. 1 (Sinauer Sunderland, Sunderland, 1998).
Google Scholar
Krzanowski, W. Between-groups comparison of principal components. J. Am. Stat. Assoc. 74, 703–707 (1979).
Article MathSciNet MATH Google Scholar
Melo, D., Garcia, G., Hubbe, A., Assis, A. P. & Marroig, G. EvolQG—an R package for evolutionary quantitative genetics. F1000Res. 4, 925 (2015).
Article Google Scholar
Marroig, G., Melo, D., Porto, A., Sebastiao, H. & Garcia, G. Selection response decomposition (SRD): A new tool for dissecting differences and similarities between matrices. Evol. Biol. 38, 225–241 (2011).
Article Google Scholar
Cheverud, J. M. Quantitative genetic analysis of cranial morphology in the cotton-top (Saguinus oedipus) and saddle-back (S. fuscicollis) tamarins. J. Evol. Biol. 9, 5–42 (1996).
Article Google Scholar
Otyama, P. I. et al. Evaluation of linkage disequilibrium, population structure, and genetic diversity in the us peanut mini core collection. BMC Genom. 20, 1–17 (2019).
Article CAS Google Scholar
Huang, K. & Rieseberg, L. H. Frequency, origins, and evolutionary role of chromosomal inversions in plants. Front. Plant Sci. 11, 296 (2020).
Article PubMed PubMed Central Google Scholar
Yeaman, S. Genomic rearrangements and the evolution of clusters of locally adaptive loci. Proc. Natl. Acad. Sci. 110, E1743–E1751 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Hill, W. G. & Robertson, A. The effect of linkage on limits to artificial selection. Genet. Res. 8, 269–294 (1966).
Article CAS PubMed Google Scholar
Reeve, J., Ortiz-Barrientos, D. & Engelstädter, J. The evolution of recombination rates in finite populations during ecological speciation. Proc. R. Soc. B Biol. Sci. 283, 20161243 (2016).
Article Google Scholar
Lotterhos, K. E., Yeaman, S., Degner, J., Aitken, S. & Hodgins, K. A. Modularity of genes involved in local adaptation to climate despite physical linkage. Genome Biol. 19, 157 (2018).
Article PubMed PubMed Central Google Scholar
Easterling, D. R. et al. Climate extremes: Observations, modeling, and impacts. Science 289, 2068–2074 (2000).
Article ADS CAS PubMed Google Scholar
Scranton, K. & Amarasekare, P. Predicting phenological shifts in a changing climate. Proc. Natl. Acad. Sci. 114, 13212–13217 (2017).
Article CAS PubMed PubMed Central Google Scholar
Chao, S. et al. Population-and genome-specific patterns of linkage disequilibrium and SNP variation in spring and winter wheat (Triticum aestivum l.). BMC Genom. 11, 1–17 (2010).
Article Google Scholar
Gromko, M. H. Unpredictability of correlated response to selection: Pleiotropy and sampling interact. Evolution 49, 685–693 (1995).
Article PubMed Google Scholar
Powell, J. E., Visscher, P. M. & Goddard, M. E. Reconciling the analysis of IBD and IBS in complex trait studies. Nat. Rev. Genet. 11, 800–805 (2010).
Article CAS PubMed Google Scholar
Habier, D., Fernando, R. L. & Garrick, D. J. Genomic BLUP decoded: A look into the black box of genomic prediction. Genetics 194, 597–607 (2013).
Article CAS PubMed PubMed Central Google Scholar
Bijma, P. & Bastiaansen, J. W. Standard error of the genetic correlation: How much data do we need to estimate a purebred-crossbred genetic correlation?. Genet. Sel. Evol. 46, 79 (2014).
Article PubMed PubMed Central Google Scholar
Kremer, A. et al. Long-distance gene flow and adaptation of forest trees to rapid climate change. Ecol. Lett. 15, 378–392 (2012).
Article PubMed PubMed Central Google Scholar
Vitt, P., Havens, K., Kramer, A. T., Sollenberger, D. & Yates, E. Assisted migration of plants: Changes in latitudes, changes in attitudes. Biol. Conserv. 143, 18–27 (2010).
Article Google Scholar
Gray, L. K., Gylander, T., Mbogga, M. S., Chen, P.-Y. & Hamann, A. Assisted migration to address climate change: Recommendations for aspen reforestation in western Canada. Ecol. Appl. 21, 1591–1603 (2011).
Article PubMed Google Scholar
McLane, S. C. & Aitken, S. N. Whitebark pine (Pinus albicaulis) assisted migration potential: Testing establishment north of the species range. Ecol. Appl. 22, 142–153 (2012).
Article PubMed Google Scholar
Aitken, S. N. & Whitlock, M. C. Assisted gene flow to facilitate local adaptation to climate change. Ann. Rev. Ecol. Evol. Syst.44, 367–388 (2013).
Article Google Scholar
Aitken, S. N. & Bemmels, J. B. Time to get moving: Assisted gene flow of forest trees. Evol. Appl. 9, 271–290 (2016).
Article PubMed Google Scholar
Neale, D. B. & Kremer, A. Forest tree genomics: Growing resources and applications. Nat. Rev. Genet. 12, 111–122 (2011).
Article CAS PubMed Google Scholar
Yeaman, S. et al. Convergent local adaptation to climate in distantly related conifers. Science 353, 1431–1433 (2016).
Article ADS CAS PubMed Google Scholar
Ćalić, I., Bussotti, F., Martínez-García, P. J. & Neale, D. B. Recent landscape genomics studies in forest trees—what can we believe?. Tree Genet. Genomes 12, 3 (2016).
Article Google Scholar
Meuwissen, T., Hayes, B. & Goddard, M. Prediction of total genetic value using genome-wide dense marker maps. Genetics 157, 1819–1829 (2001).
Article CAS PubMed PubMed Central Google Scholar
Cortés, A. J., Restrepo-Montoya, M. & Bedoya-Canas, L. E. Modern strategies to assess and breed forest tree adaptation to changing climate. Front. Plant Sci. 11, 1606 (2020).
Article Google Scholar
Cortés, A. J., López-Hernández, F. & Osorio-Rodriguez, D. Predicting thermal adaptation by looking into populations’ genomic past. Front. Genet. 11, 1093 (2020).
Article Google Scholar
Arenas, S., Cortés, A. J., Mastretta-Yanes, A. & Jaramillo-Correa, J. P. Evaluating the accuracy of genomic prediction for the management and conservation of relictual natural tree populations. Tree Genet. Genomes 17, 1–19 (2021).
Article Google Scholar
Goddard, M. E., Hayes, B. J. & Meuwissen, T. H. Genomic selection in livestock populations. Genet. Res. 92, 413–421 (2010).
Article CAS Google Scholar
Battagin, M., Gorjanc, G., Faux, A.-M., Johnston, S. E. & Hickey, J. M. Effect of manipulating recombination rates on response to selection in livestock breeding programs. Genet. Sel. Evol. 48, 1–12 (2016).
Article CAS Google Scholar
Heffner, E. L., Sorrells, M. E. & Jannink, J.-L. Genomic selection for crop improvement. Crop Sci. 49, 1–12 (2009).
Article CAS Google Scholar
Heffner, E. L., Lorenz, A. J., Jannink, J.-L. & Sorrells, M. E. Plant breeding with genomic selection: Gain per unit time and cost. Crop Sci. 50, 1681–1690 (2010).
Article Google Scholar
Resende, M. Jr. et al. Accelerating the domestication of trees using genomic selection: Accuracy of prediction models across ages and environments. New Phytol. 193, 617–624 (2012).
Article PubMed Google Scholar
Resende, M. F. et al. Accuracy of genomic selection methods in a standard data set of loblolly pine(Pinus taeda l.). Genetics 190, 1503–1510 (2012).
Article PubMed PubMed Central Google Scholar
Beaulieu, J., Doerksen, T., Clément, S., MacKay, J. & Bousquet, J. Accuracy of genomic selection models in a large population of open-pollinated families in white spruce. Heredity 113, 343–352 (2014).
Article CAS PubMed PubMed Central Google Scholar
Ratcliffe, B. et al. A comparison of genomic selection models across time in interior spruce (Picea engelmannii$\times$ glauca) using unordered SNP imputation methods. Heredity 115, 547–555 (2015).
Article CAS PubMed PubMed Central Google Scholar
El-Dien, O. G. et al. Prediction accuracies for growth and wood attributes of interior spruce in space using genotyping-by-sequencing. BMC Genom. 16, 370 (2015).
Article CAS Google Scholar
Bartholomé, J. et al. Performance of genomic prediction within and across generations in maritime pine. BMC Genom. 17, 1–14 (2016).
Article Google Scholar
Lenz, P. R. et al. Factors affecting the accuracy of genomic selection for growth and wood quality traits in an advanced-breeding population of black spruce (Picea mariana). BMC Genom. 18, 335 (2017).
Article Google Scholar
Neale, D. B. Genomics to tree breeding and forest health. Curr. Opin. Genet. Dev. 17, 539–544 (2007).
Article CAS PubMed Google Scholar
Grattapaglia, D. & Resende, M. D. Genomic selection in forest tree breeding. Tree Genet. Genomes 7, 241–255 (2011).
Article Google Scholar
Elshire, R. J. et al. A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. PLoS ONE 6, e19379 (2011).
Article ADS CAS PubMed PubMed Central Google Scholar
Neale, D. B. et al. Decoding the massive genome of loblolly pine using haploid DNA and novel assembly strategies. Genome Biol. 15, 1–13 (2014).
Article CAS Google Scholar
Terwilliger, J. D. & Hiekkalinna, T. An utter refutation of the ‘Fundamental Theorem of the HapMap’. Eur. J. Hum. Genet. 14, 426–437 (2006).
Article CAS PubMed Google Scholar
Reed, D. H. & Frankham, R. Correlation between fitness and genetic diversity. Conserv. Biol. 17, 230–237 (2003).
Article Google Scholar
Nejati-Javaremi, A., Smith, C. & Gibson, J. Effect of total allelic relationship on accuracy of evaluation and response to selection. J. Anim. Sci. 75, 1738–1745 (1997).
Article CAS PubMed Google Scholar

Download references

Acknowledgements

We would like to thank the New Zealand Radiata Pine Breeding Company and Ministry of Business, Innovation and Employment (MBIE) joint project RPBC1301 and MBIE Strategic Science Investment Fund contract nr. C04X1703 for financial support of this study.

Author information

Authors and Affiliations

Scion New Zealand Forest Research Institute Ltd., Rotorua, 3046, New Zealand
Jaroslav Klápště, Emily J Telfer, Heidi S Dungey & Natalie J Graham

Authors

Jaroslav Klápště
View author publications
You can also search for this author in PubMed Google Scholar
Emily J Telfer
View author publications
You can also search for this author in PubMed Google Scholar
Heidi S Dungey
View author publications
You can also search for this author in PubMed Google Scholar
Natalie J Graham
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

J.K. performed the statistical analysis and wrote the first draft of the manuscript, E.T. and N.G. developed genomic resources implemented in this study, H.D. supervised the research activity. All authors reviewed the manuscript.

Corresponding author

Correspondence to Jaroslav Klápště.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Klápště, J., Telfer, E.J., Dungey, H.S. et al. Chasing genetic correlation breakers to stimulate population resilience to climate change. Sci Rep 12, 8238 (2022). https://doi.org/10.1038/s41598-022-12320-3

Download citation

Received: 18 November 2021
Accepted: 09 May 2022
Published: 17 May 2022
DOI: https://doi.org/10.1038/s41598-022-12320-3

This article is cited by

Genetic consequence of widespread plantations of Cryptomeria japonica var. sinensis in Southern China: implications for afforestation strategies under climate change
- Mengying Cai
- Kentaro Uchiyama
- Yoshihiko Tsumura
Tree Genetics & Genomes (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Efficient genomics-based ‘end-to-end’ selective tree breeding framework

Adaptive evolution in a conifer hybrid zone is driven by a mosaic of recently introgressed and background genetic variants

Massive haplotypes underlie ecotypic differentiation in sunflowers

Introduction

Materials and methods

Plant material

Genomic resources

Genomic data analysis

Statistical analysis

Ethics approval and consent to participate

Results

Linkage disequilibrium

Genetic parameters

Genomic prediction accuracy

Discussion

Evolutionary advantages of correlation breakers

Mitigation of climate change through genomics

Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Supplementary Information

Supplementary Information.

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Genetic consequence of widespread plantations of Cryptomeria japonica var. sinensis in Southern China: implications for afforestation strategies under climate change

Comments

Search

Quick links