Introduction

Lentil (Lens culinaris ssp. culinaris Medik.) is an important legume widely grown in many countries. Being versatile in cooking and a good source of protein as well as various micronutrients, lentil is an essential element for human health and a major component for cereal and rice based alimentary diets1. In addition, its straw has a nutritional value as animal feed2. Lentil improves soil fertility by its capacity to fix nitrogen and increases soil aeration through its shallow root system3.

During the past two decades, the cultivation of lentil has been expanded to new areas by 28% leading to 42% increase in production as well as 18% increase in yield4. The yield improvement and cultivation expansion to new regions are the result of the development of appropriate varieties for various market segments and application of good agronomic practices. Despite these achievements, various biotic and abiotic stresses are still affecting its productivity in farmers’ fields such as heat, drought, diseases, and poor weed management. Parasitic and annual broad-leaved weeds cause significant yield losses up to 95%, especially when mismanaged5,6. Herbicide tolerance is the most effective technique to control weeds in lentils as other techniques are expensive and time consuming. Sources of tolerance to pre-emergence herbicides (metribuzin and imazethapyr) were identified in lentil7,8,9 and other crops such as faba bean10, chickpea11, and soybean12. Currently, the major efforts for developing herbicide tolerant lentil breeding lines were made through field selection with limited progress due to the low selection accuracy of visual assessment. Improving selection accuracy can be achieved by the utilization of modern breeding methods such as markers assisted selection (MAS) and genomic selection13.

Lentil is a diploid (2n = 14) and self-pollinating crop with a large genome size of 4 gigabases (Gb) (Arumuganathan and Earle14); its genome is larger than many previously sequenced crops like soybean, chickpea, maize, and rice. However, lentil genome sequencing is possible today due to advances in sequencing technologies and bioinformatics tools15. In fact, several linkage maps have been constructed and used for the identification of many genes and quantitative trait loci (QTL) controlling a range of biotic and a biotic traits16,17,18,19. However, these markers have been proved of limited value due to their narrow association with biparental genetic backgrounds. Genome wide association mapping (GWAS) is an alternative approach that utilize genome-wide single nucleotide polymorphism (SNP) markers for the identification of marker trait associations in diverse germplasm panels13. Studies have showed that the optimal implementation of single trait GWAS is under controlled conditions involving one environment and allowing to differentiate between genetic and environmental effect. Moreover, the single trait GWAS doesn’t dissect the presence of correlated traits or pleiotropic effect in contrary to the meta-GWAS approach20. However, the results of multiple single-trait GWAS statistics can be combined using meta-GWAS approach21 to increase the population size and consequently improve the power of the GWAS analysis22. Most meta-GWAS methods required only the SNP effects, calculated using single-trait GWAS for different variants, and their standard errors to calculate a global p-value that is equivalent to the one calculated when combining the actual phenotypic and genotypic data for all variants23.

The purpose of this study is to deploy meta-GWAS analysis to identify SNPs markers associated with herbicide damage as well as different agronomic traits of lentil with and without herbicide treatments using multilocation/season phenotypic data.

Results

Phenotypic results

Herbicide damage score

HDS1 and HDS2 scores ranged from 1 to 5 for imazethapyr and metribuzin at different dosages showing significant variation for herbicide tolerance among the lentil accessions. In Marchouch 2014/15 and after 2 weeks of applying imazethapyr at 75 g a.i.ha−1, 1% of the total accessions scored 2 with slight damage on leaves with marginal yellowness, 77% scored 3 with moderate damage with leaf necrosis, 18% scored 4 with severely damaged with 25–75% mortality, and 5% scored 5 with total mortality. HDS2 score taken after 5 weeks of herbicide treatments showed recovery of the injuries in the accessions; 3% of total tested accessions with marginal leaf yellowness recorded 2 score, 88% with moderate damage scored 3, 9% accessions with severe damage scored 4, and no accession scored 5. When applying imazethapyr at 150 g a.i.ha−1, HDS1 scored 3 (21% accessions), 4 (57% accessions) and 5 (22% accessions); confirming that the damages were more severe at higher dosage. After 5 weeks of the treatment, HDS2 scored 3, 4 and 5 in 2%, 51% and 47%, of the accessions, respectively showing that no recovery occurred. In Terbol 2014/15 and after 2 weeks of applying imazethapyr at 112.5 g a.i.ha−1, HDS1 scored 2 (10% accessions), 3 (55% accessions) and 4 (35% accessions). Whereas HDS2 scored 2 (6% accessions), 3 (40% accessions), 4 (48% accessions) and 5(6% accessions) showing that the toxicity symptoms were aggravated. The observations made during Terbol 2019/20 at the same dosage of imazethapyr (112.5 g a.i.ha−1) showed that the toxicity symptoms were less than the symptoms that occurred during Terbol 2014/15 and HDS2 ranged between 1 and 4 showing that the accessions recovered (Fig. 1).

Figure 1
figure 1

Distribution of lentil genotypes for herbicide damage scores (HDS1 and HDS2) under different dosages of imazethapyr and metribuzin during different locations and cropping seasons.

For metribuzin at 210 g a.i.ha−1 treatment, HDS1 showed wide variation with 17% of the total accessions scoring 2 with minimum damage (marginal leaf burning), 70% scoring 3 with moderate damage (leaf necrosis and lower vegetative growth), 13% scoring 4 with high damage (severe leaf burning). HDS2 score showed recovery from the herbicide damage with the formation of new leaves. HDS2 score showed that 5% of total accessions scored 1 with no visible damage, 68% scored 2 with slight damage, 25% scored 3 with moderate damage and 2% scored 4 with a mortality rate between 25 and 75%. Similar to imazethapyr, when doubling the dosage of metribuzin (420 g a.i.ha−1), HDS1 ranged between 3 and 5 showing aggravation of toxicity symptoms. 5 weeks after the treatment, HDS2 ranged between 2 and 5 showing recovery of the toxicity symptoms. During Terbol 2014/15, when metribuzin (315 g a.i.ha−1) was applied, HDS1 scored 2 (10%), 3 (55%) and 4 (35%) while, HDS2 scored 2 (6%), 3 (40%), 4 (48%) and 5 (6%) showing that the toxicity symptoms were aggravated. The observations made during Terbol 2019/20 at the same dosage of metribuzin (315 g a.i.ha−1) showed that the toxicity symptoms were less than the ones occurred during Terbol 2014/15 while HDS2 ranged between 2 and 5 showing that the toxicity symptoms aggravated (Fig. 1).

Crop phenology, yield and yield components

The combined variance analysis revealed p < 0.001 among the accessions (G) indicating that the tested germplasm was significantly diverse. Moreover, significant differences also existed among treatments (T) and locations (L) for all the traits except for the number of pods per plant (NP). The interaction between genotype × treatment (G × T) across trials and between genotype × Location (G × L) across treatments was also significant. The Genotype × Treatment × Location (G × T × L) interaction showed that the genotypes response to the effect of herbicide treatments was not affected by the environment except for DF and DM and their reduction indexes (Tables 1 and 2). During Terbol 2019/20, plant height (PH) was significantly less in plots treated with imazethapyr at 112.5 g a.i.ha−1. Similar observation was observed during Marchouch 2014/15 when treated with imazethapyr at 75 g a.i.ha−1 recording a reduction of 28.8% (Tables 3 and 4). The reduction in plant height was severe when imazethapyr was applied at higher dosage (150 g a.i.ha−1). When metribuzin applied at 210 g a.i.ha−1 and 315 g a.i.ha−1 respectively at Marchouch 2014/15 and Terbol 2019/20, the plant height was not significantly reduced in comparison to the untreated plots. On the other hand, when applying Metribuzin at 420 g a.i.ha−1, PH was significantly lower (by 33%) than the untreated plots (Tables 3 and 4).

Table 1 Combined analysis performed for detecting differences among genotypes (G), Environment (E), Treatments (T) and G × T, G × E and G × T × E interactions for phenological and agronomic traits performed for the trials at Marchouch during 2014/15 and at Terbol during 2014/15 and 2019/20.
Table 2 Combined analysis performed for detecting differences among genotype (G), Environment (E), Treatment (T) and G × T, G × E and G × T × E interactions for reduction indexes (RI) of phenological and agronomic traits performed for the trials at Marchouch during 2014/15 and at Terbol during 2014/15 and 2019/20.
Table 3 Mean ± standard error (SE) for different traits under different environments and treatments.
Table 4 Mean ± Standard error (SE) for reduction indexes (RI) of different traits under different environments and treatments.

During Terbol 2019/20, the biological yield per plant (BY) when treated with imazethapyr at 112.5 g a.i.ha−1 or metribuzin at 315 g a.i.ha−1 was not significantly lower than the untreated plots. Same observation was obtained during Marchouch 2014/15 when treated with different dosages of imazethapyr and metribuzin. However, when applying imazethapyr at 150g a.i.ha−1 or metribuzin at 420 g a.i.ha−1, the reduction of BY (RIBY) increased to 78.2% and 58% Respectively. Similar results were obtained for seed yield per plant (SY) when treated with imazethapyr or metribuzin at both locations Terbol and Marchouch. When increasing the dosage of imazethapyr at 150 g a.i.ha−1 or metribuzin at 420 g a.i.ha−1, the reduction in SY (RISY) increased to 101.7% and 63.3% respectively.

Genotyping and population structure

After applying the quality control criteria, the final dataset consisted of 7642 SNPs that were distributed along the lentil genome. The proportions of sequence variations of the SNP markers are as follows: A/C (1433 SNPs), A/G (3675 SNPs), C/T (3764 SNPs), and G/T (1399 SNPs). The aim of our study was to use phylogenetic diversity to investigate the genetic relationship among this set of lentil population. Through the obtained phylogenetic tree, we identified three significant clusters that evenly accommodated the lentil accessions under investigation, but we did not observe any clustering of genotypes based on their country of origin (Fig. 2).

Figure 2
figure 2

Phylogenetic tree of the studied lentil accessions using SNP genotyping data. The samples are color-coded based on their country of origin.

GWAS and annotation analyses

Among the 7642 SNP markers that were assessed, 125 (clustered in 85 unique QTL) were found to be associated with herbicide tolerance and other traits, of which 36 (clustered in 30 unique QTL) were highly significant (Table 5) while the remaining SNPs were considered as suggestive associations (see Supplementary Table S3 online). Remarkably, traits like RIPH, RIBY, and RISY were excluded as there was no SNP associated with herbicide tolerance. Based on Bonferroni threshold (0.05/n) correction at p > 4.6 × 10−6, The SNPs with − log 10 (p value) ≥ 5.2 were considered to have significant associations; 36 SNPs markers were significantly highly associated with diverse traits. Table 5 describes the positions and the significance of these SNP markers for the recorded traits as following: one SNP (AVR-Lc-01885.02–000,213,238) was associated with HDS (− log10(p) = 6.3), eight SNPs (the most significant are AVR-Lc-03987.03–231,578,053, AVR-Lc-03983.03–230,295,656 and AVR-Lc-05801.04–318,079,189) were associated with DF (− log10(p) = 5.6 to 9.3), four SNPs (AVR-Lc-03458.02–007,762,915, AVR-Lc-08010.06–011,899,372, AVR-Lc-03379.02–609,257,610 and AVR-Lc-05740.04–302,184,757) were associated with RIDF (− log10(p) = 5.4 to 8), three SNPs (AVR-Lc-02189.02–307,011,079, AVR-Lc-02200.02–309,350,505 and AVR-Lc-10007.07–447,269,681) with RIDM (− log10(p) = 5.4 to 9.1), three SNPs (AVR-Lc-02714.02–436,766,259, AVR-Lc-02715.02–436,994,699, and AVR-Lc-06969.05–022,095,933) with BY (− log10(p) = 5.2 to 5.9), two SNPs (AVR-Lc-00835.01–430,931,278 and AVR-Lc-03296.02–599,856,144) were associated with NP (− log10(p) = 5.7 and 6.0), and fourteen SNPs (the most significant SNPs are AVR-Lc-01352.01–535,793,448, AVR-Lc-06527.04–050,552,783, and AVR-Lc-05203.04–122,734,802) were associated with RINP (− log10(p) = 5.2 to 8.1). The significance of associations and the location on the chromosomes of these SNP are also presented in Manhattan plot and QQ plot (Fig. 3). Manhattan plot showed that SNP markers were dispersed randomly on the chromosomes from 1 to 7.

Table 5 Highly significant SNP-Trait associations revealed by the Meta-GWAS analysis.
Figure 3
figure 3figure 3

Manhattan plot and QQ plot of the highly significant associations existing between the SNP markers of the recorded traits. HDS2: second herbicide damage score, DF: days to flowering, RIDF: DF reduction index, RIDM: Days to maturity reduction index, BY: Biological yield per plant, NP: number of pods per plant, RINP: NP reduction index, NS: number of seeds per plant.

Physical map and gene annotation

The physical map presents the SNPs that are located on the genes which are composed of exons (coded regions) and interrupted by introns (non-coding regions) (Fig. 4). Out of the eighteen SNPs (A to R) that were found located on the genes, only nine SNPs (A, D, F, H, J, K, O, Q and R) were located on the exomes whereas the rest were found located on the introns on chromosomes 2, 4, 5, 6, and 7.

Figure 4
figure 4

Physical map showing eighteen SNP markers (A to R) that are located on the genes. Green zones are the exomes (coded regions) interrupted by the black zones that are the introns (non-coding regions) and the yellow vertical line represents the location of SNP markers of interest.

Based on the table of gene annotation, out of the eighteen SNPs that are located inside the gene, four SNPs (AVR-Lc-01885.02–000,213,238, AVR-Lc-03458.02–007,762,915, AVR-Lc-03373.02–608,709,301, and AVR-Lc-10007.07–447,269,681) were found highly associated with herbicide tolerance (Table 6). Gene annotation showed that SNP AVR-Lc-01885.02–000,213,238 highly associated with HDS2 (− log10(p) = 6.3) is located on chromosome 2 within a gene annotated Peptide and nitrate transporter type I and II extracellular region ABC transporter related, SNP AVR-Lc-03458.02–007,762,915 is highly associated with RIDF (− log10(p) = 10.1) is located on chromosome 2 within a gene annotated Allantoinase and Dihydroorotase, SNP AVR-Lc-03373.02–608,709,301 highly associated with RINP (− log10(p) = 6.2) and located on chromosome 2 within a gene annotated Biotin carboxyl carrier acetyl-CoA carboxylase, and SNP AVR-Lc-10007.07–447,269,681 highly associated with RIDM (− log10(p) = 5.4) is located on chromosome 7 within a gene annotated Myelodysplasia-myeloid leukemia factor 1-interacting protein. Nevertheless, only SNPs AVR-Lc-01885.02–000,213,238, AVR-Lc-03373.02–608,709,301, and AVR-Lc-10007.07–447,269,681 were found located on the exomes (coded regions) (Fig. 4).

Table 6 Gene annotation table showing the herbicide tolerance SNP marker and the associated gene and their location; in red are the SNPs detected highly associated with herbicide tolerance.

Discussion

In Mediterranean environments of cool winters, lentil has slow growth and crop development, which motivates weeds to compete for water, nutrients, sunlight, and space and hosts diseases and pests that causes severe yield losses in this crop5,8. It has been reported that imazethapyr and metribuzin are effective to control weeds when applied to herbicide tolerant lentil accessions. Sources of tolerance to both herbicides were detected in lentils by Balech et al.7,24 and Sharma et al.8,9 which allowed them to escape phytotoxicity symptoms caused by the herbicides.

In this study, phytotoxicity symptoms were observed when applying imazethapyr or metribuzin herbicides. The herbicide damage score evaluated the degree of phenotypic phytotoxicity, and considerable variability was observed in the phenotypic response. The recovery or aggravation of the phytotoxicity symptoms is subject to the potential of accessions to metabolize the herbicides and detoxify the plants25. Additionally, the phenology of the tested accessions was also impacted by delaying the flowering and maturity dates of some lentil accessions and caused a reduction in yield and its components. Similar results were obtained in lentil7,8,9, 24, faba bean10, and chickpea26,27. Consequently, the phytotoxicity symptoms were ascribed to the inhibition of photosynthesis and plant growth caused by these herbicides as obtained Sharma et al.8.

The phylogenetic tree analysis didn’t discern any specific pattern of genotypes based on their country of origin. Therefore, we suggest the possibility of seed exchange occurrence between countries. Thus, it appears that lentils possess broad genetic diversity that is not particular to a specific geographic location because of long-term seed migration and trading across borders.

Limited progress has been made in identifying lentil cultivars tolerant to herbicides through conventional breeding methods, especially that these approaches have been proven to be relatively slow in achieving considerable advances. Hence, it is mandatory to develop genetic markers linked to traits associated with herbicide tolerance in lentils in order to enhance selection accuracy and facilitate early-stage selection. These markers serve as effective tools for selecting adapted and tolerant accessions. Many studies have proved that GWAS is the most successful tool in identifying significant SNPs and candidate genes related to various traits. However, there is a limited number of GWAS reports conducted on lentil such as aphanomyces root rot resistance28, prebiotic carbohydrates29, anthracnose resistance30, ascochyta blight resistance31 and seed protein and amino acids content32. Compared to other crops like maize and sorghum, the development of genetic resources for lentil has been relatively slower. Nevertheless, new horizons in next generation sequencing (NGS) technologies will open as the lentil genome has been recently published15,33.

The MetaGWAS method that was applied in the present study, was initially employed in human genetics as it is impossible to gather multi-environmental data for the same population34,35. Its effectiveness over standard GWAS analysis was proved, which encouraged its usage in crops35,36. In fact, standard GWAS is more powerful when experiments are conducted under controlled conditions37,38,39. Moreover, conducting experiments in the same environment for a diverse set of accessions that are intended to be grown all over the world can lead to an improper image of the environmental effects on the genetics of the tested set. Many quantitative traits are raw measurements collected from different environments; if standard GWAS analysis is applied, bias effect may be caused which will negatively affect the detection of significant QTL40. Therefore, Meta-analysis is an adequate alternative to bypass the previously mentioned challenges of standard GWAS. In our case, MetaGWAS was the best option to be applied since we have an unbalanced set of data collected on 292 accessions, with different treatments on two different locations and two different cropping seasons with a total sample size of 11,956. This approach was also applied by Shook et al.41 on a sample of 17,556 accessions of soybean from 73 published studies, by Joukhadar et al.35 on a sample of 2571 accessions of wheat, by Battenfield et al.42 on wheat with a total sample size of 4095 and Fikere et al.43 on a sample of 585 canola accessions. To the best of our knowledge, this is the first MetaGWAS study applied in lentil crops and targeting QTL associated with herbicide tolerance. Hence, most of the identified QTL in this study appear to be new and have not been reported previously.

Based on the physical map results, four SNPs were detected located on the gene and found highly associated with the recorded traits relative to herbicide tolerance. The associations and mechanisms of tolerance to herbicides between the detected SNPs markers on the genomic regions and the phenotypic traits have been deciphered in the following.

The Peptide and nitrate transporter type I and II extracellular region ABC transporter related protein, belongs to the ATP binding cassette (ABC) transporters family and was detected and found associated with herbicide damage score (HDS). This protein transports amino acids, peptides, and nitrate through the plant’s cell membrane using the energy of ATP hydrolysis44,45,46. Several studies have proved that plants have the highest diversity of ABC transporters genes such as in Arabidopsis and in rice with 120 and 121 coding sequences respectively47,48. Some of the ABC transporters are responsible for the defence mechanisms to biotic and abiotic stresses and others are involved in the basic functions indispensable for plant growth49. Furthermore, Van Eerd et al.50 acknowledged that this enzyme is typically associated with herbicide metabolism and plant detoxification. Moreover, genes encoding for ABC transporters proteins were also detected in wheat (Triticum aestivum L.)51,52, Arabidopsis thaliana53, and soybean54, and performed the function of detoxification of plants from imazethapyr and metribuzin. In this study, the HDS discerned the recovery of some accessions from phytotoxicity symptoms after imazethapyr or metribuzin treatments which might be due to the role of detoxification executed by ABC transporters.

Allantoinase and Dihydroorotase proteins belong to the same superfamily of amidohydrolases55; they were detected and found highly associated with the RIDF. They participate in various stages of plant development through the de novo pathway by using simple molecules such as CO2, amino acids and tetrahydrofolate to build purine and pyrimidine nucleotides56,57. Imazethapyr and metribuzin have indirect effect on Allantoinase and Dihydroorotase proteins; Imazethapyr disrupts amino acids synthesis and metribuzin (triazine herbicide) inhibits tetrahydrofolate synthesis58. In addition, Duran59 and Kafer60 reported that flowering stage required the presence of high concentrations of Purine and Pyrimidine. In rice61 and in Arabidopsis thaliana60, the genes encoding to purine and pyrimidine metabolism were responsible for the tolerance to the stress that might encounter the plants during the flowering stage. In this study, when either of both herbicides was applied, the flowering stage was delayed for some accessions but not for others. This observation might be explained by the differing concentrations of purine and pyrimidine available in the plants especially during the flowering stage which depends on the lentil variety and its level of tolerance to the applied herbicide.

Biotin carboxyl carrier (BCC) and acetyl-CoA carboxylase proteins (ACC) were detected and found highly associated with RINP. BCC is used by the enzyme biotin carboxylase to form carboxybiotin that is transferred to ACC enzyme (ACCase). ACCase engender the carboxylation of acetyl-CoA to form malonyl-CoA; essential for fatty acid synthesis and other secondary compounds such as flavonoids62. This enzyme plays an essential role in embryo morphogenesis and in apical meristem development63. This explains the detected association with the reduction index of number of pods (RINP) in this study much likely as has been reported in Arabidopsis thaliana62 and Populus simonii64. Moreover, ACCase plays a role in biotic and abiotic stress tolerance in plants. Many studies like in lentil65, Brassica napus66,67 Arabidopsis thaliana68, and tobacco69 showed that plants can improve their resilience to stress by stimulating the accumulation of ACCase and consequently improving the seed yield. This explains the different levels of tolerance to the applied herbicides expressed in the RINP.

Myelodysplasia-myeloid leukemia factor 1-interacting protein was found to be highly associated with RIDM in this study. It is encoded by (MLF1IP) gene is a transcription factor that was first detected in mammals and Drosophila70. MLF1IP interacts as a transcriptional repressor with MLF1 and nucleophosmin-MLF1 (NPM-MLF1) to prevent apoptosis (programmed cell death), and thus facilitating cell growth and proliferation in different cell types71. As far as we can tell, very rare are the studies that report the presence of MLF1IP in plants and this is the first study that reports its presence in lentils. This gene was also found in tea Camellia sinensis, but limited information is available online (A database of gene co-expression network for tea plant (Camellia sinensis)). Thus, the function of MLF1IP in plants remains to be elucidated, but since they are transcription factors then their role is to regulate cell death triggered by abiotic and biotic stresses72,73.

Moreover, several studies have reported that herbicides cause oxidative stress in plants similar to other abiotic stresses74,75. This idea highlights the hypothesis that herbicide tolerance in lentils could result from several mechanisms enabling plants to tolerate the stress caused by herbicide treatment very similar to their response to other abiotic stresses. Thus, the tolerance observed in this study is attributed to the mechanisms that significantly contribute to the detoxification of herbicides in lentil crops.

Conclusion

Weed management in lentil has become crucial for attempting high yields and good quality to meet the growing global demand. Therefore, the natural genetic variability that lentil crop accessions have shown in previous studies encouraged us to screen a large germplasm collection to search for more powerful and diverse sources for post-emergence herbicide tolerance. This will promote the use of herbicide tolerant varieties with conservation agriculture systems at a lower cost on the farmers. But this method of traditional screening for herbicide tolerance in the field is time consuming, very costly, and hectic. Therefore, genomic selection and marker-assisted selection for herbicide tolerance will greatly improve precision and efficiency of breeding for herbicide tolerance and will help plant breeders in accelerating the breeding process. In this study, we identified four SNP markers that were highly associated with traits related to imazethapyr and metribuzin tolerance using the meta-GWAS method. These identified SNPs could be studied further and used to facilitate selection in breeding programs.

Materials and methods

Materials and experiments

A set of 292 lentil accessions including 175 landraces collected from 49 countries, and 117 breeding lines developed at ICARDA were evaluated to their response to imazethapyr and metribuzin treatments, separately at different doses (Supplementary Table S1).

Four field experiments were conducted at Marchouch, Morocco (33.56°N, 6.69°W) during 2013/14 and 2014/15 and at Terbol, Lebanon (33.81°N, 35.98°E) during 2014/157 and 2019/20 (Supplementary Fig. S2), in alpha lattice design with two replicates and a plot size of 1 row of 1 m length spaced at 0.3 m distance. Different dosages of imazethapyr and metribuzin and control treatment (no-herbicide treatment) were applied separately at both locations Marchouch and Terbol at the pre-flowering stage (5–6th node stage). The details of each experiment and the applied treatments are presented in Table 7.

Table 7 Environmental conditions of different location-season-treatment combinations of lentil screening.

Phenotypic data for herbicide tolerance

Based on the Lentil ontology76 the following phenotypic data were recorded:

Herbicide damage score (HDS) was recorded using the scale described in Balech et al.7 on a scale of 1–5, at 2 weeks (HDS1) and then at 5 weeks (HDS2) after the herbicide application at Terbol in 2014/15 but at Marchouch in 2013/14, only HDS2 was recorded. This scale was proposed by Gaur et al.11 to assess the ability of accessions to recover from herbicide treatments.

Crop phenology traits of number of days to 50% flowering (DF) and days to 95% of maturity (DM) from sowing day were recorded on a plot basis at Terbol in 2014–15 and 2019/20.

Agronomical and yield traits of plant height (PH) (cm), biological yield/plant (BY) (g) and seed yield/plant (g) data were recorded on three randomly selected plants per plot and the average was calculated from trials at Marchouch 2014/15, Terbol 2014/15 and 2019/20. In addition, the number of pods/plant (NP) was also recorded and calculated as PH, BY and SY at Terbol 2019/20.

The reduction indices: The reduction index \(({RI}_{trait}\)) was estimated to measure the performance of selected tolerant accessions, as follows9:

$${RI}_{trait}=100-\frac{(100\times \overline{{\text{T}}})}{\overline{{\text{C}}}}$$

where (\(\overline{{\text{T}}}\)) is the trait value of evaluated accession under herbicide treatments and \(\overline{{\text{C}}}\) is the value of the same accession under controlled conditions without any herbicide treatments. This reduction index was calculated for DF, DM, PH, NP, BY and SY at Terbol in 2019/20. At Marchouch in 2014/15, only the reduction indices for PH, BY and SY were calculated.

DNA extraction and genotyping by sequencing analysis

DNA was extracted from young leaves of seedlings aged between 4 and 6 weeks, prior to the application of salt treatment, using the CTAB method, as outlined by Rogers and Bendich77. A total of 50 μl of 100 ng/μl DNA from each sample was sent to Agriculture Victoria, Melbourne, where Multispecies Pulse SNP chip was used for genotyping. To ensure the quality of the markers, we filtered them by call rates greater than 80%, minor allele frequency (MAF) of ≥ 5%, and heterozygosity of ≤ 15%. Only those markers that met these criteria were selected for genome-wide association analysis.

Phenotypic data analysis

The spatial statistical row-column model was used to detect differences among genotypes (G) under different herbicide treatments (T), location (L) and their interactions (G × T), (G × L) and (G × T × L) for phenological and agronomic traits using Genstat V. 1978. The significance of variation among accessions and herbicide treatments was tested using p values. The best linear unbiased prediction values (BLUP) of genotypes and treatment and interactions between genotypes and treatments were also estimated by Genstat V. 19.

Genetic diversity study

The phylogenetic data analysis was carried out using the programming language R, using the clust agglomeration method of “complete”. The similarity data matrix obtained from the SNP genotyping data was then used to construct the phylogenetic tree. To visualize the tree, we used the online tool iTOL (Interactive Tree Of Life), which allowed us to color-code the samples based on their country of origin and provided a user-friendly interface for exploring and analyzing the data (Fig. 2).

Single-trait GWAS

The single variate mixed linear model implemented in the software GEMMA79,80 was used to analyze the association between each measured phenotype in each environment with the SNP data. The model used the following equation:

$${\text{y}}=\upmu +\mathrm{X\beta }+\mathrm{I\alpha }+{\text{e}}$$

where y is a vector of the phenotypes, \(\upmu\) is the intercept, X is the incidence matrix assigning individuals to genotypes, \(\upbeta\) is the SNP substitution effect, I is the identity matrix, \(\mathrm{\alpha }\) is a vector of random effects, and e is a vector for the residuals.

Meta-GWAS

Meta-GWAS analysis was performed following the method described in Bolormaa et al.22. Briefly, the following equation was used to calculate a chi-squared statistic (\(\upchi\)2) assuming n (number of environments per trait) degrees of freedom:

$$\chi_{i}^{2} = t_{i}{\prime} V^{ - 1} t_{i}$$

where \({{\text{t}}}_{{\text{i}}}\) represents the signed t-values for the SNP (i) in all environments, and \({{\text{V}}}^{-1}\) is the inverse of the correlation of the t-values among all environments. The following equation was used to calculate \({{\text{t}}}_{{\text{i}}}\):

$${{\text{t}}}_{{\text{i}}}=\frac{{{\text{b}}}_{{\text{i}}}}{{\text{se}}({{\text{b}}}_{{\text{i}}})}$$

where \({{\text{b}}}_{{\text{i}}}\) is the SNP effect calculated in the single-trait GWAS analysis for each environment and \({\text{se}}({{\text{b}}}_{{\text{i}}})\) is its standard error. Bonferroni correlation was used to declare significance. However, all associations with p < 0.0001 were reported in the supplementary materials as suggestive associations.

Ethical approval

The authors confirm that the study complies with local and national regulations. The seeds were collected from the GenBank of the International Center for Agricultural Research in the Dry Areas (ICARDA) for research purposes according to the International Treaty of Plant Genetic Resources for Food and Agriculture (ITPGRFA). For the collection of seeds, all relevant permits or permissions have been obtained. The seeds flow from ICARDA GenBank at Terbol to Morocco was made following the phytosanitary regulations of both countries and using the Standard Material Transfer Agreement (SMTA) governed by ITPGRFA. The experiments were conducted at ICARDA sites at Terbol and Marchouch in accordance with National and International regulations.