Dear Editor,

Identifying pathogenic gene mutations and their combination is critical but challenging in dissecting the etiology of complex diseases when more than one gene is involved.1 The digenic/oligogenic/omnigenic models, holding that more than one gene could act synergistically, appeal to a wide range of genes responsible for complex phenotypes.1,2,3,4 These genetic models advanced our understanding of genetic factors underlying complex phenotypes, yet an accordingly rapid and efficient experimental assay for identifying pathogenic combinations of genetic variants at animal model level is lacking and urgently needed.

Recapitulating multiple human genetic variants in mice allows them to be examined in a fixed genetic background, which is especially powerful for establishing oligogenicity.5 In a newly published study, a combination of triple-compound heterozygous variants was identified in a family affected by a heart disease, and was further verified in a mouse model by zygotic injection of CRISPR-Cas9 system followed by multiple generations of intercrossing.1 These findings revealed the joint contributions of gene variants to the etiology of complex diseases. However, sequential intercrossing for generations to obtain mice harboring multiple genetic variants is time-consuming. Additionally, founders generated from zygotic injection of CRISPR-Cas9 system are usually genetically mosaic,6 making phenotyping of founders extremely difficult. Our previous work successfully derived androgenetic haploid embryonic stem cells (AG-haESCs), which can maintain haploidy via periodic cell sorting, from sperm-originated haploid embryos.7 Importantly, AG-haESCs can serve as a “sperm replacement” to deliver multiple genetic modifications into descendants via intracytoplasmic AG-haESC injection (ICAHCI), enabling rapid phenotyping of uniform founders without mosaicism in one generation.8 Consequently, this AG-haESC-mediated semi-cloning technology may be a rapid and efficient experimental assay for identifying pathogenic combinations of genetic variants in complex diseases.

Müllerian anomalies (MA) include a wide variety of anatomic malformations in the uterus, cervix, fallopian tubes, or vagina, stemming from the variable aberrances during the development of Müllerian ducts.9,10,11 MA brings with women not only adverse reproduction capacity, but also psychological distress. Regarding the etiology of MA, genetic risk factors are deemed to harbor a strong influence; however, the wide phenotypic and genotypic heterogeneity across MA individuals makes it extremely difficult to determine the underlying genetic risk factors. For years, no major genes have been found to account for human MA in the monogenic inheritance,11 implying the genetic complexity of MA and an urgency to interrogate the potential involvement of pathogenic variant combinations in MA and other analogous complex diseases.

So far, several previous studies have implicated the involvement of genomic copy number variants (CNVs, especially genomic deletions) in the complex etiology of MA.11 To explore the etiology of MA, we thus first employed comparative genomic hybridization (CGH) microarrays to analyze genomic CNVs in 25 women with MA. We found 7/25 (28%) of the subjects carried rare and large (>100 kb) CNVs (Supplementary information, Tables S1, S2 and Data S1). Changes in gene dosage caused by CNVs are critical for the pathogenesis of human developmental disorders.12 We continued to identify the CNV-affected genes that may play crucial roles in MA. We focused on genes involved in MA or related to female reproductive tract and examined whether they are deleted in these 7 subjects. Three genes, namely GEN1, TBX6, and LHX1 are supposed to fit the bill.11,13 Thus the newly identified GEN1/2p24.2 deletion (carried by subject M45; 1/25), along with two previously reported TBX6/16p11.2 deletion (carried by subject B03; 1/25)11 and LHX1/17q12 deletion (carried by subject B07; 1/25),11 are potential risk CNVs for MA (Fig. 1a; Supplementary information, Tables S2, S3). Herein, the frequency of rare MA-associated CNVs is at least 12% (3/25) in our discovery cohort of MA. Subsequently, we continued to explore whether these rare CNVs (GEN1/2p24.2, TBX6/16p11.2 and LHX1/17q12) are recurrent in a larger MA population. For this, we enrolled another 100 women with MA, and conducted targeting qPCR analysis to preliminarily screen for CNVs in the above regions. We identified another 6 cases carrying potential deletions in TBX6/16p11.2 (4/100) or LHX1/17q12 (2/100). To verify these potential CNVs identified by qPCR, we further conducted CGH microarray analysis in these 6 cases, and found that all of them are indeed carriers of the corresponding TBX6/16p11.2 or LHX1/17q12 deletion CNVs (Fig. 1a; Supplementary information, Table S3). In total, we identified 9 cases carrying potential pathogenic CNVs in the 125 patient cohort (9/125) and demonstrated that two of these CNVs are recurrent in Chinese MA cases (Fig. 1a; Supplementary information, Tables S1, S3).

Fig. 1
figure 1

Identification of a digenic variant combination in MA through genetic analysis and semi-cloning technology. a Deleterious deletion CNVs identified in subjects with MA by CGH microarrays. A novel 1.62 Mb deletion of the human chromosome region 2p24.2 involving GEN1 was identified in subject M45. Recurrent 16p11.2 and 17q12 deletions were identified in five and three subjects with MA, respectively. The genomic deletion regions were covered with green shadow. b The working hypothesis of digenic/oligogenic inheritance modes for MA. The deletion CNVs of GEN1/2p24.2, TBX6/16p11.2 or LHX1/17q12, demonstrate low penetrance in MA, suggesting that a monogenic or Mendelian inheritance mode cannot completely account for the etiology of MA. Apart from single novel or previously identified CNV, another or other genetic variants may simultaneously contribute to MA, thus digenic/oligogenic combinations of variants may reflect the genetic complexity of MA risks. c An additional frameshift mutation (c.6dupC; p.R2fs*55) of WNT9B was identified in subject M45, who carried a GEN1/2p24.2 deletion. These double mutation hits suggest a potential digenic model for the MA in subject M45. d Experimental procedures for producing mouse models of double heterozygous mutants through semi-cloning technology. Wild type AG-haESCs were also injected into oocytes to obtain wild type mice, serving as controls for the mutant founders. e Representative mouse uteri dissected from wild type (WT) and Gen1+/−Wnt9b+/− female mice generated via the semi-cloning technology. Scale bar: 2.5 mm. Mouse uterine lengths (f) or diameters (g) of WT and Gen1+/−Wnt9b+/− female mice generated via the semi-cloning technology. Mouse uterine lengths (h) or diameters (i) of WT, Gen1+/−, Wnt9b+/−, and Gen1+/−Wnt9b+/− female mice generated by the conventional mating way. Female mice in proestrus were collected for analyses. Each side of the uteri was measured and counted. For measurements of diameters, diameter at the middle of each uterus side was measured. All the data are represented as mean ± SEM. One-way ANOVA test, **P < 0.01; ***P< 0.001; n.s. not significant

CNVs may contribute to human diseases with variable clinical manifestations. As shown in our cohort and previous studies, TBX6/16p11.2 deletion and LHX1/17q12 deletion are recurrent in human subjects with MA.11 However, deletion CNVs of 16p11.2 or 17q12 can also lead to other diseases without MA phenotypes.12,14 As for the GEN1/2p24.2 deletion, the heterozygous null mutant of mouse Gen1 showed no obvious MA phenotypes in females.13 Based on these facts of incomplete penetrance, we raise our hypothesis that single genetic variant might be insufficient for MA manifestation, and other genetic factors could contribute synergistically to the pathogenesis or increase the penetrance, resembling the digenic/oligogenic models (Fig. 1b).1,3

Deleterious genetic variants, such as deletion CNVs and single nucleotide variants (SNVs), can destroy gene function. CGH microarray analysis can efficiently identify genetic variants involving CNVs, but is not applicable to SNVs. For this reason, to gain a comprehensive insight into genetic etiology of MA, we detected SNVs using whole exome sequencing (WES) in the 9 CNV carriers. The genes with hints for contributions to MA or involvements in the female reproductive tract development in literatures were preferred in further analyses. Remarkably, we identified a frameshift mutation in WNT9B (c.6dupC; p.R2fs*55) and a missense mutation in GATA3 (c.581T>G; p.M194R) outside the CNV regions in 2 CNV carriers M45 and B03, respectively (Fig. 1c; Supplementary information, Fig. S1, Table S4). Hence, the double hits in 2 out of the 9 CNV carriers (i.e., “GEN1 deletion CNV + WNT9B frameshift SNV” and “TBX6 deletion CNV + GATA3 missense SNV”, respectively) suggest a potential digenic etiology of MA.

To investigate the pathogenicity of variant combinations “GEN1 + WNT9B” and “TBX6 + GATA3”, we introduced their corresponding double heterozygous variants into mice, respectively. By employing our semi-cloning technology, Gen1+/−Wnt9b+/− and Tbx6+/−Gata3+/KI founder mice as well as wild type controls (the mutants and controls are in the same mixed background of C57BL/6, DBA2 and 129X1/SvJ) were rapidly generated without mosaicism in one generation (Fig. 1d; Supplementary information, Fig. S2). Dissections of the respective 8–11 weeks old founders in proestrus showed that Gen1+/−Wnt9b+/− female mice, but not Tbx6+/−Gata3+/KI ones, exhibited significantly longer uteri than wild type controls (Fig. 1e, f; Supplementary information, Figs. S3, S4). Also, the diameters of Gen1+/−Wnt9b+/− mouse uteri tend to deviate from the normal values, although their average value was not significantly different from that of wild type uteri (Fig. 1g). More than 60% (7/11) of Gen1+/−Wnt9b+/− female mice exhibit disordered looser stromal structure in contrast to controls (Supplementary information, Fig. S5). These observations suggest that the double heterozygous mutations in Gen1 and Wnt9b could lead to abnormal development of the mouse uterus, a critical composition of the female reproductive tract.

To investigate whether the abnormality is a synergistic effect caused by the two variants, we examined 8–11 weeks old wild type, Gen1+/−, Wnt9b+/− and Gen1+/−Wnt9b+/− female mice in proestrus derived from the conventional mating way (mice were backcrossed to C57BL/6 to obtain mouse lines for analysis). Notably, the Gen1+/−Wnt9b+/− female mice exhibited significantly longer uteri than Gen1+/− and Wnt9b+/− female mice (Fig. 1h; Supplementary information, Fig. S6a). A tendency to deviate from the normal value of uterine diameter was also observed in Gen1+/−Wnt9b+/− female mice derived from the mating way, in consistency with the phenomenon in Gen1+/−Wnt9b+/− founders (Fig. 1g, i). Furthermore, comparisons of Gen1+/−Wnt9b+/− female mice to single mutant females (Gen1+/− or Wnt9b+/−) showed a synergistic effect of these two genetic variants on the abnormality of disordered loose stromal structure (Supplementary information, Fig. S6b, c). Collectively, these experimental observations in mouse models suggest a synergistic effect of Gen1 and Wnt9b on development of the uterus, presenting the first evidence in mice for the digenic etiology of MA.

Finally, we explored the recurrence of double deleterious variants in GEN1 and WNT9B among MA subjects. For this, we conducted gene-targeted sequencing of the coding regions of GEN1 and WNT9B in our MA cohort. Remarkably, we identified one more MA-affected case (from M82 family) carrying double deleterious variants (one in GEN1 and the other in WNT9B), resembling observations in the M45 family (Fig. 1c; Supplementary information, Fig. S7 and Table S5). In both M45 and M82 families, MA-affected subjects simultaneously carried double deleterious variants of GEN1 and WNT9B, while the unaffected mothers and sister carried only single deleterious variant either in GEN1 or WNT9B, in consistency with the observations in our corresponding mutant mice (Fig. 1h; Supplementary information, Figs. S6, S7). This result further consolidates the digenic etiology in MA. Interestingly, we also identified 4 subjects carrying single deleterious variants in GEN1, but no candidate variants in WNT9B, supporting the genetic contribution of GEN1 variants to MA (Supplementary information, Table S6). Since single variant in GEN1 is insufficient to cause MA in humans or mice (Fig. 1h, Supplementary information, Figs. S6, S7), it is likely that deleterious variants of other genes (not just WNT9B) act synergistically with GEN1 variants to cause MA, in agreement with the high genetic heterogeneity of MA. In summary, these results demonstrate GEN1 + WNT9B as an important genetic combination in MA manifestation, and suggest GEN1 as a major risk factor in MA for the first time.

Altogether, our findings provide evidence for a digenic etiology of MA in both human and mouse, and highlight the semi-cloning technology as a rapid and efficient experimental assay for identifying pathogenic combinations of genetic variants in mouse model for complex diseases. Due to the wide phenotypic and genotypic heterogeneity of MA, different digenic/oligogenic combinations of variants may likely differ in contributions. Therefore, the joint utilization of genetic analysis and semi-cloning technology holds great potential in screening more and more pathogenic variant combinations responsible for MA as well as other similar complex diseases.