Introduction

According to the literature, several well-described disorders are associated with a high prevalence of autism spectrum disorder (ASD), including Rett syndrome, Smith-Lemli-Opitz syndrome, PTEN hamartoma tumor syndrome and tuberous sclerosis (TSC). Patients with TSC, Rett syndrome and Smith-Lemli-Opitz syndrome have a prevalence of ASD in the range of 16–61%, 17.5–58%,1, 2 and 50–60%3, 4 respectively. Interestingly, up to 80–90% of patients with Smith-Lemli-Opitz syndrome appear to fulfill some ASD criteria.5 In the Diagnostic and Statistical Manual of Mental Disorders (DSM 4), Rett syndrome was a subtype of autism. In DSM 5,6 however, an individual with Rett syndrome is not automatically assumed to have a diagnosis of ASD. Due to the essential difficulty to assess the precise prevalence for invisible mental disorder(s), the prevalence of each of these disorders within ASD cohorts is unclear, making the indication for routine molecular testing in all patients with ASD uncertain. We have analyzed whole exome sequencing (WES) data from 2392 families with ASD for variants within the genes causing the above-mentioned conditions. Our goal was to assess the prevalence of patients with genetic alterations in the genes corresponding to the four conditions targeted based on a large cohort of individuals diagnosed with ASD who do not exhibit severe neurological defects.

Materials and methods

Whole-exome data from 2392 families (1800 quads and 592 trios) with ASD, obtained from the National Database for Autism Research7 (NDAR), were analyzed for pathogenic or likely pathogenic variants in TSC1, TSC2, PTEN, DHCR7 and MECP2. After McGill REB approval, variant calls were downloaded from NDAR (study 348) and were annotated using the reference genome hg19/GRCh37. Rare variants (minor allele frequency, MAF,⩽0.005) with a functional impact (defined as ‘missense’, ‘frameshift’, ‘stopgain’, ‘stoploss’, ‘startloss’ and ‘splicing’) were selected and manually inspected using the Integrative Genome Viewer (IGV; study 334). Variant calls that were not supported by visualization in IGV were removed from further analysis. Variant classification was performed in accordance with ACMG guidelines.8 Clarifications and adaptations of the criteria are located in Supplementary Table S1.

Results

Please refer to Table 1 and Supplementary Section for the variants identified.

Table 1 Pathogenic or likely pathogenic variants identified in PTEN, TSC1, TSC2 and MECP2

Discussion

Based on the selection criteria, 148 variants in the genes targeted were classified. Following variant classification, seven patients were found to carry a variant that was likely responsible for the patient’s phenotype in one of these five genes (Table 1), with the majority (6 out of 7) occurring in PTEN. It is possible that this prevalence is an underestimate, as ~10% of Cowden syndrome patients with negative WES and copy number variant (CNV) analysis have been found to harbor pathogenic promoter variants.9 We did not have access to the clinical information of the patients in the cohort for cephalic measurements.

No patients were determined to harbor biallelic DHCR7 variants (Supplementary Table S3). Twenty-eight patients carried a pathogenic or likely pathogenic DHCR7 variant, giving a carrier frequency of 1 in 85. If variants of uncertain significance (VUS) are included, the carrier frequency is ~1 in 40. Carrier frequency estimates for DHCR7 are ~1–2% for individuals of Caucasian ancestry.10 Therefore, it is possible that some of the identified VUS may in fact be disease-causing alleles.

No causative mutations for Rett syndrome were identified in our study. However, one VUS in MECP2 was particularly notable. A male proband was hemizygous for the variant c.691G>A, p.G231R, which has been previously reported in a female patient with seizures and intellectual disability. This variant is absent from population databases.11 While this variant was present in the proband’s unaffected mother and sister, males carrying MECP2 variants have been diagnosed with X-linked mental retardation (OMIM 300260). Therefore, this variant has been classified as a VUS. An additional 38 patients carried one of 36 heterozygous or hemizygous VUS in either PTEN, TSC1, TSC2 or MECP2 (Supplementary Table S2). One patient (12 621) harbored both a likely pathogenic de novo variant and a paternally inherited VUS in TSC2.

De novo status was observed in 3 out of the 6 PTEN variants and the one TSC2 variant. In total, six variants have been previously reported in patients, including four reported in patients with ASD or related phenotypes. The estimated frequency of unselected patients with autism caused by a variant in the PTEN and TSC2 gene is ~1 in 399 and 1 in 2392 respectively. In addition, two of the probands were previously determined to carry large pathogenic de novo duplications of TSC2 (Supplementary Table S4).12 Only 53% of the probands analyzed in the present study had been previously tested for CNVs; therefore, additional probands may also carry pathogenic CNVs. Therefore, at least 3 of 2392 probands (1 in 797) carried a TSC2 pathogenic variant; this appears to be more frequent than in the general population, where TSC has been reported to occur in ~1 in 5800 live births.13

There are several limitations to this study. This analysis was performed on previously obtained research WES data and variants with low coverage could not be confirmed by Sanger sequencing; in addition, some variants may have been missed due to low coverage in certain regions, or due to their presence in regions located outside of the exome. Taking this into consideration, our analysis was not consistent with a high prevalence of the four targeted disorders in ASD patient cohorts. Although this study was not designed to be an evaluation of WES in the clinical setting, this manuscript highlights the issue of uncovering a large burden of VUS in the context of making a small number of clear diagnoses. Despite our analysis being limited to five genes, 54 unique VUS that require further exploration (and resources) were identified. Therefore, preparation for efficient VUS interpretation must be considered to ensure that widespread implementation of WES in ASD will be cost-effective. This study emphasizes the importance of data sharing to further scientific advancement in the field of genomics, as previously published studies using the NDAR data set also illustrate.14, 15, 16, 17 It also emphasizes the need for a large prospective study evaluating the prevalence of different genetic syndromes in a large group of patients with ASD. Ideally, detailed phenotypic information should be available for these patients and individuals with severe neurological deficits should not be excluded, which were inevitable limitations of the current study.