Finnish Parkinson’s disease study integrating protein-protein interaction network data with exome sequencing analysis

Siitonen, Ari; Kytövuori, Laura; Nalls, Mike A.; Gibbs, Raphael; Hernandez, Dena G.; Ylikotila, Pauli; Peltonen, Markku; Singleton, Andrew B.; Majamaa, Kari

doi:10.1038/s41598-019-55479-y

Download PDF

Article
Open access
Published: 11 December 2019

Finnish Parkinson’s disease study integrating protein-protein interaction network data with exome sequencing analysis

Ari Siitonen ORCID: orcid.org/0000-0002-4695-9861^1,2,
Laura Kytövuori ORCID: orcid.org/0000-0003-2790-5646^1,2,
Mike A. Nalls^3,4,
Raphael Gibbs³,
Dena G. Hernandez³,
Pauli Ylikotila^5,6,
Markku Peltonen⁷,
Andrew B. Singleton³ &
…
Kari Majamaa^1,2

Scientific Reports volume 9, Article number: 18865 (2019) Cite this article

2079 Accesses
6 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Variants associated with Parkinson’s disease (PD) have generally a small effect size and, therefore, large sample sizes or targeted analyses are required to detect significant associations in a whole exome sequencing (WES) study. Here, we used protein-protein interaction (PPI) information on 36 genes with established or suggested associations with PD to target the analysis of the WES data. We performed an association analysis on WES data from 439 Finnish PD subjects and 855 controls, and included a Finnish population cohort as the replication dataset with 60 PD subjects and 8214 controls. Single variant association (SVA) test in the discovery dataset yielded 11 candidate variants in seven genes, but the associations were not significant in the replication cohort after correction for multiple testing. Polygenic risk score using variants rs2230288 and rs2291312, however, was associated to PD with odds ratio of 2.7 (95% confidence interval 1.4–5.2; p < 2.56e-03). Furthermore, an analysis of the PPI network revealed enriched clusters of biological processes among established and candidate genes, and these functional networks were visualized in the study. We identified novel candidate variants for PD using a gene prioritization based on PPI information, and described why these variants may be involved in the pathogenesis of PD.

Proteome wide association studies of LRRK2 variants identify novel causal and druggable proteins for Parkinson’s disease

Article Open access 08 July 2023

Expanding causal genes for Parkinson’s disease via multi-omics analysis

Article Open access 21 October 2023

Genome-wide case-only analysis of gene-gene interactions with known Parkinson’s disease risk variants reveals link between LRRK2 and SYT10

Article Open access 29 June 2023

Introduction

The genetic etiology of Parkinson’s disease (PD) is complex (see e.g.¹). Many variants are associated with PD, but the effect of each variant seems to be small². Hence, large sample sizes will be required in a successful search for new variants³, and the probability of success can be further increased by employing information on genes possibly associated with the disease. Reduction of variants in the analysis can be accomplished by combining protein-protein interaction (PPI) data with genomic data (see e.g.⁴).

We have previously conducted a whole exome sequencing (WES) study on Finnish PD patients and population controls⁵. Here, we carried out a single variant re-analysis of these subjects and included an additional replication cohort of Finnish ancestry in the analysis. PPI information enabled us to focus the analysis on 36 genes with established or suggested associations with PD (PD36) and their interaction partners. Single variant analysis (SVA) in the discovery dataset yielded 11 candidate variants in seven genes, which were then analyzed in the replication dataset. Polygenic risk score (PRS) was calculated with two of the candidate variants and association to PD was tested in the replication dataset. In order to visualize possible biological processes related to these genes, we created PPI networks that included functional information of PD36 genes and functional information of novel candidate genes and genes of PD associated loci^6,7.

Subjects and Methods

Study populations and WES data preparation

Details of the three Finnish studies, Mitopark, Stampeed and FINRISK, have been described previously^5,8. Exome sequencing of the FINRISK-study has been performed at McDonnell Genome Institute, Washington University, and variant calling at Broad Institute as described previously⁹ and summarized in Supplementary Material.

Quality control of the exome sequences in Mitopark, Stampeed and FINRISK datasets was carried out with methodology described previously⁵. The discovery dataset (men 46%) consisted of exome sequences from Mitopark (N = 392 cases), Stampeed (N = 493 controls) and FINRISK (N = 47 cases; N = 362 controls). Total genotyping rate was 0.98. A portion of FINRISK cases (N = 107) were randomly assigned to the discovery and replication datasets, and in the discovery dataset the FINRISK controls were matched to cases with respect to sex and age. Only variants found in all the three studies were included in the merged discovery dataset. The replication dataset (men 48%) consisted of 60 FINRISK cases and 8214 FINRISK controls.

The study was approved by the Ethics Committee of the Turku University Hospital. All the methods were carried out in accordance with the relevant guidelines and regulations and informed consent was obtained from all participants.

Gene prioritization using protein-protein interaction data

UniProt database (version 2018-02) was queried with the phrase parkinson disease:disease AND organism:“Homo sapiens (Human) [9606]” in order to find proteins related to PD. In total, 36 proteins (Table 1) with established or suggested associations with Parkinson’s disease (PD36 proteins) were found. Information on protein-protein interactions was downloaded from Integrated interactions database (IID)¹⁰. Experimentally detected human specific interactions (version 2017-04) contained 18,627 vertices and 280,845 interactions. We then created a network (Supplementary Fig. S1) that included PD36 proteins and proteins that have direct interactions with these proteins. In total, the network (PD2300net) consisted of 2305 UniProt protein identifiers.

Table 1 Genes with suggested associations with Parkinson’s disease that were used to build PD2300net.

Full size table

Whole exome sequencing data analysis

Analysis workflow is shown in Fig. 1. Whole exome sequences (WES) from the three studies were subjected to single-variant association (SVA) test and polygenic risk score (PRS) association test. Analysis was focused to 36 genes with established or suggested associations with PD (PD36) and 2269 genes that interacted with them in PD2300net. The discovery dataset was filtered to include only variants in genes in PD2300net and variants 20 kbp upstream or downstream of PD2300net genes. In total, there were 8091 variants in the discovery set.

SVA test and PRS association test were performed using logistic regression with the first ten principal components as covariates. Genomic inflation factor lambda (based on median chi-squared test) was 1.0 in the discovery dataset. Variants with empirically set p value threshold of 0.0005 (N = 11) were used in logistic regression analysis in replication dataset, using the same settings as in the discovery dataset except that age was added into the covariates. Bonferroni correction for multiple testing was applied to replication results. Sanger sequencing of samples in the discovery dataset confirmed that variants rs113574896 and chr10_105048270_AGAG_A were false positive findings. Variants rs2627037, rs922984, rs2291310, rs2291311, rs2291312 in TTN gene were in linkage disequilibrium (LD) with each other and had almost identical frequency profiles (Table 2).

Table 2 Single variants in the discovery and replication datasets.

Full size table

Polygenic risk score was calculated using variants rs2230288 and rs2291312. These variants were selected on the basis of p-value cutoff 0.0005 in the SVA test and it was required that minor allele frequency was similar in the discovery and replication datasets. False positive variants rs113574896 and chr10_105048270_AGAG_A were removed from the analysis. Furthermore, all the combinations of the five variants in LD in TTN gene yielded similar PRS results and therefore variant rs2291312 was included to represent the LD block. Selected variants rs2230288 and rs2291312 were weighted by logarithm of odds ratio, and these values were added together for each sample in order to obtain PRS.

For disease prediction, the discovery dataset was used in training and the replication dataset in testing the model. Separate models were built on the two variants as features (variant model) and the polygenic risk score as the feature (PRS model). Variant model used random forest and PRS model logistic regression as classifier. The models were evaluated with the aid of sensitivity, specificity, area under curve (AUC), balanced accuracy score and mean decrease in impurity as the main metrics^11,12 (MDI). Balanced accuracy score is equal to the arithmetic mean of sensitivity and specificity.

Depth of coverage was analysed in a random sample of 400 cases or controls from the discovery dataset.

Visualization of protein-protein interaction network

Visualization network was built using experimentally detected human-specific interactions of the IID database. Interactors in this network were PD36 proteins, corresponding proteins of the candidate genes, and corresponding proteins of the genes from two genome-wide association study (GWAS) meta-analysis studies and from our previous WES and GWAS study^5,6,7 (Fig. 2). In order to select genes related to loci in GWAS studies, genes were retrieved within 250kbp up- and downstream of the reported chromosome positions (Supplementary Fig. S2). Genes that belonged to PD2300net were selected and the original chromosome positions of these genes (GWAS hits) are shown in Supplementary Table S1-S3. Furthermore, genes that belonged to PD2300net and were either significant in our previous WES study in gene-level analysis or contained de-novo variants with high effect size (Supplementary Table S4).

The visualization network (PD network 1) included edges between PD36 proteins, GWAS or WES hits and candidate proteins. This network of 74 proteins incorporated 29 PD36 proteins, 38 GWAS or WES hits and seven novel candidate protein hits with 165 edges (Fig. 3, Supplementary Table S5). The largest connected component of PD network 1 consisted of 64 genes.

Randomization test was performed on PD network 1. The main metrics were average path length (APL) and average clustering coefficient (ACC). Random graphs (N = 10,000) with size and degree distribution similar to PD network 1 were generated from IID database data by label permutation, and one-sided Monte Carlo p value was calculated for the true APL and ACC values. The degree distributions of the random graphs were preserved by binning proteins into 30 equal sized bins by their network degree distribution and swapping the protein labels between the proteins in the same degree bin without replacement.

Programs and databases

Plink 1.9b¹³ and R version 3.4.3 were used to prepare and analyze the exome sequences. Python version 3.5 with Scikit-learn library version 0.19.2 was used for logistic regression and random forest prediction models and MDI analysis¹⁴. Linkage disequilibrium between TTN variants was tested with LDlink webtool setting Europeans as the population¹⁵. Depth of coverage analysis was conducted using Genome Analysis Toolkit version 3.1¹⁶.

All networks were built and visualized using Python 3.5, Pyensembl version 1.1.0, Spark version 2.1.0, Graphframes version 0.5.0, Python library NetworkX version 2.1 and Gephi version 0.9.2. Random graphs for PPI network randomization test were built using in-house scripts. Average path length and average clustering coefficient was calculated using Stanford Network Analysis Platform (SNAP) version 5.0¹⁷. Pathway and gene enrichment information was acquired from GSEA v6.1 (software.broadinstitute.org/gsea) and STRING v10¹⁸ databases.

Results

Single-variant association test

We identified 8091 variants in PD2300net genes among 439 cases and 855 controls in the discovery dataset. Several p value thresholds from 0.05 to 0.00005 were tested and the lowest threshold 0.0005 that identified more than ten variants was selected for further evaluation in order to keep the false positive rate low. Eleven variants with p value less than 0.0005 in six novel genes and one established risk gene (candidate genes) were selected for replication, but none of the associations was significant in the replication dataset after correction for multiple testing (Table 2). One of the 11 variants was rs2230288 in the GBA gene leading to p.E365K (legacy name p.E326K). The frequency of this allele was 8.5% in PD cases and 4.7% in controls in the discovery dataset and similar frequencies were observed in the replication dataset (Table 2) giving an odds ratio of 2.1. The association of the variant with PD was not significant after correction for multiple testing, however.

We estimated the depth of coverage in discovery dataset. At the whole exome level 90% of all the contigs in our study were covered at 20x depth. However, only 19% of the PD36 genes were covered at 20×, with average total coverage being about 14x for these genes.

Polygenic risk score based on two variants is associated with the risk of PD

The variants rs2230288 and rs2291312 that passed the selection criteria were included in the calculation of polygenic risk score (PRS). The mean of PRS was 0.17 (range; 0.00 to 1.78) and 18% of the cases and 9% of the controls belonged to the upper quartile of the PRS values. The association of PRS with PD was tested in the replication dataset giving a mean odds ratio of 2.7 (95% confidence interval 1.4–5.2; p < 2.56e-03) (Table 3).

Table 3 Logistic regression results of polygenic risk score in the replication dataset.

Full size table

Two prediction models (variant model, PRS model) were then trained with the discovery dataset and tested in the replication dataset. Predictive power of the two models was generally low (Tables 4 and 5). Balanced accuracy score was 0.56 and the area under curve (AUC) score was 57%. The models classified 27% of the controls and 38% of the cases as cases.

Table 4 Metrics of prediction models in the replication dataset.

Full size table

Table 5 Confusion matrix of models in the replication dataset.

Full size table

Analysis of the protein-protein interaction network

The protein-protein interaction network (PD network 1) included 29 of the PD36 proteins, 38 GWAS hits and seven novel candidate proteins with 165 edges (Fig. 3, Supplementary Table S5). The largest connected component of PD network 1 consisted of 64 genes.

Protein-protein interactions in PD network 1 revealed that the proteins encoded by the seven novel candidate genes interacted directly with seven PD36 genes and five GWAS or WES hits (Fig. 3). Furthermore, RALGDS was identified in PD network 1 as a possible source of the GWAS hit in locus chr9:135955826 (rs11243993), identified in our previous study⁵. Variant rs11243993 was identified in 14 cases and none of the controls in GWAS analysis, but WES analysis did not identify any significant RALGDS variants.

Randomization test was performed to PD network 1 in order to estimate the statistical significance of the created PPI network. The largest connected component had significantly shorter average path length (p < 0.01066) and significantly greater average clustering coefficient (p < 0.000414) than what was expected of random network with the same size and node degree distribution (Supplementary Fig. S3). This suggests that PD Network 1 could be considered as relatively small world in comparison to random graph.

Gene set enrichment analysis (GSEA) revealed that 28 proteins in PD network 1 were involved in phosphate metabolism (p < 5.72e-8), 29 proteins in phosphorylation (p < 1.62E-15) and 20 proteins in organonitrogen compound metabolism (p < 5.03E-11) (Supplementary Table S6).

An interaction between the established PD proteins and candidate proteins in PD network 1 was also evident on inspection of the information on cell signalling by protein phosphorylation (uniprot.org) (Supplementary Fig. S4). Similarly, we observed a functional network, where candidate proteins interact with the rest of the network, if we merged information on previously suggested biological processes in PD (such as ubiquitination, mitochondrial function, signaling cascades, transportation and RNA processing) into PD network 1 (Supplementary Fig. S5).

Discussion

We analyzed here our previous WES data now focusing on a targeted set of 2305 genes. We compared 439 Finnish PD cases and 855 Finnish controls in the discovery phase and replicated the results in another dataset consisting of subjects with Finnish ancestry. SVA test was not significant in the replication dataset, but an association was found between PD and PRS. In addition, protein-protein interaction network showed that the novel variants identified here, loci identified in a recent GWAS meta-analysis^6,7, selected significant hits in our previous GWAS and WES study⁵, and known PD genes formed a network with clusters of biological processes, further suggesting the involvement of these proteins in PD.

We found a significant association between PRS and PD in the replication dataset with an odds ratio of 2.7. Predictive PRS model was able to capture 38% of the cases, although with a high rate of false negative cases. The prediction rate was generally low both in the PRS model and in the variant model. The relative effect of the p.E365K variant in GBA in the classification was higher than that of TTN in the variant model with mean decrease in impurity (MDI) of 58%, leaving MDI of 42% to the TTN variant.

The clinical significance of the GBA variant p.E365K has been controversial (see Clinvar: RCV000487503), but a recent meta-analysis provided some evidence that p.E365K may indeed be associated with PD¹⁹. The association was not significant in our study after correction for multiple testing, but ten out of 60 patients carried this mutation in the replication dataset giving an allele frequency of 0.08, which was similar to that in the discovery dataset. Interestingly, the allele frequency of the variant in non-Finnish European populations is 0.01 according to gnomAD²⁰, whereas we found a frequency of 0.04 in the Finnish population. Without genome-wide significant (p < 5e-8) results, our study cannot completely define the role of rs2230288 in PD.

Five variants (rs2627037, rs922984, rs2291310, rs2291311, rs2291312) were located in the titin (TTN) gene. TTN is a large gene, rich in variants and, therefore, it is possible that the association reflects variant ascertainment or sequencing bias. The Clinvar database reports 11,148 TTN variants, among which there are at least 251 pathogenic variants in 17 different conditions including cardiomyopathies, skeletal muscle phenotypes and congenital diseases²¹.

Titin acts in sarcomere assembly and has role in elasticity and resting tension of striated muscles^22,23. The variants identified in our study were located in or in close proximity of immunoglobulin-like domains 19 and 20 that account for the elasticity of titin. In addition, oxidation of the domains has been shown to lead to stiffening of the protein²⁴. Interestingly, the molecular spring titin determines, at least in part, muscle stiffness and rigidity and tremor are the clinical hallmarks in PD^25,26. Muscle stiffness likely plays a role in determining the frequency of oscillatory motion and therefore the changes in titin structure could impact whole-animal movement by modulating muscle stiffness.

Only 19% of the PD36 genes in WES data had a depth of coverage of 20×. This may have caused a loss of significant findings in established PD genes and should be taken into consideration when estimating the results.

We composed a PPI network that was based on 36 PD genes. The network enabled us to identify genes that interact, in addition to the established PD genes, with genes in the vicinity of GWAS hits reported in previous meta-analyses, genes in the vicinity of significant GWAS hits in our previous study and selected WES hits from our previous study. Not all plausible PD genes were used to build the initial gene set and although this approach may reduce the number of candidate genes identified, the small initial set of PD genes should have reduced the number of false positive findings. PPI network visualizations and GSEA described the network context around the identified candidate genes and supported their relation to established PD genes. Furthermore, a review of the literature on the seven candidate genes indicated that they may be involved in neurodegenerative diseases (Supplementary Table S7, Supplementary Material). Interestingly, 64 proteins in PD network 1 were connected via interaction suggesting a linking factor between them or a common signaling cascade. Phosphate metabolism and phosphorylation were among the most common processes identified in the GSEA analysis, but also other previously known biological processes, such as mitochondrial processing, ubiquitination and response to stress were identified.

GWAS hit at locus rs11243993 from our previous study was suggested to originate from RALGDS gene in PD network 1. RALGDS has a role in GTPase regulation and in PD network 1 the protein interacts directly with SNCA²⁷. Interestingly, GTPase signaling have been suggested to be the link between genomics and etiology of PD²⁸.

PPI networks were built using experimental subset of IID, which integrates data from primary data sources, such as BioGRID or IntAct. These datasets use various experimental techniques as the source and each have its own strengths and pitfalls. Here we did not filter the PPI data for interaction confidence or characterization score, but instead expected the genetic association test to serve as evidence for plausible interactions.

We identified novel candidate variants in PD using a combination of WES data and PPI network data. Targeted gene analysis, polygenic risk score association analysis and PPI network analysis indicated that these variants may be involved in the pathogenesis of PD. The power in our study was limited, and therefore, our findings can be seen as hypothesis generating and they require further investigation.

Data availability

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

References

Ferreira, M. & Massano, J. An updated review of Parkinson’s disease genetics and clinicopathological correlations. Acta Neurol. Scand. 135, 273–284 (2017).
Article CAS PubMed Google Scholar
Nalls, Mike A. et al. Identification of novel risk loci, causal insights, and heritable risk for Parkinson's disease: a meta-analysis of genome-wide association studies. The Lancet Neurology 18(12), 1091–1102 (2019).
Sham, P. C. & Purcell, S. M. Statistical power and significance testing in large-scale genetic studies. Nat. Rev. Genet. 15, 335–346 (2014).
Article CAS PubMed Google Scholar
Lage, K. Protein-protein interactions and genetic diseases: The interactome. Biochim. Biophys. Acta 1842, 1971–1980 (2014).
Article CAS PubMed PubMed Central Google Scholar
Siitonen, A. et al. Genetics of early-onset Parkinson’s disease in Finland: exome sequencing and genome-wide association study. Neurobiol. Aging 53, 195.e–195.e10 (2017).
Article Google Scholar
Chang, D. et al. A meta-analysis of genome-wide association studies identifies 17 new Parkinson’s disease risk loci. Nat. Genet. 49, 1511–1516 (2017).
Article CAS PubMed PubMed Central Google Scholar
Nalls, M. A. et al. Large-scale meta-analysis of genome-wide association data identifies six new risk loci for Parkinson’s disease. Nat. Genet. 46, 989–993 (2014).
Article CAS PubMed PubMed Central Google Scholar
Zekavat, S. M. et al. Deep coverage whole genome sequences and plasma lipoprotein(a) in individuals of European and African ancestries. Nat. Commun. 9, 260–w (2018).
Article Google Scholar
Ganna, A. et al. Quantifying the Impact of Rare and Ultra-rare Coding Variation across the Phenotypic Spectrum. Am. J. Hum. Genet. 102, 1204–1211 (2018).
Article CAS PubMed PubMed Central Google Scholar
Kotlyar, M., Pastrello, C., Sheahan, N. & Jurisica, I. Integrated interactions database: tissue-specific view of the human and model organism interactomes. Nucleic Acids Res. 44, 536 (2016).
Article Google Scholar
Breiman, L., Friedman, J. H. & Olshen, R. A. Stone, cj (1984) classification and regression trees. Wadsworth, Belmont, California (2009).
Louppe, G. Understanding random forests: From theory to practice. arXiv preprint arXiv 1407, 7502 (2014).
Google Scholar
Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 8. eCollection 2015 (2015).
Pedregosa, F. et al. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12, 2825–2830 (2011).
MathSciNet MATH Google Scholar
Machiela, M. J. & Chanock, S. J. LDlink: a web-based application for exploring population-specific haplotype structure and linking correlated alleles of possible functional variants. Bioinformatics 31, 3555–3557 (2015).
Article CAS PubMed PubMed Central Google Scholar
DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).
Article CAS PubMed PubMed Central Google Scholar
Leskovec, J. & Sosič, R. Snap: A general-purpose network analysis and graph-mining library. ACM Transactions on Intelligent Systems and Technology (TIST) 8, 1 (2016).
Article Google Scholar
Szklarczyk, D. et al. STRING v10: protein-protein interaction networks, integrated over the tree of life. Nucleic Acids Res. 43, 447 (2015).
Article Google Scholar
Huang, Y., Deng, L., Zhong, Y. & Yi, M. The Association between E326K of GBA and the Risk of Parkinson’s Disease. Parkinsons Dis. 2018, 1048084 (2018).
PubMed PubMed Central Google Scholar
Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).
Article CAS PubMed PubMed Central Google Scholar
Henrie, A. et al. ClinVar Miner: Demonstrating utility of a Web-based tool for viewing and filtering ClinVar data. Hum. Mutat. 39, 1051–1060 (2018).
Article PubMed PubMed Central Google Scholar
Trinick, J. Titin as a scaffold and spring. Cytoskeleton. Curr. Biol. 6, 258–260 (1996).
Article CAS PubMed Google Scholar
Maruyama, K. Connectin, an elastic protein of striated muscle. Biophys. Chem. 50, 73–85 (1994).
Article CAS PubMed Google Scholar
Giganti, D., Yan, K., Badilla, C. L., Fernandez, J. M. & Alegre-Cebollada, J. Disulfide isomerization reactions in titin immunoglobulin domains enable a mode of protein elasticity. Nat. Commun. 9, 18–7 (2018).
Article Google Scholar
Monroy, J. A., Powers, K. L., Pace, C. M., Uyeno, T. & Nishikawa, K. C. Effects of activation on the elastic properties of intact soleus muscles with a deletion in titin. J. Exp. Biol. 220, 828–836 (2017).
PubMed Google Scholar
Taylor-Burt, K. R., Monroy, J., Pace, C., Lindstedt, S. & Nishikawa, K. C. Shiver me titin! Elucidating titin’s role in shivering thermogenesis. J. Exp. Biol. 218, 694–702 (2015).
Article PubMed Google Scholar
Spaargaren, M. & Bischoff, J. R. Identification of the guanine nucleotide dissociation stimulator for Ral as a putative effector molecule of R-ras, H-ras, K-ras, and Rap. Proceedings of the National Academy of Sciences 91, 12609–12613 (1994).
Article CAS ADS Google Scholar
Obergasteiger, J., Frapporti, G., Pramstaller, P. P., Hicks, A. A. & Volta, M. A new hypothesis for Parkinson’s disease pathogenesis: GTPase-p38 MAPK signaling and autophagy as convergence points of etiology and genomics. Molecular neurodegeneration 13, 40 (2018).
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We acknowledge our use of the gene set enrichment analysis, GSEA software, and Molecular Signature Database (MSigDB) (Subramanian, Tamayo, et al. (2005), PNAS 102, 15545–15550, http://www.broad.mit.edu/gsea/). The study was supported in part by a grant from the Sigrid Juselius Foundation. For funding details and acknowledgments, please see the Supplementary Material.

Author information

Authors and Affiliations

Research Unit of Clinical Neuroscience, University of Oulu, Oulu, Finland
Ari Siitonen, Laura Kytövuori & Kari Majamaa
Department of Neurology and Medical Research Center Oulu, Oulu University Hospital and University of Oulu, Oulu, Finland
Ari Siitonen, Laura Kytövuori & Kari Majamaa
Laboratory for Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD, USA
Mike A. Nalls, Raphael Gibbs, Dena G. Hernandez & Andrew B. Singleton
Data Tecnica International, Glen Echo, MD, 20812, USA
Mike A. Nalls
Institute of Clinical Medicine, Department of Neurology, University of Turku, Turku, Finland
Pauli Ylikotila
Division of Clinical Neurosciences, Turku University Hospital, Turku, Finland
Pauli Ylikotila
THL, Helsinki, Finland
Markku Peltonen

Authors

Ari Siitonen
View author publications
You can also search for this author in PubMed Google Scholar
Laura Kytövuori
View author publications
You can also search for this author in PubMed Google Scholar
Mike A. Nalls
View author publications
You can also search for this author in PubMed Google Scholar
Raphael Gibbs
View author publications
You can also search for this author in PubMed Google Scholar
Dena G. Hernandez
View author publications
You can also search for this author in PubMed Google Scholar
Pauli Ylikotila
View author publications
You can also search for this author in PubMed Google Scholar
Markku Peltonen
View author publications
You can also search for this author in PubMed Google Scholar
Andrew B. Singleton
View author publications
You can also search for this author in PubMed Google Scholar
Kari Majamaa
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Data acquisition A.S., D.G.H., P.Y., M.P., A.B.S., K.M. Data preprocessing. A.S., M.N., R.G. Study design and study level analysis. A.S., L.K., K.M. Critical review and writing the manuscript. A.S., L.K., M.N., R.B., D.G.H., P.Y., M.P., A.B.S., K.M.

Corresponding author

Correspondence to Ari Siitonen.

Ethics declarations

Competing Interests

Dr Nalls’ participation is supported by a consulting contract between Data Tecnica International LLC and the National Institute on Aging, NIH, Bethesda, MD, USA. Dr Nalls also consults for Genoom Health, Illumina Inc, The Michael J. Fox Foundation for Parkinson’s Research and University of California Healthcare. The other authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Siitonen, A., Kytövuori, L., Nalls, M.A. et al. Finnish Parkinson’s disease study integrating protein-protein interaction network data with exome sequencing analysis. Sci Rep 9, 18865 (2019). https://doi.org/10.1038/s41598-019-55479-y

Download citation

Received: 26 July 2019
Accepted: 29 November 2019
Published: 11 December 2019
DOI: https://doi.org/10.1038/s41598-019-55479-y

This article is cited by

Joint analysis of functionally related genes yields further candidates associated with Tetralogy of Fallot
- Alexandru Chelu
- Simon G. Williams
- David Talavera
Journal of Human Genetics (2022)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.