Exome sequencing of Finnish isolates enhances rare-variant association power

Locke, Adam E.; Steinberg, Karyn Meltz; Chiang, Charleston W. K.; Service, Susan K.; Havulinna, Aki S.; Stell, Laurel; Pirinen, Matti; Abel, Haley J.; Chiang, Colby C.; Fulton, Robert S.; Jackson, Anne U.; Kang, Chul Joo; Kanchi, Krishna L.; Koboldt, Daniel C.; Larson, David E.; Nelson, Joanne; Nicholas, Thomas J.; Pietilä, Arto; Ramensky, Vasily; Ray, Debashree; Scott, Laura J.; Stringham, Heather M.; Vangipurapu, Jagadish; Welch, Ryan; Yajnik, Pranav; Yin, Xianyong; Eriksson, Johan G.; Ala-Korpela, Mika; Järvelin, Marjo-Riitta; Männikkö, Minna; Laivuori, Hannele; Dutcher, Susan K.; Stitziel, Nathan O.; Wilson, Richard K.; Hall, Ira M.; Sabatti, Chiara; Palotie, Aarno; Salomaa, Veikko; Laakso, Markku; Ripatti, Samuli; Boehnke, Michael; Freimer, Nelson B.

doi:10.1038/s41586-019-1457-z

Article
Published: 31 July 2019

Exome sequencing of Finnish isolates enhances rare-variant association power

Adam E. Locke^1,2,3^na2,
Karyn Meltz Steinberg^2,4^na2,
Charleston W. K. Chiang^5,6,7^na2,
Susan K. Service⁵^na2,
Aki S. Havulinna^8,9,
Laurel Stell¹⁰,
Matti Pirinen^8,11,12,
Haley J. Abel^2,13,
Colby C. Chiang²,
Robert S. Fulton^2,13,
Anne U. Jackson³,
Chul Joo Kang²,
Krishna L. Kanchi²,
Daniel C. Koboldt^2,14,15,
David E. Larson^2,13,
Joanne Nelson²,
Thomas J. Nicholas^2,16,
Arto Pietilä⁹,
Vasily Ramensky^5,17,
Debashree Ray^3,18,
Laura J. Scott³,
Heather M. Stringham³,
Jagadish Vangipurapu¹⁹,
Ryan Welch³,
Pranav Yajnik³,
Xianyong Yin³,
Johan G. Eriksson^20,21,22,
Mika Ala-Korpela^{23,24,25,26,27,28},
Marjo-Riitta Järvelin^{29,30,31,32,33},
Minna Männikkö^30,34,
Hannele Laivuori^8,35,36,
FinnGen Project,
Susan K. Dutcher^2,13,
Nathan O. Stitziel^2,37,
Richard K. Wilson^2,14,15,
Ira M. Hall^1,2,
Chiara Sabatti^10,38,
Aarno Palotie^8,39,40,
Veikko Salomaa⁹,
Markku Laakso^19,41,
Samuli Ripatti^8,11,40,
Michael Boehnke³^na3 &
…
Nelson B. Freimer⁵^na3

Nature volume 572, pages 323–328 (2019)Cite this article

14k Accesses
114 Citations
213 Altmetric
Metrics details

Subjects

An Author Correction to this article was published on 04 November 2019

This article has been updated

Abstract

Exome-sequencing studies have generally been underpowered to identify deleterious alleles with a large effect on complex traits as such alleles are mostly rare. Because the population of northern and eastern Finland has expanded considerably and in isolation following a series of bottlenecks, individuals of these populations have numerous deleterious alleles at a relatively high frequency. Here, using exome sequencing of nearly 20,000 individuals from these regions, we investigate the role of rare coding variants in clinically relevant quantitative cardiometabolic traits. Exome-wide association studies for 64 quantitative traits identified 26 newly associated deleterious alleles. Of these 26 alleles, 19 are either unique to or more than 20 times more frequent in Finnish individuals than in other Europeans and show geographical clustering comparable to Mendelian disease mutations that are characteristic of the Finnish population. We estimate that sequencing studies of populations without this unique history would require hundreds of thousands to millions of participants to achieve comparable association power.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Characterization of associations.**

**Fig. 2: Allelic enrichment in the Finnish population and its effect on genetic discovery.**

**Fig. 3: Geographical clustering of associated variants.**

Genome-wide association studies

Article 26 August 2021

Utility of polygenic scores across diverse diseases in a hospital cohort for predictive modeling

Article Open access 12 April 2024

Exome-wide analysis implicates rare protein-altering variants in human handedness

Article Open access 02 April 2024

Data availability

The sequencing data can be accessed through dbGaP (https://www.ncbi.nlm.nih.gov/gap/) using study numbers phs000756 and phs000752. Association results can be accessed at http://pheweb.sph.umich.edu/FinMetSeq/ and are searchable via the Type 2 Diabetes Knowledge Portal (http://www.type2diabetesgenetics.org/). Summary statistics are also available through the NHGRI-EBI GWAS Catalog at https://www.ebi.ac.uk/gwas/downloads/summary-statistics.

Change history

04 November 2019
An Amendment to this paper has been published and can be accessed via a link at the top of the paper.

References

Samocha, K. E. et al. Regional missense constraint improves variant deleteriousness prediction. Preprint at https://www.bioRxiv.org/content/10.1101/148353v1 (2017).
Marouli, E. et al. Rare and low-frequency coding variants alter human adult height. Nature 542, 186–190 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Flannick, J. et al. Exome sequencing of 20,791 cases of type 2 diabetes and 24,440 controls. Nature 570, 71–76 (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
Timpson, N. J., Greenwood, C. M. T., Soranzo, N., Lawson, D. J. & Richards, J. B. Genetic architecture: the shape of the genetic contribution to human traits and disease. Nat. Rev. Genet. 19, 110–124 (2018).
Article CAS PubMed Google Scholar
Zuk, O. et al. Searching for missing heritability: designing rare variant association studies. Proc. Natl Acad. Sci. USA 111, E455–E464 (2014).
Article CAS PubMed PubMed Central Google Scholar
Xue, Y. et al. Enrichment of low-frequency functional variants revealed by whole-genome sequencing of multiple isolated European populations. Nat. Commun. 8, 15927 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Southam, L. et al. Whole genome sequencing and imputation in isolated populations identify genetic associations with medically-relevant complex traits. Nat. Commun. 8, 15606 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Manolio, T. A. et al. Finding the missing heritability of complex diseases. Nature 461, 747–753 (2009).
Article ADS CAS PubMed PubMed Central Google Scholar
Jakkula, E. et al. The genome-wide patterns of variation expose significant substructure in a founder population. Am. J. Hum. Genet. 83, 787–794 (2008).
Article CAS PubMed PubMed Central Google Scholar
Polvi, A. et al. The Finnish disease heritage database (FinDis) update—a database for the genes mutated in the Finnish disease heritage brought to the next-generation sequencing era. Hum. Mutat. 34, 1458–1466 (2013).
Article PubMed Google Scholar
Manning, A. et al. A low-frequency inactivating AKT2 variant enriched in the Finnish population is associated with fasting insulin levels and type 2 diabetes risk. Diabetes 66, 2019–2032 (2017).
Article CAS PubMed PubMed Central Google Scholar
Lim, E. T. et al. Distribution and medical impact of loss-of-function variants in the Finnish founder population. PLoS Genet. 10, e1004494 (2014).
Article PubMed PubMed Central CAS Google Scholar
Service, S. K. et al. Re-sequencing expands our understanding of the phenotypic impact of variants at GWAS loci. PLoS Genet. 10, e1004147 (2014).
Article PubMed PubMed Central CAS Google Scholar
Würtz, P. et al. Quantitative serum nuclear magnetic resonance metabolomics in large-scale epidemiology: a primer on -omic technologies. Am. J. Epidemiol. 186, 1084–1096 (2017).
Article PubMed PubMed Central Google Scholar
Laakso, M. et al. The Metabolic Syndrome in Men study: a resource for studies of metabolic and cardiovascular diseases. J. Lipid Res. 58, 481–493 (2017).
Article CAS PubMed PubMed Central Google Scholar
Borodulin, K. et al. Forty-year trends in cardiovascular risk factors in Finland. Eur. J. Public Health 25, 539–546 (2015).
Article PubMed Google Scholar
Abraham, G. et al. Genomic prediction of coronary heart disease. Eur. Heart J. 37, 3267–3278 (2016).
Article CAS PubMed PubMed Central Google Scholar
Sabatti, C. et al. Genome-wide association analysis of metabolic traits in a birth cohort from a founder population. Nat. Genet. 41, 35–46 (2009).
Article CAS PubMed Google Scholar
Pulizzi, N. et al. Interaction between prenatal growth and high-risk genotypes in the development of type 2 diabetes. Diabetologia 52, 825–829 (2009).
Article CAS PubMed Google Scholar
Fagerberg, L. et al. Analysis of the human tissue-specific expression by genome-wide integration of transcriptomics and antibody-based proteomics. Mol. Cell. Proteomics 13, 397–406 (2014).
Article CAS PubMed Google Scholar
Corsetti, J. P. et al. Thrombospondin-4 polymorphism (A387P) predicts cardiovascular risk in postinfarction patients with high HDL cholesterol and C-reactive protein levels. Thromb. Haemost. 106, 1170–1178 (2011).
Article CAS PubMed Google Scholar
Zhang, X. J. et al. Association between single nucleotide polymorphisms in thrombospondins genes and coronary artery disease: a meta-analysis. Thromb. Res. 136, 45–51 (2015).
Article CAS PubMed Google Scholar
Beygo, J. et al. New insights into the imprinted MEG8-DMR in 14q32 and clinical and molecular description of novel patients with Temple syndrome. Eur. J. Hum. Genet. 25, 935–945 (2017).
Article CAS PubMed PubMed Central Google Scholar
Wallace, C. et al. The imprinted DLK1-MEG3 gene region on chromosome 14q32.2 alters susceptibility to type 1 diabetes. Nat. Genet. 42, 68–71 (2010).
Article CAS PubMed Google Scholar
Day, F. R. et al. Genomic analyses identify hundreds of variants associated with age at menarche and support a role for puberty timing in cancer risk. Nat. Genet. 49, 834–841 (2017).
Article CAS PubMed PubMed Central Google Scholar
Perry, J. R. et al. Parent-of-origin-specific allelic associations among 106 genomic loci for age at menarche. Nature 514, 92–97 (2014).
Article CAS PubMed PubMed Central Google Scholar
Cleaton, M. A. et al. Fetus-derived DLK1 is required for maternal metabolic adaptations to pregnancy and is associated with fetal growth restriction. Nat. Genet. 48, 1473–1480 (2016).
Article CAS PubMed PubMed Central Google Scholar
Chaves, J. A. et al. Genomic variation at the tips of the adaptive radiation of Darwin’s finches. Mol. Ecol. 25, 5282–5295 (2016).
Article CAS PubMed Google Scholar
Surakka, I. et al. The impact of low-frequency and rare variants on lipid levels. Nat. Genet. 47, 589–597 (2015).
Article CAS PubMed PubMed Central Google Scholar
Ding, Y. et al. Plasma glycine and risk of acute myocardial infarction in patients with suspected stable angina pectoris. J. Am. Heart Assoc. 5, e002621 (2015).
Article PubMed PubMed Central Google Scholar
Wittemans, L. B. L. et al. Assessing the causal association of glycine with risk of cardio-metabolic diseases. Nat. Commun. 10, 1060 (2019).
Article ADS PubMed PubMed Central CAS Google Scholar
Perry, R. J. et al. Acetate mediates a microbiome–brain–β-cell axis to promote metabolic syndrome. Nature 534, 213–217 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Tabbassum, R. et al. Genetics of human plasma lipidome: understanding lipid metabolism and its link to diseases beyond traditional lipids. Preprint at https://www.biorxiv.org/content/10.1101/457960v1 (2018).
Casanova, M. L. et al. Exocrine pancreatic disorders in transsgenic mice expressing human keratin 8. J. Clin. Invest. 103, 1587–1595 (1999).
Article CAS PubMed PubMed Central Google Scholar
Surendran, P. et al. Trans-ancestry meta-analyses identify rare and common variants associated with blood pressure and hypertension. Nat. Genet. 48, 1151–1161 (2016).
Article CAS PubMed PubMed Central Google Scholar
Liu, C. et al. Meta-analysis identifies common and rare variants influencing blood pressure and overlapping with metabolic trait loci. Nat. Genet. 48, 1162–1170 (2016).
Article CAS PubMed PubMed Central Google Scholar
Palmer, C. & Pe’er, I. Statistical correction of the winner’s curse explains replication variability in quantitative trait genome-wide association studies. PLoS Genet. 13, e1006916 (2017).
Article PubMed PubMed Central CAS Google Scholar
Norio, R. Finnish Disease Heritage I: characteristics, causes, background. Hum. Genet. 112, 441–456 (2003).
Article PubMed Google Scholar
Service, S. et al. Magnitude and distribution of linkage disequilibrium in population isolates and implications for genome-wide association studies. Nat. Genet. 38, 556–560 (2006).
Article CAS PubMed Google Scholar
Chiang, C. W. K. et al. Genomic history of the Sardinian population. Nat. Genet. 50, 1426–1434 (2018).
Article CAS PubMed PubMed Central Google Scholar
Rivas, M. A. et al. Insights into the genetic epidemiology of Crohn’s and rare diseases in the Ashkenazi Jewish population. PLoS Genet. 14, e1007329 (2018).
Article PubMed PubMed Central CAS Google Scholar
Bastarache, L. et al. Phenotype risk scores identify patients with unrecognized Mendelian disease patterns. Science 359, 1233–1239 (2018).
Article ADS CAS PubMed PubMed Central Google Scholar
Niemi, M. E. K. et al. Common genetic variants contribute to risk of rare severe neurodevelopmental disorders. Nature 562, 268–271 (2018).
Article ADS CAS PubMed PubMed Central Google Scholar
Surakka, I. The rate of false polymorphisms introduced when imputing genotypes from global imputation panels. Preprint at https://www.biorxiv.org/content/10.1101/080770v1 (2016).
Collins, F. S. & Varmus, H. A new initiative on precision medicine. N. Engl. J. Med. 372, 793–795 (2015).
Article CAS PubMed PubMed Central Google Scholar
Stancáková, A. et al. Changes in insulin sensitivity and insulin release in relation to glycemia and glucose tolerance in 6,414 Finnish men. Diabetes 58, 1212–1221 (2009).
Article PubMed PubMed Central CAS Google Scholar
Borodulin, K. et al. Cohort profile: the National FINRISK Study. Int. J. Epidemiol. 47, 696–696i (2017).
Article Google Scholar
Wu, J. et al. A summary of the effects of antihypertensive medications on measured blood pressure. Am. J. Hypertens. 18, 935–942 (2005).
Article CAS PubMed Google Scholar
Tobin, M. D., Sheehan, N. A., Scurrah, K. J. & Burton, P. R. Adjusting for treatment effects in studies of quantitative traits: antihypertensive therapy and systolic blood pressure. Stat. Med. 24, 2911–2935 (2005).
Article MathSciNet PubMed Google Scholar
Liu, D. J. et al. Exome-wide association study of plasma lipids in >300,000 individuals. Nat. Genet. 49, 1758–1766 (2017).
Article CAS PubMed PubMed Central Google Scholar
Friedewald, W. T., Levy, R. I. & Fredrickson, D. S. Estimation of the concentration of low-density lipoprotein cholesterol in plasma, without use of the preparative ultracentrifuge. Clin. Chem. 18, 499–502 (1972).
Article CAS PubMed Google Scholar
DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).
Article CAS PubMed PubMed Central Google Scholar
Jun, G. et al. Detecting and estimating contamination of human DNA samples in sequencing and array-based genotype data. Am. J. Hum. Genet. 91, 839–848 (2012).
Article CAS PubMed PubMed Central Google Scholar
Tan, A., Abecasis, G. R. & Kang, H. M. Unified representation of genetic variants. Bioinformatics 31, 2202–2204 (2015).
Article CAS PubMed PubMed Central Google Scholar
Davis, J. P. et al. Common, low-frequency, and rare genetic variants associated with lipoprotein subclasses and triglyceride measures in Finnish men from the METSIM study. PLoS Genet. 13, e1007079 (2017).
Article PubMed PubMed Central CAS Google Scholar
Das, S. et al. Next-generation genotype imputation service and methods. Nat. Genet. 48, 1284–1287 (2016).
Article CAS PubMed PubMed Central Google Scholar
The Haplotype Reference Consortium. A reference panel of 64,976 haplotypes for genotype imputation. Nat. Genet. 48, 1279–1283 (2016).
Article CAS Google Scholar
McLaren, W. et al. The Ensembl Variant Effect Predictor. Genome Biol. 17, 122 (2016).
Article PubMed PubMed Central CAS Google Scholar
Adzhubei, I. A. et al. A method and server for predicting damaging missense mutations. Nat. Methods 7, 248–249 (2010).
Article CAS PubMed PubMed Central Google Scholar
Chun, S. & Fay, J. C. Identification of deleterious mutations within three human genomes. Genome Res. 19, 1553–1561 (2009).
Article CAS PubMed PubMed Central Google Scholar
Schwarz, J. M., Cooper, D. N., Schuelke, M. & Seelow, D. MutationTaster2: mutation prediction for the deep-sequencing age. Nat. Methods 11, 361–362 (2014).
Article CAS PubMed Google Scholar
Kumar, P., Henikoff, S. & Ng, P. C. Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat. Protoc. 4, 1073–1081 (2009).
Article CAS PubMed Google Scholar
Kang, H. M. et al. Variance component model to account for sample structure in genome-wide association studies. Nat. Genet. 42, 348–354 (2010).
Article CAS PubMed PubMed Central Google Scholar
Buniello, A. et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 47, D1005–D1012 (2019).
Article CAS PubMed Google Scholar
Kettunen, J. et al. Genome-wide study for circulating metabolites identifies 62 loci and reveals novel systemic effects of LPA. Nat. Commun. 7, 11122 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Kettunen, J. et al. Genome-wide association study identifies multiple loci influencing human serum metabolite levels. Nat. Genet. 44, 269–276 (2012).
Article CAS PubMed PubMed Central Google Scholar
Teslovich, T. M. et al. Identification of seven novel loci associated with amino acid levels using single-variant and gene-based tests in 8545 Finnish men from the METSIM study. Hum. Mol. Genet. 27, 1664–1674 (2018).
Article CAS PubMed PubMed Central Google Scholar
Inouye, M. et al. Novel loci for metabolic networks and multi-tissue expression studies reveal genes for atherosclerosis. PLoS Genet. 8, e1002907 (2012).
Article CAS PubMed PubMed Central Google Scholar
Lee, S. et al. Optimal unified approach for rare-variant association testing with application to small-sample case–control whole-exome sequencing studies. Am. J. Hum. Genet. 91, 224–237 (2012).
Article CAS PubMed PubMed Central Google Scholar
Peterson, C. B., Bogomolov, M., Benjamini, Y. & Sabatti, C. Many phenotypes without many false discoveries: error controlling strategies for multitrait association studies. Genet. Epidemiol. 40, 45–56 (2016).
Article PubMed Google Scholar
Loh, P. R. et al. Reference-based phasing using the Haplotype Reference Consortium panel. Nat. Genet. 48, 1443–1448 (2016).
Article CAS PubMed PubMed Central Google Scholar
Howie, B. N., Donnelly, P. & Marchini, J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 5, e1000529 (2009).
Article PubMed PubMed Central CAS Google Scholar
Manichaikul, A. et al. Robust relationship inference in genome-wide association studies. Bioinformatics 26, 2867–2873 (2010).
Article CAS PubMed PubMed Central Google Scholar
Lawson, D. J., Hellenthal, G., Myers, S. & Falush, D. Inference of population structure using dense haplotype data. PLoS Genet. 8, e1002453 (2012).
Article CAS PubMed PubMed Central Google Scholar
Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7 (2015).
Article PubMed PubMed Central CAS Google Scholar
Pirinen, M. et al. biMM: efficient estimation of genetic variances and covariances for cohorts with high-dimensional phenotype measurements. Bioinformatics 33, 2405–2407 (2017).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We thank T. Teshiba for coordinating ethical permissions and samples; S. Kerminen, D. Lawson and G. Busby for discussions and providing scripts to run fineSTRUCTURE. S.R. was supported by the Academy of Finland Center of Excellence in Complex Disease Genetics (312062), Academy of Finland (285380), the Finnish Foundation for Cardiovascular Research, the Sigrid Juselius Foundation, Biocentrum Helsinki and University of Helsinki HiLIFE Fellow grant. V.R. acknowledges support by RFBR, research project 18-04-00789 A. V.S. was supported by the Finnish Foundation for Cardiovascular Research. C.S. and L.S. received funding from HG006695, HL113315 and MH105578. M.A.-K. is supported by a Senior Research Fellowship from the National Health and Medical Research Council (NHMRC) of Australia (APP1158958) and works in a unit that is supported by the University of Bristol and UK Medical Research Council (MC_UU_12013/1). The Baker Institute is supported in part by the Victorian Government’s Operational Infrastructure Support Program. A.U.J., D.R., L.J.S., H.M.S., R.W., P.Y., X.Y. and M.B. received funding from DK062370. S.K.S., C.W.K.C. and N.B.F. received funding from HL113315 and NS062691. The METSIM study was supported by grants from Academy of Finland (321428), the Sigrid Juselius Foundation, the Finnish Foundation for Cardiovascular Research, Kuopio University Hospital and the Centre of Excellence of Cardiovascular and Metabolic Diseases is supported by the Academy of Finland (M.L.). Sequencing was funded by 5U54HG003079. A.E.L., K.M.S., H.J.A., C.C.C., C.J.K., K.L.K., D.C.K., D.E.L., J.N., T.J.N., S.K.D., N.O.S., I.M.H. and R.K.W. were funded by 5U54HG003079 and 5UM1HG008853-03.

Author information

A list of participants and their affiliations appears in the Supplementary Information.
These authors contributed equally: Adam E. Locke, Karyn Meltz Steinberg, Charleston W. K. Chiang, Susan K. Service.
These authors jointly supervised this work: Michael Boehnke, Nelson B. Freimer

Authors and Affiliations

Department of Medicine, Washington University School of Medicine, St Louis, MO, USA
Adam E. Locke & Ira M. Hall
McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO, USA
Adam E. Locke, Karyn Meltz Steinberg, Haley J. Abel, Colby C. Chiang, Robert S. Fulton, Chul Joo Kang, Krishna L. Kanchi, Daniel C. Koboldt, David E. Larson, Joanne Nelson, Thomas J. Nicholas, Susan K. Dutcher, Nathan O. Stitziel, Richard K. Wilson & Ira M. Hall
Department of Biostatistics and Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, MI, USA
Adam E. Locke, Anne U. Jackson, Debashree Ray, Laura J. Scott, Heather M. Stringham, Ryan Welch, Pranav Yajnik, Xianyong Yin & Michael Boehnke
Department of Pediatrics, Washington University School of Medicine, St Louis, MO, USA
Karyn Meltz Steinberg
Center for Neurobehavioral Genetics, Jane and Terry Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, CA, USA
Charleston W. K. Chiang, Susan K. Service, Vasily Ramensky & Nelson B. Freimer
Center for Genetic Epidemiology, Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
Charleston W. K. Chiang
Quantitative and Computational Biology Section, Department of Biological Sciences, University of Southern California, Los Angeles, CA, USA
Charleston W. K. Chiang
Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
Aki S. Havulinna, Matti Pirinen, Hannele Laivuori, Aarno Palotie & Samuli Ripatti
National Institute for Health and Welfare, Helsinki, Finland
Aki S. Havulinna, Arto Pietilä & Veikko Salomaa
Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
Laurel Stell & Chiara Sabatti
Department of Public Health, University of Helsinki, Helsinki, Finland
Matti Pirinen & Samuli Ripatti
Helsinki Institute for Information Technology HIIT and Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland
Matti Pirinen
Department of Genetics, Washington University School of Medicine, St Louis, MO, USA
Haley J. Abel, Robert S. Fulton, David E. Larson & Susan K. Dutcher
The Institute for Genomic Medicine, Nationwide Children’s Hospital, Columbus, OH, USA
Daniel C. Koboldt & Richard K. Wilson
Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, USA
Daniel C. Koboldt & Richard K. Wilson
USTAR Center for Genetic Discovery and Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
Thomas J. Nicholas
Federal State Institution “National Medical Research Center for Preventive Medicine” of the Ministry of Healthcare of the Russian Federation, Moscow, Russia
Vasily Ramensky
Departments of Epidemiology and Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, USA
Debashree Ray
Institute of Clinical Medicine, Internal Medicine, University of Eastern Finland, Kuopio, Finland
Jagadish Vangipurapu & Markku Laakso
Department of Public Health Solutions, National Institute for Health and Welfare, Helsinki, Finland
Johan G. Eriksson
Folkhälsan Research Center, Helsinki, Finland
Johan G. Eriksson
Department of General Practice and Primary Health Care, University of Helsinki, Helsinki and Helsinki University Hospital, Helsinki, Finland
Johan G. Eriksson
Systems Epidemiology, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
Mika Ala-Korpela
Computational Medicine, Faculty of Medicine, University of Oulu and Biocenter Oulu, University of Oulu, Oulu, Finland
Mika Ala-Korpela
NMR Metabolomics Laboratory, School of Pharmacy, University of Eastern Finland, Kuopio, Finland
Mika Ala-Korpela
Population Health Science, Bristol Medical School, University of Bristol, Bristol, UK
Mika Ala-Korpela
Medical Research Council Integrative Epidemiology Unit at the University of Bristol, Bristol, UK
Mika Ala-Korpela
Department of Epidemiology and Preventive Medicine, School of Public Health and Preventive Medicine, Faculty of Medicine, Nursing and Health Sciences, The Alfred Hospital, Monash University, Melbourne, Victoria, Australia
Mika Ala-Korpela
Biocenter Oulu, University of Oulu, Oulu, Finland
Marjo-Riitta Järvelin
Center for Life Course Health Research, Faculty of Medicine, University of Oulu, Oulu, Finland
Marjo-Riitta Järvelin & Minna Männikkö
Unit of Primary Health Care, Oulu University Hospital, Oulu, Finland
Marjo-Riitta Järvelin
Department of Epidemiology and Biostatistics, MRC-PHE Centre for Environment and Health, School of Public Health, Imperial College London, London, UK
Marjo-Riitta Järvelin
Department of Life Sciences, College of Health and Life Sciences, Brunel University London, London, UK
Marjo-Riitta Järvelin
Northern Finland Birth Cohorts, Faculty of Medicine, University of Oulu, Oulu, Finland
Minna Männikkö
Medical and Clinical Genetics, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
Hannele Laivuori
Department of Obstetrics and Gynecology, Tampere University Hospital and University of Tampere, Faculty of Medicine and Health Technology, Tampere, Finland
Hannele Laivuori
Cardiovascular Division, Department of Medicine, Washington University School of Medicine, St Louis, MO, USA
Nathan O. Stitziel
Department of Statistics, Stanford University, Stanford, CA, USA
Chiara Sabatti
Analytical and Translational Genetics Unit (ATGU), Psychiatric & Neurodevelopmental Genetics Unit, Departments of Psychiatry and Neurology, Massachusetts General Hospital, Boston, MA, USA
Aarno Palotie
Broad Institute of MIT and Harvard, Cambridge, MA, USA
Aarno Palotie & Samuli Ripatti
Department of Medicine, Kuopio University Hospital, Kuopio, Finland
Markku Laakso

Authors

Adam E. Locke
View author publications
You can also search for this author in PubMed Google Scholar
Karyn Meltz Steinberg
View author publications
You can also search for this author in PubMed Google Scholar
Charleston W. K. Chiang
View author publications
You can also search for this author in PubMed Google Scholar
Susan K. Service
View author publications
You can also search for this author in PubMed Google Scholar
Aki S. Havulinna
View author publications
You can also search for this author in PubMed Google Scholar
Laurel Stell
View author publications
You can also search for this author in PubMed Google Scholar
Matti Pirinen
View author publications
You can also search for this author in PubMed Google Scholar
Haley J. Abel
View author publications
You can also search for this author in PubMed Google Scholar
Colby C. Chiang
View author publications
You can also search for this author in PubMed Google Scholar
Robert S. Fulton
View author publications
You can also search for this author in PubMed Google Scholar
Anne U. Jackson
View author publications
You can also search for this author in PubMed Google Scholar
Chul Joo Kang
View author publications
You can also search for this author in PubMed Google Scholar
Krishna L. Kanchi
View author publications
You can also search for this author in PubMed Google Scholar
Daniel C. Koboldt
View author publications
You can also search for this author in PubMed Google Scholar
David E. Larson
View author publications
You can also search for this author in PubMed Google Scholar
Joanne Nelson
View author publications
You can also search for this author in PubMed Google Scholar
Thomas J. Nicholas
View author publications
You can also search for this author in PubMed Google Scholar
Arto Pietilä
View author publications
You can also search for this author in PubMed Google Scholar
Vasily Ramensky
View author publications
You can also search for this author in PubMed Google Scholar
Debashree Ray
View author publications
You can also search for this author in PubMed Google Scholar
Laura J. Scott
View author publications
You can also search for this author in PubMed Google Scholar
Heather M. Stringham
View author publications
You can also search for this author in PubMed Google Scholar
Jagadish Vangipurapu
View author publications
You can also search for this author in PubMed Google Scholar
Ryan Welch
View author publications
You can also search for this author in PubMed Google Scholar
Pranav Yajnik
View author publications
You can also search for this author in PubMed Google Scholar
Xianyong Yin
View author publications
You can also search for this author in PubMed Google Scholar
Johan G. Eriksson
View author publications
You can also search for this author in PubMed Google Scholar
Mika Ala-Korpela
View author publications
You can also search for this author in PubMed Google Scholar
Marjo-Riitta Järvelin
View author publications
You can also search for this author in PubMed Google Scholar
Minna Männikkö
View author publications
You can also search for this author in PubMed Google Scholar
Hannele Laivuori
View author publications
You can also search for this author in PubMed Google Scholar
Susan K. Dutcher
View author publications
You can also search for this author in PubMed Google Scholar
Nathan O. Stitziel
View author publications
You can also search for this author in PubMed Google Scholar
Richard K. Wilson
View author publications
You can also search for this author in PubMed Google Scholar
Ira M. Hall
View author publications
You can also search for this author in PubMed Google Scholar
Chiara Sabatti
View author publications
You can also search for this author in PubMed Google Scholar
Aarno Palotie
View author publications
You can also search for this author in PubMed Google Scholar
Veikko Salomaa
View author publications
You can also search for this author in PubMed Google Scholar
Markku Laakso
View author publications
You can also search for this author in PubMed Google Scholar
Samuli Ripatti
View author publications
You can also search for this author in PubMed Google Scholar
Michael Boehnke
View author publications
You can also search for this author in PubMed Google Scholar
Nelson B. Freimer
View author publications
You can also search for this author in PubMed Google Scholar

Consortia

FinnGen Project

Contributions

A.E.L., L.J.S., R.K.W., A. Palotie, V.S., M.L., S.R., M.B. and N.B.F. designed the study. A.E.L., K.M.S., H.J.A., R.S.F., D.C.K., D.E.L., J.N., T.J.N. and J.V. produced and quality-controlled the sequence data. A.E.L., A.S.H., A.U.J., A. Pietilä, H.M.S., M.A.-K., V.S. and M.L. collected, quality-controlled and/or prepared the clinical data for association analysis. A.E.L., K.M.S., C.W.K.C., S.K.S., A.S.H., L.S., M.P., C.C.C., A.U.J., C.J.K., K.L.K., V.R., D.R., J.V., R.W., P.Y. and X.Y. analysed data. A.S.H., J.G.E., M.A.-K., M.-R.J. and M.M. collected, quality-controlled and analysed replication data. H.L., S.K.D., N.O.S., I.M.H., C.S., S.R., M.B. and N.B.F. supervised experiments and analyses. A.E.L., K.M.S., C.W.K.C., S.K.S., C.S., M.B. and N.B.F. wrote the paper.

Corresponding authors

Correspondence to Michael Boehnke or Nelson B. Freimer.

Ethics declarations

Competing interests

: V.S. has participated in a conference trip sponsored by Novo Nordisk and received a honorarium from the same source for participating in an advisory board meeting. He also has ongoing research collaboration with Bayer. H.L. is a member of the Nordic Expert group unconditionally supported by Gedeon Richter Nordics and has received an honorarium from Orion. All other authors have no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Peer review information Nature thanks Timothy Frayling, Alan Shuldiner, André G. Uitterlinden, Daniel E. Weeks for their contribution to the peer review of this work.

Extended data figures and tables

Extended Data Fig. 1 Allele frequency comparisons between FinMetSeq and NFE from gnomAD.

a, Distribution of allelic frequencies between FinMetSeq and gnomAD NFE. The comparison of allele frequencies shows the excess of variants at higher frequency in Finland as a result of the multiple bottlenecks experienced in Finnish population history. b, Proportional site frequency spectra between FinMetSeq and gnomAD NFE by variant annotation class. In general, we find a depletion of the variants in the rarest frequency class, as well as enrichment of variants in the intermediate to common frequency range. The site frequency spectra were down-sampled to 18,000 chromosomes for each data set. c, Comparison of MAFs for trait-associated variants in FinMetSeq and NFE gnomAD. Plotted in the grey background is a two-dimensional histogram of variants with non-zero allele frequencies in both gnomAD and FinMetSeq but no trait associations. Variants associated with at least one trait are coloured and scaled inversely proportional to the logarithm of the association P value. Variants >10× enriched in FinMetSeq compared to NFE are pink, those <10× enriched are in blue. The dashed line is the line of equal frequency. Two-sided uncorrected P values are from a regression of trait on the count of alternative allele at each variant. The number of independent individuals used in each point is listed in Supplementary Table 5.

Extended Data Fig. 2 Heritability of and correlations between traits.

a, b, Traits are in the same order, clockwise in a, and left to right and top to bottom in b, following the trait group colour key. a, Heritability estimated in 13,342 unrelated individuals (for abbreviations see Supplementary Table 4; for details see Supplementary Table 6). b, Heat map of the absolute Pearson correlations of standardized trait values (top right triangle) and the absolute values of estimated pairwise genetic correlations (bottom left triangle). Genetic correlations are estimated in 13,342 unrelated individuals. Values in grey below the diagonal had trait heritability less than 1.5× the s.e. of heritability.

Extended Data Fig. 3 Properties of associations shared between traits.

a, Shared genomic associations by pairs of traits. For traits x and y, colour in row x and column y reflects the number of loci associated with both traits divided by the number of loci associated with trait x. Traits are presented in the same order as in Extended Data Fig. 2a, and the side and top colour bars reflect trait groups. b, Relationship between estimated genetic correlation and extent of sharing of genetic associations. For each trait pair, the extent of locus sharing is defined as the number of loci associated with both traits divided by the total number of loci associated with either trait. Analysis using the absolute value of the Pearson correlation of the residual series results in a very similar pattern. The number of trait pairs in each x-axis category is as follows: 0–1%, 819; 1–10%, 204; 11–20%, 102; 21–30%, 41; 31–40%, 29; 41–50%, 16; >50%, 13. The bar within each box is the median, the box represents the upper and lower quartiles, whiskers extend to 1.5× the interquartile range and points represent outliers.

Extended Data Fig. 4 Gene-based association of extremely rare variants in APOB with serum total cholesterol.

Top, the distribution of the covariate-adjusted and inverse-normal transformed phenotype. Bottom, the association statistics for each variant included in the gene-based test along with the trait value for minor allele carriers of each variant (orange triangles). SV.P is the P value from the analysis of each variant in a single-variant analysis. The number of independent individuals in the analysis is 19,291.

Extended Data Fig. 5 Gene-based association of rare variants in SECTM1 with HDL2 cholesterol.

Top, the distribution of the covariate-adjusted and inverse-normal transformed phenotype. Bottom, the association statistics for each variant included in the gene-based test, along with the trait value for minor allele carriers of each variant (orange triangles). SV.P is the P value from the analysis of each variant in a single-variant analysis. The number of independent individuals in the analysis is 10,984.

Extended Data Fig. 6 Gene-based association of extremely rare variants in ALDH1L1 with glycine levels.

Top, the distribution of the covariate-adjusted and inverse-normal transformed phenotype. Bottom, the association statistics for each variant included in the gene-based test, along with the trait value for minor allele carriers of each variant (orange triangles). SV.P is the P value from the analysis of each variant in a single-variant analysis. The number of independent individuals in the analysis is 8,206.

Extended Data Fig. 7 Population structure of the FinMetSeq dataset, by region.

Population structure, by region, from a principal component analysis of exome-sequencing variant data (MAF > 1%) for 14,874 unrelated individuals with known parental birthplaces. Colour indicates individuals with both parents born in the same region; grey indicates individuals with different parental birth regions or missing information for one parent. Ctf, Central Finland; COs, Central Ostrobothnia; Kai, Kainuu; Khm, Kanta-Hame; Kyl, Kymenlaakso; Lap, Lapland; Nka, Northern Karelia; NOs, Northern Ostrobothnia; NSv, Northern Savonia; Nfi, individuals born outside Finland and lacking data on parental birthplace; Osb, Ostrobothnia; Phm, Paijat-Hame; Prk, Pirkanmaa; SKa, Southern Karelia; SuK, surrendered Karelia; SOs, Southern Ostrobothnia; SSv, Southern Savonia; Stk, Satakunta; Swf, Southwest Finland; Usm, Uusimaa; X, split parental birthplaces. Large solid circles represent the centre of each region. A map of Finland with regions labelled is supplied for reference.

Extended Data Fig. 8 Hierarchical clustering tree produced by fineSTRUCTURE.

We identified 16 subpopulations within the FinMetSeq dataset by applying a haplotype-based clustering algorithm, fineSTRUCTURE, on 2,644 unrelated individuals born by 1955 whose parents were both born in the same municipality (Methods). Each subpopulation is named based on the most common parental birth location among its members. Kai, Kainuu; Lap, Lapland; NKa, North Karelia; NOs, North Ostrobothnia; NSv, North Savonia; SOs, South Ostrobothnia; SuK, Surrendered Karelia. A map of Finland with regions labelled is supplied for reference. If multiple subpopulations share the same location label, the subpopulation is further distinguished with a numeral. NSv3 is used as an internal reference for the enrichment analysis. See Supplementary Table 17 for more detailed demographic descriptions of each subpopulation.

Extended Data Fig. 9 Regional variation in allele frequencies by functional annotation.

Enrichment of variants by allelic class in regional subpopulations of late-settlement Finland (defined in Supplementary Table 17). Each bin represents the ratio of variants in the subpopulation compared to the reference subpopulation (NSv3), after down-sampling the frequency spectra of all populations to 200 chromosomes. Pink cells represent enrichment (ratio >1), blue cells represent depletion (ratio <1). Sample sizes and confidence intervals for each enrichment ratio and the associated P values are presented in Supplementary Table 18. The results are consistent with multiple bottlenecks in late-settlement Finland, particularly for populations in Lapland and Northern Ostrobothnia. *P < 0.05; **P < 0.01; ***P < 0.005.

Supplementary information

Supplementary Information

This file contains the Supplementary Results, Supplementary Methods, Supplementary References and a full list of members of FinnGen.

Reporting Summary

Supplementary Tables

This file contains Supplementary Tables 1–22 with a full guide.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Locke, A.E., Steinberg, K.M., Chiang, C.W.K. et al. Exome sequencing of Finnish isolates enhances rare-variant association power. Nature 572, 323–328 (2019). https://doi.org/10.1038/s41586-019-1457-z

Download citation

Received: 05 November 2018
Accepted: 02 July 2019
Published: 31 July 2019
Issue Date: 15 August 2019
DOI: https://doi.org/10.1038/s41586-019-1457-z

This article is cited by

Deciphering the genetic structure of the Quebec founder population using genealogies
- Laurence Gagnon
- Claudia Moreau
- Simon L. Girard
European Journal of Human Genetics (2024)
Mineral Metabolism and Polycystic Ovary Syndrome and Metabolic Risk Factors: A Mendelian Randomization Study
- Jiayan Shen
- Li Xu
- Yang Ding
Reproductive Sciences (2024)
Genome-wide characterization of circulating metabolic biomarkers
- Minna K. Karjalainen
- Savita Karthikeyan
- Johannes Kettunen
Nature (2024)
Causal associations between type 1 diabetes mellitus and cardiovascular diseases: a Mendelian randomization study
- Zirui Liu
- Haocheng Wang
- Cao Zou
Cardiovascular Diabetology (2023)
KIF15 missense variant is associated with the early onset of idiopathic pulmonary fibrosis
- Maria Hollmén
- Atte Laaka
- Marjukka Myllärniemi
Respiratory Research (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.