Insights into the genetic architecture of the human face

White, Julie D.; Indencleef, Karlijne; Naqvi, Sahin; Eller, Ryan J.; Hoskens, Hanne; Roosenboom, Jasmien; Lee, Myoung Keun; Li, Jiarui; Mohammed, Jaaved; Richmond, Stephen; Quillen, Ellen E.; Norton, Heather L.; Feingold, Eleanor; Swigut, Tomek; Marazita, Mary L.; Peeters, Hilde; Hens, Greet; Shaffer, John R.; Wysocka, Joanna; Walsh, Susan; Weinberg, Seth M.; Shriver, Mark D.; Claes, Peter

doi:10.1038/s41588-020-00741-7

Article
Published: 07 December 2020

Insights into the genetic architecture of the human face

Nature Genetics volume 53, pages 45–53 (2021)Cite this article

13k Accesses
73 Citations
338 Altmetric
Metrics details

Subjects

Abstract

The human face is complex and multipartite, and characterization of its genetic architecture remains challenging. Using a multivariate genome-wide association study meta-analysis of 8,246 European individuals, we identified 203 genome-wide-significant signals (120 also study-wide significant) associated with normal-range facial variation. Follow-up analyses indicate that the regions surrounding these signals are enriched for enhancer activity in cranial neural crest cells and craniofacial tissues, several regions harbor multiple signals with associations to different facial phenotypes, and there is evidence for potential coordinated actions of variants. In summary, our analyses provide insights into the understanding of how complex morphological traits are shaped by both individual and coordinated genetic actions.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Overall results of US-driven and UK-driven meta-analyses.**

**Fig. 2: Regions near the 203 genome-wide-significant lead SNPs are enriched for enhancers preferentially active in cranial neural crest cells and embryonic craniofacial tissue.**

**Fig. 3: Activity of 203 genome-wide-significant lead SNPs in all cell types studied.**

**Fig. 4: *TBX15-WARS2* multi-peak locus.**

**Fig. 5: Phenotypic and marginal distributions for the rs62443772–rs76244841 epistatic pair.**

Sex differences orchestrated by androgens at single-cell resolution

Article 10 April 2024

Fei Li, Xudong Xing, … Dong Gao

Exome-wide analysis implicates rare protein-altering variants in human handedness

Article Open access 02 April 2024

Dick Schijven, Sourena Soheili-Nezhad, … Clyde Francks

Genome-wide association studies

Article 26 August 2021

Emil Uffelmann, Qin Qin Huang, … Danielle Posthuma

Data availability

All of the genotypic markers for the 3DFN dataset are available to the research community through the dbGaP controlled-access repository (http://www.ncbi.nlm.nih.gov/gap) at accession no. phs000949.v1.p1. The raw source data for the phenotypes—the 3D facial surface models in.obj format—are available through the FaceBase Consortium (https://www.facebase.org) at accession no. FB00000491.01. Access to these 3D facial surface models requires proper institutional ethics approval and approval from the FaceBase data access committee. Additional details can be requested from S.M.W.

The participants making up the PSU and IUPUI datasets were not collected with broad data sharing consent. Given the highly identifiable nature of both facial and genomic information and unresolved issues regarding risk to participants, we opted for a more conservative approach to participant recruitment. Broad data sharing of the raw data from these collections would thus be in legal and ethical violation of the informed consent obtained from the participants. This restriction is not because of any personal or commercial interests. Additional details can be requested from M.D.S. and S.W. for the PSU and IUPUI datasets, respectively.

The ALSPAC (UK) data will be made available to bona fide researchers on application to the ALSPAC Executive Committee (http://www.bris.ac.uk/alspac/researchers/data-access). Ethical approval for the study was obtained from the ALSPAC Ethics and Law Committee and the Local Research Ethics Committees.

Publicly available data used were the 1000G Phase 3 data (ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/), the list of HapMap 3 SNPs excluding the MHC region (http://ldsc.broadinstitute.org/static/media/w_hm3.noMHC.snplist.zip), and ChIP–seq files from Prescott et al.³⁹ (GSE70751), Najafova et al.⁸⁵ (GSE82295), Baumgart et al.⁸⁶ (GSE89179), Nott et al.⁸⁷ (https://genome.ucsc.edu/s/nottalexi/glassLab_BrainCellTypes_hg19), Pattison et al.⁸⁸ (GSE119997), Wilderman et al.⁴⁰ (GSE97752) and the Roadmap Epigenomics Project⁸⁹ (https://egg2.wustl.edu/roadmap/data/byFileType/alignments/consolidated/). Meta-analysis GWAS statistics are available on GWAS Catalog (GCP000044). All data relevant to run future replications and meta-analysis efforts are provided in the FigShare repository for this work³⁴, along with additional figures (https://doi.org/10.6084/m9.figshare.c.4667261). Items available in the FigShare repository are (1) anthropometric mask: a Matfile of the anthropometric mask used; (2) association statistics and effects of the 203 lead SNPs: facial effects, LocusZoom plots and association statistics from each stage of the analysis for the 203 lead SNPs; (3) calculation of study-wide-significance threshold: script and permutation outcomes needed to replicate the calculation of the study-wide-significance threshold; (4) facial segment assignments: segment assignments for each quasi-landmark in the anthropometric mask; (5) Fig. 2a labeled: a larger version of Fig. 2a, with all cell types and tissues labeled; (6) GREAT Export: raw output of the GREAT analysis; (7) PCA shape constructs: PCA shape spaces for all 63 facial segments; (8) QQ plots: QQ plots for each segment in all stages of the analysis; (9) script to explore facial segments and GWAS hits: MatLab script for select data exploration functions; (10) SNPs reaching suggestive significance in either meta-analysis track: association statistics of all SNPs with P < 5 × 10⁻⁷ in METAUS or METAUK tracks; (11) source data for manuscript figures: source data in Excel format for all figures, where possible.

Code availability

KU Leuven provides the MeshMonk (v.0.0.6) spatially dense facial-mapping software, free to use for academic purposes (https://github.com/TheWebMonks/meshmonk). Matlab 2017b implementations of the hierarchical spectral clustering to obtain facial segmentations are available from a previous publication²⁵ (https://doi.org/10.6084/m9.figshare.7649024).

The statistical analyses in this work were based on functions of the statistical toolbox in Matlab 2017b, SHAPEIT2 (v.2.r900), Sanger Imputation Server (v.0.0.6), PBWT pipeline (v.3.1), MeshMonk (v.0.0.6), LDSC (v.1.0.1), FUMA (v.1.3.3), GREAT (v.3.0.0), Plink v.1.9, lavaan (v.0.6-3), R (>v.3.4), agricolae (v.1.3-0), cowplot (v.1.0.0), ggplot2 (v.3.1.1), ggpubr (v.0.2), gridExtra (v.2.3), gtable (v.0.3.0), grid (v.3.6.2), Hmisc (v.4.2-0), psych (v.1.8.12), data.table (v.1.12.0), Genotype Harmonizer (v.1.4.20), KING (v.2.1.3), bowtie2 (v.2.3.4.2), bedtools (v.2.27.1) and Bioconductor (v.3.7), as mentioned throughout the Methods.

References

Atchley, W. R. & Hall, B. K. A model for development and evolution of complex morphological structures. Biol. Rev. 66, 101–157 (1991).
Article CAS PubMed Google Scholar
Gratten, J., Wray, N. R., Keller, M. C. & Visscher, P. M. Large-scale genomics unveils the genetic architecture of psychiatric disorders. Nat. Neurosci. 17, 782–790 (2014).
Article CAS PubMed PubMed Central Google Scholar
Timpson, N. J., Greenwood, C. M. T., Soranzo, N., Lawson, D. J. & Richards, J. B. Genetic architecture: the shape of the genetic contribution to human traits and disease. Nat. Rev. Genet. 19, 110–124 (2018).
Article CAS PubMed Google Scholar
Weinberg, S. M. et al. Hunting for genes that shape human faces: initial successes and challenges for the future. Orthod. Craniofac. Res. 22, 207–212 (2019).
Article PubMed PubMed Central Google Scholar
Weinberg, S. M., Cornell, R. & Leslie, E. J. Craniofacial genetics: where have we been and where are we going? PLoS Genet. 14, e1007438 (2018).
Article PubMed PubMed Central CAS Google Scholar
Dixon, M. J., Marazita, M. L., Beaty, T. H. & Murray, J. C. Cleft lip and palate: understanding genetic and environmental influences. Nat. Rev. Genet. 12, 167–178 (2011).
Article CAS PubMed PubMed Central Google Scholar
Paternoster, L. et al. Genome-wide association study of three-dimensional facial morphology identifies a variant in PAX3 associated with nasion position. Am. J. Hum. Genet. 90, 478–485 (2012).
Article CAS PubMed PubMed Central Google Scholar
Liu, F. et al. A genome-wide association study identifies five loci influencing facial morphology in Europeans. PLoS Genet. 8, e1002932 (2012).
Article CAS PubMed PubMed Central Google Scholar
Jacobs, L. C. et al. Intrinsic and extrinsic risk factors for sagging eyelids. JAMA Dermatol. 150, 836–843 (2014).
Article PubMed Google Scholar
Adhikari, K. et al. A genome-wide association scan implicates DCHS2, RUNX2, GLI3, PAX1 and EDAR in human facial variation. Nat. Commun. 7, 11616 (2016).
CAS PubMed PubMed Central Google Scholar
Pickrell, J. K. et al. Detection and interpretation of shared genetic influences on 42 human traits. Nat. Genet. 48, 709–717 (2016).
Article CAS PubMed PubMed Central Google Scholar
Shaffer, J. R. et al. Genome-wide association study reveals multiple loci influencing normal human facial morphology. PLoS Genet. 12, e1006149 (2016).
Article PubMed PubMed Central CAS Google Scholar
Cole, J. B. et al. Genome-wide association study of African children identifies association of SCHIP1 and PDE8A with facial size and shape. PLoS Genet. 12, e1006174 (2016).
Article PubMed PubMed Central CAS Google Scholar
Lee, M. K. et al. Genome-wide association study of facial morphology reveals novel associations with FREM1 and PARK2. PLoS One 12, e0176566 (2017).
Article PubMed PubMed Central CAS Google Scholar
Crouch, D. J. M. et al. Genetics of the human face: identification of large-effect single gene variants. Proc. Natl Acad. Sci. USA 115, E676–E685 (2018).
Article CAS PubMed PubMed Central Google Scholar
Claes, P. et al. Genome-wide mapping of global-to-local genetic effects on human facial shape. Nat. Genet. 50, 414–423 (2018).
Article CAS PubMed PubMed Central Google Scholar
Endo, C. et al. Genome-wide association study in Japanese females identifies fifteen novel skin-related trait associations. Sci. Rep. 8, 8974 (2018).
Cha, S. et al. Identification of five novel genetic loci related to facial morphology by genome-wide association studies. BMC Genomics 19, 481 (2018).
Article PubMed PubMed Central CAS Google Scholar
Howe, L. J. et al. Investigating the shared genetics of non-syndromic cleft lip/palate and facial morphology. PLoS Genet. 14, e1007501 (2018).
Article PubMed PubMed Central CAS Google Scholar
Qiao, L. et al. Genome-wide variants of Eurasian facial shape differentiation and a prospective model of DNA based face prediction. J. Genet. Genomics 45, 419–432 (2018).
Article PubMed Google Scholar
Wu, W. et al. Whole-exome sequencing identified four loci influencing craniofacial morphology in northern Han Chinese. Hum. Genet. 138, 601–611 (2019).
Article CAS PubMed PubMed Central Google Scholar
Li, Y. et al. EDAR, LYPLAL1, PRDM16, PAX3, DKK1, TNFSF12, CACNA2D3 and SUPT3H gene variants influence facial morphology in a Eurasian population. Hum. Genet. 138, 681–689 (2019).
Article CAS PubMed Google Scholar
Xiong, Z. et al. Novel genetic loci affecting facial shape variation in humans. eLife 8, e49898 (2019).
Article PubMed PubMed Central Google Scholar
White, J. D. et al. MeshMonk: open-source large-scale intensive 3D phenotyping. Sci. Rep. 9, 6085 (2019).
Article PubMed PubMed Central CAS Google Scholar
Sero, D. et al. Facial recognition from DNA using face-to-DNA classifiers. Nat. Commun. 10, 2557 (2019).
Article PubMed PubMed Central CAS Google Scholar
Hayton, J. C., Allen, D. G. & Scarpello, V. Factor retention decisions in exploratory factor analysis: a tutorial on parallel analysis. Organ. Res. Methods 7, 191–205 (2004).
Article Google Scholar
Franklin, S. B., Gibson, D. J., Robertson, P. A., Pohlmann, J. T. & Fralish, J. S. Parallel analysis: a method for determining significant principal components. J. Veg. Sci. 6, 99–106 (1995).
Article Google Scholar
Stouffer, S. A., Suchman, E. A., Devinney, L. C., Star, S. A. & Williams, R. M. Jr. The American Soldier: Adjustment During Army Life. Vol. 1 (Princeton Univ. Press, 1949).
Willer, C. J., Li, Y. & Abecasis, G. R. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinforma. Oxf. Engl. 26, 2190–2191 (2010).
Article CAS Google Scholar
Bulik-Sullivan, B. et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236–1241 (2015).
Article CAS PubMed PubMed Central Google Scholar
Bulik-Sullivan, B. K. et al. LD score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).
Article CAS PubMed PubMed Central Google Scholar
Som, P. M., Streit, A. & Naidich, T. P. Illustrated review of the embryology and development of the facial region, part 3: an overview of the molecular interactions responsible for facial development. Am. J. Neuroradiol. 35, 223–229 (2014).
Article CAS PubMed PubMed Central Google Scholar
Pruim, R. J. et al. LocusZoom: regional visualization of genome-wide association scan results. Bioinformatics 26, 2336–2337 (2010).
Article CAS PubMed PubMed Central Google Scholar
White, J. & Indencleef, K. Insights into the genetic architecture of the human face. FigShare https://doi.org/10.6084/m9.figshare.c.4667261 (2020).
McLean, C. Y. et al. GREAT improves functional interpretation of cis-regulatory regions. Nat. Biotechnol. 28, 495–501 (2010).
Article CAS PubMed PubMed Central Google Scholar
Watanabe, K., Taskesen, E., Bochoven, Avan & Posthuma, D. Functional mapping and annotation of genetic associations with FUMA. Nat. Commun. 8, 1826 (2017).
Article PubMed PubMed Central CAS Google Scholar
Rada-Iglesias, A. et al. A unique chromatin signature uncovers early developmental enhancers in humans. Nature 470, 279–283 (2011).
Article CAS PubMed Google Scholar
Creyghton, M. P. et al. Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proc. Natl Acad. Sci. USA 107, 21931–21936 (2010).
Article CAS PubMed PubMed Central Google Scholar
Prescott, S. L. et al. Enhancer divergence and cis-regulatory evolution in the human and chimp neural crest. Cell 163, 68–83 (2015).
Article CAS PubMed PubMed Central Google Scholar
Wilderman, A., VanOudenhove, J., Kron, J., Noonan, J. P. & Cotney, J. High-resolution epigenomic atlas of human embryonic craniofacial development. Cell Rep. 23, 1581–1597 (2018).
Article CAS PubMed PubMed Central Google Scholar
Kraus, P. & Lufkin, T. Dlx homeobox gene control of mammalian limb and craniofacial development. Am. J. Med. Genet. A 140, 1366–1374 (2006).
Article PubMed CAS Google Scholar
Hennekam, R. C. M., Krantz, I. D. & Allanson, J. E. Gorlin’s Syndromes of the Head and Neck (Oxford Univ. Press, 2010).
Attanasio, C. et al. Fine tuning of craniofacial morphology by distant-acting enhancers. Science 342, 1241006 (2013).
Article PubMed PubMed Central CAS Google Scholar
Beaty, T. H. et al. Testing candidate genes for non-syndromic oral clefts using a case-parent trio design. Genet. Epidemiol. 22, 1–11 (2002).
Article PubMed Google Scholar
Alappat, S., Zhang, Z. Y. & Chen, Y. P. Msx homeobox gene family and craniofacial development. Cell Res 13, 429–442 (2003).
Article CAS PubMed Google Scholar
Satokata, I. & Maas, R. Msx1 deficient mice exhibit cleft palate and abnormalities of craniofacial and tooth development. Nat. Genet. 6, 348–356 (1994).
Article CAS PubMed Google Scholar
Nakatomi, M. et al. Genetic interactions between Pax9 and Msx1 regulate lip development and several stages of tooth morphogenesis. Dev. Biol. 340, 438–449 (2010).
Article CAS PubMed Google Scholar
Wang, J.-L. et al. TGF-β signaling regulates DACT1 expression in intestinal epithelial cells. Biomed. Pharmacother. 97, 864–869 (2018).
Article CAS PubMed Google Scholar
Rabadán, M. A. et al. Delamination of neural crest cells requires transient and reversible Wnt inhibition mediated by Dact1/2. Development 143, 2194–2205 (2016).
Article PubMed PubMed Central CAS Google Scholar
Stegman, M. A. et al. Identification of a tetrameric hedgehog signaling complex. J. Biol. Chem. 275, 21809–21812 (2000).
Article CAS PubMed Google Scholar
Méthot, N. & Basler, K. Suppressor of fused opposes hedgehog signal transduction by impeding nuclear accumulation of the activator form of Cubitus interruptus. Development 127, 4001–4010 (2000).
PubMed Google Scholar
Monnier, V., Dussillol, F., Alves, G., Lamour-Isnard, C. & Plessis, A. Suppressor of fused links fused and Cubitus interruptus on the hedgehog signalling pathway. Curr. Biol. CB 8, 583–586 (1998).
Article CAS PubMed Google Scholar
Krzywinski, M. I. et al. Circos: an information aesthetic for comparative genomics. Genome Res. 19, 1639–1645 (2009).
Article CAS PubMed PubMed Central Google Scholar
Brown, G. W. & Mood, A. M. On median tests for linear hypotheses. In Proc. 2nd Berkeley Symposium on Mathematical Statistics and Probability (ed. Neyman, J.) 159–166 (Univ. of California Press, 1951).
Weinberg, S. M. et al. The 3D facial norms database: part 1. A web-based craniofacial anthropometric and image repository for the clinical and research community. Cleft Palate Craniofac. J. 53, e185–e197 (2016).
Article PubMed Google Scholar
Boyd, A. et al. Cohort profile: the ‘children of the 90s’—the index offspring of the Avon longitudinal study of parents and children. Int. J. Epidemiol. 42, 111–127 (2013).
Article PubMed Google Scholar
Fraser, A. et al. Cohort profile: the Avon longitudinal study of parents and children: ALSPAC mothers cohort. Int. J. Epidemiol. 42, 97–110 (2013).
Article PubMed Google Scholar
Verma, S. S. et al. Imputation and quality control steps for combining multiple genome-wide datasets. Front. Genet. 5, 370 (2014).
Article PubMed PubMed Central CAS Google Scholar
Delaneau, O., Zagury, J.-F. & Marchini, J. Improved whole-chromosome phasing for disease and population genetic studies. Nat. Methods 10, 5–6 (2013).
Article CAS PubMed Google Scholar
The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015).
Article CAS Google Scholar
Durbin, R. Efficient haplotype matching and storage using the positional Burrows-Wheeler transform (PBWT). Bioinforma. Oxf. Engl. 30, 1266–1272 (2014).
Article CAS Google Scholar
McCarthy, S. et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat. Genet. 48, 1279–1283 (2016).
Article CAS PubMed PubMed Central Google Scholar
1000 Genomes Project Consortium. et al. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012).
Article CAS Google Scholar
Howie, B., Marchini, J. & Stephens, M. Genotype imputation with thousands of genomes. G3 Genes Genomics Genet. 1, 457–470 (2011).
Google Scholar
Heike, C. L., Upson, K., Stuhaug, E. & Weinberg, S. M. 3D digital stereophotogrammetry: a practical guide to facial image acquisition. Head. Face Med. 6, 18 (2010).
Article PubMed PubMed Central Google Scholar
Robert, P. & Escoufier, Y. A unifying tool for linear multivariate statistical methods: the RV-coefficient. J. R. Stat. Soc. Ser. C. Appl. Stat. 25, 257–265 (1976).
Google Scholar
Klingenberg, C. P. Morphometric integration and modularity in configurations of landmarks: tools for evaluating a priori hypotheses. Evol. Dev. 11, 405–421 (2009).
Article PubMed PubMed Central Google Scholar
Rohlf, F. J. & Slice, D. Extensions of the Procrustes method for the optimal superimposition of landmarks. Syst. Biol. 39, 40–59 (1990).
Google Scholar
Olson, C. L. On choosing a test statistic in multivariate analysis of variance. Psychol. Bull. 83, 579–586 (1976).
Article Google Scholar
Ferreira, M. A. R. & Purcell, S. M. A multivariate test of association. Bioinformatics 25, 132–133 (2009).
Article CAS PubMed Google Scholar
Galesloot, T. E., van Steen, K., Kiemeney, L. A. L. M., Janss, L. L. & Vermeulen, S. H. A comparison of multivariate genome-wide association methods. PLoS One 9, e95923 (2014).
Article PubMed PubMed Central CAS Google Scholar
Porter, H. F. & O’Reilly, P. F. Multivariate simulation framework reveals performance of multi-trait GWAS methods. Sci. Rep. 7, 38837 (2017).
Article PubMed PubMed Central Google Scholar
O’Reilly, P. F. et al. MultiPhen: joint model of multiple phenotypes can increase discovery in GWAS. PLoS One 7, e34861 (2012).
Article PubMed PubMed Central CAS Google Scholar
Korte, A. et al. A mixed-model approach for genome-wide association studies of correlated traits in structured populations. Nat. Genet. 44, 1066–1071 (2012).
Article CAS PubMed PubMed Central Google Scholar
Stephens, M. A unified framework for association analysis with multiple related phenotypes. PLoS One 8, e65245 (2013).
Article CAS PubMed PubMed Central Google Scholar
Zhou, X. & Stephens, M. Efficient multivariate linear mixed model algorithms for genome-wide association studies. Nat. Methods 11, 407–409 (2014).
Article CAS PubMed PubMed Central Google Scholar
Devroye, L. Non-uniform Random Variate Generation (Springer, 1986).
Zerbino, D. R. et al. Ensembl 2018. Nucleic Acids Res. 46, D754–D761 (2018).
Article CAS PubMed Google Scholar
Karolchik, D. et al. The UCSC table browser data retrieval tool. Nucleic Acids Res. 32, D493–D496 (2004).
Article CAS PubMed PubMed Central Google Scholar
Hooper, J. E. et al. Systems biology of facial development: contributions of ectoderm and mesenchyme. Dev. Biol. 426, 97–114 (2017).
Article CAS PubMed PubMed Central Google Scholar
Pers, T. H., Timshel, P. & Hirschhorn, J. N. SNPsnap: a Web-based tool for identification and annotation of matched SNPs. Bioinformatics 31, 418–420 (2015).
Article CAS PubMed Google Scholar
Buniello, A. et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 47, D1005–D1012 (2019).
CAS PubMed Google Scholar
Rosseel, Y. lavaan: an R package for structural equation modeling. J. Stat. Softw. 48, 1–36 (2012).
Article Google Scholar
Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience 4, 7 (2015).
Article PubMed PubMed Central CAS Google Scholar
Najafova, Z. et al. BRD4 localization to lineage-specific enhancers is associated with a distinct transcription factor repertoire. Nucleic Acids Res. 45, 127–141 (2017).
Article CAS PubMed Google Scholar
Baumgart, S. J. et al. CHD1 regulates cell fate determination by activation of differentiation-induced genes. Nucleic Acids Res. 45, 7722–7735 (2017).
Article CAS PubMed PubMed Central Google Scholar
Nott, A. et al. Brain cell type-specific enhancer-promoter interactome maps and disease risk association. Science 366, 1134–1139 (2019).
Article CAS PubMed PubMed Central Google Scholar
Pattison, J. M. et al. Retinoic acid and BMP4 cooperate with TP63 to alter chromatin dynamics during surface epithelial commitment. Nat. Genet. 50, 1658–1665 (2018).
Article CAS PubMed PubMed Central Google Scholar
Roadmap Epigenomics Consortium et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).

Download references

Acknowledgements

We are extremely grateful to all the individuals and families who took part in this study, the midwives for their help in recruiting them and the whole ALSPAC team, which includes interviewers, computer and laboratory technicians, clerical workers, research scientists, volunteers, managers, receptionists and nurses. We are also very grateful to all of the US participants for generously donating their time to our research, and to present and former laboratory members who worked tirelessly to make these analyses possible. Pittsburgh personnel, data collection and analyses were supported by the National Institute of Dental and Craniofacial Research (U01-DE020078, program director/principal investigators (PD/PIs): M.L.M./S.M.W.; R01-DE016148, PD/PIs: M.L.M./S.M.W.; and R01-DE027023, PD/PIs: S.M.W./J.R.S.). Funding for genotyping by the National Human Genome Research Institute (X01-HG007821 and X01-HG007485, PD/PI: M.L.M.) and funding for initial genomic data cleaning by the University of Washington provided by contract HHSN268201200008I from the National Institute for Dental and Craniofacial Research awarded to the Center for Inherited Disease Research (https://www.cidr.jhmi.edu/). Penn State personnel, data collection and analyses were supported by Procter & Gamble, Company (UCRI-2015-1117-HN-532, PD/PIs: H.L.N.), the Center for Human Evolution and Development at Penn State, the Science Foundation of Ireland Walton Fellowship (04.W4/B643, PD/PI: M.D.S.), the US National Institute of Justice (2008-DN-BX-K125, PD/PI: M.D.S.; and 2018-DU-BX-0219, PD/PIs: S.W.) and by the US Department of Defense. IUPUI personnel, data collection and analyses were supported by the National Institute of Justice (2015-R2-CX-0023, 2014-DN-BX-K031 and 2018-DU-BX-0219, PD/PI: S.W.). University of Cincinnati personnel and data collection were supported by Procter & Gamble, Company (UCRI-2015-1117-HN-532, PD/PI: H.L.N.). The UK Medical Research Council and Wellcome (grant no. 102215/2/13/2) and the University of Bristol provide core support for ALSPAC. The publication is the work of the authors and K.I. and P.C. will serve as guarantors for the contents of this paper. A comprehensive list of grants funding is available on the ALSPAC website (http://www.bristol.ac.uk/alspac/external/documents/grant-acknowledgements.pdf). ALSPAC GWAS data was generated by Sample Logistics and Genotyping Facilities at Wellcome Sanger Institute and LabCorp (Laboratory Corporation of America) using support from 23andMe. The KU Leuven research team and analyses were supported by the National Institute of Dental and Craniofacial Research (R01-DE027023, PD/PIs: S.M.W./J.R.S.), The Research Fund KU Leuven (BOF-C1, C14/15/081 and C14/20/081, PD/PI: P.C.), The Research Program of the Research Foundation—Flanders (FWO, G078518N, PD/PI: P.C.) and a Senior Clinical Investigator Fellowship of The Research Foundation—Flanders (G078714N, PD/PI: G.H.). Stanford University personnel and analyses were supported by the National Institute of Dental and Craniofacial Research (R01-DE027023, PD/PIs: S.M.W./J.R.S.; and U01-DE024430, PD/PIs: J.W./L. Selleri), the Howard Hughes Medical Institute and the March of Dimes Foundation (1-FY15-312, PD/PI: J.W.).

Author information

These authors contributed equally: Julie D. White, Karlijne Indencleef.

Authors and Affiliations

Department of Anthropology, Pennsylvania State University, State College, PA, USA
Julie D. White & Mark D. Shriver
Department of Electrical Engineering, ESAT/PSI, KU Leuven, Leuven, Belgium
Karlijne Indencleef, Jiarui Li & Peter Claes
Medical Imaging Research Center, UZ Leuven, Leuven, Belgium
Karlijne Indencleef, Hanne Hoskens, Jiarui Li & Peter Claes
Department of Chemical and Systems Biology, Stanford University School of Medicine, Stanford, CA, USA
Sahin Naqvi, Jaaved Mohammed, Tomek Swigut & Joanna Wysocka
Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
Sahin Naqvi
Department of Biology, Indiana University Purdue University Indianapolis, Indianapolis, IN, USA
Ryan J. Eller & Susan Walsh
Department of Human Genetics, KU Leuven, Leuven, Belgium
Hanne Hoskens, Hilde Peeters & Peter Claes
Department of Oral Biology, Center for Craniofacial and Dental Genetics, University of Pittsburgh, Pittsburgh, PA, USA
Jasmien Roosenboom, Myoung Keun Lee, Mary L. Marazita, John R. Shaffer & Seth M. Weinberg
Applied Clinical Research and Public Health, School of Dentistry, Cardiff University, Cardiff, UK
Stephen Richmond
Department of Internal Medicine, Section of Molecular Medicine, Wake Forest School of Medicine, Winston-Salem, NC, USA
Ellen E. Quillen
Center for Precision Medicine, Wake Forest School of Medicine, Winston-Salem, NC, USA
Ellen E. Quillen
Department of Anthropology, University of Cincinnati, Cincinnati, OH, USA
Heather L. Norton
Department of Human Genetics, University of Pittsburgh, Pittsburgh, PA, USA
Eleanor Feingold, Mary L. Marazita, John R. Shaffer & Seth M. Weinberg
Department of Neurosciences, Experimental Oto-Rhino-Laryngology, KU Leuven, Leuven, Belgium
Greet Hens
Department of Developmental Biology, Stanford University School of Medicine, Stanford, CA, USA
Joanna Wysocka
Howard Hughes Medical Institute, Stanford University School of Medicine, Stanford, CA, USA
Joanna Wysocka
Department of Anthropology, University of Pittsburgh, Pittsburgh, PA, USA
Seth M. Weinberg
Murdoch Children’s Research Institute, Melbourne, Victoria, Australia
Peter Claes

Authors

Julie D. White
View author publications
You can also search for this author in PubMed Google Scholar
Karlijne Indencleef
View author publications
You can also search for this author in PubMed Google Scholar
Sahin Naqvi
View author publications
You can also search for this author in PubMed Google Scholar
Ryan J. Eller
View author publications
You can also search for this author in PubMed Google Scholar
Hanne Hoskens
View author publications
You can also search for this author in PubMed Google Scholar
Jasmien Roosenboom
View author publications
You can also search for this author in PubMed Google Scholar
Myoung Keun Lee
View author publications
You can also search for this author in PubMed Google Scholar
Jiarui Li
View author publications
You can also search for this author in PubMed Google Scholar
Jaaved Mohammed
View author publications
You can also search for this author in PubMed Google Scholar
Stephen Richmond
View author publications
You can also search for this author in PubMed Google Scholar
Ellen E. Quillen
View author publications
You can also search for this author in PubMed Google Scholar
Heather L. Norton
View author publications
You can also search for this author in PubMed Google Scholar
Eleanor Feingold
View author publications
You can also search for this author in PubMed Google Scholar
Tomek Swigut
View author publications
You can also search for this author in PubMed Google Scholar
Mary L. Marazita
View author publications
You can also search for this author in PubMed Google Scholar
Hilde Peeters
View author publications
You can also search for this author in PubMed Google Scholar
Greet Hens
View author publications
You can also search for this author in PubMed Google Scholar
John R. Shaffer
View author publications
You can also search for this author in PubMed Google Scholar
Joanna Wysocka
View author publications
You can also search for this author in PubMed Google Scholar
Susan Walsh
View author publications
You can also search for this author in PubMed Google Scholar
Seth M. Weinberg
View author publications
You can also search for this author in PubMed Google Scholar
Mark D. Shriver
View author publications
You can also search for this author in PubMed Google Scholar
Peter Claes
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

P.C., M.D.S., S.M.W., J.R.S., J.W. and S.W. conceptualized the study (ideas; formulation or evolution of overarching research goals and aims). J.D.W., K.I., R.J.E., M.K.L., J.L., S.W. and P.C. carried out the data curation (management activities to annotate (produce metadata), scrub data and maintain research data for initial use and later re-use). J.D.W., K.I., S.N., R.J.E., H.H., J.R., J.L. and P.C. carried out the formal analysis (application of statistical, mathematical, computational or other formal techniques to analyze or synthesize study data). S.R., H.L.N., E.F., T.S., M.L.M., J.R.S., J.W., S.W., S.M.W., M.D.S. and P.C. were responsible for funding acquisition (acquisition of the financial support for the project leading to this publication). J.D.W., K.I., S.N., R.J.E., H.H., J.R., M.K.L., J.L. and P.C. carried out the investigation (conducting a research and investigation process, specifically performing the experiments or data/evidence collection). J.D.W., S.N., R.J.E., J.M., S.R., E.E.Q., H.L.N., T.S., M.L.M., J.W., S.W., S.M.W. and M.D.S. provided the resources (provision of study materials, computing resources or other analysis tools). P.C., S.M.W., M.D.S., S.W., J.W., J.R.S., M.L.M., T.S., H.P. and G.H. carried out the supervision (oversight and leadership responsibility for the research activity planning and execution, including mentorship external to the core team). J.D.W., K.I., S.N., R.J.E., H.H., J.R., M.K.L. and P.C. did the visualization (preparation, creation and/or presentation of the published work, specifically visualization/data presentation). J.D.W., K.I., S.N., R.J.E. and J.R. wrote the original draft. J.D.W., K.I., S.N., R.J.E., H.H., J.R., S.R., E.E.Q., M.L.M., H.P., J.R.S., J.W., S.W., S.M.W., M.D.S. and P.C. reviewed and edited the final manuscript.

Corresponding authors

Correspondence to Julie D. White, Karlijne Indencleef or Peter Claes.

Ethics declarations

Competing interests

H.L.N. has received $6,000 in consulting fees from Procter & Gamble, Company. Procter & Gamble, Company had no role in the conceptualization, design, data analysis, decision to publish or preparation of this manuscript. All other authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Hierarchical spectral clustering of facial shape.

a, Global-to-local facial segmentation of all 3D images (n_Total = 8,246), obtained using hierarchical spectral clustering. Segments are colored in teal and identical to those in Fig. 1. Roman numerals represent ‘quadrants’ of facial segments. b, The number of principal components retained after parallel analysis for each facial segment.

Extended Data Fig. 2 Study design.

Sample Wrangling: Images and genotypes from each study were intersected and unrelated participants of European ancestry, with quality-controlled images, covariates, and imputed genetic data were selected to obtain the analyzed data. Identification: For each facial segment, canonical correlation analysis (CCA) and Rao’s F-test approximation was used to identify the multivariate combination of facial principal components most correlated with the genotypes, which led to a P value (P_CCA-US or P_CCA-UK) and multivariate phenotypic trait most correlated with each SNP (Trait_US and Trait_UK). Verification: The principal components of the other dataset were then projected onto this trait to obtain a univariate variable representing the distribution of participants from the verification dataset for the trait identified in the identification dataset (UniVar_UK and UniVar_US). The genotypes of the verification dataset are then tested against this variable via linear regression, resulting in an additional P value (P_UniVar-UK and P_UniVar-US). Meta-Analysis: The P values from identification and verification are meta-analyzed using Stouffer’s method, resulting in the final set of P values from each meta-analysis track (P_META-US and P_META-UK).

Extended Data Fig. 3 Genomic signal correlations.

LDSC correlations between segments. a, Correlations between segments from different quadrants, ranging from 0.8 to 0.88, which seem to reflect both physical proximity of segments on the face and shared embryological origins. b, Correlations ranging from 0.88 to 1, which are mostly between segments within the same facial quadrant.

Extended Data Fig. 4 Clustering of facial segments on the basis of shared genetic signals.

Correlations between facial segments on the basis of SNP P values were calculated using LDSC, as described in Methods, and average-linkage hierarchical clustering was performed using the matrix of correlation values. Quadrant colors in legend refer to the quadrant of the polar dendrogram in which the facial segment lies in, also represented by the facial images at the top, and embryonic facial prominences are assigned to each facial segment.

Extended Data Fig. 5 GREAT and FUMA analyses showing enrichment for craniofacial and limb development.

a, GREAT analysis. For the top ten GO terms in each category, plotted is the binomial test Bonferroni-corrected P value (red; negative values) and binomial region fold enrichment (blue; positive values). Behind every GO term, in parentheses we indicate the number of genes in the test set with the annotation (Observed) and the total number of genes in the genome with the annotation (Total), with the format (Observed/Total). Dashed line represents significance at P = log₁₀(0.05) = -1.3. b, FUMA analysis, indicating the KEGG pathways that were significantly enriched in our results. Multiple pathways are relevant for craniofacial development. The right panel shows the genes that are involved in the pathways.

Extended Data Fig. 6 H3K27ac signal is significantly different in 203 lead vs. 203 random SNPs for relevant facial tissues.

For all cell types and tissues, each represented by a point above, the median difference between H3K27ac RPM signal between the 203 lead SNPs vs. 203 random SNPs was tested for significance using a two-sided Wilcoxon rank-sum test. The thin dashed line represents the 5% false discovery rate P value of 0.0094, using the Benjamini–Hochberg method. Relative to the random, MAF-matched SNPs, the lead SNPs are significantly enriched for H3K27ac signal in many cell types, with the highest magnitude differences being from CNCCs (blue) and embryonic craniofacial tissues (orange). Test statistics used to create this plot are available in Supplementary Table 4.

Extended Data Fig. 7 Correlation of H3K27ac activity among SEM models.

a, For all segments (aka ‘masks’), we compared the H3K27ac activity for significant SNPs from the refined SEM model for variation in that facial segment. Plotted is the Spearman’s rho correlation between pairs of SNPs significant in the same SEM model (‘Within Mask’); pairs of SNPs where one is from the SEM model and the other is not (‘Within To Out’), and where both SNPs in the pair are from a different SEM model (‘Out To Out’). Segments where the distribution of correlation across all cell types was significantly different (Benjamini–Hochberg adjusted P < 0.05) based on a two-sided Kruskal–Wallis test are indicated in black. b, For all cell types, the median correlation across all segments is plotted for each of the three SNP groupings. Significance between the means was determined using a two-sided Kruskal–Wallis test. Boxplots plot the first and third quartiles, with a dark black line representing the median. Whiskers extend to the largest and smallest values no further than 1.5 × the inter-quartile range from the first and third quartiles, respectively.

Extended Data Fig. 8 Phenotypic and marginal distributions for diplotype combinations.

For a random SNP pairing (a) and each significant epistasis pair (b–d), boxplots are plotted to visualize the epistatic effect on the phenotype. The marginal phenotypic medians of the singular genotypes (non-shaded boxplots) were used to calculate and visualize the predicted diplotype phenotypic distribution that would occur if the two genotypes were acting alone. The median phenotype was also calculated for each diplotype as the average of the marginal medians of the singular genotypes (blue dashed lines on the colored plots). This median was compared to the observed medians of the diplotypes (solid black lines; colored boxplots) via Mood’s Median test with one degree of freedom. Log-transformed P values were used to color boxplots if there was a significant (P < 0.05; log(P) > 1.30) difference between the expected phenotype of the combined genotype and observed diplotype. Boxplots plot the first and third quartiles, with a dark black line representing the median. Whiskers extend to the largest and smallest values no further than 1.5 × the inter-quartile range from the first and third quartiles, respectively.

Extended Data Fig. 9 MSX1 and DACT1 loci.

LocusZoom plots for the two association signals nearby MSX1 (a), which has previously been implicated in orofacial clefting in humans and mice, and DACT1 (f), which is a novel result. Points represent one-sided -log₁₀(P) of the META_UK meta-analysis track for the facial segment illustrated in the normal displacement figures (b, d, g) and are colored based on linkage disequilibrium with the labeled SNP. Asterisks indicate genotyped SNPs and circles indicate imputed SNPs. Facial effects for the two association signals nearby MSX1: rs3910659 (b) and rs13117653 (d) and the signal nearby DACT1: rs10047930 (g). Effects are the normal displacement (displacement in the direction locally normal to the facial surface) in each quasi landmark of the lowest facial segment reaching genome-wide significance in META_UK, going from the minor to the major allele. Blue indicates inward depression; red indicates outward protrusion. Yellow rosette plots depict the -log₁₀(P) of the meta-analysis P value (one-sided, right-tailed) per facial segment in META_UK track. Black-encircled facial segments have reached genome-wide significance (P = 5 × 10⁻⁸). (c) rs3910659; (e) rs13117653; (h) rs10047930.

Extended Data Fig. 10 Regions nearby previously published SNPs associated with risk for Crohn’s disease are preferentially active in immune cells and tissues.

Each boxplot represents the distribution of H3K27ac signal in 20 kb regions around 619 Crohn’s disease-associated SNPs from the NCBI-EBI GWAS catalog in one sample. See Methods for details on calculation of H3K27ac signal. Samples corresponding to immune cells and tissues are highlighted in red. Thin dashed line at ~2.9 is the median level of signal across all cell types and tissues. Boxplots plot the first and third quartiles, with a dark black line representing the median. Whiskers extend to the largest and smallest values no further than 1.5 × the inter-quartile range from the first and third quartiles, respectively.

Supplementary information

Supplementary Information

Supplementary Notes 1–3, Methods, Figs. 1 and 2, and Data 1 and 2.

Reporting Summary

Supplementary Tables

Supplementary Tables 1–5.

Supplementary Data 1

For each of the 24 multi-peak loci (listed in Supplementary Table 5): (A) -log₁₀(P) of the meta-analysis one-sided, right-tailed P value per facial segment in META_US and META_UK tracks. Black-encircled facial segments have reached genome-wide significance (P = 5 × 10⁻⁸). (B) The normal displacement (displacement in the direction locally normal to the facial surface) in each quasi-landmark of the facial segment reaching the lowest P value in META_US and META_UK, going from the minor to the major allele. Blue indicates inward depression; red indicates outward protrusion. (C) LocusZoom plots in META_US (top) and META_UK (bottom), for the segment in which the SNP had its lowest P value (one-sided). Points are colored based on linkage disequilibrium (r²) in the 1000 Genomes Phase 3 EUR population. Asterisks represent genotyped SNPs and circles represent imputed SNPs.

Supplementary Data 2

For each of the 50 segments with well-fitting SEM models, in this table we provide the number of PCs included to represent shape variation in that segment, the number of SNPs that survived the model refinement process (see Methods), the P value cutoff used to perform the model refinement and determine the SNPs to be used for epistasis, the number of SNPs used in the epistasis analysis for this segment and values for the 𝛘², CFI, RMSE and SRMR model fit indices, which were used to evaluate the models for our analysis. We also include the TLI and GFI model fit indices for completeness. This table also contains internal links to separate tabs, where, for each surviving model, we have listed the parameters used and the estimate, standard error, z-score, two-sided P value and 95% CIs. SNPs which were selected for epistasis testing are highlighted in green.

Rights and permissions

Reprints and permissions

About this article

Cite this article

White, J.D., Indencleef, K., Naqvi, S. et al. Insights into the genetic architecture of the human face. Nat Genet 53, 45–53 (2021). https://doi.org/10.1038/s41588-020-00741-7

Download citation

Received: 10 October 2019
Accepted: 23 October 2020
Published: 07 December 2020
Issue Date: January 2021
DOI: https://doi.org/10.1038/s41588-020-00741-7

This article is cited by

The level of protein in the maternal murine diet modulates the facial appearance of the offspring via mTORC1 signaling
- Meng Xie
- Markéta Kaiser
- Andrei S. Chagin
Nature Communications (2024)
A distant global control region is essential for normal expression of anterior HOXA genes during mouse and human craniofacial development
- Andrea Wilderman
- Eva D’haene
- Justin Cotney
Nature Communications (2024)
Dynamic enhancer landscapes in human craniofacial development
- Sudha Sunil Rajderkar
- Kitt Paraiso
- Axel Visel
Nature Communications (2024)
Precise modulation of transcription factor levels identifies features underlying dosage sensitivity
- Sahin Naqvi
- Seungsoo Kim
- Joanna Wysocka
Nature Genetics (2023)
Exploring regional aspects of 3D facial variation within European individuals
- Franziska Wilke
- Noah Herrick
- Susan Walsh
Scientific Reports (2023)