Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Insights into the genetic architecture of the human face

Abstract

The human face is complex and multipartite, and characterization of its genetic architecture remains challenging. Using a multivariate genome-wide association study meta-analysis of 8,246 European individuals, we identified 203 genome-wide-significant signals (120 also study-wide significant) associated with normal-range facial variation. Follow-up analyses indicate that the regions surrounding these signals are enriched for enhancer activity in cranial neural crest cells and craniofacial tissues, several regions harbor multiple signals with associations to different facial phenotypes, and there is evidence for potential coordinated actions of variants. In summary, our analyses provide insights into the understanding of how complex morphological traits are shaped by both individual and coordinated genetic actions.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Overall results of US-driven and UK-driven meta-analyses.
Fig. 2: Regions near the 203 genome-wide-significant lead SNPs are enriched for enhancers preferentially active in cranial neural crest cells and embryonic craniofacial tissue.
Fig. 3: Activity of 203 genome-wide-significant lead SNPs in all cell types studied.
Fig. 4: TBX15-WARS2 multi-peak locus.
Fig. 5: Phenotypic and marginal distributions for the rs62443772–rs76244841 epistatic pair.

Similar content being viewed by others

Data availability

All of the genotypic markers for the 3DFN dataset are available to the research community through the dbGaP controlled-access repository (http://www.ncbi.nlm.nih.gov/gap) at accession no. phs000949.v1.p1. The raw source data for the phenotypes—the 3D facial surface models in.obj format—are available through the FaceBase Consortium (https://www.facebase.org) at accession no. FB00000491.01. Access to these 3D facial surface models requires proper institutional ethics approval and approval from the FaceBase data access committee. Additional details can be requested from S.M.W.

The participants making up the PSU and IUPUI datasets were not collected with broad data sharing consent. Given the highly identifiable nature of both facial and genomic information and unresolved issues regarding risk to participants, we opted for a more conservative approach to participant recruitment. Broad data sharing of the raw data from these collections would thus be in legal and ethical violation of the informed consent obtained from the participants. This restriction is not because of any personal or commercial interests. Additional details can be requested from M.D.S. and S.W. for the PSU and IUPUI datasets, respectively.

The ALSPAC (UK) data will be made available to bona fide researchers on application to the ALSPAC Executive Committee (http://www.bris.ac.uk/alspac/researchers/data-access). Ethical approval for the study was obtained from the ALSPAC Ethics and Law Committee and the Local Research Ethics Committees.

Publicly available data used were the 1000G Phase 3 data (ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/), the list of HapMap 3 SNPs excluding the MHC region (http://ldsc.broadinstitute.org/static/media/w_hm3.noMHC.snplist.zip), and ChIP–seq files from Prescott et al.39 (GSE70751), Najafova et al.85 (GSE82295), Baumgart et al.86 (GSE89179), Nott et al.87 (https://genome.ucsc.edu/s/nottalexi/glassLab_BrainCellTypes_hg19), Pattison et al.88 (GSE119997), Wilderman et al.40 (GSE97752) and the Roadmap Epigenomics Project89 (https://egg2.wustl.edu/roadmap/data/byFileType/alignments/consolidated/). Meta-analysis GWAS statistics are available on GWAS Catalog (GCP000044). All data relevant to run future replications and meta-analysis efforts are provided in the FigShare repository for this work34, along with additional figures (https://doi.org/10.6084/m9.figshare.c.4667261). Items available in the FigShare repository are (1) anthropometric mask: a Matfile of the anthropometric mask used; (2) association statistics and effects of the 203 lead SNPs: facial effects, LocusZoom plots and association statistics from each stage of the analysis for the 203 lead SNPs; (3) calculation of study-wide-significance threshold: script and permutation outcomes needed to replicate the calculation of the study-wide-significance threshold; (4) facial segment assignments: segment assignments for each quasi-landmark in the anthropometric mask; (5) Fig. 2a labeled: a larger version of Fig. 2a, with all cell types and tissues labeled; (6) GREAT Export: raw output of the GREAT analysis; (7) PCA shape constructs: PCA shape spaces for all 63 facial segments; (8) QQ plots: QQ plots for each segment in all stages of the analysis; (9) script to explore facial segments and GWAS hits: MatLab script for select data exploration functions; (10) SNPs reaching suggestive significance in either meta-analysis track: association statistics of all SNPs with P < 5 × 10−7 in METAUS or METAUK tracks; (11) source data for manuscript figures: source data in Excel format for all figures, where possible.

Code availability

KU Leuven provides the MeshMonk (v.0.0.6) spatially dense facial-mapping software, free to use for academic purposes (https://github.com/TheWebMonks/meshmonk). Matlab 2017b implementations of the hierarchical spectral clustering to obtain facial segmentations are available from a previous publication25 (https://doi.org/10.6084/m9.figshare.7649024).

The statistical analyses in this work were based on functions of the statistical toolbox in Matlab 2017b, SHAPEIT2 (v.2.r900), Sanger Imputation Server (v.0.0.6), PBWT pipeline (v.3.1), MeshMonk (v.0.0.6), LDSC (v.1.0.1), FUMA (v.1.3.3), GREAT (v.3.0.0), Plink v.1.9, lavaan (v.0.6-3), R (>v.3.4), agricolae (v.1.3-0), cowplot (v.1.0.0), ggplot2 (v.3.1.1), ggpubr (v.0.2), gridExtra (v.2.3), gtable (v.0.3.0), grid (v.3.6.2), Hmisc (v.4.2-0), psych (v.1.8.12), data.table (v.1.12.0), Genotype Harmonizer (v.1.4.20), KING (v.2.1.3), bowtie2 (v.2.3.4.2), bedtools (v.2.27.1) and Bioconductor (v.3.7), as mentioned throughout the Methods.

References

  1. Atchley, W. R. & Hall, B. K. A model for development and evolution of complex morphological structures. Biol. Rev. 66, 101–157 (1991).

    Article  CAS  PubMed  Google Scholar 

  2. Gratten, J., Wray, N. R., Keller, M. C. & Visscher, P. M. Large-scale genomics unveils the genetic architecture of psychiatric disorders. Nat. Neurosci. 17, 782–790 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Timpson, N. J., Greenwood, C. M. T., Soranzo, N., Lawson, D. J. & Richards, J. B. Genetic architecture: the shape of the genetic contribution to human traits and disease. Nat. Rev. Genet. 19, 110–124 (2018).

    Article  CAS  PubMed  Google Scholar 

  4. Weinberg, S. M. et al. Hunting for genes that shape human faces: initial successes and challenges for the future. Orthod. Craniofac. Res. 22, 207–212 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  5. Weinberg, S. M., Cornell, R. & Leslie, E. J. Craniofacial genetics: where have we been and where are we going? PLoS Genet. 14, e1007438 (2018).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  6. Dixon, M. J., Marazita, M. L., Beaty, T. H. & Murray, J. C. Cleft lip and palate: understanding genetic and environmental influences. Nat. Rev. Genet. 12, 167–178 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Paternoster, L. et al. Genome-wide association study of three-dimensional facial morphology identifies a variant in PAX3 associated with nasion position. Am. J. Hum. Genet. 90, 478–485 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Liu, F. et al. A genome-wide association study identifies five loci influencing facial morphology in Europeans. PLoS Genet. 8, e1002932 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Jacobs, L. C. et al. Intrinsic and extrinsic risk factors for sagging eyelids. JAMA Dermatol. 150, 836–843 (2014).

    Article  PubMed  Google Scholar 

  10. Adhikari, K. et al. A genome-wide association scan implicates DCHS2, RUNX2, GLI3, PAX1 and EDAR in human facial variation. Nat. Commun. 7, 11616 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  11. Pickrell, J. K. et al. Detection and interpretation of shared genetic influences on 42 human traits. Nat. Genet. 48, 709–717 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Shaffer, J. R. et al. Genome-wide association study reveals multiple loci influencing normal human facial morphology. PLoS Genet. 12, e1006149 (2016).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  13. Cole, J. B. et al. Genome-wide association study of African children identifies association of SCHIP1 and PDE8A with facial size and shape. PLoS Genet. 12, e1006174 (2016).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  14. Lee, M. K. et al. Genome-wide association study of facial morphology reveals novel associations with FREM1 and PARK2. PLoS One 12, e0176566 (2017).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  15. Crouch, D. J. M. et al. Genetics of the human face: identification of large-effect single gene variants. Proc. Natl Acad. Sci. USA 115, E676–E685 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Claes, P. et al. Genome-wide mapping of global-to-local genetic effects on human facial shape. Nat. Genet. 50, 414–423 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Endo, C. et al. Genome-wide association study in Japanese females identifies fifteen novel skin-related trait associations. Sci. Rep. 8, 8974 (2018).

  18. Cha, S. et al. Identification of five novel genetic loci related to facial morphology by genome-wide association studies. BMC Genomics 19, 481 (2018).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  19. Howe, L. J. et al. Investigating the shared genetics of non-syndromic cleft lip/palate and facial morphology. PLoS Genet. 14, e1007501 (2018).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  20. Qiao, L. et al. Genome-wide variants of Eurasian facial shape differentiation and a prospective model of DNA based face prediction. J. Genet. Genomics 45, 419–432 (2018).

    Article  PubMed  Google Scholar 

  21. Wu, W. et al. Whole-exome sequencing identified four loci influencing craniofacial morphology in northern Han Chinese. Hum. Genet. 138, 601–611 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Li, Y. et al. EDAR, LYPLAL1, PRDM16, PAX3, DKK1, TNFSF12, CACNA2D3 and SUPT3H gene variants influence facial morphology in a Eurasian population. Hum. Genet. 138, 681–689 (2019).

    Article  CAS  PubMed  Google Scholar 

  23. Xiong, Z. et al. Novel genetic loci affecting facial shape variation in humans. eLife 8, e49898 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  24. White, J. D. et al. MeshMonk: open-source large-scale intensive 3D phenotyping. Sci. Rep. 9, 6085 (2019).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  25. Sero, D. et al. Facial recognition from DNA using face-to-DNA classifiers. Nat. Commun. 10, 2557 (2019).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  26. Hayton, J. C., Allen, D. G. & Scarpello, V. Factor retention decisions in exploratory factor analysis: a tutorial on parallel analysis. Organ. Res. Methods 7, 191–205 (2004).

    Article  Google Scholar 

  27. Franklin, S. B., Gibson, D. J., Robertson, P. A., Pohlmann, J. T. & Fralish, J. S. Parallel analysis: a method for determining significant principal components. J. Veg. Sci. 6, 99–106 (1995).

    Article  Google Scholar 

  28. Stouffer, S. A., Suchman, E. A., Devinney, L. C., Star, S. A. & Williams, R. M. Jr. The American Soldier: Adjustment During Army Life. Vol. 1 (Princeton Univ. Press, 1949).

  29. Willer, C. J., Li, Y. & Abecasis, G. R. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinforma. Oxf. Engl. 26, 2190–2191 (2010).

    Article  CAS  Google Scholar 

  30. Bulik-Sullivan, B. et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236–1241 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Bulik-Sullivan, B. K. et al. LD score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Som, P. M., Streit, A. & Naidich, T. P. Illustrated review of the embryology and development of the facial region, part 3: an overview of the molecular interactions responsible for facial development. Am. J. Neuroradiol. 35, 223–229 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Pruim, R. J. et al. LocusZoom: regional visualization of genome-wide association scan results. Bioinformatics 26, 2336–2337 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. White, J. & Indencleef, K. Insights into the genetic architecture of the human face. FigShare https://doi.org/10.6084/m9.figshare.c.4667261 (2020).

  35. McLean, C. Y. et al. GREAT improves functional interpretation of cis-regulatory regions. Nat. Biotechnol. 28, 495–501 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Watanabe, K., Taskesen, E., Bochoven, Avan & Posthuma, D. Functional mapping and annotation of genetic associations with FUMA. Nat. Commun. 8, 1826 (2017).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  37. Rada-Iglesias, A. et al. A unique chromatin signature uncovers early developmental enhancers in humans. Nature 470, 279–283 (2011).

    Article  CAS  PubMed  Google Scholar 

  38. Creyghton, M. P. et al. Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proc. Natl Acad. Sci. USA 107, 21931–21936 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Prescott, S. L. et al. Enhancer divergence and cis-regulatory evolution in the human and chimp neural crest. Cell 163, 68–83 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Wilderman, A., VanOudenhove, J., Kron, J., Noonan, J. P. & Cotney, J. High-resolution epigenomic atlas of human embryonic craniofacial development. Cell Rep. 23, 1581–1597 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Kraus, P. & Lufkin, T. Dlx homeobox gene control of mammalian limb and craniofacial development. Am. J. Med. Genet. A 140, 1366–1374 (2006).

    Article  PubMed  CAS  Google Scholar 

  42. Hennekam, R. C. M., Krantz, I. D. & Allanson, J. E. Gorlin’s Syndromes of the Head and Neck (Oxford Univ. Press, 2010).

  43. Attanasio, C. et al. Fine tuning of craniofacial morphology by distant-acting enhancers. Science 342, 1241006 (2013).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  44. Beaty, T. H. et al. Testing candidate genes for non-syndromic oral clefts using a case-parent trio design. Genet. Epidemiol. 22, 1–11 (2002).

    Article  PubMed  Google Scholar 

  45. Alappat, S., Zhang, Z. Y. & Chen, Y. P. Msx homeobox gene family and craniofacial development. Cell Res 13, 429–442 (2003).

    Article  CAS  PubMed  Google Scholar 

  46. Satokata, I. & Maas, R. Msx1 deficient mice exhibit cleft palate and abnormalities of craniofacial and tooth development. Nat. Genet. 6, 348–356 (1994).

    Article  CAS  PubMed  Google Scholar 

  47. Nakatomi, M. et al. Genetic interactions between Pax9 and Msx1 regulate lip development and several stages of tooth morphogenesis. Dev. Biol. 340, 438–449 (2010).

    Article  CAS  PubMed  Google Scholar 

  48. Wang, J.-L. et al. TGF-β signaling regulates DACT1 expression in intestinal epithelial cells. Biomed. Pharmacother. 97, 864–869 (2018).

    Article  CAS  PubMed  Google Scholar 

  49. Rabadán, M. A. et al. Delamination of neural crest cells requires transient and reversible Wnt inhibition mediated by Dact1/2. Development 143, 2194–2205 (2016).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  50. Stegman, M. A. et al. Identification of a tetrameric hedgehog signaling complex. J. Biol. Chem. 275, 21809–21812 (2000).

    Article  CAS  PubMed  Google Scholar 

  51. Méthot, N. & Basler, K. Suppressor of fused opposes hedgehog signal transduction by impeding nuclear accumulation of the activator form of Cubitus interruptus. Development 127, 4001–4010 (2000).

    PubMed  Google Scholar 

  52. Monnier, V., Dussillol, F., Alves, G., Lamour-Isnard, C. & Plessis, A. Suppressor of fused links fused and Cubitus interruptus on the hedgehog signalling pathway. Curr. Biol. CB 8, 583–586 (1998).

    Article  CAS  PubMed  Google Scholar 

  53. Krzywinski, M. I. et al. Circos: an information aesthetic for comparative genomics. Genome Res. 19, 1639–1645 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Brown, G. W. & Mood, A. M. On median tests for linear hypotheses. In Proc. 2nd Berkeley Symposium on Mathematical Statistics and Probability (ed. Neyman, J.) 159–166 (Univ. of California Press, 1951).

  55. Weinberg, S. M. et al. The 3D facial norms database: part 1. A web-based craniofacial anthropometric and image repository for the clinical and research community. Cleft Palate Craniofac. J. 53, e185–e197 (2016).

    Article  PubMed  Google Scholar 

  56. Boyd, A. et al. Cohort profile: the ‘children of the 90s’—the index offspring of the Avon longitudinal study of parents and children. Int. J. Epidemiol. 42, 111–127 (2013).

    Article  PubMed  Google Scholar 

  57. Fraser, A. et al. Cohort profile: the Avon longitudinal study of parents and children: ALSPAC mothers cohort. Int. J. Epidemiol. 42, 97–110 (2013).

    Article  PubMed  Google Scholar 

  58. Verma, S. S. et al. Imputation and quality control steps for combining multiple genome-wide datasets. Front. Genet. 5, 370 (2014).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  59. Delaneau, O., Zagury, J.-F. & Marchini, J. Improved whole-chromosome phasing for disease and population genetic studies. Nat. Methods 10, 5–6 (2013).

    Article  CAS  PubMed  Google Scholar 

  60. The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015).

    Article  CAS  Google Scholar 

  61. Durbin, R. Efficient haplotype matching and storage using the positional Burrows-Wheeler transform (PBWT). Bioinforma. Oxf. Engl. 30, 1266–1272 (2014).

    Article  CAS  Google Scholar 

  62. McCarthy, S. et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat. Genet. 48, 1279–1283 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. 1000 Genomes Project Consortium. et al. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012).

    Article  CAS  Google Scholar 

  64. Howie, B., Marchini, J. & Stephens, M. Genotype imputation with thousands of genomes. G3 Genes Genomics Genet. 1, 457–470 (2011).

    Google Scholar 

  65. Heike, C. L., Upson, K., Stuhaug, E. & Weinberg, S. M. 3D digital stereophotogrammetry: a practical guide to facial image acquisition. Head. Face Med. 6, 18 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  66. Robert, P. & Escoufier, Y. A unifying tool for linear multivariate statistical methods: the RV-coefficient. J. R. Stat. Soc. Ser. C. Appl. Stat. 25, 257–265 (1976).

    Google Scholar 

  67. Klingenberg, C. P. Morphometric integration and modularity in configurations of landmarks: tools for evaluating a priori hypotheses. Evol. Dev. 11, 405–421 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  68. Rohlf, F. J. & Slice, D. Extensions of the Procrustes method for the optimal superimposition of landmarks. Syst. Biol. 39, 40–59 (1990).

    Google Scholar 

  69. Olson, C. L. On choosing a test statistic in multivariate analysis of variance. Psychol. Bull. 83, 579–586 (1976).

    Article  Google Scholar 

  70. Ferreira, M. A. R. & Purcell, S. M. A multivariate test of association. Bioinformatics 25, 132–133 (2009).

    Article  CAS  PubMed  Google Scholar 

  71. Galesloot, T. E., van Steen, K., Kiemeney, L. A. L. M., Janss, L. L. & Vermeulen, S. H. A comparison of multivariate genome-wide association methods. PLoS One 9, e95923 (2014).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  72. Porter, H. F. & O’Reilly, P. F. Multivariate simulation framework reveals performance of multi-trait GWAS methods. Sci. Rep. 7, 38837 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  73. O’Reilly, P. F. et al. MultiPhen: joint model of multiple phenotypes can increase discovery in GWAS. PLoS One 7, e34861 (2012).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  74. Korte, A. et al. A mixed-model approach for genome-wide association studies of correlated traits in structured populations. Nat. Genet. 44, 1066–1071 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Stephens, M. A unified framework for association analysis with multiple related phenotypes. PLoS One 8, e65245 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  76. Zhou, X. & Stephens, M. Efficient multivariate linear mixed model algorithms for genome-wide association studies. Nat. Methods 11, 407–409 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  77. Devroye, L. Non-uniform Random Variate Generation (Springer, 1986).

  78. Zerbino, D. R. et al. Ensembl 2018. Nucleic Acids Res. 46, D754–D761 (2018).

    Article  CAS  PubMed  Google Scholar 

  79. Karolchik, D. et al. The UCSC table browser data retrieval tool. Nucleic Acids Res. 32, D493–D496 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Hooper, J. E. et al. Systems biology of facial development: contributions of ectoderm and mesenchyme. Dev. Biol. 426, 97–114 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  81. Pers, T. H., Timshel, P. & Hirschhorn, J. N. SNPsnap: a Web-based tool for identification and annotation of matched SNPs. Bioinformatics 31, 418–420 (2015).

    Article  CAS  PubMed  Google Scholar 

  82. Buniello, A. et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 47, D1005–D1012 (2019).

    CAS  PubMed  Google Scholar 

  83. Rosseel, Y. lavaan: an R package for structural equation modeling. J. Stat. Softw. 48, 1–36 (2012).

    Article  Google Scholar 

  84. Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience 4, 7 (2015).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  85. Najafova, Z. et al. BRD4 localization to lineage-specific enhancers is associated with a distinct transcription factor repertoire. Nucleic Acids Res. 45, 127–141 (2017).

    Article  CAS  PubMed  Google Scholar 

  86. Baumgart, S. J. et al. CHD1 regulates cell fate determination by activation of differentiation-induced genes. Nucleic Acids Res. 45, 7722–7735 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  87. Nott, A. et al. Brain cell type-specific enhancer-promoter interactome maps and disease risk association. Science 366, 1134–1139 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  88. Pattison, J. M. et al. Retinoic acid and BMP4 cooperate with TP63 to alter chromatin dynamics during surface epithelial commitment. Nat. Genet. 50, 1658–1665 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  89. Roadmap Epigenomics Consortium et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).

Download references

Acknowledgements

We are extremely grateful to all the individuals and families who took part in this study, the midwives for their help in recruiting them and the whole ALSPAC team, which includes interviewers, computer and laboratory technicians, clerical workers, research scientists, volunteers, managers, receptionists and nurses. We are also very grateful to all of the US participants for generously donating their time to our research, and to present and former laboratory members who worked tirelessly to make these analyses possible. Pittsburgh personnel, data collection and analyses were supported by the National Institute of Dental and Craniofacial Research (U01-DE020078, program director/principal investigators (PD/PIs): M.L.M./S.M.W.; R01-DE016148, PD/PIs: M.L.M./S.M.W.; and R01-DE027023, PD/PIs: S.M.W./J.R.S.). Funding for genotyping by the National Human Genome Research Institute (X01-HG007821 and X01-HG007485, PD/PI: M.L.M.) and funding for initial genomic data cleaning by the University of Washington provided by contract HHSN268201200008I from the National Institute for Dental and Craniofacial Research awarded to the Center for Inherited Disease Research (https://www.cidr.jhmi.edu/). Penn State personnel, data collection and analyses were supported by Procter & Gamble, Company (UCRI-2015-1117-HN-532, PD/PIs: H.L.N.), the Center for Human Evolution and Development at Penn State, the Science Foundation of Ireland Walton Fellowship (04.W4/B643, PD/PI: M.D.S.), the US National Institute of Justice (2008-DN-BX-K125, PD/PI: M.D.S.; and 2018-DU-BX-0219, PD/PIs: S.W.) and by the US Department of Defense. IUPUI personnel, data collection and analyses were supported by the National Institute of Justice (2015-R2-CX-0023, 2014-DN-BX-K031 and 2018-DU-BX-0219, PD/PI: S.W.). University of Cincinnati personnel and data collection were supported by Procter & Gamble, Company (UCRI-2015-1117-HN-532, PD/PI: H.L.N.). The UK Medical Research Council and Wellcome (grant no. 102215/2/13/2) and the University of Bristol provide core support for ALSPAC. The publication is the work of the authors and K.I. and P.C. will serve as guarantors for the contents of this paper. A comprehensive list of grants funding is available on the ALSPAC website (http://www.bristol.ac.uk/alspac/external/documents/grant-acknowledgements.pdf). ALSPAC GWAS data was generated by Sample Logistics and Genotyping Facilities at Wellcome Sanger Institute and LabCorp (Laboratory Corporation of America) using support from 23andMe. The KU Leuven research team and analyses were supported by the National Institute of Dental and Craniofacial Research (R01-DE027023, PD/PIs: S.M.W./J.R.S.), The Research Fund KU Leuven (BOF-C1, C14/15/081 and C14/20/081, PD/PI: P.C.), The Research Program of the Research Foundation—Flanders (FWO, G078518N, PD/PI: P.C.) and a Senior Clinical Investigator Fellowship of The Research Foundation—Flanders (G078714N, PD/PI: G.H.). Stanford University personnel and analyses were supported by the National Institute of Dental and Craniofacial Research (R01-DE027023, PD/PIs: S.M.W./J.R.S.; and U01-DE024430, PD/PIs: J.W./L. Selleri), the Howard Hughes Medical Institute and the March of Dimes Foundation (1-FY15-312, PD/PI: J.W.).

Author information

Authors and Affiliations

Authors

Contributions

P.C., M.D.S., S.M.W., J.R.S., J.W. and S.W. conceptualized the study (ideas; formulation or evolution of overarching research goals and aims). J.D.W., K.I., R.J.E., M.K.L., J.L., S.W. and P.C. carried out the data curation (management activities to annotate (produce metadata), scrub data and maintain research data for initial use and later re-use). J.D.W., K.I., S.N., R.J.E., H.H., J.R., J.L. and P.C. carried out the formal analysis (application of statistical, mathematical, computational or other formal techniques to analyze or synthesize study data). S.R., H.L.N., E.F., T.S., M.L.M., J.R.S., J.W., S.W., S.M.W., M.D.S. and P.C. were responsible for funding acquisition (acquisition of the financial support for the project leading to this publication). J.D.W., K.I., S.N., R.J.E., H.H., J.R., M.K.L., J.L. and P.C. carried out the investigation (conducting a research and investigation process, specifically performing the experiments or data/evidence collection). J.D.W., S.N., R.J.E., J.M., S.R., E.E.Q., H.L.N., T.S., M.L.M., J.W., S.W., S.M.W. and M.D.S. provided the resources (provision of study materials, computing resources or other analysis tools). P.C., S.M.W., M.D.S., S.W., J.W., J.R.S., M.L.M., T.S., H.P. and G.H. carried out the supervision (oversight and leadership responsibility for the research activity planning and execution, including mentorship external to the core team). J.D.W., K.I., S.N., R.J.E., H.H., J.R., M.K.L. and P.C. did the visualization (preparation, creation and/or presentation of the published work, specifically visualization/data presentation). J.D.W., K.I., S.N., R.J.E. and J.R. wrote the original draft. J.D.W., K.I., S.N., R.J.E., H.H., J.R., S.R., E.E.Q., M.L.M., H.P., J.R.S., J.W., S.W., S.M.W., M.D.S. and P.C. reviewed and edited the final manuscript.

Corresponding authors

Correspondence to Julie D. White, Karlijne Indencleef or Peter Claes.

Ethics declarations

Competing interests

H.L.N. has received $6,000 in consulting fees from Procter & Gamble, Company. Procter & Gamble, Company had no role in the conceptualization, design, data analysis, decision to publish or preparation of this manuscript. All other authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Hierarchical spectral clustering of facial shape.

a, Global-to-local facial segmentation of all 3D images (nTotal = 8,246), obtained using hierarchical spectral clustering. Segments are colored in teal and identical to those in Fig. 1. Roman numerals represent ‘quadrants’ of facial segments. b, The number of principal components retained after parallel analysis for each facial segment.

Extended Data Fig. 2 Study design.

Sample Wrangling: Images and genotypes from each study were intersected and unrelated participants of European ancestry, with quality-controlled images, covariates, and imputed genetic data were selected to obtain the analyzed data. Identification: For each facial segment, canonical correlation analysis (CCA) and Rao’s F-test approximation was used to identify the multivariate combination of facial principal components most correlated with the genotypes, which led to a P value (PCCA-US or PCCA-UK) and multivariate phenotypic trait most correlated with each SNP (TraitUS and TraitUK). Verification: The principal components of the other dataset were then projected onto this trait to obtain a univariate variable representing the distribution of participants from the verification dataset for the trait identified in the identification dataset (UniVarUK and UniVarUS). The genotypes of the verification dataset are then tested against this variable via linear regression, resulting in an additional P value (PUniVar-UK and PUniVar-US). Meta-Analysis: The P values from identification and verification are meta-analyzed using Stouffer’s method, resulting in the final set of P values from each meta-analysis track (PMETA-US and PMETA-UK).

Extended Data Fig. 3 Genomic signal correlations.

LDSC correlations between segments. a, Correlations between segments from different quadrants, ranging from 0.8 to 0.88, which seem to reflect both physical proximity of segments on the face and shared embryological origins. b, Correlations ranging from 0.88 to 1, which are mostly between segments within the same facial quadrant.

Extended Data Fig. 4 Clustering of facial segments on the basis of shared genetic signals.

Correlations between facial segments on the basis of SNP P values were calculated using LDSC, as described in Methods, and average-linkage hierarchical clustering was performed using the matrix of correlation values. Quadrant colors in legend refer to the quadrant of the polar dendrogram in which the facial segment lies in, also represented by the facial images at the top, and embryonic facial prominences are assigned to each facial segment.

Extended Data Fig. 5 GREAT and FUMA analyses showing enrichment for craniofacial and limb development.

a, GREAT analysis. For the top ten GO terms in each category, plotted is the binomial test Bonferroni-corrected P value (red; negative values) and binomial region fold enrichment (blue; positive values). Behind every GO term, in parentheses we indicate the number of genes in the test set with the annotation (Observed) and the total number of genes in the genome with the annotation (Total), with the format (Observed/Total). Dashed line represents significance at P = log10(0.05) = -1.3. b, FUMA analysis, indicating the KEGG pathways that were significantly enriched in our results. Multiple pathways are relevant for craniofacial development. The right panel shows the genes that are involved in the pathways.

Extended Data Fig. 6 H3K27ac signal is significantly different in 203 lead vs. 203 random SNPs for relevant facial tissues.

For all cell types and tissues, each represented by a point above, the median difference between H3K27ac RPM signal between the 203 lead SNPs vs. 203 random SNPs was tested for significance using a two-sided Wilcoxon rank-sum test. The thin dashed line represents the 5% false discovery rate P value of 0.0094, using the Benjamini–Hochberg method. Relative to the random, MAF-matched SNPs, the lead SNPs are significantly enriched for H3K27ac signal in many cell types, with the highest magnitude differences being from CNCCs (blue) and embryonic craniofacial tissues (orange). Test statistics used to create this plot are available in Supplementary Table 4.

Extended Data Fig. 7 Correlation of H3K27ac activity among SEM models.

a, For all segments (aka ‘masks’), we compared the H3K27ac activity for significant SNPs from the refined SEM model for variation in that facial segment. Plotted is the Spearman’s rho correlation between pairs of SNPs significant in the same SEM model (‘Within Mask’); pairs of SNPs where one is from the SEM model and the other is not (‘Within To Out’), and where both SNPs in the pair are from a different SEM model (‘Out To Out’). Segments where the distribution of correlation across all cell types was significantly different (Benjamini–Hochberg adjusted P < 0.05) based on a two-sided Kruskal–Wallis test are indicated in black. b, For all cell types, the median correlation across all segments is plotted for each of the three SNP groupings. Significance between the means was determined using a two-sided Kruskal–Wallis test. Boxplots plot the first and third quartiles, with a dark black line representing the median. Whiskers extend to the largest and smallest values no further than 1.5 × the inter-quartile range from the first and third quartiles, respectively.

Extended Data Fig. 8 Phenotypic and marginal distributions for diplotype combinations.

For a random SNP pairing (a) and each significant epistasis pair (bd), boxplots are plotted to visualize the epistatic effect on the phenotype. The marginal phenotypic medians of the singular genotypes (non-shaded boxplots) were used to calculate and visualize the predicted diplotype phenotypic distribution that would occur if the two genotypes were acting alone. The median phenotype was also calculated for each diplotype as the average of the marginal medians of the singular genotypes (blue dashed lines on the colored plots). This median was compared to the observed medians of the diplotypes (solid black lines; colored boxplots) via Mood’s Median test with one degree of freedom. Log-transformed P values were used to color boxplots if there was a significant (P < 0.05; log(P) > 1.30) difference between the expected phenotype of the combined genotype and observed diplotype. Boxplots plot the first and third quartiles, with a dark black line representing the median. Whiskers extend to the largest and smallest values no further than 1.5 × the inter-quartile range from the first and third quartiles, respectively.

Extended Data Fig. 9 MSX1 and DACT1 loci.

LocusZoom plots for the two association signals nearby MSX1 (a), which has previously been implicated in orofacial clefting in humans and mice, and DACT1 (f), which is a novel result. Points represent one-sided -log10(P) of the METAUK meta-analysis track for the facial segment illustrated in the normal displacement figures (b, d, g) and are colored based on linkage disequilibrium with the labeled SNP. Asterisks indicate genotyped SNPs and circles indicate imputed SNPs. Facial effects for the two association signals nearby MSX1: rs3910659 (b) and rs13117653 (d) and the signal nearby DACT1: rs10047930 (g). Effects are the normal displacement (displacement in the direction locally normal to the facial surface) in each quasi landmark of the lowest facial segment reaching genome-wide significance in METAUK, going from the minor to the major allele. Blue indicates inward depression; red indicates outward protrusion. Yellow rosette plots depict the -log10(P) of the meta-analysis P value (one-sided, right-tailed) per facial segment in METAUK track. Black-encircled facial segments have reached genome-wide significance (P = 5 × 10−8). (c) rs3910659; (e) rs13117653; (h) rs10047930.

Extended Data Fig. 10 Regions nearby previously published SNPs associated with risk for Crohn’s disease are preferentially active in immune cells and tissues.

Each boxplot represents the distribution of H3K27ac signal in 20 kb regions around 619 Crohn’s disease-associated SNPs from the NCBI-EBI GWAS catalog in one sample. See Methods for details on calculation of H3K27ac signal. Samples corresponding to immune cells and tissues are highlighted in red. Thin dashed line at ~2.9 is the median level of signal across all cell types and tissues. Boxplots plot the first and third quartiles, with a dark black line representing the median. Whiskers extend to the largest and smallest values no further than 1.5 × the inter-quartile range from the first and third quartiles, respectively.

Supplementary information

Supplementary Information

Supplementary Notes 1–3, Methods, Figs. 1 and 2, and Data 1 and 2.

Reporting Summary

Supplementary Tables

Supplementary Tables 1–5.

Supplementary Data 1

For each of the 24 multi-peak loci (listed in Supplementary Table 5): (A) -log10(P) of the meta-analysis one-sided, right-tailed P value per facial segment in METAUS and METAUK tracks. Black-encircled facial segments have reached genome-wide significance (P = 5 × 10−8). (B) The normal displacement (displacement in the direction locally normal to the facial surface) in each quasi-landmark of the facial segment reaching the lowest P value in METAUS and METAUK, going from the minor to the major allele. Blue indicates inward depression; red indicates outward protrusion. (C) LocusZoom plots in METAUS (top) and METAUK (bottom), for the segment in which the SNP had its lowest P value (one-sided). Points are colored based on linkage disequilibrium (r2) in the 1000 Genomes Phase 3 EUR population. Asterisks represent genotyped SNPs and circles represent imputed SNPs.

Supplementary Data 2

For each of the 50 segments with well-fitting SEM models, in this table we provide the number of PCs included to represent shape variation in that segment, the number of SNPs that survived the model refinement process (see Methods), the P value cutoff used to perform the model refinement and determine the SNPs to be used for epistasis, the number of SNPs used in the epistasis analysis for this segment and values for the 𝛘2, CFI, RMSE and SRMR model fit indices, which were used to evaluate the models for our analysis. We also include the TLI and GFI model fit indices for completeness. This table also contains internal links to separate tabs, where, for each surviving model, we have listed the parameters used and the estimate, standard error, z-score, two-sided P value and 95% CIs. SNPs which were selected for epistasis testing are highlighted in green.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

White, J.D., Indencleef, K., Naqvi, S. et al. Insights into the genetic architecture of the human face. Nat Genet 53, 45–53 (2021). https://doi.org/10.1038/s41588-020-00741-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41588-020-00741-7

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing