What are the genes essential for mammalian life? Answering such a large question requires big efforts and major bandwidth for analyzing phenotype data. Using high-throughput technologies and a large collaboration of experts and resources, researchers now have an estimate of the percentage of genes in the mouse genome necessary for development and survival (Nature 537, 508–514; 2016).

Credit: hose_bw, Getty

The study, supported by senior and corresponding author Stephen Murray at Jackson Laboratories (Bar Harbor, ME), involved colleagues from several institutions and the International Mouse Phenotyping Consortium (IMPC). Using 1,751 knockout mice—the first set of the nearly 5,000 knockout mice the IMPC plans to generate—and a standardized phenotyping pipeline including high-resolution 3-D imaging of whole embryos, the group identified 410 genes that were necessary for proper development and survival. Their systematic approach, combined with their imaging platform, allowed the group to observe and document phenotypes during early development on very fine scale, and uncover a remarkable amount of variability across embryos with the same mutation. The knockout mice were all generated on a defined C57BL/6N background—a popular strain used in development studies. “When looking across the seven or eight embryos generated for each knockout, we found variations in phenotype at a surprising frequency,” says Murray. “We expect diversity when we look across different genetic backgrounds, but this is the first large-scale documentation of pervasive variable expressivity in a defined genetic background.”

The work also serves as an example of how research into genotype-phenotype relationships is fundamentally changing. Rather than relying on one lab, one gene, and idiosyncratic phenotyping methods difficult for others to reproduce, the consortium-based effort takes advantage of scale to maximize efficiency and standardize processes across multiple research centers. Another distinct quality of the study is that the large amount of data generated is completely open to the research community, which could aid scientists in academic and industry sectors. “This freely available and accessible dataset provides significant new gene-phenotype associations to enable scientists to prioritize gene candidates identified in their preclinical and discovery research,” says co-first author Ann Flenniken, manager of the Clinical Phenotyping Core at The Centre for Phenogenomics (Toronto, Canada).

In addition to the data, the genetically engineered mice underlying the work are also available to the community, which should help generate more scientific returns on IMPC's investment. “This paper is just the tip of the iceberg,” says co-first author Mary Dickinson of the Baylor College of Medicine (Houston, TX). “We want the scientific community to know even more about IMPC efforts and that they have access to the mice as well as the phenotype data.