Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain
the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in
Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles
and JavaScript.
When applied in large scale to electronic medical record data, the PheWAS approach replicates GWAS associations and reveals potentially new pleiotropic associations.
By mathematically 'silencing' spurious, indirect correlations in networks, two groups devise approaches for improving many different types of network analyses.
By mathematically 'silencing' spurious, indirect correlations in networks, two groups devise approaches for improving many different types of network analyses.
Optimized algorithms from the field of electrical-signal processing improve the identification of genomic signals from diverse high-throughput sequencing experiments, such as ChIP-seq, DNase-seq and FAIRE-seq.
Genes that cause a mutant phenotype are efficiently identified from genetic screens of model and non-model organisms from whole-genome sequencing data without requiring segregating populations, genetic maps and reference sequences.
The MuTect algorithm for calling somatic point mutations enables subclonal analysis of the whole-genome or whole-exome sequencing data being generated in large-scale cancer genomics projects.
The most comprehensive analysis to date of models of transcription-factor binding specificity reveals the best methods for predicting in vivo binding from in vitro data.
High-throughput network maps are used to automatically (or semi-automatically) reconstruct an ontology that recapitulates much of the Gene Ontology and finds additional terms and relations.
Tumors vary in their ratio of normal to cancerous cells and in their genomic copy number. Carter et al. describe an analytic method for inferring the purity and ploidy of a tumor sample, enabling longitudinal studies of subclonal mutations and tumor evolution.
Small sequencing machines no bigger than a laser printer have many potential applications in diagnostics and public health. Loman et al. compare the quality, throughput and cost of instruments from Illumina, Roche and Life Technologies.
Sites where RNA editing occurs can be found using RNA-Seq, but false positives confound the data analysis. Peng et al. describe algorithms for accurately calling editing events, and apply them to identify ~22,600 events, mostly A→G changes, in a human transcriptome.
Large-scale structural genomics and genome-wide association studies generate a wealth of data relevant to human disease. Wang et al. interpret these data in the context of a protein interaction network, showing that systematic analyses of the structural interfaces hit by mutations yield insights into pathogenesis.
Over 90% of human whole-genome sequencing has been performed using instruments from two companies, Illumina and Complete Genomics. Lam et al. sequence the same DNA samples with both instruments and compare their performance for calling insertions, deletions and single-nucleotide variants.
Data filters separate true genetic variants in sequencing data from sequencing errors, but their effectiveness is difficult to assess. Reumers et al. use the genome sequences of monozygotic twins to evaluate the performance of filters individually and in combination, leading to a 290-fold reduction in error rate in calling single-nucleotide variants.
Copy-number changes in cancer genomes may be caused by errors during the replication of colocalized DNA regions. De and Michor provide genome-wide evidence for this model by integrating data on DNA replication timing, the three-dimensional organization of the genome and copy-number alterations in cancer.
Copy-number changes, point mutations and rearrangements are all usually found in cancer genomes, but their relative frequencies are highly variable. Using statistical approaches to model different processes, Fudenberg et al. find that copy number gain and loss is influenced by the three-dimensional organization of the genome in the nucleus.
New instruments can measure the presence of >30 molecular markers for massive numbers of single cells, but data analysis algorithms have lagged behind. Qiu et al. describe an approach called SPADE for recovering cellular hierarchies from mass or flow cytometry data.