Virtual gene cloning is here. Full-length gene sequences can be obtained in a few hours in front of a computer rather than from weeks of intensive labour at the bench. Genes, their protein products and functions, and their equivalents in other species (homologues and orthologues), are being named, identified and published at an astonishing rate.

But one side-effect is a growing chaos in nomenclature. A single protein is often studied simultaneously by a number of independent laboratories, each using their own pet name and refusing to acknowledge other names or agreeing to accept a single label. This inevitably leads to problems for anyone trying to stay abreast of the literature.

Take, for example, the EphB2 receptor, a protein involved in signalling in the brain. This tyrosine kinase was originally isolated from the chick and published in 1991 as Cek5, but was subsequently referred to as Nuk, Erk, Qek5, Tyro6, Sek3, Hek5 and Drt according to species, tissue or function. Happily in this case, following a workshop in Heidelberg, a proposal was drawn up, in consultation with the human and mouse gene-nomenclature committees, to systematically rename this protein and its family members. The scientific community at large was alerted to the agreed nomenclature of EphB2 (Cell 90, 403–404; 1997), which has been used ever since.

But the speed with which genes are being identified surpasses the rate at which any consolidated naming strategy is being developed. And there are other worrying sources of confusion. In describing a gene or protein, researchers should explicitly declare all other associated names and functions. For example, a paper submitted to any journal describing a new gene with an assigned function should unambiguously state any previous literature on an ortho/homologue to help an effective review process and to avoid unnecessary confusion of the literature, even though this might undermine the authors' ability to lay claim to a new label of their own. In practice, such referencing is sometimes not as diligent as is desirable. Thus, on occasion, what appears as the characterization of many different genes actually reflects one protein studied by many independent laboratories. What is more, two or more groups simultaneously submitting for publication papers describing the same protein in different guises are seldom able to get together and rename before publication.

How might sanity, in the form of standard nomenclature, be helped to prevail? Those administering databases could play a much more important role. A database worthy of the name should be in a position to screen new entries and to insist on conformity to a nomenclature system as a condition of registration. With such a process in place, journals could impose a requirement of prior registration as a condition of publication. Whatever the process, it remains to be seen whether rival groups would agree to refer to proteins by their first published name unless otherwise decided by all authors concerned.

For its part, Nature will henceforth be more rigorous in requiring that authors of papers describing the function of a protein also state all other known names of that protein the first time it is mentioned in the text. This may seem like an incremental step in dealing with the problem, but it is a start to some urgently needed improved communication within and beyond the biology community.